Defined Type: cdh::hadoop::worker::paths

Defined in:
modules/cdh/manifests/hadoop/worker/paths.pp

Overview

Define cdh::hadoop::worker::paths

Ensures directories needed for Hadoop Worker nodes are created with proper ownership and permissions. This has to be a define so that we can pass the $datanode_mounts array as a group. (Puppet doesn't support iteration.)

You should probably create each $basedir yourself before you this define is used. Each $basedir is expected to be a JBOD mount point that Hadoop will use to store data in. This define does not manage creating or mounting any partitions.

Parameters:

$basedir - base path for directory creation. Default: $title

Usage:

cdh::hadoop::worker::paths { ['/mnt/hadoop/data/a', '/mnt/hadoop/data/b']: }

The above declaration will ensure that the following directory hierarchy exists:

/mnt/hadoop/data/a
/mnt/hadoop/data/a/hdfs
/mnt/hadoop/data/a/hdfs/dn
/mnt/hadoop/data/a/yarn
/mnt/hadoop/data/a/yarn/local
/mnt/hadoop/data/a/yarn/logs
/mnt/hadoop/data/b
/mnt/hadoop/data/b/hdfs
/mnt/hadoop/data/b/hdfs/dn
/mnt/hadoop/data/b/yarn
/mnt/hadoop/data/b/yarn/local
/mnt/hadoop/data/b/yarn/logs

(If you use MRv1 instead of yarn, the hierarchy will be slightly different.)

Parameters:

  • basedir (Any) (defaults to: $title)


36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# File 'modules/cdh/manifests/hadoop/worker/paths.pp', line 36

define cdh::hadoop::worker::paths($basedir = $title) {
    Class['cdh::hadoop'] -> Cdh::Hadoop::Worker::Paths[$title]

    # hdfs, hadoop, and yarn users
    # are all added by packages
    # installed by cdh::hadoop

    # make sure mounts exist
    file { $basedir:
        ensure => 'directory',
        owner  => 'hdfs',
        group  => 'hdfs',
        mode   => '0755',
    }

    # Assume that $dfs_data_path is two levels.  e.g. hdfs/dn
    # We need to manage the parent directory too.
    $dfs_data_path_parent = inline_template("<%= File.dirname('${::cdh::hadoop::dfs_data_path}') %>")
    # create DataNode directories
    file { ["${basedir}/${dfs_data_path_parent}", "${basedir}/${::cdh::hadoop::dfs_data_path}"]:
        ensure  => 'directory',
        owner   => 'hdfs',
        group   => 'hdfs',
        mode    => '0700',
        require => File[$basedir],
    }

    # create yarn local directories
    file { ["${basedir}/yarn", "${basedir}/yarn/local", "${basedir}/yarn/logs"]:
        ensure => 'directory',
        owner  => 'yarn',
        group  => 'yarn',
        mode   => '0755',
    }
}