When running puppet apply, note that puppet's confdir is underneath the --confdir
value. For example: If you have site.csv
in /etc/puppet/config
, Then you should use --confdir=/etc/puppet
, and puppet finds the config dir underneath.
As an end to end example, you can follow the vagrant-puppet-vm
recipes to see how to set up a puppet managed bigtop hadoop installation. Those examples are gauranteed to work and serve as a pedagogical round trip to the way bigtop integrates packaging, deployment, and testing all into one package.
If in any case, you need to debug these recipes, you can add notify("...")
statements into the puppet scripts.
In time, we will add more logging and debugging to these recipes. Feel free to submit a patch for this!
As above, we defined a confdir (i.e. /etc/puppet/
) which has a config/
directory in it.
The heart of puppet is the manifests file. This file (manifests/init.pp
) expects configuration to live in hiera as specified by $confdir/hiera.yaml
. An example hiera.yaml
as well as hiera configuration yaml files are provided with the bigtop classes. They basically take the form:
key: value
with syntactic variations for hashes and arrays. Please consult the excellent puppet and hiera documentation for details.
All configuration is done via such key value assignments in hierdata/site.yaml
. Any options not defined there will revert to a default value defined in hieradata/cluster.yaml
, with the following exceptions (which are required):
bigtop::hadoop_head_node
: must be set to the FQDN of the name node of your cluster (which will also become its job tracker and gateway)
bigtop::bigtop_repo_uri
: uri of a repository containing packages for hadoop as built by Bigtop.
$confdir
is the directory that puppet will look into for its configuration. On most systems, this will be either /etc/puppet/
or /etc/puppetlabs/puppet/
. You may override this value by specifying --confdir=path/to/config/dir
on the puppet command line.
cluster.yaml
also serves as an example what parameters can be set and how they usually interact between modules.
You can instruct the recipes to install ssh-keys for user hdfs to enable passwordless login across the cluster. This is for test purposes only, so by default the option is turned off.
Files such as ssh-keys are imported from the master using the puppet:///
URL scheme. For this to work, fileserver has to be enabled on the puppet master, the files module enabled and access allowed in auth.conf. fileserver.conf should therefore contain e.g.:
[files] path /etc/puppet/files allow *
No changes are required to the default puppet 3 auth.conf.
For other options that may be set here, look for class parameters in the modules' manifests/init.pp files. Any class parameter can be used as a hiera key if prefixed with the module and class namespace. Module kafka's server class will look for its parameter port
as kafka::server::port
in hiera. Note that if hadoop::hadoop_storage_dirs
is left unset, puppet will attempt to guess which directories to use.
gradle toolchain-puppetmodules # if you already have JAVA installed
or
puppet apply --modulepath=<path_to_bigtop> -e "include bigtop_toolchain::puppet_modules"
This will install the following module(s) for you:
Note that, the puppet apt module version must be equal to or higher than 2.0.1 after BIGTOP-1870. Bigtop toolchan can take care of that for you, so just be aware of it.
And run the following on those nodes:
cp bigtop-deploy/puppet/hiera.yaml /etc/puppet mkdir -p /etc/puppet/hieradata rsync -a --delete bigtop-deploy/puppet/hieradata/site.yaml bigtop-deploy/puppet/hieradata/bigtop /etc/puppet/hieradata/
Edit /etc/puppet/hieradata/site.yaml to your liking, setting up the hostname for hadoop head node, path to storage directories and their number, list of the components you wish to install, and repo URL. At the end, the file will look something like this
bigtop::hadoop_head_node: "hadoopmaster.example.com" hadoop::hadoop_storage_dirs: - "/data/1" - "/data/2" hadoop_cluster_node::cluster_components: - ignite_hadoop - hive - spark - yarn - zookeeper bigtop::bigtop_repo_uri: "http://repos.bigtop.apache.org/releases/1.3.0/ubuntu/16.04/x86_64"
And finally execute
puppet apply -d --parser future --modulepath="bigtop-deploy/puppet/modules:/etc/puppet/modules" bigtop-deploy/puppet/manifests
When ignite-hadoop accelerator is deployed the client configs are placed under /etc/hadoop/ignite.client.conf
. All one needs to do to run Mapreduce jobs on ignite-hadoop grid is to set HADOOP_CONF_DIR=/etc/hadoop/ignite.client.conf
in the client session.
These classes are mostly used for regression testing. For ease of use they contain insecure default passwords in a number of places. If you intend to use them in production environments, make sure to track down all those places and set proper passwords. This can be done using the corresponding hiera settings. Some of these (but almost certainly not all!) are:
hadoop::common_hdfs::hadoop_http_authentication_signature_secret hadoop::httpfs::secret
Instead of explicitly setting passwords in hiera, they can be automatically generated using a program called trocla. However, there are a number of caveats with this approach at the moment:
trocla needs to be installed on the master only. To do so, run the following:
# gem install trocla # puppet module install duritong/trocla # puppet apply -e "class { 'trocla::config': manage_dependencies => false }"
The trocla ruby gem pulls in highline, moneta and bcrypt. The bcrypt gem needs ruby development packages (ruby.h) and a compiler.
Alternatively you can install your distributions' binary packages before running gem install. On Debian those packages can be installed as follows:
apt-get install ruby-highline ruby-moneta ruby-bcrypt
This installation process is expected to get easier once operation system packages have been created. This is actively underway for Debian. See Debian bugs #777761 and #777906 for progress.
After installing the trocla gem as outlined above, the following test should work:
# puppet apply -e "file { '/tmp/test': content => trocla("test", "plain") }" # cat /tmp/test puGNOX-G%zYDKHet
Now, automatic password generation can be activated in site.yaml using
hadoop::generate_secrets: true