My previous article, Puppet Infrastructure was written back in 2013. It's one of the most popular articles on my blog and seems to have helped a good number of people (as well as draw criticism from others).
With it being 2015 and all, I thought I'd write an updated version and include some notes and thoughts from using Puppet throughout 2014.
- Install and configure PuppetDB
- Setting Up Version Control
- Building the Site Module
- A Note About the Module Subcommand
- Keeping Track of Modules
- Configuring NTP
- Roles and Profiles
- The First Puppet Run
- Configuring the Firewall
- Hiera and the Firewall
- Finishing up the Puppet Server Role
- Committing Everything
This tutorial will explain how to create a new Puppet environment using best practices such as version control, a site-local module, and roles & profiles. It will also include some practices that aren't considered "best", but have helped my sanity while using Puppet in production on a daily basis. It assumes the reader will be using Ubuntu. The instructions were verified with Ubuntu 14.04.
Note that this is not an intro to Puppet – this tutorial assumes that you have at least a beginner's knowledge of Puppet.
The Puppet Labs Package Repo
Install the Puppet Labs apt repo:
Install and Configure Puppet Server
To install it, do:
I haven't used Puppet Server extensively yet, but it's worked fairly well so far. I plan to use it throughout 2015 and if I run into any major issues, I'll update this tutorial.
Configure Puppet Server
By default, Puppet Server requires 2gb of RAM. If your server does not have 2gb, you can lower the amount by editing
Additionally, I configure my Puppet servers with the following two settings:
The Future Parser
I've been using the future parser since it was first announced in Puppet 3.2. In my opinion, iteration was a sorely missing feature in Puppet and I have abused the hell out of it since it became available.
However cool the future parser may be, keep in mind that the features it provides are not official. Future releases could remove them, or worse, break them and thus break your environment. This actually happened to me with the Puppet 3.5 release and I spent the afternoon recovering from an almost catastrophic puppet run.
I've included a list of great resources at the end of this article. One of them covers the "manifest ordering" feature in detail. It's a great read and I agree with its principles on why you shouldn't use it.
When I first started using Puppet, I hated dependency-based ordering, but since there was no other option, I had to live with it and eventually it became second nature.
However, if I could get back the ridiculous amount of hours I've spent trying to resolve ordering issues, I'd have an awesome vacation. Dependency-based ordering is great in theory, but unless you write all of your Puppet manifests and modules yourself and are very strict to follow dependency conventions, it's very difficult to get right in practice. Because of this, I've begun using manifest ordering.
I understand that by using manifest ordering, I'm not helping solve the larger issue of fixing dependency problems when I find them in others' modules. I wish I had more time to do that, but unfortunately I'm not paid to work on Puppet all day – sometimes I need to get back to work.
And to play devil's advocate: really, where else am I going to have to use a dependency-based system? Everything else I use executes in a top-down format. And has
Generate an SSL Certificate
Since the Puppet Server has just been installed, it hasn't generated an SSL cert yet. It will do this the first time you start the service, but the cert will be needed in the following step. To generate the cert, do:
Install and configure PuppetDB
PuppetDB is a complementary service to Puppet. It's not required, but it's very useful. Unfortunately, this tutorial will only cover the installation of PuppetDB and not what makes it so useful.
To install and configure it:
Note that the certificate generated in the previous step must exist for PuppetDB to install cleanly.
Setting Up Version Control
It's a best practice to keep all the configuration management information in a version control provider such as
git. Since the repository will probably contain sensitive information about your environment, it's recommended to use an internal
git server. Managing one is very easy by using gitolite.
Seriously. Do not use Github, or any other public git host, if your repository will contain sensitive information. And if you accidentally check-in sensitive information, delete the entire repository immediately. Don't just delete the sensitive information and commit the changes. It sounds like common sense, I know, but it happens. Also consider using some type of encryption on your repo like hiera-gpg or blackbox.
Some people keep the entire
/etc/puppet directory in the repository. There's nothing wrong with this and if this is what you'd like to do, the following should work:
Another way of keeping all Puppet configuration in a repository is to create a module that will only be used for the particular Puppet environment being created. This method will be used for this tutorial.
Begin by creating a module:
Three directories were created inside the
- manifests: where your site-local manifests will go
- data: where your Hiera data will go
- ext: where your "extra" files will go. These files are complementary or supplementary to your environment as well as the main
What About Environments?
I experimented with environments throughout 2013 and 2014 and ultimately decided to not use them. By not using environments, my production sites now have multiple Puppet servers deployed: one for each project or domain of responsibility. If I used environments effectively, all of those Puppet servers could be combined into a single server.
However, I found that using environments in production caused some issues:
- If the location that housed the main Puppet server was down (lost power, etc), no nodes anywhere could talk to Puppet.
- If the filesystem on the central Puppet server became corrupt, it could affect all nodes. This happened to me in 2014, but the damage was restricted to only one project.
- A single main Puppet server would require all projects to work on the same version of Puppet. This is not possible for some projects.
- Similarly, upgrading Puppet means upgrading across the entire "federation".
- Having to type extra characters and tabs to reach the environment got tedious (
cd /etc/puppet/env<tab>/project-name<tab>/mod<tab>/site). Though this was not a significant reason, it was a sense of relief to just go back to
With regard to development and testing deployments, it's way too easy for me to use Vagrant to fire up a small Puppet server than to configure a central Puppet server with environments.
This isn't to say that environments are a useless feature. I've just found that they haven't worked well for my specific use-cases.
Building the Site Module
At this point, Puppet is installed and an empty
site module exists. Now we'll begin using Puppet to configure the Puppet server itself as well as any other server you place under Puppet control.
Things begin to break when two servers with skewed times try communicating with each other. To ensure this doesn't happen, NTP will be installed and configured. First, install the
puppetlabs/ntp Puppet module:
A Note About the Module Subcommand
puppet command has a built-in subcommand to install modules. It's able to find the module by looking it up at the Forge.
There are pros and cons to this. On one hand, it provides an easy way to install a module plus any other modules it depends on. On the other hand, if you install a module that has a conflicting dependency with another module, the command will break. Additionally, sometimes the version of the module hosted at the Forge is outdated. When this happens, you need to manually download the module from its designated home – usually github.
I usually clone the modules directly from Github. The
puppet module command was included in this tutorial as an example.
Keeping Track of Modules
The previous version of this article showed how to create a
bash script that will re-download all modules you use. There's nothing wrong with that method and it works great.
The previous version also mentioned librarian-puppet. I tried to use it, but found its dependency resolution and module metadata checks to be way too strict.
Dan Bode has a tool called librarian-puppet-simple that is a stripped down version of
librarian-puppet. It simply installs a set of modules that you list in a
Puppetfile – no dependency or metadata checks. Like Dan,
librarian-puppet-simple is really awesome, and you should check it out. This is what I've been using for quite a while now and don't see a reason to stop.
Now that the
puppetlabs/ntp module is installed, it can be used to install and configure NTP on any server under Puppet control. The
puppetlabs/ntp module is a simple module and rarely needs any parameters.
site.pp file, add the following:
Roles and Profiles
There's an issue with this, though. This class will need applied to every server:
There's a lot of repeated configuration here and it'll only get worse as more modules are added. A better way to apply modules to nodes is to use the "Roles and Profiles" pattern. The end of this article has some links that will describe this pattern in detail.
A good profile to start with is the "base" profile. This profile will be applied to all servers, so it's important that this profile contains very generic and global settings. To start, create a new manifest called
/etc/puppet/modules/site/manifests/profiles/base.pp with the following contents:
Next, create a role:
Finally, apply the role to the node:
With only the
ntp module being used in
site::profiles::base, this actually seems more complicated. To better show the usefulness of profiles, add the following to the
Before, you would have had to add those two lines to each node. Now you just add it to one profile and it gets applied to every node that has the
base profile applied to.
All Your Base
Since my "base" settings are so common across different Puppet-controlled environments, I have started creating an actual "base" module called bass. I haven't yet decided to pronounce it bass as in "base" or bass as in the fish.
Class or Include or Contain?
There's a lot of great documentation on this subject that is written better than I ever could. See the end of this article for links. Once you've read everything and understand the history between these three keywords, here's my $0.02:
- I use
containin my roles and nodes. This is because I enforce a no-parameter policy in roles and nodes.
- I use
classin my profiles since those do use parameters. I still sometimes use Anchors and explicit ordering, but I'm finding that these are not needed (as much) in Puppet 3.7+ and Puppet Server and by using manifest ordering.
- Sometimes I'll throw a
includein the profiles, though, if I'm positive that I'll never need to add parameters to them and that their ordering is stable.
The First Puppet Run
At this point, Puppet can be run for the first time. If all goes well, NTP will be installed and running when Puppet has finished:
Configuring the Firewall
puppetlabs/firewall module will be used to build the basis of a deny-by-default firewall.
puppetlabs/firewall module by cloning it from github:
Add it to your
Next, a new manifest called
/etc/puppet/modules/site/manifests/firewall.pp will be created:
Next, add the following to the
site::profiles::base profile, before the base packages are applied:
Now all servers will have a deny-by-default firewall applied. I wouldn't recommend applying this configuration yet because you'll be locked out if you're working on this server remotely.
Hiera and the Firewall
This is a good place to introduce Hiera - a tool to store structured configuration data outside of Puppet manifests. Hiera is installed by default with the Puppet package, so installing a
hiera package is not needed. However, to utilize Hiera's merging feature, the
deep_merge gem needs to be installed:
To configure Hiera, create
/etc/puppet/modules/site/ext/hiera.yaml with the following contents:
Next, link this configuration file to two locations:
The first location is so you can use the
hiera command-line tool. The second location is for Puppet itself.
At the moment, the only hierarchy configured is
common. This means that Hiera will only read data from a single file:
Add the following to that file:
Add any other networks or hosts (
/32) to this list that you need.
Next, add the following before the
999 rule in
This block of Puppet code uses Puppet's iteration feature from the future parser, so you'll need to make sure you have it enabled in
Now the next time you apply the Puppet configuration, a deny-by-default firewall will be enabled with explicit allow rules for each trusted network you specified in Hiera.
My Hiera Structure
Over the past year+, I have standardized on the following Hiera hierarchy:
"node" is the hostname (or fqdn) of the node. While the idea of having node-specific settings goes against the "Pets v Cattle" argument, it's sometimes unavoidable. In my production environments, only a small percentage of nodes have their own node-specific settings, and even then it's maybe one or two values.
"location" is an arbitrary fact to group nodes:
- "role" is another fact that matches the role that the node will have applied:
Be careful about using Facter to categorize nodes on the node itself! Let's say a node with a role of
dns was compromised and the intruder understood that they could change the role to
mysql by replacing the fact in
/etc/facter/facts.d/role.txt. On the next Puppet run, MySQL will be installed, which would apply sensitive information to the node that the intruder now has access to. It might also break your DNS server.
What Goes in Hiera and What Uses Hiera?
My personal rules of Hiera data are:
- Hiera data is used only in profiles. If I write my own module that will be located in
/etc/puppet/modules, I still use class parameters.
- Since I only use Hiera in profiles, that means all Hiera data is site-specific. So the deciding question becomes: "what information needs stripped from this module that will allow it to work in another environment?" That data is then moved to Hiera.
Finishing up the Puppet Server Role
Up until now, broad configurations that could be applied to any node have been used. Now we'll create a more specific role and profile to configure the Puppet Server.
In order to do this, several modules will be needed:
In production, I mark all modules in my
Puppetfile with a reference. This reference is the known working release or commit of that module. This means that if I ever want to test an updated module, I'll need to actually create a test environment, rebuild everything, and confirm it works. But the alternative of just going cowboy and deploying all new releases to the production environment will only cause a lot of pain (and downtime).
Creating the Puppet Server Profile
/etc/puppet/modules/site/manifests/profiles/puppet/server.pp with the following contents:
This example profile is much more fleshed out than the previous
site::profiles::base profile to show the format I'm currently using in my production profiles.
common.yaml Hiera file, add this:
You can see how each Hiera item matches the corresponding section in the profile except for
puppetdb::master::config::restart_puppet. This is because
puppetdb::master::config::restart_puppet is an Automatic Paramter Lookup. Declaring this in Hiera is the same as if you did:
Now create the
Now build a role for the Puppet server:
With all of this in place, run Puppet:
When everything has finished, you should now be able to switch to using
puppet agent instead of
My Opinion on Automatic Parameter Lookups
I think they're a great idea, but ulimately they're too "magical" and unintuitive. There's no easy way to tell if they're being used by reading the Puppet manifests – you have to read both the manifests and Hiera data and correlate the two data sources.
If there are any automatic lookups in my Hiera data, it's because I got lazy. It happens.
A lot of work has been done here. To see all of the changes that were made, do the following:
All of this should be committed into
This tutorial was an updated version of my 2013 Puppet Infrastructure tutorial. It described how to install and configure Puppet Server, PuppetDB, and Hiera as well as how to lay the foundation of a maintainable Puppet environment.
In addition, I've included notes of my experiences learned over the past year with Puppet.
And as mentioned throughout this article: