- Hiera Integration
- Bootstrapping Servers
- Masterless Puppet
Since writing my last article, On Configuration Management, I've been researching different tools. I found several great ones on Github, such as:
- Capistrano Puppeteer
- Supply Drop
All these tools had great ideas and I'm glad to have found them. Sprinkle and Babushka were particularly interesting because of their conditional testing after each task. This allows both a form of testing and idempotence checking in one step.
Unfortunately, none of them were exactly what I was looking for, so I began working out what my perfect tool would be. I realized that I was looking for more than a configuration management system. Configuration management (at the level of Puppet et al) is only a small part of infrastructure management. I wanted a tool that would help me corral all parts of infrastructure management into a single framework. I want the ability to apply Puppet manifests on one server, configure a Gluster volume on a second server, and create a Gitolite user on a third.
I defined three core values that I wanted this tool to have:
Capistrano was a perfect fit for all three of those items. It's also well-known which is a bonus. And being based on Rake was an added benefit.
My Capistrano setup is still a work in progress, but here's what I have so far:
Capistrano's README file does a great job explaining both how to install Capistrano and the basics of using it. If you aren't familiar with Capistrano, I recommend reading it along with the docs on Capistrano's homepage before continuing on.
I made a slight change to the
Gemfile in that I'm using the trunk version of sshkit. It has better error reporting than the current gem available.
To begin, add an existing server to
config/deploy/staging.rb. There are several examples in the stage template to go off of.
Next, add a file called
lib/capistrano/tasks/utils.cap with the following contents:
At this point, you should have a file and directory structure similar to this.
You should be able to run the following command and see the uptime of the server (or servers) that you added to the deploy file:
Note: Capistrano requires that all hostnames are resolvable. You can either create a proper DNS entry or add an entry to
/etc/hosts. Also, if you will be churning servers, make sure to clean up old SSH fingerprints or add the following to
First, add the following to the
Next, create the following files:
- lib/cap_hiera.rb: This file makes a few Hiera-based functions available in Capistrano.
- hiera/hiera.yaml: This file is a basic Hiera configuration. You can see where this file is called in
- hiera/data/staging.yaml: This file will contain server definitions for the "staging" stage.
You can verify that Hiera is working by running the following:
Once confirmed working, you can change the Capfile and add the following lines:
You can now remove any server entries made in the
config/deploy/staging.rb file. Unfortunately you cannot remove the file itself as Capistrano uses the name of the file as a stage definition.
You can also add global settings in Hiera. For example, add the following to
And add the following to
At this point, you should have a Capistrano directory that looks similar to this.
So far, this Capistrano installation has a
utils.cap task file and some Hiera files. Since I'll be adding more features, I wanted to organize everything into quasi "modules". At the moment, this module structure is not suited for large redistribution. It's more to organize local files.
Here's how the Hiera configuration looks I converted it to a "module":
Capfile will all need modified to account for the new paths.
You can move the
utils.cap task into its own module:
Capfile will need modified to import the tasks at the new location:
If you modified all files correctly, then the following should work as it did before:
At this point, you should have a directory structure similar to this.
There are a lot of hard-coded paths in the modules. They also contain site-specific data, so public distribution is a bad idea – especially if the module contains sensitive information.
The next feature is Vagrant support. Being able to add existing servers to Hiera is fine, but I want to be able to add new servers to Hiera and have Vagrant create those servers.
I use Vagrant with the vagrant-openstack-plugin so this section will be specific to that. It's easy to swap out this configuration with another cloud plugin and should not be hard to change for basic VirtualBox.
default.yaml file, I have the following:
These attributes match to settings used in the Vagrant OpenStack plugin. The
:cloud attribute is an arbitrary name. Hiera merges these default settings with defined servers by the
The Vagrant module looks like this:
Notice the template, which you can view here. The values inside the template are filled in with the corresponding server attributes. Using a template like this makes building Vagrantfiles inflexible and maybe I'll be able to fix that at some point.
vagrant.cap file contains three tasks for managing Vagrant machines. The
vagrant:new task contains a method called
render_template. This is a "helper" function which is defined in a new file, modules/utils/lib/helpers.rb. Add
There are some methods to colourize output. This requires the
colorize gem and you should add it to the
You should now be able to create a new Vagrant virtual machine by calling
vagrant:new. If you run this task without a "host filter", Capistrano will create Vagrant virtual machines for every server defined. So to create a single server, do the following:
If all was successful, a
modules/vagrant/files/mycloud/example.com will be available. You can change into that directory and run:
Or use the
vagrant:up task instead of
At this point, your Capistrano directory should look similar to this.
Vagrant can provision virtual machines with services such as shell, Puppet, or Chef. But I also want to provision other types of servers. By using Capistrano, I can create a bootstrap task that emulates the Vagrant shell provisioner. Now I can bootstrap bare-metal servers as well as virtual machines created outside of Vagrant.
I decided to place the "bootstrapping" task under a new module called "server":
With this module in place, I can bootstrap any type of server by performing the following task:
At this point, your Capistrano directory should look something like this.
Capistrano can now read an inventory of servers from Hiera, provision them with Vagrant, and bootstrap them with shell scripts. This next section introduces some SSH tasks:
- Uploading keys
- Creating SSH host entries
- Running arbitrary commands
The SSH module looks like this:
You can see the
ssh.cap file here. Note the
upload_and_move method. This is a new method added to the
helpers.rb library, seen here. This method uploads a file and then uses
sudo to move the file to its remote destination.
With this module in place, the following workflow is possible:
Sharing private SSH keys across hosts isn't the most secure thing to do. But it's more simple than generating a key on a new host and then configuring a service to use the new key. (though such tasks were difficult to do in Puppet, they are now easier with a task-based approach!)
Now for Puppet. I was eager to try out a "masterless" Puppet workflow and read everything I could find on the topic. After actually implementing this style of Puppet, I've found that it's more difficult than it initially sounds.
The difficulty comes from ensuring a server receives only the configuration data it needs. Blanketing all servers with a single Puppet repository that includes all site-specific data puts your entire infrastructure at risk: if a single server is compromised, your entire infrastructure configuration is exposed.
With this thought in mind, using a masterless Puppet workflow might even be more secure than a central Puppet server.
The Puppet Module
The Puppet module looks like this:
The idea is to have one Puppetfile per stage. The Puppetfile contains the modules used in that stage. You can also specify specific versions of the modules to use. It's possible that each server will not use all modules, but in my opinion, this is an acceptable waste of space. The alternative is to create one Puppetfile per server or per role.
Next, each role has one Puppet manifest. This maps to the common "roles and profiles" pattern. Capistrano will apply each role manifest separately. All information that the server needs for that particular role must be in the manifest. The inability to share data between manifests can be an issue, but I see it as a way to enforce contained and non-conflicting roles.
Finally, each server has a "base" role for "free". You do not have to specify this role in Hiera. If a
base.pp manifest exists for a stage, then each server in that stage will have the "base" role applied.
You can look through the staging files here.
base.pp applies a
site::roles::base role. According to the Puppetfile, the
site module is located on an internal server. The
site role might contain sensitive information or information that you don't feel like sharing publicly.
The Hiera Module
Hiera plays a big role in my Puppet work and I wanted to continue using it with a masterless Puppet environment. Since Capistrano is already configured to use Hiera, it makes sense to continue using that module. I re-arranged the Hiera module structure to look like this:
I split the
data directory into two directories. A
capistrano directory will hold Capistrano-specific data. A
staging directory that corresponds to the "staging" stage. There is a
default.yaml file inside the
staging directory. It contains global settings (such as for the "base" role) and a YAML file for each other role.
The Puppet Module (con't)
The puppet.cap task file is the most complex task file to date. While it contains a lot of logic, it should not be difficult to understand, though.
I based a lot of the
puppet.cap off of the existing work done in Supply Drop.
The first few tasks should be self-explanatory. The task that requires some notes is the
The first thing to notice about this task is that it introduces a few new methods:
- md5_diff: Compares the MD5 hash of a local and remote file
- upload_and_move_if_changed Uploads the local file to the remote server if the MD5 hash is different.
Unfortunately these two tasks can involve a lot of SSH chatter to do the comparison and make the decision. I plan to look into different methods of simplifying this such as storing an MD5 cache locally or building an
Once you understand the new methods, the rest of the task becomes quite simple:
- Capistrano generates a
hiera.yamlconfiguration file which includes all server roles.
- Capistrano uploads
<role>.ppif they exist.
- Capistrano uploads and runs the stage's
Masterless Puppet Workflow
With all this in place, here is my current workflow for a masterless Puppet setup:
- Add a role to a server definition in
- Create a Puppet manifest titled
- Deploy the files to a server with:
- Preview the Puppet run by doing:
- Apply the Puppet manifests to the server:
At this point, your Capistrano directory should look similar to this
Masterless Puppet Thoughts
When I started using a masterless Puppet workflow, a few things were immediately clear:
- Simplified Puppet with no more central server, PuppetDB, or certificates
- Lack of exported resources and sharing information between servers
I'm enjoying the first point but will have to dedicate some time to solving the second point. My initial thoughts are to use facter-dot-d more and perhaps something like Juju.
This concludes my current Capistrano setup. It's still a work-in-progress but I've been able to use it as a daily tool.
There are a few areas where I'd like to improve on:
- More efficient file transfer (rsync queue?)
- Redistributable modules (
.gitignorein the files directory?)
Update: I have refactored the modules described in this article into something more redistributable. See here.
Some tasks can be difficult to kill or cancel. I'm not too sure how to resolve this issue.
I'd love to hear comments, ideas, patches, or criticism.