- Introduction
- Installation
- Hiera Integration
- Modules
- Vagrant
- Bootstrapping Servers
- SSH
- Masterless Puppet
- Conclusion
Introduction
Since writing my last article, On Configuration Management, I've been researching different tools. I found several great ones on Github, such as:
- Sprinkle
- Babushka
- Provizioning
- Ronin
- Capistrano Puppeteer
- Pupcap
- Miyamoto
- Baptize
- Makistrano
- Rutty
- gsh
- Supply Drop
- Dust
- Deliver
- Roco
- dtools
All these tools had great ideas and I'm glad to have found them. Sprinkle and Babushka were particularly interesting because of their conditional testing after each task. This allows both a form of testing and idempotence checking in one step.
Unfortunately, none of them were exactly what I was looking for, so I began working out what my perfect tool would be. I realized that I was looking for more than a configuration management system. Configuration management (at the level of Puppet et al) is only a small part of infrastructure management. I wanted a tool that would help me corral all parts of infrastructure management into a single framework. I want the ability to apply Puppet manifests on one server, configure a Gluster volume on a second server, and create a Gitolite user on a third.
I defined three core values that I wanted this tool to have:
- Push-based
- Task-driven
- Extendible
Capistrano was a perfect fit for all three of those items. It's also well-known which is a bonus. And being based on Rake was an added benefit.
My Capistrano setup is still a work in progress, but here's what I have so far:
Installation
Capistrano's README file does a great job explaining both how to install Capistrano and the basics of using it. If you aren't familiar with Capistrano, I recommend reading it along with the docs on Capistrano's homepage before continuing on.
Initial Usage
I made a slight change to the Gemfile
in that I'm using the trunk version of sshkit. It has better error reporting than the current gem available.
To begin, add an existing server to config/deploy/staging.rb
. There are several examples in the stage template to go off of.
Next, add a file called lib/capistrano/tasks/utils.cap
with the following contents:
At this point, you should have a file and directory structure similar to this.
You should be able to run the following command and see the uptime of the server (or servers) that you added to the deploy file:
Note: Capistrano requires that all hostnames are resolvable. You can either create a proper DNS entry or add an entry to /etc/hosts
. Also, if you will be churning servers, make sure to clean up old SSH fingerprints or add the following to ~/.ssh/config
:
Hiera Integration
Adding servers to the deploy file is fine, but I thought it would be better if they were stored in Hiera. I'm still a novice with Ruby, and with some help I got it working.
First, add the following to the Gemfile
:
and run:
Next, create the following files:
- lib/cap_hiera.rb: This file makes a few Hiera-based functions available in Capistrano.
- hiera/hiera.yaml: This file is a basic Hiera configuration. You can see where this file is called in
cap_hiera.rb
. - hiera/data/staging.yaml: This file will contain server definitions for the "staging" stage.
You can verify that Hiera is working by running the following:
Once confirmed working, you can change the Capfile and add the following lines:
You can now remove any server entries made in the config/deploy/staging.rb
file. Unfortunately you cannot remove the file itself as Capistrano uses the name of the file as a stage definition.
You can also add global settings in Hiera. For example, add the following to hiera/data/default.yaml
:
And add the following to Capfile
:
At this point, you should have a Capistrano directory that looks similar to this.
Modules
So far, this Capistrano installation has a utils.cap
task file and some Hiera files. Since I'll be adding more features, I wanted to organize everything into quasi "modules". At the moment, this module structure is not suited for large redistribution. It's more to organize local files.
Hiera
Here's how the Hiera configuration looks I converted it to a "module":
cap_hiera.rb
, hiera.yaml
, and Capfile
will all need modified to account for the new paths.
Utils
You can move the utils.cap
task into its own module:
Again, Capfile
will need modified to import the tasks at the new location:
If you modified all files correctly, then the following should work as it did before:
At this point, you should have a directory structure similar to this.
Module Caveats
There are a lot of hard-coded paths in the modules. They also contain site-specific data, so public distribution is a bad idea – especially if the module contains sensitive information.
Vagrant
The next feature is Vagrant support. Being able to add existing servers to Hiera is fine, but I want to be able to add new servers to Hiera and have Vagrant create those servers.
I use Vagrant with the vagrant-openstack-plugin so this section will be specific to that. It's easy to swap out this configuration with another cloud plugin and should not be hard to change for basic VirtualBox.
Hiera
In the default.yaml
file, I have the following:
These attributes match to settings used in the Vagrant OpenStack plugin. The :cloud
attribute is an arbitrary name. Hiera merges these default settings with defined servers by the hiera_get_server
method.
Vagrant Module
The Vagrant module looks like this:
Notice the template, which you can view here. The values inside the template are filled in with the corresponding server attributes. Using a template like this makes building Vagrantfiles inflexible and maybe I'll be able to fix that at some point.
The vagrant.cap
file contains three tasks for managing Vagrant machines. The vagrant:new
task contains a method called render_template
. This is a "helper" function which is defined in a new file, modules/utils/lib/helpers.rb. Add helpers.rb
to Capfile
:
There are some methods to colourize output. This requires the colorize
gem and you should add it to the Gemfile
.
You should now be able to create a new Vagrant virtual machine by calling vagrant:new
. If you run this task without a "host filter", Capistrano will create Vagrant virtual machines for every server defined. So to create a single server, do the following:
If all was successful, a Vagrantfile
under modules/vagrant/files/mycloud/example.com
will be available. You can change into that directory and run:
Or use the vagrant:up
task instead of vagrant:new
.
At this point, your Capistrano directory should look similar to this.
Bootstrapping Servers
Vagrant can provision virtual machines with services such as shell, Puppet, or Chef. But I also want to provision other types of servers. By using Capistrano, I can create a bootstrap task that emulates the Vagrant shell provisioner. Now I can bootstrap bare-metal servers as well as virtual machines created outside of Vagrant.
I decided to place the "bootstrapping" task under a new module called "server":
The server.cap
file looks like this and the ubuntu.sh
bootstrap script looks like this.
With this module in place, I can bootstrap any type of server by performing the following task:
At this point, your Capistrano directory should look something like this.
SSH
Capistrano can now read an inventory of servers from Hiera, provision them with Vagrant, and bootstrap them with shell scripts. This next section introduces some SSH tasks:
- Uploading keys
- Creating SSH host entries
- Running arbitrary commands
The SSH module looks like this:
You can see the ssh.cap
file here. Note the upload_and_move
method. This is a new method added to the helpers.rb
library, seen here. This method uploads a file and then uses sudo
to move the file to its remote destination.
With this module in place, the following workflow is possible:
Sharing private SSH keys across hosts isn't the most secure thing to do. But it's more simple than generating a key on a new host and then configuring a service to use the new key. (though such tasks were difficult to do in Puppet, they are now easier with a task-based approach!)
Masterless Puppet
Now for Puppet. I was eager to try out a "masterless" Puppet workflow and read everything I could find on the topic. After actually implementing this style of Puppet, I've found that it's more difficult than it initially sounds.
The difficulty comes from ensuring a server receives only the configuration data it needs. Blanketing all servers with a single Puppet repository that includes all site-specific data puts your entire infrastructure at risk: if a single server is compromised, your entire infrastructure configuration is exposed.
With this thought in mind, using a masterless Puppet workflow might even be more secure than a central Puppet server.
The Puppet Module
The Puppet module looks like this:
The idea is to have one Puppetfile per stage. The Puppetfile contains the modules used in that stage. You can also specify specific versions of the modules to use. It's possible that each server will not use all modules, but in my opinion, this is an acceptable waste of space. The alternative is to create one Puppetfile per server or per role.
Next, each role has one Puppet manifest. This maps to the common "roles and profiles" pattern. Capistrano will apply each role manifest separately. All information that the server needs for that particular role must be in the manifest. The inability to share data between manifests can be an issue, but I see it as a way to enforce contained and non-conflicting roles.
Finally, each server has a "base" role for "free". You do not have to specify this role in Hiera. If a base.pp
manifest exists for a stage, then each server in that stage will have the "base" role applied.
You can look through the staging files here.
Notice that base.pp
applies a site::roles::base
role. According to the Puppetfile, the site
module is located on an internal server. The site
role might contain sensitive information or information that you don't feel like sharing publicly.
The Hiera Module
Hiera plays a big role in my Puppet work and I wanted to continue using it with a masterless Puppet environment. Since Capistrano is already configured to use Hiera, it makes sense to continue using that module. I re-arranged the Hiera module structure to look like this:
I split the data
directory into two directories. A capistrano
directory will hold Capistrano-specific data. A staging
directory that corresponds to the "staging" stage. There is a default.yaml
file inside the staging
directory. It contains global settings (such as for the "base" role) and a YAML file for each other role.
The Puppet Module (con't)
The puppet.cap task file is the most complex task file to date. While it contains a lot of logic, it should not be difficult to understand, though.
I based a lot of the puppet.cap
off of the existing work done in Supply Drop.
The first few tasks should be self-explanatory. The task that requires some notes is the puppet:deploy
task.
The first thing to notice about this task is that it introduces a few new methods:
- md5_diff: Compares the MD5 hash of a local and remote file
- upload_and_move_if_changed Uploads the local file to the remote server if the MD5 hash is different.
Unfortunately these two tasks can involve a lot of SSH chatter to do the comparison and make the decision. I plan to look into different methods of simplifying this such as storing an MD5 cache locally or building an rsync
queue.
Once you understand the new methods, the rest of the task becomes quite simple:
- Capistrano generates a
hiera.yaml
configuration file which includes all server roles. - Capistrano uploads
default.yaml
,base.pp
or<role>.pp
if they exist. - Capistrano uploads and runs the stage's
Puppetfile
.
I use r10k to control the Puppetfile because I found librarian-puppet too strict with regard to certain Puppet modules' Modulefile
.
Masterless Puppet Workflow
With all this in place, here is my current workflow for a masterless Puppet setup:
- Add a role to a server definition in
hiera/files/data/capistrano/stage.yaml
. - Create a Puppet manifest titled
role.pp
inpuppet/files/stage
. - Deploy the files to a server with:
- Preview the Puppet run by doing:
- Apply the Puppet manifests to the server:
At this point, your Capistrano directory should look similar to this
Masterless Puppet Thoughts
When I started using a masterless Puppet workflow, a few things were immediately clear:
- Simplified Puppet with no more central server, PuppetDB, or certificates
- Lack of exported resources and sharing information between servers
I'm enjoying the first point but will have to dedicate some time to solving the second point. My initial thoughts are to use facter-dot-d more and perhaps something like Juju.
Conclusion
This concludes my current Capistrano setup. It's still a work-in-progress but I've been able to use it as a daily tool.
There are a few areas where I'd like to improve on:
- More efficient file transfer (rsync queue?)
- Redistributable modules (
.gitignore
in the files directory?)
Update: I have refactored the modules described in this article into something more redistributable. See here.
Some tasks can be difficult to kill or cancel. I'm not too sure how to resolve this issue.
I'd love to hear comments, ideas, patches, or criticism.
Comments
comments powered by Disqus