terrarum

home rss

apt Infrastructure with Puppet

30 Nov 2012

Introduction

This article will explain various methods of controlling apt (the Debian and Ubuntu package manager) across several servers with Puppet and the benefits of doing so.

Caching apt Packages

Each time apt downloads a package, it is locally cached on the server until you explicitly clear the cache. This gives you the ability to reinstall the package without having apt download it again.

On individual servers, this works fine. But what about groups of several servers? If, for example, you ran the following on 15 different servers to install tmux:

$ apt-get install tmux

tmux would have been downloaded from a public server 15 times. This is very inefficient – especially if the package was significantly large.

A solution to this is to designate a server to be an apt proxy server. All other servers will go through this server to download any packages. If the package does not exist on the apt proxy server, it will be downloaded from a public server. Now when tmux is installed on 15 different servers, it is only downloaded from a public server once.

apt-cacher-ng

The best application for apt proxying is apt-cacher-ng.

To set up acng by using Puppet, either use the puppet module tool or download the module directly from github:

$ puppet module install jtopjian/acng

or

$ git clone https://github.com/jtopjian/puppet-acng acng

Next, apply the following class to the "apt server" (the "apt server" is the server that will provide all apt services throughout this article):

class { 'acng::server': }

When acng is installed, it will automatically use the contents of your sources.list file as the upstream apt repository. This will work for 90% of use-cases.

On every other server that will use the apt server as a proxy, apply the following manifest:

class { 'acng::client':
  server => 'apt.example.com',
}

Mirroring apt Repositories

acng is useful if your servers use the same subset of packages. If you find that your servers are downloading a wide variety of packages – maybe even most of the available packages in a repository – it would make more sense to just mirror the entire repository instead of letting acng inevitably do it.

apt-mirror

apt-mirror is an application that provides an easy way to mirror one or more repositories. To configure with Puppet, first install the apt_mirror module:

$ puppet module install jtopjian/apt_mirror

or

$ git clone https://github.com/jtopjian/puppet-apt_mirror apt_mirror

Next, apply the base manifest:

class { 'apt_mirror': }

The apt_mirror module comes with a defined type to add individual mirrors. For example, to mirror main and contrib repositories for Ubuntu Precise, use the following:

apt_mirror::mirror { 'ubuntu precise':
  mirror     => 'archive.ubuntu.com',
  os         => 'ubuntu',
  release    => 'precise',
  components => ['main', 'contrib'],
}

apt_mirror installs a cron entry that will run once a day. It creates new mirrors that do not yet exist or updates existing mirrors with any changes.

In order to fully utilize the mirror, you will need to make it accessible to all of your servers. I recommend following this article for instructions on how to do this manually. I'll also describe how to do this with Puppet later in this article.

Creating Your Own apt Repository

There are several reasons to create your own apt repository:

reprepro

Creating an apt repository is not the most straightforward task. The following page contains a very long list of tools to assist. The most popular tool seems to be reprepro which is what I have chosen to use.

For a great overview on how to use reprepro, please see this article.

To configure reprepro with Puppet, first install the module:

$ puppet module install jtopjian/reprepro

or

$ git clone https://github.com/jtopjian/puppet-reprepro reprepro

The following manifest will install reprepro and set up the reprepro user:

$basedir = '/var/lib/apt/repo'

class { 'reprepro':
  basedir => $basedir,
}

One reprepro is installed, GPG will need configured. Modern apt versions use GPG to sign both packages and files that maintain the apt repository. In my opinion, it is difficult to use Puppet to maintain GPG keys, so I recommend doing this part outside of Puppet.

First, change to the reprepro user:

$ su - reprepro

Next, generate a GPG key:

$ gpg --gen-key

Follow and answer the prompts. I recommend not using a passphrase. Although it's insecure to do such a thing, GPG is just being used to sign arbitrary apt repository files – you can always delete the repository and re-create it if you feel your key has become compromised.

Once the key has been created, export it and store it in the reprepro Puppet module:

$ gpg --export --armor foo@bar.com > /etc/puppet/modules/reprepro/files/localpkgs.gpg

Now to continue with Puppet. The reprepro Puppet module has the ability to crate and maintain several apt repositories. Each repository contains one or more distributions. For most cases, and for the purpose of this document, only one of each will be created.

First, the repository:

reprepro::repository { 'localpkgs':
  ensure  => present,
  basedir => $basedir,
  options => ['basedir .'],
}

And next the distribution:

reprepro::distribution { 'precise':
  basedir       => $basedir,
  repository    => 'localpkgs',
  origin        => 'Foobar',
  label         => 'Foobar',
  suite         => 'stable',
  architectures => 'amd64 i386',
  components    => 'main contrib non-free',
  description   => 'Package repository for local site maintenance',
  sign_with     => 'F4D5DAA8',
  not_automatic => 'No',
}

(The sign_with value can be obtained by doing gpg --list-keys)

Once these manifests have been applied to the apt server, you can begin adding packages to your custom repository. The reprepro Puppet module installs a cron entry that will monitor the $basedir/$repository/tmp/$distribution directory for any *.deb files. If it finds any, it will add the files to the repository and clean the tmp dir out.

Making Your Repository Available

Now that you have an apt repository with packages, you will want all of your servers to be able to download those packages. You'll need to make your repository publicly accessible in order to do this. The easiest way is by installing a web server like Apache:

$ puppet module install puppetlabs/apache

Next, create a virtual host that uses your repository as its DocumentRoot:

class { 'apache': }

apache::vhost { 'localpkgs':
  port       => '80',
  docroot    => '/var/lib/apt/repo/localpkgs',
  servername => 'apt.example.com',
  require    => Reprepro::Distribution['precise'],
}

Also make sure your servers can access your GPG key that you generated:

file { '/var/lib/apt/repo/localpkgs/localpkgs.gpg':
  ensure  => present,
  owner   => 'www-data',
  group   => 'reprepro',
  mode    => '0644',
  source  => 'puppet:///modules/reprepro/localpkgs.gpg',
  require => Apache::Vhost['localpkgs'],
}

Finally, tell your servers about your repository. You can use the Puppet apt module to do this:

$ puppet module install puppetlabs/apt

And use the following manifest on each server:

apt::source { 'localpkgs':
  location    => 'http://apt.example.com',
  release     => 'precise',
  repos       => 'main contrib non-free',
  key         => 'F4D5DAA8',
  key_source  => 'http://apt.example.com/localpkgs.gpg',
  include_src => false,
}

Now you should be able to perform an apt-cache search on each server for packages in your repository.

Use-case: Puppet 2.7.x

One creative way of using both apt-mirror and reprepro is to ensure your servers are running the latest version of Puppet, but keep them at the 2.7.x versions.

Puppetlabs maintains their own apt repository at http://apt.puppetlabs.com. It's a great repository and it's always up to date. However, it makes no distinction between the 2.7 and 3.0 branches of Puppet. This means that unless you specifically specify a 2.7.x version of Puppet, you will get a 3.0 version.

Personally, I'm not ready to work with 3.0, but at the same time, I want my servers to have the latest 2.7 version.

The first step in easily doing this is to mirror the Puppetlabs apt repository:

apt_mirror::mirror { 'puppetlabs':
  mirror     => 'apt.puppetlabs.com',
  os         => '',
  release    => 'precise',
  components => ['main'],
}

Once your apt server has mirrored the repository, navigate to the mirror's location (most likely /var/spool/apt-mirror/mirror/apt.puppetlabs.com):

$ cd /var/spool/apt-mirror/mirror/apt.puppetlabs.com

Inside this directory, navigate further to pool/precise/main:

$ cd pool/precise/main

This directory will contain several directories with a title of a single letter. These directories will contain subdirectories of various Puppet-related packages. For example:

Sort through these directories and find the latest 2.7.x versions of each. Copy them to your /var/lib/apt/repo/$repository/tmp/$distribution directory. cron will pick them up and add them to your personal repository.

Now all servers will have access to the latest 2.7.x version of Puppet without having to specify a specific version. Whenever a new 2.7.x release comes out, apt-mirror will retrieve it within 24 hours. Just copy the new version to /var/lib/apt/repo/$repository/tmp/$distribution and it will then be added to your personal repository.

Conclusion

This article described several different ways of configuring apt to help control how groups of servers access packages. Implementing these methods yields benefits of faster downloads as well as limiting or widening access to packages when needed. By using Puppet, you can easily configure a central server to host these roles or configure groups of servers to access the designated apt server.

Comments

comments powered by Disqus