Caching RPMs with automirror
Introduction
In a previous article, I wrote about how to use pkg-cacher to cache requested RPM files. Since then, the website for pkg-cacher has become unavailable and coincidentally I wrote my own tool that provides the same functionality called automirror. This article describes how to use it.
Table of Contents
Update – July 2011
automirror currently does not work with CentOS 6 (and probably RHEL6). It looks like there are some incompatibilities between the HTTP::Proxy Perl module and libcurl (which yum uses).
I have opened a bug report for HTTP::Proxy and hopefully it will be resolved. In the meantime, I would suggest just using a small caching proxy such as Polipo.
Caching Packages
One reason caching can be useful is: if one has 10 servers that require a package update, it makes more sense to download that package once and then distribute it locally to the other servers than to have each server request the remote package. This saves on outside bandwidth as well as makes retrieving the package extremely faster due to the local connection.
automirror‘s caching system is simple: if a server requests a file with a file extension that you have told automirror to watch out for (for example, an .rpm file), it first checks to see if there is a locally available copy. If not, it retrieves the file from the remote source. After retrieving it, it saves a copy. On the next request for the same .rpm file, it will send back the locally saved copy instead of downloading the remote copy again.
Installation
automirror can be downloaded from its BitBucket repository. You can either use mercurial to clone the repository or download a run-of-the mill .zip, .gz, or .bz2 archive.
Inside the archive is the actual automirror script and a sample init script called rc.automirror. This init script has been tested on CentOS.
Install the automirror script by just copying it to any directory:
# cp automirror /usr/local/bin
To install the init script, first edit it and change any settings or variables necessary. You should only have to edit the options in the OPTIONS section.
By default, the EXTENSIONS option is set to .rpm and .img files to support automated RedHat / CentOS installs.
You can install the init script by doing:
# cp rc.automirror /etc/init.d/automirror # chkconfig --add automirror
You can now start the service by running
# /etc/init.d/automirror start
or
# service automirror start
Using automirror
For reference, the various options to automirror can be read by doing:
# perldoc /usr/local/bin/automirror
or
# /usr/local/bin/automirror --help
But the init script should be all you need to control it.
There are two main ways to use automirror: normal mode and tunneling mode.
Normal Mode
Normal mode is configured by setting TUNNEL TO 0 in the init script. In normal mode, simply configure any application (or global environment variable such as http_proxy) to point to the host and port where automirror is listening. For example if automirror was listening on 192.168.255.1:8080, to configure yum to use automirror as a proxy, edit /etc/yum.conf and add the following:
proxy=http://192.168.255.1:8080
Now when performing any action with yum, it will utilize the proxy and caching service.
Tunnel Mode
Tunnel mode can be turned on by setting TUNNEL TO 1 in the init script.
There are some applications and services, such as Kickstart or an automated run of Anaconda, that do not support a proxy service. You can use automirror in tunnel mode to get around this. For example, if there is an .rpm file located at
http://some.centos.mirror.org/path/to/package-1.2-3.rpm
you can still download the package via automirror by specifying:
http://192.168.255.1:8080/some.centos.mirror.org/path/to/package-1.2-3.rpm
In tunnelling mode, automirror will pop the host off of the url and use the next segment as the new host to connect to.
In Kickstart scripts, you can perform an automated install that uses automirror by specifying:
url --url=http://192.168.255.1:8080/some.centos.mirror.org/pub/centos/5/os/i386
Conclusion
automirror is a very flexible proxy script that can perform a variety of caching. This article covered how to install and use automirror to cache .rpm packages.

Oliver said,
Doesn’t that mean you still need to sync the yum metadata over to the mirror? Or does automirror have some way of handling that?
Joe Topjian said,
Yes and yes.
You can choose to cache the metadata by configuring automirror to cache extensions of xml and bz2, but I have chosen not to do this by default.
From my testing, I just chose to have each install get a fresh copy of the metadata but caching the actual rpm package. I found this balance to be acceptable — it still speeds up installs and updates greatly.
The problems I ran into / thought of are:
There isn’t a way to expire cache in automirror (yet — I might do this later, or an external solution can be made such as cron and find), so by caching the repomod.xml file, clients using the proxy will never get a fresh file.
Also, I think there might also be the issue with installs being at different points of time that need different metadata and if the files are named the same, they would get the wrong files they need.
Thanks for your question.
Niels de Vos said,
Hi Joe,
I’ll check if automirror works on Fedora and maybe package it and have it included. For the packaging I would need to know the license and the project should include a COPYING file. Can you please share the licensing details?
Thanks, Niels
Joe Topjian said,
Hi Niels,
Thanks for your interest. Do let me know if you have any problems with automirror and Fedora.
The license is BSD. I have uploaded a copy of the BSD license to the repository.
Niels de Vos said,
Thanks Joe!
CentOS 6 Cobbler Server » Terrarum said,
[...] am currently using Polipo to handle this since my own automirror will not work with CentOS [...]
bushidoka said,
Well I was pretty excited about this, but I cannot seem to make it go. There was already something on port 8080 so I figured out how to make it sit on 8181. I was not sure where it was caching files but think I figured out it is in /var/cache/yum/automirror. On the first try nothing was in there (not even a directory there) so I made the directory, the removed the RPM in question to try again. Did my yum install and it downloaded it afresh – but there is still nothing in /var/cache/yum/automirror.
During the download I did “lsof -i 8181″ and saw that yum was indeed talking to port 8181. This was all on the same host BTW – automirror running on the same machine I was doing yum on.
Log file looks like this :
[Mon Dec 19 10:21:56 2011] (13545) Processing: mirrorlist [Mon Dec 19 10:21:56 2011] (13545) Using: external mirror for / [Mon Dec 19 10:21:56 2011] (13545) Processing: repomd.xml [Mon Dec 19 10:21:56 2011] (13545) Using: external mirror for / [Mon Dec 19 10:31:36 2011] (13545) ERROR: Getting request failed: Client closed
Joe Topjian said,
Hello,
What distribution are you using? I have found that automirror does not work with CentOS/RHEL 6+.
If you are using a 6+ distribution, see the note here: http://terrarum.net/administration/caching-rpms-with-automirror.html#update
Joe
bushidoka said,
I’m using RHEL5. And I just checked from another box and it is still not working. You can see it is talking to my proxy on 8181 but it is not saving files in /var/cache/yum/automirror over there
yum 31358 root 5u IPv4 2444784 TCP solexa4:47814->solexa-db:8181 (ESTABLISHED) yum 31358 root 10u IPv4 2444789 TCP solexa4:54907->xmlrpc.rhn.redhat.com:https (ESTABLISHED)
Joe Topjian said,
In your last comment you have
“ERROR: Getting request failed: Client closed”
If this is the last log entry before automirror stops working, I would suspect there is a compatibility issue with RHEL and RedHat’s repositories.
Have you tried running automirror with –verbose? Does it give any hints?
Also does the user running automirror have proper permissions to write to /var/cache/yum/automirror ?
As a side note, the location of the cached files can be specified in the rc script or with command-line flags.
Joe Topjian said,
Another note to make:
I was confident that this script would work well for any RPM repository, but after testing over several months, I only had long-term success with CentOS5.
I’ve since begun using Polipo for caching. It’s a very small proxy server, and although it caches all data, I’ve been able to get it to work on some systems where automirror failed.
Add A Comment