Puppet trick: Running apt-get update only when needed

If you’re running an environment where your private Debian apt repository is constantly changing with new packages being added or upgraded, you may want to have those packages deployed via Puppet as soon as possible.

For example if you have puppet manifests to install the latest package version available:

This will trigger the apt-get-update-private-repo exec resource on every puppet run even if nothing changed in the repository. It also marks the resource as changed in the report; when you view Puppet Dashboard, you wonder why the servers have changed every 30 mins, even though the configuration of the server hasn’t physically changed.

The solution: pre-check for changes in the Packages file

A Debian repository generates an Packages file which is downloaded everytime an apt-get update is executed. See the Debian wiki for the information on how a Debian repository works.

The Packages file change on the repository server when the repository has changed, so before an apt-get update we can check whether the Packages file on the server has been modified and compare it with the local Packages file.
Here is the modified apt-get-update exec resource:

Thanks to the onlyif the apt-get-update-private-repo command will only run if the Packages file has been downloaded and it’s contents are different to the local Packages file.

onlyif must return true for the command to run, however a cmp will return true if there are no changes hence the ! is added before cmp to invert the exit value.

In terms of network usage – it’s almost exactly the same as a normal apt-get update, but with this trick system administrators can be peace and know when a puppet run has actually changed the server.

Packages caching with Approx

If you have a number of Ubuntu or Debian servers, especially if many of them are running within a private LAN with not direct access to the internet, then you should consider having a central place to cache all packages.

Approx is my favourite tool for caching packages,  it’s lightweight and very simple to configure.

Installation & configuration

First lets install the package

Next lets do some configuration in /etc/approx/approx.conf

The important part is the unique alias you give to each apt repository URL, in this case ubuntu  will be the alias for the normal packages and secure for security packages.

Now edit /etc/apt/sources.list on all the servers inside the LAN:

Replace the url with the approx server address and approx’s port 9999 then the alias specified in approx.conf

Now you’re ready to run apt-get update and install packages from the proxy server!

Where’s the approx daemon?

Some people may get confused with which daemon runs approx. It is invoked by inetd.
So if you want to start/stop approx, you’d need to invoke the openbsd-inetd service  (for Ubuntu 12.04).

Refreshing the cache

Running and apt-get update which points to Approx will trigger approx to check for new packages.

Using approx with a proxy server

There may be a case where the server which is running approx, doesn’t have direct access to the internet and must go via a proxy server. Approx does not have proxy settings for approx.conf
The trick is to export an http_proxy environment variable for the  inetd service (assuming you don’t have any other services invoked by inetd which you don’t want to let them use a proxy).

Under Ubuntu 12.04 edit inetd’s default file /etc/default/openbsd-inetd

Restart the inetd service and you’re done!

Beware of system wide environment variables!

I’ve come across a case where someone put the http_proxy environment variable inside the system wide environment file /etc/environment
This caused approx to not work at all because this meant on the approx server, an apt-get didn’t fetch the packages via approx, instead it tried to connect to the approx server address directly via the proxy server which obviously is wrong!

Make sure your proxy is running before doing an apt-get update

I’ve also come across a situation where an user attempted to install a package but  but the proxy server wasn’t running, as a consequence Approx created a local cache of a 0 byte sized .deb package. So periodically check for those kind of bad files.