Kristian Lyngstøl's Blog

Caching a Debian repository with Varnish

Posted on 2010-04-19

In the past I've both run my own debian mirror and used apt-cacher to reduce the amount of duplicate package downloads I do at home when I upgrade multiple computers (virtual or otherwise). Mirroring sort of defeated the purpose as I used far more bandwidth than I needed, and apt-cacher was not horribly robust. The reason I want this is twofold: 1. I'm a geek. 2. I often have 5-10 debian-based hosts (virtual+physical) at home.

So now I use Varnish instead.

Debian archives (in this context, debian and ubuntu are the same) are twofold: information about packages and the packages themself. Ie: "apt-get update" versus "apt-get install". I cache "^/debian/.*.deb$" for 21 days (random number) and everything else in "^/debian" for 12 hours.

I've set up repo.kristian.int to point to the web host, which means that if I for some reason don't want to use my local Varnish cache, I can just point it to a real debian mirror and the clients wouldn't notice the difference.

Just for the heck of it, Varnish is set up to use 50GB of -smalloc memory. It's fun to have disk space.

Pros: - No maintenance needed - it's just a HTTP cache. - Reduced bandwidth usage. - Faster local upgrades.

Cons: - No streaming delivery yet, so adds a delay if it's cache miss. Since the final delivery is gbit, this is hardly a real issue. And since streaming delivery is on the todo-list for Varnish.... - Restarting Varnish flushes the cache. I will be using persistence to solve this.

Vcl:

backend lo {
        .host = "127.0.0.1";
                .port = "8080";
}

backend debian {
        .host = "ftp.no.debian.org";
                .port = "80";
}

sub vcl_recv {
        if (req.url == "/purgeall") {
                purge("req.url ~ .*");
                        error 200 "Purged all";
        }
        if (req.url ~ "^/debian/.*") {
                set req.backend = debian;
        } else {
                set req.backend = lo;
        }
}

sub vcl_fetch {
        if (req.url ~ "^/debian") {
                if (req.url ~ ".deb$") {
                        set beresp.ttl = 21d;
                } else {
                        set beresp.ttl = 12h;
                }
        } elsif (req.url ~ "^/slaughter") {
                set beresp.ttl = 1s;
        } elsif (req.url ~ "^/munin/") {
                set beresp.ttl = 30s;
        } else {
                set beresp.ttl = 10s;
                        set beresp.cacheable = false;
        }
}

Notes on the vcl

The VCL is an unedited copy/paste of the actual VCL I use, and it's running on an internal Varnish server. I'm not protecting things like /purgeall, which you should if you copy this.

Also note that I consistently fall through to the default VCL instead of trying to out-smart it. That's how I recommend you write a VCL file, as the default VCL handles cookies, authorization headers and strange HTTP requests (ie: TRACE) in a sensible way, in addition to adding X-Forwarded-For logic.

Adding support for Ubuntu would just mean adding an ubuntu mirror and copying the logic for ^/debian to ^/ubuntu.