Posted on 2010-07-30
We just released Varnish 2.1.3. And it's good.
I am a person who value stability, predictability and stability over pretty much everything else. I will gladly use 3-year old software over the latest version if it's working well. I only upgrade my desktop if I see a really good reason to do it. I rather wait 2 years for a new package to enter my favorite distribution than grab it from a source package to get a new feature.
Why do I tell you this? Because I am now ready to truly whole-heartedly recommend Varnish 2.1.3. Varnish 2.0.6 was a great release. It was stable, it worked well, we know how to get the most of it. There were no un-knowns. When we released Varnish 2.1.0, we knew that it was going to take a release or two to get the 2.1 releases equally good. I finally believe we are there, and that using Varnish 2.1.3 is (almost) as safe as 2.0.6. This is the version I will be recommending to our customers.
Varnish 2.1 represents two years of development. Roughly. The performance of Varnish 2.1.3 is roughly the same as that of Varnish 2.0.6, with a few exceptions.
First, we now use the "critbit" hashing algorithm instead of "classic" as default. This switch revealed a few weaknesses in the implementation that we gradually resolved between Varnish 2.1.0 and 2.1.3. The benefit of critbit is that it requires far less locking to deliver content. It scales better with large data sets and is generally a nice thing to have.
We have also re-factored much of the code that relates to directors, which allowed us to add multiple new directors. Including directors to pick a backend based on source-ip, URL hash etc.
With Varnish 2.1.3 we also added a "log" command to VCL, which allows you to add generic log messages to the SHM-log through VCL, a much-requested feature.
We have also added basic support for Range-headers. This is not the smartest version of it around, but it fits into the KISS-approach of Varnish. When Range-support is enabled, Varnish will fetch the entire object upon a Range request, but deliver only the range that the client requested. This allows Varnish to cache the entire object, but deliver it in smaller segments.
An other important change between Varnish 2.0 and 2.1 is the removal of the object workspace. The most immediate effect of this is that you will have to write "beresp" instead of "obj" in the vcl_fetch part of your VCL. The bigger consequence is that you no longer have a obj_workspace parameter and all work previously done in obj_workspace is now done in sess_workspace, then Varnish allocates exactly as much space as it needs for the object once it's finished. This should save you some memory on large data sets.
Now, there are several other changes, but most of them are internal. This is partly to make way for persistent storage, and also for general house-keeping. An other important reason why the perceived difference between Varnish 2.0.6 and 2.1 is not that big is that many of the features that were written for Varnish 2.1 were ported to Varnish 2.0. This includes new purging mechanisms, saint mode, resets in vcl_error and numerous bug fixes.
We are still working on persistent storage. It is available in Varnish 2.1 as an experimental feature, but it is missing certain key aspects - like LRU support. You can compare this to how critbit evolved: Critbit was available in Varnish 2.0, but not stable. We used the 2.0 release to fine tune critbit, and we will use 2.1 to improve persistent storage.
For Varnish 2.1.4, I will be merging the DNS director, which has been ready for some time now. I wanted to investigate some reports of memory leaks before I merged it, and those seem to be debunked now.
I will also be merging my return(refresh) code, which is fairly simple stuff. All it does is "guarantee" a cache miss, even if there is valid content in the cache. The use-case for this is when you update content and want to control who does the initial waiting. The typical example is when your front page updates, you send a script to it with a magic header (X-Refresh: Yes, for instance), then you look for that in VCL and make sure the client is coming from an allowed IP (if (client.ip ~ purgers), for example) and issue return(refresh), which will (oddly enough) refresh the content. Your clients wont have to wait and the front-page is updated immediately.
In the longer run, we are also looking at proper support for gzip. For the uninitiated, it should be emphasized that for normal operation, Varnish doesn't need to support gzip. Normally, Varnish will simply forward the Accept-Encoding header to the web server, which will compress the content as it sees fit and return it with a Vary header. That way, Varnish can deliver compressed content without having to compress it itself. This works fine, until you introduce Edge Side Includes (ESI) into the mix. With ESI, Varnish has to parse the content returned to check for ESI commands, and it can't do that if the content is compressed. So today, Varnish only supports uncompressed ESI. We wish to solve that. Properly.
I am sure I have forgotten some key elements, but this should hopefully be enough to make this a worth-while read.