Posted on 2012-05-02
I'm a "HiFi-idiot". I don't believe $400 power cables give me better sound, but I'm not that far off either.
So a while back I bought a new AV receiver, a Denon 4311 to be precise, and it came with a few nifty hack-worthy features. The biggest one being the serial interface… which is also available over regular TCP/IP using a regular Ethernet connection.
The interface is meant for systems integration. It does offer a simple web interface that allows me to do most basic tasks too, but what's really interesting is the "Serial" interface. You talk to the receiver using a bastardized telnet-interface. It says TCP port 23, but don't expect telnet to "just work".
Denon kindly offers a full documentation of the interface, and it gives you all the features of the amplifier, from basic volume control to complete control over the iPod docking, including navigation of said iPod. I just HAD to do something with it.
The first thing I did was of course to write a munin plugin to graph the volume. This is an old (obviously out-dated) example:
(main reason it's out of date: I've been fiddling with the munin install for a while and never bothered fixing it).
The code for that is available on Munin exchange, or my github page.
However, the Denon interface has a major drawback: It can only handle one connection at any given time. This means that if I'm doing something else with it, like adjusting the volume, the munin plugin wont execute. There are other problems too: If you try to bind "Volume up" to a key that spews the right command at the interface, it'll fail because chances are you'll want to adjust the volume for more than 0.5dBa at a time.
So I wrote a tiny little perl daemon:
#!/usr/bin/perl -w
# vold.pl, Volume control "daemon" for Denon x311 AVR
# Copyright (C) 2011-2012 Kristian Lyngstol <kristian@bohemians.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
# vold.pl talks to a denon x311 AV receiver over TCP/IP and offers a simple
# interface you can program against. It also talks to Spotify over D-Bus
# and hijacks the 'sleep'-function on the receiver, allowing you to use
# your regular remote control to pause Spotify and skip forward. If you
# want to use this, I suggest you read the code to hardcode some of this to
# your own needs.
use strict;
use IO::Socket;
use IO::Select;
use IO::Socket::INET6;
# The receiver
my $remote;
# The select() struct
my $sel;
# Read from the receiver. Assume unmuted at start-up (worst case scenario:
# Hit mute twice to unmute so vold catches the MUON in return).
my $mute = 0;
# No volume at start-up.
my $vol = "NA";
# for readin' stuff
my $line;
# listen-socket.
my $listen;
my $target = $ARGV[0];
if (!defined($target)) {
print "Usage: vold.pl <IP of denon receiver>\n";
exit(1);
}
# (Re-)connect to the receiver and re-add it to the select-queue.
sub recon {
if (defined($remote)) {
$sel->remove($remote);
}
$remote = IO::Socket::INET6->new(Proto => "tcp", PeerAddr => $target, PeerPort => "23", Timeout => 5,) or die "cannot connect to avr";
$sel->add($remote);
print "Re-opened connection to AVR\n";
}
# Handle data from the receiver, possibly reconnecting if needed.
sub handle_remote {
local $/ = "\r";
if (!$remote->connected) {
recon();
return;
}
$line = <$remote>;
if ($line =~ m/MV[0-9][0-9][0-9]?/) {
$vol = $line;
$vol =~ s/[^0-9]//g;
if (length($vol)>2) {
$vol =~ s/[0-9]$//;
}
if ($vol == "99") {
$vol = "0";
} else {
$vol -= 1;
}
} elsif ($line =~ m/MUON/) {
my $blah = `dbus-send --print-reply --dest=org.mpris.MediaPlayer2.spotify /org/mpris/MediaPlayer2 org.mpris.MediaPlayer2.Player.Pause`;
$mute = 1;
} elsif ($line =~ m/MUOFF/) {
$mute = 0;
my $blah = `dbus-send --print-reply --dest=org.mpris.MediaPlayer2.spotify /org/mpris/MediaPlayer2 org.mpris.MediaPlayer2.Player.Play`;
} elsif ($line =~ m/SLP[0-9]/) {
my $blah = `dbus-send --print-reply --dest=org.mpris.MediaPlayer2.spotify /org/mpris/MediaPlayer2 org.mpris.MediaPlayer2.Player.Next`;
print $remote "SLPOFF\r";
}
}
# Close a peer and remove it from the select queue
sub go_away {
my $fh = $_[0];
if (defined $fh) {
$sel->remove($fh);
$fh->close;
}
return;
}
# Handle client-data, defaulting to getting rid of it.
sub handle_client {
my $fh = $_[0];
if (!$fh->connected) {
go_away($fh);
return;
}
$line = <$fh>;
if (!defined $line) {
go_away($fh);
return;
}
$line =~ s/[^a-zA-Z0-9]//g;
if ($line =~ m/UP/) {
print $remote "MVUP\r";
} elsif ($line =~ m/DOWN/) {
print $remote "MVDOWN\r";
} elsif ($line =~ m/VOL/) {
print $fh "". $vol . "\n";
} elsif ($line =~ m/PC/) {
print $remote "SIDVR\r";
} elsif ($line =~ m/MUTE/) {
if ($mute == 0) {
print $remote "MUON\r";
} else {
print $remote "MUOFF\r";
}
} else {
print $fh "What you say?\n";
print $fh "Use: DOWN, UP, MUTE, PC or VOL\n";
go_away($fh);
}
}
$listen = new IO::Socket::INET6(Listen => 1, LocalPort => 1337, Timeout=>0,ReuseAddr =>1) or die "WHAT?";
$sel = new IO::Select( $listen );
recon();
# Let the games begin.
# Short: Both accepts new connections on $listen and adds them to $sel,
# reads client-connections and reads the denon-receiver, hopefully catching
# disconnects.
while(my @ready = $sel->can_read) {
if (!defined $remote || !$remote->connected) {
recon();
}
foreach my $fh (@ready) {
if($fh == $listen) {
# Create a new socket
my $new = $listen->accept;
$sel->add($new);
} elsif ($fh == $remote) {
handle_remote();
} else {
handle_client($fh);
}
}
}
(let me know if it's interesting and I'll move it to GitHub)
The features are simple: vold.pl opens a single connection to the AVR and forwards a handful commands, like UP/DOWN/MUTE and allows switching to the "PC" input, which I've got configured as DVR on the receiver.
It also taps into the sleep-function, which I never use. Why? Because sleep is the only button on the remote control which:
I also hooked into mute. If I hit mute on my receiver, it not only mutes, but it pauses Spotify.
If I hit sleep twice (The first hit just shows the sleep timer, which is off), it will skip forward to the next song, then reset the sleep timer. If I hit sleep three times, it will skip two songs and reset etc.
It's a fairly fugly interface, but fun. The ugliest part of the interface is perhaps that it uses carriage returns and goes rather bonkers if you try to send a newline. This makes it impossible to use the interface by just telneting directly to it (unless you find some knob to make telnet only send carriage returns AND behave generally nice. Keep in mind it also doesn't SEND any line feeds, so your terminal will keep overwriting the old text.)
CommentsPosted on 2012-04-25
I just pushed my HTTP request spewer, spew, to github.
http://github.com/varnish/spew
It's Linux-specific, since it uses epoll, and the http.c-code is still nasty, but it's also fast.
A reminder of what it can do:
The feature list contains:
Most of the boilerplate code is actually from an old defunct project I abandoned in 2009. All the stuff that deals with options and config files and debug messages and whatnot. The only thing I've done recently is src/http.c and integration.
Also: I know the code is still horrible. Patches are welcome, as are requests, (constructive!) comments, etc.
CommentsPosted on 2012-04-23
After my last post about testing Varnish (http://kly.no/posts/2012_04_19_Testing_Varnish.html), and a few years of frustration, I decided to take a look at what is actually possible.
So this is an example:
What you are seeing is Varnish doing 183k req/s on my home machine. The important thing, however, is that the tool generating this load, cleverly called a.out, is running at 22% CPU load, and it's the single-threaded result of one day of dirty hacking. Compare this to httperf which is hard to get over 100k req/s, or siege which kills itself at about 15k req/s on the same machine.
This being a prototype, the code will never see the light of day. Trust me when I say it's horrible - that's what you get from a day of fiddling. However, it has demonstrated to me what's possible, and I might re-start this project now that I have an idea of what I want to do.
As for how a.out works? It's connection-oriented, so it maintains N open connections at any given time and spews M requests over each connection in rather large (configurable) bursts. It also manages to NOT die if you stop the server for a while (httperf doesn't like this and siege doesn't really need any help to murder itself). It collects roughly 0 statistics and it does not really care about response.
So no, it's not very good. But it's fast, and that was what I set out to achieve.
Update: So this tool is a bit more powerful. I've been able to do 280-290k req/s with a single process (and thread). This is the same machine I did the 275k req/s record with using httperf, but that required two extra machines to generate traffic.... Will be interesting to try booting those tomorrow.
CommentsPosted on 2012-04-21
I've been going through my recent photos in a vague attempt at sorting through them and getting the good stuff out, and I stumbled upon a set of pictures from a visit to Østensjøvannet in Oslo, an important nesting ground for quite a few birds in Oslo.
Click on them for bigger resolution.
For the technical details: All of these were taken with my Canon 5D ("Mark 1") using a Canon 70-200mm L f/2.8 lens (no stabilizer) and a 2x extender. Can't remember the precise settings, but I suppose exif might remember, if it survived the gimp.
CommentsPosted on 2012-04-19
These are questions I see people ask frequently, and there is no simple answer. In this blog post, I'll go through some of what you could do to test a Varnish-site.
This is not about benchmarking. I don't do benchmarking, never have. Why not? Because it's exceedingly hard and very few people succeed at proper benchmarking.
Neither is this a blog post about testing functionality on your site. You should be doing that already. I'll only say that you should test functionality, and it's often best done by browsing the site.
Also, don't expect it to be all that complete. Ask questions in comments and I might expand upon it!
Despite what most people ask about, the tools you chose are not nearly as important as what you want to test.
If you are hosting videos, I doubt testing request/second is a sensible metric. There are a few things you need to ask yourself:
These questions are important and they relate.
If your site is already in production under some different architecture, you are in luck. Your access logs can tell you a lot about the traffic pattern you can expect. This is a great start.
If this is a new site, though, it can be harder to estimate the answers. I recommend starting with How much of my site is it possible to cache?. If your site is mostly static content, then Varnish will be able to help you a lot, assuming you set things up accordingly. If it's a site for logged in users, you have a much harder task. It's still possible to cache content with Varnish, but it's much harder. The details of how to do that is beyond the scope of this post.
As long as you can cache the majority of the content, chances are you will not be CPU bound as far as Varnish is concerned.
Testing a Varnish-site can be really fast or you can use the next six months doing it. Let's start by getting a baseline.
I usually start out by something truly simple: Look at varnishstat and varnishlog while you use a browser to browse the site. It's important that this is not a script, because your users are likely using browsers too and you want to catch all the stuff they catch, like cookies.
To set this up, the best way is to modify /etc/hosts (or the Windows equivalent (there is one, all the viruses uses it)). The reason you don't want to just add a test-domain is because your site will go on-line using a real domain, not a test-domain. A typical /etc/hosts file could look like this for me:
127.0.0.1 localhost 127.0.1.1 freud.kly.no freud 127.0.0.1 www.example.com example.com media.example.com # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters
Even better is if this is an external server. Then make sure you block any access to other port-80 services for that site. This will ensure that you don't miss any sub-domains.
What you are looking for is cache hits, misses and hitpasses. This should reveal if the site handles cookies properly or not. You may also want to fire up a different browser.
You also want to keep a look out for:
Once you've got this nailed down. If you doubt the speed of Varnish, we can always throw in wget too:
wget --delete-after -p -r www.example.com
This will give you a recursive request for www.example.com with all prerequisites (CSS, images, etc). It's not very useful in itself but it will give you a feel for how fast the site is without Varnish and then after you've cached it with Varnish. You can easily run multiple wget-commands in parallel to gauge the bandwidth usage:
while sleep 1; do wget --delete-after -p -r www.example.com; done & while sleep 1; do wget --delete-after -p -r www.example.com; done & while sleep 1; do wget --delete-after -p -r www.example.com; done &
Ideally this should be network-bound, but realistically speaking, wget is not /that/ fast when it comes to tiny requests.
Warning
Keep in mind that you are likely going to be hitting a DNS server frequently, specially if you don't use /etc/hosts. I've had DNS servers running at 50-70% CPU when I've done stress testing in the past, which means the DNS server is affecting the test more than you want it to.
So far none of these tricks have been very fancy.
So you wont reach 275 kreq/s using wget. I'm not sure that should be a goal either, but it's worth while taking a look at.
If you are moving on to testing just Varnish, not the site itself, then it's time to move away from browsers and wget. There are several tools available for this, and I tend to prefer httperf. It's not a good tool by any sensible measure, but it's a fast one. The best way to learn httperf is to stick all the arguments into a text file and set up a shell script that randomly picks them until you find something that works. The manual pages are unhelpful at best.
An alternative to httperf is siege. I'm sure siege is great, if you don't mind that it'll run into a wall and kill itself long before your web server. If you want further proof, take a look at this part of siegerc, documenting Keep-Alive:
# Connection directive. Options "close" and "keep-alive" # Starting with release 2.57b3, siege implements persistent. # connections in accordance to RFC 2068 using both chunked # encoding and content-length directives to determine the. # page size. To run siege with persistent connections set # the connection directive to keep-alive. (Default close) # CAUTION: use the keep-alive directive with care. # DOUBLE CAUTION: this directive does not work well on HPUX # TRIPLE CAUTION: don't use keep-alives until further notice # ex: connection = close # connection = keep-alive # connection = close
A stress testing tool that doesn't support keep-alive properly isn't very helpful. Whenever I use siege, it tends to max out at about 5000-10000 requests/second.
There's also Apache Bench, commonly known as just ab. I've rarely used it, but what little use I've seen from it has not been impressive. It supports KeepAlive, but my brief look at it showed no way to control the KeepAlive-ness. From basic tests of it, it also seemed slightly slower than httperf. It does seem better today than it was the first time I looked at it, though. For this blog posts, I'll use httperf simply because it's the tool I'm most familiar with and which have given me the right combination of control and performance.
However, httperf has several flaws:
The trick to httperf is to use --rate when you can. A typical httperf command might look like this (run on my laptop):
$ httperf --rate 2000 --num-conns=10000
--num-calls 20 --burst-length 20 --server localhost
--port 8080 --uri /misc/dummy.png
httperf --client=0/1 --server=localhost --port=8080
--uri=/misc/dummy.png --rate=2000 --send-buffer=4096
--recv-buffer=16384 --num-conns=10000 --num-calls=20
--burst-length=20
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of
open files to FD_SETSIZE
Maximum connect burst length: 20
Total: connections 10000 requests 200000 replies 200000 test-duration 7.076 s
Connection rate: 1413.2 conn/s (0.7 ms/conn, <=266 concurrent connections)
Connection time [ms]: min 1.6 avg 70.4 max 3049.8 median 33.5 stddev 286.4
Connection time [ms]: connect 27.9
Connection length [replies/conn]: 20.000
Request rate: 28264.9 req/s (0.0 ms/req)
Request size [B]: 76.0
Reply rate [replies/s]: min 39514.8 avg 39514.8 max 39514.8 stddev 0.0 (1 samples)
Reply time [ms]: response 35.9 transfer 0.0
Reply size [B]: header 317.0 content 178.0 footer 0.0 (total 495.0)
Reply status: 1xx=0 2xx=200000 3xx=0 4xx=0 5xx=0
CPU time [s]: user 1.33 system 5.61 (user 18.8% system 79.3% total 98.0%)
Net I/O: 15761.0 KB/s (129.1*10^6 bps)
Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0
Note that httperf will echo the command you ran it with back to you, with all options expanded. I took the liberty of formatting the output a bit more to make it easier to read. The options I use here are:
The first thing you should look at in the output is Errors:. If you get errors, there's a very good chance you were too optimistic with your --rate setting. Also note that the uri matters greatly. /misc/dummy.png is just that: a dummy-png I have to test. (see for yourself at http://kly.no/misc/dummy.png). Let's try the same with the front page:
$ httperf --rate 2000 --num-conns=10000 --num-calls 20
--burst-length 20 --server localhost --port 8080 --uri /
httperf --client=0/1 --server=localhost --port=8080 --uri=/
--rate=2000 --send-buffer=4096 --recv-buffer=16384
--num-conns=10000 --num-calls=20 --burst-length=20
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of
open files to FD_SETSIZE
Maximum connect burst length: 42
Total: connections 1738 requests 34760 replies 34760 test-duration 10.589 s
Connection rate: 164.1 conn/s (6.1 ms/conn, <=1018 concurrent connections)
Connection time [ms]: min 477.3 avg 4592.4 max 8549.5 median 5276.5 stddev 2233.3
Connection time [ms]: connect 8.7
Connection length [replies/conn]: 20.000
Request rate: 3282.8 req/s (0.3 ms/req)
Request size [B]: 62.0
Reply rate [replies/s]: min 3077.3 avg 3311.2 max 3545.1 stddev 330.8 (2 samples)
Reply time [ms]: response 3772.9 transfer 49.8
Reply size [B]: header 326.0 content 38915.0 footer 2.0 (total 39243.0)
Reply status: 1xx=0 2xx=34760 3xx=0 4xx=0 5xx=0
CPU time [s]: user 0.61 system 9.54 (user 5.8% system 90.1% total 95.9%)
Net I/O: 125998.3 KB/s (1032.2*10^6 bps)
Errors: total 8262 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 8262 addrunavail 0 ftab-full 0 other 0
Now see how the errors piled up. This is because we exceeded the performance httperf could offer. Yeah, httperf is far from perfect. Also note the bandwidth usage and CPU usage. I'm not sure if it's a coincidence that we're so close to gigabit, since this is a Varnish server running on localhost.
What you also may want to look at is reply status, to check the status codes. You also want to pay attention to connection times. Let's take a look at the first example again:
Connection rate: 1413.2 conn/s (0.7 ms/conn, <=266 concurrent connections) Connection time [ms]: min 1.6 avg 70.4 max 3049.8 median 33.5 stddev 286.4 Connection time [ms]: connect 27.9 Connection length [replies/conn]: 20.000
This tells me the average connection time was 70.4ms, with a maximum at 3049.8ms. 3 seconds is quite a long time. You may want to look at that. What I do when I debug stuff like this is make sure that I rule out the tool itself as the source of worry. There is no 100% accurate method of doing this, but given the CPU load of httperf at the time, it's reasonable to assume httperf is part of the problem here. You can experiment by slightly adjusting the --rate option to see if you're close to the breaking point of httperf.
You also want to watch varnishstat during these tests.
Frankly very little.
Sure, this means I can run Varnish at around 30k req/s on my laptop, testing FROM my laptop too. But this is not that helpful.
Well, first of all, running 20 requests over a single connection is pointless. There's almost no browser or site out there which will cause this to happen. Depending on the site, numbers between 4 and 10 requests per connection is more realistic.
If all you want is a big number, then tons of requests over a single connection is fine. But it has nothing to do with reality.
You can get httperf to do some pretty cool things if you invest time in setting up thorough tests. It can generate URLs, for instance, if that's your thing. Or simulate sessions where it asks for one page, then three other pages over the same connection X amount of time later, etc etc. This is were the six-month testing period comes into play.
I consider it a much better practice to look at access logs you have and use something simpler to iterate the list. wget can do it, and I know several newspapers that use curl for just this purpose. It was actually curl that first showed me what happens when Varnish becomes CPU bound without having a session_linger set (this is set by default now, but for the curious, what happened was that the request rate dropped to a 20th of what it was a moment before, due to context switching).
Test your site and by all means test Varnish, but do not assume that just because httperf or some other tool gives you 80 000 requests/second that this will match real-life traffic.
Proper testing is an art and this is just a small look at some techniques I hope people find interesting.
Comments