High Performance Load Balancing Using DSR on a Raspberry Pi 2

TL;DR: Direct Server Return allows a Raspberry Pi or other low-end equipment to balance load at performance impossible to achieve using other methods.

Test conditions

Pi directly connected to laptop with Ethernet.

Laptop running Debian VM in Virtual Box with adapter 1 on NAT and adapter 2 on Ethernet (bridged)

Two more VM:s in VirtualBox on the same laptop to be used as test targets, connected in the same way as the first.

One tricky aspect of this test is that DSR requires a dedicated network interface, and the Pi only has one. This means that everything needs to be set up with the interface configured normally, and then the interface must be reconfigured and the test controlled from the console.

The Debian VM is temporarily set up as gateway to the outside world.

root@debian:~# ifconfig eth1
root@debian:~# echo 1 > /proc/sys/net/ipv4/ip_forward
root@debian:~# iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

And the Pi is temporarily set up with this configuration:

Pi eth0 =
GW =

ping from pi


Install Pen

Make sure Raspbian on the Pi is up to date.

apt-get update
apt-get upgrade

Continue with instructions from the Wiki.

apt-get install automake autoconf gcc git
mkdir Git
cd Git
git clone https://github.com/UlricE/pen.git
cd pen
automake --add-missing

Verify the installation

./pen -dfU 53

And from the Debian VM:

root@debian:~# dig @ +short siag.nu

Pen is now confirmed to work.

Let’s try one more. There is an Apache server running on the Debian VM.

root@raspberrypi:~/Git/pen# ./pen -df 80

root@debian:~# lynx -dump
It works!

This is the default web page for this server.

The web server software is running but no content has been added, yet.

Configure the Pi for DSR

Everything seems good to go. We can now reconfigure eth0 on the Pi.

ifconfig eth0

./pen -df -O "dsr_if eth0" -r

This means that we intend to forward any TCP traffic with destination address to the two servers and, load balanced using round robin.

Those two addresses exist on two additional Debian VM:s on the same laptop. Like the first one they each have eth0 connected to NAT and eth1 connected to wired ethernet.

Set up test targets

Both VM:s need a loopback interface configured with the virtual address
which they must mot tell anyone about:

ifconfig lo:1
echo 2 > /proc/sys/net/ipv4/conf/all/arp_announce
echo 1 > /proc/sys/net/ipv4/conf/all/arp_ignore

Restart Apache to make sure it listens on the new address:

service apache2 restart

Finally verify that we can access Apache on all addresses:

root@debian:~# lynx -dump
It works!

This is the default web page for this server.

The web server software is running but no content has been added, yet.
root@debian:~# lynx -dump
It works!

This is the default web page for this server.

The web server software is running but no content has been added, yet.
root@debian:~# lynx -dump
It works!

This is the default web page for this server.

The web server software is running but no content has been added, yet.

And here is one of the main reasons for wanting to use DSR:

root@debian:~# ab -n 1000 -c 20
This is ApacheBench, Version 2.3 <$Revision: 1604373 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests

Server Software: Apache/2.2.22
Server Hostname:
Server Port: 80

Document Path: /1000k
Document Length: 1024000 bytes

Concurrency Level: 20
Time taken for tests: 13.427 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 1024235000 bytes
HTML transferred: 1024000000 bytes
Requests per second: 74.48 [#/sec] (mean)
Time per request: 268.532 [ms] (mean)
Time per request: 13.427 [ms] (mean, across all concurrent requests)
Transfer rate: 74496.13 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 7 8.3 6 239
Processing: 74 258 139.7 210 1081
Waiting: 0 11 22.4 8 239
Total: 77 265 140.0 217 1087

Percentage of the requests served within a certain time (ms)
50% 217
66% 239
75% 269
80% 390
90% 439
95% 517
98% 691
99% 782
100% 1087 (longest request)

The document 1000k is a dummy file containing 1024000 zeroes. Fetching it at 74.48 requests per second corresponds to a bandwidth of 610 Mbps, a speed physically impossible to achieve through the Pi’s Fast Ethernet interface, but easily achieved using DSR since the return traffic bypasses the load balancer completely. CPU usage on the Pi hovered at 15-20% during the test.


Direct Server Return for UDP

Pen has supported Direct Server Return for TCP for some time. Support for UDP has now been added, suitable for load balancing e.g. DNS.

Here, debian2 is the DNS client and debian uses Pen in DSR mode to load balance between debian3 and debian4 running Bind.

Pen command line:

ulric@debian:~/Git/pen$ sudo ./pen -df -U -O poll -O “dsr_if eth1” -S 2 -r
As of 0.28.1 the server table is expanded dynamically,
making the -S option obsolete
2015-08-03 16:24:09: read_cfg((null))
2015-08-03 16:24:09: Before: conns = (nil), connections_max = 0, clients = (nil), clients_max = 0
2015-08-03 16:24:09: expand_conntable(500)
2015-08-03 16:24:09: After: conns = 0x1ac4600, connections_max = 500, clients = 0x7f6428d5c010, clients_max = 2048
2015-08-03 16:24:09: pen 0.29.0 starting
2015-08-03 16:24:09: servers:
2015-08-03 16:24:09: 0
2015-08-03 16:24:09: 1

As far as debian2 can see, the responses are coming from a single DNS server:


But tcpdump on debian3 and debian4 shows requests and replies being load balanced across the hosts:




Pen 0.29.0 released

Available here:


And also here:


Pen 0.29.0 introduces transparent reverse proxying on supported platforms,
which currently means Linux, FreeBSD and OpenBSD. This allows the backend
servers to see the client’s real address. It can be used in combination
with SSL termination.

Another improvement is that the server table size is no longer fixed
at startup but grows dynamically as servers are added. The -S option is
still accepted but doesn’t do anything. The client and connection tables
can also be expanded on the fly, reducing the number of restarts.

Full list of changes from 0.28.0:

150608 Released 0.29.0.

150528 Transparent reverse proxy support for Linux, FreeBSD and OpenBSD.

150527 Allow the client table size to be updated on the fly. Default size still 2048.
Allow the connection table size to be updated in the fly. Default still 500.
See penctl.1, options clients_max and conn_max.

150526 Introduced the macro NO_SERVER to be used instead of -1 to signify
error conditions and such.
Removed the fixed server table size along with the -S option.

150525 Fixed cosmetic bug in startup code which required port to be specified
on backend servers even if it was the same as the listening port.


Transparent Reverse Proxy on OpenBSD

Continuing this series of posts on transparent reverse proxy, here’s how to do it on OpenBSD.

The OpenBSD host running Pen has IP addresses on em1 and on em2. The client debian2 has IP address and the server debian3 has IP address

OpenBSD takes first price in the easy management department by not requiring any special firewall rules or policy routing whatsoever. Just start Pen exactly the same way as on Linux and FreeBSD:

sudo ./pen -df -O transparent

The client sees a connection from to The server sees a connection from to




Transparent Reverse Proxy on FreeBSD

A previous post described how to get transparent reverse proxy to work with Pen on Linux. The same functionality is available on FreeBSD.

The FreeBSD host running Pen has IP addresses on em1 and on em2. Like before, the client debian2 has IP address and the server debian3 has IP address

FreeBSD requires far less in the way of special preparations than Linux did in the earlier post; in fact, a single firewall rule is all we need:

ipfw add 10 fwd tcp from any 5001 to any in recv em2

The Pen command is the same whether on Linux or FreeBSD:

sudo ./pen -df -O transparent

And as before, the client sees a connection from to, while the server sees a connection from to




Transparent Reverse Proxy

This is for the version of Pen in Git, and 0.29.0 when it is released.

With the exception of Direct Server Return, Pen works as a proxy: a client connects to Pen and Pen opens a new connection to an available server. A side effect of this is that the server can’t see the original client IP address.

For http, and for https where Pen also does SSL termination, the X-Forwarded-For header can be used to communicate the address. It is activated by the -H option and adds the header to the request if it isn’t already there. But this is a web-specific solution and doesn’t work for e.g. mail, where you also want to preserve the client address.

Now there is another solution to the problem. The transparent option makes Pen “spoof” the client’s IP address in its outgoing connection to the backend server.

Here, debian2 is the client with IP and debian3 is the server with IP Pen sits in between with IP addresses and Debian2 and debian3 have static routes set up so they can reach each other through the host running Pen.

There is a bunch of network configuration that needs to be done on the Pen host in order to get the return traffic go where it should. First some firewall rules:

root@debian:~# iptables -t mangle -N DIVERT
root@debian:~# iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT
root@debian:~# iptables -t mangle -A DIVERT -j MARK --set-mark 1
root@debian:~# iptables -t mangle -A DIVERT -j ACCEPT

And then a few special routes:

root@debian:~# ip rule add fwmark 1 lookup 100
root@debian:~# ip route add local dev lo table 100

The Pen command like looks like this:

sudo ./pen -df -O transparent



The server sees the original client IP address


Pen 0.28.0 released

Available here:


And also here:


Pen 0.28.0 brings Direct Server Return on Linux and FreeBSD.

It also brings the Windows code up to speed.

Full list of changes from 0.27.5:

150520 Released 0.28.0.

150513 Numerous updates to support the madness that is Windows.

150501 Fix from Vincent Bernat: segfault when not using SSL.

150427 DSR support using Netmap on FreeBSD.
Unbroke DSR on Linux.

150424 Replaced all calls to perror with debug(…, strerror(errno);
Updated penlog and penlogd to use diag.[ch].

150422 More refactoring: broke out conn.[ch], client.[ch], server.[ch],
Made a hash index such that the load balancer may balance load.

150420 Broke out Windows code from pen.c into windows.c. Added windows.h.

150419 Broke out public definitions for dsr into dsr.h.
Broke out memory management into memory.[ch].
Broke out dignostic and logging functions into diag.[ch].
Broke out settings into settings.[ch].
Broke out access lists into acl.[ch].
Broke out event initialization into event.[ch].
Added pen_epoll.h, pen_kqueue.h, pen_poll.h, pen_select.h.
Broke out pen_aton et al into netconv.[ch].

150416 Added dsr.c


Installing Window 10 on the Raspberry Pi

For once, not a post about Pen. Although, Pen does run on Windows and yes, Pen will run on Windows on the Pi.

Installing Windows was a bit of a challenge because of DISM.EXE, the tool used to write the image to the SD card. The problem is that the version in Windows 8.1 – the most recent supported version of Windows – is too old! According to the installation instructions, Windows 10 must first be installed on a PC; an absurd requirement which is fortunately incorrect. Instead, Windows Assessment and Deployment Kit for Windows 10 can be installed and includes a newer release of the tool.

With that hurdle out of the way, the rest of the installation was easy. Microsoft seems to regard the Pi not as a self-hosted environment but rather as a kind of Arduino to which you deploy “apps”. We’ll see how well that is received.