Using Docker.io for Testing

It’s VMs all the way down…

The Problem

I found a rather embarassing bug a while back in txtorcon 0.8.1 which wasn’t passing the interface argument down to Twisted’s underlying TCP4ServerEndpoint, which meant it was likely that a hidden service using this interface could easily be listening on not-the-loopback interface (which is Very Bad).

Of course, the unit-tests I wrote along with the fix help, but it would be really nice to start up a full txtorcon-based testcase and verify nothing is listening publically (e.g. with nmap). I think you probably call this a functional test.

Thinking about this, a shiny hammer I’d seen recently came to mind: docker.io.

TL;DR: if you want to see all this in action, type make integration from a txtorcon clone, and/or look at these-here codez

One Possible Solution

It’d be too fragile to use a dev machine directly – probably there are lots of services running and maybe even some listening publically, like a Web server or SSH daemon. Although we could try to verify that “our” script isn’t listening publically on the dev machine, there’d be a lot of noise and it would get pretty fragile – these are variations on “looking at ps output”.

With docker we can completely specify a “container” and therefore know precisely what is listening on what (our hidden service and Tor on 127.0.0.1 interfaces, and nothing on the public interface).

When I first played with docker (around version 0.4) I didn’t like the idea of downloading images from “docker.io”. They’ve at least got HTTPS on this now, but I’d still like to make my own. Luckily, this is quite easy, especially if we want a Debian or Ubuntu system (and we do). (All the below commands are run as root).

First we create a basic wheezy image in dockerbase-wheezy:

root@host:~# debootstrap wheezy dockerbase-wheezy

…and then we import the image into Docker

root@host:~# tar -C dockerbase-wheezy -c . | docker import - dockerbase-wheezy

There’s some already pretty good documentation on docker.io about this.

For building up our image from our own base, we simply follow the Dockerfile reference and come up with something like this:

FROM dockerbase-wheezy
RUN apt-get update
RUN apt-get install -y python-setuptools python-twisted python-ipaddr python-geoip graphviz
# we make our code available via a "container volume" (-v option to
# run) at /txtorcon
# we could also use ADD to upload our code into the volume

This just takes the image and installs the pre-requisites for txtorcon. We also make our code available at /txtorcon so we can access it later. If we have our Dockerfile in a subdir called testcontainer/ then we can build this container:

root@host:~# docker build -t txtorcon-tester testcontainer/

This results in an image called txtorcon-tester being available.

The Actual Test

So, the actual test-case needs to be written. For this particular use-case, check out the test I developed for txtorcon. It starts a Twisted web service listening on a local port and connected (via Tor) to a hidden service. The details aren’t important, just how we run the script. Since it’s at integration/hidden_service_listen_ports/container_run in our checkout (which we ADD-ed to the container at /txtorcon), we can get at it in the container:

root@host:~# docker run -d txtorcon-tester /txtorcon/integration/hidden_service_listen_ports/container_run
530194a7417a857cee3420a02506c85af2bf94a293b96939d3b73b8c9cc402da

This will launch a fresh container, run our script and print out the ID of the (now running) container (that’s the -d option). So now we’ve got a super-lightweight VM running our test-script on a machine (“container”) whose configuration we know.

So, if only we knew the IP address of it, we could use nmap to determine if anything is listening for TCP connections. Docker can spit out some JSON describing the container, so we can determine the IP address nicely with a little Python:

import json
container_id = 'that giant number docker run -d spit out'
data = subprocess.check_output(['docker', 'inspect', container_id])
data = json.loads(data)[0]
ip_addr = data['NetworkSettings']['IPAddress']

Now we can run nmap on the whole mess and figure out if our container is listening on any TCP port:

nmap -T5 -p 1-65535 --open -sS <ip_address>

In practice, you probably want -oX in order to get parsable output (in this case, XML). In any case, this will tell us if there’s anything listening.

Great!

Conclusion

This gives a fairly nice and very repeatable way to write a system-level test-case, where you can verify features or behavior of an entire system. I also think it’s easy enough to understand that you don’t really need to know very much about docker.io or LXC

Once you’ve got your base image, it’s pretty fast (a few seconds) to re-build the test-case image and also very fast to launch a new container. It’s certainly not unit-test speeds, but plenty fast enough for tests that “can” be slow like stuff you arbitrarily label “integration tests” :)

I don’t know that you’d want to really-deploy an application – especially a very possibly “security sensitive” one like a Tor hidden service – inside docker.io containers but I certainly like it for this use-case.

Note that for extra bonus points, you can run all the above in a KVM/QEMU VM if you don’t have a new-enough docker.io host system.