As anyone who accidentally stumbles on this blog knows, I’ve been playing with Docker for a few weeks now, which makes me an expert and entitled to opine on its future. Hell, you don’t need to be an expert to know where this is going. Docker rocks. It’s going to revolutionize the way applications and dependencies are managed in the datacenter. No technology has captivated me this way since… I don’t even know. Virtualization came close. But it is containers that are poised to really deliver the value virtualization promised. And Windows has nothing comparable what they offer. There are plenty of virtualization solutions that work on Windows, but containers are not virtualization: they’re about environment isolation, and they are a lot lighter weight, a lot easier to use, and a lot more manageable than virtualization technologies.
Over the weekend I finished up an experiment to encapsulate the complete environment for my spider application in a Docker container. It was a huge success from my point of view. Just to recap, I’m working on an application that uses a bunch of spider processes to get information off the web. The environment these things run in is fairly sophisticated at this point: they get their work input from, and stream work product to, redis. They also stream log events to redis, from which the events are picked up by logstash and indexed into Elastic Search. The whole process runs under the control of supervisord. So there are a few pieces in play.
Configuring the environment on EC2 previously required approximately 50 steps, including installing the base dependencies from the python, debian, and java repositories. After this weekend’s work it should require two: install Docker, and build the container image.
All those manual steps are now captured declaratively in a Docker build file. The base image that the container build depends on is an Ubuntu 13.10 debootstrap install with python and a few other things added, that I pushed to my repository on the Docker index. It can now be pulled to anywhere that Docker is installed. I could also build and push the final environment image, but all images on the Docker index are public. The way around that is either to run your own repository, or build the final container image in place on the server. I’m going to take the latter approach for now.
So, with the base image in the repo, and the Docker build file created and tested, launching a new environment on a server with Docker installed looks like this:
sudo docker build -t="my:image" -rm=true - < mydockerfile sudo docker run -t -i -n="my:name" my:image /bin/bash
Boom, the server is running. In my case what that means is that redis is up, logstash is up, Elastic Search is up, supervisord is up and has launched the spider processes, and the proper ports are exposed and bridged over to the host system. All I have to do is point my web browser at http://myhostname.com:9292 and the Kibana3 dashboard pops up with the event stream visualized so I can monitor the work progress. That. Is. Cool.
Which leads me back to my opening thought. As a developer who spent 20 years writing code for Microsoft platforms I don’t want to come off like an ex-smoker who has seen the light and needs to let everyone know it. But at the same time, I can’t help but wonder how Microsoft will respond to this? Docker came out of nowhere, igniting a wildfire by making a formerly obtuse technology (LXC containers) easier to understand and use. I think within another year or so containers will be as ubiquitous and as important in the datacenter as virtualization is now, perhaps more so. And Windows will be increasingly relegated to running Exchange and Sharepoint servers.