I’ve been working in Python a lot over the last year. I actually started learning it about three years ago, but back then my day gig was 100% .NET/C# on the back-end, and just when I decided to do something interesting with Python in Ubuntu a series of time-constrained crises would erupt on the job and put me off the idea. Since that time my employer and my role have changed, the first totally, and the second at least incrementally. At KIP we use Python for daily development, and run all our software on Centos on Amazon Web Services. As such I got to dive back in and really learn to use the language, and it’s been a lot of fun.
One of the de facto standard components of the Python development toolchain is virtualenv. We use it for Django development, and I have used it for my personal projects as well. What virtualenv does is pretty simple, but highly useful: it copies your system Python install along with supporting packages to a directory associated with your project, and then updates the Python paths to point to this copied environment. That way you can pip install to your heart’s content, without making any changes to your global Python install. It’s a great way of localizing dependencies, but only for your Python code and Python packages and modules. If your project requires a redis-server install, or you need lxml and have to install libxml2-dev and three or four other dependencies first, virtualenv doesn’t capture changes to the system at that level. You’re still basically polluting your environment with project-specific dependencies.
Dependencies in general are a hassle. Back when I started messing with VMs a few years ago I thought that technology would solve a lot of these problems, and indeed it has in some applications. I still use VMs heavily, and in fact my daily development environment is an Ubuntu Saucy VirtualBox VM running on Windows 7. But VMs are a heavyweight solution. They consume a fixed amount of ram, and for best performance a fixed amount of disk. They take time to spin up. They’re not easy to move from place to place and it’s fairly complicated to automate the creation of them in a controllable way. Given all these factors I never quite saw myself having two or three VMs running with different projects at the same time. It’s just cumbersome.
And then along comes Docker. Docker is… I guess wrapper is too minimizing a term… let’s call it a management layer, over Linux containers. Linux containers are a technology I don’t know enough about yet. I plan to learn more soon, but for the moment it’s enough to know that they enable all sorts of awesome. For example, once you have installed Docker and have the daemon running, you can do this:
sudo debootstrap saucy saucy > /dev/null sudo tar -C saucy -c . | sudo docker import - myimages:saucy
The first command uses debootstrap to download a minimal Ubuntu 13.10 image into a directory named saucy. It will take a little while to pull the files, but that’s as long as you’ll ever have to wait when building and using a Docker image.
The second command tars up the saucy install and feeds the tarball to docker’s import command, which causes it to create a new image and register it locally as myimages:saucy. Now that we have a base Ubuntu 13.10 image the following coolness becomes possible:
sudo docker run -i -t -name="saucy" -h="saucy" myimages:saucy /bin/bash root@saucy#
This command tells Docker to launch a container from the image we just created, with an interactive shell. The -name option will give the container the name “saucy,” which will be a convenient way to refer to it in future commands. The -h option causes Docker to assign the container the host name “saucy” as well. The bit at the end ‘/bin/bash’ tells Docker what command to run when the container launches. Hit enter on this and we’re running as root in a shell in a self-contained minimal install of Ubuntu. The coolest thing of all is that once we got past the initial image building step, starting a container from that image was literally as fast as starting Sublime Text. Maybe faster.
So we have our base image running in a container. Now what? Now we install stuff. Add in a bunch of things the minimal install doesn’t include, like man, nano, wget, and screen. Install Python and PIP. Create a project folder and install GIT. Whatever we want in our base environment. Once that is done we can exit the container by typing ‘exit’ or hitting control-d. Once back at the host system shell prompt the command:
sudo docker ps -a
…will show all the containers we’ve launched. Ordinarily the ps command only shows running containers, but the -a option causes it to show those that are stopped, like the one we just exited from. That container still has all of our changes. If we want to preserve those changes so that we can easily launch new containers with the same stuff inside, we can do this:
sudo docker commit saucy myimages:saucy-dev
This just tells Docker to commit the current state of the container saucy to a new image named myimages:saucy-dev. This image can now serve as our base development image, which can be launched in a couple of seconds anytime we want to start a new project or just try something out. I can’t overemphasize how much speed contributes to the usefulness of this tool. You can launch a new Docker container before mkvirtualenv can get through copying the base Python install for a new virtual environment. And it launches completely configured and ready to run commands in.
Given that, I found myself wondering just what use virtualenv is to me at this point? Unlike virtualenv Docker captures the complete state of the system. I can fully localize all dependencies on a “virtualization” platform that is as easy to use as a text editor in terms of speed and accessibility. Even better, I can create a “DockerFile” that describes all of the steps to create my custom image from a knwon base, and now any copy of Docker, running anywhere, can recreate my container environment from a version controlled script. That is way cool.
Now, the truth is, there are probably some valid reasons why the picture isn’t quite that rosy yet. The Docker people will be the first to tell you it is a very young tool, and in fact they have warnings about this splashed all over the website. There are issues with not running as root in the container. I haven’t been able to get .bashrc to run when launching a container for a non-root user. There are issues running screen due to the way the pseudo-tty is allocated. And if you’re doing a GUI app, then things may be more complicated still. All this stuff is being worked on and improved, and it’s very exciting to think of where this approachable container technology might take us.