I was building out a search server container with Elasticsearch 1.0.1 today, and I ran into one of those irritating little problems that I could solve a lot faster if I would just observe more carefully what is actually going on. One of the steps in the build is to clone some stuff from our git repo that includes config files that will get copied to various places. In the process of testing I added a new file and pushed it, then re-ran the build. Halfway through I got a stat error from a cp command that couldn’t find the file.
But, but, I had pushed it, and pulled the repo, so where the hell was it? Yesterday something similar had happened when building a logstash/redis container. One of the nice things about a Docker build is that it leaves the interim containers installed until the end of the build (or forever if you don’t use the -rm=true option). So you can start up the container from the last successful build step and look around inside it. In yesterday’s case it turned out I was pushing to one branch and cloning from another.
But that problem had been solved yesterday. Today’s problem was different, because I was definitely cloning the right branch. I took a closer look at the output from the Docker build, and where I expected to see…
Step 4 : RUN git clone blahblahblah.git ---> Running in 51c842191693
I instead saw…
Step 4 : RUN git clone blahblahblah.git ---> Using cache
Docker was assuming the effect of the RUN command was deterministic and was reusing the interim image from the last time I ran the build. Interestingly it did the same thing with a later wget command that downloaded an external package. I’m not sure how those commands could ever be considered deterministic, since they pull data from outside sources, but whatever. The important thing is you can add the -no-cache option to the build command to get Docker to ignore the cache.
sudo docker build -no-cache -rm=true - < DockerFile
Note that this applies to the whole build, so if you do have some other commands that are in fact deterministic they are not going to use the cache either. It would be nice to have an argument to the RUN command to do this on per-step basis, but at least -no-cache will make sure all your RUN steps get evaluated every time you build.