My first semester of Graduate school has ended and I’m one step closer to earning my Master’s in Data Science. Since the semesters been over for a few days, I’ve been itching to learn something new. Docker’s been on my to-learn list for awhile. Similar to Bitcoin and cryptocurrencies, Docker and containers in general are something the entire tech industry has been talking about for a few years but no one seems to understand. Or, at least, I didn’t.
While I’ve been apart of a lot of those conversations and even worked at organizations where coworkers were actively using Docker, I still had very little idea what problem Docker was solving and why it was doing it well enough to gain the massive amounts of regard it has now.
I knew the following things, vaugely.
Using containers allows sandboxing of your applications and services
Docker containers would allow you to run your apps and services on multiple infrastructures while reducing deployment headaches.
While this isn’t technically untrue, it’s missing a lot of nuance. So today we’re going to briefly cover what Docker is. I’ve been writing a lot of Flask applications recently, so as an example I’ll create a Docker image of my open-source powerlifting application SaltySplatoon
What are containers?
Containers are a form of visualization. They create an individual namespace not shared by the rest of the operating. However, the host operating system can share some resources and (depending on optimization) binaries as well.
As a concrete example, we’ll be creating a container that packages python3, all the necessary libraries (Flask), and gunicorn.
What is Docker?
Docker provides a software suite that allows a user to specify blueprints in the form of Dockerfile’s that can be used to create lightweight containers easily. They didn’t invent containers, but they standardize it very nicely.
Packaging a Flask Application with Docker
All of the following commands were run on a fresh install of Ubuntu. In my case I used a virtual machine of with the 16.04.3 iso running on VMware Fusion. The process should be essentially the same for all Debian based distributions
You’re going to need a requirements.txt for your project. The documentation tells you everything you need to know. In essence, it’s just a text file that contains:
The python packages needed for your application
The exact versions of the packages.
You know when you’re writing code and everything is working on your system, then you give the code to a colleague or coworker and it’s broken all of a sudden? 95% of the time this will be because you have something installed they don’t. Keeping requirements.txt up to date allows another user to run
pip install -r requirements.txt and have all the packages needed to run your code.
This is best used when combined with virtual environments, but that’s a subject for another blogpost.
Should be pretty straightforward. Run:
sudo apt install docker.io
and you should be good to go. The Docker daemon starts automatically.
The Dockerfile is essentially a text file that acts as a schematic for Docker to create an image.
After a little work, mine looks like this:
FROM ubuntu:latest MAINTAINER Shane Caldwell "firstname.lastname@example.org" ARG buildtime_mongo_user=default_value ENV mongo_user=$buildtime_mongo_user ARG buildtime_mongo_pass=default_value ENV mongo_pass=$buildtime_mongo_pass ARG buildtime_app_secret=default_value ENV salty_appsecret=$buildtime_app_secret EXPOSE 8080 RUN apt-get update -y RUN apt-get install -y python-pip python-dev build-essential python3-pip gunicorn COPY . /saltysplatoon WORKDIR /saltysplatoon RUN pip3 install -r requirements.txt ENTRYPOINT ["python3"] CMD ["salty.py"]
So what’s going on here?
FROM ubuntu:latest let’s Docker know the base OS the app was built for was the latest distribution of Ubuntu. This could be Debian, or Redhat, or any number of different distributions.
The series of
ENV statements are used to set environmental variables for the image. My app connects to a Mongo database hosted on mLab. I don’t want to store this information directly in the Dockerfile. With the build command I use, buildtime_mongo_user and the other variables are written over at build time by the actual values necessary to connect to the database.
EXPOSE determines which ports the image will accept connections from. Statements written with
EXPOSE will be translated to iptables rules for the image.
RUN allows running of bash commands when the image is created. In this, we update apt-get and then install the necessary packages to run the application. We then copy over the current directory structure into a folder called “saltysplatoon”. Note that this assumes the dockerfile is kept in the root directory of the GitHub project, which is standard in the projects I’ve seen.
We then pip3 install the requirements.txt we created early.
The entrypoint for our command is python3, and the command is just the file to be run.
Boom! Pretty easy, right?
Creating your image
sudo docker build --build-arg buildtime_mongo_user=secret_username --build-arg buildtime_mongo_pass=secret_password --build-arg buildtime_app_secret=secret_appkey -t saltysplatoon_docker:latest .
As you might expect, the mongo_user is not actually “secret_username”. It’d be pretty bad opsec to push those values after going through all the trouble to not hardcode them in the dockerfile.
The –build-arg allows me to pass these variables to the dockerfile at build time.
sudo docker run -d -p 8080:8080 saltysplatoon_docker
This runs the image we created, “saltysplatoon_docker” and has port 8080 from the image be served over port 8080 of our local host. That’s it! We now have a single image running an instance of our application that we can access from our local machine.
Monitoring your image
A couple commands proved to be really useful while I was figuring out how to package and run my application correctly.
docker ps -a: Essentially works the same as ps on a regular linux machine. This let’s you see the docker processes you’re running and their status. This includes how long they’ve been running, whether they’ve exited, which image they’re based off of, and so on.
docker logs -f [docker_instance]
This ended up being useful frequently. Looking at the logs makes it easy to figure out why something isn’t working as expected.
docker inspect [docker_intance]
Docker inspect let’s you get all the nitty gritty details from the image. This includes environmental variables, which was useful until I got the
ARG/ENV pattern to work correctly.
This barely scratches the surface of what Docker is capable of, but it’s not a bad start! Next, we’ll learn what the security concerns are when it comes to containers.