Lint your Dockerfile with Hadolint
While reading the Top 20 Dockerfile best practicies, I came across with a tool called Hadolint.
Hadolint is written in Haskell and give you a hand when writing your Dockerfiles, to lead you to use the best practicies when working with container environments.
To show you the tool works, you can download and install on your computer from the releases page, and if you are on Linux execute the following:
wget https://github.com/hadolint/hadolint/releases/download/v1.23.0/hadolint-Linux-x86_64
chmod +x hadolint-Linux-x86_64
sudo mv hadolint-Linux-x86_64 /us/local/bin/hadolint
And you are good to go.
So, to test how Hadolint works, consider the following Dockerfile for a FastApi python application:
FROM python:3.8-slim-buster
RUN mkdir /home/app
WORKDIR /home/app
RUN apt update && apt install -y libpq-dev gcc libc-dev
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY ./partner_api partner_api
ENTRYPOINT ["uvicorn", "partner_api.main:app", "--host", "0.0.0.0"]
Looks like a good Dockerfile, yes?
Let's run Hadolint: hadolint Dockerfile
Dockerfile:7 DL3027 warning: Do not use apt as it is meant to be a end-user tool, use apt-get or apt-cache instead
Dockerfile:10 DL3042 warning: Avoid use of cache directory with pip. Use `pip install --no-cache-dir <package>`
Hmm. I'm supposed to use apt-get instead of apt and use no-cache with pip. Let's fix that:
RUN apt-get update && apt-get install -y libpq-dev gcc libc-dev
...
RUN pip install --no-cache-dir -r requirements.txt
Let's run Hadolint again:
Dockerfile:7 DL3008 warning: Pin versions in apt get install. Instead of `apt-get install <package>` use `apt-get install <package>=<version>`
Dockerfile:7 DL3009 info: Delete the apt-get lists after installing something
Dockerfile:7 DL3015 info: Avoid additional packages by specifying `--no-install-recommends`
More apt fixes... In this case I will only add the --no-install-recommends
and uninstall the libs used to compile psycopg2 after pip install
.
RUN apt-get update && \
apt-get install --no-install-recommends -y libpq-dev gcc libc-dev && \
rm -rf /var/lib/apt/lists/*
...
RUN apt-get autoremove -y gcc libc-dev
Now, Hadolint don't tell us about the use of root as user when running the container, but following the tips from the linked post at the beggining of this article, let's setup a user:
FROM python:3.8-slim-buster
RUN mkdir /home/app
RUN useradd app-user && chown -R app-user /home/app
WORKDIR /home/app
RUN apt-get update && apt-get install --no-install-recommends -y libpq-dev gcc libc-dev
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
RUN apt-get autoremove -y gcc libc-dev
COPY ./partner_api partner_api
USER app-user
ENTRYPOINT ["uvicorn", "partner_api.main:app", "--host", "0.0.0.0"]
Let's run Hadolint one last time:
Dockerfile:9 DL3008 warning: Pin versions in apt get install. Instead of `apt-get install <package>` use `apt-get install <package>=<version>`
Well, no new warnings after adding our user.
I was unsure about pinning the versions of the installed packages, because in some point those versions might not be available in future versions of the python-based image. Do you have any suggestions? Leave it out in the comments!
Check on how to integrate Hadolint into your CI/CD favorite tool here.
Note: Hadolint is also available online here.
Size of the Images
The image built with the initial Dockerfile had a size of 371MB, and with the final version 339MB. 30MB less, it doesn't look like that is much, but our final image is safer than the initial.
Docker Build Context
Do not forget to create a .dockerignore
file to improve the proccess of building your image. When you do docker build -t my-image .
, all the files within the folder that your Dockefile is will be sent to the build context. Including all hidden files and folders like .git
.
Build Context size without .dockerignore
: 433.6MB, time spent until finish: 3 minutes and 38 seconds.
Build Context size with .dockerignore
: 81.92kB, time spent until finish: 1 minute and 42 seconds.
Here are the .dockerignore
that I'm using:
__pycache__
*.pyc
*.pyo
*.pyd
.Python
env
pip-log.txt
pip-delete-this-directory.txt
.tox
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
*.log
.git
tests
.venv
That's all folks!