open source

How to write a secure container

Christopher Simpson

Jun 21, 2022 • 4 min read

So, how do you write a secure container?

Rule 1: Know where your data is coming from

Common things to avoid:

Pulling images from public registries
Know your dependencies

Solution: Maintain your own copy of dependencies, and keep up to date when new patches are released to adress security issues

Reality: This is too much time/cost/resource for most organisations.

Most accept a higher risk level and instead start with a trusted base image (e.g. the "official" python3 signed docker image from dockerhub or similar (quay.io).

With that approach, you then build your container from the base image. You have more work to do than a pre made container, but you will have the benefit of having a better understanding of what it's 'made of'.

Constrast that with a pre-built container, for example the FastAPI uvicorn-gunicorn container, it makes a lot assumptions about your production environment which you shoud be making. The maintainer very kindly points out "WARNING: You Probably Don't Need this Docker Image" which is exactly the wider point I'm trying to make here- build your own containers!

Gotcha! Do you know what's in your project requirements / dependencies?

What's in your requirements.txt or packages.json or pom.xml (java).

Do you know where they came from?

Are the projects dead/unmaintained?

They might be safe now, but what about in 3months?

This is another argument for keeping and maintaing a copy of your dependencies and keeping them up to date (see rule 1). It's getting old, but pulling packages randomly from the internet is always a bad idea, they may not even exist when you want them, or worse, or be malicous.

What not to do

Dont use some random persons base image you found on dokker hub. Do the extra work and create your own container so you know what's in it. It's pain, it's effort, but you'll sleep better at night.

Rule 2: Use a real web server

If it's a web container serving http(s)?

Common issues:

Using the "development server" e.g. Node express server for local development, or flask's run, and Django's built in web server.

These 'built-in' servers are for local development* and will typically fall over, and are not built with security in mind.

*Thereis a growing opinion (and valid imho) that using such 'local' development servers can be counter productive anyway. Why? Because you want to be as close to production as possible, often these 'play' web servers have different bugs or features which are different in production- it pays dividens to use something docker docker-compose or podman-compose to run production-like experiences- you'll catch production-like bugs quicker and, crucially, be able to better re-produce them when you do have a production issue.

If you hit a production issue which you can only produce in production (it happens) , it's helpful to be able to keep as much the same locally as possible, and using a 'local development' web server is just one more which you could do without.

Solution: Do the extra work to add httpd/Apache , lighthttp, nginx or similar to your container.

Getting advanced

Randomise the container user's user id

Did you know that if the user id inside your container is the same user id and your host machine then your container can (by default) delete your entire host filesystem? There's no isolation if you have the same user id. The operating system sees them as the same user.

Whilst there's work to masquade this, this is a reason why Openshift randomises the user id of the container user when it starts, and always to a user id above 10,000. This is to try and guarantee that the user id of the container is never going to be the same as the host.

We wrote about "Building Non Root Docker Images OpenShift" in 2019 (so will be out of data, treat accordingly) and ways to create images which deal with the randomisation of the user id.

Reality: This is a suprising nightmare for most developers the first time it's experienced because practically zero docker hub images will run due to the random user id. It was for me at aleast, hence the article.

Solution: if you're vested in this, seek examples of how to add users and correct permissions during the container startup.

Go distroless

Distroless containers are base images with an extremely minimal footprint (e.g. they don't even have cat, cd or ls installed). It's very hard to hack software which doesn't exists, it's very effective.

This approach also has cost and speed benefits because the images are (usually) much smaller, they cost less to store, and move accross networks (and cloud providers love to charge for traffic between their internal networks).

Search for the distroless project and see if that works for you, especially if you're writing in go which seems to be better supported (no surprise since distroless is a Google project).

Be warned though, other distroless containers like python are less well maintained because simply less development effort /community exists for those so you may actually still be more insecure (last I checked the python distroless container was based on a rather outdated Debian image). This is an example of a Distroless python flask application.

Other nice things to 'secure' your container are we done yet?

Use multistage builds because it's a nice way of using all your build tools to build your app in a container, then throw them all out and copy only your built/compiled program into a new container (you don't need the build tools, why ship them? Our non-root docker images is an example of multi-stage builds)

You need to apt-update -y / yum update your containers, did you know?

Containers are bases on base images, and often that's debian or similar. Just like your operating system the containers you're using also need to be updated and patched, not just your dependencies. I'm embarressed to admit how non-obvious this was to me when I started in containerland oh so long ago.

Let someone else do it

There are a lot of automatic scanning tools for containers now, some opensource, some which say they're open source, and others. Have a look.

The thing is they all don't know your application and your attitude to risk, so you'll want to still to the good house keeping:

Have a disistaster recovery plan
You've tested it recently, right?
Healthcheck your applications
Log aggressively, remember storage is cheep