Skip to main content

Container Security Basics

By default, Docker containers run as root. This means if an attacker exploits a vulnerability in your application, they get root access inside the container -- and potentially on the host. This lesson covers the three most impactful hardening controls.

The Three Core Controls

flowchart LR
A["1. Non-Root User<br/>USER directive"] --> B["2. Drop Capabilities<br/>cap_drop: ALL"]
B --> C["3. Read-Only Filesystem<br/>read_only: true"]

style A fill:#e8f5e9,stroke:#2e7d32
style B fill:#e3f2fd,stroke:#1565c0
style C fill:#fff3e0,stroke:#ef6c00

Control 1: Run as Non-Root

In the Dockerfile

FROM node:18-alpine

# Create a non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup

# Set working directory and copy app
WORKDIR /app
COPY --chown=appuser:appgroup . .

# Switch to non-root user
USER appuser

CMD ["node", "server.js"]

At Runtime (Override)

docker run --user 1000:1000 my-app:1.0.0

In Compose

services:
api:
image: my-api:1.0.0
user: "1000:1000"

Verify

docker exec my-app whoami
# Should NOT output: root

docker inspect my-app --format '{{.Config.User}}'

Control 2: Drop Linux Capabilities

Linux capabilities grant specific privileges. Docker adds several by default (e.g., NET_RAW, CHOWN, SETUID). Drop all capabilities and add back only what is needed:

services:
api:
image: my-api:1.0.0
cap_drop:
- ALL
cap_add:
- NET_BIND_SERVICE # Only if binding to ports < 1024
CapabilityWhat It PermitsUsually Needed?
NET_BIND_SERVICEBind to ports below 1024Only for port 80/443
CHOWNChange file ownershipRarely
SETUID / SETGIDChange user/group identityRarely
NET_RAWRaw sockets (ping)Rarely
SYS_ADMINBroad admin operationsNever (almost)

Verify

docker inspect my-app --format '{{json .HostConfig.CapDrop}}'
docker inspect my-app --format '{{json .HostConfig.CapAdd}}'

Control 3: Read-Only Root Filesystem

Prevent attackers from writing malicious files to the container filesystem:

services:
api:
image: my-api:1.0.0
read_only: true
tmpfs:
- /tmp # App needs a writable temp directory
- /run # Some processes need /run
volumes:
- app-data:/data # Persistent writable storage

The read_only: true flag makes the entire root filesystem immutable. Use tmpfs mounts for directories that need to be writable temporarily.

Verify

docker exec my-app touch /test-file
# Should fail: Read-only file system

Avoid --privileged

--privileged gives the container full access to the host. It disables all security boundaries:

# NEVER do this unless absolutely necessary
docker run --privileged my-app

If a container needs specific kernel access, use --cap-add for the specific capability instead of --privileged.

Avoid Host Namespace Sharing

These flags weaken container isolation:

FlagWhat It SharesRisk
--network hostHost network stackContainer sees all host traffic
--pid hostHost process namespaceContainer can see/signal host processes
--ipc hostHost IPC namespaceContainer can access host shared memory
-v /:/hostEntire host filesystemContainer has full host access

Only use these when explicitly justified and with compensating controls.

Hardened Compose Template

Putting it all together:

services:
api:
image: my-api:1.0.0
user: "1000:1000"
read_only: true
cap_drop: [ALL]
security_opt:
- no-new-privileges:true
tmpfs:
- /tmp
restart: unless-stopped
networks: [back]

networks:
back: {}

The no-new-privileges flag prevents processes from gaining additional privileges through setuid binaries.

Quick Audit Commands

# Check what user a container runs as
docker inspect CONTAINER --format '{{.Config.User}}'

# Check capabilities
docker inspect CONTAINER --format '{{json .HostConfig.CapDrop}}'
docker inspect CONTAINER --format '{{json .HostConfig.CapAdd}}'

# Check if privileged
docker inspect CONTAINER --format '{{.HostConfig.Privileged}}'

# Check mounts
docker inspect CONTAINER --format '{{json .Mounts}}'

Key Takeaways

  • Run as non-root by default. Use USER in Dockerfile or user: in Compose.
  • Drop all capabilities with cap_drop: ALL and only add back what is needed.
  • Use read-only root filesystem with read_only: true. Mount tmpfs for writable temp directories.
  • Never use --privileged unless absolutely necessary. Use specific --cap-add instead.
  • Add no-new-privileges to prevent privilege escalation via setuid binaries.

What's Next

  • Continue to Image Security to learn how to secure your image supply chain.