Container Security Basics

By default, Docker containers run as root. This means if an attacker exploits a vulnerability in your application, they get root access inside the container -- and potentially on the host. This lesson covers the three most impactful hardening controls.

The Three Core Controls

flowchart LR
    A["1. Non-Root User<br/>USER directive"] --> B["2. Drop Capabilities<br/>cap_drop: ALL"]
    B --> C["3. Read-Only Filesystem<br/>read_only: true"]

    style A fill:#e8f5e9,stroke:#2e7d32
    style B fill:#e3f2fd,stroke:#1565c0
    style C fill:#fff3e0,stroke:#ef6c00

Control 1: Run as Non-Root

In the Dockerfile

FROM node:18-alpine

# Create a non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup

# Set working directory and copy app
WORKDIR /app
COPY --chown=appuser:appgroup . .

# Switch to non-root user
USER appuser

CMD ["node", "server.js"]

At Runtime (Override)

docker run --user 1000:1000 my-app:1.0.0

In Compose

services:
  api:
    image: my-api:1.0.0
    user: "1000:1000"

Verify

docker exec my-app whoami
# Should NOT output: root

docker inspect my-app --format '{{.Config.User}}'

Control 2: Drop Linux Capabilities

Linux capabilities grant specific privileges. Docker adds several by default (e.g., NET_RAW, CHOWN, SETUID). Drop all capabilities and add back only what is needed:

services:
  api:
    image: my-api:1.0.0
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE   # Only if binding to ports < 1024

Capability	What It Permits	Usually Needed?
`NET_BIND_SERVICE`	Bind to ports below 1024	Only for port 80/443
`CHOWN`	Change file ownership	Rarely
`SETUID` / `SETGID`	Change user/group identity	Rarely
`NET_RAW`	Raw sockets (ping)	Rarely
`SYS_ADMIN`	Broad admin operations	Never (almost)

Verify

docker inspect my-app --format '{{json .HostConfig.CapDrop}}'
docker inspect my-app --format '{{json .HostConfig.CapAdd}}'

Control 3: Read-Only Root Filesystem

Prevent attackers from writing malicious files to the container filesystem:

services:
  api:
    image: my-api:1.0.0
    read_only: true
    tmpfs:
      - /tmp        # App needs a writable temp directory
      - /run        # Some processes need /run
    volumes:
      - app-data:/data   # Persistent writable storage

The read_only: true flag makes the entire root filesystem immutable. Use tmpfs mounts for directories that need to be writable temporarily.

Verify

docker exec my-app touch /test-file
# Should fail: Read-only file system

Avoid `--privileged`

--privileged gives the container full access to the host. It disables all security boundaries:

# NEVER do this unless absolutely necessary
docker run --privileged my-app

If a container needs specific kernel access, use --cap-add for the specific capability instead of --privileged.

These flags weaken container isolation:

Flag	What It Shares	Risk
`--network host`	Host network stack	Container sees all host traffic
`--pid host`	Host process namespace	Container can see/signal host processes
`--ipc host`	Host IPC namespace	Container can access host shared memory
`-v /:/host`	Entire host filesystem	Container has full host access

Only use these when explicitly justified and with compensating controls.

Hardened Compose Template

Putting it all together:

services:
  api:
    image: my-api:1.0.0
    user: "1000:1000"
    read_only: true
    cap_drop: [ALL]
    security_opt:
      - no-new-privileges:true
    tmpfs:
      - /tmp
    restart: unless-stopped
    networks: [back]

networks:
  back: {}

The no-new-privileges flag prevents processes from gaining additional privileges through setuid binaries.

Quick Audit Commands

# Check what user a container runs as
docker inspect CONTAINER --format '{{.Config.User}}'

# Check capabilities
docker inspect CONTAINER --format '{{json .HostConfig.CapDrop}}'
docker inspect CONTAINER --format '{{json .HostConfig.CapAdd}}'

# Check if privileged
docker inspect CONTAINER --format '{{.HostConfig.Privileged}}'

# Check mounts
docker inspect CONTAINER --format '{{json .Mounts}}'

Key Takeaways

Run as non-root by default. Use USER in Dockerfile or user: in Compose.
Drop all capabilities with cap_drop: ALL and only add back what is needed.
Use read-only root filesystem with read_only: true. Mount tmpfs for writable temp directories.
Never use --privileged unless absolutely necessary. Use specific --cap-add instead.
Add no-new-privileges to prevent privilege escalation via setuid binaries.

What's Next

Continue to Image Security to learn how to secure your image supply chain.

The Three Core Controls​

Control 1: Run as Non-Root​

In the Dockerfile​

At Runtime (Override)​

In Compose​

Verify​

Control 2: Drop Linux Capabilities​

Verify​

Control 3: Read-Only Root Filesystem​

Verify​

Avoid --privileged​

Avoid Host Namespace Sharing​

Hardened Compose Template​

Quick Audit Commands​

Key Takeaways​

What's Next​

The Three Core Controls

Control 1: Run as Non-Root

In the Dockerfile

At Runtime (Override)

In Compose

Verify

Control 2: Drop Linux Capabilities

Verify

Control 3: Read-Only Root Filesystem

Verify

Avoid `--privileged`

Avoid Host Namespace Sharing

Hardened Compose Template

Quick Audit Commands

Key Takeaways

What's Next