Skip to main content

Understanding Docker Storage

Every container has its own filesystem, but that filesystem is not what it appears to be. It is built from read-only image layers stacked together, with a thin writable layer on top. Understanding this architecture explains why containers are fast to start, why images share disk space, and why data disappears when a container is removed.

The Layer Architecture

flowchart TB
subgraph container["Container"]
W["Writable Layer\n(container-specific)"]
end

subgraph image["Image Layers (read-only)"]
L4["Layer 4: COPY app/ /app"]
L3["Layer 3: RUN npm install"]
L2["Layer 2: RUN apt-get update"]
L1["Layer 1: FROM ubuntu:22.04"]
end

W --> L4 --> L3 --> L2 --> L1

style W fill:#fff3e0,stroke:#ef6c00
style L4 fill:#e3f2fd,stroke:#1565c0
style L3 fill:#e3f2fd,stroke:#1565c0
style L2 fill:#e3f2fd,stroke:#1565c0
style L1 fill:#e3f2fd,stroke:#1565c0

Each Dockerfile instruction creates a new layer. These layers are immutable and shared across containers that use the same image.

How Layers Work

PropertyImage LayersWritable Layer
Created byDockerfile instructionsContainer creation
Writable?❌ Read-only✅ Read-write
Shared?✅ Across containers❌ Per container
Persists?✅ Until image is deleted❌ Lost when container is removed
SizeUsually large (OS, packages)Usually small (runtime changes)

Copy-on-Write (CoW)

When a container modifies a file from an image layer, Docker uses copy-on-write:

  1. The file is copied from the image layer to the writable layer
  2. The modification is applied to the copy
  3. The original file in the image layer is untouched
flowchart LR
subgraph before["Before modification"]
A["Image Layer\n/etc/nginx/nginx.conf (original)"]
B["Writable Layer\n(empty)"]
end

subgraph after["After modification"]
C["Image Layer\n/etc/nginx/nginx.conf (original, hidden)"]
D["Writable Layer\n/etc/nginx/nginx.conf (modified copy)"]
end

before -->|"Container edits file"| after

style A fill:#e3f2fd,stroke:#1565c0
style B fill:#fff3e0,stroke:#ef6c00
style C fill:#e3f2fd,stroke:#1565c0
style D fill:#fff3e0,stroke:#ef6c00

This is why modifying large files inside a running container increases disk usage -- the entire file is copied to the writable layer.

Storage Drivers

Docker uses a storage driver to manage the layered filesystem. The default on modern Linux is overlay2:

DriverStatusBest For
overlay2Default and recommendedAll modern Linux distributions
btrfsSupportedSystems already using Btrfs
zfsSupportedSystems already using ZFS
devicemapperDeprecatedLegacy systems only

Check your current driver:

docker info --format '{{.Driver}}'
# overlay2
tip

Unless you have a specific reason to change the storage driver, always use overlay2. It has the best performance and widest support.

Exploring Container Filesystem

View Layer Details

# See the layers in an image
docker image inspect nginx:alpine --format '{{json .RootFS.Layers}}' | python3 -m json.tool

# See image history (which instruction created each layer)
docker image history nginx:alpine

Output:

IMAGE          CREATED        CREATED BY                                      SIZE
a1b2c3d4e5f6 2 weeks ago CMD ["nginx" "-g" "daemon off;"] 0B
<missing> 2 weeks ago EXPOSE map[80/tcp:{}] 0B
<missing> 2 weeks ago COPY docker-entrypoint.sh / 4.62kB
<missing> 2 weeks ago RUN /bin/sh -c set -x && addgroup... 26.8MB
<missing> 3 weeks ago /bin/sh -c #(nop) ADD file:... 7.38MB

Check Container's Writable Layer

# See what has changed in the writable layer
docker diff my-container
SymbolMeaning
AAdded
CChanged
DDeleted

Measure Container Size

# Show size of writable layer for all containers
docker ps -s

The SIZE column shows two values:

  • Virtual size: Total (image + writable layer)
  • Size: Writable layer only

Why Container Data Is Ephemeral

flowchart LR
A["Container Running\n(writable layer has data)"] -->|"docker rm"| B["Container Deleted\n(writable layer gone)"]
A -->|"docker stop + start"| C["Container Restarted\n(writable layer preserved)"]

style A fill:#e8f5e9,stroke:#2e7d32
style B fill:#ffebee,stroke:#c62828
style C fill:#e8f5e9,stroke:#2e7d32

Key facts:

  • docker stop + docker start → writable layer is preserved
  • docker rm → writable layer is deleted permanently
  • docker run (new container) → starts with a fresh writable layer

This means any data written inside the container (logs, database files, uploaded files) is lost when the container is removed. This is by design -- it ensures containers are reproducible and disposable.

danger

Never store important data in the container's writable layer. Use volumes or bind mounts for anything that must survive a container restart or removal.

Docker Storage Types Overview

Docker offers three ways to persist data beyond the container's writable layer:

flowchart TD
A["Docker Storage"] --> B["Volumes\n(Managed by Docker)"]
A --> C["Bind Mounts\n(Host path mapped)"]
A --> D["tmpfs Mounts\n(Memory only)"]

style B fill:#e8f5e9,stroke:#2e7d32
style C fill:#e3f2fd,stroke:#1565c0
style D fill:#fff3e0,stroke:#ef6c00
TypeLocationManaged by Docker?Persists on disk?Best For
Volumes/var/lib/docker/volumes/✅ Yes✅ YesDatabases, application data
Bind mountsAnywhere on host❌ No✅ YesDevelopment, host configs
tmpfsMemory (RAM)❌ No❌ NoSecrets, temp scratch data

Each type is covered in detail in the following lessons.

Key Takeaways

  • Docker images are built from read-only layers stacked together. Containers add a thin writable layer on top.
  • The writable layer uses copy-on-write -- modifying an image file copies it to the writable layer first.
  • Data in the writable layer is ephemeral -- it is lost when the container is removed.
  • Use docker diff to see changes in the writable layer and docker ps -s to measure its size.
  • For persistent data, use volumes, bind mounts, or tmpfs -- never rely on the writable layer.

What's Next

  • Continue to Volume Management to learn how to create and manage persistent storage with Docker volumes.