Skip to main content

Data Persistence Strategies

Knowing how volumes, bind mounts, and tmpfs work (covered in Volume Management) is the foundation. This lesson focuses on when to use each type and how to design storage patterns that handle permissions, multi-container sharing, and environment differences correctly.

Choosing the Right Storage Type

flowchart TD
A["Need to persist data?"] -->|"No"| B["Container writable layer\n(default, ephemeral)"]
A -->|"Yes"| C{"Data sensitivity?"}
C -->|"Secrets / tokens"| D["tmpfs mount\n(RAM only)"]
C -->|"Normal data"| E{"Who manages the path?"}
E -->|"Docker manages it"| F["Named volume"]
E -->|"I need a specific host path"| G{"Read-only?"}
G -->|"Yes"| H["Bind mount (:ro)"]
G -->|"No"| I["Bind mount"]

style B fill:#f5f5f5,stroke:#9e9e9e
style D fill:#fff3e0,stroke:#ef6c00
style F fill:#e8f5e9,stroke:#2e7d32
style H fill:#e3f2fd,stroke:#1565c0
style I fill:#e3f2fd,stroke:#1565c0

Quick Decision Table

Use CaseStorage TypeReason
Database data (PostgreSQL, MySQL)Named volumeManaged by Docker, portable, survives container removal
Application uploads / user filesNamed volumePersists independently from the container lifecycle
Source code in developmentBind mountHot-reload requires host filesystem access
Configuration filesBind mount (:ro)Host-managed, container should not modify
TLS certificatesBind mount (:ro)Host-managed by cert tool (certbot, etc.)
Secrets / API keys (at rest)tmpfsNever written to disk
Temporary build artifactstmpfs or writable layerDiscarded after use
Log files (external aggregation)Bind mount or volumeDepends on log collection strategy

Database Persistence

Databases are the most critical data to persist correctly.

PostgreSQL

services:
db:
image: postgres:16
restart: unless-stopped
environment:
POSTGRES_PASSWORD_FILE: /run/secrets/db_password
volumes:
- pgdata:/var/lib/postgresql/data
- ./init-scripts:/docker-entrypoint-initdb.d:ro
secrets:
- db_password

volumes:
pgdata:

secrets:
db_password:
file: ./secrets/db_password.txt

MySQL / MariaDB

services:
db:
image: mysql:8
restart: unless-stopped
environment:
MYSQL_ROOT_PASSWORD_FILE: /run/secrets/db_password
volumes:
- mysqldata:/var/lib/mysql
- ./my-custom.cnf:/etc/mysql/conf.d/custom.cnf:ro

volumes:
mysqldata:

Redis (with persistence)

services:
cache:
image: redis:7-alpine
restart: unless-stopped
command: redis-server --appendonly yes
volumes:
- redisdata:/data

volumes:
redisdata:
tip

Always use a named volume for database data directories. Bind mounts can cause permission issues, especially on macOS and Windows, and are not portable.

Permission and Ownership Patterns

Permission mismatches are one of the most common Docker storage problems.

The Problem

flowchart LR
A["Host: file owned by uid 1000"] -->|"Bind mount"| B["Container: process runs as uid 999"]
B --> C["Permission denied!"]

style C fill:#ffebee,stroke:#c62828

Solution 1: Match UIDs

Ensure the container process runs as the same UID as the host file owner:

# Create a user with a specific UID matching the host
RUN addgroup -g 1000 appgroup && \
adduser -u 1000 -G appgroup -s /bin/sh -D appuser
USER appuser

Solution 2: Fix Ownership at Runtime

# Set ownership in an entrypoint script
chown -R app:app /data
exec "$@"

Solution 3: Use Named Volumes

Named volumes are initialized with the correct permissions from the image's filesystem. Docker sets ownership to match what the Dockerfile specifies:

# The /data directory will have correct ownership in the volume
RUN mkdir -p /data && chown -R app:app /data
VOLUME /data

Permission Reference

ScenarioSolution
Bind mount, host user ≠ container userMatch UIDs or use chown in entrypoint
Named volume, first useDocker copies ownership from image -- usually works correctly
Read-only config filesMount with :ro, ensure host file is readable
Container runs as rootWorks but insecure -- use a non-root user

Multi-Container Shared Data

Shared Volume Between Services

services:
# Writer uploads files
api:
image: my-api:1.0.0
volumes:
- uploads:/app/uploads

# Reader serves files
cdn:
image: nginx:alpine
volumes:
- uploads:/usr/share/nginx/html/uploads:ro

volumes:
uploads:

File Lock Considerations

warning

When multiple containers write to the same volume, you risk file corruption if they modify the same files simultaneously. Use application-level locking or a shared-nothing architecture where each container writes to its own subdirectory.

Environment-Specific Storage

Development

services:
api:
image: my-api:1.0.0
volumes:
# Source code: bind mount for hot-reload
- ./src:/app/src
# Node modules: named volume to avoid overwriting
- node_modules:/app/node_modules
# Config: bind mount, read-only
- ./config/dev.json:/app/config.json:ro

volumes:
node_modules:

Production

services:
api:
image: my-api:1.0.0
volumes:
# Data: named volume only
- uploads:/app/uploads
# Config: bind mount, read-only
- ./config/prod.json:/app/config.json:ro
# Secrets: tmpfs
tmpfs:
- /tmp:size=50m

volumes:
uploads:

The node_modules Trick

In development, bind-mounting the project root (./:/app) also overwrites /app/node_modules with the host version (or an empty directory if it does not exist on the host). Fix this with a named volume overlay:

volumes:
- ./:/app # Bind mount project root
- node_modules:/app/node_modules # Named volume "covers" bind mount at this path

volumes:
node_modules:

The named volume at /app/node_modules takes precedence over the bind mount at that specific path.

Storage Anti-Patterns

Anti-PatternProblemBetter Approach
Storing data in writable layerLost on docker rmUse named volumes
Bind-mounting database data dirPermission issues, not portableUse named volumes
Committing data into imagesImage size bloated, stale dataMount data at runtime
Using anonymous volumesHard to find, easy to accidentally pruneUse named volumes
Same volume, multiple writers, no lockingFile corruptionApplication-level locking or separate directories
Not setting :ro on config mountsContainer can modify host configsAlways mount configs as :ro

Key Takeaways

  • Named volumes for database data and application state -- Docker-managed, portable, correct permissions.
  • Bind mounts for source code (development) and config files (read-only).
  • tmpfs for secrets and scratch data that should never touch disk.
  • Match UIDs between host and container to avoid permission issues with bind mounts.
  • Use the node_modules named volume trick to prevent bind mounts from overwriting dependency directories.
  • Mount configuration files as read-only (:ro) to prevent containers from modifying host files.

What's Next

  • Continue to Backup and Restore to learn how to protect your persistent data with backup scripts and strategies.