Backup and Restore
Volumes persist data, but they do not protect against accidental deletion, corruption, or hardware failure. You need a backup strategy. This lesson covers backing up Docker volumes, database-specific backup patterns, and automating the process with scripts.
Backing Up Volumes
The standard technique uses a temporary container to create a tar archive of a volume's contents:
Backup a Volume
docker run --rm \
-v pgdata:/source:ro \
-v "$(pwd)":/backup \
alpine \
tar czf /backup/pgdata-$(date +%Y%m%d-%H%M%S).tar.gz -C /source .
How it works:
- A temporary Alpine container is created
- The volume (
pgdata) is mounted read-only at/source - The current directory is mounted at
/backup tarcreates a compressed archive- The container is removed automatically (
--rm)
flowchart LR
A["Volume: pgdata"] -->|"Mounted at /source (ro)"| B["Temp Container\n(alpine)"]
C["Host: $(pwd)"] -->|"Mounted at /backup"| B
B -->|"tar czf"| D["pgdata-20240115-103000.tar.gz"]
style A fill:#e8f5e9,stroke:#2e7d32
style D fill:#e3f2fd,stroke:#1565c0
Restore a Volume
# Create the volume (if it does not exist)
docker volume create pgdata
# Restore from backup
docker run --rm \
-v pgdata:/target \
-v "$(pwd)":/backup:ro \
alpine \
sh -c "rm -rf /target/* && tar xzf /backup/pgdata-20240115-103000.tar.gz -C /target"
Restoring overwrites all existing data in the volume. Stop any containers using the volume before restoring to avoid corruption.
Database-Specific Backups
For databases, a raw volume backup may not be safe if the database is actively writing. Use the database's native dump tool instead.
PostgreSQL
# Backup
docker exec db pg_dumpall -U postgres > backup-$(date +%Y%m%d).sql
# Or dump a single database
docker exec db pg_dump -U postgres mydb > mydb-$(date +%Y%m%d).sql
# Compressed backup
docker exec db pg_dump -U postgres -Fc mydb > mydb-$(date +%Y%m%d).dump
# Restore
docker exec -i db psql -U postgres < backup-20240115.sql
# Restore compressed format
docker exec -i db pg_restore -U postgres -d mydb < mydb-20240115.dump
MySQL / MariaDB
# Backup all databases
docker exec db mysqldump -u root -p"$MYSQL_ROOT_PASSWORD" --all-databases > backup-$(date +%Y%m%d).sql
# Backup a single database
docker exec db mysqldump -u root -p"$MYSQL_ROOT_PASSWORD" mydb > mydb-$(date +%Y%m%d).sql
# Restore
docker exec -i db mysql -u root -p"$MYSQL_ROOT_PASSWORD" < backup-20240115.sql
Redis
# Trigger a snapshot
docker exec cache redis-cli BGSAVE
# Copy the dump file out
docker cp cache:/data/dump.rdb ./redis-backup-$(date +%Y%m%d).rdb
# Restore: copy the dump file into a new container's volume
docker cp redis-backup-20240115.rdb cache:/data/dump.rdb
docker restart cache
Comparison
| Database | Backup Tool | Online Backup? | Compressed? | Restore Tool |
|---|---|---|---|---|
| PostgreSQL | pg_dump / pg_dumpall | ✅ Yes | ✅ With -Fc | psql / pg_restore |
| MySQL | mysqldump | ✅ Yes | ❌ Pipe to gzip | mysql |
| Redis | BGSAVE + docker cp | ✅ Yes | ❌ Binary format | Copy dump.rdb |
Backup Script
A reusable script that backs up multiple volumes and databases:
#!/usr/bin/env bash
set -euo pipefail
# Configuration
BACKUP_DIR="/opt/backups/docker"
RETENTION_DAYS=30
DATE=$(date +%Y%m%d-%H%M%S)
mkdir -p "$BACKUP_DIR"
echo "=== Docker Backup: $DATE ==="
# Volume backups
backup_volume() {
local volume="$1"
local file="$BACKUP_DIR/${volume}-${DATE}.tar.gz"
echo "Backing up volume: $volume -> $file"
docker run --rm \
-v "${volume}":/source:ro \
-v "$BACKUP_DIR":/backup \
alpine \
tar czf "/backup/${volume}-${DATE}.tar.gz" -C /source .
echo " Done ($(du -h "$file" | cut -f1))"
}
# Database backups
backup_postgres() {
local container="$1"
local file="$BACKUP_DIR/postgres-${container}-${DATE}.sql.gz"
echo "Backing up PostgreSQL: $container -> $file"
docker exec "$container" pg_dumpall -U postgres | gzip > "$file"
echo " Done ($(du -h "$file" | cut -f1))"
}
backup_mysql() {
local container="$1"
local password="$2"
local file="$BACKUP_DIR/mysql-${container}-${DATE}.sql.gz"
echo "Backing up MySQL: $container -> $file"
docker exec "$container" mysqldump -u root -p"${password}" --all-databases | gzip > "$file"
echo " Done ($(du -h "$file" | cut -f1))"
}
# Run backups
backup_volume "pgdata"
backup_volume "uploads"
backup_postgres "db"
# backup_mysql "mysql-db" "rootpassword"
# Cleanup old backups
echo "Cleaning backups older than ${RETENTION_DAYS} days..."
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +${RETENTION_DAYS} -delete
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +${RETENTION_DAYS} -delete
echo "=== Backup complete ==="
chmod +x backup.sh
Automating with Cron
Schedule the backup script to run daily:
# Edit crontab
crontab -e
# Run backup every day at 2:00 AM
0 2 * * * /opt/scripts/backup.sh >> /var/log/docker-backup.log 2>&1
Verify Cron Is Working
# Check cron logs
grep backup /var/log/syslog
# List scheduled jobs
crontab -l
# Check backup files exist
ls -lh /opt/backups/docker/
Restore Workflow
flowchart TD
A["Identify what to restore"] --> B{"Data type?"}
B -->|"Volume"| C["Stop containers using volume"]
C --> D["docker run --rm alpine tar xzf"]
D --> E["Start containers"]
B -->|"PostgreSQL"| F["docker exec psql < backup.sql"]
B -->|"MySQL"| G["docker exec mysql < backup.sql"]
B -->|"Redis"| H["docker cp dump.rdb + restart"]
E --> I["Verify data"]
F --> I
G --> I
H --> I
style A fill:#e3f2fd,stroke:#1565c0
style I fill:#e8f5e9,stroke:#2e7d32
Full Restore Example
# 1. Stop the service
docker compose stop db
# 2. Remove the old volume (if starting fresh)
docker volume rm myapp_pgdata
# 3. Create a new volume
docker volume create myapp_pgdata
# 4. Restore from backup
docker run --rm \
-v myapp_pgdata:/target \
-v /opt/backups/docker:/backup:ro \
alpine \
tar xzf /backup/pgdata-20240115-103000.tar.gz -C /target
# 5. Start the service
docker compose start db
# 6. Verify
docker exec db psql -U postgres -c "SELECT count(*) FROM users;"
Testing Your Backups
A backup that has never been restored is not a backup. It is a hope.
| Verification | How |
|---|---|
| Check file exists | ls -lh /opt/backups/docker/ |
| Check file is not empty | test -s backup.tar.gz |
| Check file is valid | tar tzf backup.tar.gz > /dev/null |
| Test full restore | Restore to a test volume and verify data |
| Automate verification | Add a restore-test step to the backup script |
Quick Verification Script
#!/usr/bin/env bash
# verify-backup.sh -- test that a backup can be restored
BACKUP="$1"
TEST_VOLUME="backup-test-$(date +%s)"
echo "Creating test volume: $TEST_VOLUME"
docker volume create "$TEST_VOLUME"
echo "Restoring backup to test volume..."
docker run --rm \
-v "$TEST_VOLUME":/target \
-v "$(dirname "$BACKUP")":/backup:ro \
alpine \
tar xzf "/backup/$(basename "$BACKUP")" -C /target
echo "Listing restored files:"
docker run --rm -v "$TEST_VOLUME":/data:ro alpine ls -lah /data
echo "Cleaning up test volume..."
docker volume rm "$TEST_VOLUME"
echo "Verification complete."
Key Takeaways
- Volume backups use a temporary container with
tar-- mount the volume read-only for safety. - Database backups use native dump tools (
pg_dump,mysqldump) for consistency during live operations. - Automate backups with cron and implement a retention policy to prevent disk exhaustion.
- Always stop containers before restoring a volume to avoid data corruption.
- Test your backups regularly by restoring to a test volume -- an untested backup is not a backup.
What's Next
- Return to the Containers and Runtime Management module overview.
- Continue to Module 5: Networking to learn how containers communicate with each other and the outside world.