Dockerfile Fundamentals

A Dockerfile is a text file that tells Docker how to build an image. Each line is an instruction that adds a layer to the image. The quality of your Dockerfile directly determines the size, security, speed, and reliability of every container you run from it.

How a Docker Build Works

When you run docker build, the Docker client sends your project files (the build context) to the Docker daemon. The daemon then executes each instruction in the Dockerfile, creating a new layer for each step.

Each step produces a layer. Docker caches these layers, so unchanged steps do not need to rebuild. This is why instruction order matters -- more on that in the next lesson.

Instruction Reference

Instruction	What It Does	Example
`FROM`	Sets the base image (starting filesystem and runtime)	`FROM node:20-alpine`
`WORKDIR`	Sets the working directory for all following instructions	`WORKDIR /app`
`COPY`	Copies files from your project into the image	`COPY package.json ./`
`ADD`	Like `COPY`, but also extracts archives and fetches URLs	`ADD archive.tar.gz /app/`
`RUN`	Executes a command during build (install packages, compile, etc.)	`RUN npm ci --omit=dev`
`ENV`	Sets an environment variable that persists in the image	`ENV NODE_ENV=production`
`ARG`	Defines a build-time variable (not available at runtime)	`ARG VERSION=1.0.0`
`EXPOSE`	Documents which port the container listens on (does not publish it)	`EXPOSE 3000`
`USER`	Sets the user for `RUN`, `CMD`, and `ENTRYPOINT` instructions	`USER node`
`ENTRYPOINT`	Defines the main executable (always runs)	`ENTRYPOINT ["node"]`
`CMD`	Provides default arguments to `ENTRYPOINT`, or a default command	`CMD ["server.js"]`

COPY vs ADD

Use COPY for everything unless you specifically need ADD's extra features (archive extraction or URL fetching). COPY is more explicit and predictable.

Recommended Instruction Order

The order of instructions in a Dockerfile affects both cache efficiency and readability. Follow this pattern:

The key insight: copy dependency files first, install dependencies, then copy source code. This way, Docker can reuse the cached dependency layer when only your source code changes (which happens far more often than dependency changes).

Complete Examples

Node.js

FROM node:20-alpine
WORKDIR /app

# Copy dependency manifests first (cache-friendly)
COPY package.json package-lock.json ./
RUN npm ci --omit=dev

# Then copy application source
COPY . .

USER node
CMD ["node", "server.js"]

Python

FROM python:3.12-slim
WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

USER 10001
CMD ["python", "app.py"]

Go (with Multi-Stage)

# Build stage: compile the binary
FROM golang:1.23-alpine AS build
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o app .

# Runtime stage: only the compiled binary
FROM alpine:3.20
COPY --from=build /src/app /usr/local/bin/app
ENTRYPOINT ["/usr/local/bin/app"]

Java

FROM eclipse-temurin:21-jre
WORKDIR /app
COPY target/app.jar app.jar
ENTRYPOINT ["java", "-jar", "app.jar"]

Base Image Selection

Your base image determines the starting size, available packages, and security surface of your image. Choose carefully:

Base Image Type	Size	When to Use
`alpine` variants (e.g., `node:20-alpine`)	~5-50 MB	Most server workloads. Smallest common option
`slim` variants (e.g., `python:3.12-slim`)	~50-150 MB	When Alpine's `musl` libc causes compatibility issues
Full images (e.g., `node:20`)	~300-1000 MB	Development only, or when you need many system packages
`distroless` (e.g., `gcr.io/.../distroless`)	~2-20 MB	Maximum security. No shell, no package manager
`scratch`	0 MB	Statically compiled binaries (Go, Rust)

Pin your versions

Always use a specific version tag like node:20-alpine instead of node:latest. The latest tag changes without warning and can break your builds. For maximum reproducibility, pin to a specific digest.

The `.dockerignore` File

Just like .gitignore prevents files from being tracked by Git, .dockerignore prevents files from being sent to the Docker daemon during builds. This makes builds faster and prevents sensitive files from accidentally ending up in your image.

Create a .dockerignore file in your project root:

.git
node_modules
dist
build
coverage
*.log
.env
.DS_Store
tmp

Without this file, Docker sends your entire project directory (including node_modules, .git history, and local secrets) to the daemon -- even if you never COPY them.

ENTRYPOINT vs CMD

These two instructions are often confused. Here is how they work together:

Dockerfile	`docker run app`	`docker run app --help`
`CMD ["node", "server.js"]`	Runs `node server.js`	Runs `--help` (CMD is replaced)
`ENTRYPOINT ["node", "server.js"]`	Runs `node server.js`	Runs `node server.js --help` (args appended)
`ENTRYPOINT ["node"]` + `CMD ["server.js"]`	Runs `node server.js`	Runs `node --help` (CMD is replaced)

Rule of thumb: Use CMD for simple applications. Use ENTRYPOINT + CMD when you want a fixed executable with configurable arguments.

Always use the exec form (JSON array syntax) for reliable signal handling:

# Good: exec form - process receives SIGTERM directly
CMD ["node", "server.js"]

# Bad: shell form - runs through /bin/sh, signals may not reach your process
CMD node server.js

Common Pitfalls

Mistake	Why It Hurts	Fix
Using `latest` as base image	Builds break unpredictably	Pin to a specific version tag
`COPY . .` before `RUN npm install`	Every code change re-installs all dependencies	Copy `package*.json` first, install, then copy source
Storing secrets in `ENV` or `COPY`	Secrets are baked into image layers forever	Inject at runtime via env vars or mounted secrets
Running as `root`	Compromised app has full system access	Add `USER node` or `USER 10001`
`RUN apt-get update` alone	Package cache becomes stale	Chain: `RUN apt-get update && apt-get install -y ...`
No `.dockerignore`	Slow builds, secrets sent to daemon	Create `.dockerignore` with standard exclusions

Building and Running Your Image

# Build the image and tag it
docker build -t my-app:1.0.0 .

# Check the image was created
docker images my-app

# Run a container from the image
docker run --rm -p 3000:3000 my-app:1.0.0

# Run with a custom command (overrides CMD)
docker run --rm my-app:1.0.0 --help

Adding Metadata with Labels

Labels add metadata to your image that helps with auditing and traceability. Use the OCI standard label names:

FROM node:20-alpine
LABEL org.opencontainers.image.source="https://github.com/example/repo"
LABEL org.opencontainers.image.revision="abc123"
LABEL org.opencontainers.image.created="2026-02-13T00:00:00Z"

WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY . .
USER node
CMD ["node", "server.js"]

You can inspect labels on any image with docker inspect <image>.

Key Takeaways

A Dockerfile is a build contract -- it determines the size, security, and reliability of your containers.
Instruction order matters: copy dependency manifests first, install dependencies, then copy source code.
Always use a .dockerignore to keep builds fast and prevent secret leakage.
Pin your base image versions -- never rely on latest for production builds.
Run as a non-root user whenever possible.
Use exec form (["node", "server.js"]) for CMD and ENTRYPOINT to ensure proper signal handling.

What's Next

Continue to Layer Cache and Build Context to understand how Docker caches layers and how to make your builds faster.

How a Docker Build Works​

Instruction Reference​

Recommended Instruction Order​

Complete Examples​

Node.js​

Python​

Go (with Multi-Stage)​

Java​

Base Image Selection​

The .dockerignore File​

ENTRYPOINT vs CMD​

Common Pitfalls​

Building and Running Your Image​

Adding Metadata with Labels​

Key Takeaways​

What's Next​