Docker
1. Why This Exists (The Real Infrastructure Problem)
The Problem: "It works on my machine." - Dependency Hell: Developer A uses Python 3.8. Server uses Python 3.6. App crashes. - Isolation Nightmares: Running 5 apps on one VM. App A leaks memory and kills App B. App C changes a shared library and breaks App D. - Slow Scaling: VM booting takes minutes. You need seconds.
The Solution: Ship the computer (OS filesystem), not just the code.
2. Mental Model (Antigravity View)
The Analogy: The Shipping Container. - Before: We threw loose sacks of flour, barrels of wine, and cars into a ship’s hold. Loading took weeks. - After: Everything goes into a standard 20ft metal box. The ship doesn't care what's inside. The crane doesn't care. The truck doesn't care.
One-Sentence Definition: A process that runs your application in a radically isolated section of the Linux kernel, tricking it into thinking it has its own private OS.
3. Architecture Diagram
+-------------------------------------------------------+
| Host OS |
| (Linux Kernel - Shared by ALL Containers) |
+--------------------------+----------------------------+
| Docker Daemon | containerd |
+-----------+--------------+----------------------------+
|
+-------v-------+ +-------v-------+
| Container A | | Container B |
| (App + Libs) | | (App + Libs) |
+---------------+ +---------------+
| Guest FS | | Guest FS |
| (Debian/Alpine)| | (RHEL/Ubuntu) |
+---------------+ +---------------+
^ ^
| |
Namespace Isolation (PID, NET, MNT, IPC)
Cgroups Resource Limits (CPU, RAM)
4. Core Concepts (No Fluff)
- Image: The read-only template (The class). A snapshot of a filesystem + metadata.
- Container: The data instantiation (The object). A writeable layer on top of the image + execution stats.
- Layers: Filesystem changes. Docker determines if a layer exists locally. If yes, it reuses it (Caching).
- Union Filesystem (Overlay2): The magic that stacks layers. It lets you see a unified view of multiple directories.
- Namespaces: The walls.
PID: Process ID 1 inside container != Process ID 1 on host.NET: Ownlocalhost, own IP.MNT: Own/etc,/home(chroot on steroids).
- Cgroups (Control Groups): The limits. "You get 512MB RAM and 0.5 CPU."
- Volumes: The wormhole. A directory on the host mounted inside the container. Data persists here after container death.
- Multi-stage Builds: Compiling Java/Go requires JDK/Go compiler (Heavy). Running requires only JRE/Glibc (Light). Build in Stage 1, copy binary to Stage 2. Result: 50MB image instead of 1GB.
5. How It Works Internally
docker run nginx:- Client talks to Daemon (dockerd).
- Daemon checks local cache for
nginx:latestimage layers. - Pull: If missing, downloads layers from Registry (Docker Hub).
- Create: Creates a new container ID. allocates a read-write layer (OverlayFS).
- Network: Creates a virtual ethernet pair (veth). Connects one end to
docker0bridge, other to container. - Start: Calls
containerd->runc(low-level runtime) to actually spawn the process using Kernel namespaces.
6. Command Reference (Very Important)
Build
docker build -t my-app:v1 .: Builds image from Dockerfile in current dir. Use-fto specify file.docker build -t my-app:v1 --target production .: Builds only up to a specific stage.
Run
docker run -d -p 8080:80 --name web-1 my-app:v1:-d: Detached (Background).-p: Port Mapping (Host:Container). Traffic hitting Host:8080 flows to Container:80.--name: DNS name for internal Docker network.
Logs & Debug
docker logs -f web-1: Tail logs (stdout/stderr). Essential for debugging crashes.docker exec -it web-1 /bin/bash: Teleport inside the running container.-i(Interactive),-t(TTY).docker inspect web-1: The source of truth. Shows IP, volume mounts, environment variables, state.
Cleanup
docker system prune -a: The nucleur option. Delete all stopped containers, unused networks, and dangling images.
7. Production Deployment Example (Spring Boot)
1. Dockerfile (Multi-Stage)
# Stage 1: Build
FROM eclipse-temurin:17-jdk-jammy as builder
WORKDIR /app
COPY .mvn/ .mvn
COPY mvnw pom.xml ./
# Download dependencies (Layer caching optimization)
RUN ./mvnw dependency:resolve
COPY src ./src
RUN ./mvnw package -DskipTests
# Stage 2: Run
FROM eclipse-temurin:17-jre-jammy
WORKDIR /app
COPY --from=builder /app/target/*.jar app.jar
# Non-root user for security
RUN useradd -m myuser
USER myuser
ENTRYPOINT ["java", "-jar", "app.jar"]
2. docker-compose.yml (Local Dev / Simple Prod)
version: '3.8'
services:
backend:
build: .
ports:
- "8080:8080"
environment:
- SPRING_DATASOURCE_URL=jdbc:postgresql://db:5432/mydb
depends_on:
- db
- redis
db:
image: postgres:15-alpine
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: password
volumes:
- db_data:/var/lib/postgresql/data
redis:
image: redis:7-alpine
volumes:
db_data: # Persist DB execution
8. Scaling Model
- Vertical:
docker update --cpu-shares 512 web-1. - Horizontal:
docker-compose up -d --scale backend=3. (Requires a Load Balancer like Nginx in front to distribute traffic).
9. Failure Modes
- OOM Kill (Exit Code 137): Container used more RAM than Cgroup limit. Kernel kills it. Fix: Increase limit or fix memory leak.
- PID 1 Zombie: If your app (PID 1) crashes, the container stops. If it generates zombie processes, they eat system table. Use
tinias init process. - Disk Full: Docker logs are stored as JSON files on Host. If not rotated, they fill the disk. Fix: Configure
max-sizeindaemon.json.
10. Security Model
- Don't run as Root: By default, code runs as root inside container. If escaped, attacker is root on host. Always
USER myuser. - Read-Only Root FS:
docker run --read-only. Prevents attacker from modifying binaries. - Sensible Limits: Always set
--memoryand--cpus. One leaky app shouldn't crash the server.
11. Cost Model
- Free: Docker Engine is open source.
- Paid: Docker Desktop (Mac/Windows) requires subscription for large companies.
- Compute: You pay for the underlying EC2. Dense packing containers serves more traffic per $ than VMs.
12. When NOT To Use It
- High IOPS Databases: Running primary Oracle/Postgres in Docker adds filesystem overhead (OverlayFS). For extreme performance, run on bare metal or specialized VMs.
- GUI Apps: Possible (X11 forwarding), but painful. Docker is for headless daemons.
13. Interview & System Design Summary
- Layering: Copy-on-write means 10 containers sharing an image only use disk space once.
- Isolation: Namespaces (Visibility) vs Cgroups (Resources).
- Networking: Bridge (Default) vs Host (Performance) vs Overlay (Swarm/K8s).
- Volumes: Bind Mount (Dev) vs Docker Volume (Prod/Persistence).
- Init Process: PID 1 must handle signals (SIGTERM/SIGINT) correctly to shut down gracefully.