Skip to content

Docker

1. Why This Exists (The Real Infrastructure Problem)

The Problem: "It works on my machine." - Dependency Hell: Developer A uses Python 3.8. Server uses Python 3.6. App crashes. - Isolation Nightmares: Running 5 apps on one VM. App A leaks memory and kills App B. App C changes a shared library and breaks App D. - Slow Scaling: VM booting takes minutes. You need seconds.

The Solution: Ship the computer (OS filesystem), not just the code.

2. Mental Model (Antigravity View)

The Analogy: The Shipping Container. - Before: We threw loose sacks of flour, barrels of wine, and cars into a ship’s hold. Loading took weeks. - After: Everything goes into a standard 20ft metal box. The ship doesn't care what's inside. The crane doesn't care. The truck doesn't care.

One-Sentence Definition: A process that runs your application in a radically isolated section of the Linux kernel, tricking it into thinking it has its own private OS.

3. Architecture Diagram

+-------------------------------------------------------+
|                       Host OS                         |
|  (Linux Kernel - Shared by ALL Containers)            |
+--------------------------+----------------------------+
|       Docker Daemon      |      containerd            |
+-----------+--------------+----------------------------+
            |
    +-------v-------+          +-------v-------+
    |  Container A  |          |  Container B  |
    | (App + Libs)  |          | (App + Libs)  |
    +---------------+          +---------------+
    |  Guest FS     |          |  Guest FS     |
    | (Debian/Alpine)|         | (RHEL/Ubuntu) |
    +---------------+          +---------------+
            ^                          ^
            |                          |
       Namespace Isolation (PID, NET, MNT, IPC)
       Cgroups Resource Limits (CPU, RAM)

4. Core Concepts (No Fluff)

  1. Image: The read-only template (The class). A snapshot of a filesystem + metadata.
  2. Container: The data instantiation (The object). A writeable layer on top of the image + execution stats.
  3. Layers: Filesystem changes. Docker determines if a layer exists locally. If yes, it reuses it (Caching).
  4. Union Filesystem (Overlay2): The magic that stacks layers. It lets you see a unified view of multiple directories.
  5. Namespaces: The walls.
    • PID: Process ID 1 inside container != Process ID 1 on host.
    • NET: Own localhost, own IP.
    • MNT: Own /etc, /home (chroot on steroids).
  6. Cgroups (Control Groups): The limits. "You get 512MB RAM and 0.5 CPU."
  7. Volumes: The wormhole. A directory on the host mounted inside the container. Data persists here after container death.
  8. Multi-stage Builds: Compiling Java/Go requires JDK/Go compiler (Heavy). Running requires only JRE/Glibc (Light). Build in Stage 1, copy binary to Stage 2. Result: 50MB image instead of 1GB.

5. How It Works Internally

  1. docker run nginx:
    • Client talks to Daemon (dockerd).
    • Daemon checks local cache for nginx:latest image layers.
    • Pull: If missing, downloads layers from Registry (Docker Hub).
    • Create: Creates a new container ID. allocates a read-write layer (OverlayFS).
    • Network: Creates a virtual ethernet pair (veth). Connects one end to docker0 bridge, other to container.
    • Start: Calls containerd -> runc (low-level runtime) to actually spawn the process using Kernel namespaces.

6. Command Reference (Very Important)

Build

  • docker build -t my-app:v1 .: Builds image from Dockerfile in current dir. Use -f to specify file.
  • docker build -t my-app:v1 --target production .: Builds only up to a specific stage.

Run

  • docker run -d -p 8080:80 --name web-1 my-app:v1:
    • -d: Detached (Background).
    • -p: Port Mapping (Host:Container). Traffic hitting Host:8080 flows to Container:80.
    • --name: DNS name for internal Docker network.

Logs & Debug

  • docker logs -f web-1: Tail logs (stdout/stderr). Essential for debugging crashes.
  • docker exec -it web-1 /bin/bash: Teleport inside the running container. -i (Interactive), -t (TTY).
  • docker inspect web-1: The source of truth. Shows IP, volume mounts, environment variables, state.

Cleanup

  • docker system prune -a: The nucleur option. Delete all stopped containers, unused networks, and dangling images.

7. Production Deployment Example (Spring Boot)

1. Dockerfile (Multi-Stage)

# Stage 1: Build
FROM eclipse-temurin:17-jdk-jammy as builder
WORKDIR /app
COPY .mvn/ .mvn
COPY mvnw pom.xml ./
# Download dependencies (Layer caching optimization)
RUN ./mvnw dependency:resolve
COPY src ./src
RUN ./mvnw package -DskipTests

# Stage 2: Run
FROM eclipse-temurin:17-jre-jammy
WORKDIR /app
COPY --from=builder /app/target/*.jar app.jar
# Non-root user for security
RUN useradd -m myuser
USER myuser
ENTRYPOINT ["java", "-jar", "app.jar"]

2. docker-compose.yml (Local Dev / Simple Prod)

version: '3.8'
services:
  backend:
    build: .
    ports:
      - "8080:8080"
    environment:
      - SPRING_DATASOURCE_URL=jdbc:postgresql://db:5432/mydb
    depends_on:
      - db
      - redis

  db:
    image: postgres:15-alpine
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: password
    volumes:
      - db_data:/var/lib/postgresql/data

  redis:
    image: redis:7-alpine

volumes:
  db_data: # Persist DB execution

8. Scaling Model

  • Vertical: docker update --cpu-shares 512 web-1.
  • Horizontal: docker-compose up -d --scale backend=3. (Requires a Load Balancer like Nginx in front to distribute traffic).

9. Failure Modes

  • OOM Kill (Exit Code 137): Container used more RAM than Cgroup limit. Kernel kills it. Fix: Increase limit or fix memory leak.
  • PID 1 Zombie: If your app (PID 1) crashes, the container stops. If it generates zombie processes, they eat system table. Use tini as init process.
  • Disk Full: Docker logs are stored as JSON files on Host. If not rotated, they fill the disk. Fix: Configure max-size in daemon.json.

10. Security Model

  • Don't run as Root: By default, code runs as root inside container. If escaped, attacker is root on host. Always USER myuser.
  • Read-Only Root FS: docker run --read-only. Prevents attacker from modifying binaries.
  • Sensible Limits: Always set --memory and --cpus. One leaky app shouldn't crash the server.

11. Cost Model

  • Free: Docker Engine is open source.
  • Paid: Docker Desktop (Mac/Windows) requires subscription for large companies.
  • Compute: You pay for the underlying EC2. Dense packing containers serves more traffic per $ than VMs.

12. When NOT To Use It

  • High IOPS Databases: Running primary Oracle/Postgres in Docker adds filesystem overhead (OverlayFS). For extreme performance, run on bare metal or specialized VMs.
  • GUI Apps: Possible (X11 forwarding), but painful. Docker is for headless daemons.

13. Interview & System Design Summary

  • Layering: Copy-on-write means 10 containers sharing an image only use disk space once.
  • Isolation: Namespaces (Visibility) vs Cgroups (Resources).
  • Networking: Bridge (Default) vs Host (Performance) vs Overlay (Swarm/K8s).
  • Volumes: Bind Mount (Dev) vs Docker Volume (Prod/Persistence).
  • Init Process: PID 1 must handle signals (SIGTERM/SIGINT) correctly to shut down gracefully.