Skip to content

Kubernetes (K8s)

1. Why This Exists (The Real Infrastructure Problem)

The Problem: Docker solved "It works on my machine", but created "How do I manage 500 machines?" - Container Sprawl: You have 500 Docker containers running across 20 VMs. Who restarts them when they crash? - Service Discovery: How does Container A find Container B if IPs change every time they restart? - Resource Tetris: VM 1 is 90% full. VM 2 is 10% full. How do you efficiently pack containers to save money?

The Solution: A Cluster Operating System. You give it 50 VMs, and it treats them as one giant supercomputer.

2. Mental Model (Antigravity View)

The Analogy: The Shipping Port Manager. - Docker: The Container Box. - Kubernetes: The Crane, The Logbook, and The Safety Inspector. - You: The Customer. You say "I need 5 replicas of the Web App running". - K8s: "I don't care where they run, but I guarantee there will always be 5." (State Reconciliation).

One-Sentence Definition: A declarative control plane that automates deployment, scaling, and management of containerized applications.

3. Architecture Diagram

+---------------------------------------------------------------+
|                      Control Plane (Brain)                    |
|                                                               |
|  [API Server] <---> [etcd (Database)]                         |
|       ^  |                                                    |
|       |  +--------> [Scheduler] (Decides where Pods go)       |
|       |                                                       |
|   [Controller Mgr] (Fixes broken state)                       |
+-------+-------------------------------------------------------+
        |  (Talks via HTTPS to Kubelets)
        v
+-------+------------------+    +-------+------------------+
|      Worker Node 1       |    |      Worker Node 2       |
|                          |    |                          |
| [Kubelet] (The Agent)    |    | [Kubelet]                |
|    |                     |    |    |                     |
|    v                     |    |    v                     |
| [Container Runtime]      |    | [Container Runtime]      |
|    |                     |    |    |                     |
|  +-----+ +-----+         |    |  +-----+                 |
|  | Pod | | Pod |         |    |  | Pod |                 |
|  +-----+ +-----+         |    |  +-----+                 |
|     |       ^            |    |                          |
| [Kube-Proxy] (Networking)|    | [Kube-Proxy]             |
+--------------------------+    +--------------------------+

4. Core Concepts (No Fluff)

  1. Pod: The atom. One or more containers that share an IP and Volume. (Main App + Sidecar).
  2. ReplicaSet: The Xerox machine. "Ensure 3 copies of Pod X exist."
  3. Deployment: The Rolling Update manager. Manages ReplicaSets to move from v1 to v2 without downtime.
  4. Service: The internal Load Balancer/DNS. "Talk to my-service and I'll route you to one of the 3 healthy Pods."
  5. Ingress: The External Door. HTTP/HTTPS routing (Host/Path based) into the cluster.
  6. ConfigMap/Secret: Configuration files and Passwords injected into Pods as Env Vars or Files.
  7. StatefulSet: For Databases. Guarantees ordering: db-0 starts before db-1. Stable Network IDs.
  8. DaemonSet: Runs one Pod on every Node. (Logs, Monitoring agents).
  9. HPA (Horizontal Pod Autoscaler): "If CPU > 50%, add more replicas."

Internal Components

  1. etcd: The Source of Truth. Key-Value store holding the entire state of the cluster.
  2. kubelet: The agent on every node that says "API Server wants Pod A here. Docker, run Pod A."
  3. Scheduler: The Tetris player. logic: "Node 1 has 2GB RAM free. Pod requires 1GB. Place it there."
  4. API Server: The Front Door. The only component that talks to etcd. Everyone else talks to API Server.

5. How It Works Internally

Lifecycle of a Pod Creation: 1. User: kubectl apply -f pod.yaml -> API Server. 2. API Server: Validates and writes to etcd. Returns "Created". 3. Scheduler: Watches API Server. Sees "Pending Pod". Filter nodes (Resource/Taints). Scores nodes. Assigns Pod to Node 1. Update API Server. 4. Kubelet (Node 1): Watches API Server. Sees "Pod assigned to me". 5. Kubelet: - Calls CNI Plugin (Allocate IP). - Calls CRI (Container Runtime Interface) -> (containerd/Docker) to pull image and start container. 6. Kubelet: Updates API Server: "Pod is Running". 7. Kube-Proxy: Updates iptables rules on all nodes to route traffic to the new Pod IP.

6. Command Reference (Very Important)

Get & Inspect

  • kubectl get pods -o wide: Show Pods + IP lines + Node location.
  • kubectl describe pod my-pod: The Debugging Gold. Shows events (ImagePullBackOff, FailedScheduling, OOMKilled).
  • kubectl get all: Show everything in current namespace.

Interaction

  • kubectl logs -f my-pod: Tail logs. If multi-container: -c container-name.
  • kubectl exec -it my-pod -- /bin/sh: Shell into pod.
  • kubectl port-forward svc/my-service 8080:80: Forward local port 8080 to Cluster Service port 80. (Access internal apps without Ingress).

Troubleshooting Nodes

  • kubectl top nodes: Show CPU/RAM usage.
  • kubectl get events --sort-by='.lastTimestamp': Show cluster-wide error events.

Apply & Updates

  • kubectl apply -f manifest.yaml: Best practice. Declarative update.
  • kubectl rollout restart deployment/my-app: Force a restart (new pods) without changing YAML.
  • kubectl scale deployment/my-app --replicas=5: Imperative scaling (Will be overwritten by HPA/GitOps).

7. Production Deployment Example (Full Stack)

Kubernetes Manifests (k8s/)

1. Secret & ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  DB_HOST: "postgres-svc"
---
apiVersion: v1
kind: Secret
metadata:
  name: app-secret
stringData:
  DB_PASS: "supersecret"

2. Database (StatefulSet)

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: "postgres-svc"
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:15
        ports:
        - containerPort: 5432
        volumeMounts:
        - name: pg-data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates: # Auto-provision EBS/PersistentDisk
  - metadata:
      name: pg-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 10Gi

3. Backend (Deployment)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-backend
spec:
  replicas: 3
  selector:
    matchLabels:
      app: backend
  template:
    metadata:
      labels:
        app: backend
    spec:
      containers:
      - name: app
        image: my-repo/backend:v1
        env:
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: app-secret
              key: DB_PASS
        livenessProbe: # Restart if dead
          httpGet:
            path: /actuator/health
            port: 8080
        readinessProbe: # Don't send traffic until ready
          httpGet:
            path: /actuator/health
            port: 8080

4. Service (Internal Load Balancer)

apiVersion: v1
kind: Service
metadata:
  name: backend-svc
spec:
  selector:
    app: backend
  ports:
    - port: 80
      targetPort: 8080

8. Scaling Model

  • HPA (Auto-Scaler): Scales Pods based on CPU/Memory/Custom Metrics (Requests per sec).
  • Cluster Autoscaler / Karpenter: Scales Nodes (EC2) when Pods are "Pending" (No space left).
  • Rolling Update: Default. Spawns v2 pod, waits for readinessProbe, kills v1 pod. Zero downtime.

9. Failure Modes

  • CrashLoopBackOff: App started, crashed immediately. K8s restarts it exponentially (10s, 20s, 5m). Check kubectl logs.
  • ImagePullBackOff: Typo in image name or Missing Auth secret for Private Registry.
  • Pending: No Nodes available with enough CPU/RAM. Cluster Autoscaler should kick in.
  • Node NotReady: Kubelet stopped sending heartbeats. K8s waits 5 mins (pod-eviction-timeout) then reschedules pods elsewhere.

10. Security Model

  • RBAC: Role Based Access Control. "User Bob can only get pods in dev namespace".
  • Pod Security Standards (PSS): "Prevent Containers from running as Root."
  • Network Policies: Default is specific "Allow All". You must create policies to "Deny traffic between Namespace A and B". (Requires CNI support like Calico/VPC CNI).

11. Cost Model

  • Control Plane: ~$73/month (EKS).
  • Worker Nodes: Paid per second (EC2).
  • Waste: If you request memory: 4Gi but use 500Mi, you pay for 4Gi (Reserved capacity).
  • Load Balancers: Each Service Type: LoadBalancer = $$$. Use Ingress (1 LB for 50 services).

12. When NOT To Use It

  • Simple App: 1-2 containers? Use Docker Compose on a VM or App Runner.
  • Limited Ops Team: K8s is complex. If you don't have someone to manage upgrades, logging, and ingress, it will break you.
  • Stateful Monoliths: Legacy apps requiring specific MAC addresses or IP hardcoding.

13. Interview & System Design Summary

  • Imperative vs Declarative: K8s is Declarative ("Make it look like this YAML").
  • Self-Healing: The Controller Manager loop (Current State -> Desired State).
  • Service Discovery: DNS (CoreDNS) maps Service Name -> ClusterIP -> Pod IPs.
  • Probes: Liveness (Restart) vs Readiness (Traffic flow).
  • Sidecar Pattern: Helper container logging/proxying alongside main app in same Pod.