Kubernetes (K8s)
1. Why This Exists (The Real Infrastructure Problem)
The Problem: Docker solved "It works on my machine", but created "How do I manage 500 machines?" - Container Sprawl: You have 500 Docker containers running across 20 VMs. Who restarts them when they crash? - Service Discovery: How does Container A find Container B if IPs change every time they restart? - Resource Tetris: VM 1 is 90% full. VM 2 is 10% full. How do you efficiently pack containers to save money?
The Solution: A Cluster Operating System. You give it 50 VMs, and it treats them as one giant supercomputer.
2. Mental Model (Antigravity View)
The Analogy: The Shipping Port Manager. - Docker: The Container Box. - Kubernetes: The Crane, The Logbook, and The Safety Inspector. - You: The Customer. You say "I need 5 replicas of the Web App running". - K8s: "I don't care where they run, but I guarantee there will always be 5." (State Reconciliation).
One-Sentence Definition: A declarative control plane that automates deployment, scaling, and management of containerized applications.
3. Architecture Diagram
+---------------------------------------------------------------+
| Control Plane (Brain) |
| |
| [API Server] <---> [etcd (Database)] |
| ^ | |
| | +--------> [Scheduler] (Decides where Pods go) |
| | |
| [Controller Mgr] (Fixes broken state) |
+-------+-------------------------------------------------------+
| (Talks via HTTPS to Kubelets)
v
+-------+------------------+ +-------+------------------+
| Worker Node 1 | | Worker Node 2 |
| | | |
| [Kubelet] (The Agent) | | [Kubelet] |
| | | | | |
| v | | v |
| [Container Runtime] | | [Container Runtime] |
| | | | | |
| +-----+ +-----+ | | +-----+ |
| | Pod | | Pod | | | | Pod | |
| +-----+ +-----+ | | +-----+ |
| | ^ | | |
| [Kube-Proxy] (Networking)| | [Kube-Proxy] |
+--------------------------+ +--------------------------+
4. Core Concepts (No Fluff)
- Pod: The atom. One or more containers that share an IP and Volume. (Main App + Sidecar).
- ReplicaSet: The Xerox machine. "Ensure 3 copies of Pod X exist."
- Deployment: The Rolling Update manager. Manages ReplicaSets to move from v1 to v2 without downtime.
- Service: The internal Load Balancer/DNS. "Talk to
my-serviceand I'll route you to one of the 3 healthy Pods." - Ingress: The External Door. HTTP/HTTPS routing (Host/Path based) into the cluster.
- ConfigMap/Secret: Configuration files and Passwords injected into Pods as Env Vars or Files.
- StatefulSet: For Databases. Guarantees ordering:
db-0starts beforedb-1. Stable Network IDs. - DaemonSet: Runs one Pod on every Node. (Logs, Monitoring agents).
- HPA (Horizontal Pod Autoscaler): "If CPU > 50%, add more replicas."
Internal Components
- etcd: The Source of Truth. Key-Value store holding the entire state of the cluster.
- kubelet: The agent on every node that says "API Server wants Pod A here. Docker, run Pod A."
- Scheduler: The Tetris player. logic: "Node 1 has 2GB RAM free. Pod requires 1GB. Place it there."
- API Server: The Front Door. The only component that talks to
etcd. Everyone else talks to API Server.
5. How It Works Internally
Lifecycle of a Pod Creation:
1. User: kubectl apply -f pod.yaml -> API Server.
2. API Server: Validates and writes to etcd. Returns "Created".
3. Scheduler: Watches API Server. Sees "Pending Pod". Filter nodes (Resource/Taints). Scores nodes. Assigns Pod to Node 1. Update API Server.
4. Kubelet (Node 1): Watches API Server. Sees "Pod assigned to me".
5. Kubelet:
- Calls CNI Plugin (Allocate IP).
- Calls CRI (Container Runtime Interface) -> (containerd/Docker) to pull image and start container.
6. Kubelet: Updates API Server: "Pod is Running".
7. Kube-Proxy: Updates iptables rules on all nodes to route traffic to the new Pod IP.
6. Command Reference (Very Important)
Get & Inspect
kubectl get pods -o wide: Show Pods + IP lines + Node location.kubectl describe pod my-pod: The Debugging Gold. Shows events (ImagePullBackOff, FailedScheduling, OOMKilled).kubectl get all: Show everything in current namespace.
Interaction
kubectl logs -f my-pod: Tail logs. If multi-container:-c container-name.kubectl exec -it my-pod -- /bin/sh: Shell into pod.kubectl port-forward svc/my-service 8080:80: Forward local port 8080 to Cluster Service port 80. (Access internal apps without Ingress).
Troubleshooting Nodes
kubectl top nodes: Show CPU/RAM usage.kubectl get events --sort-by='.lastTimestamp': Show cluster-wide error events.
Apply & Updates
kubectl apply -f manifest.yaml: Best practice. Declarative update.kubectl rollout restart deployment/my-app: Force a restart (new pods) without changing YAML.kubectl scale deployment/my-app --replicas=5: Imperative scaling (Will be overwritten by HPA/GitOps).
7. Production Deployment Example (Full Stack)
Kubernetes Manifests (k8s/)
1. Secret & ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
DB_HOST: "postgres-svc"
---
apiVersion: v1
kind: Secret
metadata:
name: app-secret
stringData:
DB_PASS: "supersecret"
2. Database (StatefulSet)
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: "postgres-svc"
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15
ports:
- containerPort: 5432
volumeMounts:
- name: pg-data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates: # Auto-provision EBS/PersistentDisk
- metadata:
name: pg-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
3. Backend (Deployment)
apiVersion: apps/v1
kind: Deployment
metadata:
name: spring-backend
spec:
replicas: 3
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
containers:
- name: app
image: my-repo/backend:v1
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: app-secret
key: DB_PASS
livenessProbe: # Restart if dead
httpGet:
path: /actuator/health
port: 8080
readinessProbe: # Don't send traffic until ready
httpGet:
path: /actuator/health
port: 8080
4. Service (Internal Load Balancer)
apiVersion: v1
kind: Service
metadata:
name: backend-svc
spec:
selector:
app: backend
ports:
- port: 80
targetPort: 8080
8. Scaling Model
- HPA (Auto-Scaler): Scales Pods based on CPU/Memory/Custom Metrics (Requests per sec).
- Cluster Autoscaler / Karpenter: Scales Nodes (EC2) when Pods are "Pending" (No space left).
- Rolling Update: Default. Spawns v2 pod, waits for
readinessProbe, kills v1 pod. Zero downtime.
9. Failure Modes
- CrashLoopBackOff: App started, crashed immediately. K8s restarts it exponentially (10s, 20s, 5m). Check
kubectl logs. - ImagePullBackOff: Typo in image name or Missing Auth secret for Private Registry.
- Pending: No Nodes available with enough CPU/RAM. Cluster Autoscaler should kick in.
- Node NotReady: Kubelet stopped sending heartbeats. K8s waits 5 mins (pod-eviction-timeout) then reschedules pods elsewhere.
10. Security Model
- RBAC: Role Based Access Control. "User Bob can only
getpods indevnamespace". - Pod Security Standards (PSS): "Prevent Containers from running as Root."
- Network Policies: Default is specific "Allow All". You must create policies to "Deny traffic between Namespace A and B". (Requires CNI support like Calico/VPC CNI).
11. Cost Model
- Control Plane: ~$73/month (EKS).
- Worker Nodes: Paid per second (EC2).
- Waste: If you request
memory: 4Gibut use 500Mi, you pay for 4Gi (Reserved capacity). - Load Balancers: Each
Service Type: LoadBalancer= $$$. Use Ingress (1 LB for 50 services).
12. When NOT To Use It
- Simple App: 1-2 containers? Use Docker Compose on a VM or App Runner.
- Limited Ops Team: K8s is complex. If you don't have someone to manage upgrades, logging, and ingress, it will break you.
- Stateful Monoliths: Legacy apps requiring specific MAC addresses or IP hardcoding.
13. Interview & System Design Summary
- Imperative vs Declarative: K8s is Declarative ("Make it look like this YAML").
- Self-Healing: The Controller Manager loop (Current State -> Desired State).
- Service Discovery: DNS (CoreDNS) maps Service Name -> ClusterIP -> Pod IPs.
- Probes: Liveness (Restart) vs Readiness (Traffic flow).
- Sidecar Pattern: Helper container logging/proxying alongside main app in same Pod.