VPC (Virtual Private Cloud)
1. Why This Service Exists (The Real Problem)
The Problem: In the early cloud days (EC2-Classic), all customers shared one giant flat network. - Security Nightmare: Your database was potentially reachable by my web server if I guessed your IP. - No Customization: You couldn't choose your own IP range or control routing.
The Solution: AWS carved out a private, isolated slice of the AWS cloud just for you.
2. Mental Model (Antigravity View)
The Analogy: Your House in a Gated Community. - VPC: Your house (The isolated boundary). - Subnets: Rooms in the house (Bedrooms are private, Living Room is public). - Route Table: The Hallway signs (To Kitchen -> Left). - IGW (Internet Gateway): The Front Door (To the outside world). - NACL: The Security Guard at the neighborhood gate (Subnet level). - Security Group: The Lock on your Bedroom Door (Instance level).
One-Sentence Definition: A logically isolated virtual network where you launch your AWS resources, defined by an IP address range.
3. Core Components (No Marketing)
- CIDR Block: The IP range (e.g.,
10.0.0.0/16). Defines the size of your network (65,536 IPs). - Subnets:
- Public: Has a route to the Internet Gateway (IGW).
- Private: No direct route to the internet. Uses NAT Gateway to talk out.
- Isolated: No internet access at all (Database only).
- Route Tables: The rules engine. "Traffic for
0.0.0.0/0goes toigw-123". - Internet Gateway (IGW): The bridge between your VPC and the public internet.
- NAT Gateway: A one-way mirror. Private instances can "see out" (download updates) but nobody can "see in".
4. How It Works Internally (Simplified)
- Packet Flow:
- EC2
10.0.1.5sends a packet to8.8.8.8(Google DNS). - VPC Router looks at the Subnet's Route Table.
- Rule:
0.0.0.0/0 -> nat-gateway-id. - Packet goes to NAT GW -> IGW -> ISP.
- EC2
- Isolation:
- VPC traffic is encapsulated (VLAN tagging on steroids) using AWS Hyperplane technology. Even if you sniff the wire, you can't see other customers' packets.
5. Common Production Use Cases
- Multi-Tier Web App:
- Public Subnet: Load Balancer (ALB) + NAT Gateway.
- Private Subnet: App Servers (EC2/EKS).
- Data Subnet: Databases (RDS/ElastiCache).
- Hybrid Cloud: Connecting your On-Premise Data Center to VPC via VPN or Direct Connect.
6. Architecture Patterns
The "Standard" 3-Tier Architecture
Requirement: 2 Availability Zones (AZs) for High Availability.
Total Layout:
- VPC CIDR: 10.0.0.0/16
- AZ 1:
- Public Subnet (ALB, NAT GW) 10.0.0.0/24
- Private App Subnet (EC2) 10.0.1.0/24
- Private DB Subnet (RDS) 10.0.2.0/24
- AZ 2: (Mirrors AZ 1 for redundancy)
- Public 10.0.10.0/24
- Private App 10.0.11.0/24
- Private DB 10.0.12.0/24
7. IAM & Security Model
Network Access Control Lists (NACLs) vs. Security Groups (SGs): - Security Group: Stateful. Applied to Instance. "Allow High". If you allow request IN, response OUT is automatically allowed. - NACL: Stateless. Applied to Subnet. "Block Low". You must explicitly allow IN and OUT. - Use Case: Use SGs for everything. Use NACLs only to block specific malicious IPs (e.g., DDOS source).
VPC Flow Logs: - The "Security Camera" for your network. Records every packet allowed/denied. Essential for forensic analysis.
8. Cost Model (Very Important)
The Hidden Tax of Cloud Networking. 1. NAT Gateway: - Hourly charge (~$0.045/hr) + Data Processing Charge ($0.045/GB). - If you download terabytes through NAT, you will go broke. 2. VPC Peering: - Data transfer across peered VPCs in same region is usually free (if in same AZ) or cheap (cross-AZ). Cross-region peering is expensive. 3. VPN: Hourly charge for connection. 4. Egress: Sending data out to the internet ($0.09/GB typically).
Optimization: - Use VPC Endpoints (Gateway type) for S3 and DynamoDB. It's free and bypasses NAT Gateway. - Use VPC Endpoints (Interface type) for other AWS services to keep traffic within AWS network.
9. Common Mistakes & Anti-Patterns
- CIDR Overlap: Picking
192.168.1.0/24for your VPC, then trying to VPN to your office which also uses192.168.1.0/24. Routing breaks. Fix: Use obscure ranges like10.123.0.0/16. - One Giant Public Subnet: Putting Database and App servers in public subnet with Public IPs. Security Risk.
- Hardcoding IPs: AWS IPs are dynamic. Use DNS names.
- Running out of IPs: Picking a
/24VPC (256 IPs) for a EKS cluster. Pods consume IPs fast. Use/16(65k IPs).
10. When NOT to Use This Service
- You ALWAYS use VPC.
- (Unless you use high-level abstractions like Lightsail or App Runner which hide the VPC from you, but it's still there under the hood).
- Edge Case: Lambda used to run outside VPC by default (faster cold starts), but now runs efficiently inside.
11. Interview-Level Summary
- NAT Gateway: How do private instances talk to internet?
- Bastion Host: How do you SSH into private instances?
- Peering: How do two VPCs talk? (Non-transitive!).
- Endpoints: How to talk to S3 without NAT? (Gateway Endpoint).
- SGs vs NACLs: Stateful vs Stateless.
- Flow Logs: How to debug "Connect Timeout"?