AWS Connectivity — VPN, Direct Connect & VPC Peering

1. Connectivity Scenarios

Need Solution
On-premises ↔ AWS (over internet) Site-to-Site VPN
On-premises ↔ AWS (dedicated fiber) AWS Direct Connect
Multiple on-prem branches ↔ AWS VPN CloudHub
Remote user ↔ AWS AWS Client VPN
VPC ↔ VPC (same or different account/region) VPC Peering
Many VPCs + on-premises (hub model) Transit Gateway

2. Site-to-Site VPN

Components

Component Location Role
Virtual Private Gateway (VGW) AWS VPC AWS-side VPN endpoint — attached to VPC
Customer Gateway (CGW) On-premises Represents your physical VPN device in AWS
VPN Connection Logical link Two IPsec tunnels connecting VGW ↔ CGW
On-Premises                              AWS
Data Center                              VPC
  [Router/Firewall]                      [VGW]
  [CGW resource] ──── Tunnel 1 ──────────┤
                ──── Tunnel 2 ──────────┘  (two tunnels for redundancy)

Key Properties

Property Detail
Transport Public internet
Encryption IPsec (AES-256)
Tunnels per connection 2 (active/passive or ECMP active/active)
Routing Static routes OR BGP (dynamic)
Speed Up to 1.25 Gbps per tunnel
Setup time Minutes to hours
Cost Low (~$0.05/hr per VPN connection)
Latency Variable (depends on internet)

Two tunnels = redundancy. If one tunnel fails, traffic automatically shifts to the second. Each tunnel terminates in a different AZ on the AWS side.

Routing — Static vs BGP

Mode How Use When
Static You manually add on-prem CIDR to VPN config + route table Simple setups, predictable routes
BGP (dynamic) Routes automatically propagated via route propagation Complex/changing networks, multiple routes
# With BGP + route propagation enabled:
# Route table automatically gets:
# 192.168.0.0/24 → vgw-xxxxxxxx  (on-prem network, auto-propagated)

3. AWS VPN CloudHub

Extends Site-to-Site VPN to connect multiple on-premises offices to the same VGW — and allows those offices to communicate with each other through AWS.

Branch A (New York)  ─── VPN ───┐
Branch B (London)    ─── VPN ───┼── VGW (single VPC) ── AWS VPC
Branch C (Karachi)   ─── VPN ───┘
              ↕ (branches can also talk to each other via VGW)

Uses a hub-and-spoke model. Each branch needs a unique BGP ASN. Traffic between branches goes through AWS — billed as data transfer.


4. AWS Client VPN

For individual users (developers, remote employees) to connect to AWS or on-premises:

Your Laptop (OpenVPN client) → Client VPN Endpoint → VPC / On-Premises
Property Detail
Protocol OpenVPN (TLS)
Authentication Active Directory, SAML, certificate-based
Split tunneling ✅ Optional — only VPC traffic through VPN, rest goes to internet directly
Use case Remote workers, developers, temporary access

Client VPN ≠ Site-to-Site VPN. Client VPN is user-level access, not network-level.


5. AWS Direct Connect (DX)

Definition

A dedicated private fiber connection from your data center to an AWS Direct Connect Location (colocation facility) — bypassing the public internet entirely.

Your Data Center
  → Your private fiber →
    Direct Connect Location (AWS-partner colocation)
      → AWS backbone →
        AWS Region

You do NOT lay fiber directly to AWS. You connect to a Direct Connect Location (e.g., Equinix, Cyxtera) where AWS has a presence. A third-party carrier handles your data center → DX Location connection.

Connection Types

Type Bandwidth Provisioned By
Dedicated Connection 1, 10, or 100 Gbps AWS directly (from DX Location)
Hosted Connection 50 Mbps – 10 Gbps AWS Partner (sub-1G options available)

Key Properties

Property Detail
Network Private — no public internet
Encryption ❌ NOT encrypted by default
Latency Consistent and low
Bandwidth 1–100 Gbps (dedicated)
Setup time Weeks to months (physical provisioning)
Cost High — port fee + data transfer
SLA 99.99% with redundant connections

DX is NOT encrypted by default. For encryption: run a Site-to-Site VPN over the Direct Connect connection (VPN over DX = private path + encryption).

Virtual Interfaces (VIFs) ⭐

A VIF is a logical subdivision of the physical DX connection — allows one physical fiber to carry multiple traffic types.

VIF Type Connects To IP Addressing Max Bandwidth Use Case
Private VIF VPC (via VGW or DX Gateway) RFC 1918 (private) 10 Gbps Access private resources in VPC
Public VIF AWS public services (S3, DynamoDB, CloudFront) Public IPs 10 Gbps Access AWS public endpoints without internet
Transit VIF Transit Gateway (directly) RFC 1918 (private) 100 Gbps Multi-VPC modern architecture
Same DX connection can carry multiple VIFs (different VLANs):
  VLAN 100 → Private VIF → VPC-A
  VLAN 200 → Public VIF  → S3, SQS, SNS
  VLAN 300 → Transit VIF → Transit Gateway → multiple VPCs

Direct Connect Gateway

Connects one DX connection to multiple VPCs across Regions and accounts:

On-Premises → DX Location → Private VIF → DX Gateway → VGW (VPC-A, us-east-1)
                                                    → VGW (VPC-B, ap-south-1)
                                                    → VGW (VPC-C, eu-west-1)

Without DX Gateway — each VPC needs its own VIF (expensive and complex).

Architecture Decision

Single VPC, simple:       Private VIF → VGW → VPC
Multiple VPCs, modern:    Transit VIF → Transit Gateway → VPCs  (up to 100 Gbps)
Multiple VPCs, legacy:    Private VIF → DX Gateway → TGW → VPCs
Access S3/DynamoDB:       Public VIF → AWS Public Zone

6. DX + VPN — Backup Architecture ⭐

Direct Connect has no SLA on the fiber between your data center and the DX Location. Best practice for production:

Primary:  Direct Connect     (high bandwidth, low latency, private)
Backup:   Site-to-Site VPN   (internet-based, activates on DX failure)
On-Premises
  ├── Direct Connect → VGW   (primary, high-speed)
  └── VPN Connection → VGW   (backup, activates when DX fails)

BGP routing handles failover automatically — DX routes preferred (shorter AS path).


7. VPN vs Direct Connect

Feature Site-to-Site VPN Direct Connect
Transport Public internet Private dedicated fiber
Encryption ✅ IPsec (always) ❌ Not by default (add VPN for encryption)
Latency Variable Consistent, low
Bandwidth ~1.25 Gbps/tunnel 1–100 Gbps
Setup time Minutes–hours Weeks–months
Cost Low (~$0.05/hr) High (port + data)
Reliability Depends on internet Very high (99.99% with redundancy)
Use case Quick setup, backup, dev Enterprise, large data, compliance

8. VPC Peering ⭐

Private connection between two VPCs over the AWS internal network — traffic never leaves AWS backbone.

How It Works

VPC-A (Requester)     →     VPC-B (Accepter)
  sends peering request       accepts request
  updates own route table     updates own route table
  → communication works ✅

Setup Steps

1. Create VPC Peering Connection
   (VPC-A initiates → VPC-B accepts)

2. Update Route Table in VPC-A:
   Destination: 192.168.0.0/16 → Target: pcx-xxxxxxxx

3. Update Route Table in VPC-B:
   Destination: 10.0.0.0/16 → Target: pcx-xxxxxxxx

4. Update Security Groups / NACLs to allow traffic

Peering connection alone = nothing. Route tables are what actually make it work.

Properties

Property Detail
Transport AWS private network (no internet)
Encryption Encrypted in transit (AWS backbone)
Cross-Region ✅ Yes — inter-region peering (data transfer charges apply)
Cross-Account ✅ Yes — requester/accepter in different accounts
Max peering per VPC 50 (default) / 125 (max with increase)
Cost Free within same Region; data transfer charges inter-Region

Critical Limitations ⭐

1. No Overlapping CIDRs:

VPC-A: 10.0.0.0/16
VPC-B: 10.0.0.0/24  ← overlaps → peering BLOCKED ❌

VPC-A: 10.0.0.0/16
VPC-B: 192.168.0.0/16 ← no overlap → allowed ✅

2. No Transitive Routing:

VPC-A ↔ VPC-B (peered)
VPC-A ↔ VPC-C (peered)
VPC-B ↛ VPC-C (NOT routable — traffic cannot transit through VPC-A)

Fix: Create direct VPC-B ↔ VPC-C peering
OR:  Use Transit Gateway (supports transitive routing)

3. No Edge-to-Edge Routing:

VPC-A has a VPN to on-premises
VPC-B is peered with VPC-A
→ VPC-B CANNOT reach on-premises via VPC-A's VPN ❌

Fix: Use Transit Gateway


9. VPC Peering vs Transit Gateway

Factor VPC Peering Transit Gateway
Transitive routing ❌ No ✅ Yes
N VPCs (connections needed) N×(N-1)/2 mesh N attachments (hub-spoke)
10 VPCs 45 peering connections 10 TGW attachments
Cross-Region ✅ Yes ✅ Yes (TGW peering)
On-premises (VPN/DX) ❌ No ✅ Yes (unified hub)
Cost Free within Region Per attachment + data
Bandwidth No limit Up to 50 Gbps per VPC attachment
Use when ≤ 5 VPC connections 5+ VPCs or hybrid networking
10 VPCs with peering = 45 connections to manage ❌
10 VPCs with TGW    = 10 attachments to one TGW ✅

10. CIDR Planning — Architect-Level Responsibility ⭐

The most common connectivity failure is CIDR overlap. Design with future growth:

Rule: Plan all CIDR blocks BEFORE connecting anything.
      Overlapping CIDRs = no peering, no VPN routing, routing ambiguity.

Example Plan:
  VPC Production:   10.0.0.0/16
  VPC Staging:      10.1.0.0/16
  VPC Dev:          10.2.0.0/16
  On-Premises:      192.168.0.0/16

All unique → any connectivity option works ✅

11. Full Connectivity Architecture (Enterprise)

On-Premises Data Center
  ├── Direct Connect (primary)  ─────┐
  └── Site-to-Site VPN (backup) ─────┤
                              Transit Gateway
                                ┌────┴────┐
                            VPC-Prod   VPC-Dev
                            (10.0.0.0)  (10.2.0.0)
                                └────┬────┘
                                  VPC-Shared
                              (NAT GW, VPC Endpoints,
                               DNS resolver)

12. Common Mistakes

❌ Wrong ✅ Correct
IGW used for VPN connectivity VGW is used for VPN — IGW is for internet access only
Direct Connect is encrypted by default DX is NOT encrypted — add VPN over DX for encryption
VPC Peering supports transitive routing No transitive routing — use Transit Gateway
One VPN tunnel per connection AWS VPN provides two tunnels per connection (redundancy)
Direct Connect setup takes hours Takes weeks to months (physical provisioning)
Peering works without route table updates Route tables on both sides must be manually updated
Default VPCs can always be peered Default VPCs have 172.31.0.0/16 — same CIDR across regions → cannot peer
VPN CloudHub = internet VPN CloudHub routes via AWS backbone after entering via VPN tunnel
DX is always faster than VPN DX has consistent latency; VPN may be adequate for non-latency-sensitive workloads
Transit VIF needs DX Gateway Transit VIF connects directly to TGW — no DX Gateway needed

13. Interview Questions Checklist

  • What are the two components of a Site-to-Site VPN? (VGW + CGW)
  • How many IPsec tunnels per VPN connection? Why two?
  • Static routing vs BGP routing in VPN — when to use each?
  • What is VPN CloudHub? How does it work?
  • Client VPN vs Site-to-Site VPN — difference?
  • What is Direct Connect? Who lays the fiber?
  • Is Direct Connect encrypted? How do you add encryption?
  • What are the three VIF types? What does each connect to?
  • Transit VIF vs Private VIF — key difference?
  • What is a Direct Connect Gateway? When is it needed?
  • What is the best practice for DX resilience? (DX primary + VPN backup)
  • VPN vs Direct Connect — 5 comparisons
  • What are the two hard limits of VPC Peering? (no overlap, no transitive)
  • Walk through the 4 steps to set up VPC Peering
  • Why doesn't peering work between two default VPCs in different regions?
  • What is edge-to-edge routing? Why is it blocked in peering?
  • VPC Peering vs Transit Gateway — when to use which?
  • 10 VPCs need full mesh connectivity — how many peering connections? (45)
  • What is the CIDR planning responsibility of an architect?