AWS WAF (Web Application Firewall)¶
1. What is AWS WAF?¶
AWS WAF is a Layer 7 (application layer) firewall that inspects HTTP/HTTPS requests and allows, blocks, or counts them based on rules you define. It protects web applications from common exploits, bots, and malicious traffic before requests reach your application servers.
Without WAF:
Attacker sends SQL injection → hits your application → database compromised
With WAF:
Attacker sends SQL injection → WAF inspects HTTP body → matches rule → BLOCK ❌
Legitimate user → WAF inspects → no match → ALLOW ✅ → reaches application
What WAF Operates At¶
| Layer | Protocol | What WAF Inspects |
|---|---|---|
| Layer ¾ | TCP/IP | ❌ Not WAF's job (that's Network ACL / Security Groups) |
| Layer 7 | HTTP/HTTPS | ✅ URL, headers, body, query strings, cookies, IP |
2. Where WAF Integrates (Protected Resources)¶
WAF is attached to a protected resource — traffic to that resource is inspected by WAF rules before being forwarded:
| Resource | WAF Scope | Use Case |
|---|---|---|
| CloudFront | Global (all edge locations) | Protect CDN, inspect at edge |
| Application Load Balancer (ALB) | Regional | Protect web apps behind ALB |
| API Gateway (REST + HTTP) | Regional | Protect APIs |
| AppSync | Regional | Protect GraphQL APIs |
| Cognito User Pool | Regional | Protect auth endpoints |
| App Runner | Regional | Protect containerized apps |
| Verified Access | Regional | Zero trust application access |
Architecture options:
Option A: WAF on CloudFront (global)
User → CloudFront (WAF here) → ALB → Application
✅ Inspects at edge (closest to attacker) — cheapest, fastest blocking
Option B: WAF on ALB (regional)
User → ALB (WAF here) → Application
✅ Inspects regional traffic
❌ Bypassed if attacker hits ALB directly (skip CloudFront)
Option C: WAF on both CloudFront AND ALB
Most secure — blocks at edge + backstop if ALB is reached directly [reddit](https://www.reddit.com/r/aws/comments/1kc0fnb/rate_limit_rules_in_waf_with_cloudfront/)
3. Core Components ⭐¶
Web ACL (Access Control List)¶
The top-level container — you create one Web ACL and attach it to a resource. A Web ACL contains ordered rules and rule groups.
Web ACL properties:
Scope: CLOUDFRONT (global) or REGIONAL (ALB, API GW, etc.)
Default action: ALLOW or BLOCK (what happens if no rule matches)
Rules: ordered list of rules and rule groups (priority 0–99,999,999)
Capacity: max 1,500 WCUs (Web ACL Capacity Units)
Scope matters: A Web ACL created for CloudFront (global) cannot be attached to an ALB (regional) and vice versa. You create separate Web ACLs per scope.
Rule¶
A single inspection unit — evaluates a condition and takes an action:
Rule:
Name: BlockSQLInjection
Priority: 10 ← lower number = evaluated first
Statement: SQL injection match on request body
Action: Block
Rule:
Name: AllowOfficeIP
Priority: 5 ← evaluated before BlockSQLInjection (lower priority number)
Statement: IP set match (203.0.113.0/24)
Action: Allow
Rule Group¶
A reusable collection of rules — packaged together with a fixed capacity cost. Three types of rule groups exist (covered in Section 5).
WCU (Web ACL Capacity Unit)¶
Every rule consumes WCUs based on complexity:
Simple IP match: 1 WCU
Regex pattern match: 25 WCUs per pattern (max 10 patterns = 250 WCUs)
SQL injection detection: 20 WCUs
Bot Control (common): 25 WCUs
Bot Control (targeted): 50 WCUs
Web ACL limit: 1,500 WCUs total
(Can request increase, costs extra per 500 additional WCUs)
4. Rule Actions¶
| Action | Effect | Continues Evaluation? |
|---|---|---|
| Allow | Forward request to protected resource | No — done |
| Block | Return HTTP 403 (or custom response) | No — done |
| Count | Count the request, add labels, continue | Yes — evaluation continues |
| CAPTCHA | Return CAPTCHA challenge to client | No (until solved) |
| Challenge | Return silent browser challenge (JS) | No (until solved) |
Count action use case:
New rule you're testing → set to Count first
Monitor in CloudWatch: how many requests would have been blocked?
If no false positives → switch to Block
CAPTCHA vs Challenge:
Challenge: silent, JavaScript-based browser verification (invisible to user)
CAPTCHA: visible image puzzle the user must solve
Use Challenge first → if bots pass it → escalate to CAPTCHA
5. Rule Types ⭐¶
Type 1: AWS Managed Rules¶
Pre-built rule groups maintained by the AWS Threat Intelligence team. Updated automatically when new threats emerge — zero maintenance from you.
| Managed Rule Group | Protects Against |
|---|---|
AWSManagedRulesCommonRuleSet | OWASP Top 10 (XSS, SQLi, path traversal, etc.) |
AWSManagedRulesAdminProtectionRuleSet | Admin page access (/admin, /wp-admin) |
AWSManagedRulesKnownBadInputsRuleSet | Exploits (Log4JRCE, Spring4Shell, etc.) |
AWSManagedRulesSQLiRuleSet | SQL injection attacks |
AWSManagedRulesLinuxRuleSet | Linux-specific exploits |
AWSManagedRulesWindowsRuleSet | Windows-specific exploits |
AWSManagedRulesWordPressRuleSet | WordPress vulnerabilities |
AWSManagedRulesAmazonIpReputationList | Known malicious IPs (AWS threat intel) |
AWSManagedRulesAnonymousIpList | Tor exits, VPN, proxies, hosting providers |
AWSManagedRulesBotControlRuleSet | Bots (see Section 6) |
AWSManagedRulesFraudControlAccountTakeoverPrevention | Credential stuffing |
Add managed rule group to Web ACL:
→ select rule group → set priority → choose override actions if needed
→ AWS maintains all rules inside — you do not write individual rules
Override action (per rule inside the group):
Use group defaults → respect all built-in Allow/Block/Count actions
Count all → override all actions to Count (safe testing mode)
Override specific → override individual rules from Block → Count or vice versa
Type 2: Customer Managed Rules (Your Own)¶
Rules you write yourself for your specific application logic:
Statement types you can use:
IP Set Match:
Block specific IP ranges
"Statement": { "IPSetReferenceStatement": { "ARN": "arn:aws:wafv2:...:ipset/block-list" } }
Geo Match:
Block all traffic from specific countries
"Statement": { "GeoMatchStatement": { "CountryCodes": ["RU", "CN", "KP"] } }
String Match:
Block requests containing specific string in URI, header, body, query string
"Statement": {
"ByteMatchStatement": {
"SearchString": "../../../etc/passwd",
"FieldToMatch": { "UriPath": {} },
"TextTransformations": [{ "Priority": 0, "Type": "URL_DECODE" }],
"PositionalConstraint": "CONTAINS"
}
}
Regex Match:
Block URIs matching regex pattern
FieldToMatch: URI, headers, body, query string, cookies, HTTP method
Rate-Based Rule:
Block IPs exceeding N requests per 5 minutes (see Section 7)
SQL Injection Match:
Built-in SQLi detection on specific field
XSS Match:
Built-in cross-site scripting detection on specific field
Logical Rules (AND / OR / NOT):
Combine multiple statements:
"AND": [geo match CN, rate > 1000] → Block
"NOT": [IP in allowlist] AND [SQLi match] → Block
Type 3: Marketplace Managed Rules¶
Third-party security vendors (Imperva, F5, Fortinet, TrendMicro) sell rule groups in AWS Marketplace. Subscribed via marketplace, then added to your Web ACL like AWS managed rules.
6. Bot Control ⭐¶
AWS WAF Bot Control is a managed rule group specifically for bot traffic:
Two Protection Levels¶
Common (Basic Bot Detection)
WCU cost: 25 WCUs
Detects: well-known bots by signature (user agent, IP, behavior patterns)
→ Googlebot, Bingbot (verified → Allow)
→ Scrapers, crawlers (unverified → Block)
Pricing: additional fee per million requests inspected
Targeted (Advanced Bot Detection + ML)
WCU cost: 50 WCUs
Adds: machine learning analysis of traffic patterns
browser fingerprinting (detects automated browsers)
coordinated activity detection (distributed bot farms)
Rules with TGT_ prefix = targeted rules
Rules with TGT_ML_ prefix = ML-powered rules (take up to 24h to baseline) [docs.aws.amazon](https://docs.aws.amazon.com/waf/latest/developerguide/waf-bot-control-rg-using.html)
Pricing: higher additional fee per million requests
What Bot Control Labels¶
WAF adds labels to requests for custom rule chaining:
awswaf:managed:aws:bot-control:bot:category:ai
awswaf:managed:aws:bot-control:bot:verified
awswaf:managed:aws:bot-control:signal:automated_browser
awswaf:managed:aws:bot-control:signal:known_bot_data_center
awswaf:managed:aws:bot-control:signal:non_browser_user_agent
Use labels in downstream rules:
"If label = signal:automated_browser → CAPTCHA"
"If label = bot:verified → Allow (Googlebot)"
"If label = signal:known_bot_data_center → Block"
Bot Categories WAF Detects¶
- Search engine crawlers (Googlebot, Bingbot) → verified → Allow by default
- Monitoring tools (Pingdom, UptimeRobot)
- SEO crawlers and scrapers
- AI training crawlers (
CategoryAI— blocks AI data scrapers) - Automated browsers (Puppeteer, Selenium, Playwright)
- Credential stuffing bots
- DDoS bots and scanners
7. Rate-Based Rules ⭐¶
Rate limiting blocks clients that exceed a request threshold within a 5-minute window:
Rate-based rule:
Threshold: 1,000 requests
Window: 5 minutes (fixed — cannot change)
Aggregate key: IP address (default)
Action: Block (or Count, CAPTCHA, Challenge)
Behavior:
AWS WAF counts requests per IP in rolling 5-minute window
When IP exceeds 1,000 in any 5-minute window → Action triggered
Block lifted when rate drops below threshold
[docs.aws.amazon](https://docs.aws.amazon.com/waf/latest/developerguide/waf-rule-statement-type-rate-based-request-limiting.html)
Aggregate Keys (What to Rate-Limit By)¶
| Key | Use Case |
|---|---|
| IP address (default) | Block single IPs hammering your API |
| Forwarded IP | When using CloudFront/proxy — rate-limit real client IP from X-Forwarded-For header |
| HTTP header value | Rate-limit by API key, session token, User-Agent |
| Query string component | Rate-limit by specific query parameter |
| HTTP method | Rate-limit POST separately from GET |
| Custom key combination | Combine IP + header + URI path |
Scope-down statement (optional): [docs.aws.amazon](https://docs.aws.amazon.com/waf/latest/developerguide/waf-rule-statement-type-rate-based.html)
Only count/rate-limit requests that match a condition
Example: rate-limit only POST /login (not all endpoints)
"ScopeDownStatement": {
"ByteMatchStatement": {
"SearchString": "/login",
"FieldToMatch": { "UriPath": {} },
"PositionalConstraint": "EXACTLY"
}
}
→ Only login attempts counted toward rate limit
→ Other endpoints unaffected
8. IP Sets and Geo Matching¶
IP Sets¶
Reusable list of IP addresses/CIDR ranges
Create once → reference in multiple rules
IP set types:
Allowlist: known safe IPs (office, partners) → Always Allow
Blocklist: known bad IPs, threat intel feeds → Always Block
Create:
aws wafv2 create-ip-set \
--scope REGIONAL \
--name office-allowlist \
--ip-address-version IPV4 \
--addresses "203.0.113.0/24" "198.51.100.5/32"
Update without rule change:
Add/remove IPs from IP set → rule automatically uses updated list
Geo Matching¶
Block or allow based on country of origin (determined by IP geolocation):
Use cases:
GDPR: force EU users to EU endpoint only
Compliance: block service in specific jurisdictions
Cost reduction: block known bot-heavy regions
"Statement": {
"GeoMatchStatement": {
"CountryCodes": ["RU", "CN", "KP", "IR"]
}
}
Limitation: VPN/Tor users bypass geo match (their exit IP = different country)
→ Combine with AWSManagedRulesAnonymousIpList to also block Tor/VPN
9. Labels and Rule Chaining¶
Labels enable multi-stage inspection — rules can add labels that later rules match:
Stage 1: Bot Control rule group runs
→ Labels request: "awswaf:managed:aws:bot-control:signal:automated_browser"
Stage 2: Your custom rule checks for that label
If: label matches "automated_browser"
Then: CAPTCHA
Stage 3: If CAPTCHA is solved → allow through
If CAPTCHA failed → Block
This allows: coarse detection → fine-grained custom response
10. Logging and Monitoring ⭐¶
WAF Logging¶
Enable logging → WAF sends full request logs to:
Amazon S3 → long-term storage, Athena queries
CloudWatch Logs → real-time monitoring, alarms
Amazon Kinesis Data Firehose → streaming to S3/Splunk/Datadog
Log contents:
Timestamp, action (ALLOW/BLOCK/COUNT), rule that matched,
Client IP, URI, headers (configurable), country, HTTP method,
Labels applied, terminating rule
Log filtering:
Log only BLOCK actions (reduces cost)
Log only specific rule matches
Redact sensitive headers before logging
CloudWatch Metrics (per rule)¶
AllowedRequests → count of allowed requests
BlockedRequests → count of blocked requests per rule
CountedRequests → count of counted (passthrough) requests
CaptchaRequests → count of CAPTCHA challenges
PassedRequests → requests that passed CAPTCHA
Create alarms:
BlockedRequests > 1000 in 5 min → SNS → PagerDuty
(sudden spike in blocks = attack in progress)
WAF Sampled Requests¶
WAF stores samples of last 3 hours of requests (max 100 per rule):
Console: WAF → Web ACL → Sampled requests tab
Shows: actual request that matched (headers, URI, body snippet)
Use: debug why a rule is blocking legitimate traffic
11. AWS WAF vs AWS Shield ⭐¶
These are complementary, not competing:
| AWS WAF | AWS Shield Standard | AWS Shield Advanced | |
|---|---|---|---|
| What it stops | Layer 7 exploits, bots, custom threats | Layer ¾ volumetric DDoS | L¾ + L7 DDoS with response team |
| How it works | Inspect HTTP content, apply rules | Absorb large SYN floods, UDP floods | WAF + 24/7 DDoS response team (DRT) |
| Cost | Pay per Web ACL + rules + requests | Free (all AWS customers) | $3,000/month + data transfer out fee |
| Rules | Your rules + managed rules | Automatic, no configuration | Automatic + DRT tuning |
| Layer | 7 | ¾ | ¾/7 |
Combined architecture for full protection:
CloudFront (with WAF + Shield Advanced)
→ WAF handles: SQLi, XSS, bots, rate limiting, geo blocking
→ Shield handles: volumetric DDoS (millions of pps UDP/TCP floods)
→ Together: comprehensive application protection
12. WAF Pricing¶
Regional Web ACL: $5.00/month per Web ACL
CloudFront Web ACL: $5.00/month per Web ACL
Per rule: $1.00/month per rule
Per rule group: $1.00/month per rule group
Requests: $0.60 per 1 million requests (first 10M)
$0.40 per 1 million requests (after 10M)
Managed rule groups (AWS Managed):
Basic groups (CommonRuleSet, SQLi, etc.): included
Bot Control Common: additional charge per million requests
Bot Control Targeted: higher per-million charge
Rule groups from marketplace: vendor-specific pricing
13. WAF Deployment Best Practices¶
1. Start all new rules in COUNT mode → monitor for 1-2 weeks → switch to BLOCK
(Prevents blocking legitimate traffic due to false positives)
2. Layer defense:
Priority 1: IP allowlist (trusted IPs — highest priority, always allow)
Priority 2: IP blocklist (known bad IPs — block immediately)
Priority 3: AWS managed rules (OWASP, SQLi, etc.)
Priority 4: Rate-based rules (DDoS/brute force)
Priority 5: Custom application rules
Default: ALLOW (or BLOCK for strict mode)
3. Use scope-down statements on Bot Control and advanced rules:
→ Only apply expensive rules to sensitive endpoints (/login, /checkout)
→ Reduces WCU consumption and request inspection costs [aws.github](https://aws.github.io/aws-security-services-best-practices/guides/waf/configuring-waf-rules/docs/)
4. Attach WAF to CloudFront (not just ALB):
→ Blocks at edge before traffic reaches your region
→ Reduces origin load from attack traffic [reddit](https://www.reddit.com/r/aws/comments/1kc0fnb/rate_limit_rules_in_waf_with_cloudfront/)
5. Enable WAF logging to S3 + set up CloudWatch alarm on BlockedRequests spike
→ Operational visibility + incident response
6. Regularly review sampled requests for false positives
→ Tune rules before attacks happen, not during
14. Common Mistakes¶
| ❌ Wrong | ✅ Correct |
|---|---|
| WAF protects against DDoS volumetric attacks | WAF handles Layer 7 only — use Shield for volumetric DDoS |
| WAF on ALB protects from all internet traffic | Attackers can bypass ALB-WAF by hitting ALB directly — attach WAF to CloudFront too |
| Regional Web ACL works with CloudFront | CloudFront requires CLOUDFRONT scope Web ACL (created in us-east-1) |
| New rules should immediately be set to Block | Start in Count mode — monitor for false positives before blocking |
| Rate-limit window is configurable | Rate-based rule window is always 5 minutes (fixed) |
| Shield Advanced includes WAF | Shield Advanced does not include WAF — they are separate services (but work together) |
| Bot Control detects all bots | Bot Control Targeted + ML takes up to 24h to establish baselines |
| One Web ACL works for CloudFront and ALB | Separate Web ACLs needed — scope is set at creation and cannot be changed |
| Managed rules are free | Basic managed rules are included; Bot Control and Fraud Control have per-request charges |
| WAF inspects encrypted HTTPS content | WAF sits after TLS termination at the load balancer/CloudFront — it sees decrypted HTTP |
15. Interview Questions Checklist¶
- What layer does WAF operate at? What does it inspect?
- List all resources WAF can be attached to
- What is a Web ACL? What is WCU?
- Five rule actions — what does Count do that others don't?
- When would you use CAPTCHA vs Challenge action?
- Three types of rule groups — what's the difference?
- Name five AWS managed rule groups and what they protect against
- How does Bot Control Common differ from Bot Control Targeted?
- What are WAF labels? How are they used for rule chaining?
- Rate-based rule — window duration, aggregate key options
- What is a scope-down statement? Why use it on Bot Control?
- WAF vs Shield — which stops what at which layer?
- Why attach WAF to CloudFront instead of (or in addition to) ALB?
- Why start new rules in Count mode?
- How do you debug a WAF rule blocking legitimate traffic? (Sampled Requests)
- WAF pricing model — what are the four charges?
- Geo match limitation — how do you handle VPN/Tor bypass?