AWS Lambda¶

1. What is Lambda?¶

AWS Lambda is a serverless, event-driven compute service — you upload code and AWS runs it. No servers to provision, no OS to manage, no capacity to plan. You pay only for the milliseconds your code actually executes.

Traditional server model:
  Provision EC2 → install runtime → deploy app → manage 24/7
  Pay: every hour the server exists (even idle hours)

Lambda model:
  Upload code → Lambda runs it on demand
  Pay: only while code executes (per 1ms)
  Idle time: $0

Lambda is not for long-running processes. It is designed for short-lived, stateless functions that respond to events.

2. Lambda Execution Model¶

Handler Function¶

Every Lambda function has a handler — the entry point AWS invokes:

# Python
def handler(event, context):
    print(event['name'])
    return {
        'statusCode': 200,
        'body': 'Hello from Lambda'
    }

// Node.js
exports.handler = async (event, context) => {
    return {
        statusCode: 200,
        body: JSON.stringify({ message: 'Hello from Lambda' })
    };
};

Parameter	Contains
`event`	Input data from the trigger (HTTP request, S3 event, SQS message, etc.)
`context`	Runtime info: function name, remaining time, request ID, memory limit

Execution Environment Lifecycle¶

Phase 1: INIT (cold start only)
  ├── Download your code/container
  ├── Start language runtime (Node.js, Python, Java…)
  ├── Run initialization code OUTSIDE handler
  │   (import libraries, create DB connections, load config)
  └── Duration: 100ms to 1+ second

Phase 2: INVOKE
  ├── Run handler function with event
  └── Duration: your code execution time

Phase 3: SHUTDOWN (if no new invocations for ~15 min)
  └── Execution environment frozen/terminated

# Code placement matters — INIT vs INVOKE

import boto3                         # ← INIT: runs once per cold start
db_client = boto3.client('dynamodb') # ← INIT: connection created once

def handler(event, context):
    # ← INVOKE: runs every request
    result = db_client.get_item(...)  # reuses existing connection ✅
    return result

3. Cold Start vs Warm Start ⭐¶

Cold Start (new execution environment):
  Trigger arrives → no available environment →
  AWS provisions environment → INIT phase → INVOKE
  Latency added: ~100ms (Python/Node.js) to ~1s+ (Java/.NET)

Warm Start (reuse existing environment):
  Trigger arrives → existing environment available →
  INVOKE directly (skip INIT)
  Latency added: ~0ms

What Causes Cold Starts?¶

Trigger	Explanation
First invocation ever	No environments exist yet
Traffic spike	10 concurrent requests → 10 environments needed simultaneously
After ~15 minutes idle	Environment was frozen/recycled
Code/config update deployed	New environments needed for new version
Region first invocation	No warm environments in that region

Cold Start Mitigation Strategies¶

1. Provisioned Concurrency (Eliminates Cold Starts)

Pre-initializes N execution environments — always warm and ready.
Incoming requests route to pre-warmed environment → zero cold start.

Must be applied to a FUNCTION VERSION or ALIAS — NOT $LATEST [docs.aws.amazon](https://docs.aws.amazon.com/lambda/latest/dg/provisioned-concurrency.html)

CLI:
  aws lambda put-provisioned-concurrency-config \
    --function-name my-api \
    --qualifier prod \           ← alias or version, not $LATEST
    --provisioned-concurrent-executions 10

Cost: charged per GB-second while environments are provisioned
      (even when not actively executing — this is always-on compute)

Auto-scaling Provisioned Concurrency:
  Scale up at 8am, scale down at 8pm (based on schedule)
  Scale based on utilization metric (target 70% utilization)

2. ARM64 Architecture (Graviton2)

lambda.Architecture.ARM_64
→ 13–24% faster cold starts vs x86_64
→ 20% cheaper per GB-second
→ Use for: most workloads (Python, Node.js, Java)

3. Minimize Package Size

Smaller deployment package = faster code download during INIT
  Use: tree-shaking, exclude dev dependencies
  Use: Lambda layers for shared large libraries
  Use: container images for large dependencies (cached per environment)

4. Move Heavy Work to INIT Phase

# Do expensive setup once at init, not every request
s3 = boto3.client('s3')               # ← init phase
config = load_config_from_ssm()       # ← init phase (cached)

def handler(event, context):
    return s3.get_object(...)         # ← invoke phase (reuses client)

5. Keep Functions Warm (Scheduled EventBridge)

EventBridge rule: every 5 minutes → invoke Lambda with warmup event
→ Prevents idle timeout from recycling environments
→ Free tier covers most warmup invocations
Note: keeps ONE environment warm — not useful for concurrent scaling

4. Lambda Limits ⭐¶

Resource	Limit
Memory	128 MB – 10,240 MB (1 MB increments)
Timeout	1 second – 900 seconds (15 minutes)
Ephemeral storage (/tmp)	512 MB – 10,240 MB
Deployment package (zip)	50 MB compressed / 250 MB uncompressed
Container image size	10 GB
Environment variables	4 KB total
Layers per function	5 layers
Concurrent executions (default)	1,000 per region (soft limit — can request increase)
Burst concurrency	500–3,000 (varies by region)

CPU Scales With Memory¶

Lambda allocates CPU proportional to memory:
  128 MB  → ~0.07 vCPU
  1,769 MB → 1 full vCPU  ← threshold for one full CPU
  3,538 MB → 2 vCPUs
  10,240 MB → ~5.8 vCPUs

For CPU-bound workloads (image processing, ML inference):
  Increase memory → you get more CPU → faster execution
  May actually REDUCE cost: faster = less duration billed

5. Invocation Types¶

Synchronous¶

Caller waits for result before continuing

Sources:
  API Gateway, ALB, CloudFront, SDK direct call

Flow:
  Caller → Lambda → executes → returns response → Caller receives result

Error handling: caller receives error immediately
Retry: caller's responsibility

Asynchronous¶

Caller sends event and immediately gets 202 Accepted
Lambda processes in background — caller doesn't wait

Sources:
  S3 event notifications, SNS, EventBridge, CloudWatch Events

Flow:
  Event → Lambda internal queue → Lambda executes
  Caller already moved on

Error handling: Lambda retries automatically
  Default: 2 retries (3 total attempts)
  On final failure → Dead Letter Queue (SQS or SNS)

Configure:
  Maximum age of event: 60s – 6 hours
  Maximum retry attempts: 0, 1, or 2
  On-failure destination: SQS, SNS, EventBridge, another Lambda

Event Source Mapping (Poll-Based)¶

Lambda polls a source for new records, processes in batches

Sources:
  SQS (standard + FIFO)
  Kinesis Data Streams
  DynamoDB Streams
  MSK (Managed Streaming for Kafka)
  MQ (ActiveMQ, RabbitMQ)

Flow:
  Lambda service polls SQS every 1–20 seconds
  Receives batch (up to 10,000 messages for SQS)
  Invokes Lambda with the batch
  On success: messages deleted from SQS
  On failure: messages return to queue (retry) or go to DLQ

Batch settings:
  Batch size: 1–10,000 (SQS), 1–10,000 (Kinesis)
  Batch window: 0–300 seconds (wait to collect larger batch)

6. Event Sources (Triggers) ⭐¶

Category	Source	Invocation Type
HTTP	API Gateway (REST, HTTP)	Synchronous
HTTP	Application Load Balancer	Synchronous
Storage	S3 (object events)	Asynchronous
Streaming	Kinesis Data Streams	Event Source Mapping (poll)
Queue	SQS	Event Source Mapping (poll)
Database	DynamoDB Streams	Event Source Mapping (poll)
Messaging	SNS	Asynchronous
Events	EventBridge	Asynchronous
Schedule	EventBridge Scheduler	Asynchronous
Auth	Cognito User Pools	Synchronous
Edge	CloudFront (Lambda@Edge)	Synchronous

7. Versions and Aliases ⭐¶

Versions¶

$LATEST → mutable, always the most recent code
v1, v2, v3 → immutable snapshots of code + configuration

Publish a version:
  aws lambda publish-version --function-name my-function
  → Creates an immutable version (v1, v2…)
  → Version has its own ARN: arn:aws:lambda:...:function:my-function:3

Qualified ARN:   arn:...:my-function:3  → specific version
Unqualified ARN: arn:...:my-function    → $LATEST [docs.aws.amazon](https://docs.aws.amazon.com/lambda/latest/dg/configuration-versions.html)

Cannot edit code of a published version — it is frozen

Aliases¶

A named pointer to a version — can be updated without changing clients

my-function:prod  → points to v3
my-function:dev   → points to $LATEST
my-function:beta  → points to v4

Update prod alias:
  aws lambda update-alias --function-name my-function \
    --name prod --function-version 4
  → prod now points to v4 (no client-side changes needed)

Traffic shifting (canary deployments):
  prod alias: 90% → v3, 10% → v4
  → Gradually shift traffic to test new version in production
  → Use with CodeDeploy for automated rollback on alarms

  aws lambda update-alias --function-name my-function \
    --name prod \
    --routing-config '{"AdditionalVersionWeights":{"4":0.10}}'

Alias ARN: arn:aws:lambda:...:function:my-function:prod
  → Always resolves to the version the alias points to
  → Use alias ARNs in all triggers/event sources — enables zero-downtime deploys

8. Lambda Layers ⭐¶

Layers are ZIP archives containing shared code, libraries, or binaries that can be attached to multiple functions:

Without layers:
  function-A.zip: code + numpy + pandas + scipy  (200 MB)
  function-B.zip: code + numpy + pandas + scipy  (200 MB)
  function-C.zip: code + numpy + pandas + scipy  (200 MB)
  → 600 MB total, duplicate libraries

With layers:
  numpy-pandas-layer.zip: numpy + pandas + scipy  (180 MB shared layer)
  function-A.zip: code only (5 MB)
  function-B.zip: code only (3 MB)
  function-C.zip: code only (4 MB)
  → 192 MB total, fast deploys, single update point

Layer directory structure (Python):
  python/lib/python3.12/site-packages/
    numpy/
    pandas/

Layer directory structure (Node.js):
  nodejs/node_modules/
    lodash/
    moment/

Limits:
  Max 5 layers per function
  Total unzipped (code + all layers): 250 MB
  Layers versioned (immutable when published)

9. Lambda in a VPC¶

By default Lambda runs outside your VPC (in AWS-managed infrastructure). Attach Lambda to a VPC when your function needs to access private resources:

Use cases for VPC attachment:
  → Lambda → private RDS (no public endpoint)
  → Lambda → private ElastiCache
  → Lambda → private EC2 microservice

VPC configuration:
  Assign to: private subnets (NOT public subnets)
  Assign to: Security Group (controls outbound connections)

Architecture:
  Lambda (in VPC private subnet)
    → security group allows port 3306 → RDS security group
    → connects to private RDS ✅

Internet access from VPC Lambda:
  Lambda in private subnet → NAT Gateway → Internet ✅
  Lambda in private subnet (no NAT) → No internet ❌
  Lambda in public subnet → still no internet (Lambda ignores public subnet routing)

Placing Lambda in a VPC adds ~100ms cold start overhead (setting up ENI — Elastic Network Interface). This was worse before 2019; AWS significantly reduced it with hyperplane ENIs. Still relevant for latency-sensitive synchronous functions.

10. Concurrency ⭐¶

Concurrency = number of requests being handled simultaneously at one moment

Each in-flight request occupies one execution environment.
If 50 requests arrive simultaneously → 50 concurrent Lambda instances needed.

Account concurrency limit: 1,000 per region (default, soft limit)
  → If function needs 1,001 simultaneous executions → throttled (429 error)

Reserved Concurrency:
  Guarantee N executions are always available for a specific function
  Also caps the function at N → prevents one function consuming all account concurrency

  aws lambda put-function-concurrency \
    --function-name critical-api \
    --reserved-concurrent-executions 100
  → Guarantees: critical-api always gets 100 executions
  → Caps: critical-api never exceeds 100 (even if more available)
  → Set to 0 → throttle the function completely (useful for emergency stop)

Unreserved Concurrency:
  The remaining concurrency (1,000 - sum of all reserved) shared by all other functions

Concurrency Types Summary¶

Type	Purpose	Cold Starts
On-Demand	Default — scales to account limit	Yes
Reserved	Guarantees capacity + caps function	Yes
Provisioned	Pre-warmed environments for specific version/alias	No

11. Lambda Destinations¶

Modern replacement for DLQ — route function output to a destination based on success or failure:

Asynchronous invocation destinations:
  On success → SQS / SNS / EventBridge / another Lambda
  On failure  → SQS / SNS / EventBridge / another Lambda

Event source mapping destinations:
  On failure  → SQS / SNS

Example: order processing pipeline
  Lambda: process-order
    On success → SNS: notify-customer
    On failure  → SQS: failed-orders-dlq (for manual review + retry)

Destinations provide more context than DLQ — they include the full event, the function response, error details, and execution metadata. Prefer destinations over DLQ for async Lambda error handling.

12. Lambda@Edge and CloudFront Functions¶

Run code at CloudFront edge locations — closest to the end user:

	Lambda@Edge	CloudFront Functions
Runtime	Node.js, Python	JavaScript only
Max memory	128 MB (viewer) / 10 GB (origin)	2 MB
Max duration	5s (viewer) / 30s (origin)	< 1ms
Network access	✅ Yes	❌ No
Cost	Higher	⅙th the cost
Triggers	CloudFront viewer/origin req/resp	Viewer request/response only
Use case	Auth, A/B test, URL rewrite, API	Header manipulation, URL rewrite, redirects

CloudFront trigger points:
  Viewer Request  → before cache check (Lambda@Edge + CF Functions)
  Origin Request  → cache miss, before origin call (Lambda@Edge only)
  Origin Response → after origin responds (Lambda@Edge only)
  Viewer Response → before sending to user (Lambda@Edge + CF Functions)

Example: auth at edge
  Viewer Request → Lambda@Edge → verify JWT → allow/deny before hitting origin
  → Blocks unauthenticated requests at CDN layer, not your origin server

13. Environment Variables and Configuration¶

import os

def handler(event, context):
    db_host   = os.environ['DB_HOST']      # ← environment variable
    env_name  = os.environ['ENVIRONMENT']  # ← "prod" / "dev"
    api_key   = os.environ['API_KEY']      # ← sensitive: use KMS encryption

# Environment variables limit: 4 KB total (all variables combined)
# Encrypt with KMS: Lambda console → Configuration → Environment variables → Enable encryption

Secrets — Best Practice¶

Bad:  hardcoded credentials in code or environment variables (plaintext)
OK:   environment variable encrypted with KMS
Best: fetch from Secrets Manager at runtime (auto-rotation, audit trail)

import boto3, json
client = boto3.client('secretsmanager')

# Fetch once in INIT phase — cache in global variable
secret = json.loads(client.get_secret_value(SecretId='prod/db')['SecretString'])
DB_PASSWORD = secret['password']

14. Lambda Pricing¶

Two charges:
  1. Requests: $0.20 per 1 million requests
  2. Duration: $0.0000166667 per GB-second

GB-second = (memory in GB) × (duration in seconds)

Example:
  Function: 512 MB memory, 500ms execution, 1 million requests/month
  Duration cost:
    0.5 GB × 0.5 seconds = 0.25 GB-seconds per request
    × 1,000,000 requests  = 250,000 GB-seconds
    × $0.0000166667       = $4.17
  Request cost: $0.20
  Total: ~$4.37/month

Free tier (every month, does not expire):
  1,000,000 requests/month
  400,000 GB-seconds/month

ARM64 (Graviton): 20% cheaper per GB-second vs x86_64

15. Common Patterns ⭐¶

API (Serverless REST API)¶

Client → API Gateway → Lambda → DynamoDB
Scales from 0 to millions of requests
Cost: pay per request, $0 when idle

Event Processing¶

S3 upload → Lambda → resize image → save thumbnail → S3
SQS message → Lambda → process order → DynamoDB + SES email
DynamoDB Stream → Lambda → sync to Elasticsearch

Scheduled Jobs (Cron)¶

EventBridge rule: cron(0 2 * * ? *)  → Lambda: cleanup-old-data
Every day at 2 AM → runs cleanup function
No server needed — serverless cron

Fan-Out Pattern¶

One event → SNS → multiple Lambda subscribers
  Lambda-A: send email
  Lambda-B: update analytics
  Lambda-C: log to audit trail
All three execute in parallel from single SNS publish

16. Common Mistakes¶

❌ Wrong	✅ Correct
Lambda can run indefinitely	Hard timeout: 15 minutes maximum
Increase timeout to fix slow DB queries	Lambda in VPC without NAT/endpoint = no internet — fix networking first
Put Lambda in public subnet for internet access	Lambda in public subnet still has no internet — use private subnet + NAT Gateway
Create DB connection inside handler	Create connection in INIT phase (outside handler) — reused across warm invocations
Provisioned Concurrency on $LATEST	Provisioned Concurrency must be on a version or alias — not $LATEST
Layers count against the 250 MB limit	Yes — total unzipped = code + all layers combined must be ≤ 250 MB
Cold starts happen on every invocation	Cold starts occur in < 1% of invocations in steady workloads
Reserved Concurrency = no cold starts	Reserved concurrency limits capacity — only Provisioned eliminates cold starts
Environment variables have no size limit	All env vars combined: 4 KB total limit
Lambda scales linearly	Lambda scales in burst increments; burst limit (500–3000/min) applies on rapid scale-out