AWS Lambda

1. What is Lambda?

AWS Lambda is a serverless, event-driven compute service — you upload code and AWS runs it. No servers to provision, no OS to manage, no capacity to plan. You pay only for the milliseconds your code actually executes.

Traditional server model:
  Provision EC2 → install runtime → deploy app → manage 24/7
  Pay: every hour the server exists (even idle hours)

Lambda model:
  Upload code → Lambda runs it on demand
  Pay: only while code executes (per 1ms)
  Idle time: $0

Lambda is not for long-running processes. It is designed for short-lived, stateless functions that respond to events.


2. Lambda Execution Model

Handler Function

Every Lambda function has a handler — the entry point AWS invokes:

# Python
def handler(event, context):
    print(event['name'])
    return {
        'statusCode': 200,
        'body': 'Hello from Lambda'
    }
// Node.js
exports.handler = async (event, context) => {
    return {
        statusCode: 200,
        body: JSON.stringify({ message: 'Hello from Lambda' })
    };
};
Parameter Contains
event Input data from the trigger (HTTP request, S3 event, SQS message, etc.)
context Runtime info: function name, remaining time, request ID, memory limit

Execution Environment Lifecycle

Phase 1: INIT (cold start only)
  ├── Download your code/container
  ├── Start language runtime (Node.js, Python, Java…)
  ├── Run initialization code OUTSIDE handler
  │   (import libraries, create DB connections, load config)
  └── Duration: 100ms to 1+ second

Phase 2: INVOKE
  ├── Run handler function with event
  └── Duration: your code execution time

Phase 3: SHUTDOWN (if no new invocations for ~15 min)
  └── Execution environment frozen/terminated
# Code placement matters — INIT vs INVOKE

import boto3                         # ← INIT: runs once per cold start
db_client = boto3.client('dynamodb') # ← INIT: connection created once

def handler(event, context):
    # ← INVOKE: runs every request
    result = db_client.get_item(...)  # reuses existing connection ✅
    return result

3. Cold Start vs Warm Start ⭐

Cold Start (new execution environment):
  Trigger arrives → no available environment →
  AWS provisions environment → INIT phase → INVOKE
  Latency added: ~100ms (Python/Node.js) to ~1s+ (Java/.NET)

Warm Start (reuse existing environment):
  Trigger arrives → existing environment available →
  INVOKE directly (skip INIT)
  Latency added: ~0ms

What Causes Cold Starts?

Trigger Explanation
First invocation ever No environments exist yet
Traffic spike 10 concurrent requests → 10 environments needed simultaneously
After ~15 minutes idle Environment was frozen/recycled
Code/config update deployed New environments needed for new version
Region first invocation No warm environments in that region

Cold Start Mitigation Strategies

1. Provisioned Concurrency (Eliminates Cold Starts)

Pre-initializes N execution environments — always warm and ready.
Incoming requests route to pre-warmed environment → zero cold start.

Must be applied to a FUNCTION VERSION or ALIAS — NOT $LATEST [docs.aws.amazon](https://docs.aws.amazon.com/lambda/latest/dg/provisioned-concurrency.html)

CLI:
  aws lambda put-provisioned-concurrency-config \
    --function-name my-api \
    --qualifier prod \           ← alias or version, not $LATEST
    --provisioned-concurrent-executions 10

Cost: charged per GB-second while environments are provisioned
      (even when not actively executing — this is always-on compute)

Auto-scaling Provisioned Concurrency:
  Scale up at 8am, scale down at 8pm (based on schedule)
  Scale based on utilization metric (target 70% utilization)

2. ARM64 Architecture (Graviton2)

lambda.Architecture.ARM_64
→ 13–24% faster cold starts vs x86_64
→ 20% cheaper per GB-second
→ Use for: most workloads (Python, Node.js, Java)

3. Minimize Package Size

Smaller deployment package = faster code download during INIT
  Use: tree-shaking, exclude dev dependencies
  Use: Lambda layers for shared large libraries
  Use: container images for large dependencies (cached per environment)

4. Move Heavy Work to INIT Phase

# Do expensive setup once at init, not every request
s3 = boto3.client('s3')               # ← init phase
config = load_config_from_ssm()       # ← init phase (cached)

def handler(event, context):
    return s3.get_object(...)         # ← invoke phase (reuses client)

5. Keep Functions Warm (Scheduled EventBridge)

EventBridge rule: every 5 minutes → invoke Lambda with warmup event
→ Prevents idle timeout from recycling environments
→ Free tier covers most warmup invocations
Note: keeps ONE environment warm — not useful for concurrent scaling

4. Lambda Limits ⭐

Resource Limit
Memory 128 MB – 10,240 MB (1 MB increments)
Timeout 1 second – 900 seconds (15 minutes)
Ephemeral storage (/tmp) 512 MB – 10,240 MB
Deployment package (zip) 50 MB compressed / 250 MB uncompressed
Container image size 10 GB
Environment variables 4 KB total
Layers per function 5 layers
Concurrent executions (default) 1,000 per region (soft limit — can request increase)
Burst concurrency 500–3,000 (varies by region)

CPU Scales With Memory

Lambda allocates CPU proportional to memory:
  128 MB  → ~0.07 vCPU
  1,769 MB → 1 full vCPU  ← threshold for one full CPU
  3,538 MB → 2 vCPUs
  10,240 MB → ~5.8 vCPUs

For CPU-bound workloads (image processing, ML inference):
  Increase memory → you get more CPU → faster execution
  May actually REDUCE cost: faster = less duration billed

5. Invocation Types

Synchronous

Caller waits for result before continuing

Sources:
  API Gateway, ALB, CloudFront, SDK direct call

Flow:
  Caller → Lambda → executes → returns response → Caller receives result

Error handling: caller receives error immediately
Retry: caller's responsibility

Asynchronous

Caller sends event and immediately gets 202 Accepted
Lambda processes in background — caller doesn't wait

Sources:
  S3 event notifications, SNS, EventBridge, CloudWatch Events

Flow:
  Event → Lambda internal queue → Lambda executes
  Caller already moved on

Error handling: Lambda retries automatically
  Default: 2 retries (3 total attempts)
  On final failure → Dead Letter Queue (SQS or SNS)

Configure:
  Maximum age of event: 60s – 6 hours
  Maximum retry attempts: 0, 1, or 2
  On-failure destination: SQS, SNS, EventBridge, another Lambda

Event Source Mapping (Poll-Based)

Lambda polls a source for new records, processes in batches

Sources:
  SQS (standard + FIFO)
  Kinesis Data Streams
  DynamoDB Streams
  MSK (Managed Streaming for Kafka)
  MQ (ActiveMQ, RabbitMQ)

Flow:
  Lambda service polls SQS every 1–20 seconds
  Receives batch (up to 10,000 messages for SQS)
  Invokes Lambda with the batch
  On success: messages deleted from SQS
  On failure: messages return to queue (retry) or go to DLQ

Batch settings:
  Batch size: 1–10,000 (SQS), 1–10,000 (Kinesis)
  Batch window: 0–300 seconds (wait to collect larger batch)

6. Event Sources (Triggers) ⭐

Category Source Invocation Type
HTTP API Gateway (REST, HTTP) Synchronous
HTTP Application Load Balancer Synchronous
Storage S3 (object events) Asynchronous
Streaming Kinesis Data Streams Event Source Mapping (poll)
Queue SQS Event Source Mapping (poll)
Database DynamoDB Streams Event Source Mapping (poll)
Messaging SNS Asynchronous
Events EventBridge Asynchronous
Schedule EventBridge Scheduler Asynchronous
Auth Cognito User Pools Synchronous
Edge CloudFront (Lambda@Edge) Synchronous

7. Versions and Aliases ⭐

Versions

$LATEST → mutable, always the most recent code
v1, v2, v3 → immutable snapshots of code + configuration

Publish a version:
  aws lambda publish-version --function-name my-function
  → Creates an immutable version (v1, v2…)
  → Version has its own ARN: arn:aws:lambda:...:function:my-function:3

Qualified ARN:   arn:...:my-function:3  → specific version
Unqualified ARN: arn:...:my-function    → $LATEST [docs.aws.amazon](https://docs.aws.amazon.com/lambda/latest/dg/configuration-versions.html)

Cannot edit code of a published version — it is frozen

Aliases

A named pointer to a version — can be updated without changing clients

my-function:prod  → points to v3
my-function:dev   → points to $LATEST
my-function:beta  → points to v4

Update prod alias:
  aws lambda update-alias --function-name my-function \
    --name prod --function-version 4
  → prod now points to v4 (no client-side changes needed)

Traffic shifting (canary deployments):
  prod alias: 90% → v3, 10% → v4
  → Gradually shift traffic to test new version in production
  → Use with CodeDeploy for automated rollback on alarms

  aws lambda update-alias --function-name my-function \
    --name prod \
    --routing-config '{"AdditionalVersionWeights":{"4":0.10}}'
Alias ARN: arn:aws:lambda:...:function:my-function:prod
  → Always resolves to the version the alias points to
  → Use alias ARNs in all triggers/event sources — enables zero-downtime deploys

8. Lambda Layers ⭐

Layers are ZIP archives containing shared code, libraries, or binaries that can be attached to multiple functions:

Without layers:
  function-A.zip: code + numpy + pandas + scipy  (200 MB)
  function-B.zip: code + numpy + pandas + scipy  (200 MB)
  function-C.zip: code + numpy + pandas + scipy  (200 MB)
  → 600 MB total, duplicate libraries

With layers:
  numpy-pandas-layer.zip: numpy + pandas + scipy  (180 MB shared layer)
  function-A.zip: code only (5 MB)
  function-B.zip: code only (3 MB)
  function-C.zip: code only (4 MB)
  → 192 MB total, fast deploys, single update point

Layer directory structure (Python):
  python/lib/python3.12/site-packages/
    numpy/
    pandas/

Layer directory structure (Node.js):
  nodejs/node_modules/
    lodash/
    moment/

Limits:
  Max 5 layers per function
  Total unzipped (code + all layers): 250 MB
  Layers versioned (immutable when published)

9. Lambda in a VPC

By default Lambda runs outside your VPC (in AWS-managed infrastructure). Attach Lambda to a VPC when your function needs to access private resources:

Use cases for VPC attachment:
  → Lambda → private RDS (no public endpoint)
  → Lambda → private ElastiCache
  → Lambda → private EC2 microservice

VPC configuration:
  Assign to: private subnets (NOT public subnets)
  Assign to: Security Group (controls outbound connections)

Architecture:
  Lambda (in VPC private subnet)
    → security group allows port 3306 → RDS security group
    → connects to private RDS ✅

Internet access from VPC Lambda:
  Lambda in private subnet → NAT Gateway → Internet ✅
  Lambda in private subnet (no NAT) → No internet ❌
  Lambda in public subnet → still no internet (Lambda ignores public subnet routing)

Placing Lambda in a VPC adds ~100ms cold start overhead (setting up ENI — Elastic Network Interface). This was worse before 2019; AWS significantly reduced it with hyperplane ENIs. Still relevant for latency-sensitive synchronous functions.


10. Concurrency ⭐

Concurrency = number of requests being handled simultaneously at one moment

Each in-flight request occupies one execution environment.
If 50 requests arrive simultaneously → 50 concurrent Lambda instances needed.

Account concurrency limit: 1,000 per region (default, soft limit)
  → If function needs 1,001 simultaneous executions → throttled (429 error)

Reserved Concurrency:
  Guarantee N executions are always available for a specific function
  Also caps the function at N → prevents one function consuming all account concurrency

  aws lambda put-function-concurrency \
    --function-name critical-api \
    --reserved-concurrent-executions 100
  → Guarantees: critical-api always gets 100 executions
  → Caps: critical-api never exceeds 100 (even if more available)
  → Set to 0 → throttle the function completely (useful for emergency stop)

Unreserved Concurrency:
  The remaining concurrency (1,000 - sum of all reserved) shared by all other functions

Concurrency Types Summary

Type Purpose Cold Starts
On-Demand Default — scales to account limit Yes
Reserved Guarantees capacity + caps function Yes
Provisioned Pre-warmed environments for specific version/alias No

11. Lambda Destinations

Modern replacement for DLQ — route function output to a destination based on success or failure:

Asynchronous invocation destinations:
  On success → SQS / SNS / EventBridge / another Lambda
  On failure  → SQS / SNS / EventBridge / another Lambda

Event source mapping destinations:
  On failure  → SQS / SNS

Example: order processing pipeline
  Lambda: process-order
    On success → SNS: notify-customer
    On failure  → SQS: failed-orders-dlq (for manual review + retry)

Destinations provide more context than DLQ — they include the full event, the function response, error details, and execution metadata. Prefer destinations over DLQ for async Lambda error handling.


12. Lambda@Edge and CloudFront Functions

Run code at CloudFront edge locations — closest to the end user:

Lambda@Edge CloudFront Functions
Runtime Node.js, Python JavaScript only
Max memory 128 MB (viewer) / 10 GB (origin) 2 MB
Max duration 5s (viewer) / 30s (origin) < 1ms
Network access ✅ Yes ❌ No
Cost Higher ⅙th the cost
Triggers CloudFront viewer/origin req/resp Viewer request/response only
Use case Auth, A/B test, URL rewrite, API Header manipulation, URL rewrite, redirects
CloudFront trigger points:
  Viewer Request  → before cache check (Lambda@Edge + CF Functions)
  Origin Request  → cache miss, before origin call (Lambda@Edge only)
  Origin Response → after origin responds (Lambda@Edge only)
  Viewer Response → before sending to user (Lambda@Edge + CF Functions)

Example: auth at edge
  Viewer Request → Lambda@Edge → verify JWT → allow/deny before hitting origin
  → Blocks unauthenticated requests at CDN layer, not your origin server

13. Environment Variables and Configuration

import os

def handler(event, context):
    db_host   = os.environ['DB_HOST']      # ← environment variable
    env_name  = os.environ['ENVIRONMENT']  # ← "prod" / "dev"
    api_key   = os.environ['API_KEY']      # ← sensitive: use KMS encryption

# Environment variables limit: 4 KB total (all variables combined)
# Encrypt with KMS: Lambda console → Configuration → Environment variables → Enable encryption

Secrets — Best Practice

Bad:  hardcoded credentials in code or environment variables (plaintext)
OK:   environment variable encrypted with KMS
Best: fetch from Secrets Manager at runtime (auto-rotation, audit trail)

import boto3, json
client = boto3.client('secretsmanager')

# Fetch once in INIT phase — cache in global variable
secret = json.loads(client.get_secret_value(SecretId='prod/db')['SecretString'])
DB_PASSWORD = secret['password']

14. Lambda Pricing

Two charges:
  1. Requests: $0.20 per 1 million requests
  2. Duration: $0.0000166667 per GB-second

GB-second = (memory in GB) × (duration in seconds)

Example:
  Function: 512 MB memory, 500ms execution, 1 million requests/month
  Duration cost:
    0.5 GB × 0.5 seconds = 0.25 GB-seconds per request
    × 1,000,000 requests  = 250,000 GB-seconds
    × $0.0000166667       = $4.17
  Request cost: $0.20
  Total: ~$4.37/month

Free tier (every month, does not expire):
  1,000,000 requests/month
  400,000 GB-seconds/month

ARM64 (Graviton): 20% cheaper per GB-second vs x86_64

15. Common Patterns ⭐

API (Serverless REST API)

Client → API Gateway → Lambda → DynamoDB
Scales from 0 to millions of requests
Cost: pay per request, $0 when idle

Event Processing

S3 upload → Lambda → resize image → save thumbnail → S3
SQS message → Lambda → process order → DynamoDB + SES email
DynamoDB Stream → Lambda → sync to Elasticsearch

Scheduled Jobs (Cron)

EventBridge rule: cron(0 2 * * ? *)  → Lambda: cleanup-old-data
Every day at 2 AM → runs cleanup function
No server needed — serverless cron

Fan-Out Pattern

One event → SNS → multiple Lambda subscribers
  Lambda-A: send email
  Lambda-B: update analytics
  Lambda-C: log to audit trail
All three execute in parallel from single SNS publish

16. Common Mistakes

❌ Wrong ✅ Correct
Lambda can run indefinitely Hard timeout: 15 minutes maximum
Increase timeout to fix slow DB queries Lambda in VPC without NAT/endpoint = no internet — fix networking first
Put Lambda in public subnet for internet access Lambda in public subnet still has no internet — use private subnet + NAT Gateway
Create DB connection inside handler Create connection in INIT phase (outside handler) — reused across warm invocations
Provisioned Concurrency on $LATEST Provisioned Concurrency must be on a version or alias — not $LATEST
Layers count against the 250 MB limit Yes — total unzipped = code + all layers combined must be ≤ 250 MB
Cold starts happen on every invocation Cold starts occur in < 1% of invocations in steady workloads
Reserved Concurrency = no cold starts Reserved concurrency limits capacity — only Provisioned eliminates cold starts
Environment variables have no size limit All env vars combined: 4 KB total limit
Lambda scales linearly Lambda scales in burst increments; burst limit (500–3000/min) applies on rapid scale-out

17. Interview Questions Checklist

  • What is serverless computing? How does Lambda fit?
  • Explain the Lambda execution environment lifecycle (3 phases)
  • What is a cold start? What causes it? How do you fix it?
  • Cold start duration range? When do they occur? (< 1% of invocations)
  • Three invocation types — synchronous, asynchronous, event source mapping
  • Error handling for each invocation type (retries, DLQ, destinations)
  • Versions vs Aliases — what are they? How do you do canary deployments?
  • What is Provisioned Concurrency? Why must it be on a version/alias?
  • Reserved vs Provisioned vs On-Demand concurrency — differences?
  • Why put Lambda in a VPC? What do you lose? What configuration is needed?
  • How do you give Lambda internet access when it's in a VPC?
  • CPU scaling with memory — at what memory is 1 full vCPU? (1,769 MB)
  • What are Lambda Layers? Use case? Limits?
  • Lambda@Edge vs CloudFront Functions — when to use each?
  • Destinations vs DLQ — why prefer destinations for async Lambda?
  • Lambda pricing model — two charges, free tier?
  • What is the ephemeral storage (/tmp) and its limit?
  • Where should you put DB connection creation — handler or outside? Why?