π‘οΈ Kubernetes securityContext Deep Dive¶
Official Kubernetes documentation: Security Context
π What is securityContext?¶
In Kubernetes, the securityContext defines privilege and access control settings for a Pod or Container. Itβs crucial for securing workloads by configuring:
- Which user the container runs as
- Access to the file system
- Ability to escalate privileges
- POSIX group access
- Linux capabilities and kernel-level security features like SELinux
These settings help enforce principle of least privilege and compliance.
π Why Use securityContext?¶
Kubernetes runs applications in isolated containers. To strengthen security, we need control over how containers run processes, access files, escalate privileges, etc. This is where securityContext comes in.
- Helps run containers as non-root users
- Prevents privilege escalation
- Controls access to filesystem
- Assigns ownership to volumes
π securityContext Fields Explained with Impact¶
Below are the most commonly used fields and their real impact in containers:
1. runAsUser¶
Defines the UID that the container's processes run as.
π YAML:
securityContext:
runAsUser: 1000
π Effect: Inside the container, all processes will run as user ID 1000. Example:
ps aux
PID USER TIME COMMAND
1 1000 0:00 sleep 1h
6 1000 0:00 sh
π§ Use it when you want to run containers as non-root users.
2. runAsGroup¶
Defines the GID that the containerβs processes run as.
π YAML:
securityContext:
runAsGroup: 3000
π Effect: All processes will run with the specified group ID 3000. Check with:
id
uid=1000 gid=3000 groups=3000
runAsGroup field specifies the primary group ID of 3000 for all processes within any containers of the Pod. If this field is omitted, the primary group ID of the containers will be root(0). Any files created will also be owned by user 1000 and group 3000 when runAsGroup is specified. 3. fsGroup¶
- Gives group ownership of mounted volumes to specified GID.
- New files created in mounted volumes (e.g.,
/data) are owned by this group.
securityContext:
fsGroup: 2000
π§ͺ Example Walkthrough:
$ id
uid=1000 gid=3000 groups=3000,2000
$ ls -ld /data
# Directory shows group ID = 2000 (from fsGroup)
drwxrwsrwx 2 root 2000 4096 Apr 8 20:08 demo
Want a detailed documentation? Click here!
4. seLinuxOptions¶
- What it does: Defines SELinux labels for process and file access.
- Use Case: Fine-grained access control for systems using SELinux.
securityContext:
seLinuxOptions:
level: "s0:c123,c456"
role: "system_r"
type: "spc_t"
user: "system_u"
5. supplementalGroups¶
Adds additional groups the container's processes will be part of.
π YAML:
securityContext:
supplementalGroups: [4000, 5000]
securityContext:
supplementalGroups:
- 4000
- 5000
π Effect: Inside container:
id
uid=1000 gid=3000 groups=3000,4000,5000
6. runAsNonRoot¶
Ensures container cannot run as root.
π YAML:
securityContext:
runAsNonRoot: true
π Effect: If container tries to run as UID 0 (root), it will be blocked. Useful for ensuring least privilege.
7. allowPrivilegeEscalation¶
Prevents processes from gaining more privileges than their parent.
π YAML:
securityContext:
allowPrivilegeEscalation: false
π Effect: Disallows tools like sudo, setuid, etc. Useful for untrusted containers.
Note: allowPrivilegeEscalation is always true when the container: - is run as privileged, or - has CAP_SYS_ADMIN
8. readOnlyRootFilesystem¶
Mounts root filesystem as read-only.
π YAML:
securityContext:
readOnlyRootFilesystem: true
π Effect: Prevents writing to /. App must write to mounted volumes instead. Useful for hardened environments.
9. capabilities¶
Controls Linux kernel capabilities granted to the container. - Official documentation: Linux Capabilities
π YAML:
securityContext:
capabilities:
drop: # ["ALL"]
- ALL
add: # ["NET_BIND_SERVICE", "NET_RAW"]
- NET_BIND_SERVICE
- CHOWN
π Effect: Removes all privileges and adds back only necessary ones like binding to ports <1024.
10. privileged: true¶
When you set:
securityContext:
privileged: true
it gives the container full host-level privileges β basically, itβs like running the container as root on the host itself, not just inside its namespace. This allows access to host devices, kernel modules, and mounting filesystems, etc.
What βHostβ Means
In Kubernetes (or Docker),
- Your host = the machine (node) on which the container actually runs.
- So if youβre running Kubernetes locally (e.g.,
minikube,kind, orkubeadmon your laptop), then the host is your laptop.
If itβs a cluster:
- Then βhostβ means the worker node (VM or physical machine) where that Pod is scheduled.
β οΈ So when you set:
securityContext:
privileged: true
It allows the container to access and control your host system, e.g.:
- read or modify host files,
- mount host devices,
- change kernel settings.
Thatβs why itβs dangerous β itβs almost like giving the container root access to your laptop (the host).
ποΈ Pod-level vs Container-level securityContext¶
Kubernetes allows you to define securityContext:
- At the Pod level: applies defaults to all containers
- At the Container level: overrides pod-level for that container
π Precedence¶
| Defined At | Takes Effect? |
|---|---|
| Pod-level only | β Applied to all containers |
| Container-level only | β Applied only to that container |
| Both | β Container-level overrides Pod-level |
β Full Example¶
apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
securityContext: # Pod-level security context
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
supplementalGroups: [4000]
runAsNonRoot: true
containers:
- name: app
image: busybox
command: ["sh", "-c", "sleep 1h"]
securityContext: # Container-level security context
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
add: ["NET_BIND_SERVICE"]
π§ͺ What This Means in Practice¶
- Pod processes run as UID 1000, GID 3000
- All mounted volumes (like /data) get GID 2000
- Extra group access for GID 4000
- Root FS is read-only (container level)
- All kernel capabilities dropped except for port binding
π Kubernetes securityContext Key Reference¶
| π§© Field Name | π Applies To | π Description |
|---|---|---|
runAsUser | β Pod & β Container | Runs the process inside container as a specific UID. |
runAsGroup | β Pod & β Container | Runs the process with specified GID. |
fsGroup | β Pod Only | Sets GID for mounted volumes (shared among containers). |
fsGroupChangePolicy | β Pod Only | Controls when fsGroup is applied to volume files. |
supplementalGroups | β Pod Only | Additional GIDs added to all containers in the Pod. |
supplementalGroupsPolicy | β Pod Only (Alpha) | Controls how supplementalGroups are applied (only in strict mode). |
capabilities.add | β Container Only | Add Linux capabilities (e.g., NET_ADMIN, SYS_TIME). |
allowPrivilegeEscalation | β Container Only | Prevents gaining more privileges than parent process. |
readOnlyRootFilesystem | β Container Only | Mounts the container's root filesystem as read-only to prevent tampering. |
privileged | β Container Only | Gives full host privileges to the container (dangerous!). |
runAsNonRoot | β Pod & β Container | Ensures container doesn't run as UID 0 (root). |
seccompProfile.type | β Pod & β Container | Defines seccomp profile (RuntimeDefault, Unconfined, etc.). |
appArmorProfile.type | β Pod & β Container | Specifies AppArmor profile to apply (usually only on supported OS). |
seLinuxOptions.level | β Pod & β Container | Sets SELinux context for more fine-grained control. |
π― Summary¶
| Scope | Fields |
|---|---|
| Pod Only | fsGroup, fsGroupChangePolicy, supplementalGroups, supplementalGroupsPolicy |
| Container Only | capabilities, allowPrivilegeEscalation, privileged, readOnlyRootFilesystem |
| Both | runAsUser, runAsGroup, runAsNonRoot, seccompProfile, appArmorProfile, seLinuxOptions |