Kubernetes SecurityContext Overview
To enforce policies on the pod level, we can use Kubernetes SecurityContext field in the pod specification. A security context is used to define different privilege and access level control settings for any Pod or Container running inside the Pod.
Here are some of the settings which can be configured as part of Kubernetes SecurityContext field:
- runAsUser to specify the UID with which each container will run
- runAsNonRoot flag that will simply prevent starting containers that run as UID 0 or root.
- runAsGroup The GID to run the entrypoint of the container process
- supplementalGroups specify the Group (GID) for the first process in each container
- fsGroup we can specify the Group (GID) for filesystem ownership and new files. This can be applied for entire Pod and not on each container.
- allowPrivilegeEscalation controls whether any process inside the container can gain more privilege to perform the respective task.
- readOnlyRootFilesystem will mount the container root file system inside the Pod as read-only by default
- capabilities controls the different capabilities which can be added using 'add' or disabled using 'drop' keyword for the container
- Seccomp: Filter a process's system calls.
- AppArmor: Use program profiles to restrict the capabilities of individual programs.
- Security Enhanced Linux (SELinux) Objects are assigned security labels.
Pre-requisites
Before you start with Kubernetes SecurityContext, you must consider below points:
- You have an up and running Kubernetes Cluster
- You will need a Pod Security Policy in place which will be used to provide different types of Kubernetes SecurityContext such as privileges, capabilities etc
Using runAsUser with Kubernetes SecurityContext
In this section we will explore the runAsUser field used with Kubernetes SecurityContext. The runAsUser
can be applied at Pod Level or at Container Level. Let me demonstrate both these examples
Example-1: Define runAsUser for entire Pod
In this section we have a multi container pod where we will define runAsUser parameter under Kubernetes SecurityContext for all the containers running inside the Pod.
apiVersion: v1
kind: Pod
metadata:
name: pod-as-user-guest
namespace: test1
spec:
securityContext:
runAsUser: 1025
containers:
- name: one
image: golinux-registry:8090/secure-context-img:latest
command: ["/bin/sleep", "999999"]
- name: two
image: golinux-registry:8090/secure-context-img:latest
command: ["/bin/sleep", "999999"]
We will create this Pod:
~]# kubectl create -f security-context-runasuser-1.yaml
pod/pod-as-user-guest created
Check the status of the Pod, so both our containers are in Running
state:
~]# kubectl get pods -n test1
NAME READY STATUS RESTARTS AGE
pod-as-user-guest 2/2 Running 0 4s
We can connect to both the containers and verify the default user:
~]# kubectl exec -it pod-as-user-guest -n test1 -c one -- id uid=1025(user2) gid=1025(user2) groups=1025(user2) ~]# kubectl exec -it pod-as-user-guest -n test1 -c two -- id uid=1025(user2) gid=1025(user2) groups=1025(user2)
As expected, both the containers are running with the provided user id defined
with runAsUser
under Pod level Kubernetes SecurityContext.
We will delete this pod:
~]# kubectl delete pod pod-as-user-guest -n test1
pod "pod-as-user-guest" deleted
Example-2: Define runAsUser for container
In this section now we will define different user for individual container inside the Kubernetes SecurityContext of the Pod definition file:
apiVersion: v1
kind: Pod
metadata:
name: pod-as-user-guest
namespace: test1
spec:
containers:
- name: one
image: golinux-registry:8090/secure-context-img:latest
command: ["/bin/sleep", "999999"]
securityContext:
runAsUser: 1025
- name: two
image: golinux-registry:8090/secure-context-img:latest
command: ["/bin/sleep", "999999"]
securityContext:
runAsUser: 1026
Here I have defined runAsUser
separately for both the containers inside the Kubernetes SecurityContext so we will use different user for both the containers.
Create this Pod:
~]# kubectl create -f security-context-runasuser-1.yaml
pod/pod-as-user-guest created
Check the status:
~]# kubectl get pods -n test1
NAME READY STATUS RESTARTS AGE
pod-as-user-guest 2/2 Running 0 94s
Verify the USER ID of both the containers:
# kubectl exec -it pod-as-user-guest -n test1 -c one -- id uid=1025(user2) gid=1025(user2) groups=1025(user2) ~]# kubectl exec -it pod-as-user-guest -n test1 -c two -- id uid=1026(user1) gid=1026(user1) groups=1026(user1)
When we are sharing some volumes across multiple containers, then access permission can become a concern. In such scenarios we can use fsGroup
under Kubernetes SecurityContext to define a common group which will act as an group owner for any such shared volumes.
fsGroup
is assigned at Pod level so you cannot assign it at container level Kubernetes SecurityContext
, if you try to assign it at container level then you will get below error:
error: error validating "security-context-fsgroup-1.yaml": error validating data: [ValidationError(Pod.spec.containers[0].securityContext): unknown field "fsGroup" in io.k8s.api.core.v1.SecurityContext, ValidationError(Pod.spec.containers[1].securityContext): unknown field "fsGroup" in io.k8s.api.core.v1.SecurityContext]; if you choose to ignore these errors, turn validation off with --validate=false
This our sample YAML file to create a pod using fsGroup
:
apiVersion: v1
kind: Pod
metadata:
name: pod-as-user-guest
namespace: test1
spec:
securityContext:
fsGroup: 555
containers:
# create container one
- name: one
image: golinux-registry:8090/secure-context-img:latest
command: ["/bin/sleep", "999999"]
# container one running as user id 1025
securityContext:
runAsUser: 1025
# mount empty dir under /volume
volumeMounts:
- name: shared-volume
mountPath: /volume
# create container two
- name: two
image: golinux-registry:8090/secure-context-img:latest
command: ["/bin/sleep", "999999"]
# container one running as user id 1026
securityContext:
runAsUser: 1026
# mount empty dir under /volume
volumeMounts:
- name: shared-volume
mountPath: /volume
# create emptyDir
volumes:
- name: shared-volume
emptyDir: {}
Create this pod:
~]# kubectl create -f security-context-fsgroup-1.yaml
pod/pod-as-user-guest created
Verify the group ownership on the shared volume:
~]# kubectl exec -it pod-as-user-guest -n test1 -c one -- bash [user2@pod-as-user-guest /]$ id uid=1025(user2) gid=1025(user2) groups=1025(user2),555 [user2@pod-as-user-guest /]$ ls -ld /volume/ drwxrwsrwx. 2 root 555 4096 Sep 3 09:28 /volume/ [user2@pod-as-user-guest /]$ exit
So, one container one the /volume
path is owned by 555
GID as expected. The id
command shows the container is running with user ID 1025
, as specified in the pod definition. The effective group ID is 1025(user2)
, but group ID 555
is also associated with the user.
Let's verify the same on two
container:
~]# kubectl exec -it pod-as-user-guest -n test1 -c two -- bash [user1@pod-as-user-guest /]$ id uid=1026(user1) gid=1026(user1) groups=1026(user1),555 [user1@pod-as-user-guest /]]$ ls -ld /volume/ drwxrwsrwx. 2 root 555 4096 Sep 3 09:28 /volume/
One more thing which you should know that with fsGroup
Kubernetes SecurityContext, any files created inside the shared volume will have group ownership of the ID provided in the pod definition file. For example, here I will create a file inside /tmp
:
[user1@pod-as-user-guest ~]$ touch /tmp/file
[user1@pod-as-user-guest ~]$ ls -l /tmp/file
-rw-rw-r--. 1 user1 user1 0 Sep 3 09:42 /tmp/file
As you can see above, the file is owned by user2
user and group. But if you create any file inside the shared volume i.e. /volume
path then the group owner of that respective file would be same as fsGroup
value i.e. 555
in our case:
[user1@pod-as-user-guest ~]$ touch /volume/file
[user1@pod-as-user-guest ~]$ ls -l /volume/file
-rw-rw-r--. 1 user1 555 0 Sep 3 09:42 /volume/file
As you can see, the fsGroup
Kubernetes SecurityContext property is used when the process creates files in a volume (but this depends on the volume plugin used).
Define supplementalGroups inside Kubernetes SecurityContext
We can combine fsGroup
with supplementalGroups
inside the Pod's SecurityContext field to define some additional groups. In such case the runAsUser
or the default image user will also be added to these supplementary groups.
apiVersion: v1
kind: Pod
metadata:
name: pod-as-user-guest
namespace: test1
spec:
securityContext:
fsGroup: 555
# Define additional groups for the default user
supplementalGroups: [666, 777]
containers:
# create container one
- name: one
image: golinux-registry:8090/secure-context-img:latest
command: ["/bin/sleep", "999999"]
...
We will create this Pod and verify the list of groups part of user2
user:
~]# kubectl exec -it pod-as-user-guest -n test1 -c two -- id
uid=1026(user1) gid=1026(user1) groups=1026(user1),555,666,777
So now along with fsGroup
, our user has also been added to additional supplementary groups.
Using allowPrivilegeEscalation with Kubernetes SecurityContext
In this section we will cover different areas related to privilege where we will add or remove capabilities to the container. Every container inside a Pod uses kernel capabilities to perform different task, such as even when you are changing permission of a file then you can basically using CAP_CHOWN
capability and many other capability such as CAP_SUID
, CAP_SGID
etc. You use CAP_SYS_MOUNT
capability when you are using mount, umount commands.
So, with containers you can accordingly assign or drop these capabilities so the user inside the container will have limited privilege and is considered more secure. To be able to control or restrict capabilities, you must define allowPrivilegeEscalation
as true
inside the Pod's Kubernetes Security Context.
The capabilities which a pod would use are basically defined using PodSecurityPolicy.
Example-1: Using allowedCapabilities in Pod Security Policy
The allowedCapabilities
field is used to specify which capabilities pod authors can add in the Kubernetes securityContext.capabilities
field in the container spec.
I have following PSP currently added to my Kubernetes Cluster and I have added some capabilities under allowedCapabilities
:
~]# kubectl get psp testns-psp-01 -o yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
...
spec:
allowPrivilegeEscalation: true
allowedCapabilities:
- SYS_ADMIN
- NET_BIND_SERVICE
- CHOWN
requiredDropCapabilities:
- ALL
...
Status of the PSP can be checked using below command:
~]# kubectl get psp | grep -E 'PRIV|testns' NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES testns-psp-01 false SYS_ADMIN,NET_BIND_SERVICE,CHOWN RunAsAny MustRunAsNonRoot RunAsAny RunAsAny false *
So we are only allowing capabilities as mentioned under CAPS
section.
We will create a StatefulSet with certain pre-defined capabilities, but the capability we use will not be part of allowed capabilities in the Pod Security Policy. Here is my pod definition file:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: test-statefulset
namespace: testns
spec:
selector:
matchLabels:
app: dev
serviceName: test-pod
replicas: 2
template:
metadata:
labels:
app: dev
spec:
containers:
- name: test-statefulset
image: golinux-registry:8090/secure-context-img:latest
command: ["supervisord", "-c", "/etc/supervisord.conf"]
imagePullPolicy: Always
securityContext:
runAsUser: 1025
## by default the privilege will be disabled
privileged: false
## allow the use of capabilitites
allowPrivilegeEscalation: true
capabilities:
## drop all capabilities
drop:
- ALL
## The creation of statefulset should fail as SUID is not in allowedCapabilities
add:
- SUID
Here, we are trying to use a capability using Kubernetes SecurityContext which has not been defined in Pod Security Policy, so let's try to create this statefulset:
~]# kubectl create -f test-statefulset.yaml
statefulset.apps/test-statefulset created
The statefulset has been successfully created but the pods have not come up:
~]# kubectl get statefulset -n testns NAME READY AGE test-statefulset 0/2 12m
We can use kubectl describe statefulset test-statefulset -n testns
command to troubleshoot the issue here:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 10s (x6 over 12s) statefulset-controller create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added]
Warning FailedCreate 7s (x2 over 12s) statefulset-controller create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed]
Warning FailedCreate 2s (x4 over 12s) statefulset-controller create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[0].securityContext.capabilities.add: Invalid value: "SUID": capability may not be added]
As expected, since the SUID capability was not defined in the PodSecurityPolicy
so statefulset failed to create pods.
Now I have modified my statefulset definition file and use CHOWN capability instead of SUID and the pod creation was successful. We can verify the same using the following command:
~]# kubectl exec -it test-statefulset-0 -n testns -- capsh --print Current: = cap_chown+i Bounding set =cap_chown Securebits: 00/0x0/1'b0 secure-noroot: no (unlocked) secure-no-suid-fixup: no (unlocked) secure-keep-caps: no (unlocked) uid=1025(user2) gid=1025(user2) groups=
Example-2: Using defaultAddCapabilities in PodSecurityPolicy
Next we will update our Pod Security Policy to also add some defaultAddCapabilities
. All capabilities listed under the defaultAddCapabilities
field will be added to every deployed pod’s containers. If a user doesn’t want certain containers to have those capabilities, they need to explicitly drop them in the specs of those containers.
I have modified my testns-psp-01 using kubectl edit psp testns-psp-01 -n testns
command and added defaultAddCapabilities
field with new capability:
...
spec:
allowPrivilegeEscalation: true
allowedCapabilities:
- SYS_ADMIN
- NET_BIND_SERVICE
- CHOWN
defaultAddCapabilities:
- NET_RAW
fsGroup:
rule: RunAsAny
requiredDropCapabilities:
- ALL
...
So, we have marked NET_RAW
as default capability which will be added to any container using this Pod Security Policy.
Let us quickly create one statefulset using our last pod definition file and verify if NET_RAW
capability is automatically added to the container:
...
securityContext:
runAsUser: 1025
## by default the privilege will be disabled
privileged: false
## allow the use of capabilitites
allowPrivilegeEscalation: true
capabilities:
## drop all capabilities
drop:
- ALL
## Add chown capability
add:
- CHOWN
...
Let's create this statefulset and verify the list of allowed capabilities:
~]# kubectl exec -it test-statefulset-0 -n testns -- capsh --print Current: = cap_chown,cap_net_raw+i Bounding set =cap_chown,cap_net_raw Securebits: 00/0x0/1'b0 secure-noroot: no (unlocked) secure-no-suid-fixup: no (unlocked) secure-keep-caps: no (unlocked) uid=1025(user2) gid=1025(user2) groups=
As expected, we had only added CHOWN capability but our pod also contains NET_RAW capability which was added as part of defaultAddCapabilities
from Pod Security Policy.
Example-3: Using requiredDropCapabilities in Pod Security Policy
The final field in this example is requiredDropCapabilities
. The capabilities listed in this field are dropped automatically from every container (the PodSecurityPolicy Admission Control plugin will add them to every container’s securityContext.capabilities.drop
field).
We have updated our ststefulset definition file, and now we are not dropping or adding any additional capability:
...
securityContext:
runAsUser: 1025
## by default the privilege will be disabled
privileged: false
## allow the use of capabilitites
allowPrivilegeEscalation: true
...
We have also updated our PodSecurityPolicy using kubectl edit psp testns-psp-01 -n testns
and added SYS_ADMIN
as requiredDropCapabilities
:
....
spec:
allowPrivilegeEscalation: true
allowedCapabilities:
- NET_BIND_SERVICE
- CHOWN
defaultAddCapabilities:
- NET_RAW
fsGroup:
rule: RunAsAny
requiredDropCapabilities:
- SYS_ADMIN
....
Next, we deploy our statefulset and verify the applied Linux capabilities:
~]# kubectl exec -it test-statefulset-0 -n testns -- capsh --print Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+i Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap Securebits: 00/0x0/1'b0 secure-noroot: no (unlocked) secure-no-suid-fixup: no (unlocked) secure-keep-caps: no (unlocked) uid=1025(user2) gid=1025(user2) groups=
Here you can check, SYS_ADMIN
capability is not available as it is removed using requiredDropCapabilities
What would happen if you explicitly try to add a dropped capability in SecurityContext?
Here we have dropped SYS_ADMIN
capability using requiredDropCapabilities
in the Pod Security Policy. Now what if we explicitly try to add the same capability to our statefulset using Kubernetes SecurityContext field as shown below:
...
securityContext:
runAsUser: 1025
## by default the privilege will be disabled
privileged: false
## allow the use of capabilitites
allowPrivilegeEscalation: true
capabilities:
add:
- SYS_ADMIN
...
Next, lets try to create this statefulset:
~]# kubectl create -f test-statefulset.yaml
statefulset.apps/test-statefulset created
Now let's check the events of this statefulset for any errors:
~]# kubectl describe statefulset test-statefulset -n testns
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 5s statefulset-controller create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added]
Warning FailedCreate 5s (x3 over 5s) statefulset-controller create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed]
Warning FailedCreate 3s (x6 over 5s) statefulset-controller create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[0].securityContext.capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added]
Hence if a user tries to create a pod where they explicitly add one of the capabilities listed in the policy’s requiredDropCapabilities
field, the pod is rejected:
Summary
In this tutorial we explored different Kubernetes SecurityContext which we can use inside a Pod and container. In a nutshell we covered following topics:
- Containers can be configured to run as a different user and/or group than the one defined in the container image.
- Containers can also run in privileged mode, allowing them to access the node’s devices that are otherwise not exposed to pods.
- Containers can be run as read-only, preventing processes from writing to the container’s filesystem (and only allowing them to write to mounted volumes).
- Cluster-level PodSecurityPolicy resources can be created to prevent users from creating pods that could compromise a node.
Further Readings
Configure a Security Context for a Pod or Container