Create Pod Security Policy Kubernetes [Step-by-Step]

Overview on Pod Security Policy in Kubernetes

PSP is short abbreviation used for Pod Security Policy in Kubernetes.PSP is a cluster scoped resource which checks for a set of conditions before a pod is admitted and scheduled to run in a cluster. This is achieved via a Kubernetes admission controller which evaluates every pod creation request for compliance with the PSP assigned to the pod.

PSP allows you to control:

Running of privileged containers
Usage of host namespaces
Usage of host networking and ports

Following Table explains different fields used with PSP

Field	Usage
`privileged`	Allow containers to gain capabilities which include access to host mounts, filesystem to change settings and many more privileged capabilities.
`hostPID, hostIPC`	Container has access to host namespaces where process and ethernet interfaces are visible to container
`hostNetwork, hostPorts`	Container has IP access to the host network and ports.
`volumes`	Allow volumes types like configMap, emtyDir or secret
`allowedHostPaths`	Allow whitelist of host paths that are allowed to be used by hostPath volumes i.e. /tmp
`allowedFlexVolumes`	Allow specific FlexVolume drivers i.e. azure/kv
`fsGroup`	You can set a GID or range of GID which owns the pod’s volumes.
`readOnlyRootFilesystem`	Set containers root file system as read only
`runAsUser, runAsGroup, supplementalGroups`	Defines containers UID and GID. Here you can specify non root user or groups
`allowPrivilegeEscalation, defaultAllowPrivilegeEscalation`	Restricting privilege escalation by process
`defaultAddCapabilities, requiredDropCapabilities, allowedCapabilities`	You can add or drop linux capabilities as needed.
`SELinux`	Define SELinux context of the container
`allowedProcMountTypes`	Define SELinux context of the container
`forbiddenSysctls,allowedUnsafeSysctls`	Sysctl profile used by container
`annotations`	AppArmor and seccomp profiles used by containers

NOTE

Pod SecurityPolicies (PSPs) are deprecated with Kubernetes release 1.21 and are scheduled to be removed in release 1.25

Getting started with Kubernetes Pod Security Policy

In this section we will look into different Pod Security Policy examples and understand the different fields which are used in the definition YAML file:

Example-1: Restrict hostIPC, hostPID, hostNetwork and hostPorts using PSP

The following listing shows a sample PodSecurityPolicy, which prevents pods from using the host’s IPC, PID, and Network namespaces, and prevents running privileged containers and the use of most host ports (except ports from 10000-11000 and 13000-14000). The policy doesn’t set any constraints on what users, groups, or SELinux groups the container can run as.

yaml


apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
metadata:
  name: default
spec:
  hostIPC: false                 1
  hostPID: false                 1
  hostNetwork: false             1
  hostPorts:                     2
  - min: 10000                   2
    max: 11000                   2
  - min: 13000                   2
    max: 14000                   2
  privileged: false              3
  readOnlyRootFilesystem: true   4
  runAsUser:                     5
    rule: RunAsAny               5
  fsGroup:                       5
    rule: RunAsAny               5
  supplementalGroups:            5
    rule: RunAsAny               5
  seLinux:                       6
    rule: RunAsAny               6
  volumes:                       7
  - '*'                          7

Here,

Containers aren’t allowed to use the host’s IPC, PID, or network namespace.
They can only bind to host ports 10000 to 11000 (inclusive) or host ports 13000 to 14000.
Containers cannot run in privileged mode.
Containers are forced to run with a read-only root filesystem.
Containers can run as any user and any group.
They can also use any SELinux groups they want.
All volume types can be used in pods.

Example-2: Restrict runAsUser, fsGroup, and supplementalGroups using PSP

The policy in the previous example doesn’t impose any limits on which users and groups containers can run as, because you’ve used theRunAsAnyrule for therun-As-User,fsGroup, andsupplementalGroupsfields. If you want to constrain the list of allowed user or group IDs, you change the rule toMustRunAsand specify the range of allowed IDs.

yaml


runAsUser:
    rule: MustRunAs
    ranges:
    - min: 2                1
      max: 2                1
  fsGroup:
    rule: MustRunAs
    ranges:
    - min: 2                2
      max: 10               2
    - min: 20               2
      max: 30               2
  supplementalGroups:
    rule: MustRunAs
    ranges:
    - min: 2                2
      max: 10               2
    - min: 20               2
      max: 30               2

Here,

Add a single range with min equal to max to set one specific ID.
Multiple ranges are supported—here, group IDs can be 2–10 or 20–30 (inclusive).

If the pod spec tries to set either of those fields to a value outside of these ranges, the pod will not be accepted by the API server

Example-3: Restrict allowed, default, and disallowed capabilities using PSP

By now you must already be familiar that containers can run in privileged mode or not, and you can define a more fine-grained permission configuration by adding or dropping Linux kernel capabilities in each container. Three fields influence which capabilities containers can or cannot use:

allowedCapabilities field is used to specify which capabilities pod authors can add in thesecurityContext.capabilitiesfield in the container spec
defaultAddCapabilities field is used to add default capabilities which will be deployed with every container
requiredDropCapabilities field contains the list of capabilities which will be dropped automatically from every controller.

Here, we have specified certain capabilities in PodSecurityPolicy, you can check the man page to get the complete list of capabilities which you can use with PSP:

yaml


apiVersion: extensions/v1beta1
kind: PodSecurityPolicy
spec:
  allowedCapabilities:            1
  - SYS_TIME                      1
  defaultAddCapabilities:         2
  - CHOWN                         2
  requiredDropCapabilities:       3
  - SYS_ADMIN                     3
  - SYS_MODULE                    3
  ...

Here,

Allow containers to add the SYS_TIME capability.
Automatically add the CHOWN capability to every container.
Require containers to drop the SYS_ADMIN and SYS_MODULE capabilities.

Example-4: Restricting the types of volumes which any pod can use

The last thing a PodSecurityPolicy resource can do is define which volume types users can add to their pods. At the minimum, a PodSecurityPolicy should allow using at least theemptyDir,configMap,secret,downwardAPI, and thepersistentVolumeClaimvolumes.

yaml


kind: PodSecurityPolicy
spec:
  volumes:
  - emptyDir
  - configMap
  - secret
  - downwardAPI
  - persistentVolumeClaim

If multiple PodSecurityPolicy resources are in place, pods can use any volume type defined in any of the policies (the union of allvolumeslists is used).

Lab Environment

I have already deployed a multi-node kubernetes cluster in my previous tutorial, so I will use the same setup for the demonstration.

Workflow to create Pod Security Policy

You must follow the following flow to create pod security policy in Kubernetes.

Create a PSP.
Create ClusterRole with the ‘use’ verb which authorizes pod deployment controllers to use the policies.
Create ClusterRoleBindings which is used to enforce policy for the groups (i.e. system:authenticated or system:unauthenticated) or SA (Service Accounts).

Create Pod Security Policy Kubernetes [Step-by-Step]

Step-1: Create Pod Security Policy

In this section we will go ahead and create our first Pod Security Policy on the Kubernetes Cluster. Following is the content of my restricted-psp.yaml file:

yaml


apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted-psp
spec:
  # Required to prevent escalations to root.
  privileged: false
  # This is redundant with non-root + disallow privilege escalation,
  # but we can provide it for defense in depth.
  allowPrivilegeEscalation: false
  # Drop all capabilities
  requiredDropCapabilities:
    - ALL
  # Allow core volume types.
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    # Assume that persistentVolumes set up by the cluster admin are safe to use.
    - 'persistentVolumeClaim'
  hostNetwork: false
  hostIPC: false
  hostPID: false
  runAsUser:
    # Require the container to run without root privileges.
    rule: 'MustRunAsNonRoot'
  seLinux:
    # This policy assumes the nodes are using AppArmor rather than SELinux.
    rule: 'RunAsAny'
  supplementalGroups:
    rule: 'MustRunAs'
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535
  fsGroup:
    rule: 'MustRunAs'
    ranges:
      # Forbid adding the root group.
      - min: 1
        max: 65535
  readOnlyRootFilesystem: false

I have already added enough comments in each field to explain the purpose. Let us go ahead and create this PSP:

]# kubectl create -f restricted-psp.yaml 
podsecuritypolicy.policy/restricted-psp created

List the created pod security policy:

]# kubectl get psp | grep -E 'PRIV|restricted-psp'
NAME                                PRIV    CAPS               SELINUX    RUNASUSER          FSGROUP     SUPGROUP    READONLYROOTFS   VOLUMES
restricted-psp                      false                      RunAsAny   MustRunAsNonRoot   MustRunAs   MustRunAs   false            configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim

So our PSP has been successfully created.

Step-2: Create Cluster Role

Next we will create the Cluster Role that needs to grant access to use the desired policies. Here is the content of my Cluster Role YAML file restricted-psp-role.yaml

yaml


apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: restricted-psp
rules:
- apiGroups:
  # Use 'policy' apiGroup used for PodSecurityPolicy Resource
  - policy
  resourceNames:
  # Name of the Pod Security Policy, you can add more than one
  - restricted-psp
  resources:
  # resource name for PodSecurityPolicy
  - podsecuritypolicies
  verbs:
  # provide access to 'use'
  - use

Create this Cluster Role:

]# kubectl create -f restricted-psp-role.yaml 
clusterrole.rbac.authorization.k8s.io/restricted-psp created

List the role which we just created:

]# kubectl get clusterrole | grep restricted-psp
restricted-psp                                                         2021-09-03T05:12:07Z

Step-3: Create Cluster Role Binding

Next we need to bind the Cluster Role using Cluster Role Binding to grant usage for pods. Here is my sample file content from restricted-psp-role-bind.yaml

yaml


apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: restricted-binding-psp
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  # name of the cluster role to bind
  name: restricted-psp
subjects:
    # Authorize all service accounts in all namespace
  - apiGroup: rbac.authorization.k8s.io
    kind : Group
    # You may restrict to a namespace using system:serviceaccounts:<authorized namespace>
    name: system:authenticated

Let us create this Cluster Role Binding:

]# kubectl create -f restricted-psp-role-bind.yaml 
clusterrolebinding.rbac.authorization.k8s.io/restricted-binding-psp created

List the ClusterRoleBinding which we created above:

]# kubectl get clusterrolebinding | grep -i restricted-binding
restricted-binding-psp                                 ClusterRole/restricted-psp                                         2m28s

Now let us go ahead and cover different examples to create Kubernetes Cluster Resources using Security Context to start Pod with limited privilege and capabilities.

Step-4: Verify Pod Security Policy using StatefulSet

Create StatefulSet

In this example I will try to create a non privileged pod which would start with 'root' user:

yaml


apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: test-statefulset
  namespace: deepak
spec:
  selector:
    matchLabels:
      app: dev
  serviceName: test-pod
  replicas: 2
  template:
    metadata:
      labels:
        app: dev
    spec:
      containers:
      - name: test-statefulset
        image: alpine:latest
        command: ["sleep", "10000"]
        securityContext:
          # Run the pod as root user
          runAsUser: 0
          # Pod will start with no privilege
          privileged: false
          # Allow privilege escalation
          allowPrivilegeEscalation: true
          # Privilege escalation is allowed but first drop all capabilities
          capabilities:
            drop:
             - ALL
            # Allow only NET_BIND_SERVICE Capability
            add:
             - NET_BIND_SERVICE

Let's try to create this statefulset:

]# kubectl create -f test-statefulset.yaml 
statefulset.apps/test-statefulset created

If you check the status of statefulset, none of the relica pods are created:

]# kubectl get statefulset -n deepak
NAME                      READY   AGE
test-statefulset          0/2     28s

Troubleshoot "unable to validate against any pod security policy" Errors

So this would mean that something has failed. We can use kubectl describe to get more details:

]# kubectl describe statefulset test-statefulset -n deepak
....Events:
  Type     Reason        Age                From                    Message
  ----     ------        ----               ----                    -------
  Warning  FailedCreate  44s (x2 over 44s)  statefulset-controller  create Pod test-statefulset-0 in StatefulSet test-statefulset failed error: pods "test-statefulset-0" is forbidden: unable to validate against any pod security policy: [spec.containers[0].securityContext.runAsUser: Invalid value: 0: running with the root UID is forbidden spec.containers[0].securityContext.capabilities.add: Invalid value: "NET_BIND_SERVICE": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[0].securityContext.runAsUser: Invalid value: 0: running with the root UID is forbidden spec.containers[0].securityContext.capabilities.add: Invalid value: "NET_BIND_SERVICE": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[0].securityContext.runAsUser: Invalid value: 0: running with the root UID is forbidden spec.containers[0].securityContext.capabilities.add: Invalid value: "NET_BIND_SERVICE": capability may not be added spec.containers[0].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[0].securityContext.runAsUser: Invalid value: 0: running with the root UID is forbidden]

Why we are getting unable to validate against any pod security policy?

Here, our pod creation has failed because if you remember our restricted-psp does not allow pod to be running as root. Check the PSP again:

]# kubectl get psp | grep -E 'PRIV|restricted-psp'
NAME                                PRIV    CAPS               SELINUX    RUNASUSER          FSGROUP     SUPGROUP    READONLYROOTFS   VOLUMES
restricted-psp                      false                      RunAsAny   MustRunAsNonRoot   MustRunAs   MustRunAs   false            configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim

The RUNASUSER field contains MustRunAsNonRoot while we were trying to run as root user hence the above command failed to create pods.

To overcome this either

we start the pod using non-root user
or we modify the Pod Security Policy to allow RUNASUSER as RunAsAny

HINT

For our demonstration, we have updated our YAML file to use a non-root user. I will be using a different image from my personal repository which has a normal user.

Delete the existing statefulset as it is in failed state:

]# kubectl delete statefulset test-statefulset  -n deepak
statefulset.apps "test-statefulset" deleted

Re-create this statefulset after fixing the RunAsUser field to use non-root user:

]# kubectl create -f test-statefulset.yaml 
statefulset.apps/test-statefulset created

Verify StatefulSet Status

Verify the statefulset status:

]# kubectl get statefulset -n deepak
NAME                      READY   AGE
test-statefulset          2/2     2m1s

Verify Applied PodSecurityPolicy to the Pod

Verify the PSP applied to our Pods:

]# kubectl describe pod test-statefulset-0 -n deepak | grep psp
              kubernetes.io/psp: restricted-psp

So, both our pods are in running state. Let us connect to one of the pods:

[root@ncs20fp1-w2-egress-control-02 hardening]# kubectl exec -it test-statefulset-0 -n deepak -- bash

List applied Capabilities to the container

List the applied capabilities to this Pod

[sdl@test-statefulset-0 /]$ capsh --print

Sample Output:

Since we had dropped all the capabilities from the container and added NET_BIND_SERVICE in our statefulset YAML file so we can see that only one capability was applied to the Pod.

Limitation of Pod Security Policy

PodSecurityPolicySpec has references to allowedCapabilities, privileged or hostNetwork. These enforcements can only work on linux based runtimes.
If you are creating a pod using controllers (e.g. replication controller), it’s worth checking if PSPs are authorized for use by those controllers.
Once PSPs are enabled cluster wide and a pod doesn’t start because of incorrect PSP, it becomes hectic to troubleshoot the issue. Moreover if PSPs are enabled clusterwide in production clusters you need to test each and every component in your cluster including dependencies like mutating admission controllers and conflicting verdicts.
Azure Kubernetes Service(AKS) have deprecated support for PSPs and preferred OPA Gatekeeper for policy enforcement to support more flexible policies using OPA engine.
PSP are deprecated and scheduled to be removed by Kubernetes 1.25.
Kubernetes can have edge cases where PSPs can be bypassed.

Summary

Pod Security Policy is quiet vast topic where you can control different areas such as privilege, namespace, networking and ports. In this tutorial we have covered the privileged pod section with an example where we demonstrated how a pod creation can fail with the Pod doesn't match the PodSecurityPolicy fields.

Overview on Pod Security Policy in Kubernetes

Getting started with Kubernetes Pod Security Policy

Example-1: Restrict hostIPC, hostPID, hostNetwork and hostPorts using PSP

Example-2: Restrict runAsUser, fsGroup, and supplementalGroups using PSP

Example-3: Restrict allowed, default, and disallowed capabilities using PSP

Example-4: Restricting the types of volumes which any pod can use

Lab Environment

Workflow to create Pod Security Policy

Step-1: Create Pod Security Policy

Step-2: Create Cluster Role

Step-3: Create Cluster Role Binding

Step-4: Verify Pod Security Policy using StatefulSet

Create StatefulSet

Troubleshoot "unable to validate against any pod security policy" Errors

Verify StatefulSet Status

Verify Applied PodSecurityPolicy to the Pod

List applied Capabilities to the container

Limitation of Pod Security Policy

Summary

Further Readings

Related Articles

Kubernetes Interview Questions and Answers

Set ulimit in Kubernetes Pods: open files, nproc, and process limits

Kubernetes DNS Troubleshooting: Fix CoreDNS, NXDOMAIN, SERVFAIL, ndots, and DNS Timeouts

Search GoLinuxCloud