Configure Pod storage with Kubernetes Persistent Volume (PV)


Kubernetes Tutorial

In our previous tutorial we already covered some of the Kubernetes volume types which could act as a persistent volume so you may be little confused.

For example, We can use NFS as a persistent volume but to create a NFS-backed volume, the developer has to know the actual server the NFS export is located on. This is against the basic idea of Kubernetes, which aims to hide the actual infrastructure from both the application and its developer, leaving them free from worrying about the specifics of the infrastructure and making apps portable across a wide array of cloud providers and on-premises datacenters.

With Kubernetes Persistent Volumes when a developer needs a certain amount of persistent storage for their application, they can request it from Kubernetes, the same way they can request CPU, memory, and other resources when creating a pod.

 

How Kubernetes Persistent Volume and Persistent Volume Claim works

To enable apps to request storage in a Kubernetes cluster without having to deal with infrastructure specifics, two new resources were introduced. They are Persistent-Volumes and PersistentVolumeClaims

Configure Pod storage with Kubernetes Persistent Volume (PV)

Let us understand in a step by step instruction how a Persistent Volume can be configured and used:

  • The cluster administrator will sets up the underlying storage and then registers it in Kubernetes by creating a PersistentVolume resource through the Kubernetes API server. In this image we have a NFS backed solution.
  • When creating the PersistentVolume, the admin specifies its size and the access modes it supports.
  • When a cluster user needs to use persistent storage in one of their pods, they first create a PersistentVolumeClaim manifest, specifying the minimum size and the access mode they require.
  • The user then submits the PersistentVolumeClaim manifest to the Kubernetes API server, and Kubernetes finds the appropriate PersistentVolume and binds the volume to the claim.
  • The PersistentVolumeClaim can then be used as one of the volumes inside a pod
  • Other users cannot use the same PersistentVolume until it has been released by deleting the bound PersistentVolumeClaim.

 

1. Create Persistent Volume

We will use our existing NFS server which we had created in our previous article while learning about Kubernetes Volumes. If you don't have one you can configure your own NFS server or if you are working on cloud then you can also use GCE Persistent Disk or AWS EBS.

Before we start with the YAML file, we must have the KIND and apiVersion to create Persistent Volume. We have covered this topic in depth in most of the tutorials of this series. We can use api-resources to get the KIND value:

[root@controller ~]# kubectl api-resources | grep -iE 'KIND|persistent'
NAME                              SHORTNAMES   APIGROUP                       NAMESPACED   KIND
persistentvolumeclaims            pvc                                         true         PersistentVolumeClaim
persistentvolumes                 pv                                          false        PersistentVolume

Here we have the KIND value for both Persistent Volume and Persistent Volume Claim which we will use later in this tutorial. Next we can use this value to get the mapping apiVersion:

[root@controller ~]# kubectl explain PersistentVolume | head -n 2
KIND:     PersistentVolume
VERSION:  v1

Now that we have both the mandatory params, we can start writing our YAML to create persistent volume:

[root@controller ~]# cat persistent-volume.yml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-share-pv
spec:
  capacity:
    storage: 1Gi
  volumeMode: Filesystem
  accessModes:
   - ReadWriteMany
   - ReadOnlyMany
  persistentVolumeReclaimPolicy: Recycle
  mountOptions:
    - hard
    - nfsvers=4.1
  nfs:
    path: /nfs_share
    server: 192.168.43.48

Let's cover the spec here. There are six sections: capacity, volume mode, access modes, reclaim policy, storage class, and the volume type (nfs in the example).

 

1.1 Capacity

Each volume has a designated amount of storage. Storage claims may be satisfied by persistent volumes that have at least that amount of storage. In the example, the persistent volume has a capacity of 1 gibibytes (a single gibibyte (GiB) is 2 to the power of 30 bytes).

You can refer the following table for more understanding on units:

Name Bytes Suffix Name Bytes Suffix
kulobyte 1000 K kibibyte 1024 Ki
megabyte 1000*2 M mebibyte 1024*2 Mi
gigabyte 1000*3 G gibibyte 1024*3 Gi
terabyte 1000*4 T tebibyte 1024*4 Ti
petabyte 1000*5 P pebibyte 1024*5 Pi
exayte 1000*6 E exbibyte 1024*6 Ei

 

1.2 Volume Mode

Here you can specify whether you want a filesystem ("Filesystem") or raw storage ("Block"). If you don't specify volume mode, then the default is "Filesystem"

 

1.3 Access Modes

There are three access modes:

  • RWOReadWriteOnce—Only a single node can mount the volume for reading and writing.
  • ROXReadOnlyMany—Multiple nodes can mount the volume for reading.
  • RWXReadWriteMany—Multiple nodes can mount the volume for both reading and writing.
NOTE:
The storage is mounted to nodes, so even with ReadWriteOnce, multiple pods on the same node can mount the volume and write to it.

 

1.4 Reclaim Policy

The reclaim policy determines what happens when a persistent volume claim is deleted. There are three different policies:

  • Retain: The volume will need to be reclaimed manually
  • Delete: The associated storage asset, such as AWS EBS, GCE PD, Azure disk, or OpenStack Cinder volume, is deleted
  • Recycle: Delete content only (rm -rf /volume/*)

The Retain and Delete policies mean the persistent volume is not available anymore for future claims. The Recycle policy allows the volume to be claimed again.

Currently, only NFS and HostPath support recycling. AWS EBS, GCE PD, Azure disks, and Cinder volumes support deletion. Dynamically provisioned volumes are always deleted.

 

1.5 Volume Type

The volume type is specified by name in the spec. There is no volumeType section. In our example, NFS is the volume type. Each volume type may have its own set of parameters. In this case, it's a path and server.

 

1.6 Mount Options

Some persistent volume types have additional mount options you can specify. The mount options are not validated. If you provide an invalid mount option, the volume provisioning will fail.

Let's now create our Persistent Volume:

[root@controller ~]# kubectl create -f persistent-volume.yml
persistentvolume/nfs-share-pv created

 

1.7 Listing PersistentVolume

PV is a resource like a node. We could use kubectl get pv to see current provisioned PVs. Here, pv is used as a shorthand for persistentvolume.

[root@controller ~]# kubectl get pv
NAME           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
nfs-share-pv   1Gi        ROX,RWX        Recycle          Available                                   18s

As expected, the PersistentVolume is shown as Available, because we haven’t yet created the PersistentVolumeClaim.

NOTE:
PersistentVolumes don’t belong to any namespace (see figure 6.7). They’re cluster-level resources like nodes.

 

2. Claiming a PersistentVolume by creating a PersistentVolumeClaim

Say you need to deploy a pod that requires persistent storage. You’ll use the PersistentVolume you created earlier. But you can’t use it directly in the pod. You need to claim it first.

Claiming a PersistentVolume is a completely separate process from creating a pod, because you want the same PersistentVolumeClaim to stay available even if the pod is rescheduled (remember, rescheduling means the previous pod is deleted and a new one is created).

 

2.1 Creating a PersistentVolumeClaim

We have got the KIND value for Persistent Volume Claim, so let's also get the apiVersion value:

[root@controller ~]# kubectl explain PersistentVolumeClaim | head -n 2
KIND:     PersistentVolumeClaim
VERSION:  v1

Following is my YAML file to create the PVC for nfs-share-pv:

[root@controller ~]# cat persistent-volume-claim.yml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: nfs-share-pvc
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 500Mi
  storageClassName: ""

As soon as you create the claim, Kubernetes finds the appropriate PersistentVolume and binds it to the claim. The PersistentVolume’s capacity must be large enough to accommodate what the claim requests.

Here the accessModes will be used to map this claim with respective PersistentVolume. In our case, the claim requests 500 MiB of storage and a ReadWriteMany access mode. The PersistentVolume we created earlier matches those two requirements so it will bound to our claim.

 

2.2 Listing PersistentVolumeClaims

List all PersistentVolumeClaims to see the state of your PVC. Here, pvc as a shorthand for persistentvolumeclaim

[root@controller ~]# kubectl get pvc
NAME            STATUS   VOLUME         CAPACITY   ACCESS MODES   STORAGECLASS   AGE
nfs-share-pvc   Bound    nfs-share-pv   1Gi        ROX,RWX                       4m31s

You can also see that the PersistentVolume is now Bound and no longer Available by inspecting it with kubectl get:

[root@controller ~]# kubectl get pv
NAME           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                   STORAGECLASS   REASON   AGE
nfs-share-pv   1Gi        ROX,RWX        Recycle          Bound    default/nfs-share-pvc                           7m26s

The PersistentVolume shows it’s bound to claim default/nfs-share-pvc. The default part is the namespace the claim resides in (we created the claim in the default namespace). I already mentioned that PersistentVolume resources are cluster-scoped and thus cannot be created in a specific namespace, but PersistentVolumeClaims can only be created in a specific namespace. They can then only be used by pods in the same namespace.

To get more details on the PVC which we created we can use kubectl describe command:

[root@controller ~]# kubectl describe pvc nfs-share-pvc
Name:          nfs-share-pvc
Namespace:     default
StorageClass:
Status:        Bound
Volume:        nfs-share-pv
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      1Gi
Access Modes:  ROX,RWX
VolumeMode:    Filesystem
Mounted By:    <none>
Events:        <none>

 

3. Using a PersistentVolumeClaim in a pod

The PersistentVolume is now yours to use. Nobody else can claim the same volume until you release it. To use it inside a pod, you need to reference the PersistentVolumeClaim by name inside the pod’s volume (Make sure you read that right, the PersistentVolumeClaim, not the PersistentVolume directly!)

Here I am creating an nginx container to use the PV storage:

[root@controller ~]# cat nfs-share-pod.yml
apiVersion: v1
kind: Pod
metadata:
  name: pod-nfs-share
spec:
  containers:
  - image: nginx
    name: pv-container
    ports:
     - containerPort: 80
       name: "http-server"
    volumeMounts:
    - name: data
      mountPath: /var/www
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: nfs-share-pvc

As you see we are referencing our Pod which we created earlier.

Next we will go ahead and create this Pod:

[root@controller ~]# kubectl create -f nfs-share-pod.yml
pod/pod-nfs-share created

Make sure the pod has started successfully and the status is Running:

[root@controller ~]# kubectl get pods pod-nfs-share
NAME            READY   STATUS    RESTARTS   AGE
pod-nfs-share   1/1     Running   0          10m

So the pod is running properly. Next we can check the status of Mount in our container:

[root@controller ~]# kubectl exec -it pod-nfs-share -- df -h /var/www
Filesystem                Size  Used Avail Use% Mounted on
192.168.43.48:/nfs_share   14G  8.6G  4.1G  68% /var/www

As expected the provided directory is mounted and we could also find the files under this directory which we had created in our previous tutorial on Kubernetes Volumes:

[root@controller ~]# kubectl exec -it pod-nfs-share -- ls -l /var/www
total 0
-rw-r--r-- 1 root root 0 Jan  7 11:04 someFile.txt
-rw-r--r-- 1 root root 0 Jan  7 11:09 someFile1.txt

 

4. Recycling Persistent Volumes

If you remember, we created our Persistent Volume using persistentVolumeReclaimPolicy: Recycle which means the data would be deleted once released. Let us verify this behaviour. We will delete our PVC and Pod:

[root@controller ~]# kubectl delete pod pod-nfs-share
pod "pod-nfs-share" deleted

[root@controller ~]# kubectl delete pvc nfs-share-pvc
persistentvolumeclaim "nfs-share-pvc" deleted

So now I don't have any Persistent Volume Claim

[root@controller ~]# kubectl get pvc
No resources found in default namespace.

While the Persistent Volume status shows as "Released"

[root@controller ~]# kubectl get pv
NAME           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                         STORAGECLASS    REASON   AGE
nfs-share-pv   1Gi        ROX,RWX        Recycle          Released   default/nfs-share-pvc                                  80m

Let's re-create the PVC

[root@controller ~]# kubectl create -f persistent-volume-claim.yml
persistentvolumeclaim/nfs-share-pvc created

The PVC creation was successful even though we know there was data in our NFS share path.

[root@controller ~]# kubectl get pv
NAME           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                   STORAGECLASS   REASON   AGE
nfs-share-pv   1Gi        ROX,RWX        Recycle          Bound    default/nfs-share-pvc                           81m

I will create the same Pod again:

[root@controller ~]# kubectl create -f nfs-share-pod.yml
pod/pod-nfs-share created

The Pod is also created successfully and it is in Running state now:

[root@controller ~]# kubectl get pods pod-nfs-share
NAME            READY   STATUS    RESTARTS   AGE
pod-nfs-share   1/1     Running   0          48s

Let's check if our files are still there in the shared path

[root@controller ~]# kubectl exec -it pod-nfs-share -- ls -l /var/www
total 0

Since the ReclaimPolicy was recycle, the files from our NFS share was deleted after we deleted the Pod but if we would have used Retain as our ReclaimPolicy then the files would not be deleted and once the PVC and Pods are deleted, the PersistentVolume will not allow to be Bound unless the data is manually cleaned up.

Defining the available storage types through StorageClass resources
The cluster admin, instead of creating PersistentVolumes, can deploy a PersistentVolume provisioner and define one or more StorageClass objects to let users choose what type of PersistentVolume they want. The users can refer to the StorageClass in their PersistentVolumeClaims and the provisioner will take that into account when provisioning the persistent storage.

NOTE:
Similar to PersistentVolumes, StorageClass resources aren’t part of namespace.

 

5. Local Persistent Volumes with StorageClass

The Local Persistent Volumes beta feature in Kubernetes 1.10 makes it possible to leverage local disks in your StatefulSets. You can specify directly-attached local disks as PersistentVolumes, and use them in StatefulSets with the same PersistentVolumeClaim objects that previously only supported remote volume types.

 

5.1 Create StorageClass

Storage classes let an administrator configure your cluster with custom persistent storage . A storage class has a name in the metadata (it must be specified in the annotation to claim), a provisioner, and parameters.

As I don't have access to cloud environment, I will define a storage class for using local volumes. Local volumes are similar to HostPath, but they persist across pod restarts and node restarts. In that sense, they are considered persistent volumes. You can get the complete list of provisioner from the official page of kubernetes storageclass

Here is our YAML file to create a storageclass:

[root@controller ~]# cat storage-class.yml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

Create this storage class:

[root@controller ~]# kubectl create -f storage-class.yml
storageclass.storage.k8s.io/local-storage created

Check the status:

[root@controller ~]# kubectl get sc
NAME            PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-storage   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  6m57s

Here, sc is a short abbreviation for storageclass.

 

5.2 Creating a local persistent volume

Now, we can create a persistent volume using the storage class that will persist even after the pod that's using it is terminated:

[root@controller ~]# cat local-pv-sc.yml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv
spec:
  capacity:
    storage: 2Gi
  accessModes:
   - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /tmp/data
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - worker-2.example.com

You must provide the node hostname in the values section of nodeAffinity field in the PersistentVolume object. This is how the Kubernetes scheduler understands that this PersistentVolume is tied to a specific node. nodeAffinity is a required field for local PersistentVolumes. Here I intend to create the persistent volume on worker-2.example.com

When local volumes are manually created like this, the only supported persistentVolumeReclaimPolicy is “Retain”. When the PersistentVolume is released from the PersistentVolumeClaim, an administrator must manually clean up and set up the local volume again for reuse

To get the list of available nodes you can use:

[root@controller ~]# kubectl get nodes
NAME                     STATUS   ROLES    AGE   VERSION
controller.example.com   Ready    master   42d   v1.19.3
worker-1.example.com     Ready       42d   v1.19.3
worker-2.example.com     Ready       42d   v1.19.3
IMPORTANT NOTE:
By default, your cluster will not schedule Pods on the control-plane node for security reasons. If you try to create a local volume on the controller node then you may get error similar to "FailedScheduling 4s (x2 over 4s) default-scheduler 0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) didn't find available persistent volumes to bind." when you create your Pod.

Let's create this Persistent Volume:

[root@controller ~]# kubectl create -f local-pv-sc.yml
persistentvolume/local-pv created

Check the status of the PV:

[root@controller ~]# kubectl get pv
NAME           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                   STORAGECLASS    REASON   AGE
local-pv       2Gi        RWO            Retain           Available                           local-storage            24s
nfs-share-pv   1Gi        ROX,RWX        Recycle          Released    default/nfs-share-pvc                            12h

Currently the status of this PV is available and is mapped to local-storage StorageClass. We can use kubectl describe to get more details on this PV:

[root@controller ~]# kubectl describe pv local-pv
Name:              local-pv
Labels:            <none>
Annotations:       <none>
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      local-storage
Status:            Available
Claim:
Reclaim Policy:    Retain
Access Modes:      RWO
VolumeMode:        Filesystem
Capacity:          2Gi
Node Affinity:
  Required Terms:
    Term 0:        kubernetes.io/hostname in [worker-2.example.com]
Message:
Source:
    Type:  LocalVolume (a persistent volume backed by local storage on a node)
    Path:  /tmp/data
Events:    <none>

 

5.3 Making Persistent Volume Claims

Next we will create a Persistent Volume Claim to mount the local volume in the respective Pod.

[root@controller ~]# cat local-pvc.yml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: local-storage-claim
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-storage
  resources:
    requests:
      storage: 1Gi

We will create this PVC:

[root@controller ~]# kubectl create -f local-pvc.yml
persistentvolumeclaim/local-storage-claim created

It is important to note that the persistent volume claim didn't actually claim any storage yet and wasn't bound to our local volume. The claim is pending until some container actually attempts to mount a volume using the claim:

[root@controller ~]# kubectl get pvc
NAME                  STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS    AGE
local-storage-claim   Pending                                      local-storage   17s

You can use kubectl describe to get more information of this PVC:

[root@controller ~]# kubectl describe pvc local-storage-claim
Name:          local-storage-claim
Namespace:     default
StorageClass:  local-storage
Status:        Pending
Volume:
Labels:        <none>
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Mounted By:    <none>
Events:
  Type    Reason                Age                  From                         Message
  ----    ------                ----                 ----                         -------
  Normal  WaitForFirstConsumer  1s (x17 over 3m54s)  persistentvolume-controller  waiting for first consumer to be created before binding

Under Events you can see that the claim is waiting for the first consumer which will bind the local path.

 

5.4 Create a Pod

We have provisioned a volume and claimed it. It's time to use the claimed storage in a container. This turns out to be pretty simple. First, the persistent volume claim must be used as a volume in the pod and then the containers in the pod can mount it, just like any other volume. Here is a pod configuration file that specifies the persistent volume claim we created earlier (bound to the local persistent volume we provisioned):

[root@controller ~]# cat local-pv-pod.yml
kind: Pod
apiVersion: v1
metadata:
  name: local-pod
spec:
  containers:
    - name: local-pod
      image: nginx
      ports:
      - containerPort: 80
        name: "httpd-server"
      volumeMounts:
      - mountPath: "/mnt/tmp"
        name: persistent-volume
  volumes:
    - name: persistent-volume
      persistentVolumeClaim:
        claimName: local-storage-claim

The key is in the persistentVolumeClaim section under volumes. The claim name (storage-claim here) uniquely identifies the specific claim within the current namespace and makes it available as a volume named persistent-volume here. Then, the container can refer to it by its name and mount it to /mnt/tmp.

Let's create this Pod:

[root@controller ~]# kubectl create -f local-pv-pod.yml
pod/local-pod created

Check the status of this pod and make sure it is running:

[root@controller ~]# kubectl get pods local-pod
NAME        READY   STATUS    RESTARTS   AGE
local-pod   1/1     Running   0          21s

Now the claim will be in Bound state assuming everything is successful:

[root@controller ~]# kubectl get pvc
NAME                  STATUS   VOLUME     CAPACITY   ACCESS MODES   STORAGECLASS    AGE
local-storage-claim   Bound    local-pv   2Gi        RWO            local-storage   7m47s

The same can be checked for the Persistent Volume:

[root@controller ~]# kubectl get pv
NAME           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                         STORAGECLASS    REASON   AGE
local-pv       2Gi        RWO            Retain           Bound      default/local-storage-claim   local-storage            17m
nfs-share-pv   1Gi        ROX,RWX        Recycle          Released   default/nfs-share-pvc                                  12h

 

Conclusion

In this Kubernetes Tutorial we learned about Persistent Volumes and Persistent Volume Claims. Due ti lack of infrastructure I could not cover the dynamic provisioner which actually doesn't require us to create a Persistent Volume and we could have directly accessed a volume using a storage class with Persistent Volume Claim. This could prevent the dynamic provisioner from interfering when you want the PersistentVolumeClaim to be bound to a pre-provisioned PersistentVolume

Deepak Prasad

Deepak Prasad

He is the founder of GoLinuxCloud and brings over a decade of expertise in Linux, Python, Go, Laravel, DevOps, Kubernetes, Git, Shell scripting, OpenShift, AWS, Networking, and Security. With extensive experience, he excels in various domains, from development to DevOps, Networking, and Security, ensuring robust and efficient solutions for diverse projects. You can connect with him on his LinkedIn profile.

Can't find what you're searching for? Let us assist you.

Enter your query below, and we'll provide instant results tailored to your needs.

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can send mail to admin@golinuxcloud.com

Thank You for your support!!

Leave a Comment