In our previous tutorial we already covered some of the Kubernetes volume types which could act as a persistent volume so you may be little confused.
For example, We can use NFS as a persistent volume but to create a NFS-backed volume, the developer has to know the actual server the NFS export is located on. This is against the basic idea of Kubernetes, which aims to hide the actual infrastructure from both the application and its developer, leaving them free from worrying about the specifics of the infrastructure and making apps portable across a wide array of cloud providers and on-premises datacenters.
With Kubernetes Persistent Volumes when a developer needs a certain amount of persistent storage for their application, they can request it from Kubernetes, the same way they can request CPU, memory, and other resources when creating a pod.
How Kubernetes Persistent Volume and Persistent Volume Claim works
To enable apps to request storage in a Kubernetes cluster without having to deal with infrastructure specifics, two new resources were introduced. They are Persistent-Volumes and PersistentVolumeClaims
Let us understand in a step by step instruction how a Persistent Volume can be configured and used:
- The cluster administrator will sets up the underlying storage and then registers it in Kubernetes by creating a
PersistentVolume
resource through the Kubernetes API server. In this image we have a NFS backed solution. - When creating the
PersistentVolume
, the admin specifies its size and the access modes it supports. - When a cluster user needs to use persistent storage in one of their pods, they first create a
PersistentVolumeClaim
manifest, specifying the minimum size and the access mode they require. - The user then submits the
PersistentVolumeClaim
manifest to the Kubernetes API server, and Kubernetes finds the appropriatePersistentVolume
and binds the volume to the claim. - The
PersistentVolumeClaim
can then be used as one of the volumes inside a pod - Other users cannot use the same
PersistentVolume
until it has been released by deleting the boundPersistentVolumeClaim
.
1. Create Persistent Volume
We will use our existing NFS server which we had created in our previous article while learning about Kubernetes Volumes. If you don't have one you can configure your own NFS server or if you are working on cloud then you can also use GCE Persistent Disk or AWS EBS.
Before we start with the YAML file, we must have the KIND and apiVersion
to create Persistent Volume. We have covered this topic in depth in most of the tutorials of this series. We can use api-resources to get the KIND value:
[root@controller ~]# kubectl api-resources | grep -iE 'KIND|persistent'
NAME SHORTNAMES APIGROUP NAMESPACED KIND
persistentvolumeclaims pvc true PersistentVolumeClaim
persistentvolumes pv false PersistentVolume
Here we have the KIND value for both Persistent Volume and Persistent Volume Claim which we will use later in this tutorial. Next we can use this value to get the mapping apiVersion:
[root@controller ~]# kubectl explain PersistentVolume | head -n 2
KIND: PersistentVolume
VERSION: v1
Now that we have both the mandatory params, we can start writing our YAML to create persistent volume:
[root@controller ~]# cat persistent-volume.yml apiVersion: v1 kind: PersistentVolume metadata: name: nfs-share-pv spec: capacity: storage: 1Gi volumeMode: Filesystem accessModes: - ReadWriteMany - ReadOnlyMany persistentVolumeReclaimPolicy: Recycle mountOptions: - hard - nfsvers=4.1 nfs: path: /nfs_share server: 192.168.43.48
Let's cover the spec here. There are six sections: capacity, volume mode, access modes, reclaim policy, storage class, and the volume type (nfs in the example).
1.1 Capacity
Each volume has a designated amount of storage. Storage claims may be satisfied by persistent volumes that have at least that amount of storage. In the example, the persistent volume has a capacity of 1 gibibytes (a single gibibyte (GiB) is 2 to the power of 30 bytes).
You can refer the following table for more understanding on units:
Name | Bytes | Suffix | Name | Bytes | Suffix |
---|---|---|---|---|---|
kulobyte | 1000 | K | kibibyte | 1024 | Ki |
megabyte | 1000*2 | M | mebibyte | 1024*2 | Mi |
gigabyte | 1000*3 | G | gibibyte | 1024*3 | Gi |
terabyte | 1000*4 | T | tebibyte | 1024*4 | Ti |
petabyte | 1000*5 | P | pebibyte | 1024*5 | Pi |
exayte | 1000*6 | E | exbibyte | 1024*6 | Ei |
1.2 Volume Mode
Here you can specify whether you want a filesystem ("Filesystem") or raw storage ("Block"). If you don't specify volume mode, then the default is "Filesystem"
1.3 Access Modes
There are three access modes:
- RWO —
ReadWriteOnce
—Only a single node can mount the volume for reading and writing. - ROX —
ReadOnlyMany
—Multiple nodes can mount the volume for reading. - RWX —
ReadWriteMany
—Multiple nodes can mount the volume for both reading and writing.
ReadWriteOnce
, multiple pods on the same node can mount the volume and write to it.
1.4 Reclaim Policy
The reclaim policy determines what happens when a persistent volume claim is deleted. There are three different policies:
- Retain: The volume will need to be reclaimed manually
- Delete: The associated storage asset, such as AWS EBS, GCE PD, Azure disk, or OpenStack Cinder volume, is deleted
- Recycle: Delete content only (
rm -rf /volume/*
)
The Retain and Delete policies mean the persistent volume is not available anymore for future claims. The Recycle policy allows the volume to be claimed again.
Currently, only NFS and HostPath
support recycling. AWS EBS, GCE PD, Azure disks, and Cinder volumes support deletion. Dynamically provisioned volumes are always deleted.
1.5 Volume Type
The volume type is specified by name in the spec. There is no volumeType
section. In our example, NFS is the volume type. Each volume type may have its own set of parameters. In this case, it's a path and server.
1.6 Mount Options
Some persistent volume types have additional mount options you can specify. The mount options are not validated. If you provide an invalid mount option, the volume provisioning will fail.
Let's now create our Persistent Volume:
[root@controller ~]# kubectl create -f persistent-volume.yml persistentvolume/nfs-share-pv created
1.7 Listing PersistentVolume
PV is a resource like a node. We could use kubectl get pv
to see current provisioned PVs. Here, pv
is used as a shorthand for persistentvolume
.
[root@controller ~]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
nfs-share-pv 1Gi ROX,RWX Recycle Available 18s
As expected, the PersistentVolume
is shown as Available, because we haven’t yet created the PersistentVolumeClaim
.
PersistentVolumes
don’t belong to any namespace (see figure 6.7). They’re cluster-level resources like nodes.
2. Claiming a PersistentVolume by creating a PersistentVolumeClaim
Say you need to deploy a pod that requires persistent storage. You’ll use the PersistentVolume
you created earlier. But you can’t use it directly in the pod. You need to claim it first.
Claiming a PersistentVolume
is a completely separate process from creating a pod, because you want the same PersistentVolumeClaim to stay available even if the pod is rescheduled (remember, rescheduling means the previous pod is deleted and a new one is created).
2.1 Creating a PersistentVolumeClaim
We have got the KIND
value for Persistent Volume Claim, so let's also get the apiVersion
value:
[root@controller ~]# kubectl explain PersistentVolumeClaim | head -n 2 KIND: PersistentVolumeClaim VERSION: v1
Following is my YAML file to create the PVC for nfs-share-pv
:
[root@controller ~]# cat persistent-volume-claim.yml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: nfs-share-pvc spec: accessModes: - ReadWriteMany resources: requests: storage: 500Mi storageClassName: ""
As soon as you create the claim, Kubernetes finds the appropriate PersistentVolume
and binds it to the claim. The PersistentVolume’s capacity must be large enough to accommodate what the claim requests.
Here the accessModes
will be used to map this claim with respective PersistentVolume
. In our case, the claim requests 500 MiB of storage and a ReadWriteMany
access mode. The PersistentVolume
we created earlier matches those two requirements so it will bound to our claim.
2.2 Listing PersistentVolumeClaims
List all PersistentVolumeClaims
to see the state of your PVC. Here, pvc
as a shorthand for persistentvolumeclaim
[root@controller ~]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nfs-share-pvc Bound nfs-share-pv 1Gi ROX,RWX 4m31s
You can also see that the PersistentVolume
is now Bound and no longer Available by inspecting it with kubectl get
:
[root@controller ~]# kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE nfs-share-pv 1Gi ROX,RWX Recycle Bound default/nfs-share-pvc 7m26s
The PersistentVolume
shows it’s bound to claim default/nfs-share-pvc
. The default part is the namespace the claim resides in (we created the claim in the default namespace). I already mentioned that PersistentVolume
resources are cluster-scoped and thus cannot be created in a specific namespace, but PersistentVolumeClaims
can only be created in a specific namespace. They can then only be used by pods in the same namespace.
To get more details on the PVC which we created we can use kubectl describe
command:
[root@controller ~]# kubectl describe pvc nfs-share-pvc Name: nfs-share-pvc Namespace: default StorageClass: Status: Bound Volume: nfs-share-pv Labels: <none> Annotations: pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes Finalizers: [kubernetes.io/pvc-protection] Capacity: 1Gi Access Modes: ROX,RWX VolumeMode: Filesystem Mounted By: <none> Events: <none>
3. Using a PersistentVolumeClaim in a pod
The PersistentVolume
is now yours to use. Nobody else can claim the same volume until you release it. To use it inside a pod, you need to reference the PersistentVolumeClaim
by name inside the pod’s volume (Make sure you read that right, the PersistentVolumeClaim
, not the PersistentVolume
directly!)
Here I am creating an nginx container to use the PV storage:
[root@controller ~]# cat nfs-share-pod.yml apiVersion: v1 kind: Pod metadata: name: pod-nfs-share spec: containers: - image: nginx name: pv-container ports: - containerPort: 80 name: "http-server" volumeMounts: - name: data mountPath: /var/www volumes: - name: data persistentVolumeClaim: claimName: nfs-share-pvc
As you see we are referencing our Pod which we created earlier.
Next we will go ahead and create this Pod:
[root@controller ~]# kubectl create -f nfs-share-pod.yml
pod/pod-nfs-share created
Make sure the pod has started successfully and the status is Running
:
[root@controller ~]# kubectl get pods pod-nfs-share
NAME READY STATUS RESTARTS AGE
pod-nfs-share 1/1 Running 0 10m
So the pod is running properly. Next we can check the status of Mount in our container:
[root@controller ~]# kubectl exec -it pod-nfs-share -- df -h /var/www Filesystem Size Used Avail Use% Mounted on 192.168.43.48:/nfs_share 14G 8.6G 4.1G 68% /var/www
As expected the provided directory is mounted and we could also find the files under this directory which we had created in our previous tutorial on Kubernetes Volumes:
[root@controller ~]# kubectl exec -it pod-nfs-share -- ls -l /var/www total 0 -rw-r--r-- 1 root root 0 Jan 7 11:04 someFile.txt -rw-r--r-- 1 root root 0 Jan 7 11:09 someFile1.txt
4. Recycling Persistent Volumes
If you remember, we created our Persistent Volume using persistentVolumeReclaimPolicy: Recycle
which means the data would be deleted once released. Let us verify this behaviour. We will delete our PVC and Pod:
[root@controller ~]# kubectl delete pod pod-nfs-share pod "pod-nfs-share" deleted [root@controller ~]# kubectl delete pvc nfs-share-pvc persistentvolumeclaim "nfs-share-pvc" deleted
So now I don't have any Persistent Volume Claim
[root@controller ~]# kubectl get pvc
No resources found in default namespace.
While the Persistent Volume status shows as "Released
"
[root@controller ~]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
nfs-share-pv 1Gi ROX,RWX Recycle Released default/nfs-share-pvc 80m
Let's re-create the PVC
[root@controller ~]# kubectl create -f persistent-volume-claim.yml
persistentvolumeclaim/nfs-share-pvc created
The PVC creation was successful even though we know there was data in our NFS share path.
[root@controller ~]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
nfs-share-pv 1Gi ROX,RWX Recycle Bound default/nfs-share-pvc 81m
I will create the same Pod again:
[root@controller ~]# kubectl create -f nfs-share-pod.yml
pod/pod-nfs-share created
The Pod is also created successfully and it is in Running
state now:
[root@controller ~]# kubectl get pods pod-nfs-share
NAME READY STATUS RESTARTS AGE
pod-nfs-share 1/1 Running 0 48s
Let's check if our files are still there in the shared path
[root@controller ~]# kubectl exec -it pod-nfs-share -- ls -l /var/www
total 0
Since the ReclaimPolicy
was recycle, the files from our NFS share was deleted after we deleted the Pod but if we would have used Retain as our ReclaimPolicy
then the files would not be deleted and once the PVC and Pods are deleted, the PersistentVolume
will not allow to be Bound unless the data is manually cleaned up.
Defining the available storage types through StorageClass resources
The cluster admin, instead of creating PersistentVolumes
, can deploy a PersistentVolume
provisioner and define one or more StorageClass
objects to let users choose what type of PersistentVolume
they want. The users can refer to the StorageClass
in their PersistentVolumeClaims
and the provisioner will take that into account when provisioning the persistent storage.
5. Local Persistent Volumes with StorageClass
The Local Persistent Volumes beta feature in Kubernetes 1.10 makes it possible to leverage local disks in your StatefulSets
. You can specify directly-attached local disks as PersistentVolumes
, and use them in StatefulSets
with the same PersistentVolumeClaim
objects that previously only supported remote volume types.
5.1 Create StorageClass
Storage classes let an administrator configure your cluster with custom persistent storage . A storage class has a name in the metadata (it must be specified in the annotation to claim), a provisioner, and parameters.
As I don't have access to cloud environment, I will define a storage class for using local volumes. Local volumes are similar to HostPath
, but they persist across pod restarts and node restarts. In that sense, they are considered persistent volumes. You can get the complete list of provisioner from the official page of kubernetes storageclass
Here is our YAML file to create a storageclass:
[root@controller ~]# cat storage-class.yml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: local-storage provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer
Create this storage class:
[root@controller ~]# kubectl create -f storage-class.yml
storageclass.storage.k8s.io/local-storage created
Check the status:
[root@controller ~]# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-storage kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 6m57s
Here, sc
is a short abbreviation for storageclass.
5.2 Creating a local persistent volume
Now, we can create a persistent volume using the storage class that will persist even after the pod that's using it is terminated:
[root@controller ~]# cat local-pv-sc.yml apiVersion: v1 kind: PersistentVolume metadata: name: local-pv spec: capacity: storage: 2Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain storageClassName: local-storage local: path: /tmp/data nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - worker-2.example.com
You must provide the node hostname in the values
section of nodeAffinity
field in the PersistentVolume
object. This is how the Kubernetes scheduler understands that this PersistentVolume is tied to a specific node. nodeAffinity
is a required field for local PersistentVolumes. Here I intend to create the persistent volume on worker-2.example.com
When local volumes are manually created like this, the only supported persistentVolumeReclaimPolicy
is “Retain”. When the PersistentVolume
is released from the PersistentVolumeClaim
, an administrator must manually clean up and set up the local volume again for reuse
To get the list of available nodes you can use:
[root@controller ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION controller.example.com Ready master 42d v1.19.3 worker-1.example.com Ready 42d v1.19.3 worker-2.example.com Ready 42d v1.19.3
FailedScheduling 4s (x2 over 4s) default-scheduler 0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) didn't find available persistent volumes to bind.
" when you create your Pod.Let's create this Persistent Volume:
[root@controller ~]# kubectl create -f local-pv-sc.yml
persistentvolume/local-pv created
Check the status of the PV:
[root@controller ~]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
local-pv 2Gi RWO Retain Available local-storage 24s
nfs-share-pv 1Gi ROX,RWX Recycle Released default/nfs-share-pvc 12h
Currently the status of this PV is available
and is mapped to local-storage
StorageClass. We can use kubectl describe to get more details on this PV:
[root@controller ~]# kubectl describe pv local-pv Name: local-pv Labels: <none> Annotations: <none> Finalizers: [kubernetes.io/pv-protection] StorageClass: local-storage Status: Available Claim: Reclaim Policy: Retain Access Modes: RWO VolumeMode: Filesystem Capacity: 2Gi Node Affinity: Required Terms: Term 0: kubernetes.io/hostname in [worker-2.example.com] Message: Source: Type: LocalVolume (a persistent volume backed by local storage on a node) Path: /tmp/data Events: <none>
5.3 Making Persistent Volume Claims
Next we will create a Persistent Volume Claim to mount the local volume in the respective Pod.
[root@controller ~]# cat local-pvc.yml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: local-storage-claim spec: accessModes: - ReadWriteOnce storageClassName: local-storage resources: requests: storage: 1Gi
We will create this PVC:
[root@controller ~]# kubectl create -f local-pvc.yml
persistentvolumeclaim/local-storage-claim created
It is important to note that the persistent volume claim didn't actually claim any storage yet and wasn't bound to our local volume. The claim is pending
until some container actually attempts to mount a volume using the claim:
[root@controller ~]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
local-storage-claim Pending local-storage 17s
You can use kubectl describe
to get more information of this PVC:
[root@controller ~]# kubectl describe pvc local-storage-claim Name: local-storage-claim Namespace: default StorageClass: local-storage Status: Pending Volume: Labels: <none> Annotations: <none> Finalizers: [kubernetes.io/pvc-protection] Capacity: Access Modes: VolumeMode: Filesystem Mounted By: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal WaitForFirstConsumer 1s (x17 over 3m54s) persistentvolume-controller waiting for first consumer to be created before binding
Under Events
you can see that the claim is waiting for the first consumer which will bind the local path.
5.4 Create a Pod
We have provisioned a volume and claimed it. It's time to use the claimed storage in a container. This turns out to be pretty simple. First, the persistent volume claim must be used as a volume in the pod and then the containers in the pod can mount it, just like any other volume. Here is a pod configuration file that specifies the persistent volume claim we created earlier (bound to the local persistent volume we provisioned):
[root@controller ~]# cat local-pv-pod.yml kind: Pod apiVersion: v1 metadata: name: local-pod spec: containers: - name: local-pod image: nginx ports: - containerPort: 80 name: "httpd-server" volumeMounts: - mountPath: "/mnt/tmp" name: persistent-volume volumes: - name: persistent-volume persistentVolumeClaim: claimName: local-storage-claim
The key is in the persistentVolumeClaim
section under volumes. The claim name (storage-claim here) uniquely identifies the specific claim within the current namespace and makes it available as a volume named persistent-volume here. Then, the container can refer to it by its name and mount it to /mnt/tmp
.
Let's create this Pod:
[root@controller ~]# kubectl create -f local-pv-pod.yml
pod/local-pod created
Check the status of this pod and make sure it is running
:
[root@controller ~]# kubectl get pods local-pod
NAME READY STATUS RESTARTS AGE
local-pod 1/1 Running 0 21s
Now the claim will be in Bound state assuming everything is successful:
[root@controller ~]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
local-storage-claim Bound local-pv 2Gi RWO local-storage 7m47s
The same can be checked for the Persistent Volume:
[root@controller ~]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
local-pv 2Gi RWO Retain Bound default/local-storage-claim local-storage 17m
nfs-share-pv 1Gi ROX,RWX Recycle Released default/nfs-share-pvc 12h
Conclusion
In this Kubernetes Tutorial we learned about Persistent Volumes and Persistent Volume Claims. Due ti lack of infrastructure I could not cover the dynamic provisioner which actually doesn't require us to create a Persistent Volume and we could have directly accessed a volume using a storage class with Persistent Volume Claim. This could prevent the dynamic provisioner from interfering when you want the PersistentVolumeClaim to be bound to a pre-provisioned PersistentVolume