Beginners guide on Kubernetes Pods with examples

In our previous article I explained about Kubernetes Architecture where we briefly learned about different components which includes Pods. In this article I will explain more about different types of objects (resources) in Kubernetes. We’ll start with pods, because they’re the central, most important, concept in Kubernetes. Everything else either manages, exposes, or is used by pods.

Different methods to create objects in Kubernetes

There are two approaches to create different kind of objects in Kubernetes Declarative and Imperative

The recommended way to work with kubectl is declarative way , by writing your manifest files and applying kubectl {apply|create} -f manufest.yml
With YAML file you have more control over the different properties you can add to your container compared to imperative method
With imperative you can just use kubectl command line to create different objects
The one challenge with declarative method of creating objects would be to creating a YAML file, to overcome this you can get the YAML file content from any existing object, for example: kubectl get <object> nginx -o yaml and then use that as a template to create another object kubectl {replace|apply} -f nginx.yaml.

Overview on Kubernetes Pods

We already know that a pod is a co-located group of containers and represents the basic building block in Kubernetes.
Instead of deploying containers individually, you always deploy and operate on a pod of containers.
We’re not implying that a pod always includes more than one container, it’s common for pods to contain only a single container.
The key thing about pods is that when a pod does contain multiple containers, all of them are always run on a single worker node and it never spans multiple worker nodes,

Beginners guide on Kubernetes Pods with examples

Creating Pods using YAML file

Pods and other Kubernetes resources are usually created by posting a JSON or YAML manifest to the Kubernetes REST API endpoint.

[root@controller ~]# cat nginx.yml
apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: default
spec:
  containers:
  - name: nginx
    image: nginx
    ports:
    - containerPort: 80

To create the pod:

[root@controller ~]# kubectl create -f nginx.yml
pod/nginx created

Check status of the Pod

Verify the newly created pod, a container for nginx is being created inside the pod:

[root@controller ~]# kubectl get pods
NAME    READY   STATUS              RESTARTS   AGE
nginx   0/1     ContainerCreating   0          4s

We check again in few seconds, with much more details:

[root@controller ~]# kubectl get pods -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP              NODE                   NOMINATED NODE   READINESS GATES
nginx                           1/1     Running   0          43s   10.36.0.4       worker-1.example.com   <none>     <none>

So now our pod is in running state and we know that is it running on worker1 node.

Get details of the Pod

You can use kubectl describe to get more details of a specific resource which in this case is Pod.

[root@controller ~]# kubectl describe pod nginx
Name:         nginx
Namespace:    default
Priority:     0
Node:         worker-1.example.com/192.168.43.49
Start Time:   Mon, 04 Jan 2021 10:40:32 +0530
Labels:       <none>
Annotations:  <none>
Status:       Running
IP:           10.36.0.4
IPs:
  IP:  10.36.0.4
Containers:
  nginx:
    Container ID:   docker://b0a1d708daf4684cb9bee10ed26235097b6e5a91754ff54b1c7c06188aef1715
    Image:          nginx
    Image ID:       docker-pullable://nginx@sha256:4cf620a5c81390ee209398ecc18e5fb9dd0f5155cd82adcbae532fec94006fb9
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Mon, 04 Jan 2021 10:40:45 +0530
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-glntg (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-glntg:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-glntg
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  12m   default-scheduler  Successfully assigned default/nginx to worker-1.example.com
  Normal  Pulling    11m   kubelet            Pulling image "nginx"
  Normal  Pulled     11m   kubelet            Successfully pulled image "nginx" in 10.728263967s
  Normal  Created    11m   kubelet            Created container nginx
  Normal  Started    11m   kubelet            Started container nginx

So as you see this command will give you all the details related to this Pod and the latest events.

Check status of the container from the Pod

We had added a single container inside our Pod with the name "nginx" in our YAML file. You can check the docker container status on worker1 node using docker ps:

[root@worker-1 ~]# docker ps
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS               NAMES
b0a1d708daf4        nginx                  "/docker-entrypoint.…"   3 minutes ago       Up 3 minutes                            k8s_nginx_nginx_default_191b8a01-2c76-47ea-9e21-b4495bccaaaa_0
641d95dc5a8e        k8s.gcr.io/pause:3.2   "/pause"                 4 minutes ago       Up 4 minutes                            k8s_POD_nginx_default_191b8a01-2c76-47ea-9e21-b4495bccaaaa_0

So this container was created 3 minutes ago.

Connecting to the Pod

We can use kubectl exec to execute a command on the Pod. Now to connect to the specific Pod we can execute a shell which will give us a shell prompt for the target Pod container. The syntax to do this would be:

kubectl exec -it <pod-name> -c <container-name> -- <CMD>

Here,

 -i, --stdin=false
 -t, --tty=false: Stdin is a TTY
 -c, --container='': Container name. If omitted, the first container in the pod will be chosen

We will execute the command as /bin/bash so we will just get the shell of the nginx container. For example, to connect to our nginx Pod which has a container named nginx

[root@controller ~]# kubectl exec -it nginx -c nginx -- /bin/bash
root@nginx:/#

Now that we have a shell from the container, we can execute commands on the container. To make sure our nginx server is up and running we can query the webserver using curl:

root@nginx:/# curl localhost
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Perform port forwarding using kubectl

From kubectl get pods -o wide output we know that our container is running on 10.36.0.4, but is this IP reachable from the controller?

Send a bounded ICMP echo test with ping -c; the ping command covers count, interval, and interpreting packet loss.

[root@controller ~]# ping -c 2 10.36.0.4
PING 10.36.0.4 (10.36.0.4) 56(84) bytes of data.
c
--- 10.36.0.4 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 51ms

As expected we get time out, this is because these IPs are only known to the containers as internal network so external network can't connect to this IP directly.

If you check the same using worker1 node where this container is running, we can connect to this IP using respective worker node on which this container is running i.e. worker1

[root@worker-1 ~]# ping -c 2 10.36.0.4
PING 10.36.0.4 (10.36.0.4) 56(84) bytes of data.
64 bytes from 10.36.0.4: icmp_seq=1 ttl=64 time=0.109 ms
64 bytes from 10.36.0.4: icmp_seq=2 ttl=64 time=0.058 ms

--- 10.36.0.4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 23ms
rtt min/avg/max/mdev = 0.058/0.083/0.109/0.027 ms

But then again it is not reachable from worker-2

[root@worker-2 ~]# ping -c 2 10.36.0.4
PING 10.36.0.4 (10.36.0.4) 56(84) bytes of data.

--- 10.36.0.4 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 28ms

So this internal IP is only reachable from the node where the container is actually running. Since this is a nginx server, we can use curl to query the webpage from worker-1:

[root@worker-1 ~]# curl 10.36.0.4
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
nginx.org.<br/>
Commercial support is available at
nginx.com.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Now if we want to access this page using external network such as controller node then we have to forward the port used by nginx using kubectl port-forward <local_port>:<pod_port> where local_port can be any random port while pod_port is the port used by the pod's container:

[root@controller ~]# kubectl port-forward nginx :80 &
[2] 32237
Forwarding from 127.0.0.1:35341 -> 80

Now we try to access the container using the forwarded port:

[root@controller ~]# curl 127.0.0.1:35341
Handling connection for 35341
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
nginx.org.<br/>
Commercial support is available at
nginx.com.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Once done you can kill the PID of the port-forward command:

[root@controller ~]# kill -9 32237

Understanding multi container Pods

Normally we use single container with a pod as they are easier to build and maintain although there are some cases where you might want to run multiple containers in a Pod

Sidecar container: a container that enhances the primary application, for instance for logging purpose
Ambassador container: a container that represents the primary container to the outside world, such as proxy
Adapter container: used to adopt the traffic or data pattern to match the traffic or data pattern in other applications in the cluster

When using multi-container Pods, the containers typically share data through shared storage.

Understanding the sidecar scenario

A sidecar container is providing additional functionality to the main container where it makes no sense learning this functionality in a separate pod because they are closely related to one another
Think of logging, monitoring and syncing
The essence is that the main container and the sidecar container have access to shared resources to exchange information
Often, shared volumes are used for this purpose

Here I have a sample YAML file used to create a multi-container sidecar pod :

[root@controller ~]# cat create-sidecar.yml
kind: Pod
apiVersion: v1
metadata:
  name: sidecar-pod
spec:
  volumes:
  - name: logs
    emptyDir: {}

  containers:
  - name: app
    image: busybox
    command: ["/bin/sh"]
    args: ["-c", "while true; do date >> /var/log/date.txt; sleep 10; done"]
    volumeMounts:
    - name: logs
      mountPath: /var/log

  - name: sidecar
    image: centos/httpd
    ports:
    - containerPort: 80
    volumeMounts:
    - name: logs
      mountPath: /var/www/html

Next we create this sidecar container:

[root@controller ~]# kubectl create -f create-sidecar.yml
pod/sidecar-pod created

Verify the status, it will take some time while both the containers are created:

[root@controller ~]# kubectl get pods
NAME          READY   STATUS              RESTARTS   AGE
nginx         1/1     Running             0          55m
sidecar-pod   0/2     ContainerCreating   0          25s

Re-check in some time and the pod should be in running state:

[root@controller ~]# kubectl get pods -o wide
NAME          READY   STATUS    RESTARTS   AGE   IP          NODE                   NOMINATED NODE   READINESS GATES
nginx         1/1     Running   0          67m   10.36.0.4   worker-1.example.com   <none>           <none>
sidecar-pod   2/2     Running   0          12m   10.44.0.1   worker-2.example.com   <none>           <none>

We con connect to this pod using:

[root@controller ~]# kubectl exec -it sidecar-pod -c sidecar -- /bin/bash
[root@sidecar-pod /]#

Now we can use curl to check the content of date.txt where we were appending the date command output every 10 seconds in a loop:

[root@sidecar-pod ~]# curl http://localhost/date.txt
Fri Nov 27 05:43:00 UTC 2020
Fri Nov 27 05:43:10 UTC 2020
Fri Nov 27 05:43:20 UTC 2020
Fri Nov 27 05:43:30 UTC 2020
Fri Nov 27 05:43:40 UTC 2020
Fri Nov 27 05:43:50 UTC 2020
Fri Nov 27 05:44:00 UTC 2020
Fri Nov 27 05:44:11 UTC 2020
Fri Nov 27 05:44:21 UTC 2020

Inspecting Pods

In this section I will share different commands which you can use to inspect and analyse the pods, for example currently I have this sidecar pod which we created in the previous example:

[root@controller ~]# kubectl get pods
NAME          READY   STATUS    RESTARTS   AGE
sidecar-pod   2/2     Running   0          111s

To get more details of this pod we will use kubectl describe pods <pod_name> command, as you see it gives us a bunch of information about this pod:

~]# kubectl describe pods sidecar-pod
Name:         sidecar-pod
Namespace:    default
Priority:     0
Node:         worker-1.example.com/192.168.43.49
Start Time:   Sat, 28 Nov 2020 00:10:56 +0530
Labels:       <none>
Annotations:  <none>
Status:       Running
IP:           10.36.0.1
IPs:
  IP:  10.36.0.1
Containers:
  app:
    Container ID:  docker://94d1acadd85e0f1557019c858e9a47d2ed3f99650b103ce52b76916b69309ab9
    Image:         busybox
    Image ID:      docker-pullable://busybox@sha256:9f1c79411e054199210b4d489ae600a061595967adb643cd923f8515ad8123d2
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
...
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  2m54s  default-scheduler  Successfully assigned default/sidecar-pod to worker-1.example.com
  Normal  Pulling    2m53s  kubelet            Pulling image "busybox"
  Normal  Pulled     2m45s  kubelet            Successfully pulled image "busybox" in 7.523991553s
  Normal  Created    2m45s  kubelet            Created container app
  Normal  Started    2m45s  kubelet            Started container app
  Normal  Pulling    2m45s  kubelet            Pulling image "centos/httpd"
  Normal  Pulled     92s    kubelet            Successfully pulled image "centos/httpd" in 1m13.290810172s
  Normal  Created    91s    kubelet            Created container sidecar
  Normal  Started    91s    kubelet            Started container sidecar

The next command can be used to get the STDOUT and STDERR messages from the pods, to demonstrate this I have created another pod sleepy which will write the date output on the console every 5 seconds:

[root@controller ~]# kubectl get pods -o wide
NAME          READY   STATUS    RESTARTS   AGE   IP          NODE                   NOMINATED NODE   READINESS GATES
sidecar-pod   2/2     Running   0          15m   10.36.0.1   worker-1.example.com   <none>           <none>              
sleepy        1/1     Running   0          12s   10.44.0.1   worker-2.example.com   <none>           <none>

Check the logs from a Pod

Now to check the logs from this pod we will use following command which will give us the content printed by the respective container on the console:

[root@controller ~]# kubectl logs sleepy
Fri Nov 27 18:56:32 UTC 2020
Fri Nov 27 18:56:37 UTC 2020
Fri Nov 27 18:56:42 UTC 2020

To connect to a container inside a pod we will use:

kubectl exec -it <pod-name> -c <container-name> -- <CMD>

For example to connect to the sidecar container from sidecar-pod, I will use:

[root@controller ~]# kubectl exec -it sidecar-pod -c sidecar -- /bin/bash
[root@sidecar-pod /]#

You can enter exit followed by ENTER to come out of the container shell.

Deleting a Pod

The syntax to delete a pod is

kubectl delete pod <pod-name>

So to delete our nginx pod we will use:

[root@controller ~]# kubectl delete pod nginx
pod "nginx" deleted

Running pod instances in a Job

A Job creates one or more Pods and ensures that a specified number of them successfully terminate.
Pods normally are created as run forever
To create a Pod that runs for a limited duration, use Jobs instead
Jobs are useful for tasks, like backup, calculation, batch processing and more
A Pod that is started by a Job must have its restartPolicy set to OnFailure or Never
- OnFailure will re-run the container on the same Pod
- Never will re-run the failing container in a new Pod

Understanding different available Job types

There are 3 different Job types which can be created byb specifying completions and parallelism parameters:

Non-parallel Jobs: one Pod is started, unless the Pod fails

completions=1
parallelism=1

Parallel Jobs with a fixed completion count: the Job is complete after successfully running as many times as specified in jobs.spec.completions

completions=n
parallelism=m

Parallel Jobs with a work queue: multiple Jobs are started, when one completes successfully, the Job is complete

completions=1
parallelism=n

Running job pods sequentially

If you need a Job to run more than once, you set completions to how many times you want the Job’s pod to run. For this demonstration, we need to create a Job so let us check the required apiVersion.

So the required apiVersion for Job KIND is batch/v1

[root@controller ~]# kubectl explain Job
KIND:     Job
VERSION:  batch/v1

Following is a sample YAML file:

[root@controller ~]# cat pod-simple-job.yml
apiVersion: batch/v1
kind: Job
metadata:
  name: pod-simple-job
spec:
  completions: 3
  template:
    spec:
      containers:
      - name: sleepy
        image: alpine
        command: ["/bin/sleep"]
        args: ["5"]
      restartPolicy: Never

Here we have defined completion: 3 so we want the Job's pod to run for 3 times sequentially.

[root@controller ~]# kubectl create -f pod-simple-job.yml
job.batch/pod-simple-job created

Check the list of available Pods:

[root@controller ~]# kubectl get pods
NAME                            READY   STATUS              RESTARTS   AGE
nginx                           1/1     Running             0          47m
pod-simple-job-52vr7            0/1     ContainerCreating   0          3s

So a new Pod is started where the container is getting created as per the description from our YAML file. Once the command we executed in this container is completed, another Pod should get started automatically as want this to happen sequentially:

[root@controller ~]# kubectl get pods
NAME                            READY   STATUS              RESTARTS   AGE
nginx                           1/1     Running             0          48m
pod-simple-job-52vr7            0/1     Completed           0          13s
pod-simple-job-bhrf5            0/1     ContainerCreating   0          2s

As you see, as soon as the first Pod status is completed, another Pod is started. You can check the status of the Job using:

[root@controller ~]# kubectl get jobs
NAME             COMPLETIONS   DURATION   AGE
pod-simple-job   1/3           16s        16s

Once all the jobs are completed:

[root@controller ~]# kubectl get jobs
NAME             COMPLETIONS   DURATION   AGE
pod-simple-job   3/3           36s        3m36s

Deleting a job

When a Job completes, no more Pods are created, but the Pods are not deleted either. Keeping them around allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output. The job object also remains after it is completed so that you can view its status. It is up to the user to delete old jobs after noting their status.

[root@controller ~]# kubectl delete jobs pod-simple-job
job.batch "pod-simple-job" deleted

Another way to terminate a Job is by setting an active deadline. Do this by setting the.spec.activeDeadlineSecondsfield of the Job to a number of seconds. TheactiveDeadlineSecondsapplies to the duration of the job, no matter how many Pods are created. Once a Job reachesactiveDeadlineSeconds, all of its running Pods are terminated and the Job status will becometype: Failedwithreason: DeadlineExceeded.

Clean up finished jobs automatically

Finished Jobs are usually no longer needed in the system. Keeping them around in the system will put pressure on the API server. If the Jobs are managed directly by a higher level controller, such asCronJobs, the Jobs can be cleaned up by CronJobs based on the specified capacity-based cleanup policy

Conclusion

In this Kubernetes Tutorial we covered different areas related to Pods. So now you should have a basic idea to create, manage and analyse different Pods and containers. Pods can run multiple processes and are similar to physical hosts in the non-container world. YAML or JSON descriptors can be written and used to create pods and then examined to see the specification of a pod and its current state. We also learned to create and run Pod instances inside a Job which will allow the pod to complete the task and can be then terminated.