OOM is an abbreviation for Out Of Memory and if you are working in IT Field then you may not be new to this term. This can be a nightmare for developers and administrators as many times it makes our life hell trying to figure out the root cause for getting OOMKIlled error.
When you are working in non-containerized environment then you still have an added advantage to view syslog messages which can give you pretty good hint on what might have caused the Out Of Memory error but unfortunately in containerized environment the chances are you have not configured syslog. This because in container environment the syslog doesn't work by default and it requires additional linux capabilities, privilege and configuration to read the journalctl
or /dev/log
socket. So you have to totally rely on the application logs to figure out any such root cause.
In this tutorial we will try to help you with some key pointers which can help you identify the application or service which might have caused OOMKilled error. Normally in container environment we run one process per container but it is possible dur to application requirement we are running multiple services inside the container and if any one of those service causes OOM then the entire container will crash.
Understanding OOMKilled Error (Exit Code 137)
The Out Of Memory (OOM) Killer is a mechanism in Linux-based systems which gets invoked when the system runs out of physical memory and swap space. When this situation arises, the OOM Killer is tasked with freeing up memory by terminating one or more processes on the system. This is a last resort action taken by the kernel to prevent the entire system from crashing due to a lack of memory.
When a process is terminated by the OOM Killer, it typically exits with code 137. This exit code is significant because it indicates that the process was killed due to a SIGKILL signal. Here’s how the code breaks down:
- 128 + 9 = 137
- 128: Indicates that a process was terminated by a signal.
- 9: The signal number for SIGKILL.
In a Kubernetes context, containers in pods can be killed by the OOM Killer if they try to use more memory than their allocated limit. When a container exceeds its memory request, it risks being OOM killed, especially if it approaches or exceeds its memory limit. Kubernetes also sets oom_score_adj
for each container to ensure that containers remain within their memory limits and priorities.
Can we disable OOMKiller Feature?
We have to understand that OOMKiller is not a villain and it is actually there to avoid complete system crash. You can image there was a BUG in your code which was leaking memory which is running on a pod with no memory limit defined. So at one point in time your code will eat up all the memory which can crash the whole system so to avoid that we define Kubernetes resource limit so that a pod is not allowed to take more than the defined limit and only the specific container running your code is impacted while all other services continue to work.
However, there are some sysctl parameters and configurations related to how the OOM Killer behaves that you can adjust, depending on your specific requirements and the characteristics of your workloads.
oom_kill_allocating_task:
- This parameter determines whether the Linux kernel should kill the task that triggered the out-of-memory condition instead of selecting a task based on heuristics (like
oom_score
). - Setting this to
1
makes the OOM Killer kill the process that caused the memory shortage.
sysctl vm.oom_kill_allocating_task=1
oom_control:
- This parameter is available per cgroup and allows you to disable the OOM Killer for specific groups of processes managed by cgroups.
- You can set
oom_control
to1
on a specific cgroup to disable the OOM Killer for processes in that cgroup:
echo 1 > /sys/fs/cgroup/<path-to-cgroup>/memory.oom_control
panic_on_oom:
- This parameter controls the kernel's reaction to an OOM condition. By default, it is set to
0
, meaning the kernel will call the OOM Killer. - Setting this parameter to
1
causes the kernel to panic when an OOM condition occurs, which can be useful in systems where you prefer a reboot over allowing the OOM Killer to terminate processes.
sysctl vm.panic_on_oom=1
overcommit_memory:
- This parameter controls the kernel's memory overcommit behavior. There are three possible values:
0
: Heuristic overcommit handling. The kernel tries to estimate the amount of free memory left when allocating memory.1
: Always overcommit. Applications can allocate more memory than is actually available, which can lead to OOM conditions.2
: Never overcommit. The total address space commit for the system is not allowed to exceed swap + a configurable percentage (overcommit_ratio
) of physical RAM.
- Adjusting this setting helps control the system's approach to memory allocation and can influence the frequency of OOM conditions.
sysctl vm.overcommit_memory=2
You can read more on how to apply sysctl parameters sysctl reload without reboot
Steps to debug OOMKilled Error (Exit Code 137) in Kubernetes
Here I created a small python script which will allocate 10MB of memory and will continue to allot until the container crashes. I had also written another article with commands to induce artificial CPU and Memory load Increase load with stress command in Linux
Here is my python script which is reproducing the OOMKilled error:
import time
import datetime
import sys
def allocate_memory():
# Start with an empty list to hold data
allocated = []
step_size = 10 # Adjust step size as needed
try:
while True:
# Allocate memory in chunks of 10MB
allocated.append(' ' * 10 * 1024 * 1024) # 10MB space
# Get current time and format it
current_time = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"{current_time} - Allocated {len(allocated) * 10}MB of memory", flush=True)
time.sleep(1) # Wait for a second before allocating more
except Exception as e:
print(f"{datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')} - Error: {str(e)}", flush=True)
sys.exit(1)
if __name__ == "__main__":
allocate_memory()
1. Identify the OOMKilled container
You can check the status of individual container of your pod using kubectl describe
command to determine which container crashed with OOMKilled error:
For example, I have created working-ssh-0
pod so let me describe it:
kubectl describe pods -n deepak working-ssh-0
As you can see the container status along with other details:
test-statefulset:
Container ID: containerd://d6c719abf823f68eab2b9d3cb936073afb627ede4d3824cf58a650d7b7982491
Image: bcmt-registry:5000/ssh:deepak
Image ID: bcmt-registry:5000/ssh@sha256:3ee989bcce8e9923c79b2e5e279ab4a1e0c2fc231d7feea57a40119612b82978
Port: <none>
Host Port: <none>
Command:
python3
/opt/nokia/sdl-security/scripts/bin/stress.py
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Sat, 04 May 2024 12:24:25 +0530
Finished: Sat, 04 May 2024 12:24:29 +0530
Ready: False
Restart Count: 6
You can also monitor kubectl get pod -n <namespace>
output to check the status of the pods.
NAME READY STATUS RESTARTS AGE
working-ssh-0 0/1 CrashLoopBackOff 1 (11s ago) 22s
working-ssh-1 0/1 OOMKilled 1 (14s ago) 20s
As you can see in my case both the pods have single container and both these containers are crashing.
2. Identify the service causing OOMKilled Error
Now that we know the container which is crashing, we need to know the service which is causing the OOMKilled error. Here things become tricky because as highlighted earlier we may not have syslog so we have to rely on application logs.
If you have configured some way of forwarding logs and capturing metrices from the container on a third party service such as Prometheus or Splunk etc then you can check the logs on those servers. Or else you can use kubectl logs
command to check the logs generated by all the services inside the pod.
But the problem is kubectl logs
can should logs from current session only so you must use --previous
argument so that you can see the pod logs from previous session i.e. before the pod went for restart. This can potentially give you a clue on which service was killed with exit code 137.
kubectl logs <pod-name> -c <container> --previous
Let's check the logs in our case:
2024-05-04 07:08:42,550 DEBG 'stress' stdout output:
2024-05-04 07:08:42 - Allocated 30MB of memory
2024-05-04 07:08:43,709 DEBG fd 7 closed, stopped monitoring <POutputDispatcher at 140209320959616 for <Subprocess at 140209323341640 with name stress in state RUNNING> (stdout)>
2024-05-04 07:08:43,709 DEBG fd 9 closed, stopped monitoring <POutputDispatcher at 140209320959280 for <Subprocess at 140209323341640 with name stress in state RUNNING> (stderr)>
2024-05-04 07:08:43,710 INFO exited: stress (terminated by SIGKILL; not expected)
Here we can see the process which received SIGKILL was stress
which is my script used to induce memory load via python script.
So now we know the service which caused the container to crash with OOMKilled error.
3. Debugging Memory Leaks
This is the part where there is no one answer for all. Now since you know the service which is causing the memory leak and eating up all the memory then you have to take a decision
- Does you application actually need more memory? In such case you should consider increasing the memory limit of the pod. Modify the deployment YAML to adjust
resources.limits.memory
andresources.requests.memory
. You can refer How to limit Kubernetes resources (CPU & Memory) for more information. - You application is leaking memory so you have to identify what is causing the leak, 7 tools to detect Memory Leaks and optimize the code.
4. Implement Horizontal Pod Autoscaling
I am adding this option so that you can consider this as a long term solution. This is applicable if you have a application which requires higher memory at certain time when there is high demand while other times the demand is low so obviously we wouldn't wat our pods to be killed during high load times.
In such cases you should consider auto scaling i.e. after a certain pre-defined memory limit is reached then new pods will be spawned to handle the load and once the overall memory consumption is reduced then additional spawned pods are automatically deleted.
Here is a sample YAML file which can be used to configure horizontal auto scaling:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: your-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: your-deployment
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 50
targetMemoryUtilizationPercentage: 80
Summary
When you encounter an OOMKilled event with exit code 137 in Kubernetes, it typically indicates that the pod was terminated because it exceeded the memory resources allocated to it. Here’s a summary of the key points regarding this scenario:
- Exit Code 137: This code signifies that a process was forcibly killed with a SIGKILL signal due to memory constraints. The calculation is based on the formula 128 + 9, where 128 indicates a process killed by a signal and 9 is the signal number for SIGKILL.
- OOMKilled: This label is assigned by Kubernetes when the pod is terminated due to the Out Of Memory Killer (OOM Killer), a mechanism in the Linux kernel that kills processes to free up memory and prevent the system from crashing.
The most common cause for a pod receiving an OOMKilled status is that it attempted to use more memory than its configured limit in Kubernetes. Pods can have memory requests, which are the minimum needed, and limits, which cap the amount of memory the pod can use. Even if usual memory consumption is within limits, unexpected spikes can cause the pod to exceed its memory allocation temporarily, leading to termination.