What OOMKilled means
The container exceeded its resources.limits.memory and the kernel's
out-of-memory killer terminated it, recorded as Reason: OOMKilled with exit
code 137. The official docs are explicit: "a Container is not allowed to use more
than its memory limit. If a Container allocates more memory than its limit, the
Container becomes a candidate for termination … If the Container continues to
consume memory beyond its limit, the Container is terminated"
(Assign Memory Resources).
It is almost always the container hitting its own limit — not the node running
out of memory.
Enforcement is reactive, not a hard ceiling: "terminations only happen when
the kernel detects memory pressure … A container may use more memory than its
memory limit, but if it does, it may get killed"
(Resource Management).
That's why a leak can run over the limit for a while, then die seemingly at
random when pressure hits.
Two ways a pod gets killed — and who dies first
There are two distinct mechanisms, and they have different victims:
- Your own limit — the container exceeds
limits.memoryand is OOMKilled (above). The victim is that container. - Node memory pressure — the whole node runs low and the kubelet proactively evicts pods to reclaim memory ("the process by which the kubelet proactively terminates pods to reclaim resource on nodes", Node-pressure Eviction). Here the victim may not be the heaviest user — it's chosen by QoS class.
Under node pressure, "Kubernetes will first evict BestEffort Pods … followed
by Burstable and finally Guaranteed Pods", and "only Pods exceeding resource
requests are candidates for eviction"
(QoS Classes):
- Guaranteed — every container has memory + CPU
request==limit. "Least likely to face eviction." Use this for critical workloads. - Burstable — has some requests/limits but isn't Guaranteed. Evicted after BestEffort, and only when over its requests.
- BestEffort — no requests or limits at all. Killed first under node pressure — even if it wasn't the pod that caused the problem.
Practical consequence: a pod with no requests/limits is the first casualty when a
node is squeezed. Give anything you can't afford to lose requests == limits
(Guaranteed) so it's evicted last.
Diagnose it
kubectl describe pod <pod> -n <namespace>
# Last State: Terminated
# Reason: OOMKilled
# Exit Code: 137
kubectl top pod <pod> -n <namespace> --containers
# Compare MEMORY against the container's limits.memory
Is it your limit, or the node? The tell is in the status:
- Container
Last State: Terminated, Reason: OOMKilled(exit 137) → it hit a memory limit (the kernel OOM-killed it). - Pod
status.reason: Evicted,phase: Failed→ the kubelet evicted it under node memory pressure — "the kubelet sets the phase for the selected pods toFailed, and terminates the Pod" (Node-pressure Eviction).
# Is the NODE under memory pressure?
kubectl describe node <node> | grep -iA5 conditions # MemoryPressure: True?
kubectl get events -A --field-selector reason=Evicted
kubectl top nodes # which node is hot
kubectl top pods -A --sort-by=memory # the noisy neighbour on it
The kubelet evicts before the node truly runs out, so this is usually a
graceful eviction (reason Evicted), not a kernel OOM — unless eviction can't
keep up. The signal is memory.available and the default hard threshold is
100Mi (Node-pressure Eviction).
Common causes
- Limit set too low for the workload's real usage.
- A memory leak — steady climb to the limit, killed, restart, repeat.
- A cache/heap sized in absolute terms while the limit was left unchanged — a classic deploy-induced OOM.
Fix it
If it's the container's own limit:- Confirm
Reason: OOMKilledand the real working-set size. - If usage is legitimately higher than the limit, raise
limits.memory(and requests) with headroom. - If usage climbs without bound, profile and fix the leak — a higher limit only delays the kill.
- If a recent change set a cache/heap size, align the limit with it or revert.
- Find the noisy neighbour —
kubectl top pods -A --sort-by=memoryon that node — and right-size or move it. - Set honest
requestseverywhere so the scheduler stops overcommitting the node (only pods over their requests are eviction candidates). - Make critical workloads Guaranteed (requests == limits) and/or give them a higher
PriorityClass— the kubelet evicts lower-priority, over-request pods first. - Add node capacity or enable the Cluster Autoscaler so pressure has an escape valve.
For the long-form version, see the guide: Kubernetes OOMKilled: a complete debugging guide.
How Intellira diagnoses this
Intellira reads the termination state and working set read-only and walks the causality chain — flagging the ArgoCD sync, Jenkins build or Bitbucket commit that changed a cache or heap setting while the limit stayed put — and names the file to fix, with evidence.
Sources
- Kubernetes — Assign Memory Resources to Containers and Pods
- Kubernetes — Resource Management for Pods and Containers
- Kubernetes — Pod Quality of Service Classes
- Kubernetes — Node-pressure Eviction
By Intellira Engineering. AI-assisted draft, reviewed by the Intellira engineering team; claims cited inline; last verified 2026-06-02.