Skip to content
Intellira
Kuberneteshigh severity

CrashLoopBackOff

A pod in CrashLoopBackOff keeps crashing and restarting. Read the real reason — app error, failed probe, init container, or exit 0 — and fix it.

Written by Intellira Engineering, Editorial team

What CrashLoopBackOff means

CrashLoopBackOff is not the error itself — it is the kubelet telling you a container keeps exiting and being restarted, and that the back-off delay between restart attempts is now in effect. Kubernetes documents that the status appears "when a Pod is failing to start repeatedly" (Pod Lifecycle). Note this is the kubectl Status display field for human intuition, not the Pod phase, which is a separate part of the API (Pod Lifecycle).

The back-off is deliberate. Per the official docs, after a container exits the kubelet restarts it with an exponential delay (10s, 20s, 40s, …) capped at 300 seconds (5 minutes); once a container has run for 10 minutes without problems the kubelet resets the back-off timer for that container (Pod Lifecycle). So a high RESTARTS count with long gaps is expected, not a second bug — the real reason is in the previous container's logs and its termination state.

The decider: restartPolicy

Whether you ever see CrashLoopBackOff depends on the Pod's restartPolicy. With Always (the Deployment/ReplicaSet default) and OnFailure, the kubelet keeps restarting and the loop produces the back-off. With Never, the container is not restarted and the Pod is simply marked failed instead — so if you expect a crash loop but see a terminal failure, check restartPolicy first (Pod Lifecycle). OnFailure only restarts on a non-zero exit, which is why an exit-0 container under OnFailure will not crash-loop but a latest-tagged worker under Always will.

Diagnose it

Read the logs from the crashed instance, not the current one. The --previous flag returns logs from a previously terminated container instance (Debug Running Pods):

kubectl logs <pod> -n <namespace> --previous
kubectl describe pod <pod> -n <namespace>
# Under "Last State": Reason and Exit Code tell you the class of failure
kubectl get pod <pod> -n <namespace> -o yaml

kubectl describe pod surfaces the Pod events and the container's Last State; -o yaml gives the full status (including .status.initContainerStatuses) when describe is not enough (Debug Running Pods).

Use the Exit Code and Reason to pick the matching mechanism below.

Causes, each end to end

1. Application error on startup (non-zero exit)

The process exits non-zero: bad config, unreachable dependency, failed migration, wrong entrypoint, missing permission, or a port conflict. This is the catch-all class and is documented as the most common (GKE — Troubleshoot CrashLoopBackOff).

  • Diagnose: Last State: Terminated with a non-zero Exit Code (often 1). kubectl logs <pod> --previous shows the stack trace or error line.
  • Fix: correct the failing call shown in the previous logs — fix the config value, restore the dependency, or repair the command/entrypoint, then redeploy. Beware a moving :latest image tag silently pulling a breaking change (GKE — Troubleshoot CrashLoopBackOff).

2. OOMKilled (exit 137)

The container exceeded its memory limit and was killed by the kernel OOM killer.

  • Diagnose: Last State: Terminated, Reason: OOMKilled, Exit Code: 137 (128 + SIGKILL 9) (GKE — Troubleshoot CrashLoopBackOff).
  • Fix: raise resources.limits.memory if the workload genuinely needs it, or fix the leak / unbounded buffer. See the dedicated OOMKilled page.

3. Missing config or secret

A referenced ConfigMap or Secret key is absent, so the app aborts at boot. (A missing object reference that blocks container creation surfaces as CreateContainerConfigError, not a loop — see that page.)

  • Diagnose: previous logs show "key not found" / "no such file"; describe shows the env or volume reference.
  • Fix: create or correct the referenced ConfigMap/Secret key and redeploy.

4. Failing liveness probe

A liveness probe failure makes the kubelet kill and restart the container; if it never passes, you get a loop (Configure Probes). The kubelet kills after failureThreshold consecutive failures (default 3), with periodSeconds default 10 and timeoutSeconds default 1 (Configure Probes).

  • Diagnose: events show Liveness probe failed then Container ... failed liveness probe, will be restarted (Configure Probes). The app itself may log nothing wrong — the kill is external. Watch for exit 143 (128 + SIGTERM 15) when the kubelet terminates a "healthy" container.
  • Fix: correct the probe path/port/command if it is wrong; if the app is just slow to boot, raise initialDelaySeconds/failureThreshold, or better, add a startupProbe that holds off liveness checks until the container is up (Configure Probes).

5. Crashing init container (Init:CrashLoopBackOff)

If an init container keeps failing, the main container never starts and the status reads Init:CrashLoopBackOff. The kubelet repeatedly restarts a failed init container until it succeeds; under a Pod restartPolicy of Always, init containers run with OnFailure (Init Containers). With restartPolicy: Never, a failed init container fails the whole Pod (Init Containers).

  • Diagnose: status shows Init:... (e.g. Init:CrashLoopBackOff or Init:0/2). Read the init container's logs by name, not the app container:
    kubectl logs <pod> -c <init-container-name> -n <namespace>
    kubectl get pod <pod> -n <namespace> -o yaml  # .status.initContainerStatuses
    
  • Fix: resolve what the init step waits on (DB reachable, migration applied, volume populated). The app container will start once the init step exits 0.

6. Container exits 0 with no long-running process

The process completes and exits 0, but a Deployment expects it to keep running, so the kubelet restarts it — repeatedly (GKE — Troubleshoot CrashLoopBackOff). Common when the command is a one-shot script, a worker that exits on an empty queue, or a shell that runs to completion.

  • Diagnose: Last State: Terminated, Reason: Completed, Exit Code: 0, yet the Pod keeps restarting (GKE — Troubleshoot CrashLoopBackOff).
  • Fix: make the command a long-running foreground process; or, if it is truly a run-to-completion task, model it as a Job/CronJob (which expects exit 0), not a Deployment.

Fix it — order of operations

  1. kubectl logs <pod> --previous — read the actual error from the crashed run.
  2. kubectl describe pod <pod> — read Last State Reason and Exit Code to pick the mechanism above. Init: prefix means it is the init container (case 5).
  3. Exit 137 / OOMKilled → case 2. Exit 0 / Completed while restarting → case 6.
  4. Events show Liveness probe failed → case 4 (probe, not the app).
  5. Otherwise read the previous logs for the failing call → case 1 or 3.
  6. Check restartPolicy if the behavior (loop vs terminal failure) is not what you expect.

How Intellira diagnoses this

Intellira reads the previous logs and termination state read-only, classifies the failure by its exit code, Reason, and whether the failing container is an init container, and walks the causality chain back to the commit, build, or sync that introduced it — so the answer is "commit X broke startup," with evidence, not just "the pod is crash-looping."

Sources

By Intellira Engineering. AI-assisted draft, reviewed by the Intellira engineering team; claims cited inline; last verified 2026-06-02.

Frequently asked questions

What does CrashLoopBackOff mean?
The container repeatedly starts, crashes, and is restarted by the kubelet with an exponential back-off delay. It is a symptom — the real cause is in the previous container logs and the termination reason.
Is CrashLoopBackOff the same as OOMKilled?
No. OOMKilled is one possible cause of CrashLoopBackOff, shown as Reason: OOMKilled with exit code 137 in kubectl describe. Other causes include a failed command, a missing config or secret, a failing liveness probe, a crashing init container, or a container that exits 0 with no long-running process.
Why does the container restart count keep climbing with long gaps?
That is the back-off working as designed. The kubelet restarts a failed container with an exponential delay (10s, 20s, 40s, …) capped at 5 minutes, and resets the timer after the container runs cleanly for 10 minutes. A high restart count with long gaps is expected, not a second bug.

Related errors

Find the root cause of CrashLoopBackOff on your stack

Connect read-only and Intellira correlates the change behind it across Bitbucket, Jenkins, ArgoCD and Kubernetes — with the evidence to prove it.