Start from the error you actually hit.
Each guide explains the failure and walks the causality chain to the change that caused it. Grouped by the system that surfaced the problem.
Kubernetes
ContainerCreating
A Pod stuck in ContainerCreating is waiting on a volume, image pull, secret, or network attachment. Here is how to find which one and fix it.
CrashLoopBackOff
A pod in CrashLoopBackOff keeps crashing and restarting. Read the real reason — app error, failed probe, init container, or exit 0 — and fix it.
CreateContainerConfigError
Why a Pod is stuck in CreateContainerConfigError and how to fix the missing ConfigMap, Secret, or key reference behind it.
DNSResolution
Pods cannot resolve Service or external names. Six causes diagnosed and fixed: CoreDNS down, dnsPolicy, ndots, NetworkPolicy, resolv.conf loops, name forms.
ImagePullBackOff
ImagePullBackOff means the kubelet cannot pull the image. Read the exact pull error and fix it: wrong tag, auth, rate limit, wrong arch, or pull policy.
NetworkPolicyBlocked
Connections time out or are refused because a NetworkPolicy denies them. How to confirm isolation, find the missing allow rule, and unblock DNS.
NodeNotReady
A NotReady node has stopped reporting healthy to the control plane. Its pods get evicted and rescheduled. Here is how to find why the kubelet went unhealthy.
OOMKilled
OOMKilled (exit code 137) means a container exceeded its memory limit and the kernel killed it. Here is how to confirm it and fix the real cause.
PodPending
A pod stuck in Pending was not scheduled. The FailedScheduling event says why — resources, taints, affinity, topology spread, or an unbound volume.
PVCPending
A PVC stuck Pending means no volume bound it: missing StorageClass, WaitForFirstConsumer, no provisioner, no matching PV, or access-mode mismatch.
ServiceNoEndpoints
A Service with no endpoints returns connection refused or timeouts. Causes: selector mismatch, unready pods, port mismatch, or no running pods.
ArgoCD
Degraded
A Degraded ArgoCD app synced fine but a resource is unhealthy. Health — not sync — names the failing resource, and the app inherits the worst child health.
OutOfSync
An ArgoCD app shows OutOfSync when live cluster state deviates from the target state in Git. How to read the diff, find the real source of drift, and fix it.
ProgressingStuck
An ArgoCD app stuck Progressing is moving toward Healthy but never arrives. Sync status and health status are orthogonal — fix the workload, not the sync.
SyncFailed
When an ArgoCD sync fails, either the manifests never rendered or the apply errored. Read the operation message to find which, then the failing resource.
Jenkins
AgentOffline
A Jenkins agent shows offline and will not connect. Separate transport, secret, JDK, host, label, and cloud causes — then diagnose and fix each.
BuildFailure
A Jenkins build that fails at a stage needs the right log, not the whole console. Find the failing stage, tell FAILURE from UNSTABLE, and trace the change.
ExecutorStarvation
Jobs sit in the queue waiting for an executor. Separate true starvation from label mismatch, blocked builds, throttling, and flyweight waits — then clear it.