What Pod Pending means
A Pending pod has been accepted by the API server but not yet bound to a node.
The kube-scheduler
runs a two-step filtering then scoring pass; if filtering eliminates every
node, the pod stays Pending and the scheduler records a FailedScheduling event
naming the blocker. So this is almost always a read-the-event problem — but the
event can name several distinct mechanisms, and the fix differs for each.
The scheduler filters on requests, not live usage: a node sitting at 5% CPU can still be "full" if existing pods have reserved its allocatable CPU via their resource requests.
Diagnose it
# The FailedScheduling event states the exact blocker, per node
kubectl describe pod <pod> -n <namespace>
# e.g. "0/5 nodes are available: 3 Insufficient cpu, 2 node(s) had untolerated taint ..."
kubectl get events -n <namespace> --sort-by=.lastTimestamp | grep <pod>
# Node-level facts the event hints at
kubectl describe node <node> # Taints, Conditions, Allocatable, Allocated, Non-terminated Pods
kubectl get nodes -o wide # SchedulingDisabled = cordoned
Read the per-node breakdown in the event. Each clause ("Insufficient cpu", "untolerated taint", "didn't match Pod's node affinity/selector", "didn't match pod topology spread constraints", "Too many pods", "node(s) didn't have free ports") maps to one of the mechanisms below.
Insufficient resources (container-level)
No node has enough allocatable CPU/memory to satisfy the pod's
resources.requests. The scheduler uses requests, not limits, to choose a node
(docs).
- Diagnose: event says
Insufficient cpu/Insufficient memory. Compare the pod's requests againstkubectl describe node→Allocatableminus theAllocated resourcesalready reserved by other pods. - Fix: lower the pod's
requeststo fit, or add node capacity. If the request is larger than any single node's allocatable, no node can ever fit it — size it down or use a larger node type.
Node-level pressure and cordoning (node-level, not user taints)
Distinct from the user taints below: the control plane auto-taints unhealthy or drained nodes, removing them from the feasible set.
- Diagnose:
kubectl describe nodeshows aTaints:line such asnode.kubernetes.io/memory-pressure:NoSchedule,node.kubernetes.io/disk-pressure:NoSchedule, ornode.kubernetes.io/unschedulable:NoSchedule(a cordoned node, also shown asSchedulingDisabledinkubectl get nodes). These are added automatically when the matching node condition is active (built-in taints). - Fix: for pressure taints, relieve the underlying condition (free disk/memory
on the node) — the kubelet clears the taint when the condition resolves. For a
cordoned node,
kubectl uncordon <node>once maintenance is done. Do not add tolerations for pressure taints; that masks a real node problem.
Taints without matching tolerations (user taints)
An operator tainted eligible nodes (e.g. dedicated GPU or system pools) and the
pod has no matching toleration. A NoSchedule taint blocks new pods that do not
tolerate it (taints and tolerations).
- Diagnose: event says
node(s) had untolerated taint {key: value}. Confirm withkubectl describe node→Taints:. - Fix: add a matching toleration to the pod spec (key, effect, and value/
Existsmust match), or remove the taint deliberately withkubectl taint nodes <node> <key>:NoSchedule-if it was set in error.
Node affinity / nodeSelector mismatch
nodeSelector or required nodeAffinity matches no node. Required (hard) affinity
is restrictive — if no node satisfies it, the pod will not schedule at all
(affinity).
- Diagnose: event says
didn't match Pod's node affinity/selector. List node labels:kubectl get nodes --show-labels. - Fix: correct the label expression, label a node to match
(
kubectl label node <node> <key>=<value>), or relax a hard rule (requiredDuringScheduling...) to a soft one (preferredDuringScheduling...).
Pod affinity / anti-affinity
Distinct from node affinity: the pod must be co-located with (affinity) or kept apart from (anti-affinity) other pods by topology key. Required anti-affinity is a common cause of "works at 1 replica, Pending at N" — each replica demands a different node (inter-pod affinity).
- Diagnose: event says
didn't match pod affinity/anti-affinity rulesordidn't satisfy existing pods anti-affinity rules. CheckpodAntiAffinityand whether enough distinct topology domains (nodes/zones) exist. - Fix: add nodes in distinct topology domains, or change required anti-affinity
to
preferredDuringSchedulingIgnoredDuringExecution.
Pod topology spread constraints
A topologySpreadConstraints entry with whenUnsatisfiable: DoNotSchedule is a
hard limit: if placing the pod would exceed maxSkew across the topology, the
scheduler leaves it Pending (topology spread).
- Diagnose: event says
didn't match pod topology spread constraints. Inspectspec.topologySpreadConstraintsand how matching pods are already distributed across thetopologyKey(e.g. zones). - Fix: add a node/zone in the under-filled topology domain, raise
maxSkew, or switchwhenUnsatisfiabletoScheduleAnyway(soft) if even spread is not mandatory.
hostPort conflict
If the pod requests a hostPort, only one pod per node can hold that port. Once a
node has it bound, that node is filtered out (scheduler).
- Diagnose: event says
node(s) didn't have free ports for the requested pod ports. - Fix: remove the
hostPort(use a Service instead), change the port, or reduce the pod to one replica per node (e.g. a DaemonSet pattern).
Max pods per node / Pod CIDR exhaustion
Every pod needs an IP from the node's Pod CIDR, and the kubelet enforces a max-pods ceiling (the upstream default is 110 pods per node). A node can hit this even with idle CPU and memory (Pending pods debugging). On managed clusters the CIDR sizing is tied to this limit (GKE max Pods per node).
- Diagnose: event says
Too many pods.kubectl describe nodeshowsNon-terminated Podsnear the node's pods capacity. - Fix: add nodes so pods spread out, or raise max-pods / the per-node Pod CIDR size (a node-pool / kubelet setting on managed platforms — it changes IP allocation, so plan it).
No cluster autoscaling (cloud clusters)
When every node is full and the cluster cannot grow, the pod stays Pending indefinitely. On managed clusters this is the autoscaler's job; if it is absent or cannot add the right node type, nothing happens.
- Diagnose:
Insufficient cpu/memoryon all nodes with no new nodes appearing. Check the Cluster Autoscaler / Karpenter logs and events for scale-up decisions or limits (Karpenter scheduling). - Fix: confirm an autoscaler is installed and its node pools/provisioners can satisfy the pod's requests, taints, and affinity (an autoscaler will not add a node a pod could not schedule onto anyway). Otherwise add capacity manually.
Unbound PersistentVolumeClaim
The pod mounts a PVC that cannot bind. With volumeBindingMode: Immediate the PVC
must bind before scheduling; with WaitForFirstConsumer binding is deferred until
scheduling, so a provisioning failure surfaces as a Pending pod
(persistent volumes).
- Diagnose:
kubectl get pvc -n <namespace>showsPending; the pod event mentions unbound or unschedulable volumes.kubectl describe pvc <pvc>shows the provisioner error. - Fix: ensure the named
StorageClassexists and its provisioner works, that requested capacity is available, and (for zonal volumes) that a node in the volume's zone is schedulable.
How Intellira diagnoses this
Intellira reads the scheduling events and node state read-only, parses the
per-node FailedScheduling breakdown into the specific mechanism (resources,
taint, affinity, topology spread, hostPort, max-pods, or PVC), and ties a sudden
Pending state to the change that introduced it — a new resource request, an added
affinity or spread rule, a taint, or a StorageClass change — with the evidence
behind it.
Sources
- Kubernetes scheduler
- Managing resources for containers (requests vs limits)
- Taints and tolerations
- Assigning pods to nodes (affinity / anti-affinity)
- Pod topology spread constraints
- Persistent volumes
- GKE: configure maximum Pods per node
- Datadog: debug Kubernetes pending pods
- Karpenter scheduling
By Intellira Engineering. AI-assisted draft, reviewed by the Intellira engineering team; claims cited inline; last verified 2026-06-02.