My agent is offline — is that the same as executor starvation?

No. Offline means the agent cannot connect to the controller or keep the remoting channel open, so its executors leave the pool entirely. Executor starvation means agents are online but every executor is busy, or no online agent carries the label the job needs. Check Manage Jenkins → Nodes: a red x is offline (this page); all executors green-but-busy is starvation. They are diagnosed and fixed differently.

Why does the agent connect, then immediately drop?

The two common causes are a JDK mismatch and a stale secret. Since Jenkins 2.357 both the controller and the agent JVM must run Java 11 or newer; a controller on Java 11 with an agent on Java 8 fails with UnsupportedClassVersionError (class file version 55.0 vs 52.0). A wrong or regenerated secret is rejected during the handshake. Read the agent launch log: the error names which one.

Do I need to open the TCP agent port?

Only for inbound TCP agents. Since Jenkins 2.0 the inbound TCP port is disabled by default (Docker images expose 50000). As of Jenkins 2.217 inbound agents can use WebSocket transport instead, which needs no extra TCP port and no special firewall rule — it rides the existing HTTP(S) port. WebSocket is the simpler choice behind a reverse proxy.

AgentOffline — Jenkins

A Jenkins agent goes offline when it cannot open or hold the remoting channel back to the controller. The fix depends on which link in that chain broke: the transport (TCP agent port closed, wrong controller URL, a reverse proxy not forwarding the agent port), the secret (wrong or regenerated, so the handshake is refused), the JDK (controller and agent on incompatible Java versions), the agent host (process died, remote root disk full, network lost), a label with no online agent to satisfy it, or a cloud/Kubernetes pod that never provisioned. Open Manage Jenkins → Nodes, click the offline agent, and read its launch log — the error string points at exactly one of these. This is distinct from executor starvation, where agents are online but every executor is busy; see executor starvation.

Offline vs executor-starved (get this right first)

These look similar in the queue and are fixed in opposite ways:

Offline (this page). The agent cannot connect or keep its channel open. Its executors are removed from the pool. Manage Jenkins → Nodes shows a red x. You fix the connection — transport, secret, JDK, host, or provisioning.
Executor-starved. Agents are online and reachable, but all executors are busy, or no online agent carries the job's label. You fix capacity or labels, not the connection. See executor starvation.

The single tell: a red, disconnected node icon is offline; a green node whose slots are all occupied is starvation.

How agents connect, in one paragraph

Jenkins Remoting is the library that implements the agent ⇔ controller channel (Jenkins Remoting). An inbound agent (formerly "JNLP") dials out to the controller; an SSH agent is launched by the controller over SSH. For inbound agents there are two transports. The classic path uses a separate TCP agent port — disabled by default since Jenkins 2.0, exposed as 50000 in the Docker images, and configurable as fixed or random in Manage Jenkins → Security (Exposed Services and Ports). Since Jenkins 2.217 an inbound agent can instead use WebSocket transport, which needs no extra TCP port and no special security configuration because it rides the existing HTTP(S) port (Exposed Services and Ports). A typical inbound launch is:

java -jar agent.jar \
  -url https://jenkins.example.com/ \
  -secret <hex-secret> \
  -name build-agent-01 \
  -workDir /var/jenkins
# add -webSocket to use the WebSocket transport instead of the TCP agent port

The -secret is a long string of hex digits the client needs to establish the connection (Inbound agent — jenkinsci/remoting). With -webSocket, only a single connection is made; without it, the agent first connects over HTTP(S) to retrieve connection info, then opens the TCP agent port (Inbound agent — jenkinsci/remoting).

Diagnose it

Three logs tell you almost everything; read them in this order.

Manage Jenkins → Nodes, then click the agent. The node page shows whether it is connected and, when offline, the cause (a monitor threshold, a launch failure, or a manual disconnect) (Managing nodes).
The agent's launch / log output — the console where agent.jar runs, or the node's log page. This is where transport, secret, and JDK errors surface verbatim.
The controller log (Manage Jenkins → System Log) for the matching rejection or provisioning entry.

For a Kubernetes agent, add kubectl against the agent pod:

kubectl get pods -n jenkins-agents
# NAME                          READY   STATUS             RESTARTS   AGE
# build-agent-01-7q4kx-jnlp     0/1     Error              0          12s

kubectl describe pod build-agent-01-7q4kx -n jenkins-agents   # events: scheduling, image pull, exit code
kubectl logs build-agent-01-7q4kx -c jnlp -n jenkins-agents   # the remoting/connect-back output

The error string maps almost one-to-one to a cause below.

Causes, each end to end

Transport: inbound TCP agent can't reach the controller

The agent reaches the controller over HTTP(S) but then cannot open the TCP agent port — because the port is disabled, firewalled, or a reverse proxy in front of Jenkins forwards only HTTP and drops the agent port.

What it is. The inbound TCP port is disabled by default since Jenkins 2.0 and must be enabled in Manage Jenkins → Security; a random port changes on reboot and is hard to firewall, while a fixed port is stable (Exposed Services and Ports). A reverse proxy that terminates HTTP usually does not forward the raw TCP agent port unless explicitly configured.

Diagnose. The agent log shows the HTTP step succeeding, then a timeout or refusal on the agent port — for example:

INFO: Locating server among [https://jenkins.example.com/]
INFO: Agent discovery successful
INFO:   Agent address: jenkins.example.com
INFO:   Agent port:    50000
SEVERE: Failed to connect to jenkins.example.com:50000
java.net.ConnectException: Connection refused (Connection refused)

From a shell on the agent host, confirm the port is actually reachable with nc -vz jenkins.example.com 50000 (a refusal or timeout here is the proof).

Fix. Pick one transport and make it consistent end to end. Easiest behind a proxy: switch the agent to WebSocket (-webSocket), which rides the HTTP(S) port and needs no extra port or firewall rule (Exposed Services and Ports). If you keep TCP, set a fixed agent port, open it through the firewall, and forward it at the proxy. Tradeoff: WebSocket simplifies networking but pins you to the HTTP path's proxy timeouts; a fixed TCP port is direct but is one more rule to manage and one more thing a proxy can silently drop.

Transport: wrong controller URL

The agent's -url does not match how the controller advertises itself, so the agent either can't find the controller or is handed an address it can't route to.

Diagnose. The log fails at "Locating server" / "Agent discovery", or discovery succeeds but hands back an internal hostname the agent can't resolve. Check the -url value and the Jenkins URL under Manage Jenkins → System.
Fix. Set -url to the externally reachable controller URL, and set the Jenkins URL in system config to the same address agents actually use. Mismatched internal-vs-external hostnames are the usual culprit behind a proxy.

Secret: wrong or expired agent secret → handshake refused

The agent connects but the controller rejects it because the secret does not match — commonly after the node was deleted and recreated, or the secret was regenerated, leaving the agent launching with a stale value.

What it is. The secret is the hex string the client must present to establish the connection (Inbound agent — jenkinsci/remoting).

Diagnose. The log gets past discovery, then the controller closes the channel during the handshake:

INFO: Connecting to jenkins.example.com:50000
INFO: Trying protocol: JNLP4-connect
SEVERE: The server rejected the connection: build-agent-01 is already connected
        or the secret did not match
java.io.IOException: The server rejected the connection

Fix. Re-copy the current secret from the node page (Manage Jenkins → Nodes → the agent shows the exact launch command and secret) and relaunch the agent with it. Note the remoting guidance: if a secret is compromised, do not reuse the agent name on that controller (Inbound agent — jenkinsci/remoting).

JDK: Java version mismatch between controller and agent

Remoting requires a compatible JVM on both ends. Since Jenkins 2.357 (and LTS 2.361.1) both the controller JVM and the agent JVM must run Java 11 or newer (Jenkins requires Java 11).

Diagnose. A controller on Java 11 with an agent on Java 8 throws, on the agent side:

Exception in thread "main" java.lang.UnsupportedClassVersionError:
  hudson/remoting/Launcher has been compiled by a more recent version of the
  Java Runtime (class file version 55.0), this version of the Java Runtime only
  recognizes class file versions up to 52.0

Class file 55.0 is Java 11; 52.0 is Java 8 (Jenkins requires Java 11). Confirm with java -version on the agent host.

Fix. Run the agent JVM (the process executing agent.jar/remoting.jar) on Java 11 or newer (Jenkins requires Java 11). This is separate from the JDK your builds use — you can still build with Java 8 via Global Tool Configuration; only the agent process itself must meet the minimum.

Agent host: process died, disk full, or lost network

The agent was healthy and then dropped because something on the host failed: the agent.jar process exited, the remote root filled up, or the network went away. Jenkins also takes a node offline on its own when a monitor threshold is crossed.

What it is. Jenkins monitors each node for disk space, free temp space, free swap, clock difference, and response time, and takes the node offline if any value crosses its threshold (Managing nodes).
Diagnose. The node page names the offline cause directly — e.g. "Disk space is too low" or "Free Swap Space is too low" for a monitor trip, versus a connection-lost cause for a dead process or network drop. On the host, check the agent process is alive and df -h the agent's remote root.
Fix. For a monitor trip, clear the underlying condition (free disk on the remote root, fix clock skew with NTP) and bring the node back online — do not just disable the monitor. For a dead process, restart agent.jar (run it under a supervisor/systemd so it restarts on exit). For a network drop, the agent reconnects once connectivity returns.

Label: no online agent carries the requested label

A build pinned to a label can run only on an agent with that label. If the only agent carrying it is offline, the job waits — but the cause here is the offline agent, not capacity.

Diagnose. The queue reason reads "there are no nodes with the label '...'". Confirm whether an agent with that label exists and is merely offline (this page) versus online-but-busy. If the labelled agent is offline, fix its connection using the cause above that matches its log.
Differentiation. If an agent with the label is online and the job still waits, that is not an offline problem — it is executor starvation (all slots busy, or label/usage restrictions). Offline = the channel is down; starved = the channel is up but no slot is free.

Cloud / Kubernetes: pod or instance never provisioned

With the Kubernetes plugin, an agent is a pod created per build from a pod template; the pod's jnlp container is launched as an inbound agent and connects back using injected JENKINS_URL, JENKINS_SECRET, and JENKINS_AGENT_NAME environment variables (kubernetes-plugin). It shows offline when the pod never reaches Running, or starts and immediately exits.

Diagnose. kubectl get pods for the agent namespace, then describe the pod for events (scheduling failures, image pull errors, missing required fields) and logs -c jnlp for the connect-back output. The plugin's troubleshooting guidance is to check pod status via kubectl and raise the controller log level for org.csanchez.jenkins.plugins.kubernetes (kubernetes-plugin). A pod that scheduled but exits points at the container; one stuck Pending points at the cluster (resources, image).
Common cause — image JRE. The pod image must have a JRE compatible with the Java version the controller requires (kubernetes-plugin); an incompatible or missing JRE produces the same UnsupportedClassVersionError as the JDK case above, or a jnlp container that exits immediately.
Common cause — connect-back transport. If the pod can't reach the agent port, enable WebSocket on the cloud configuration so agents connect over HTTP(S) rather than the TCP service port — the documented fix when the controller sits behind a proxy (kubernetes-plugin).
Fix. Correct whatever the events/logs name: fix the pod template (image, required fields, resource requests so it can schedule), align the image JRE with the controller's Java version, and set the connect-back transport/URL the pod can actually reach. Tradeoff: podRetention such as onFailure() keeps failed pods around so you can inspect them, at the cost of leftover pods you must clean up (kubernetes-plugin).

Fix it (order of operations)

Manage Jenkins → Nodes, open the offline agent, read the offline cause.
Read the agent launch log. Match the error string: connection refused on the agent port (transport), discovery/URL failure (URL), "server rejected the connection" (secret), UnsupportedClassVersionError (JDK).
Confirm it is offline, not starved. A red node is offline; an online node with all slots busy is executor starvation.
Transport: prefer WebSocket behind a proxy; otherwise fix the fixed TCP port + firewall + proxy forwarding. Align -url with the system Jenkins URL.
Secret: re-copy the current secret from the node page and relaunch.
JDK: put the agent JVM on Java 11+; this is separate from the build JDK.
Host: clear the tripped monitor (disk, clock), restart a dead agent.jar under a supervisor, restore network.
Kubernetes: kubectl describe/logs the pod; fix the template, image JRE, resources, and connect-back transport.

How Intellira diagnoses this

Intellira is read-only: it never restarts an agent, deletes a pod, or edits a node — it correlates evidence and names the cause. It reads the Jenkins MCP node state and offline cause, the agent and controller logs, and — for Kubernetes agents — the agent pod's status, events, and jnlp container logs via the Kubernetes MCP server. It then classifies the failure: a connection-refused on the agent port reads as a transport/firewall problem; a rejected handshake reads as a stale secret; an UnsupportedClassVersionError reads as a JDK mismatch; a monitor-tripped node reads as a host condition (disk/clock); a pod that never reaches Running reads as a provisioning/template fault. Critically, it distinguishes offline (channel down, executors gone) from executor starvation (channel up, slots busy) so the remediation it surfaces matches the actual link that broke, rather than reporting a generic "agent unavailable."

Sources

By Intellira Engineering. AI-assisted draft; claims cited inline; last verified 2026-06-02. Pending technical review.

AgentOffline