Connector health check is red

VaultPAM runs periodic health probes against each connector and the resource hosts it serves. When the health indicator turns red (or shows Degraded), one of four causes is almost always responsible. Network connectivity loss is the most common; version mismatches are the least common but are easy to confirm.

Symptoms

The connector row in Connectors shows a red health indicator or the status Degraded.
A health badge on a specific Safe shows the connector serving it is unhealthy.
Alerts fire for ControlPlaneTunnelHeartbeatTimeoutsHigh or a similar metric.
Sessions launched through the connector fail or are slow to connect.
The connector health was green recently and turned red without an apparent change.

What the health states mean

Status	Meaning
Online	The connector and all probed resource hosts are reachable.
Degraded	The connector is reachable but at least one resource host probe is failing.
Offline	The connector is not reachable from the control plane.
Needs recovery	The connector has lost its durable identity and cannot reconnect normally. The internal runtime state is `needs_recovery`; the connector does not silently fall back to the original bootstrap token and fails closed until an operator starts the controlled recovery flow.

A red health indicator corresponds to Degraded (connector up, some hosts down) or Offline (connector itself unreachable).

Causes

1. Network connectivity loss

The connector host has lost its outbound path to the VaultPAM control plane (port 443), or a network policy change has blocked the route.

Check: From the connector host, run:

curl -v https://dev.euwarden.com/healthz

If this fails, the connector cannot reach the control plane. Also check whether the connector machine itself is still reachable (ping, SSH, or RDP into it).

Fix: Restore the outbound network path. Confirm corporate firewalls or cloud security groups allow TCP 443 from the connector host to the control plane. After restoring connectivity, the connector will reconnect automatically within 30 seconds — no manual restart is needed.

2. Resource host unreachable

The connector may be healthy itself but one or more resource hosts it probes (the targets registered in the connected Safes) have become unreachable from the connector network. This produces a Degraded status rather than Offline.

Check: Identify which resource host is flagged. In Connectors → pick connector → Health details, you may see per-host probe results. From the connector machine, test reachability to the affected target:

nc -zv <target-host> <target-port>

Fix: Restore the route from the connector machine to the resource host. The probe will recover on the next health check cycle (typically every 60 seconds). No connector restart is needed.

3. Probe timeout

Health probes have a configurable timeout. If the resource host or connector is slow to respond — due to high CPU load, a hung service, or an overloaded network path — the probe may time out even though the host is technically reachable. Intermittent red states that self-resolve often fall into this category.

Check: Open Connectors → pick connector → Health history. If the health state flips between red and green without intervention, probe timeouts are the likely cause. Confirm the resource host is not under unusual load.

Fix: If the probe timeout is consistently too short for the host's normal response time, ask a VaultPAM operator to increase the probe timeout in the connector configuration. If the host is overloaded, address the load issue on the target system.

4. Connector version mismatch

VaultPAM periodically releases connector updates. If a connector is running a version that is no longer supported, the control plane may stop accepting its health reports or flag it as degraded until the connector is updated.

Check: Open Connectors → pick connector → Details. Compare the Version field against the latest supported connector version listed in System → Connector versions or the VaultPAM release notes.

Fix: Update the connector to the current supported version. The update procedure depends on the deployment method:

Docker: pull the latest image and recreate the container.
Native / VM: download the updated binary and restart the service.
Kubernetes: update the image tag in the deployment manifest and apply.

After the update, the connector will re-pair using its existing durable identity — no new enrollment token is needed.

5. Needs recovery

If the status is Needs recovery, the connector has lost its durable identity and will not heal through the normal red-health causes above.

Check: Open Connectors and confirm the status is Needs recovery rather than Degraded or Offline.

Fix: Follow My connector is offline and use the guided-recovery branch. Do not keep retrying the ordinary health-recovery steps in this article.

Resolution steps

Open Connectors → pick connector and note whether the status is Degraded (connector up, hosts down), Offline (connector unreachable), or Needs recovery. If it is Needs recovery, stop here and follow My connector is offline.
If Offline: from the connector host, run curl -v https://dev.euwarden.com/healthz. If it fails, restore the outbound network path to the control plane.
If Degraded: check which resource host probes are failing in Connectors → pick connector → Health details. From the connector machine, run nc -zv <target-host> <target-port> for each failing host and restore connectivity.
If the status flips intermittently without intervention, check for probe timeouts and resource host load.
Compare the connector version against the supported version. Update if outdated.
After any fix, wait 60–90 seconds and refresh the Connectors page. The health indicator updates automatically on the next probe cycle.

Escalation path

If the health check remains red after working through all four causes:

Note the connector name, current status, and the time the health state changed.
Collect the connector logs (last 200 lines):
- Docker: docker logs -n 200 vaultpam-connector
- systemd: journalctl -u vaultpam-connector -n 200
Include the output of curl -v https://dev.euwarden.com/healthz from the connector host.
Open a support ticket with: connector name, status, log output, and network connectivity test results.

Symptoms​

What the health states mean​

Causes​

1. Network connectivity loss​

2. Resource host unreachable​

3. Probe timeout​

4. Connector version mismatch​

5. Needs recovery​

Resolution steps​

Escalation path​

Related articles​