Connector health check is red
VaultPAM runs periodic health probes against each connector and the resource hosts it serves. When the health indicator turns red (or shows Degraded), one of four causes is almost always responsible. Network connectivity loss is the most common; version mismatches are the least common but are easy to confirm.
Symptoms
- The connector row in Connectors shows a red health indicator or the status Degraded.
- A health badge on a specific Safe shows the connector serving it is unhealthy.
- Alerts fire for
ControlPlaneTunnelHeartbeatTimeoutsHighor a similar metric. - Sessions launched through the connector fail or are slow to connect.
- The connector health was green recently and turned red without an apparent change.
What the health states mean
| Status | Meaning |
|---|---|
| Online | The connector and all probed resource hosts are reachable. |
| Degraded | The connector is reachable but at least one resource host probe is failing. |
| Offline | The connector is not reachable from the control plane. |
| Needs recovery | The connector has lost its durable identity and cannot reconnect normally. |
A red health indicator corresponds to Degraded (connector up, some hosts down) or Offline (connector itself unreachable).
Causes
1. Network connectivity loss
The connector host has lost its outbound path to the VaultPAM control plane (port 443), or a network policy change has blocked the route.
Check: From the connector host, run:
curl -v https://dev.euwarden.com/healthz
If this fails, the connector cannot reach the control plane. Also check whether the connector machine itself is still reachable (ping, SSH, or RDP into it).
Fix: Restore the outbound network path. Confirm corporate firewalls or cloud security groups allow TCP 443 from the connector host to the control plane. After restoring connectivity, the connector will reconnect automatically within 30 seconds — no manual restart is needed.
2. Resource host unreachable
The connector may be healthy itself but one or more resource hosts it probes (the targets registered in the connected Safes) have become unreachable from the connector network. This produces a Degraded status rather than Offline.
Check: Identify which resource host is flagged. In Connectors → pick connector → Health details, you may see per-host probe results. From the connector machine, test reachability to the affected target:
nc -zv <target-host> <target-port>
Fix: Restore the route from the connector machine to the resource host. The probe will recover on the next health check cycle (typically every 60 seconds). No connector restart is needed.
3. Probe timeout
Health probes have a configurable timeout. If the resource host or connector is slow to respond — due to high CPU load, a hung service, or an overloaded network path — the probe may time out even though the host is technically reachable. Intermittent red states that self-resolve often fall into this category.
Check: Open Connectors → pick connector → Health history. If the health state flips between red and green without intervention, probe timeouts are the likely cause. Confirm the resource host is not under unusual load.
Fix: If the probe timeout is consistently too short for the host's normal response time, ask a VaultPAM operator to increase the probe timeout in the connector configuration. If the host is overloaded, address the load issue on the target system.
4. Connector version mismatch
VaultPAM periodically releases connector updates. If a connector is running a version that is no longer supported, the control plane may stop accepting its health reports or flag it as degraded until the connector is updated.
Check: Open Connectors → pick connector → Details. Compare the Version field against the latest supported connector version listed in System → Connector versions or the VaultPAM release notes.
Fix: Update the connector to the current supported version. The update procedure depends on the deployment method:
- Docker: pull the latest image and recreate the container.
- Native / VM: download the updated binary and restart the service.
- Kubernetes: update the image tag in the deployment manifest and apply.
After the update, the connector will re-pair using its existing durable identity — no new enrollment token is needed.
Resolution steps
- Open Connectors → pick connector and note whether the status is Degraded (connector up, hosts down) or Offline (connector unreachable).
- If Offline: from the connector host, run
curl -v https://dev.euwarden.com/healthz. If it fails, restore the outbound network path to the control plane. - If Degraded: check which resource host probes are failing in Connectors → pick connector → Health details. From the connector machine, run
nc -zv <target-host> <target-port>for each failing host and restore connectivity. - If the status flips intermittently without intervention, check for probe timeouts and resource host load.
- Compare the connector version against the supported version. Update if outdated.
- After any fix, wait 60–90 seconds and refresh the Connectors page. The health indicator updates automatically on the next probe cycle.
Escalation path
If the health check remains red after working through all four causes:
- Note the connector name, current status, and the time the health state changed.
- Collect the connector logs (last 200 lines):
- Docker:
docker logs -n 200 vaultpam-connector - systemd:
journalctl -u vaultpam-connector -n 200
- Docker:
- Include the output of
curl -v https://dev.euwarden.com/healthzfrom the connector host. - Open a support ticket with: connector name, status, log output, and network connectivity test results.