Skip to content

Expose healthcheck data in Prometheus metrics #10678

Closed as not planned
Closed as not planned
@johnstcn

Description

@johnstcn

As an operator, I would like to be able to view the [health data from Coder] (#8971) in Prometheus:

Something like the below could be useful:

coderd_health_access_url { healthy: [1|0], reachable: [1|0], status_code: [status_code], response_len: [response_len] }
coderd_health_database { healthy: [1|0], reachable: [1|0]], latency_ms: [latency_ms], threshold_ms: [threshold_ms] }
coderd_health_derp { region_id: [region_id], healthy: [1|0], round_trip_ping_ms: [round_trip_ping_ms], uses_websocket: [1|0], stun_enabled: [1|0], ... }
coderd_health_websocket { healthy: [1|0], response_len: [response_len], code: [code] }

This will allow me to answer questions such as:

  • At what periods does Coder notice the worst database latency?
  • Does Coder's access URL become unreachable at specific times?
  • Do any DERP regions report errors at specific times?
  • Do websocket requests fail at specific times?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions