Skip to content

fix(helm): use /healthz for liveness and readiness probes instead of /api/v2/buildinfo #8035

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 15, 2023

Conversation

johnstcn
Copy link
Member

@johnstcn johnstcn commented Jun 14, 2023

Using /api/v2/buildinfo will cause the pod to fail liveness checks if the database is temporarily unavailable.

Apparently the /healthz endpoint was handled by the UI at one point.
This no longer appears to be the case, as we do have a dedicated /healthz endpoint for a while now.

@johnstcn johnstcn requested review from coadler and deansheather June 14, 2023 17:13
@johnstcn johnstcn self-assigned this Jun 14, 2023
@deansheather
Copy link
Member

If the pod can't connect to the DB couldn't that be a possible indicator that the coder pod is broken (perhaps broken networking)?

@johnstcn
Copy link
Member Author

johnstcn commented Jun 14, 2023

If the pod can't connect to the DB couldn't that be a possible indicator that the coder pod is broken (perhaps broken networking)?

Wouldn't coderd eventually crash anyway if the DB became unavailable?

Update: looks like it doesn't, but the DB being unavailable isn't necessarily something that restarting the pod will fix. How about we instead use /healthz for liveness and /api/v2/buildinfo for readiness? That way, coderd will simply no longer have traffic routed to it if it can't connect to the database.

@coadler
Copy link
Contributor

coadler commented Jun 14, 2023

I don't think using buildinfo over healthz really has any benefits. They're both initialized at essentially the same time during startup, and the only difference seems to be is that buildinfo is behind the API ratelimiter.

@ammario
Copy link
Member

ammario commented Jun 15, 2023

👍🏽

There is an expectation that pinging a health-check endpoint is essentially free, both in terms of latency and downstream resource costs.

@johnstcn
Copy link
Member Author

The API ratelimiter is enough of a reason to prefer healthz IMO.

@johnstcn johnstcn merged commit b1588fa into main Jun 15, 2023
@johnstcn johnstcn deleted the cj/helm-healthz branch June 15, 2023 09:08
@github-actions github-actions bot locked and limited conversation to collaborators Jun 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants