Skip to content

feat(coderd/healthcheck): add health check for proxy #10846

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Nov 24, 2023

Conversation

johnstcn
Copy link
Member

@johnstcn johnstcn commented Nov 23, 2023

Fixes #9558

Adds a health check for workspace proxies:

  • Healthy iff all proxies are healthy and the same version,
  • Warning if some proxies are unhealthy,
  • Error if all proxies are unhealthy, or do not all have the same version.

@johnstcn johnstcn changed the title WIP [wip] feat(coderd/healthcheck): add health check for proxy Nov 23, 2023
@johnstcn johnstcn force-pushed the cj/workspaceproxy-healthcheck branch from 84b0060 to e80c05d Compare November 23, 2023 12:56
@johnstcn johnstcn force-pushed the cj/workspaceproxy-healthcheck branch from 11597b6 to c7e202c Compare November 23, 2023 15:16
@johnstcn johnstcn changed the title [wip] feat(coderd/healthcheck): add health check for proxy feat(coderd/healthcheck): add health check for proxy Nov 23, 2023
@johnstcn johnstcn marked this pull request as ready for review November 24, 2023 09:33
@johnstcn johnstcn requested review from mtojek and mafredri November 24, 2023 09:33
Copy link
Member

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good test coverage, minor things only.

coderd/coderd.go Outdated
AppSecurityKey workspaceapps.SecurityKey
AppSecurityKey workspaceapps.SecurityKey

// The following two functions are dependencies of HealthcheckFunc but are only implemented
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move the comment to the place where you assign these no-op {}

@@ -153,6 +174,9 @@ func Run(ctx context.Context, opts *ReportOptions) *Report {
if !report.Database.Healthy {
report.FailingSections = append(report.FailingSections, SectionDatabase)
}
if !report.WorkspaceProxy.Healthy {
report.FailingSections = append(report.FailingSections, SectionWorkspaceProxy)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side thought: frontend does not use FailingSections at all. Maybe we should remove it for simplicity.

Copy link
Member Author

@johnstcn johnstcn Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Will file a follow-up PR

Edit: #10854

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's beneficial to have a structure that can be navigated in a generic way.

for (let section of report.failing_sections) {
	console.log(report[section].reason);
}

Given a good design, it will "always works", even when new fields and structures are added. Granted, I'm not a proponent for the current design, just for the ability to do so.

// health.SeverityError if all proxies are unhealthy,
// health.SeverityOK if all proxies are healthy,
// health.SeverityWarning otherwise.
func calculateSeverity(total, healthy int) health.Severity {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH this could probably be moved to a separate utils file

},
}, nil
}),
updateProxyHealth: ptr.Ref(func(ctx context.Context) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can create a no-op function somewhere around, and just refer to it

coderd/coderd.go Outdated
@@ -396,9 +402,19 @@ func New(options *Options) *API {
*options.UpdateCheckOptions,
)
}

if options.FetchWorkspaceProxiesFunc == nil {
options.FetchWorkspaceProxiesFunc = &atomic.Pointer[func(context.Context) (codersdk.RegionsResponse[codersdk.WorkspaceProxy], error)]{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random comment: This looks pretty gnarly (generics within generics), I kind of get why the Go team delayed generics for as long as they did. 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, generics within generics look gnarly in any language!

@@ -153,6 +174,9 @@ func Run(ctx context.Context, opts *ReportOptions) *Report {
if !report.Database.Healthy {
report.FailingSections = append(report.FailingSections, SectionDatabase)
}
if !report.WorkspaceProxy.Healthy {
report.FailingSections = append(report.FailingSections, SectionWorkspaceProxy)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's beneficial to have a structure that can be navigated in a generic way.

for (let section of report.failing_sections) {
	console.log(report[section].reason);
}

Given a good design, it will "always works", even when new fields and structures are added. Granted, I'm not a proponent for the current design, just for the ability to do so.

for _, proxy := range r.WorkspaceProxies.Regions {
if vErr := checkVersion(proxy, opts.CurrentVersion); vErr != nil {
r.Healthy = false
r.Severity = health.SeverityError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not suggesting, just asking: Is this always an error? Could it be a warning instead if it's a patch version difference, for instance?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our current behaviour won't even let workspace proxies register if they don't match major.minor, but makes no further complaints if it's just a patch version difference.

Making it warn on a patch version difference sounds like a good idea in theory, but it will mean every time you upgrade, you'll get a short period of warning that there's a patch version diff. Although that's the current behaviour as well when you do a major version upgrade, so... 🤷

I'll see about adding it in a follow-up.

@johnstcn johnstcn requested a review from mafredri November 24, 2023 14:42
Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

// in an actually useful and meaningful way.
type workspaceProxiesFetchUpdater struct {
fetchFunc func(context.Context) (codersdk.RegionsResponse[codersdk.WorkspaceProxy], error)
updateFunc func(context.Context) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: You could either export this as-is to re-use in tests, or you could simplify this to just pass in the whole API (api *coderd.API) and call it from the methods.

@johnstcn johnstcn merged commit 411ce46 into main Nov 24, 2023
@johnstcn johnstcn deleted the cj/workspaceproxy-healthcheck branch November 24, 2023 15:06
@github-actions github-actions bot locked and limited conversation to collaborators Nov 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add health check for proxy
3 participants