Skip to content

feat(coderd/healthcheck): add access URL error codes and healthcheck doc #10915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Nov 30, 2023

Conversation

johnstcn
Copy link
Member

@johnstcn johnstcn commented Nov 28, 2023

Relates to #8965

  • Added error codes for separate code paths in health checks
  • Prefixed errors and warnings with error code prefixes
  • Added a docs page with details on each code, cause and solution

Future work:

  • add a convenience function in site/src/api to create a troubleshooting link from an error that extracts the error code prefix matching ^([A-Z0-9]+):.
  • Refactor both .error and .warnings[] to be a shape of { message: string, code: string}. I started on this but realised it makes the diff too large, so will do it in a follow-up.

@BrunoQuaresma
Copy link
Collaborator

What would be the expected shape here? Something like:

{
  "warnings": [
    "code": "W001",
    "message": "This is a warning"
  ],
  "resolutions": {
    "W001": "https://docs.coder.com/warning/W001"
  }
}

@johnstcn
Copy link
Member Author

johnstcn commented Nov 29, 2023

What would be the expected shape here? Something like:

{
  "warnings": [
    "code": "W001",
    "message": "This is a warning"
  ],
  "resolutions": {
    "W001": "https://docs.coder.com/warning/W001"
  }
}

I had envisioned { warnings: [ { code: string, message: string } ... ] }, without the resolutions field.
Idea is that the FE would automatically construct this from ${CODER_DOCS_URL}/path/to/doc#W001.

@BrunoQuaresma
Copy link
Collaborator

@johnstcn sounds good!

@johnstcn johnstcn force-pushed the cj/health-resolutions branch from 8080874 to 1994ce1 Compare November 30, 2023 09:45
Comment on lines +13 to +34
// CodeUnknown is a catch-all health code when something unexpected goes wrong (for example, a panic).
CodeUnknown Code = "EUNKNOWN"

CodeProxyUpdate Code = "EWP01"
CodeProxyFetch Code = "EWP02"
CodeProxyVersionMismatch Code = "EWP03"
CodeProxyUnhealthy Code = "EWP04"

CodeDatabasePingFailed Code = "EDB01"
CodeDatabasePingSlow Code = "EDB02"

CodeWebsocketDial Code = "EWS01"
CodeWebsocketEcho Code = "EWS02"
CodeWebsocketMsg Code = "EWS03"

CodeAccessURLNotSet Code = "EACS01"
CodeAccessURLInvalid Code = "EACS02"
CodeAccessURLFetch Code = "EACS03"
CodeAccessURLNotOK Code = "EACS04"

CodeDERPNodeUsesWebsocket Code = `EDERP01`
CodeDERPOneNodeUnhealthy Code = `EDERP02`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review: these all have to go in here to be generated properly

Co-authored-by: Muhammad Atif Ali <atif@coder.com>
Copy link
Member

@matifali matifali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments but this is Awesome <3

@johnstcn johnstcn marked this pull request as ready for review November 30, 2023 11:30
@mtojek mtojek self-requested a review November 30, 2023 12:00
Copy link
Member

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@johnstcn johnstcn enabled auto-merge (squash) November 30, 2023 12:14
@johnstcn johnstcn merged commit 4f92928 into main Nov 30, 2023
@johnstcn johnstcn deleted the cj/health-resolutions branch November 30, 2023 12:15
@github-actions github-actions bot locked and limited conversation to collaborators Nov 30, 2023
Copy link
Contributor

@spikecurtis spikecurtis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions inline

> [forced websocket connections for DERP](../cli/server.md#--derp-force-websockets).

**Solution:** ensure that any configured reverse proxy does not strip the
`Upgrade: derp` header.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensure that any proxies you use allow connection upgrade with the Upgrade: derp header.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

performance may be impacted for clients closest to the unhealthy DERP server.

**Solution:** Ensure that the DERP server is available and reachable over the
network on port 443, for example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that the DERP server is available and reachable over the network, for example:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

network on port 443, for example:

```shell
curl -v "https://coder.company.com:443/derp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:443 is redundant if you use https://

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

running Coder, using standard troubleshooting tools like `curl`:

```shell
curl -v "https://coder.company.com:443/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:443 is redundant with https://

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

```

2. Ensure that any reverse proxy that is sitting in front of Coder's configured
access URL is not stripping the HTTP header `Upgrade: websocket`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that any reverse proxy that is serving Coder's configured access URL allows connection upgrade with the header Upgrade: websocket

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

**Problem:** One or more workspace proxies are not reachable.

**Solution:** Ensure that Coder can establish a connection to the configured
workspace proxies on port 443.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that Coder can establish a connection to the configured
workspace proxies

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants