Skip to content

fix: Improve coder server shutdown procedure #3246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 27, 2022

Conversation

mafredri
Copy link
Member

This commit improves the coder server shutdown procedure so that all
triggers for shutdown do so in a graceful way without skipping any
steps.

We also improve cancellation and shutdown of services by ensuring
resources are cleaned up at the end.

Notable changes:

  • We wrap cmd.Context() to allow us to control cancellation better
  • We attempt graceful shutdown of the http server (server.Shutdown)
    because it's less abrupt (compared to shutdownConns)
  • All exit paths share the same shutdown procedure (except for early
    exit)
  • provisionerds are now shutdown concurrently instead of one at a
    time, the also now get a new context for shutdown because
    cmd.Context() may be cancelled
  • Resources created by newProvisionerDaemon are cleaned up
  • Lifecycle Executor exits its goroutine on context cancellation

Fixes #3245

Combined with hashicorp/hc-install#68, this PR
allows goleaks test to pass for cli TestServer tests.

This commit improves the `coder server` shutdown procedure so that all
triggers for shutdown do so in a graceful way without skipping any
steps.

We also improve cancellation and shutdown of services by ensuring
resources are cleaned up at the end.

Notable changes:
- We wrap `cmd.Context()` to allow us to control cancellation better
- We attempt graceful shutdown of the http server (`server.Shutdown`)
  because it's less abrupt (compared to `shutdownConns`)
- All exit paths share the same shutdown procedure (except for early
  exit)
- `provisionerd`s are now shutdown concurrently instead of one at a
  time, the also now get a new context for shutdown because
  `cmd.Context()` may be cancelled
- Resources created by `newProvisionerDaemon` are cleaned up
- Lifecycle `Executor` exits its goroutine on context cancellation

Fixes #3245
@mafredri mafredri self-assigned this Jul 27, 2022
@mafredri mafredri requested a review from a team July 27, 2022 13:14
// Clean up idle connections at the end, e.g.
// embedded-postgres can leave an idle connection
// which is caught by goleaks.
defer http.DefaultClient.CloseIdleConnections()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 TIL!

devTunnelErrChan = make(<-chan error, 1)
ctxTunnel, closeTunnel = context.WithCancel(ctx)
devTunnel *devtunnel.Tunnel
devTunnelErr <-chan error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we moving to an un-buffered channel here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just so that the channel is accessible outside the if, the actual channel is returned by devtunnel.New().

I thought it was weird that we're allocating a channel here since the nil channel blocks forever which is ideal, and informs the reader of our intent.

// Trigger context cancellation for any remaining services.
cancel()

return exitErr
Copy link
Member Author

@mafredri mafredri Jul 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll always be returning an error now. There's really no way to cleanly shut down the server (e.g. click shut down in WebUI), so this makes sense. And interrupt for instance usually returns non-zero exit code.

The only downside is that we will print context cancelled and see --help at the end. I kind of want us to have a cobra post-exec for server that does os.Exit(n) is the context is cancelled. Thoughts?

Copy link
Member

@kylecarbs kylecarbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a brief review, but on mobile so hard to confirm! Changes look good to me.

@mafredri mafredri merged commit d27076c into main Jul 27, 2022
@mafredri mafredri deleted the mafredri/coder-server-shutdown-improvements branch July 27, 2022 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor coder server shutdown procedure and allow TestServer tests to pass goleaks detection
3 participants