Skip to content

fix: close SSH sessions bottom-up if top-down fails #14678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 17, 2024

Conversation

spikecurtis
Copy link
Contributor

@spikecurtis spikecurtis commented Sep 16, 2024

Fixes https://github.com/coder/customers/issues/669

In coder ssh we normally attempt to tear stuff down top to bottom, so that SSH stuff like remote-forwards are cleaned up nicely, tailnet Coordination gets a disconnect, etc.

But, TCP timeouts can be very long (72 hours currently for SSH), and so if network connectivity down, we can effectively deadlock trying to tear down the remote-forward state, which involves sending an SSH command message and doesn't time out independently of the underlying TCP connection.

This PR introduces a "graceful shutdown" timeout for the upper layers of our SSH stuff to finish closing. If they haven't closed in 5 seconds, we shut down the agent connection, which cascades bottom-up.

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @spikecurtis and the rest of your teammates on Graphite Graphite

@spikecurtis spikecurtis marked this pull request as ready for review September 16, 2024 06:43
Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I think this is a valid approach and appreciate the test coverage you've written, I have a slightly different approach in mind that I feel will be more robust. I'd like to hear your thoughts.

@spikecurtis spikecurtis force-pushed the spike/customers-669-ssh-hang branch from d9dc47a to 21ee4de Compare September 17, 2024 09:14
Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking another look at the implementation and making it more robust, looking great! 👍🏻

@spikecurtis spikecurtis force-pushed the spike/customers-669-ssh-hang branch from 21ee4de to a3d0c5b Compare September 17, 2024 10:28
@spikecurtis spikecurtis force-pushed the spike/customers-669-ssh-hang branch from a3d0c5b to 7dd56e0 Compare September 17, 2024 10:34
@spikecurtis spikecurtis merged commit 6ff9a05 into main Sep 17, 2024
27 checks passed
Copy link
Contributor Author

Merge activity

@spikecurtis spikecurtis deleted the spike/customers-669-ssh-hang branch September 17, 2024 10:46
@github-actions github-actions bot locked and limited conversation to collaborators Sep 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants