Skip to content

feat: adds callback call when connection ends #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 21, 2023

Conversation

spikecurtis
Copy link

@spikecurtis spikecurtis commented Jun 20, 2023

I want to add logging to the Coder Agent when ssh connections disconnect, including the reason for disconnection. c.f. coder/coder#7961

Currently, our ssh library doesn't expose or log this information, as it only calls the ConnectionFailedCallback if the connection fails during the initial SSH handshake, not if it fails later.

This PR makes it so that if the initial ssh handshake succeeds, then we start a goroutine that waits for the connection to end, and calls the callback. We already log calls to this callback in our server code, but in a separate PR we might want to adjust so that we only log at Info or Debug for "expected" disconnection errors like EOF.

@spikecurtis spikecurtis requested review from mafredri and mtojek June 20, 2023 12:43
Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the purpose of ConnectionFailedCallback is different from what we're trying to do here and repurposing it for established connection errors seems like it could lead to confusion.

Is there a reason we wouldn't want to handle this in agentssh instead? (E.g. get sshConn via ContextKeyConn).

Or perhaps ultimately, a new kind of callback named ConnectionErrorCallback could be added.

server.go Outdated
if srv.ConnectionFailedCallback != nil {
go func() {
wErr := sshConn.Wait()
srv.ConnectionFailedCallback(conn, wErr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of spawning a goroutine that can theoretically be left running after the server has closed, we could move this check to after the for loop for ch := range chans { ... }. That channel will be closed anyway before .Wait() here returns.

Copy link
Member

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps ultimately, a new kind of callback named ConnectionErrorCallback could be added.

As I have spent some time recently diving in coder/ssh, I would recommend researching that approach.

BTW Sometimes the connection doesn't end, but will timeout eventually if there is no network activity (connection read/write timeout). With TCP keep alive enabled it might be hard to determine if the connection ended or if it is a transient network issue.

@spikecurtis
Copy link
Author

Is there a reason we wouldn't want to handle this in agentssh instead? (E.g. get sshConn via ContextKeyConn).

I do want to handle the logging in agentssh, but there are currently no callbacks at the scope I need. I want to log the connection lifetime & errors, so doing it in a session handler is the wrong scope: a connection could have 0, 1, or several sessions, so we can't easily arrange to log exactly once. There is a "connection" callback, but that has a net.Conn and what I want is the ssh.Conn (net.Conn doesn't have any method to wait for close, nor would it expose the error from the SSH layer).

@spikecurtis
Copy link
Author

Or perhaps ultimately, a new kind of callback named ConnectionErrorCallback could be added.

As I have spent some time recently diving in coder/ssh, I would recommend researching that approach.

I like this idea. I'll refactor to introduce a new callback.

Signed-off-by: Spike Curtis <spike@coder.com>
@spikecurtis spikecurtis force-pushed the spike/conn-wait-callback branch from 3aef9e1 to c92d705 Compare June 21, 2023 05:40
@spikecurtis spikecurtis requested review from mafredri and mtojek June 21, 2023 05:40
@spikecurtis
Copy link
Author

I went with ConnectionCompleteCallback since "error" sounds like something that gets called for abnormal termination, but I want this to get called for every connection.

Copy link
Member

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for making my suggestion even better and also documenting the always err behavior!

@spikecurtis spikecurtis merged commit 9a7e234 into master Jun 21, 2023
@spikecurtis spikecurtis deleted the spike/conn-wait-callback branch June 21, 2023 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants