-
Notifications
You must be signed in to change notification settings - Fork 887
fix: ensure wsproxy MultiAgent
is closed when websocket dies
#11414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: ensure wsproxy MultiAgent
is closed when websocket dies
#11414
Conversation
In this stack: This stack of pull requests is managed by Graphite. Learn more about stacking. |
50c92b4
to
12f1878
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work tracking down the bug!
I don't see any new tests though that will catch and prevent a regression, so please add
<-ctx.Done() | ||
ma.Close() | ||
}() | ||
|
||
go func() { | ||
defer cancel() | ||
dec := json.NewDecoder(nc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth dropping an INFO log here.
See this comment inline on Graphite.
@@ -472,6 +472,11 @@ func (c *Client) DialCoordinator(ctx context.Context) (agpl.MultiAgentConn, erro | |||
OnRemove: func(agpl.Queue) { conn.Close(websocket.StatusGoingAway, "closed") }, | |||
}).Init() | |||
|
|||
go func() { | |||
<-ctx.Done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding this correctly, we're depending on the fact that the reader goroutine below cancels the context on a failed read.
I think we should also tear down the multi-agent on a failed write of subscription messages. It's unlikely that we'd have a failure that leaves the connection half-open (e.g. for reads but not writes), but such things are possible and you don't want the proxy limping on unable to subscribe to new agents.
func HeartbeatClose(ctx context.Context, exit func(), conn *websocket.Conn) { | ||
ticker := time.NewTicker(30 * time.Second) | ||
ticker := time.NewTicker(15 * time.Second) | ||
defer ticker.Stop() | ||
|
||
for { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The `SingleTailnet` behavior only checked to see if the `MultiAgent` was closed, but the websocket error was not being propogated into the `MultiAgent`, causing it to never be swapped for a new working one.
12f1878
to
72ad811
Compare
The
SingleTailnet
behavior only checked to see if theMultiAgent
wasclosed, but the websocket error was not being propogated into the
MultiAgent
, causing it to never be swapped for a new working one.Fixes #11401
Before:
After: