You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So, it turns out https://github.com/nhooyr/websocket is bugged when we use SetWriteDeadline --- the next write will often fail. Prior to #7345 we only called this on client connections, but now we use it in both directions, so we're seeing agents get disconnected now. We probably didn't notice the client disconnections because a certain amount of client disconnects are of course expected.
I have to wonder whether the client disconnections that were present even before #7345 are the underlying cause for some of the flakiness we have seen in SSH, JetBrains Gateway, etc.
Agent disconnects started increasing substantially with #7345
Logs: https://app.datadoghq.com/logs?query=%40caller%3A%2Fhome%2Frunner%2Fwork%2Fcoder%2Fcoder%2Ftailnet%2Fcoordinator.go%5C%3A%2A%20%40msg%3A%28%22unable%20to%20read%20agent%20update%3B%20closed%20conn%3F%22%20OR%20%22unable%20to%20read%20client%20update%3B%20closed%20conn%3F%22%20OR%20%22could%20not%20write%20nodes%20to%20connection%22%29&agg_q=%40fields.error%2C%40fields.error&analyticsOptions=%5B%22bars%22%2C%22dog_classic%22%2Cnull%2Cnull%5D&cols=host%2Cservice&index=%2A&messageDisplay=inline&sort_m=%2C&sort_t=%2C&stream_sort=desc&top_n=10%2C10&top_o=top%2Ctop&viz=toplist&x_missing=true%2Ctrue&from_ts=1682961720000&to_ts=1683220920000&live=false
The text was updated successfully, but these errors were encountered: