-
Notifications
You must be signed in to change notification settings - Fork 963
chore: add backed reader, writer and pipe implementation #19147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
9dd9c4a
to
3223bf9
Compare
d2ee08d
to
27da7ef
Compare
16abe05
to
dde9516
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not done with my review, but taking a break and wanted to send you comments so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another batch of comments.
I haven't really looked at BackedPipe yet, but I think maybe it would be better to wait for the comments I've made so far to be resolved.
… to block on errors
…being closed/reconnected race conditions
7468299
to
b2188f9
Compare
if runtime.GOOS == "windows" { | ||
// Don't run goleak on windows tests, they're super flaky right now. | ||
// See: https://github.com/coder/coder/issues/8954 | ||
os.Exit(m.Run()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Windows CLI tests are still flaky, then we need to look into it. That issue is closed not planned and has been for years.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible for a test to leak on one os and not leak on another?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, especially on the CLI and Agent we need to do some OS specific shit around pseduo-TTYs. File I/O is different, so something might block indefinitely on one but not another OS. Stuff like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that and it sounds reasonable, but this test is conditioned only to execute goleak.VerifyTestMain(m, testutil.GoleakOptions...)
The rest of the testing logic of this package itself is not conditioned on the OS.
So my bottom line is - if we believe that goroutine leak detection flakes on Windows (and that condition is almost everywhere in the codebase ) - I wanted to stay consistent with that.
// Notify parent of error | ||
select { | ||
case bw.errorChan <- err: | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has me worried. Dropping errors like this means that you might get a different number of error events depending on race conditions. How do we ensure we reconnect the correct number of times?
At the least, it needs a comment explaining why this works.
Since the reader and writer share the error channel, when the connection goes down you could get either 1 or 2 errors. Maybe one of the errors was swallowed by this branch, or maybe nothing was being written at the time, and so there was no error. If you get two errors, how do we know to reconnect only once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reconnect process is debounced by singleflight on the pipe level so I wouldn't expect them to trigger two reconnects.
However I've added separate channels for read & write errors in the backed pipe.
Is comment describing the behavior enough or do you feel there might be an edge case here I'm not testing for right now?
separate channels for writer/reader errors moved to an interface for reconnection coordination logic for closing a pipe
Fixes: #18101
This PR introduces a new
backedpipe
package that provides reliable bidirectional byte streams over unreliable network connections. The implementation includes:BackedPipe
: Orchestrates a reader and writer to provide transparent reconnection and data replayBackedReader
: Handles reading with automatic reconnection, blocking reads when disconnectedBackedWriter
: Maintains a ring buffer of recent writes for replay during reconnectionRingBuffer
: Efficient circular buffer implementation for storing dataThe package enables resilient connections by tracking sequence numbers and replaying missed data after reconnection. It handles connection failures gracefully, automatically reconnecting and resuming data transfer from the appropriate point.