-
Notifications
You must be signed in to change notification settings - Fork 4
fix: avoid data race in session signal channel register/deregister #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: avoid data race in session signal channel register/deregister #5
Conversation
0786b47
to
f6b35a4
Compare
f6b35a4
to
9fde237
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. One suggestion inline
} | ||
go func() { | ||
defer sess.sigMu.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is unorthodox enough to warrant a comment; it looked like a bug to me at first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what I meant is the fact that you're calling
defer sess.sigMu.Unlock()
without the corresponding Lock()
in the same function looks very strange to me! We often talk about a function "holding" the lock while it runs, but here you "transfer" the lock from the Signals() function to the child goroutine so the parent can return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understood that's what you meant. Did you feel the motivation for it was lacking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an aside, this is explicitly documented in https://pkg.go.dev/sync#Mutex.Unlock
A locked Mutex is not associated with a particular goroutine. It is allowed for one goroutine to lock a Mutex and then arrange for another goroutine to unlock it.
Not that I'm making any claims about this being standard in any way... I was simply trying to retrofit an edge-cases covered fix into the existing code without too many changes.
Interestingly, there's been a proposal to disallow this, but it didn't gain traction: golang/go#9201
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was actually just about the mechanics --- like
// here we're relying on the fact that the mutex was locked in the outer `Signal()`
// function, so this goroutine/function just has to unlock the mutex when it is done
We recently added signal support to
agentssh
, however, I noticed the following goleak flake today:https://github.com/coder/coder/actions/runs/7003589751/job/19049631559?pr=10883
This made me take a closer look at the implementation and saw that it was racy. This PR fixes the race.