-
Notifications
You must be signed in to change notification settings - Fork 887
fix: remove pty match for TestSSH/RemoteForward #10789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Current dependencies on/for this PR: This stack of pull requests is managed by Graphite. |
043bda0
to
0930d7a
Compare
// true exits the loop. | ||
return true | ||
} | ||
resp, err := http.DefaultClient.Do(req) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this may leave idle connections after it (triggering leak detection). If it does though, perhaps we just need to handle it in test main so we can keep using the default client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen any. I think idle connections are cleaned up promptly.
assert.NoError(t, err) | ||
assert.EqualValues(t, "hello world", body) | ||
return true | ||
}, testutil.WaitLong, testutil.IntervalFast) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous matcher waited for i-am-ready
echoed by the startup script. It means that the agent didn't start yet, and localhost:8222
is not available. I'm curious where is the trick in the new implementation. Is it a matter of testutil.WaitLong
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, before we waited for i-am-ready
and then connected.
Here, we just (re)try connecting in a loop until success or a timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it the same root cause? The agent is still not ready to connect. Are you just trying to reduce complexity and remove extra conditions?
EDIT:
I don't argue with the new implementation. I'm trying to understand the root cause. Would extending the timeout for pty "expect match" work as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we used PTY/expect to wait until we thought the agent was connected and SSH remote forward was up, then we connected and verified the output. The root cause of the flake is that we didn't always get the startup script log because of the way the ssh
command works. It detects that a startup script is running, then asks to follow logs, but the log may have already been dispatched and we miss it, causing the expect to hang.
In the new implementation we do away with PTY/expect entirely, and thus dodge the root cause from before. To deal with the fact that we no longer get an explicit signal that the remote forward is up, we just retry it multiple times until it comes up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The root cause of the flake is that we didn't always get the startup script log because of the way the ssh command works.
Ok, that's the part I wasn't aware of. Thanks!
Merge activity
|
Fixes #10578