Skip to content

fix: Enable reconnectingpty loadtest and fix/improve logging #5403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 13, 2022

Conversation

mafredri
Copy link
Member

This commit re-enabled reconnectingpty loadtests after a logging
refactor of (*agent).handleReconnectingPTY. The reasons the tests were
flaking was that logger.Error was being called and slogtest failing
the test.

We could have set the option for slogtest to disable failing, but that
could hide real issues. The current approach improves reconnectingpty
logging overall and provides more insight into what's happening. It's
expected that reconnectingpty sessions fail after the agent is closed,
so calling logger.Error at that point is not wanted.

Ref: #5322

@mafredri mafredri force-pushed the mafredri/fix-loadtest-reconnectingpty-error branch 2 times, most recently from 0393aa2 to fda0bb7 Compare December 13, 2022 18:14
agent/agent.go Outdated
Comment on lines 926 to 928
prevBuf := slices.Clone(rpty.circularBuffer.Bytes())
rpty.circularBufferMutex.RUnlock()
_, err = conn.Write(prevBuf)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Unrelated fix(?). It seemed like a bad idea (to me) to perform a write on a network connection whilst holding the mutex, this shortens the span.

However, neither now, nor before, do we achieve perfect synchronization. In theory we should ensure that writes happening between writing prevBuf to conn and adding conn to activeConns are not lost. I did not try to fix this issue in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. It might be worth leaving that in a comment.

agent/agent.go Outdated
}
_, err = rpty.ptty.Input().Write([]byte(req.Data))
if err != nil {
a.logger.Warn(ctx, "write to reconnecting pty", slog.F("id", msg.ID), slog.Error(err))
return
logger.Warn(ctx, "write to pty", slog.Error(err)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: It's unclear to my why some logs have been Warn whereas others are Error, I did however notice some tests fail unless I mimic the previous behavior exactly, which is what I've attempted here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's unfortunate D: I believe they were best effort

@@ -604,6 +615,7 @@ func AwaitWorkspaceAgents(t *testing.T, client *codersdk.Client, workspaceID uui

return true
}, testutil.WaitLong, testutil.IntervalFast)
t.Logf("got workspace agents (workspace %s)", workspaceID)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: These log messages help discern the order of events.

defer conn.Close()

connectionID := uuid.NewString()
logger = logger.With(slog.F("id", msg.ID), slog.F("connection_id", connectionID))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Addition of connection_id will help debug if logs are coming from one or multiple connections.

@@ -300,10 +301,12 @@ func (a *agent) createTailnet(ctx context.Context, derpMap *tailcfg.DERPMap) (_
}
}()
if err = a.trackConnGoroutine(func() {
logger := a.logger.Named("reconnecting-pty")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Named logging seemed like it would be helpful here, even though it's not fully a separate service.

server.init(ctx)
return server
a.init(ctx)
return a
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: This just bothered me, lol.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

;p I agree

This commit re-enabled reconnectingpty loadtests after a logging
refactor of `(*agent).handleReconnectingPTY`. The reasons the tests were
flaking was that `logger.Error` was being called and `slogtest` failing
the test.

We could have set the option for `slogtest` to disable failing, but that
could hide real issues. The current approach improves reconnectingpty
logging overall and provides more insight into what's happening. It's
expected that reconnectingpty sessions fail after the agent is closed,
so calling `logger.Error` at that point is not wanted.

Ref: #5322
@mafredri mafredri force-pushed the mafredri/fix-loadtest-reconnectingpty-error branch from fda0bb7 to 772dff1 Compare December 13, 2022 18:23
mafredri added a commit that referenced this pull request Dec 13, 2022
While working on #5403 I noticed the web terminal connects with a
different UUID than what is serialized in the URL, this resulted in a
new session being opened if I refreshed the page.

The reasons seems to be that we were accessing raw DOM elements
(`location.search`) for navigation and omitting `react-router-dom`. This
resulted in a new navigation and UUID being generated after the fact.
server.init(ctx)
return server
a.init(ctx)
return a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

;p I agree

agent/agent.go Outdated
Comment on lines 926 to 928
prevBuf := slices.Clone(rpty.circularBuffer.Bytes())
rpty.circularBufferMutex.RUnlock()
_, err = conn.Write(prevBuf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. It might be worth leaving that in a comment.

@mafredri mafredri force-pushed the mafredri/fix-loadtest-reconnectingpty-error branch from 3051881 to 41e119c Compare December 13, 2022 18:55
@mafredri mafredri marked this pull request as ready for review December 13, 2022 18:57
@mafredri mafredri merged commit 4fc4c01 into main Dec 13, 2022
@mafredri mafredri deleted the mafredri/fix-loadtest-reconnectingpty-error branch December 13, 2022 19:28
@github-actions github-actions bot locked and limited conversation to collaborators Dec 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants