Skip to content

feat: replace ssh maxTimeout with keep-alive mechanism #8062

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 16, 2023

Conversation

mtojek
Copy link
Member

@mtojek mtojek commented Jun 16, 2023

Fixes: #7512
Fixes: #7581

This PR enables Keep-Alive support for SSH connections instead of MaxTimeout.

The previous fix mentioned in this PR did not effectively address the issue with the VS Code extension. The problem arose when there was no activity (neither read nor write operations), causing the agent to terminate the connection after 1 minute. This pull request modifies the behavior to allow the client to respond to pings before terminating the connection, and keeping the connection alive.

How I tested this:

  1. Set CODER_AGENT_SSH_MAX_TIMEOUT to 60s or modify ssh-max-timeout.Default in the source code.
  2. ./script/develop.sh
  3. Start a new workspace or restart existing one.
  4. Test: ssh coder.workspace-2 -vvv <- observe pings every 20s
  5. Test: ./scripts/coder-dev.sh ssh workspace-2 <- leave the terminal without activity for more than 1 minute, the shell is still alive
  6. Test: run browser Terminal <- leave the terminal without activity for more than 1 minute, the shell is still alive.

@mtojek mtojek self-assigned this Jun 16, 2023
@mtojek mtojek force-pushed the 7581-update-coder-ssh-b branch from aa3744d to 8334f26 Compare June 16, 2023 11:32
@mtojek mtojek changed the title feat: set default agent timeout to ~72h feat: ssh: use keep-alive mechanism instead of maxTimeout Jun 16, 2023
@mtojek mtojek changed the title feat: ssh: use keep-alive mechanism instead of maxTimeout feat: replace ssh maxTimeout with keep-alive mechanism Jun 16, 2023
@mtojek mtojek requested review from mafredri and johnstcn June 16, 2023 11:41
@mtojek mtojek marked this pull request as ready for review June 16, 2023 11:42
Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such a small change. ❤️

cli/agent.go Outdated
Default: "0",
Flag: "ssh-max-timeout",
// tcpip.KeepaliveIdleOption = 72h (forwardTCPSockOpts() in tailnet/conn.go)
Default: fmt.Sprintf("%v", 72*time.Hour-time.Minute),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two thoughts:

  1. How about we set this to an even 72h, but do the subtraction elsewhere? I think the even number will be easier for users to understand
    • Or, we could bump the tailnet conn keepalive slightly to account for this
  2. Perhaps the description should mention this is the maximum sensible value for this field?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I set it to 72h, and changed tcpip.KeepaliveIdleOption to 72h + 1min as it seems straightforward to me. I guess that it doesn't make any difference that we elongate the TCP timeout by 1 minute.
  2. Done (range: 60s - 72h).

Copy link
Member

@johnstcn johnstcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I don't have any issues with this change, but will defer to @mafredri for approval.

@mtojek mtojek requested a review from mafredri June 16, 2023 12:25
Copy link
Member

@mafredri mafredri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@mtojek mtojek merged commit 247f8a9 into main Jun 16, 2023
@mtojek mtojek deleted the 7581-update-coder-ssh-b branch June 16, 2023 13:22
@github-actions github-actions bot locked and limited conversation to collaborators Jun 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

agent: Add support for keep-alive SSH messages agent: hanging "bash -l" processes
3 participants