Skip to content

workspace proxy fails to proxy; error "ensure agent: subscribe agent" #11401

Closed
@johnstcn

Description

@johnstcn

(TODO: write a better issue title)

After #11366 was merged, I observed 502 errors when attempting to open vscode-web on my workspace:

Failed to proxy request to application: acquire agent conn: ensure agent: subscribe agent: write message: failed to write msg: WebSocket closed: failed to read frame header: EOF

and alternately, upon refresh:

Failed to proxy request to application: acquire agent conn: ensure agent: subscribe agent: write message: failed to write msg: failed to acquire lock: context canceled

Screenshot 2024-01-04 at 09 30 08

I observed this behaviour with:

But not with:

After restarting paris.fly.dev.coder.com, the issue was apparently resolved.

Current theory is that there is a bug in the retry logic.

This appears to be supported by the following timeline of events:

  • 2024-01-04T05:33:34.9677772Z 'paris-coder' fly.io wsproxy restarted
  • 2024-01-04T05:34:08.0138014Z 'sydney-coder' fly.io wsproxy restarted
  • 2024-01-04T05:34:17.3245364Z deployment.apps/coder restarted
  • 2024-01-04T05:34:25.7908454Z 'sydney' GCP wsproxy restarted
  • 2024-01-04T05:34:29.7072117Z 'sao-paulo-coder' fly.io wsproxy restarted
  • 2024-01-04T05:34:33.5160577Z deployment "coder" successfully rolled out
  • 2024-01-04T05:34:44.6466916Z 'europe' GCP wsproxy restarted
  • 2024-01-04T05:35:02.7787815Z 'brazil' GCP wsproxy restarted

The Paris and Sydney wsproxies would have been connected to coderd at the time of the rollout restart happening; the restart would have interrupted the persistent websocket connection for those wsproxies while the others most likely were connected to the new coderd replicas.

Curiously, the workspace proxy healthcheck reported no issues:

curl https://sydney.fly.dev.coder.com/healthz-report
{"errors":null,"warnings":null}

The logs on the Paris fly.io wsproxy had already rotated, but we observed the following in the Sydney fly.io wsproxy's log output:

2024-01-04T05:34:06Z app[918577d4bd5538] syd [info]Started HTTP listener at http://0.0.0.0:3000
2024-01-04T05:34:06Z app[918577d4bd5538] syd [info]View the Web UI: https://sydney.fly.dev.coder.com
2024-01-04T05:34:08Z app[918577d4bd5538] syd [info]==> Logs will stream in below (press ctrl+c to gracefully exit):
2024-01-04T05:34:35Z app[918577d4bd5538] syd [info]2024-01-04 05:34:35.000 [warn]  net.workspace-proxy.servertailnet: broadcast server node to agents ...
2024-01-04T05:34:35Z app[918577d4bd5538] syd [info]    error= write message:
2024-01-04T05:34:35Z app[918577d4bd5538] syd [info]               github.com/coder/coder/v2/enterprise/wsproxy/wsproxysdk.(*remoteMultiAgentHandler).writeJSON
2024-01-04T05:34:35Z app[918577d4bd5538] syd [info]                   /home/runner/actions-runner/_work/coder/coder/enterprise/wsproxy/wsproxysdk/wsproxysdk.go:524
2024-01-04T05:34:35Z app[918577d4bd5538] syd [info]             - failed to write msg: WebSocket closed: failed to read frame header: EOF

Metadata

Metadata

Assignees

Labels

s2Broken use cases or features (with a workaround). Only humans may set this.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions