SSH to workspace not working anymore with Cloudflare proxying enabled

From one day to the other I cannot connect to my workspace via SSH anymore. I can SSH into other containers but my main workspace is not connecting. I downgraded, rebooted the whole instance, restarted the workspace but nothing helped though it's indicated as running:
![image](https://github.com/coder/coder/assets/5358638/7c5abc5b-e727-4a9e-a1ea-e091a1f38a9c)

Neither the Terminal button (or any other button) nor `coder ssh jaulz.workspace` is working. Both basically return not even an error message. When I'm checking the logs of the Docker container I can see this output of the agent:
```
2023-08-25 14:15:56.198 [debu]  net.tailnet.net.wgengine: derphttp.Client.Recv: connecting to derp-999 (coder)
2023-08-25 14:15:56.266 [debu]  net.tailnet.net.wgengine: magicsock: [0xc0003c32c0] derp.Recv(derp-999): derp.Recv: unexpectedly large frame of 1414811695 bytes returned
2023-08-25 14:15:56.266 [debu]  net.tailnet.net.wgengine: derp-999: [v1] backoff: 5697 msec
2023-08-25 14:15:56.278 [debu]  net.tailnet.net.wgengine: netcheck: netcheck.runProbe: got STUN response for 1000stun0 from 167.235.129.2:57958 (1c14801356802acb4ae1ba97) in 10.998169ms
2023-08-25 14:15:56.303 [debu]  net.tailnet.net.wgengine: netcheck: netcheck.runProbe: got STUN response for 1001stun0 from 167.235.129.2:57958 (7e84418b23ab8c0e48696dca) in 36.522108ms
2023-08-25 14:15:56.493 [debu]  net.tailnet.net.wgengine: netcheck: [v1] measuring ICMP latency of coder (999): no address for node 999b
2023-08-25 14:15:56.548 [debu]  net.tailnet.net.wgengine: netcheck: [v1] report: udp=true v6=false v6os=false mapvarydest=false hair= portmap= v4a=167.235.129.2:57958 derp=999 derpdist=999v4:19ms
2023-08-25 14:16:01.965 [debu]  net.tailnet.net.wgengine: derphttp.Client.Recv: connecting to derp-999 (coder)
2023-08-25 14:16:02.051 [debu]  net.tailnet.net.wgengine: magicsock: [0xc0003c32c0] derp.Recv(derp-999): derp.Recv: unexpectedly large frame of 1414811695 bytes returned
2023-08-25 14:16:02.051 [debu]  net.tailnet.net.wgengine: derp-999: [v1] backoff: 5150 msec
2023-08-25 14:16:02.063 [debu]  net.tailnet.net.wgengine: netcheck: netcheck.runProbe: got STUN response for 1000stun0 from 167.235.129.2:57958 (7757ffae0f447c677f87d0fd) in 11.04406ms
2023-08-25 14:16:02.272 [debu]  net.tailnet.net.wgengine: netcheck: [v1] measuring ICMP latency of coder (999): no address for node 999b
2023-08-25 14:16:02.320 [debu]  net.tailnet.net.wgengine: netcheck: [v1] report: udp=true v6=false v6os=false mapvarydest= hair= portmap= v4a=167.235.129.2:57958 derp=999 derpdist=999v4:23ms
2023-08-25 14:16:02.321 [debu]  net.tailnet: netinfo callback  netinfo="NetInfo{varies= hairpin= ipv6=false ipv6os=false udp=true icmpv4=false derp=#999 portmap= link=\"\"}"
2023-08-25 14:16:02.321 [debu]  net.tailnet: sending node  node="&{ID:nodeid:4be00bc6922401af AsOf:2023-08-25 14:16:02.321403 +0000 UTC Key:nodekey:30b321849632e6672891c1696bf0ac5dd4785b8f4bc8b80fe26a54f5e64d7b63 DiscoKey:discokey:4d5786f5d6f02e50372f6dc0da59bd31874763f61a76d68ff40458721c5d4f18 PreferredDERP:999 DERPLatency:map[999-v4:0.023000846] DERPForcedWebsocket:map[] Addresses:[fd7a:115c:a1e0:4270:b41f:f439:6ddb:fe51/128 fd7a:115c:a1e0:49d6:b259:b7ac:b1b2:48f4/128] AllowedIPs:[fd7a:115c:a1e0:4270:b41f:f439:6ddb:fe51/128 fd7a:115c:a1e0:49d6:b259:b7ac:b1b2:48f4/128] Endpoints:[167.235.129.2:57958 192.168.176.5:57958]}"
2023-08-25 14:16:07.202 [debu]  net.tailnet.net.wgengine: derphttp.Client.Recv: connecting to derp-999 (coder)
2023-08-25 14:16:07.268 [debu]  net.tailnet.net.wgengine: magicsock: [0xc0003c32c0] derp.Recv(derp-999): derp.Recv: unexpectedly large frame of 1414811695 bytes returned
2023-08-25 14:16:07.268 [debu]  net.tailnet.net.wgengine: derp-999: [v1] backoff: 4659 msec
2023-08-25 14:16:07.306 [debu]  net.tailnet.net.wgengine: netcheck: netcheck.runProbe: got STUN response for 1001stun0 from 167.235.129.2:57958 (5ebfbbea8624a955074f5a09) in 36.667137ms
2023-08-25 14:16:07.328 [debu]  net.tailnet.net.wgengine: netcheck: netcheck.runProbe: got STUN response for 1002stun0 from 167.235.129.2:57958 (b23e39419eeceb4b133924f2) in 59.409467ms
2023-08-25 14:16:07.503 [debu]  net.tailnet.net.wgengine: netcheck: [v1] measuring ICMP latency of coder (999): no address for node 999b
2023-08-25 14:16:07.569 [debu]  net.tailnet.net.wgengine: netcheck: [v1] report: udp=true v6=false v6os=false mapvarydest=false hair= portmap= v4a=167.235.129.2:57958 derp=999 derpdist=999v4:23ms
2023-08-25 14:16:07.570 [debu]  net.tailnet: netinfo callback  netinfo="NetInfo{varies=false hairpin= ipv6=false ipv6os=false udp=true icmpv4=false derp=#999 portmap= link=\"\"}"
2023-08-25 14:16:07.570 [debu]  net.tailnet: sending node  node="&{ID:nodeid:4be00bc6922401af AsOf:2023-08-25 14:16:07.570705 +0000 UTC Key:nodekey:30b321849632e6672891c1696bf0ac5dd4785b8f4bc8b80fe26a54f5e64d7b63 DiscoKey:discokey:4d5786f5d6f02e50372f6dc0da59bd31874763f61a76d68ff40458721c5d4f18 PreferredDERP:999 DERPLatency:map[999-v4:0.023359301] DERPForcedWebsocket:map[] Addresses:[fd7a:115c:a1e0:4270:b41f:f439:6ddb:fe51/128 fd7a:115c:a1e0:49d6:b259:b7ac:b1b2:48f4/128] AllowedIPs:[fd7a:115c:a1e0:4270:b41f:f439:6ddb:fe51/128 fd7a:115c:a1e0:49d6:b259:b7ac:b1b2:48f4/128] Endpoints:[167.235.129.2:57958 192.168.176.5:57958]}"
```

Is maybe this error `unexpectedly large frame of 1414811695 bytes returned` causing any issues?

The workspace itself is created via this template:
```terraform
resource "docker_container" "workspace" {
  count    = data.coder_workspace.me.start_count
  image    = docker_image.workspace.image_id
  # Uses lower() to avoid Docker restriction on container names.
  name     = "${data.coder_parameter.namespace.value}-${data.coder_workspace.me.owner}-${lower(data.coder_workspace.me.name)}-workspace"
  hostname = lower(data.coder_workspace.me.name)
  dns      = ["1.1.1.1"]
  memory   = 10000
  memory_swap   = -1
  # Use the docker gateway if the access URL is 127.0.0.1
  command  = ["sh", "-c", replace(coder_agent.workspace.init_script, "/localhost|127\\.0\\.0\\.1/", "host.docker.internal")]
  env      = [
    "CODER_AGENT_TOKEN=${coder_agent.workspace.token}"
  ]
  host {
    host = "host.docker.internal"
    ip   = "host-gateway"
  }
  volumes {
    container_path = "/home/coder/"
    volume_name    = docker_volume.home.name
    read_only      = false
  }
  networks_advanced {
    name = docker_network.internal.name
  }
  host {
    host = "${data.coder_parameter.workspace_host.value}"
    ip   = "127.0.0.1"
  }
  restart = "unless-stopped"
}
```

Has anything changed during recent updates which could cause this misbehaviour? Thanks a lot for your feedback!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SSH to workspace not working anymore with Cloudflare proxying enabled #9337

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SSH to workspace not working anymore with Cloudflare proxying enabled #9337

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions