Description
From one day to the other I cannot connect to my workspace via SSH anymore. I can SSH into other containers but my main workspace is not connecting. I downgraded, rebooted the whole instance, restarted the workspace but nothing helped though it's indicated as running:
Neither the Terminal button (or any other button) nor coder ssh jaulz.workspace
is working. Both basically return not even an error message. When I'm checking the logs of the Docker container I can see this output of the agent:
2023-08-25 14:15:56.198 [debu] net.tailnet.net.wgengine: derphttp.Client.Recv: connecting to derp-999 (coder)
2023-08-25 14:15:56.266 [debu] net.tailnet.net.wgengine: magicsock: [0xc0003c32c0] derp.Recv(derp-999): derp.Recv: unexpectedly large frame of 1414811695 bytes returned
2023-08-25 14:15:56.266 [debu] net.tailnet.net.wgengine: derp-999: [v1] backoff: 5697 msec
2023-08-25 14:15:56.278 [debu] net.tailnet.net.wgengine: netcheck: netcheck.runProbe: got STUN response for 1000stun0 from 167.235.129.2:57958 (1c14801356802acb4ae1ba97) in 10.998169ms
2023-08-25 14:15:56.303 [debu] net.tailnet.net.wgengine: netcheck: netcheck.runProbe: got STUN response for 1001stun0 from 167.235.129.2:57958 (7e84418b23ab8c0e48696dca) in 36.522108ms
2023-08-25 14:15:56.493 [debu] net.tailnet.net.wgengine: netcheck: [v1] measuring ICMP latency of coder (999): no address for node 999b
2023-08-25 14:15:56.548 [debu] net.tailnet.net.wgengine: netcheck: [v1] report: udp=true v6=false v6os=false mapvarydest=false hair= portmap= v4a=167.235.129.2:57958 derp=999 derpdist=999v4:19ms
2023-08-25 14:16:01.965 [debu] net.tailnet.net.wgengine: derphttp.Client.Recv: connecting to derp-999 (coder)
2023-08-25 14:16:02.051 [debu] net.tailnet.net.wgengine: magicsock: [0xc0003c32c0] derp.Recv(derp-999): derp.Recv: unexpectedly large frame of 1414811695 bytes returned
2023-08-25 14:16:02.051 [debu] net.tailnet.net.wgengine: derp-999: [v1] backoff: 5150 msec
2023-08-25 14:16:02.063 [debu] net.tailnet.net.wgengine: netcheck: netcheck.runProbe: got STUN response for 1000stun0 from 167.235.129.2:57958 (7757ffae0f447c677f87d0fd) in 11.04406ms
2023-08-25 14:16:02.272 [debu] net.tailnet.net.wgengine: netcheck: [v1] measuring ICMP latency of coder (999): no address for node 999b
2023-08-25 14:16:02.320 [debu] net.tailnet.net.wgengine: netcheck: [v1] report: udp=true v6=false v6os=false mapvarydest= hair= portmap= v4a=167.235.129.2:57958 derp=999 derpdist=999v4:23ms
2023-08-25 14:16:02.321 [debu] net.tailnet: netinfo callback netinfo="NetInfo{varies= hairpin= ipv6=false ipv6os=false udp=true icmpv4=false derp=#999 portmap= link=\"\"}"
2023-08-25 14:16:02.321 [debu] net.tailnet: sending node node="&{ID:nodeid:4be00bc6922401af AsOf:2023-08-25 14:16:02.321403 +0000 UTC Key:nodekey:30b321849632e6672891c1696bf0ac5dd4785b8f4bc8b80fe26a54f5e64d7b63 DiscoKey:discokey:4d5786f5d6f02e50372f6dc0da59bd31874763f61a76d68ff40458721c5d4f18 PreferredDERP:999 DERPLatency:map[999-v4:0.023000846] DERPForcedWebsocket:map[] Addresses:[fd7a:115c:a1e0:4270:b41f:f439:6ddb:fe51/128 fd7a:115c:a1e0:49d6:b259:b7ac:b1b2:48f4/128] AllowedIPs:[fd7a:115c:a1e0:4270:b41f:f439:6ddb:fe51/128 fd7a:115c:a1e0:49d6:b259:b7ac:b1b2:48f4/128] Endpoints:[167.235.129.2:57958 192.168.176.5:57958]}"
2023-08-25 14:16:07.202 [debu] net.tailnet.net.wgengine: derphttp.Client.Recv: connecting to derp-999 (coder)
2023-08-25 14:16:07.268 [debu] net.tailnet.net.wgengine: magicsock: [0xc0003c32c0] derp.Recv(derp-999): derp.Recv: unexpectedly large frame of 1414811695 bytes returned
2023-08-25 14:16:07.268 [debu] net.tailnet.net.wgengine: derp-999: [v1] backoff: 4659 msec
2023-08-25 14:16:07.306 [debu] net.tailnet.net.wgengine: netcheck: netcheck.runProbe: got STUN response for 1001stun0 from 167.235.129.2:57958 (5ebfbbea8624a955074f5a09) in 36.667137ms
2023-08-25 14:16:07.328 [debu] net.tailnet.net.wgengine: netcheck: netcheck.runProbe: got STUN response for 1002stun0 from 167.235.129.2:57958 (b23e39419eeceb4b133924f2) in 59.409467ms
2023-08-25 14:16:07.503 [debu] net.tailnet.net.wgengine: netcheck: [v1] measuring ICMP latency of coder (999): no address for node 999b
2023-08-25 14:16:07.569 [debu] net.tailnet.net.wgengine: netcheck: [v1] report: udp=true v6=false v6os=false mapvarydest=false hair= portmap= v4a=167.235.129.2:57958 derp=999 derpdist=999v4:23ms
2023-08-25 14:16:07.570 [debu] net.tailnet: netinfo callback netinfo="NetInfo{varies=false hairpin= ipv6=false ipv6os=false udp=true icmpv4=false derp=#999 portmap= link=\"\"}"
2023-08-25 14:16:07.570 [debu] net.tailnet: sending node node="&{ID:nodeid:4be00bc6922401af AsOf:2023-08-25 14:16:07.570705 +0000 UTC Key:nodekey:30b321849632e6672891c1696bf0ac5dd4785b8f4bc8b80fe26a54f5e64d7b63 DiscoKey:discokey:4d5786f5d6f02e50372f6dc0da59bd31874763f61a76d68ff40458721c5d4f18 PreferredDERP:999 DERPLatency:map[999-v4:0.023359301] DERPForcedWebsocket:map[] Addresses:[fd7a:115c:a1e0:4270:b41f:f439:6ddb:fe51/128 fd7a:115c:a1e0:49d6:b259:b7ac:b1b2:48f4/128] AllowedIPs:[fd7a:115c:a1e0:4270:b41f:f439:6ddb:fe51/128 fd7a:115c:a1e0:49d6:b259:b7ac:b1b2:48f4/128] Endpoints:[167.235.129.2:57958 192.168.176.5:57958]}"
Is maybe this error unexpectedly large frame of 1414811695 bytes returned
causing any issues?
The workspace itself is created via this template:
resource "docker_container" "workspace" {
count = data.coder_workspace.me.start_count
image = docker_image.workspace.image_id
# Uses lower() to avoid Docker restriction on container names.
name = "${data.coder_parameter.namespace.value}-${data.coder_workspace.me.owner}-${lower(data.coder_workspace.me.name)}-workspace"
hostname = lower(data.coder_workspace.me.name)
dns = ["1.1.1.1"]
memory = 10000
memory_swap = -1
# Use the docker gateway if the access URL is 127.0.0.1
command = ["sh", "-c", replace(coder_agent.workspace.init_script, "/localhost|127\\.0\\.0\\.1/", "host.docker.internal")]
env = [
"CODER_AGENT_TOKEN=${coder_agent.workspace.token}"
]
host {
host = "host.docker.internal"
ip = "host-gateway"
}
volumes {
container_path = "/home/coder/"
volume_name = docker_volume.home.name
read_only = false
}
networks_advanced {
name = docker_network.internal.name
}
host {
host = "${data.coder_parameter.workspace_host.value}"
ip = "127.0.0.1"
}
restart = "unless-stopped"
}
Has anything changed during recent updates which could cause this misbehaviour? Thanks a lot for your feedback!