-
Notifications
You must be signed in to change notification settings - Fork 887
SSH to workspace not working anymore with Cloudflare proxying enabled #9337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Same here. Starting today, I'm experiencing the same issue. I'm unable to connect to my workspace using either VSCode or the Jetbrains Gateway. curl -X GET https://mycoder.uri/api/v2/debug/health -H 'Accept: application/json' -H 'Coder-Session-Token: czZF56msxH-NOD4anc2wec5nx6YjeHXai' "999": { netcheck: [v1] measuring ICMP latency of coder (999): no address for node 999b" So, no clue what happened. |
Hi @jaulz and @zpantskhava, could you both confirm what version you're running? |
Did this happen after a Coder upgrade? Also, what setup do you use? What clouds do you use and what load balancers, reverse proxies or VPNs sit between you and Coder? |
Hi Only what is working is terminal from web UI. Full build of Coder, supports the server subcommand. |
I'm using a Self hosted Coder. On-Prem |
coder login https://mycoder.uri/
coder list coder config-ssh ssh -vvv coder.z.main DetailsOpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017 Note: And it gets stuck here. Above is what I'm getting from a remote host while trying to connect to the workspace. Below is what I'm getting from the coder host itself: ssh -vvv coder.z.main coder@z:~$ Note: I do not generate any SSH keys, neither on the local host nor on the remote server. |
@coadler @deansheather it did not happen immediately after a Coder upgrade but instead I think it happened overnight when the workspace was rebooted. I am using the latest version |
I just tested it with the delivered Docker template and even that template is not working. I see a similar message there (i.e.
|
This log line is very confusing to us and we've never really seen it before:
Since you're using Cloudflare, what happens if you disable Cloudflare proxying (turn off the orange cloud) and try it directly? |
@deansheather just tested it but unfortunately it's the very same behaviour. This is by the way the part of the log that repeats every few seconds (while trying to access the Terminal from web or SSH):
|
Likewise, tried to connect to it directly, but without success. Here is the log from JetBrains Gateway: 2023-08-29 00:21:43,473 [1713088] INFO - CoderWorkspacesStepView - Configuring Coder CLI... And log from workspace container: 2023-08-28 20:26:47.301 [debu] net.tailnet.net.wgengine: derphttp.Client.Recv: connecting to derp-999 (coder) The HTTPS connection works perfectly fine. I'm able to list workspaces and start/stop them. However, as I mentioned earlier, I can Only SSH into the workspace from the coder host itself. Where can we check the logs other than: Can coder be initiated in "debug" mode with the capability to stream data to a log file? |
|
THanks @spikecurtis OpenSSH_8.7p1, OpenSSL 3.0.7 1 Nov 2022 It becomes stuck at this point. |
@zpantskhava it looks like you're sending the output of
|
This comment was marked as spam.
This comment was marked as spam.
Yeah, that's basically the same log for me 😞 |
Okay, it seems that my downgrade didn't work the way I expected it (forgot to pass the |
@deansheather I set up a completely new server and also removed Cloudfare from the bill and now it seems to work again. Any idea which setting in Cloudfare could cause these issues? @zpantskhava are you also using Cloudfare? |
Hi @jaulz, Yes, I am using it. However, during my previous usage, I connected directly to the Coder instance without employing a domain name. As for myself, I'm currently attempting to pinpoint the root cause and remain optimistic for assistance. |
Another user reported this in Discord and we were able to fix it by disabling Cloudflare proxying, so I'm fairly confident this is where the issue is. One way to workaround this for now is to use Tailscale's hosted DERPs. You can do this by setting |
I can also replicate this behaviour using v2.1.4 & cloudflare. Disabling CF proxy does fix the issue. |
@0xCiBeR cool, where exactly did you disable the proxy? |
Just turning off the orange cloud on the specific dns records in cloudflare |
I confirm that setting the two options fixed the issue, even keeping Cloudflare proxying active. # ...
# Fix for SSH behind Cloudflare
CODER_DERP_CONFIG_URL: "https://controlplane.tailscale.com/derpmap/default"
CODER_DERP_SERVER_ENABLE: "false"
# ... |
As we have a workaround to make this work. I am moving this from |
Seems like a change to Coder (probably a change to the tailscale fork) caused Cloudflare proxying to break which is affecting some of our enterprise customers |
100% works for me as well. I've implemented it within coder.env CODER_DERP_CONFIG_URL= "https://controlplane.tailscale.com/derpmap/default" Awesome |
DERP headers consist of 1 byte of type, followed by 4 bytes (big-endian, unsigned) of length. These logs don't include the type, but
We don't know the version or status code from the logs, as the DERP client doesn't get far enough decoding to tell. The mystery is why the DERP client is operating on HTTP data. In a normal HTTP upgrade scenario, the client would send something like
Then wait for the response like
and then, it would start processing data as the new protocol. However, tailscale has this "fast start" mechanism where it sets a header If Cloudflare (or another proxy) were to strip this header out, or return an HTTP error response, then we'd end up with symptoms like our logs show. For this to work, the DERP server has to have TLS enabled locally, and it injects an additional "meta certificate" into the ServerHello, which is self signed and contains the DERP public key encoded in the CommonName. That seems to imply that the Cloudflare proxy is leaving this metacert on the protocol even while it decrypts the TLS session from the client. |
@jaulz @0xCiBeR @zpantskhava I'm so far unable to reproduce this experimenting with Cloudflare. Can any of you share more details about the Cloudflare configuration you have when you're seeing this issue? In particular, What TLS/SSL encryption mode are you using for Coder? |
TL;DR Use these environment variables to disable the internal DERP and use Tailscale's official map instead: CODER_DERP_CONFIG_URL: "https://controlplane.tailscale.com/derpmap/default"
CODER_DERP_SERVER_ENABLE: "false" Due to the obscurity of the DERP protocol, we don't guarantee any support of DERP behind reverse proxies such as Cloudflare. |
Sorry, I missed this issue but eventually it was just the orange cloud in the DNS settings which hides the IP of the actual server. Since there is a workaround I also agree that it should be enough for the time being. Thanks for your support! |
From one day to the other I cannot connect to my workspace via SSH anymore. I can SSH into other containers but my main workspace is not connecting. I downgraded, rebooted the whole instance, restarted the workspace but nothing helped though it's indicated as running:

Neither the Terminal button (or any other button) nor
coder ssh jaulz.workspace
is working. Both basically return not even an error message. When I'm checking the logs of the Docker container I can see this output of the agent:Is maybe this error
unexpectedly large frame of 1414811695 bytes returned
causing any issues?The workspace itself is created via this template:
Has anything changed during recent updates which could cause this misbehaviour? Thanks a lot for your feedback!
The text was updated successfully, but these errors were encountered: