Closed
Description
i'm creating this issue to track an issue experienced by a strategic customer that returned 504s on the /listening-ports
API route with the single_tailnet
experiment enabled and CODER_BLOCK_DIRECT=true
. this happened upon upgrade from 2.2.x to 2.3.1. the agent and workspace displayed as healthy in the dashboard.
coderd logs returned:
2023-10-20 17:15:22.614 [warn] coderd: GET host=coder.<REDACTED DOMAIN> path=/api/v2/workspaceagents/0fcbe2cb-f591-49b2-89f5-df0bbea137e1/listening-ports proto=HTTP/1.1 remote_addr=<REDACTED IP start="2023-10-20T17:14:22.610280152Z" took=1m0.004586037s status_code=500 latency_ms=60004
response_body="{\"message\":\"Internal error dialing workspace agent.\",\"detail\":\"agent is unreachable\"}\n" request_id=c83a5819-db40-429b-880d-624e834b53eb
removing the single_tailnet
resolved things. here's their full env var to help debug:
env:
- name: CODER_PG_CONNECTION_URL
valueFrom:
secretKeyRef:
name: <REDACTED>
key: url
- name: CODER_ACCESS_URL
value: "https://coder/.<REDACTED>"
- name: CODER_WILDCARD_ACCESS_URL
value: "*.[coder.<](http://coder.REDACTED)/REDACTED>"
- name: CODER_OIDC_IGNORE_USERINFO
value: "true"
- name: CODER_OIDC_ISSUER_URL
value: "https://login.microsoftonline.com/<REDACTED>"
- name: CODER_OIDC_EMAIL_DOMAIN
value: "[<](http://<REDACTED>.com/)REDACTED>"
- name: CODER_OIDC_CLIENT_ID
value: "<REDACTED>"
- name: CODER_OIDC_CLIENT_SECRET
valueFrom:
secretKeyRef:
name: <REDACTED>
key: secret
- name: CODER_OIDC_SCOPES
value: "openid,email,profile"
- name: CODER_OIDC_GROUP_MAPPING
value: <REDACTED>
- name: CODER_PROMETHEUS_ENABLE
value: "true"
- name: CODER_PROMETHEUS_COLLECT_AGENT_STATS
value: "true"
- name: CODER_PROMETHEUS_ADDRESS
value: "0.0.0.0:2112"
- name: CODER_BLOCK_DIRECT
value: "true"
# - name: CODER_VERBOSE
# value: "true"
# - name: TF_LOG
# value: "DEBUG"
- name: CODER_EXPERIMENTS
value: "deployment_health_page"