Skip to content

Commit 1832a75

Browse files
authored
docs: describe AWS hard NAT (#13205)
Documents what I've learned about getting direct connections on AWS. Several customers have had issues.
1 parent 35cb572 commit 1832a75

File tree

1 file changed

+50
-0
lines changed

1 file changed

+50
-0
lines changed

docs/networking/stun.md

+50
Original file line numberDiff line numberDiff line change
@@ -122,3 +122,53 @@ originate. Using these internal addresses is much more likely to result in a
122122
successful direct connection.
123123

124124
![Diagram of a workspace agent and client over VPN](../images/networking/stun3.png)
125+
126+
## Hard NAT
127+
128+
Some NATs are known to use a different port when forwarding requests to the STUN
129+
server and when forwarding probe packets to peers. In that case, the address a
130+
peer discovers over the STUN protocol will have the correct IP address, but the
131+
wrong port. Tailscale refers to this as "hard" NAT in
132+
[How NAT traversal works (tailscale.com)](https://tailscale.com/blog/how-nat-traversal-works).
133+
134+
If both peers are behind a "hard" NAT, direct connections may take longer to
135+
establish or will not be established at all. If one peer is behind a "hard" NAT
136+
and the other is running a firewall (including Windows Defender Firewall), the
137+
firewall may block direct connections.
138+
139+
In both cases, peers fallback to DERP connections if they cannot establish a
140+
direct connection.
141+
142+
If your workspaces are behind a "hard" NAT, you can:
143+
144+
1. Ensure clients are not also behind a "hard" NAT. You may have limited ability
145+
to control this if end users connect from their homes.
146+
2. Ensure firewalls on client devices (e.g. Windows Defender Firewall) have an
147+
inbound policy allowing all UDP ports either to the `coder` or `coder.exe`
148+
CLI binary, or from the IP addresses of your workspace NATs.
149+
3. Reconfigure your workspace network's NAT connection to the public internet to
150+
be an "easy" NAT. See below for specific examples.
151+
152+
### AWS NAT Gateway
153+
154+
The
155+
[AWS NAT Gateway](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html)
156+
is a known "hard" NAT. You can use a
157+
[NAT Instance](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_NAT_Instance.html)
158+
instead of a NAT Gateway, and configure it to use the same port assignment for
159+
all UDP traffic from a particular source IP:port combination (Tailscale calls
160+
this "easy" NAT). Linux `MASQUERADE` rules work well for this.
161+
162+
### AWS Elastic Kubernetes Service (EKS)
163+
164+
The default configuration of AWS Elastic Kubernetes Service (EKS) includes the
165+
[Amazon VPC CNI Driver](https://github.com/aws/amazon-vpc-cni-k8s), which by
166+
default randomizes the public port for different outgoing UDP connections. This
167+
makes it act as a "hard" NAT, even if the EKS nodes are on a public subnet (and
168+
thus do not need to use the AWS NAT Gateway to reach the Internet).
169+
170+
This behavior can be disabled by setting the environment variable
171+
`AWS_VPC_K8S_CNI_RANDOMIZESNAT=none` in the `aws-node` DaemonSet. Note, however,
172+
if your nodes are on a private subnet, they will still need NAT to reach the
173+
public Internet, meaning that issues with the
174+
[AWS NAT Gateway](#aws-nat-gateway) might affect you.

0 commit comments

Comments
 (0)