Skip to content

feature request: subdomain on applications #9903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
raphaelfff opened this issue Sep 28, 2023 · 28 comments · Fixed by #10150
Closed

feature request: subdomain on applications #9903

raphaelfff opened this issue Sep 28, 2023 · 28 comments · Fixed by #10150

Comments

@raphaelfff
Copy link

We need to support urls in the form of http://subdomain.3000--main--test3--admin.coder.com to be forwarded to the application, i started playing around with the required changes, some pointers would be great! (and how to test it too)

main...raphaelfff:coder:patch-1

@cdr-bot cdr-bot bot added the feature label Sep 28, 2023
@Emyrk
Copy link
Member

Emyrk commented Oct 3, 2023

@raphaelfff appreciate the feature request.

Do you have a solution in mind for handling tls of the nested subdomain? Right now Coder uses wildcard certificates that work for 1 depth subdomains. So the cert for *.coder.com works for all apps since they are <application>.coder.com.

If we extend the subdomain depth to 2, <subdomain>.<application>.coder.com, the wildcard certs are not sufficient. Our <application> subdomain is also an unbounded set, so multi-domain wildcard certs would still have issues as there is a limit to the number of SANs to list.


I think to get this to work, some application would need to generate certs "on demand" as applications are hit, and maintain this set of generated certs. Ideally making them deterministic so if we regenerate them, they are the same cert to the client at all times.

Doing this in Coder would be unfortunate, as I assume most users use some sort of ingress to handle their certs and cert rotation.

@raphaelfff
Copy link
Author

Indeed the cert is the main blocker here, as plumbing it through coder seems trivial.
We are currently evaluating using traefik as ingress, and my understanding is that is should be able to generate the cert on demand (don't quote me on that...).
If traefik (or similar) doesnt work, then no point implementing that feature is coder.

@Emyrk
Copy link
Member

Emyrk commented Oct 3, 2023

The problem you will have with generating certs on demand is that I don't think there exists any CAs on the public net that will sign an intermediate CA. Most CAs like Letsencrypt only sign leaf certs.

If you can get an intermediate CA that the client's trust (if your company has some root CA it controls and clients install for example), than I see this as feasible in theory. Not sure where the implementation will live, but the extra latency from cert generations doesn't seem too bad, especially with caching.

Without an intermediate CA, you would need to do the ACME exchange for each new application I suppose 🤔. I am sure you'd hit any rate limit any cert provider has setup. Lets encrypt is 50 per week per registered domain.


So I think this would only work on environments where the deployment operator has some CA they control and clients trust. Which is not that uncommon in big orgs, so could be worth implementing this with a large disclaimer on who can use it.

@raphaelfff
Copy link
Author

Indeed, thats a valid concern, but for our use its fine, we expected 2/3 subdomain per workspace.
Also, Lets Encrypt can issue wildcard, so i wonder how the Traefik ACME will generate them.
In an ideal world (i ll confirm this week):
coder.lol is the coder instance
8080--main--ws1--raph.coder.lol is the application, with cert covered by the (coder.lol cert under *.coder.lol)
Accessing hello.8080--main--ws1--raph.coder.lol will trigger the creation of a new cert with 8080--main--ws1--raph.coder.lol and *.8080--main--ws1--raph.coder.lol
So that really boils down to not having more than 50 workspaces created per week, should be fine

@Emyrk
Copy link
Member

Emyrk commented Oct 3, 2023

Yup exactly for the cert creations. Definitely feels weird on the cert creation. Another rate limit to keep in mind is the "Duplicate Certificate Limit" of 5 per week (renewing certs)

https://letsencrypt.org/docs/duplicate-certificate-limit/

Lets encrypt certs expire in 90 days. So you will probably need some extra logic on how to determine renewals, eg if a workspace is deleted, do not renew it.

Another tool you can use is each cert can have up to 100 names for lets encrypt. So you could batch together your renewals based on certs nearing their expiry.


This all is possible, but does feel weird to support in any official capacity. However, I do not see any issue with supporting it on the Coder side.

For the implementation, we need to support some syntax for the wildcard access url. We currently only support 1 asterisks. The new syntax for the param would have to indicate which asterisks is the app name, and which is the new depth 2 subdomain. Because *.*.coder.lol is ambiguous to coder as to how to handle it imo. Maybe it should be like *.<app>.coder.lol or something, not sure...

The app itself could have some value to indicate subdomain support, but I think this should also be set at the deployment level given the cert handling that needs to be done.

The other reason to handle this explicitly is we redirect all unknown requests to the access url if the config value is set. I am not sure the reason for this flag, but it would be good to keep it working.

Tl;dr: If the cert stuff can be handled, I can implement this on our end. I think the cert stuff is definitely a harder lift 😄

@raphaelfff
Copy link
Author

raphaelfff commented Oct 3, 2023

I think Traefik creates the certs on demand anyway, so I dont think thats an issue. I ll see as I start using it :)
Also maybe Traefik is smart about a renewals and grouping 🤷‍♂️, again, will see

Yeah, my point is that Coder should not have limitation on that, being self hosted its the customer's responsibility to figure out the certs etc.
It should defo be *.<app>.coder.lol, *.*.<app>.coder.lol etc

Will let you know once i have an answer on traefik.

@deansheather
Copy link
Member

I think the best way to implement this would be to count the segments in the configured wildcard access URL and only check the last n segments in the hostname provided in the request. So it wouldn't be done through the regex directly and would be done in the parsing function.

This would have to be behind an opt-in site-wide flag though to enable additional hostname segments in app URLs.

As Steven said you'll quickly hit the wildcard certificate issuance/renewal limit per domain. The only way to handle this without locking people out of apps when the limit is reached is by running your own certificate authority and trusting it on all of your developer's devices, and getting traefik to use that CA to issue new wildcard certs.

@raphaelfff
Copy link
Author

raphaelfff commented Oct 4, 2023

Seems like traefik doesnt do on demand TLS, but caddy does! (next experiment)
Regardless whether caddy does it or not, or wether its viable to use letsencrypt (as you mentionned a pvt CA would work) or people may be okay with an untrusted cert; Coder should not block that behavior, so my 2 cent is that feature should be baked into Coder anyways.

@raphaelfff
Copy link
Author

Actually i would add that another thing that would work for us is supporting randomstuff-8080--main--ws1--raph.coder.lol, basically having a prefix before the application name

@Emyrk
Copy link
Member

Emyrk commented Oct 5, 2023

The only issue I see with prefixes is the max length of a subdomain is 255 characters. Aside from that, I think it can be very reasonable to ignore some prefix when parsing the subdomain for routing traffic to the app.

@raphaelfff
Copy link
Author

raphaelfff commented Oct 5, 2023

255 is plenty, do you have an ETA as to when this could be implemented ?

@Emyrk
Copy link
Member

Emyrk commented Oct 5, 2023

Can you explain the problem you are trying to solve? The reason to require prefixes vs using path based routing within your app?

8080--main--ws1--raph.coder.lol/randomstuff/*?

@raphaelfff
Copy link
Author

raphaelfff commented Oct 5, 2023

Our product is tenant based, tenant is determined from subdomain: acme.example.com
In dev we have a nifty trick (mostly to go around oauth callback url restrictions): acme-dot-8080--main--whatever.coder.lol
That nifty trick actually comes useful here for the certificate issue mentionned earlier too (*.coder.lol would cover all our devenvs)
So here my goal would be to define a single app: 8080 and have devs use the prefix acme-dot- to let the system know its the acme tenant.
At this stage of dev changing from domain based to path based is not an option.

@smolinari
Copy link

"tenant is determined from subdomain" - Basing anything off of a URL will be a bad idea from the start. People can type in other tenant names and you'd only have security via obscurity, which is pretty weak. I'd highly recommend not making any authorization decisions based on the sub-domain. Just sayin.... 😁

Scott

@Emyrk
Copy link
Member

Emyrk commented Oct 5, 2023

@smolinari I imagine each subdomain still has it's own authorization + authentication. Any SAAS product you could just type in the url and find other tenants, but be blocked by the "login screen" for that given tenant.

So I don't think subdomain based tenants is a bad decision outright. It can provide some better "built in" isolation from browsers as far as cookies and CORs requests. You also get to have different DNS for each tenant if you intend to geo distribute the tenant services or something.

So I see the value. If the routing is baked into the product, I see why the default solution is to extend the app urls to include some routing information in the subdomain.

@raphaelfff if you have some load balancer in front, would it be possible in dev environments to route based on the url? The load balancer can strip the url from the path if hosting on a sub path is not supported. That does assume the app does not need the tenant information from the url though.

@raphaelfff
Copy link
Author

The design of our product is out of scope here, Steven raises a lot of good points here and the reason for the current design.

Load balancer isn't an option, the app needs the subdomain part...

@Emyrk
Copy link
Member

Emyrk commented Oct 5, 2023

The design of our product is out of scope here

Agreed, just wanted to validate the original subdomain thing 😄

Coder uses Subdomains as a core feature, so I understand the requirement to get that information from the url. The reason we have path based apps is primarily for local development and coder in coder testing where subdomains are not trivial to make work.

Load balancer isn't an option, the app needs the subdomain part...

And if we supported prefixes, you could switch based on that instead of a subdomain?

@raphaelfff
Copy link
Author

Yes the switch would happen based off 'acme-dot-'

@Emyrk
Copy link
Member

Emyrk commented Oct 5, 2023

One idea is to completely expose the regex capture group for breaking apart app urls.

Golang:

// ParseSubdomainAppURL parses an ApplicationURL from the given subdomain. If
// the subdomain is not a valid application URL hostname, returns a non-nil
// error. If the hostname is not a subdomain of the given base hostname, returns
// a non-nil error.
//
// The base hostname should not include a scheme, leading asterisk or dot.
//
// Subdomains should be in the form:
//
// {PORT/APP_SLUG}--{AGENT_NAME}--{WORKSPACE_NAME}--{USERNAME}
// (eg. https://8080--main--dev--dean.hi.c8s.io)
func ParseSubdomainAppURL(subdomain string) (ApplicationURL, error) {
matches := appURL.FindAllStringSubmatch(subdomain, -1)
if len(matches) == 0 {
return ApplicationURL{}, xerrors.Errorf("invalid application url format: %q", subdomain)
}
matchGroup := matches[0]
return ApplicationURL{
AppSlugOrPort: matchGroup[appURL.SubexpIndex("AppSlug")],
AgentName: matchGroup[appURL.SubexpIndex("AgentName")],
WorkspaceName: matchGroup[appURL.SubexpIndex("WorkspaceName")],
Username: matchGroup[appURL.SubexpIndex("Username")],
}, nil
}

Regex:
https://regex101.com/r/0Y5LR6/1

(?P<AppSlug>[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)--(?P<AgentName>[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)--(?P<WorkspaceName>[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)--(?P<Username>[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)

We use capture groups to identify each section. It would not be hard to add a prefix/suffix, or any other variations.
It does get tricky on the UI to create urls for the application. So it would require a matching "url builder", likely a template syntax like $APP_SLUG--$AGENT_NAME--$WORKSPACE_NAME--$USERNAME.

So something like:

$CODER_APP_REGEX="^(?P<Prefix>[^\s-]+-)?(?P<AppSlug>[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)--(?P<AgentName>[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)--(?P<WorkspaceName>[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)--(?P<Username>[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)$"
$CODER_APP_BUILDER="$APP_SLUG--$AGENT_NAME--$WORKSPACE_NAME--$USERNAME"

https://regex101.com/r/kn1Bij/1


This passes full control to the configuration. It pushes a lot of complexity to the configuration, which means things can be misconfigured more easily.

I've thought of more simple solutions, but they all will omit some control. Obviously this would optional, and not required. The existing wildcard


Implementation detail:

We also need to expand the validHostnameLabelRegex which comes from the wildcardAccessURL and replaces the leading *. We could probably just use the regex passed in by the user, unsure why we have 2 different regex strings for this atm.

@smolinari
Copy link

@Emyrk

I imagine each subdomain still has it's own authorization + authentication.

I'd hope that would be implemented. I could only go on what was written. If I was incorrect, I apologize for budging in unnecessarily.

I do have another two cents though. 😁 We ran into the challenge of the sub-subdomains and creating the certs for them too. Since we have to generate DNS entries for those subdomains anyway, because of our LB setup, we decided to run the process ourselves. The automation isn't written yet, but we'll be working with Argo Workflows to make it happen. Should be easy enough.

We see Coder as a simpler service in a bigger system. To us, it's just a workspace generator. How the workspaces are used and wired up will be completely outside of Coder in our current plans. Not sure that helps the request here, but maybe some food for thought? 🙂

Scott

@raphaelfff
Copy link
Author

@Emyrk

Exposing the regex would be cool! (for example i m running a single agent, having it in the URL is kinda useless for me...)
I would not expose the regex, but simply the URL pattern, like that its flexible enough for the user, yet not too complicated while maintaining the inner constraints for each segments:

"$PREFIX$APP_SLUG--$AGENT_NAME--$WORKSPACE_NAME--$USERNAME"
"$PREFIX$APP_SLUG--$WORKSPACE_NAME--$USERNAME" (i would use a pattern like that)

You can use that to generate the regex, and the URL in the UI


Another suggestion is to have each app define its subdomain:
Then its user responsibility to make sure its unique enough so that there is no conflict with other workspaces:

resource "coder_app" "myapp" {
  agent_id  = coder_agent.main.id
  display_name = "MyApp"
  slug      = "myapp"
  domain = "myapp-${lower(data.coder_workspace.me.name)}--${data.coder_workspace.me.owner}"
  url       = "http://localhost:8080"
  subdomain = true
}

Prefix would always be prepended (or not, may be configurable)


If we keep it simple for now, and closer to what's existing, I have a suggestion (not super happy with the naming...):

resource "coder_app" "myapp" {
  agent_id  = coder_agent.main.id
  display_name = "MyApp"
  slug      = "myapp"
  prefix = "acme-dot-" //                  <<------------
  url       = "http://localhost:8080"
  subdomain = true
}

The app would match *myapp--$AGENT_NAME--$WORKSPACE_NAME--$USERNAME
But the button in the UI would build the url to acme-dot-myapp--$AGENT_NAME--$WORKSPACE_NAME--$USERNAME

One question here is how to resolve ambiguities, like declaring another app with name dot-myapp... (if you think that even is a problem)

@Emyrk
Copy link
Member

Emyrk commented Oct 6, 2023

The ambiguities are a problem, and I agree your suggestion is simplier. I was unsure if the naming pattern atm would be an issue for anyone ($APP_SLUG--$AGENT_NAME--$WORKSPACE_NAME--$USERNAME).

Interesting thought on putting it on the app instead, but that brings about some a chicken and the egg problem. Currently the router parses the url, then finds the app based on the parameters it has.

If we put it in the terraform, we have to do this route check in postgres where the apps are I assume? Doing a DB call for each request would be quite exorbitant, so requires some caching.... just adds complexity to the parsing I don't think we want to undertake.

In Coder v1 we allowed static suffixes to these urls. I figured it would be nice to have something more dynamic.

The issue with not doing regex, and doing something like: $PREFIX$APP_SLUG--$WORKSPACE_NAME--$USERNAME, is that Coder has no idea where the $PREFIX ends and the $APP_SLUG starts. There needs to be some separation character or regex character limit in the $PREFIX.

So keeping things "simple" I think adds some ambiguity, where the regex, although much more complicated, lets the operator address the ambiguity. And regex is more easily testable locally, like the links I sent above.


I'm still tinkering with some ideas, but the regex feels inclusive of all possible wants. And it allows the configuration at the deployment wide (for better or worse) which keeps the computation easily in memory.

@deansheather
Copy link
Member

The only issue I see with prefixes is the max length of a subdomain is 255 characters. Aside from that, I think it can be very reasonable to ignore some prefix when parsing the subdomain for routing traffic to the app.

It's 63 per segment. I don't think we should support this for blahblah--app--agent--workspace--username.coder.com because hitting the 63 char limit results in the browser refusing to send the request (you can't detect this and show a nice error). https://www.rfc-editor.org/rfc/rfc1035#section-2.3.4

@raphaelfff
Copy link
Author

63 chars is still plently... we could go up to blahblahblahblahblahblahlahbla--app--agent--workspace--username.coder.lol.

Although you cant catch that beautifully, I still see value in the prefix feature

@deansheather
Copy link
Member

With real usernames, workspace names, agent names and app names/ports you get a lot less characters to work with. In our wgtunnel project we recently had issues where we were hitting the hostname segment length limit with real names and we had to come up with a solution to make our suffixes much shorter to save space.

If subdomains work for your use case (and you can support TLS and such) then I think we should implement that rather than the segment prefix thing.

@raphaelfff
Copy link
Author

I see, well with our real usernames, workspace names, agent names and app names/ports we have plenty of characters left! (real example: mgmt--main--0--raphael.coder.lol, and i really only need to prefix another ~10 characters)

On the short term getting TLS to work is not a viable solution (need to setup a self hosted ACME and figure our the cert trust etc), so having the prefix feature would really help.

@raphaelfff
Copy link
Author

Coder being self-hosted, its all about flexibility, prefix might will probably work for most people, but if you have the need/technical capability of a subdomain setup, then i guess that should be a possibility as well

@deansheather
Copy link
Member

I've made a PR to allow for prefixes in the hyphenated subdomain thing. They must be separated with three hyphens instead of two so we can expand/remove segments from the parsing algorithm easily in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants