Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

relative url's can cause bad links #77

Closed
SvenDowideit opened this issue Apr 15, 2014 · 20 comments
Closed

relative url's can cause bad links #77

SvenDowideit opened this issue Apr 15, 2014 · 20 comments
Labels
Milestone

Comments

@SvenDowideit
Copy link

I am deploying to s3, and then have a hidden 404 error mkdocs page - when the users enters a long (incorrect url), then the error page is rendered using that url - thus all the menu links are broken.

@d0ugal
Copy link
Member

d0ugal commented Dec 17, 2014

FWIW, relative URLs have also caused a ton of issues with implementing #222.

I'm not really a fan of relative URLs, it is only really useful for people that view the docs in static mode and I'm not sure how popular a use case that is.

The only way I can think of solving this would be by having a config option - but I'm somewhat wary of this.

@SvenDowideit I assume you already use this approach in your forked version?

@tomchristie
Copy link
Member

Is there an example for this?

I use relative interlinking all the way through the REST framework docs. (Eg see links at the bottom of this document... https://raw.githubusercontent.com/tomchristie/django-rest-framework/master/docs/topics/rest-hypermedia-hateoas.md )

What doesn't work?

@d0ugal
Copy link
Member

d0ugal commented Dec 17, 2014

Consider for example the search index, it is generated once for all plages. However, if you display the results in a modal the links need to be relative (or they wont work in static mode). To make this work in #222 I had to do this bit of gross magic, I'm trying to come up with a better solution.

https://github.com/d0ugal/mkdocs/blob/search/mkdocs/themes/mkdocs/js/tipuesearch/tipuesearch_content.js
https://github.com/d0ugal/mkdocs/blob/search/mkdocs/themes/mkdocs/base.html#L65

@tomchristie
Copy link
Member

it is only really useful for people that view the docs in static mode

Maybe I'm just misunderstanding, but being able to click between docs in my editor is pretty neat.

Consider for example a search index

Are we just talking about relative links in the theme then? No problem from me with not supporting that.

@d0ugal
Copy link
Member

d0ugal commented Dec 17, 2014

Updated my comment as I rushed it on my phone. So yeah, I think for me the issue is supporting it at the theme level for me.

I think with @SvenDowideit's example you have a similar issue when the 404 could be rendered at /invalid-path/ or /invalid/path/ but I may be missunderstanding that.

@d0ugal
Copy link
Member

d0ugal commented Dec 17, 2014

It is worth noting that we don't have this issue with the DRF theme, but I'm not sure if that is specific to that theme. The menu works fine: http://www.django-rest-framework.org/testing/testing/testing/

@tomchristie
Copy link
Member

I'm still surprised tho - I've never used anything except relative links with MkDocs,
and put a stack of time into making them work initially.
Be helpful to see a broken example case.

@tomchristie
Copy link
Member

I'm further confused because this report seems to be competely at odds with #192

If you can't use relative or absolute URLs how are the rest of you folks getting by? :)

@d0ugal
Copy link
Member

d0ugal commented Dec 17, 2014

Hah, so I'd always use relative for document linking. I don't think absolute URLs work (although, with the strict mode disabled they may happen to work).

I chimed in as I've had issues with #222 but that is a specific theme case so I may have just confused things. Maybe Sven can clarify the original issue.

@tomchristie
Copy link
Member

we don't have this issue with the DRF theme

Yeah just noticed that the nav on the 404 page there is hardcoded in the theme :p

@SvenDowideit
Copy link
Author

http://docs.docker.com/ is an example - we also publish a version of the same docs to http://docs.docker.com/v1.2/ etc

and at the same time, there are webserver based redirects that may display a page, but come from a totally different place in the url structure.

so for (a madeup) eg http://yoursite/one/two/three/index.html may also be shown when the user requests http://yoursite/other/place.index.html

so the relative link is going to be at the wrong place, and the wrong level.

I've obviously made a change in my fork for this... :)

@SvenDowideit
Copy link
Author

mmm, iirc, its also very relevant if you use the same 'search' page for all 404's, which we do.

@ePirat
Copy link

ePirat commented Jan 7, 2015

@d0ugal What exactly do you mean with

I'm not really a fan of relative URLs, it is only really useful for people that view the docs in static mode and I'm not sure how popular a use case that is.

Static mode?

@d0ugal
Copy link
Member

d0ugal commented Jan 7, 2015

@ePirat We possibly need a better name for this, basically when you browse the documentation in your browser on the local filesystem. So links like "/" don't work as that isn't the root of the documentation. So, for example mkdocs build && firefox site/index.html.

@waylan
Copy link
Member

waylan commented Aug 26, 2016

The report in #1035 brought me to this issue. This is the first time I've taken a close look at it. It appears from my reading that there is some confusion about this issue. As I see it, there are two separate (unresolved) issues with relative URLs. This one and the one in #192. The issue in #192 applies only to URL munging within Markdown and is not related to this issue.

In this issue the OP encountered the same problem as encountered in #1035 (which I closed as a duplicate, but it clearly defines the user's problem and helps clear up the issue). Namely, a template renders a relative URL which is hard coded into the page. Normally, not a problem as each page is always only ever at one location. However the problem exists when the same page can exist in multiple different locations. For example, error pages are served for any URL in which an error occurs. The location of any other page relative to the current URL will be different for most cases--especially now that we allow multi-level nesting.

Of course, in dynamic servers, the error page is often regenerated each time (from a template), in which case the current URL is accounted for when rendering the template, avoiding this problem. But we don't have this option as our error page is rendered without knowledge of the requested URL. Therefore, we need nav items, images, js, css, etc, to point to the proper location regardless of the location of the current page.

The easy way to do that is via absolute URLs. But, of course, we cannot assume that the server root is the same as the MkDocs root. For example, consider documentation hosted on http://pythonhosted.org as explained in this comment. The point is, we can't just assume / is the documentation root on the server. It may be /projectname/ or /username/projectname/ or some other thing.

Previous comments have mentioned adding some sort of new config setting. That should not be necessary. We already have the site_url setting. If a project's documentation root is at http://example.com/projectname/, then that is going to be the value assigned to site_url. We can extrapolate from that that any absolute URL should have /projectname/ prepended to it to be a proper absolute URL.

Of course, there is the issue of the dev server (livereload or not) in which the server root is always the same as MkDocs root. But that should not be a problem as we also have the dev_addr setting. If that is set, just extrapolate the root from that rather than site_url. Coincidentally, I find it strange that this config setting exists. Yes, it should exist as an option on the serve command, but shouldn't the value just be passed into the config as an override of the site_url setting during development? But I digress.

So my proposal would be that the base_url template variable be the absolute path from the server root to the MkDocs root. To illustrate, here's some examples of what the base_url template variable should contain depending on the value of the site_url config setting:

site_url base_url
http://example.com/ /
http://example.com /
http://example.com/foo/ /foo/
http://example.com/foo /foo/
http://example.com/foo/bar/baz/ /foo/bar/baz/
http://127.0.0.1:8000 (from dev_addr) /
http://0.0.0.0:80 (from dev_addr) /

@waylan
Copy link
Member

waylan commented Aug 26, 2016

For completeness, I should point out that browsing files via the local file system (file://) will break with absolute URLs every time. Of course, currently it is broken anyway (search won't work), but if we want to support that, then absolute URLs are out. In that case, I would propose using absolute URLs on error pages only (as described in my previous comment). Error pages are server specific and don't get returned by the local file system on an error, so it should be okay if they are broken in that instance. The trick is how to address the nav on error pages. If the nav contains relative URLs, those URLs still need to be absolute on the error page. Perhaps the error page should only contain a link to the homepage, not the entire nav.

@waylan
Copy link
Member

waylan commented Aug 27, 2016

Warning, the following is a complicated explanation of how the code works. Sorry, but explaining it is part of how I work out solutions, which you'll find at the end.

I think the key to fixing this issue is to understand how the URLContext works. The 'short' explanation is that in mkdocs.commands.build.build_pages we have the loop: for page in site_navigation.walk_pages(): The key is in the walk_pages method. That method sets a page as "active" and yields it. build_pages then takes that page and builds it (renders Markdown and template, then writes to disc). The url property of each page is a method which makes a call to the URLContext using the "active" page to create URLs which are relative to that "active" page (the entire loop through the nav in the template happens with this one page set as "active"). After the page is built, the loop (in build_pages) continues, asking walk_pages for the next page. walk_pages then deactivates the current page and activates the next page before yielding it. The entire thing is dependent on the global "active" page and URLContext and the fact that we are in a synchronous system (it would never work in an asynchronous environment). That is how each page is generated with a complete nav where all the URLs are relative to that page with absolutely no consideration for the location of the current page in the template. Its a clever piece of code which removes the need for the template to care about relative URLs.

However, there are pages which do not get this treatment. Specifically, any page which is not listed in the pages config setting. As it happens, all those pages are template based pages, such as error pages, sitemap.xml, search.html and anything in the extra_templates config setting. The key difference is that for these pages there is no "active" page, in which case the URLContext defaults to assuming an "active" URL of /. Of course, search.html is at the site root (as it sitemap.xml) so the relative paths in the nav are correct. However, extra_templates could be nested at any location (I assume this is an unreported bug as its a little used feature). Presumably, setting the "active" page in the URLContext to the path of the current template would build a proper nav for each page in extra_templates (untested, but I think it should work).

But for error pages it's a little more tricky. As discussed previously, error pages could be served at any URL nested at any level by the server. There it no correct "active" page to make nav items relative to. Therefore, I propose that if there is no "active" page, rather than the default being to generate a URL for a page relative to /, instead generate an absolute URL with the base extrapolated from site_url. This absolute URL should then be returned by the mkdocs.nav.Page.url method when the error template builds it's nav.

waylan added a commit to waylan/mkdocs that referenced this issue Aug 28, 2016
An error page can be served from any location and therefore it is
impossable to pre-build an error page with correct relative URLs.
With absolute URLs, the error pages will properly link to other
pages in the nav as well as media files (css, js, images, etc) from
the template regardless of the actual URL the file is served from.

However, to continue to support environments where the docs root is a subdir
of the server root, all other pages must continue to use relative URLs.
The `site_url` is used to determine the server root when building
absolute URLs for the error page to ensure those URLs continue to work
in that type of environment.

Relative URLs are also nessecary for those who browser the site on the
local file system (via `file://`). In that case, the error page will be
broken. However, as error pages are not served by a local file system,
this is no more than a minor inconvience. Error pages should always be
tested from a server environment.

Fixes mkdocs#77.
waylan added a commit to waylan/mkdocs that referenced this issue Aug 28, 2016
An error page can be served from any location and therefore it is
impossable to pre-build an error page with correct relative URLs.
With absolute URLs, the error pages will properly link to other
pages in the nav as well as media files (css, js, images, etc) from
the template regardless of the actual URL the file is served from.

However, to continue to support environments where the docs root is a subdir
of the server root, all other pages must continue to use relative URLs.
The `site_url` is used to determine the server root when building
absolute URLs for the error page to ensure those URLs continue to work
in that type of environment.

Relative URLs are also nessecary for those who browser the site on the
local file system (via `file://`). In that case, the error page will be
broken. However, as error pages are not served by a local file system,
this is no more than a minor inconvience. Error pages should always be
tested from a server environment.

Fixes mkdocs#77.
waylan added a commit to waylan/mkdocs that referenced this issue Aug 29, 2016
An error page can be served from any location and therefore it is
impossable to pre-build an error page with correct relative URLs.
With absolute URLs, the error pages will properly link to other
pages in the nav as well as media files (css, js, images, etc) from
the template regardless of the actual URL the file is served from.

However, to continue to support environments where the docs root is a subdir
of the server root, all other pages must continue to use relative URLs.
The `site_url` is used to determine the server root when building
absolute URLs for the error page to ensure those URLs continue to work
in that type of environment.

Relative URLs are also nessecary for those who browser the site on the
local file system (via `file://`). In that case, the error page will be
broken. However, as error pages are not served by a local file system,
this is no more than a minor inconvience. Error pages should always be
tested from a server environment.

Fixes mkdocs#77.
waylan added a commit to waylan/mkdocs that referenced this issue Aug 29, 2016
An error page can be served from any location and therefore it is
impossable to pre-build an error page with correct relative URLs.
With absolute URLs, the error pages will properly link to other
pages in the nav as well as media files (css, js, images, etc) from
the template regardless of the actual URL the file is served from.

However, to continue to support environments where the docs root is a subdir
of the server root, all other pages must continue to use relative URLs.
The `site_url` is used to determine the server root when building
absolute URLs for the error page to ensure those URLs continue to work
in that type of environment.

Relative URLs are also nessecary for those who browser the site on the
local file system (via `file://`). In that case, the error page will be
broken. However, as error pages are not served by a local file system,
this is no more than a minor inconvience. Error pages should always be
tested from a server environment.

Fixes mkdocs#77.
waylan added a commit to waylan/mkdocs that referenced this issue Aug 29, 2016
An error page can be served from any location and therefore it is
impossable to pre-build an error page with correct relative URLs.
With absolute URLs, the error pages will properly link to other
pages in the nav as well as media files (css, js, images, etc) from
the template regardless of the actual URL the file is served from.

However, to continue to support environments where the docs root is a subdir
of the server root, all other pages must continue to use relative URLs.
The `site_url` is used to determine the server root when building
absolute URLs for the error page to ensure those URLs continue to work
in that type of environment.

Relative URLs are also nessecary for those who browse the site on the
local file system (via `file://`). In that case, the error page will be
broken. However, as error pages are not served by a local file system,
this is no more than a minor inconvience. Error pages should always be
tested from a server environment.

Fixes mkdocs#77.
waylan added a commit to waylan/mkdocs that referenced this issue Aug 29, 2016
An error page can be served from any location and therefore it is
impossable to pre-build an error page with correct relative URLs.
With absolute URLs, the error pages will properly link to other
pages in the nav as well as media files (css, js, images, etc) from
the template regardless of the actual URL the file is served from.

However, to continue to support environments where the docs root is a subdir
of the server root, all other pages must continue to use relative URLs.
The `site_url` is used to determine the server root when building
absolute URLs for the error page to ensure those URLs continue to work
in that type of environment.

Relative URLs are also nessecary for those who browse the site on the
local file system (via `file://`). In that case, the error page will be
broken. However, as error pages are not served by a local file system,
this is no more than a minor inconvience. Error pages should always be
tested from a server environment.

Fixes mkdocs#77.
@waylan
Copy link
Member

waylan commented Aug 29, 2016

I have a working solution in #1039. It took a few iterations, but I have the least invasive patch possible (added 9 lines of code, not counting tests and comments). Only the error page is affected, everything else continues to behave as before.

@csandanov
Copy link

@waylan there's so much information and changes regarding the URLs I'm confused whether there is a solution to my problem described in #192 or not. Is it possible to make mkdocs generate absolute URLs (to avoid 404 by search crawlers) or not? I realize there are variables I can use in theme templates to get absolute URLs in navigation but what about markdown links in .md files? Relative URLs ../ generate tons of 404 errors by search crawler.

@waylan
Copy link
Member

waylan commented Sep 14, 2018

@csandanov please open a new issue and provide examples describing your problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants