-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MAINT create robots.txt
for setting up pydata-sphinx-theme
preview
#28376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Let's try it out. I think the chances of this breaking something related to the main webpage are small, but I'll keep an eye on it after merging to check
@@ -0,0 +1,2 @@ | |||
User-agent: * | |||
Disallow: /_pst_preview/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disallow: /_pst_preview/ | |
# Do not let search engines index the PyData theme preview site | |
# during the live testing phase. | |
# https://github.com/scikit-learn/scikit-learn/pull/28353 | |
Disallow: /_pst_preview/ |
My review arrived too late. It would be good to insert a comment to explain the Why of such config files. |
Thanks @ogrisel for the comment, I did not know that robots.txt can have comments. I will have a follow up for that. |
I did not know either but it looked the intuitive thing to do and I checked: https://developers.google.com/search/docs/crawling-indexing/robots/create-robots-txt#create_rules |
I opened #28378 for this. |
Reading https://github.com/scikit-learn/scikit-learn/blob/main/build_tools/circle/push_doc.sh more closely to understand how the docs repo works I think the result of this PR will be a Are the files above Maybe @thomasjpfan knows more? |
As an alternative adding a robots.txt directly into https://github.com/scikit-learn/scikit-learn.github.io may be simpler ... That would mean doing a PR on the scikit-learn.github.io repo (better for remembering why we did this) or pushing directly into the repo if we want a quick and dirty thing. There are precedents of doing things directly in the scikit-learn.github.io repo, see https://github.com/scikit-learn/scikit-learn.github.io/pulls?q=is%3Apr+sort%3Aupdated-desc+is%3Amerged |
I am +1 for directly adding into
Yep and (though I'm not one) I think maintainers need to make some modifications directly in that repo per major/minor release: See the 9th step of https://scikit-learn.org/dev/developers/maintainer.html#making-a-release. |
I opened scikit-learn/scikit-learn.github.io#21 to add directly a robots.txt in the .github.io repo. |
Related to: #28353. In particular, see #28353 (comment).
@betatim