-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MNT Remove encoding declarations: # -*- coding: utf-8 -*-
#21260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In Python 3, the default source file encoding is UTF-8.
571474c
to
86e5679
Compare
# -*- coding: utf-8 -*-
While for Python it's indeed not necessary, editors/IDE will also need an indication what encoding it is (and I guess otherwise could use the system encoding). https://stackoverflow.com/a/14083123 For instance on Windows the default encoding is not UTF-8,
So I think it's probably safer to keep those headers. |
The default encoding of Python source files and the default encoding associated to the system locale are two different things.
Do you know of a broken IDE that would understand the encoding declaration Currently, some pure ASCII files do have the encoding declaration and some UTF-8 files lack it. No one complains about it, this is a strong indication the encoding declaration is probably NOT important. Also the current situation is inconsistent. Either all source files should start with the encoding declaration, or all UTF-8 files (and only them) should start with the encoding declaration - whatever you prefer. |
A majority of the files I have modified seem to be ASCII files:
|
Here are the UTF-8
And here are the UTF-8
|
Well the question is more how confident are you that IDE will rely on the file extension to determine the encoding :) For instance in https://www.jetbrains.com/help/pycharm/encoding.html I see nothing about it being a Python or a non Python file. I guess it depends on the default settings. One could hope they are reasonable, but personally I don't know. I just would rather be careful here. So a confirmation that UTF-8 is used by default, particularly on Windows for some of the more popular browsers would be useful. In VS Code the default is indeed UTF-8. It's probably fine, but doesn't hurt to double check. |
I don't have Windows to double-check. Instead I can add Alternatively we can close this PR. |
Note that VS saves into UTF8-BOM by default, but we probably want to the avoid the BOM in a multi-platform context. |
Well the problem is that a file that is currently ASCII will become non ASCII as soon as one adds a non ASCII character. Let's keep it open. Maybe someone else has other opinions or could provide feedback on their IDE configuration. |
Then we need to add |
Note that large projects like NumPy live without the encoding declarations. It seems to be working well for them. |
Maybe they not not have authorship lines with "Gaël", "Müller" or "Loïc" or docstrings with "Schölkopf". |
+1 for keeping UTF-8 markers to avoid problems with editors on Windows and add them when needed on a case by case basis. |
The authorship lines are not relevant as such. The number of UTF-8 files and total files is similar in both projects, but I agree Scikit-learn has twice the NumPy proportion of UTF-8 files. Still, it's similar. In NumPy:
In Scikit-learn:
|
Just to be clear:
|
I don't know. Can someone with a windows machine try to edit a On Linux and macOS I am pretty sure that UTF-8 is always use by default nowadays. |
If you really want to take into account Windows editors, I think |
I checked with these 4 editors. VSCode, PyCharm, notepad++ use UTF-8 by default. Spider seems to be using ASCII by default but automatically switches to UTF-8 as soon as you add a non-ASCII character. In all cases, reading the file shows the characters correctly. I think we can safely remove the UTF-8 markers. |
(please don't ask me more experiments with these editors, I'm uninstalling 3 of them as I speak 😄) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I trust @jeremiedbb's test results. Thanks @jeremiedbb!
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
What does this implement/fix? Explain your changes.
In Python 3, the default source file encoding is UTF-8.
Any other comments?
This is a follow-up of #21246 which was limited to examples.