Skip to content

MAINT force encoding when opening file to not be platform specific #27731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 10, 2023

Conversation

glemaitre
Copy link
Member

closes #27725

Trying to enforce the encoding while reading the CSS file such that it is not platform dependent.

@Charlie-XIAO Could you try to know if you still have the issue locally.

@Charlie-XIAO
Copy link
Contributor

Charlie-XIAO commented Nov 6, 2023

@glemaitre Yes I can confirm that this fixes my issue.

Copy link

github-actions bot commented Nov 6, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 199e360. Link to the linter CI: here

Copy link
Member

@betatim betatim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Not sure that there is a (not too complex) way to test for this. You'd probably have to patch the method used to determine the platform's default encoding in open()

@lesteve
Copy link
Member

lesteve commented Nov 6, 2023

Not sure that there is a (not too complex) way to test for this. You'd probably have to patch the method used to determine the platform's default encoding in open()

I pushed a commit with a test. I am guessing the test will run on Posix platforms and be skipped on Windows. I could not find a simpler way to do it, if deemed to complex this can always be reverted 😉.

@glemaitre
Copy link
Member Author

Better to have a test. Thanks @lesteve

@betatim
Copy link
Member

betatim commented Nov 6, 2023

Nice! I like it. And with C as locale reading the file fails for me, so this looks like a good enough test.

@lesteve
Copy link
Member

lesteve commented Nov 6, 2023

Oh well the CI error comes from the fact that the locale resetting logic does not work on some platforms. This is probably due to python/cpython#78115 (comment) but this needs more investigation.

@lesteve
Copy link
Member

lesteve commented Nov 7, 2023

I tweaked the test in my last commit, the debian 32bit build is now green.

codecov is red because we don't have coverage for Windows since #26052 it seems (and I expect locale.setlocale(locale.LC_CTYPE, "C") to raise a locale.Error on Windows but I have no easy way to double-check)

Edit: actually I double-checked and the "C" locale was available on the Windows VM I managed to create. It might be always available if I interpret this correctly

A value of "C" for locale specifies the minimal environment for C translation

In any case, I feel the try/except in the test does not hurt ...

@lesteve
Copy link
Member

lesteve commented Nov 10, 2023

Let's merge this!

@lesteve lesteve merged commit 2c1472d into scikit-learn:main Nov 10, 2023
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: pytest giving UnicodeDecodeError on Windows machine
4 participants