Skip to content

[MNT]: Require that matplotlibrc/style files use utf-8 (or have an encoding cookie) #23026

Closed
@anntzer

Description

@anntzer

Summary

Currently, matplotlibrc and style files are read with the locale encoding, since #3575. There's even a test for it in test_rcparams.py, which reads

def test_Issue_1713(tmpdir):
    rcpath = Path(tmpdir) / 'test_rcparams.rc'
    rcpath.write_text('timezone: UTC', encoding='UTF-32-BE')
    with mock.patch('locale.getpreferredencoding', return_value='UTF-32-BE'):
        rc = mpl.rc_params_from_file(rcpath, True, False)
    assert rc.get('timezone') == 'UTC'

But actually, we probably never really supported non-ascii encodings (such as utf-32-be), because if you try to import matplotlib in such a context, we will fail much earlier, when trying to read the default matplotlibrc file:

from unittest import mock
with mock.patch("locale.getpreferredencoding", return_value="utf-32-be"):
    import matplotlib

gives

Traceback (most recent call last):
  File "/tmp/test.py", line 3, in <module>
    import matplotlib
  File ".../matplotlib/__init__.py", line 883, in <module>
    rcParamsDefault = _rc_params_in_file(
  File ".../matplotlib/__init__.py", line 785, in _rc_params_in_file
    for line_no, line in enumerate(fd, 1):
  File "/usr/lib/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-32-be' codec can't decode bytes in position 0-3: code point not in range(0x110000)

(the test doesn't see that because the default matplotlibrc file has already been imported at this point...). This behavior also means that style files are actually not shareable between systems that use incompatible encodings.

Given that #3575 was implemented in response to #1713, which is about the Py2/Py3 unicode transition and not any user actually requesting support for non-standard encodings, I think we should just drop any intent of reading matplotlibrc/style files using the user locale, and instead spec them as being utf-8 (or, if we want to be super-flexible, support encoding cookies as in https://docs.python.org/3/library/tokenize.html#tokenize.detect_encoding / https://peps.python.org/pep-0263/ -- but I'd say it's probably not worth it?).

Proposed fix

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions