Skip to content

MNT: let bad rcParam keys pass #19588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

jklymak
Copy link
Member

@jklymak jklymak commented Feb 26, 2021

PR Summary

Close #19578

I don't see why we would fail to import if the rcdefault and the rcParam dicts don't exactly match. It definitely is a pain for developers who need to bisect. But I frankly don't follow why we have all this machinery at all, so no doubt I am missing a subtlety...

EDIT: As pointed out by @timhoffm the issue is that /matplotlibrc.template can get out of sync with rcdefaults._validators.
This is confusing because we were not checking against /matplotlibrc.template, but rather /lib/matplotlib.mpl-data/matplotlibrc which gets copied from /matplotlibrc.template when setup.py is called.

This PR moves the check from being done on import matplotlib to a test that is run by CI.

PR Checklist

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (run flake8 on changed files to check).
  • New features are documented, with examples if plot related.
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • Conforms to Matplotlib style conventions (install flake8-docstrings and run flake8 --docstring-convention=all).
  • New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).

Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm undecided if a configuration MPLBISECT_MODE=1 as proposed here is needed or if the warning is sufficient.

The source if rcParamsDefault is cbook._get_data_path("matplotlibrc"). Am I right that we control that file and it's not user configurable. Then, erroring out with undefined keys serves the purpose that we don't carry around inconsistent state in the repo. Warnings can be overlooked. Otherwise, just warning would be ok.

if key == 'backend':
defaultParams['backend'] = rcsetup._auto_backend_sentinel
elif key in rcParamsDefault.keys():
defaultParams[key] = validator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused that before defaultParams was a mapping key: [default_value, validator] and now it's key: validator.
Both don't make sense to me. I'd expect key: default_value. But maybe that's just my before-first-coffee brain.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooops, well, it is certainly a mystery to me what this is even supposed to do if this big error didn't kill the test suite. A search in the repo turns up defaultParams only gets used here, and nowhere else. Removing this whole block of code leads to no changes so far as I can tell. So I am completely mystified why this code even exists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed a new commit that cuts it all. So far tests are fine. So again, what is this supposed to do (he sheepishly asks of a PR change he approved #15029)

@anntzer
Copy link
Contributor

anntzer commented Feb 26, 2021

I think I would rather have something like: if loading the default matplotlibrc fails, then check whether matplotlibrc.template exists in the directory twice above matplotlib/__init__.py (i.e. where it would be in a dev install), and if so load it from there.

@jklymak jklymak added this to the v3.4.0 milestone Feb 26, 2021
@jklymak jklymak added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Feb 26, 2021
@jklymak
Copy link
Member Author

jklymak commented Feb 26, 2021

I'm marking release critical because #15029 has not been released yet, and I think this would be good to sort out. But apologies in advance if I'm just missing something obvious.

@jklymak
Copy link
Member Author

jklymak commented Feb 26, 2021

So this only checks consistency between the template, and the rcParams (and spits out rcsetup.defaultParams).

I'd argue this should not be done every time the matplotlib starts up, but rather should be a test, since inconsistency between rcsetup.py and the template is a development problem, not a user problem?

@jklymak jklymak marked this pull request as draft February 26, 2021 16:18
@jklymak
Copy link
Member Author

jklymak commented Feb 26, 2021

So all this is doing is testing, every time we load matplotlib, that the installed version of the template in mpl-data is the same as what is hardcoded in rcsetup.py? a) are we protecting against the user manually mucking with this? Why? b) if we want to give the user a way to make a matplotlibrc.template shouldn't we just give them a function to do that? Why do we have mpl-data/matplotlibrc at all?

@WeatherGod
Copy link
Member

WeatherGod commented Feb 26, 2021 via email

@jklymak
Copy link
Member Author

jklymak commented Feb 26, 2021

The check went in a few years ago when someone noticed that the template wasn't getting updated appropriately,

OK, that makes sense. But I think we now make the template during setup.py (right?) so it is impossible for it to be out of sync. So I think this whole check can safely be removed?

Edit: confirmed, lib/matplotlib/mpl-data/matplotlibrc is made on setup (which is the whole problem here). I can't think of a reason for this check given that is the case. @anntzer? @tacaswell?

@jklymak jklymak marked this pull request as ready for review February 26, 2021 18:51
@jklymak jklymak force-pushed the mnt-let-bad-key-pass branch from 43f16ae to 2fdb7df Compare February 26, 2021 18:53
@timhoffm
Copy link
Member

timhoffm commented Feb 26, 2021

The past

If you look at #15029, you can see what is going on:

  • Originally, we had rcsetup.defaultParams specifying the defaults and validators as mapping key: (default_value, validator). The matplotlibrc.template was detached and thus could get out of sync.
  • defaultParams was imported into top-level matplotlib, making matplotlib.defaultParamsandmatplotlib.rcsetup.defaultParams` effectively public API.
  • Get default params from matplotlibrc.template. #15029 partially addressed this by making matplotlibrc.template leading wrt. the default value and use it to generate defaultParams / rcsetup.defaultParams at startup for backward compatibility. But the template does only hold the defaults. The validators are now stored in a separate mapping rcsetup._validators.

The present

  • matplotlib.rcParamsDefault is a mapping key: default_value and determines the defaults for rcParams. It is populated from lib/matplotlib/mpl-data/matplotlibrc, which is created from matplotlibrc.template during setup.py.
  • defaultParams / rcsetup.defaultParams is a mapping key: (default_value, validator). It seems unused within matplotlib nowadays, but is former public API and might be accessed by users. It's keys come from rcsetup._validators. These keys can theoretically get out of sync with the keys in the template / rcParamsDefault. In practice we would note that:

The problem

This is a messy but carefully tuned setup that works for production.

The bisect problem #19578 comes from the fact that the two carefully synced sources have different update mechanisms. Validators are immediately updated as you walk through the history. But defaults are only updated when you run setup.py. This lets them drift apart, which is (somewhat correctly) detected by our synchronization safeguards.

The future

is left as an exercise to the able reader. 😄

Hint: It's probably not feasible/desired to unite the two sources. If that's the case, you need to be able to get consistent defaults at startup, which may mean we need a flag that one has to enable for bisecting, which generates matplotlibrc at startup.

@jklymak
Copy link
Member Author

jklymak commented Feb 27, 2021

Thanks @timhoffm I think I understand now:

We have

  • rcdefaults._validators that map keys to their validators,
  • /matplotlibrc.template that has the actual default values for each key, serves as the template, and has a bunch of comments in it that are helpful to the user
  • lib/matplotlib/mpl-data/matplotlibrc gets made at setup.py and is a direct copy of matplotlibrc.template at setup.

I still feel, however, that there is no need to check the consistency of rcdefaults._validators and lib/matplotlib/mpl-data/matplotlibrc every time that matplotlib is imported. I don't see how these could get out of sync unless someone deliberately did so (or you are developing and this file is not being updated because you are running in editable mode). We should have a test that matplotlibrc.template and rcdefaults._validators are in sync as part of CI. Having lib/matplotlib/mpl-data/matplotlibrc and rcdefaults_validators go out of sync doesn't actually hurt anything, so far as I can tell - certainly all the tests continue to run.

@jklymak jklymak force-pushed the mnt-let-bad-key-pass branch from 1abe137 to e2a21aa Compare February 27, 2021 06:09
@jklymak
Copy link
Member Author

jklymak commented Feb 27, 2021

Added a test that maplotlibrc.template and rcsetup._validators are in sync.

@jklymak jklymak force-pushed the mnt-let-bad-key-pass branch from fd5996a to 88b2d1f Compare February 27, 2021 06:34
@jklymak jklymak force-pushed the mnt-let-bad-key-pass branch from 88b2d1f to 1a46563 Compare February 27, 2021 07:26
@anntzer
Copy link
Contributor

anntzer commented Feb 27, 2021

Thanks @timhoffm for the long writeup :-)
I'm not sure I really like just warning if an entry is missing in mpl-data/matplotlibrc (because I guess that'll just crash with a KeyError if we ever try to access that entry?). Moreover if that ever happens to an end user that likely means that their install is quite messed up. Instead (as briefly mentioned above) I think that if we detect that things are out of sync (i.e. an entry is missing), then we should check whether matplotlibrc.template exists in the directory twice above matplotlib/__init__.py (i.e. where it would be in a dev install), and if so load it from there; if it's not there then we should crash. e.g. something like

diff --git i/lib/matplotlib/__init__.py w/lib/matplotlib/__init__.py
index c3d4aaf62..5cf679ad5 100644
--- i/lib/matplotlib/__init__.py
+++ w/lib/matplotlib/__init__.py
@@ -823,6 +823,17 @@ rcParamsDefault = _rc_params_in_file(
     transform=lambda line: line[1:] if line.startswith("#") else line,
     fail_on_error=True)
 dict.update(rcParamsDefault, rcsetup._hardcoded_defaults)
+if {*rcParamsDefault} ^ {*rcsetup._validators}:
+    # This should only occur e.g. when bisecting on a dev install when setup.py
+    # hasn't been run to update mpl-data/matplotlibrc.
+    _repo_template = (Path(__file__).parents[2] / "matplotlibrc.template")
+    if _repo_template.exists():  # Do we want to update the template here?
+        rcParamsDefault = _rc_params_in_file(
+            _repo_template,
+            # Strip leading comment.
+            transform=lambda line: line[1:] if line.startswith("#") else line,
+            fail_on_error=True)
+        dict.update(rcParamsDefault, rcsetup._hardcoded_defaults)
 rcParams = RcParams()  # The global instance.
 dict.update(rcParams, dict.items(rcParamsDefault))
 dict.update(rcParams, _rc_params_in_file(matplotlib_fname()))

@jklymak
Copy link
Member Author

jklymak commented Feb 27, 2021

@anntzer, I'm not following the point of this. If the user has broken their /lib/matplotlib/mpl-data/matplotlibrc then we should instead check, at run time, the /matplotlibrc.template? Why would that file be any more trustworthy? I mean at that point we should check all the *.py files because they could have edited those.

Maybe an idea is to hide a copy of /lib/matplotlib/mpl-data/matplotlibrc as /lib/matplotlib/mpl-data/._matplotlibrc_installed and use that as canonical if we are afraid of folks editing their installs?

The other idea would be to wrap the config template in a python file, and or make rcsetup._validators have
"default", "preamble" and "postscript" properties that they are written as comments around the defaults, and just create all the templates on setup.py and never look at them again.

@anntzer
Copy link
Contributor

anntzer commented Feb 28, 2021

If the user has broken their /lib/matplotlib/mpl-data/matplotlibrc then we should instead check, at run time, the /matplotlibrc.template?

Here I am specifically talking about the case of bisecting on editable installs, because that's a case that occurs relatively often for devs (I have noticed the problem too). If a plain end-user install is broken, then we should just error out -- mpl-data/matplotlibrc being messed up on a plain install should be treated just as if an end-user edits and breaks any .py file in their matplotlib install, we shouldn't try to handle that.

For editable installs, mpl-data/matplotlibrc can (as you noticed) easily get broken nowadays if you change your local repo HEAD and don't run setup.py to regen it from the toplevel matplotlibrc.template. On the other hand, the toplevel matplotlibrc.template should always be in sync with the rest of the repo state (because any change to e.g. rcParams.validators should be commited simultaneously with a change to matplotlibrc.template in order to make the tests pass -- if they are not in sync, then CI will not pass). So matplotlibrc.template should always be correct.

@jklymak
Copy link
Member Author

jklymak commented Feb 28, 2021

So matplotlibrc.template should always be correct.

Right, so why don't we load the defaults from there for everyone? Why does lib/matplotlib/mpl-data/matplotlibrc exist at all?

@anntzer
Copy link
Contributor

anntzer commented Feb 28, 2021

Because matplotlibrc.template doesn't exist for user installs? (where would it be for a matplotlib installed in someone's site-packages?)

@jklymak
Copy link
Member Author

jklymak commented Feb 28, 2021

Ok, then conversely why don't we version control lib/matplotlib/mpl-data/matplotlibrc and why does matplotlibrc.template exist at all?

@anntzer
Copy link
Contributor

anntzer commented Feb 28, 2021

I would be in favor of killing matplotlibrc.template and having just mpl-data/matplotlibrc. However, I think in #14929 there were some concerns about people relying on the template as documentation (see e.g. #14929 (comment)). Also, we allow redistributors to change the default backend (via setup.cfg, see #11844) which means that the file needs to be modified to support that -- although that's admittedly fixable by doing slightly more nasty things in setup.py to change the file in the wheel.

@jklymak
Copy link
Member Author

jklymak commented Feb 28, 2021

For documentation consistency we could easily just softlink matplotlibrc.template to lib/matplotlib/mpl-data/matplotlibrc. I don't see that as a problem.

I don't understand wheel building well enough - is the idea that setup.cfg needs to be able to modify a file to create lib/matplotlib/mpl-data/matplotlibrc? Is it a problem to just modify the file in-place? The relevant code is at

    # Write the default matplotlibrc file
    with open('matplotlibrc.template') as fd:
        template_lines = fd.read().splitlines(True)
    backend_line_idx, = [  # Also asserts that there is a single such line.
        idx for idx, line in enumerate(template_lines)
        if line.startswith('#backend:')]
    if setupext.options['backend']:
        template_lines[backend_line_idx] = (
            'backend: {}'.format(setupext.options['backend']))
    with open('lib/matplotlib/mpl-data/matplotlibrc', 'w') as fd:
        fd.write(''.join(template_lines))

and it seems that could easily be done in-place.

@jklymak
Copy link
Member Author

jklymak commented Feb 28, 2021

See #19602 for that approach

@anntzer anntzer mentioned this pull request Feb 28, 2021
7 tasks
@anntzer
Copy link
Contributor

anntzer commented Feb 28, 2021

soft-links will be a problem on Windows (they can be made to work but ugh); updating matplotlibrc in place directly means your repo will be in a dirty state which may or may not confuse versioneer. See #19603.

@jklymak
Copy link
Member Author

jklymak commented Feb 28, 2021

I don't think the softlink is for windows in particular, but so anyone who was grabbing matplotlibrc.template from the repo can still find it. Also if we remove it, then tutorials/introductory/customizing.py will need to be fixed.

@jklymak
Copy link
Member Author

jklymak commented Mar 2, 2021

Closing for #19603.

@jklymak jklymak closed this Mar 2, 2021
@QuLogic QuLogic removed this from the v3.4.0 milestone Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bisect very hard with rcParam changes
5 participants