Skip to content

MEP32: matplotlibrc/mplstyle with Python syntax. #9528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

anntzer
Copy link
Contributor

@anntzer anntzer commented Oct 22, 2017

See proposed MEP text in the PR. The rendered version is available by clicking on view at the top right of the "files changed" tab.

I had this written for a while but was hoping to push this a bit later (I don't really have the time to work on it now). #6157 (comment) made me consider at least publishing my current thoughts.

attn @matplotlib/developers

@jklymak
Copy link
Member

jklymak commented Oct 22, 2017

Cool - I think I prefer the straight python syntax. How does that look to the user in their code where they call the style sheet? Same as before?

@anntzer
Copy link
Contributor Author

anntzer commented Oct 22, 2017

Indeed, I did not mention that point (nothing changes for the end user). Edited accordingly.

@anntzer anntzer force-pushed the mplrc-mep branch 3 times, most recently from a1dff3f to 43cab58 Compare October 23, 2017 08:49
@jklymak jklymak added this to the v3.0 milestone Dec 21, 2017
@jklymak
Copy link
Member

jklymak commented Dec 21, 2017

Milestoning as 3.0, a) because I think this would be great, and popping to the top of everyone's queue, and b) because I would be surprised if it was accepted for 2.2.

@efiring
Copy link
Member

efiring commented Dec 21, 2017

I also like plain python for configuration. I ran into a speed bump, but got over it, when doing something like this. See the Bunch.from_pyfile() method in https://currents.soest.hawaii.edu/hgstage/pycurrents/file/tip/system/misc.py.

@tacaswell
Copy link
Member

Thanks for putting this together! I am coming around to this being the right thing to do (despite my grumbling).

A third option for the full-python implementation is to look for modules with a function

def apply_style(rcparams) -> None:
   ...

and have style.use pass the rcParams file through them in sequence. This would also let us do the update atomically by collecting the changes all into a temporary namespace and then update the global one once. It would also avoid having to delete the module to re-import it.

This would also allow for some interesting import hook implementations...

An other option would be to make rcparams a Traitlets object and leverage all of their configuration management.

@jklymak
Copy link
Member

jklymak commented Mar 7, 2018

Ping @anntzer - I personally think this would be a great feature for 3.0.... (once you are done expunging any py2-isms that have been driving you nuts).

@WeatherGod
Copy link
Member

WeatherGod commented Mar 7, 2018 via email

@jklymak
Copy link
Member

jklymak commented Mar 7, 2018

Good point. I do wonder about MPL arbitrarily loading a config file, without the opportunity to turn that convenience off, or even better to have it turned off by default. I think it makes sense that downstream packages should never load any matplotlibrc file.

@anntzer
Copy link
Contributor Author

anntzer commented Mar 8, 2018

I am not particularly interested in relitigating this question (although I will state once again that 1) did you know that np.load can also execute arbitrary code? 2) the "proper parser" proposed has been in a state of vaporware for quite a while), but given that both @tacaswell and @efiring consider that this is a good approach, I'll let them present their thoughts (if any) on the security implications.

As for the implementation strategy I think I quite like @tacaswell's suggestion of requiring an apply_style hook.

@WeatherGod
Copy link
Member

WeatherGod commented Mar 8, 2018 via email

@pwuertz
Copy link
Contributor

pwuertz commented Mar 8, 2018

Just to throw in my 2 cents: From what I see none of the arguments against YAML mentioned in the linked discussion apply here (any more). The ruamel.yaml implementation is actively maintained (forget about PyYAML) and does safe parsing by default, i.e. there is no security risk of importing/constructing arbitrary objects by default. You just explicitly add the constructors for the custom types (like cycler) to be supported.

In my personal opinion YAML currently is the most human-readable, human-editable configuration language out there. And having a parser/dumper with round-trip support (like ruamel.yaml) makes it machine-readable/writable too.

@anntzer
Copy link
Contributor Author

anntzer commented Mar 8, 2018

I guess I'm +0 on using yaml (if it is indeed possible to define custom type loaders), I don't like the language but it's not too bad and most importantly at least it has a spec...

@WeatherGod
Copy link
Member

WeatherGod commented Mar 8, 2018 via email

@dopplershift
Copy link
Contributor

Given how it shows up in my conda updates, I think ruamel is used by conda for parsing its config.

I agree some with @WeatherGod here--as a library, we need to tread differently than an application. In theory, the same level of permissions required to modify the config would be used to run the script (baring misconfiguration), so no security risk. On the other hand, we're literally talking about adding a hook to trigger arbitrary code execution, which screams out CVE to me (as irrational as that might be). (I think I've changed my feeling from in the past.)

@anntzer
Copy link
Contributor Author

anntzer commented Mar 8, 2018

In theory, the same level of permissions required to modify the config would be used to run the script (baring misconfiguration), so no security risk.

Thanks for pointing that out. At that point I'm genuinely curious about what is the threat model that we're talking about avoiding. As in, can you present any way in which such a patch (say e.g. matplotlib executes ~/.config/matplotlibrc.py on import, just to pick a concrete case) actually makes things more dangerous?

Note also that you can opt out: just set the MATPLOTLIBRC environment variable to os.devnull (/dev/null) before importing matplotlib. (Well, if that doesn't work, then there should be a patch that makes it work.)

@dopplershift
Copy link
Contributor

I'm not a security expert, nor do I want to pretend to be, so I feel really uncomfortable trying to decide what's "safe".

There's also a difference between what's actually safe and what our user community would perceive as safe.

@anntzer
Copy link
Contributor Author

anntzer commented Mar 8, 2018

I guess it's good that the discussion restarted there, because I got to think about the issue again and am going to renege on my "uninterest in relitigating it" mentioned above.

Let's, again, use a concrete proposal as basis of discussion: at import time, matplotlib tries to import a module named matplotlibrc(.py) from the normal config path (possibly modified by MATPLOTLIBRC), and tries to call the apply_style(rc) function defined there. It is possible to disable this feature (for example) by setting MATPLOTLIBRC to os.devnull. Note that neither eval nor exec appear in this proposal :-)

The fear appears to be that that module can be modified by an attacker or an oblivious user to execute arbitrary code, e.g. os.system("rm -rf /"), and we don't want to be responsible for that. But wait! An attacker or an oblivious user had a much simpler way to achieve the same thing: they could just write this into six.py and put that in the user's cwd (or ~/.local/lib/python3.6/site-packages), and we do import six (and other dependencies) without any kind of validation (just as nearly all Python packages in the world do).

From this I conclude that we (and nearly all other Python packages that have external dependencies) are already vulnerable to arbitrary code execution (in that model), and can essentially do nothing about it.

Edit: for improved impact, replace six.py by some stdlib module name, of course.

@choldgraf
Copy link
Contributor

Just my 2 cents, I think this conversation should have input from somebody with experience in security issues who can vet the proposal. Security vulnerabilities are hard to spot and, as suggested in this thread, can be a deal-breaker for some groups. I'd feel more comfortable moving forward on this if a trusted voice said that it was OK from a security standpoint.

@anntzer anntzer mentioned this pull request May 13, 2018
@jklymak jklymak modified the milestones: v3.0, v3.1 Jul 9, 2018
@anntzer anntzer mentioned this pull request Oct 12, 2018
6 tasks
@tacaswell tacaswell modified the milestones: v3.1.0, v3.2.0 Feb 26, 2019
@anntzer anntzer mentioned this pull request Jul 31, 2019
6 tasks
@tacaswell tacaswell modified the milestones: v3.2.0, v3.3.0 Aug 19, 2019
@QuLogic QuLogic modified the milestones: v3.3.0, v3.4.0 Apr 30, 2020
@jklymak jklymak marked this pull request as draft September 12, 2020 19:54
@QuLogic QuLogic modified the milestones: v3.4.0, v3.5.0 Jan 21, 2021
@h-vetinari
Copy link

In my personal opinion YAML currently is the most human-readable, human-editable configuration language out there. And having a parser/dumper with round-trip support (like ruamel.yaml) makes it machine-readable/writable too.

Sorry for resurrecting an old issue, but I just saw it milestoned for 3.5.0 and had a look. I was wondering if toml had been considered (it's not part of the list of formats in the alternatives). It basically combines (IMO) the best of both worlds between YAML & JSON, and results in a very readable format (and is getting ever more widespread use through PEP518, cargo, etc.).

Since this allows easily defining arrays etc., and has a fully specified & verifiable grammar, that might also help with having to eval strings that currently contain more complicated expressions.

@anntzer
Copy link
Contributor Author

anntzer commented May 1, 2021

I don't think PEP518 is a good argument here; PEP518 explicitly chose to introduce a new format rather than using Python literals (https://www.python.org/dev/peps/pep-0518/#python-literals) because they expect build systems to be written in other languages than Python (note that this is the only point against them), whereas we certainly don't expect matplotlibrc being read by anyone other than Matplotlib.

@h-vetinari
Copy link

I'm not saying it's a complete answer (e.g. I haven't understood the cycler requirements beyond the fact that some arrays are incompatible with the current comma-dependent parsing), my core point was mainly: TOML > YAML

@tacaswell tacaswell modified the milestones: v3.5.0, v3.6.0 Aug 5, 2021
@timhoffm timhoffm modified the milestones: v3.6.0, unassigned Apr 30, 2022
@story645 story645 modified the milestones: unassigned, needs sorting Oct 6, 2022
@github-actions
Copy link

Since this Pull Request has not been updated in 60 days, it has been marked "inactive." This does not mean that it will be closed, though it may be moved to a "Draft" state. This helps maintainers prioritize their reviewing efforts. You can pick the PR back up anytime - please ping us if you need a review or guidance to move the PR forward! If you do not plan on continuing the work, please let us know so that we can either find someone to take the PR over, or close it.

@github-actions github-actions bot added the status: inactive Marked by the “Stale” Github Action label Apr 21, 2023
@anntzer
Copy link
Contributor Author

anntzer commented Apr 21, 2023

I think the idea is well-advertised now and whether the issue is open or closed won't change much to the discussion.

@anntzer anntzer closed this Apr 21, 2023
@timhoffm
Copy link
Member

As a remark, strictYAML may be an improvement over the very general and complex YAML standard. https://github.com/crdoconnor/strictyaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: inactive Marked by the “Stale” Github Action topic: rcparams
Projects
None yet
Development

Successfully merging this pull request may close these issues.