|
| 1 | +MEP32: ``matplotlibrc``/``mplstyle`` with Python syntax |
| 2 | +======================================================= |
| 3 | + |
| 4 | +.. contents:: :local: |
| 5 | + |
| 6 | +Status |
| 7 | +------ |
| 8 | + |
| 9 | +Discussion |
| 10 | + |
| 11 | +Branches and Pull Requests |
| 12 | +-------------------------- |
| 13 | + |
| 14 | +None |
| 15 | + |
| 16 | +Abstract |
| 17 | +-------- |
| 18 | + |
| 19 | +I propose to replace ``matplotlibrc`` and ``mplstyle`` files (henceforth |
| 20 | +"old-style configs") by files using regular Python syntax (henceforth |
| 21 | +"new-style configs"). This will fix a number of issues that currently exist |
| 22 | +with parsing old-style configs. |
| 23 | + |
| 24 | +Detailed description |
| 25 | +-------------------- |
| 26 | + |
| 27 | +The problem |
| 28 | +~~~~~~~~~~~ |
| 29 | + |
| 30 | +Current ("old-style") configuration files use a custom syntax, of the form |
| 31 | + |
| 32 | +.. code:: conf |
| 33 | +
|
| 34 | + key: value # possible comment |
| 35 | + key: value |
| 36 | +
|
| 37 | +For each allowable key, a specific parser is defined in ``mpl.rcsetup``, |
| 38 | +possibly via some helper functions. Typical parsers parse booleans, numbers, |
| 39 | +strings (always unquoted), enumerated strings, comma-separated lists of the |
| 40 | +above-mentioned types, etc. |
| 41 | + |
| 42 | +However, some desirable inputs are difficult to parse, or currently only |
| 43 | +partially parsed: |
| 44 | + |
| 45 | +- Property cycles (e.g., ``axes.prop_cycle``), of the form :: |
| 46 | + |
| 47 | + cycler("key", [value1, value2, ...]) |
| 48 | + |
| 49 | + are currently simply parsed via ``eval`` in a restricted environment |
| 50 | + (`#6274`_), which may (or may not) be a security hole (especially combined |
| 51 | + with the ability to load style files from an URL). |
| 52 | + |
| 53 | +- Path effects (e.g., for implementing an XKCD style sheet, `#6157`_). They |
| 54 | + would have a form similar to :: |
| 55 | + |
| 56 | + patheffects.withStroke(linewidth=4, foreground="w") |
| 57 | + |
| 58 | + and would, like ``cycler``\s, require either a custom parser or being |
| 59 | + ``eval``'d. |
| 60 | + |
| 61 | +- Long inputs (e.g. property cycles) cannot be split over multiple lines, as |
| 62 | + the parser has no support for line continuations (`#9184`_). |
| 63 | + |
| 64 | +- LaTeX and PGF preambles (multiple, comma separated strings) cannot contain |
| 65 | + commas, because commas are replaced by newlines by the parser (`#4371`_). |
| 66 | + Note that commas are in fact fairly common in "normal" LaTeX preambles (e.g., |
| 67 | + ``\usepackage[option1, option2]{package}``. Actually inputting the preamble |
| 68 | + over multiple lines is not possible due to the lack of multiline support (see |
| 69 | + above). |
| 70 | + |
| 71 | +- Strings cannot contain the hash (``#``) symbol, as strings are unquoted and |
| 72 | + the hash is unconditionally interpreted as the start of a comment (`#7089`_). |
| 73 | + The hash symbol had been proposed to indicate a "plus" marker. |
| 74 | + |
| 75 | +- Dash specs of the form ``(offset, (ink-on, ink-off, in-on, ...))`` are |
| 76 | + misparsed: while :: |
| 77 | + |
| 78 | + plt.plot([1, 2], ls=(0, (5, 5))) |
| 79 | + |
| 80 | + works just fine, :: |
| 81 | + |
| 82 | + plt.rcParams["lines.linestyle"] = (0, (5, 5)) |
| 83 | + |
| 84 | + and :: |
| 85 | + |
| 86 | + plt.rcParams["lines.linestyle"] = "(0, (5, 5))" |
| 87 | + |
| 88 | + as well as setting this value in ``matplotlibrc`` all raise an exception |
| 89 | + (indirectly a cause of `#7219`_). |
| 90 | + |
| 91 | +- Custom color palettes (redefining the meaning of ``"r"``, ``"g"``, ``"b"``, |
| 92 | + etc. as seaborn used to do) has been proposed (`#8430`_), but the |
| 93 | + hypothetical rcParam value would have a type of ``Dict[str, Dict[str, str]]`` |
| 94 | + (mapping palettes to mappings of color names to color values), which was |
| 95 | + described by @tacaswell as "not too hard to parse" but "would further stress |
| 96 | + our current configuration system". |
| 97 | + |
| 98 | +Overall, the syntax of the config file is defined as "whatever the parser |
| 99 | +accepts" (`#3670`_). |
| 100 | + |
| 101 | +An additional feature that has been requested (but shoudl not be particularly |
| 102 | +difficult to implement using the current machinery) is "cascading" style |
| 103 | +sheets, either by adding a ``style`` key to style files (`#4240`_) or by |
| 104 | +loading all ``matplotlibrc``\s in order (`#6320`_). |
| 105 | + |
| 106 | +Proposed solution |
| 107 | +~~~~~~~~~~~~~~~~~ |
| 108 | + |
| 109 | +Instead of playing whack-a-mole with parser bugs, I propose to replace the |
| 110 | +syntax of config files to simply use Python config. Two main possibilities are |
| 111 | +considered below. In all cases, it is necessary to encode, in a way or |
| 112 | +another, that the config file uses the new syntax, so that Matplotlib can tell |
| 113 | +which file-parser to use. |
| 114 | + |
| 115 | +Maintain a matplotlibrc-like layout |
| 116 | +``````````````````````````````````` |
| 117 | + |
| 118 | +The config files would maintain the format |
| 119 | + |
| 120 | +.. code:: conf |
| 121 | +
|
| 122 | + key: value # possible comment |
| 123 | + key: value |
| 124 | +
|
| 125 | +but all values would simply be parsed by passing to ``eval`` in the same |
| 126 | +restricted environment as for cyclers. Further validation of the inputs should |
| 127 | +try to reuse whatever validation code Matplotlib already uses to validate |
| 128 | +the same input when passed to an actual artist's property setter (e.g., |
| 129 | +validating a linestyle should call the same helper validator function as |
| 130 | +``Line2D.set_linestyle``). |
| 131 | + |
| 132 | +- The fact that a config file uses the nex syntax could be indicated by some |
| 133 | + "magic string" (e.g. ``# matplotlibrc-syntax-version: 2``), or a different |
| 134 | + naming convention. |
| 135 | + |
| 136 | +- Parser handling for line-continuations would still need to be implemented. A |
| 137 | + relatively simple possibility would be to support backslash continuations |
| 138 | + (lack of support for implicit continuations based on parentheses could be |
| 139 | + somewhat surprising to a user inputting Python syntax, though). |
| 140 | + |
| 141 | +- From a security point of view, this is exactly as secure as the current |
| 142 | + situation (whatever one can pass to ``eval`` with this syntax, one could |
| 143 | + already do it by passing it as value for the ``axes.prop_cycle`` key). |
| 144 | + |
| 145 | +- Support for ``patheffects`` would require adding more entries into the |
| 146 | + restricted environment. |
| 147 | + |
| 148 | +Full Python syntax |
| 149 | +`````````````````` |
| 150 | + |
| 151 | +The config files would simply be Python source files, of the form :: |
| 152 | + |
| 153 | + from matplotlib import rcParams |
| 154 | + rcParams["key"] = value # possible comment |
| 155 | + rcParams["key"] = value |
| 156 | + |
| 157 | +or :: |
| 158 | + |
| 159 | + from matplotlib import rcParams |
| 160 | + rcParams.update( |
| 161 | + {"key": value, # possible comment |
| 162 | + "key": value} |
| 163 | + ) |
| 164 | + |
| 165 | +The files (with a ``.py`` extension, thus immediately distinguishable from |
| 166 | +old-style configs) would be either |
| 167 | + |
| 168 | +- option 1: ``exec``'d in a completely standard context (empty globals, all |
| 169 | + builtins available). A few variables (``rcParams``, ``cycler``, etc.) could |
| 170 | + be preloaded into the globals, but I would prefer not (`#8235`_; see also |
| 171 | + `here <explicit-imports_>`_). |
| 172 | + |
| 173 | +- option 2: Imported (operating by side-effect of the import), and then |
| 174 | + immediately removed from ``sys.modules`` so that reloading works; the config |
| 175 | + loader code would be in charge of locally patching ``sys.path`` to make the |
| 176 | + config files visible to the import system. |
| 177 | + |
| 178 | +In either case, cascading style sheets can be implemented by having a config |
| 179 | +file ``exec`` or ``import`` (depending on the option chosen) itself another |
| 180 | +config file. |
| 181 | + |
| 182 | +It would remain possible to disallow (accidental) modification of certain |
| 183 | +rcParams from style files by locally patching ``RcParams.__setitem__`` in |
| 184 | +``style.use``. However, the style files would be able to execute arbitrary |
| 185 | +code (this is a *feature* of this proposal). |
| 186 | + |
| 187 | +As above, validation should share as much code as possible as the actual artist |
| 188 | +property setters. |
| 189 | + |
| 190 | +No parser would need to be written at all -- it's done for us by Python! |
| 191 | + |
| 192 | +Direct loading from an URL would be disabled, as it is inherently insecure. |
| 193 | +The documentation would encourage manual downloading (... or could even |
| 194 | +document how to do it using ``urllib`` if we really want to) of style sheets, |
| 195 | +which I believe is a good enough replacement (but I am happy to hear arguments |
| 196 | +that it is not). |
| 197 | + |
| 198 | +Implementation |
| 199 | +-------------- |
| 200 | + |
| 201 | +The general implementation strategy is outlined in the proposed solutions. |
| 202 | +Neither strategy appears to present large technical difficulties. Actual work |
| 203 | +will be based on the agreed-upon syntax. |
| 204 | + |
| 205 | +Backward compatibility |
| 206 | +---------------------- |
| 207 | + |
| 208 | +New-style configs use a different code path, so old-style config support can |
| 209 | +remain in order to maintain full backward compatibility. Deprecating support |
| 210 | +for old-style configs can be discussed and decided upon at a later time (or |
| 211 | +never done). |
| 212 | + |
| 213 | +Alternatives |
| 214 | +------------ |
| 215 | + |
| 216 | +- Proposal: Fix the current issues with the parsers and implement custom |
| 217 | + parsers for the additional kinds of values we want to support. |
| 218 | + |
| 219 | + Issues: Is it really worth maintaining a large corpus of custom parsers for |
| 220 | + a custom-designed language that is essentially used only by us? |
| 221 | + |
| 222 | +- Proposal: Switch to another configuration language (JSON, YAML, etc.). |
| 223 | + |
| 224 | + Issues: It remains necessary to be able to encode certain specific Python |
| 225 | + objects (certainly cyclers, possibly path effects), which means that they |
| 226 | + will need to be ``eval``'d (in which case I fail to see the advantage |
| 227 | + over using Python throughout), or that custom syntax (compatible with the |
| 228 | + underlying configuration language!) will need to be invented and custom |
| 229 | + parsers maintained. Additionally, JSON does not support comments, and YAML |
| 230 | + is an extremely (overly, in my opinion) complex language. See also the |
| 231 | + discussion that took place over PEP518_ (not that I particularly like the |
| 232 | + final choice of yet another obscure configuration language by that PEP). |
| 233 | + |
| 234 | +.. _#3670: https://github.com/matplotlib/matplotlib/issues/3670 |
| 235 | +.. _#4240: https://github.com/matplotlib/matplotlib/issues/4240 |
| 236 | +.. _#4371: https://github.com/matplotlib/matplotlib/issues/4371 |
| 237 | +.. _#6157: https://github.com/matplotlib/matplotlib/issues/6157 |
| 238 | +.. _#6274: https://github.com/matplotlib/matplotlib/issues/6274 |
| 239 | +.. _#6320: https://github.com/matplotlib/matplotlib/issues/6320 |
| 240 | +.. _#7089: https://github.com/matplotlib/matplotlib/issues/7089 |
| 241 | +.. _#7219: https://github.com/matplotlib/matplotlib/issues/7219 |
| 242 | +.. _#8235: https://github.com/matplotlib/matplotlib/issues/8430 |
| 243 | +.. _#8430: https://github.com/matplotlib/matplotlib/issues/8430 |
| 244 | +.. _#9184: https://github.com/matplotlib/matplotlib/issues/9184 |
| 245 | +.. _PEP518: https://www.python.org/dev/peps/pep-0518/#other-file-formats |
| 246 | +.. _explicit-imports: https://www.reddit.com/r/Python/comments/ex54j/seeking_clarification_on_pylonsturbogearspyramid/c1bo1v5/ |
0 commit comments