Skip to content

[Bug]: Markers with string-type integer values are converted to integers in cycler and matplotlibrc #29646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
conradhaupt opened this issue Feb 19, 2025 · 6 comments

Comments

@conradhaupt
Copy link

conradhaupt commented Feb 19, 2025

Bug summary

When creating a cycler with a "marker" property value contains strings that could be converted to integers, they are always converted to integers, even if the string value is valid and the intended value. This is most obvious with the list ["1", "2"]; which is distinctly different from [1, 2]. However, the latter is always used. Edit (20.02.2025): This is also the case for "lines.markers", "boxplot.flierprops.marker", "boxplot.meanprops.marker", and "scatter.marker".

Relevant functions are _validate_markers (here) and _listify_validator (here). It seems like this issue was introduced by #27613, which solved a similar issue reported in #27596.

Code for reproduction

Edited 20.02.2025

from matplotlib import cycler
import matplotlib.pyplot as plt

markers = ["1", "2", "o", 4]
print("Cycler")
test_cycler = cycler("marker", markers)
for prop_dict in test_cycler:
    print(
        "Marker {} with type {}.".format(prop_dict["marker"], type(prop_dict["marker"]))
    )
print("lines.marker")
for marker in markers:
    plt.rcParams["lines.marker"] = marker
    print(
        "Marker {} with type {}.".format(
            plt.rcParams["lines.marker"], type(plt.rcParams["lines.marker"])
        )
    )

Actual outcome

$ python main.py
Cycler
Marker 1 with type <class 'int'>.
Marker 2 with type <class 'int'>.
Marker o with type <class 'str'>.
Marker 4 with type <class 'int'>.
lines.marker
Marker 1 with type <class 'int'>.
Marker 2 with type <class 'int'>.
Marker o with type <class 'str'>.
Marker 4 with type <class 'int'>.

which indicates that the string entries "1" and "2" are converted into integers. As the string "o" cannot be converted into an integer with int("o"), it is used explicitly.

Expected outcome

$ python main.py
Cycler
Marker 1 with type <class 'str'>.
Marker 2 with type <class 'str'>.
Marker o with type <class 'str'>.
Marker 4 with type <class 'int'>.
lines.marker
Marker 1 with type <class 'str'>.
Marker 2 with type <class 'str'>.
Marker o with type <class 'str'>.
Marker 4 with type <class 'int'>.

This output would indicate that strings containing integers are parsed correctly.

Additional information

This bug was not present prior to #27613, but it did solve a similar bug.

Looking at the code which causes this bug, _validate_markers (here) and _listify_validator (here), it's not clear to me why the validation is done this way. I.e., to validate types, values are explicitly converted to see if a ValueError is raised, instead of testing the type explicitly. Is this the intended method for validation?

I have a draft solution, but I would like clarification on the intended validation mechanisms before I submit the code for review. Edit (20.02.2025): Since finding that other RcParams entries are affected by this bug, my solution is no longer applicable.

Operating system

Fedora 41

Matplotlib Version

3.10.0

Matplotlib Backend

No response

Python version

Python 3.12.9

Jupyter version

No response

Installation

pip

@tacaswell
Copy link
Member

The source of this is likely that this code is also used to validate reading matplotlibrc which always comes to us as strings. The impacts there need to be understand and balanced with any fix to this.

@timhoffm
Copy link
Member

Yes, the validators accept two kinds of inputs: (1) a python object suitable for the context, (2) a string that can be evaluated to such a python object. The latter is needed because matplotlibrc is plain text and we have to parse this. This is quite ugly but works well enough for simple cases.
Markers complicate the situation in that they can both be ints 1 and strings that contain an int "1". I think it is possible to separate the cases - maybe we cannot use the listify approach for that.

Anyway, a solution should make the following modification to the tests pass.

  • The first swiches from cycler.cycler() to matplotlib.cycler, which is the validated version.
  • The second ensures that matplotlibrc is parsed correctly.
diff --git a/lib/matplotlib/tests/test_cycles.py b/lib/matplotlib/tests/test_cycles.py
index 4fa2616194..587d4f0509 100644
--- a/lib/matplotlib/tests/test_cycles.py
+++ b/lib/matplotlib/tests/test_cycles.py
@@ -29,7 +29,10 @@ def test_marker_cycle():
 
 def test_valid_marker_cycles():
     fig, ax = plt.subplots()
-    ax.set_prop_cycle(cycler(marker=[1, "+", ".", 4]))
+    ax.set_prop_cycle(mpl.cycler(marker=["1", "+", ".", 4]))
+    for _ in range(4):
+        ax.plot(range(10), range(10))
+    assert [l.get_marker() for l in ax.lines] == ["1", "+", ".", 4]
 
 
 def test_marker_cycle_kwargs_arrays_iterators():
diff --git a/lib/matplotlib/tests/test_rcparams.py b/lib/matplotlib/tests/test_rcparams.py
index 096c5cef6c..44be6a9929 100644
--- a/lib/matplotlib/tests/test_rcparams.py
+++ b/lib/matplotlib/tests/test_rcparams.py
@@ -268,6 +268,8 @@ def generate_validator_testcases(valid):
                      ("cycler('c', 'rgb') * cycler('linestyle', ['-', '--'])",
                       (cycler('color', 'rgb') *
                        cycler('linestyle', ['-', '--']))),
+                     ('cycler(marker=["1", 2])',
+                      cycler(marker=["1", 2])),
                      (cycler('ls', ['-', '--']),
                       cycler('linestyle', ['-', '--'])),
                      (cycler(mew=[2, 5]),

That said, a workaround for you would be to use the unvalidated cycler.cycler() (or pyplot.cycler(), which is currently identical #26868).

@anntzer
Copy link
Contributor

anntzer commented Feb 19, 2025

Prior to #27613 (mpl 3.8.3/3.9.0), it was not possible to specify integer markers (TICKLEFT=1, etc.) in the matplotlibrc file, which uses strings only (or in a cycler, #27596). That PR "fixed" the issue by instead interpreting the string "1" as an integer, but the consequence is that it has now become impossible, instead, to specify string markers whose string shorthand is an integer (this issue).

Given that it is anyways confusing to have two different markers called 1 (tri_down) and "1" (tickleft), that neither of the names is particularly intuitive anyways (IMO...), and that I suspect they are rarely used in practice, I would suggest reverting #27596 and at least deprecating the integer markers and replace them by their full names (i.e. one would have to write plot(..., marker="tickright") instead of plot(..., marker=1)). (More likely, first introduce support for full names, then deprecate the ints in the following release.)

Perhaps we could also deprecate the int-as-str markers (plot(..., marker="1") becoming plot(..., marker="tri_down")) but at least in this case one could have been using shorthand notations such as plot(x, y, "1") so it's maybe reasonable to keep them...

@timhoffm
Copy link
Member

timhoffm commented Feb 20, 2025

Introducing full names for the markers is a good idea. That makes marker definition unambiguous in every situation. We can separately decide whether we want to deprecate the current forms.

I would not revert anything. AFAICS the before/after of #27613 are just the two choices that a validator can make - convert "2" to int or not. Neither seems better than the other. I therefore would leave it like this. Though I suspect it should be possible to fix this by jumping some extra hoops

  • in matplotlibrc (all input is string) bare numbers are converted to integers, quoted numbers become strings.
  • in cycler validation we should not convert string to int.
    IMHO the right approach would be to separate conversion from validation. Cycler should only use validation. Although we could discuss whether cycler needs to do validation at all. Using an unvalidated cycler would just blow up later on usage of an inappropriate value.

And also, we should move away from plain-text matplotlibrc. But that's a separate topic.

@anntzer
Copy link
Contributor

anntzer commented Feb 20, 2025

just the two choices that a validator can make - convert "2" to int or not. Neither seems better than the other.

The point is that I believe that one of the choices (don't convert) is better than the other, for more than one reason:

  • String markers are clearly designed to be markers: "1"..."4" are "tripods" with various orientations and "8" is an octagon, positioned at the "natural" center of the shape, so they indeed mark a specific position; int markers are clearly helpers (and indeed used that way in the library) for drawing axis ticks (tickdown, etc.) and arrows at the end of "half-errorbars" using lolims/uplims (caretdownbase, etc.), if used standalone they don't even clearly mark a point because the actual underlying point is actually at one end of the marker (e.g. the relationship between tickdown and the actual position marked on the axis).
  • Supporting string markers is consistent with supporting shorthands in plot() (at least plot(x, y, "8") can reasonably be remembered, if you've seen it before). Naturally it's not the same implementation as in matplotlibrc, but from a consistency POV it would be weird if "8" in matplotlibrc and "8" in plot marker shorthand had different meanings.

@conradhaupt
Copy link
Author

I appreciate all of the input, especially the patch for tests from @timhoffm.1

In fact, the reason I found out about this bug was because my mplstyle file was being processed incorrectly. I had traced the issue to validation in matplotlib.rcsetup, though it could be present elsewhere. I believe the core problem is that validation functions are relied upon to coerce values between types. Looking at other validators in matplotlib.rcsetup, most do type conversions as they use _make_type_validator which "validates" by converting the value to the intended type.2

Note that this bug actually affects the following four additional rcParams entries, where _validate_marker is used, and the original cycler bug for axes.prop_cycle:

  • "lines.marker"
  • "boxplot.flierprops.marker"
  • "boxplot.meanprops.marker"
  • "scatter.marker"
  • The evaluation of cycler("marker", ...) in rcParams and explicit Python code. This uses _validate_markerlist, which wraps _validate_marker.

These rcParams entries all behave differently pre- and post-#27613, as they all use _validate_marker.

I agree with @timhoffm, that it would be better to "[...] separate conversion from validation". However, that may require some re-architecting of how matplotlibrc is loaded and parsed. For example, double-quoting would be a nice way of indicating a string but double-quotes are removed in _rc_params_in_file [1] and their addition may require changing the syntax of maplotlibrc/mplstyle files.

A few notes:

  1. cycler(), when loaded from matplotlibrc or an mplstyle file, is handled with validate_cycler, where eval is used to process string values. Therefore, markers passed to cycler() are handled correctly up-until _validate_marker. Integers used to be rejected by _validate_marker, now they are kept but string-integers are converted to integers .
  2. The other rcParams entries that use _validate_marker, e.g., "lines.marker", do not use eval(). This means their values are always passed to _validate_marker as strings. Prior to Fix marker validator with cycler (allow mix of classes)  #27613 they were always treated as strings, now string-integers are converted to ints. Given _rc_params_in_file removes quotes, I have been unable to modify my matplotlibrc file to coerce string-integers into string-integers for these RcParams entries.
  3. We could allow for str() and int() functions in eval()[2], allowing us to write something akin to axes.prop_cycle: (cycler("marker": [".", str("1"), int(2)]),). Then, we would implement proper type checking in _validate_marker. However, then we wouldn't handle the other RcParams entries properly, which do not use eval(). These would always pass strings to _validate_marker.

It would be great if the following matplotlibrc/mplstyle file was possible.

lines.marker: 2 # int, TICKUP
scatter.marker: '1' # str, tri_down
axes.prop_cycle: (cycler("marker", [".", "1", 2]),) # Cycles through markers point, tri_down, and TICKUP

Footnotes

  1. I've added some additional tests, for the new RcParams entries that use _validate_marker, in my fork here.

  2. This is useful if we provide a value that can be coerced into the valid type.

@conradhaupt conradhaupt changed the title [Bug]: Cycler converts string-type integers to integers, even when values are valid [Bug]: Markers with string-type integer values are converted to integers in cycler and matplotlibrc Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants