-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
FIX: Remove type checking for strings in '_validate_linestyle' #8165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: Remove type checking for strings in '_validate_linestyle' #8165
Conversation
lib/matplotlib/rcsetup.py
Outdated
@@ -926,7 +930,7 @@ def _validate_linestyle(ls): | |||
# (called inside the instance of validate_nseq_float). | |||
pass | |||
|
|||
raise ValueError("linestyle must be a string or " + | |||
raise ValueError("linestyle must be a valid string or " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the "+" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do that. Edit: done in 278913d.
Well Travis is really angry at me for the |
Ok, it looks like Python 2 (, which I am working with) is less strict about string-like object comparisons: # Python 2.7.12 |Continuum Analytics, Inc.| (default, Jul 2 2016, 17:42:40)
In [1]: b'dotted' in [u'dotted']
Out[1]: True
In [2]: 'dotted'.encode('ascii') in [u'dotted']
Out[2]: True than Python 3: # Python 3.6.0 |Continuum Analytics, Inc.| (default, Dec 23 2016, 12:22:00)
In [1]: b'dotted' in ['dotted']
Out[1]: False
In [2]: b'dotted'.decode() in ['dotted']
Out[2]: True
In [3]: 'dotted'.encode('ascii').decode() in ['dotted']
Out[3]: True The commit 973dd5e should fix the previous failures, by using the |
Hum, Travis seems a bit happier with Python 3, but I just tried locally What is the usual way to deal with this kind of encoding problem? PS: I'll take take of the PEP8 issue, do not worry. |
Ok, I had a look at how try:
return _validate_named_linestyle("{:s}".format(ls))
except KeyError:
# ls is not a valid name of line style.
pass should avoid raising a try:
return _validate_named_linestyle(six.text_type(ls))
except KeyError:
# ls is not a valid name of line style.
pass may be more consistent with the other string checking done in |
Removing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove attempts to decode.
@@ -342,6 +342,8 @@ def generate_validator_testcases(valid): | |||
('', ''), (' ', ' '), | |||
('None', 'none'), ('none', 'none'), | |||
('DoTtEd', 'dotted'), | |||
(b'dotted', 'dotted'), # binary string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should only pass on python2 (where str and bytes are conflated). In python 3 this is correctly failing asb'dotted' != 'dotted'
because to do the comparison you would have to either decode the bytes (which we can't do because we don't know the encoding) or encode the string into the same encoding as the bytes (which we can't do because we don't know the encoding).
I suggest putting this in another test with a skip if python3 mark?
Just pass the user input through, in 3, users passing bytes in is wrong and should fail, on 2 it happens to work due to the warts of string v unicode on python2. |
Ok thanks for the comment @tacaswell . I'll do this in a moment. Note to myself: go back to before the decode commit, and play with |
973dd5e
to
9ae7265
Compare
Rebased! I squashed some similar commits and dropped the former one that attempted to decode the arguments. @tacaswell No more attempt to decode the arguments, and I modified the tests to take care of the difference between Python 2 and Python 3. TBH, I struggled a bit to find a clean way to do the latter: I hope the style of the new version of the tests is fine. On my local computer with Python 2, the tests are passing (and the doc is building): hopefully Travis with Python 3 will agree. |
The new approach may not play nicely (as in, cause a warning) due to the same issue as #7658? I haven't checked, but you may want to have a look. |
@anntzer Thanks, I'll have a look this weekend. |
Based on @anntzer's remark, it appears that indeed np.array([1, 2]) in ['dotted', 'solid'] raises a With 97234b5, I reintroduce some early type checking to avoid any comparison of a Numpy array with strings. I agree it is kind of a nuke for a problem that does not even currently exist, but some (super) thin reasons to support this may be:
|
@tacaswell Ping if you want to have a look when Travis will have run (TL; DR: @anntzer I hope you will notice the effort of not using |
lib/matplotlib/rcsetup.py
Outdated
if isinstance(ls, six.string_types): | ||
try: | ||
return _validate_named_linestyle(ls) | ||
except (KeyError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no parentheses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed… 844115f takes care of it.
@afvincent Congratulations, you've earned un bon point. (Dix points = une image. Dropping Python2 support = many many points :-)) |
@anntzer You know, I had a few teachers that were doing so in primary school 😄. Abouy dropping Python2, well… Let's say that I made a “mistake” when I started to learn Python at the beginning of my Ph.D. But I promise, I'll switch for my next big research project (2020 is coming close anyway…). We'll see if the bounty points lasts until then ;). If it doesn't, at least I'll have access to all the wonderful things like @, f-strings & Co! |
You know, I grew up in France... (in fact, pretty close to where you did your PhD...) |
Rahh, Travis does not seem to be happy (@anntzer you may want to take your bon point back). I think I understand why (if I am right, I was lucky about exceptions catching before…): I'll push a commit when I will have a fix. |
844115f
to
d4bd0cf
Compare
Rebased (and squashed) with a new version! The former spirit is still there but now the exception catching should be more robust, both on Python 2 (a local pytest runs smoothly with For the record, previously on both Python versions, the exception handling was flaky or error-prone. For example on Python 2, a |
Ok, now Travis stays calm :)! Ping to @tacaswell if you want to have a look, now that Travis is peacefull. And @anntzer because there have been quite a few changes since you had a look yesterday (sorry for that 🐑 ). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor fix.
lib/matplotlib/rcsetup.py
Outdated
# 'solid'.encode('utf-16'), may raise a unicode error. | ||
raise ValueError("the linestyle string is not a valid string.") | ||
|
||
if hasattr(ls, 'decode'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isinstance(ls, bytes)
seems to be more explicit? (in the Py2 case this will already have been covered by the case above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I guess it should be isinstance(ls, (bytes, bytearray))
if one want to avoid jokes (and keep an identical behavior). I'll try to test that during the day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's memoryview
on Py3, and it doesn't have a decode
attribute. I think we should just not worry about that case (people can always pass in objects with arbitrarily messed up __len__
/__iter__
/...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, on Python 3.6, I get
In [16]: hello = bytearray("Hello".encode('utf-16'))
In [17]: hasattr(hello, 'decode')
Out[17]: True
so this looks like bytearray instances do have a decode method, don't they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops sorry, was confusing with buffer/memoryview. Ignore what I said :-)
lib/matplotlib/rcsetup.py
Outdated
except (UnicodeDecodeError, KeyError): | ||
# On Python 2, string-like *ls*, like for example | ||
# 'solid'.encode('utf-16'), may raise a unicode error. | ||
raise ValueError("the linestyle string is not a valid string.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For debugging purpose of our users, could you add here what the value passed is?
raise ValueError("the linestyle string %r is not a valid string." % ls)
should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure @anntzer will support the idea of using old string formatting ^^. Joke apart, the idea to return argument that was passed seems OK to me: I'll do that during the day too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait doesn't everyone use f"the linestyle string {ls} is not a valid string."
these days? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however you prefer the string formatting to be :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we let master branch go 3-only we probably can target 3.6+ if we go back to supporting the 'last 2' python minor versions and are targeting a 3 only release for 2018-07 (py3.7 is scheduled for 2018-06-15).....
d4bd0cf
to
ed04d93
Compare
Re-rebased: squashed to 2 single commits and reordered them (1. code changes, 2. update of the related test). The only difference with the previous commits should be:
|
This looks good to me. |
Thanks! |
Thanks to all who reviewed the PR :). It was a windy journey, but an instructive one from my side ^^. |
You have hit on both why the bytes / unicode changes were made in python3 and why the transition has been a bit rough 😈 . |
There does not appear to be a |
@QuLogic Indeed, backporting does not seem required. #8040 (which is fixed by this PR) was not backported but milestoned for 2.1 (by @tacaswell) instead: that is why there is no |
This PR fixes #8145 .
The type checking of strings was raising errors when non Unicode strings were passed as arguments to
_validate_linestyle
(that was introduced in #8040), which can easily occur under Python 2.7 for example. This PR replaces this type checking with a simpletry: … except: …
approach. It also adds a couple of test cases to explicitly test the situation of non Unicode arguments.