burn python2.x bridges more #12948

asottile · 2019-01-10T05:26:36Z

I saw https://twitter.com/amuellerml/status/1083129999643824128 and #12639 and was inspired

This was almost entirely automated via pyupgrade

I ran:

git ls-files -- '*.py' | xargs pyupgrade --py3-plus --keep-percent-format

And then fixed the flake8 issues pointed out by

./build_tools/circle/flake8_diff.sh

rth

Thanks @asottile ! A few comments are below, maybe this could be done by providing specific options to pyupgrade...

rth · 2019-01-10T06:53:18Z

benchmarks/bench_20newsgroups.py

+    print("X_test {}".format(X_test.shape))
+    print("X_test.format = {}".format(X_test.format))
+    print("X_test.dtype = {}".format(X_test.dtype))
+    print("y_test {}".format(y_test.shape))


All changes of the form

- "something {0}".format(x) + "something {}".format(x)

don't really bring anything and should be reverted IMO. It's one character less, for the cost of creating merge conflicts with all the open PRs.

they do offer an improvement and were added in python2.7 -- numbering format specifiers is human-error-prone operation

Yes and it's good to use them going forward, but adding merge conflicts for this seems not useful.

I think you could say the same about using super() -- what's the hangup here?

Fair, that would be an argument to keep the super calls the same. But I do feel the new way makes them more readable. The "hangup" is that every code change brings maintenance costs with it and we're trying to balance the benefit of making the change with that cost.
This is the reason why we discourage changes that are largely cosmetic, like applying pep8.

I don't really have that strong an opinion, though...

rth · 2019-01-10T06:55:28Z

benchmarks/bench_plot_nmf.py

@@ -203,7 +203,7 @@ class _PGNMF(NMF):
    def __init__(self, n_components=None, solver='pg', init=None,
                 tol=1e-4, max_iter=200, random_state=None,
                 alpha=0., l1_ratio=0., nls_max_iter=10):
-        super(_PGNMF, self).__init__(
+        super().__init__(


All changes to super() are done in #12812 and should be reverted here..

An easier solution might also be to wait until #12812 is merged ~~and re-run pyupgrade then~~ which will make this diff smaller.

rth · 2019-01-10T06:57:01Z

sklearn/externals/_arff.py

@@ -384,11 +384,11 @@ def __str__(self):
 # INTERNAL ====================================================================
 def encode_string(s):
    if _RE_QUOTE_CHARS.search(s):
-        return u"'%s'" % _RE_ESCAPE_CHARS.sub(r'\\', s)
+        return "'%s'" % _RE_ESCAPE_CHARS.sub(r'\\', s)


This is a vendored package. All changes to sklearn/externals/* need to be reverted (and if you wish submitted upstream to the corresponding package).

asottile · 2019-01-10T19:40:19Z

closing as "largely cosmetic"

amueller · 2019-01-10T19:44:17Z

I guess it's a bit spitting hairs which changes are cosmetic and which are not. Unfortunately we don't really have clear guidelines or consensus on that and I get it if you don't want to wade through that. Maybe it would make sense to swallow the bitter pill in this case and just make all the changes? I don't know.

rth · 2019-01-10T21:50:52Z

@asottile I have not realized that you wrote pyupgrade: it is quite nice.

Personally most of the changes in this PR look good to me (now that #12812 was merged) except for python2.7+ format specifiers. The latter is responsible for almost half of the changes and I just don't see the (debatable) advantages it brings in this case justifying the risk of merge conflicts. It's a bit along the same line as the argument about keeping percent formatting in asottile/pyupgrade#63 (comment) It works, the previous way is not deprecated, it's only slightly more compact and the fact that this was not supported in python2.6- is not reason enough to start using it.

Any chance of an option in pyupdate to disable the python2.7+ format which could be applied to this PR?

All the other changes look quite useful.

asottile · 2019-01-10T22:12:29Z

The --keep-percent-format option was not added as a "convenience" -- there are places where it introduces behaviour changes due to bugs in python2.x -- I don't know the state of the unicode compatibility in this codebase so I did not run with that option enabled.

I don't have interest in participating in option-hell at personal preference whims -- the whole point of the tool and tools like it (black, import reorderers, whitespace trimmers, etc.) is that you don't discuss stylistic nitpicks: format the code automatically and focus on architectural changes.

The reason to use py27+ format literals is not "because it breaks python2.6" and I've not suggested that. The reason to use it is the same reasons the syntax was introduced in the first place -- it's less error prone.

If you're afraid of merge conflicts, I'd suggest adding more aggressive formatting tools instead of shying away from them. For instance add-trailing-comma or any of the other formatters / linters might be a good idea to invest in.

rth · 2019-01-16T22:32:46Z

the whole point of the tool and tools like it (black, import reorderers, whitespace trimmers, etc.) is that you don't discuss stylistic nitpicks: format the code automatically and focus on architectural changes.

In general, I adhere to that vision and am pretty enthusiastic about black (and pyupgrade). However, when you take a large legacy project like scikit-learn we currently have 680 open PRs. A large number of these PRs were made at a significant cost both in contributor and reviewers time, can take years (sic) to merge, and are still very relevant today. So we have to take into account the cost of resolving merge conflicts, auto-formatting tools only help so much in this situation.

The reason to use py27+ format literals is not "because it breaks python2.6" and I've not suggested that.

Well, that's the feeling I got from the PR title "burn python2.x bridges more".

The reason to use it is the same reasons the syntax was introduced in the first place -- it's less error prone.

I'm not really interested in a long discussion about this, but after skimming through the issue where it was introduced https://bugs.python.org/issue5237 it seemed mostly about convenience. Presumably, you are not changing functionality there, so if any code was wrong, it would still be. Also, those same lines could use f-string formatting soon.

Anyway, I have continued this in #12997 (can't re-open this PR since you deleted the branch).

burn python2.x bridges more

f53646b

rth requested changes Jan 10, 2019

View reviewed changes

asottile closed this Jan 10, 2019

rth mentioned this pull request Jan 16, 2019

MAINT Run pyupgrade following Python2 deprecation #12997

Merged

Uh oh!

burn python2.x bridges more #12948

burn python2.x bridges more #12948

Uh oh!

Conversation

asottile commented Jan 10, 2019

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

rth Jan 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asottile Jan 10, 2019

Choose a reason for hiding this comment

Uh oh!

amueller Jan 10, 2019

Choose a reason for hiding this comment

Uh oh!

asottile Jan 10, 2019

Choose a reason for hiding this comment

Uh oh!

amueller Jan 10, 2019

Choose a reason for hiding this comment

Uh oh!

amueller Jan 10, 2019

Choose a reason for hiding this comment

Uh oh!

rth Jan 10, 2019

Choose a reason for hiding this comment

Uh oh!

rth Jan 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rth Jan 10, 2019

Choose a reason for hiding this comment

Uh oh!

asottile commented Jan 10, 2019

Uh oh!

amueller commented Jan 10, 2019

Uh oh!

rth commented Jan 10, 2019

Uh oh!

asottile commented Jan 10, 2019

Uh oh!

rth commented Jan 16, 2019

Uh oh!

Uh oh!

rth Jan 10, 2019 •

edited

Loading

rth Jan 10, 2019 •

edited

Loading