Skip to content

STY: Apply pre-commit hooks to codebase #22809

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

matthewfeickert
Copy link
Contributor

@matthewfeickert matthewfeickert commented Apr 8, 2022

PR Summary

This is a follow up PR to PR #21583 based on discussions in PR #22777 (c.f. #22777 (comment)) and on Gitter. This touches a lot of files so I'll walk through the story here, which should help motivate turning on pre-commit.ci.

  1. The pre-commit hooks were added in PR Add pre-commit config and dev instructions #21583 (November 2021) with a pre-commit.ci

autoupdate_schedule: 'quarterly'

though as pre-commit.ci wasn't turned on, there were no pre-commit hook updates coming in on that schedule and so I've run pre-commit autoupdate to update the pre-commit-hooks and flake8 ids.

  1. With those updated, I then ran the failing (c.f. DEV: Add codespell to pre-commit hooks #22777 (comment)) pre-commit hooks and applied their necessary revisions as individual commits. So those are the style commits for end-of-file-fixer, trailing-whitespace, and check-docstring-first. These were also done as individual commits so that if in review any of these are viewed as not acceptable they can be dropped easily with a rebase.

  2. As the following commits touch many files, this is a noisy first time run of the hooks. To help limit this, add a .git-blame-ignore-revs file that will automatically ignore commits in the blame view on GitHub (this file is required to be in the top level of the repo). This can also be done locally with git blame by running

git blame --ignore-revs-file .git-blame-ignore-revs

Example on my fork:
example_ignore

Note to reviewer: As this file contains commits it is sensitive to rebases, and so please make sure that these commits are correct if anyone rebases this during review.

This PR should probably be merged in advance of PR #22777 which can be trivially rebased and would be a good time to turn pre-commit.ci on.

pre-commit now passes the full repo:

$ pre-commit run --all-files
check for added large files..............................................Passed
check docstring is first.................................................Passed
fix end of files.........................................................Passed
mixed line ending........................................................Passed
trim trailing whitespace.................................................Passed
flake8...................................................................Passed

PR Checklist

Tests and Styling

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (install flake8-docstrings and run flake8 --docstring-convention=all).

Documentation

  • [n/a] New features are documented, with examples if plot related.
  • [n/a] New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • [n/a] API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).

Comment on lines 1 to 8
# style: end-of-file-fixer pre-commit hook
b372d66ab9e9aa30573f240d87a6ea92474cce75

# style: trailing-whitespace pre-commit hook
0f0a977d8aaba773570dff8f828c358df22203aa

# style: check-docstring-first pre-commit hook
1a36dc83af39159749ce5857eca6ddacd345e6bb
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer: As this file contains commits it is sensitive to rebases, and so please make sure that these commits are correct if anyone rebases this during review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great that this is possible. Really helpful for big cosmetic changes like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rebased, but I also updated the commits.

$ git log --graph --oneline --decorate --all | head -n 10
* 02fc0611cf (HEAD -> mnt/apply-pre-commit-to-all-files, origin/mnt/apply-pre-commit-to-all-files) DOC: Add pre-commit hook run dev instructions
* f3aa826d83 DEV: Add .git-blame-ignore-revs to ignore style in blame view
* 66c12f2778 STY: Apply check-docstring-first pre-commit hook
* 33516082c5 STY: Apply trailing-whitespace pre-commit hook
* 94d985a744 STY: Apply end-of-file-fixer pre-commit hook
* 7e04925fcd MNT: Update pre-commit hooks to latest versions
| * 3dd6049160 (mnt/add-codespell-to-pre-commit-hooks) FIX: Fix typo caught by codespell pre-commit hook
| * 19987645ec DEV: Add codespell to pre-commit hooks
|/  
*   19992285cb (upstream/main, origin/main, main) Merge pull request #22807 from anntzer/qd

Comment on lines +73 to +80
The hooks can also be run manually. All the hooks can be run, in order as
listed in ``.pre-commit-config.yaml``, against the full codebase with ::

pre-commit run --all-files

To run a particular hook manually, run ``pre-commit run`` with the hook id ::

pre-commit run <hook id> --all-files
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can of course target particular files with --file (e.g. pre-commit run --file README.rst) but given that is already done on the files that you're attempting to commit when you have pre-commit installed locally this didn't really seem important to note.

From pre-commit run --help:

  --files [FILES ...]   Specific filenames to run hooks on.

@matthewfeickert
Copy link
Contributor Author

matthewfeickert commented Apr 8, 2022

To address a question from Gitter:

In PR #17143 reviewdog was added. reviewdog currently appears to run linting for only flake8 (please correct me if I'm wrong). As to how pre-commit differs, it applies all the pre-commit hooks (including flake8) and will quarterly (as defined in the config)

autoupdate_schedule: 'quarterly'

open a PR to automatically update the hooks that have new releases. It also runs all of the pre-commit hooks faster then reviewdog (though this really doesn't matter at all when you're comparing against how long the rest of the CI takes).

I'll leave it up to the dev team if this is sufficiently interesting/useful to have.

@matthewfeickert
Copy link
Contributor Author

matthewfeickert commented Apr 8, 2022

I also enabled pre-commit.ci on my fork to do a demo of what this looks like. c.f. matthewfeickert#9 which shows things working, and then also a little autofix demo from a comment.

my_fork

@oscargus
Copy link
Member

oscargus commented Apr 9, 2022

I'm not sure that we should require pre-commit hooks (as I take is an indirect suggestion here). While I tend to use it, there are setups where it is not really convenient. For example, I have different shells on Windows, one for running git (https://gitforwindows.org/) and one for Python (conda). While I can of course unify them (being too lazy so far, and doing much development on Linux), this is a bit of extra work and nothing I think should be an obstacle for contributing.

Also, I am not sure how this works when doing commits from graphical interfaces (which for some reason seems to be a popular way of using git...).

Running the hooks on all files is as you say a one-time effort. IMHO I do not think we need to use the .git-blame-ignore-revs here, but just go for it (if we should do it). All newly committed code will have these checks anyway and if someone touches some code and have white-space removal in the editor, it will be enforced anyway. With that said, I am not against cleaning the code in one go, but think the .git-blame-ignore-revs is really not required.

To me, pre-commit hooks primarily serve the purpose of not having to run the code quality tests locally and get warnings if something is not right, so I am not a big fan of the hooks actually changing stuff. Better to "learn" how it should be done.

@matthewfeickert
Copy link
Contributor Author

I'm not sure that we should require pre-commit hooks (as I take is an indirect suggestion here).

This doesn't require developers to use them. Using pre-commit.ci is just going to find things that would later require a developer to go fix to and alert the PR contributor and the reviewers to them.

While I tend to use it, there are setups where it is not really convenient. For example, I have different shells on Windows, one for running git (https://gitforwindows.org/) and one for Python (conda). While I can of course unify them (being too lazy so far, and doing much development on Linux), this is a bit of extra work and nothing I think should be an obstacle for contributing.

I don't think I understand. :? What is the obstacle? Are you saying that pre-commit doesn't work on your Windows setup? Can you elaborate?

Also, I am not sure how this works when doing commits from graphical interfaces (which for some reason seems to be a popular way of using git...).

That is an interesting point that I've never thought about before. I would be a bit surprised (maybe naively?) to find out that popular Git GUIs have no plugin support though, and so I would (definitely naively) hope things "just work".

Running the hooks on all files is as you say a one-time effort. IMHO I do not think we need to use the .git-blame-ignore-revs here, but just go for it (if we should do it). All newly committed code will have these checks anyway and if someone touches some code and have white-space removal in the editor, it will be enforced anyway. With that said, I am not against cleaning the code in one go, but think the .git-blame-ignore-revs is really not required.

It is certainly not required, but the diff is pretty ugly and meaningless, so this was more of a "making it less painful" effort. So I'm quite happy to remove this, but the blame avoidance seems almost custom designed for these huge diffs situations.

To me, pre-commit hooks primarily serve the purpose of not having to run the code quality tests locally and get warnings if something is not right, so I am not a big fan of the hooks actually changing stuff. Better to "learn" how it should be done.

I take a different view on their purpose (automate the boring stuff that would need to get human changed anyway), but my opinion on that isn't the relevant thing here. The hooks only run locally if the developer has them installed locally (which is totally optional) and if a developer pushes code that fails pre-commit.ci then as automatic pushing back fixes is disabled

autofix_prs: false

they can speed up the process and avoid having a reviewer have to ask for them for changes by just commenting "pre-commit.ci autofix". Nothing is happening automatically unless a developer opts in by using pre-commit locally or by asking pre-commit.ci to fix things for them. (I think I am probably telling you things that you already know though, so apologies here. 😬 Can you elaborate/clarify your points if I've misunderstood?)

I'll note also that this PR is simply enforcing the pre-commit hooks that were added in PR #21583, and if the dev team isn't happy with that hook selection I'm certainly happy to fix that here to whatever is desired. 👍

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the git-blame-ignore-revs, I think these cosmetic changes are fine to get the recommit hooks to pass.

@oscargus
Copy link
Member

oscargus commented Apr 9, 2022

I don't think I understand. :? What is the obstacle? Are you saying that pre-commit doesn't work on your Windows setup? Can you elaborate?

The problem is that the git (for Windows) bash shell and the (mini)conda shell (Anaconda prompt) don't have the same Python environment, so if I run git in the conda shell I have everything I need for the pre-commit, but not the nice aspects of the git bash shell (like branch name and tab-completion), and if I use the git shell, I do not have the conda environment.

As I said, this can be sorted out with a bit of effort and is just a convenience thing really, but I would assume it is a similar situation for the graphical clients as conda typically doesn't set up any environment variables except for it's own shell.

(I think I have three different Python installs on my Windows machine: conda, that I use for most things, Windows Python that ends up in the path, and the one provided with the MS dev-prompt for building certain stuff. So while I can have all three synced, it is not very practical...)

Copy link
Contributor

@greglucas greglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually surprised there is this little churn in the files, it is almost all trailing whitespace.

I also prefer having the .git-blame-ignore-revs for the whitespace changes because what I'm really interested in with the blame is looking for who updated the actual code piece and I don't want to ping @matthewfeickert to fix something he only did a whitespace update on :)

@@ -989,12 +989,12 @@ def trigger(self, *args, **kwargs):
'help': ToolHelpBase,
'copy': ToolCopyToClipboardBase,
}
"""Default tools"""
# Default tools
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just delete this comment and the one below, the comment is the same as the variable name :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in rebase I just pushed. 👍

@matthewfeickert matthewfeickert force-pushed the mnt/apply-pre-commit-to-all-files branch from c677636 to 02fc061 Compare April 9, 2022 16:02
@matthewfeickert
Copy link
Contributor Author

Just showing that things still pass after rebase:

$ pre-commit run --all-files
check for added large files..............................................Passed
check docstring is first.................................................Passed
fix end of files.........................................................Passed
mixed line ending........................................................Passed
trim trailing whitespace.................................................Passed
flake8...................................................................Passed

@@ -989,12 +989,10 @@ def trigger(self, *args, **kwargs):
'help': ToolHelpBase,
'copy': ToolCopyToClipboardBase,
}
"""Default tools"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these removed? Not sure if they are actually rendered, but those are doc-strings for the attributes.

Copy link
Contributor Author

@matthewfeickert matthewfeickert Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were removed given @greglucas review comment here: #22809 (comment)

They don't show up in the "files changed" tab during review and only show up in the "Conversation" tab because I rebased the PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK! Yes, considering that the doc-string is the same as the name and that they do not show up in the rendered documentation it may make sense. (Not having them as a comment for sure makes sense.)

I was a bit worried that this indicates that one cannot really have variable doc-strings though. That those will be interpreted as "multiple top-level doc-strings".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@matthewfeickert matthewfeickert Apr 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oscargus I'll let you and the other members of the dev team make a call here, but the fact that this does show up in the current docs looks unintentional to me given the render. At least I don't see their value here in the docs, but I'm also obviously not an expert in this section of the API.

not_intentional

These docstring comments were added in 2015 back in PR #3652 by @fariza. So while I sure hope no one ever asks me to explain code that I wrote over 5 years ago 😬 maybe they can take a look and give their thoughts.

@greglucas as you suggested removal maybe you can also give your thoughts here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, check-docstring-first had a long-standing issue for these false positives, see pre-commit/pre-commit-hooks#159. It can be turned off per-file, and does catch the common case where imports sneak above the docstring. Or it can be dropped, either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this affects excess docstrings in examples far more than variable docstrings in the actual library, so I think I'm net positive on keeping it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I think I'm net positive on keeping it.

(Sorry, long week and my brain is a bit fuzzy.) Positive on keeping the check-docstring-first hook, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty indifferent on this (as it sounds like most of us are!). I'd lean towards sticking with what you have currently (removing the docstring/comment [and because of that removing the sphinx generated documentation] and leaving the pre-commit check in place).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QuLogic @oscargus @jklymak @greglucas @tacaswell As it seems that there are no strong feelings against keeping the check-docstring-first hook and some positive leanings towards it, does anyone have any further questions on this PR?

To keep the noise down on this I've stopped rebasing it against main but I have a branch on my fork where I've continued to do so and pre-commit run --all-files is still passing with this rebased on main's HEAD.

@oscargus
Copy link
Member

I don't want to ping

That is the general reason, yes. However, looking at the changes here: do you really expect that you will have to ping for any of these changes? They are primarily in non-code and the very few that are in the code are quite non-critical.

@greglucas
Copy link
Contributor

I agree that this specific change isn't too invasive and wouldn't be a big issue either way for the ignore commits from my perspective. However, I see this as setting a good precedent for potential future updates dropping big style changes and makes it easy to see how to do it properly by just adding the commit hash to this file.

@matthewfeickert
Copy link
Contributor Author

However, I see this as setting a good precedent for potential future updates dropping big style changes and makes it easy to see how to do it properly by just adding the commit hash to this file.

Yeah. 👍 I got the idea from @henryiii using it in scikit-build/scikit-build#685 where they use Black. I know that Matplotlib doesn't use any formatters like that, but if there was ever a big style revision it could be done with a diminished disruption with respect to using the Git history.

@tacaswell tacaswell added this to the v3.6.0 milestone Apr 13, 2022
Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slight preference that the LICENCES files not be touched / be ignored by these tools but not a big enough concern to block over it.

@matthewfeickert
Copy link
Contributor Author

matthewfeickert commented Apr 13, 2022

I have a slight preference that the LICENCES files not be touched / be ignored by these tools but not a big enough concern to block over it.

@tacaswell That can easily happen. They can just get added to the exclude list

exclude: |
(?x)^(
extern|
lib/matplotlib/mpl-data|
doc/devel/gitwash|
doc/users/prev|
doc/api/prev|
lib/matplotlib/tests/tinypages
)

I'll do that, rebase to get the right commits, and then request re-review.

Requested by Thomas Caswell in PR 22809
* Update pre-commit hooks:
   - https://github.com/pre-commit/pre-commit-hooks: v4.0.1 → v4.2.0
   - https://github.com/pycqa/flake8: v3.9.2 → v4.0.1

This is being applied manually and not waiting for pre-commit.ci to
apply it as pre-commit.ci hasn't been turned on yet since these
hooks were introduced in PR 21583.
Also remove comments in lib/matplotlib/backend_tools.py per request
of maintainer in PR review.
Add a git blame ignore-revs file to automatically ignore commits listed
in it from the blame view on GitHub. Include in it style changes added by
pre-commit hooks which touch multiple files.

To ignore the commits in the .git-blame-ignore-revs when running `git blame`
locally run: `git blame --ignore-revs-file .git-blame-ignore-revs`.

c.f.:
* https://git-scm.com/docs/git-blame#Documentation/git-blame.txt---ignore-revs-fileltfilegt
* https://docs.github.com/en/repositories/working-with-files/using-files/viewing-a-file#ignore-commits-in-the-blame-view
Add instructions to the devs on how to manually run pre-commit hooks, both for
all the hooks, and for individual hook ids. Example:
`pre-commit run flake8 --all-files`

Also use the 'python -m pip' pattern to be consistent with the rest of the docs.
@matthewfeickert matthewfeickert force-pushed the mnt/apply-pre-commit-to-all-files branch from 02fc061 to 48ade7d Compare April 13, 2022 23:15
Comment on lines +1 to +8
# style: end-of-file-fixer pre-commit hook
c1a33a481b9c2df605bcb9bef9c19fe65c3dac21

# style: trailing-whitespace pre-commit hook
213061c0804530d04bbbd5c259f10dc8504e5b2b

# style: check-docstring-first pre-commit hook
046533797725293dfc2a6edb9f536b25f08aa636
Copy link
Contributor Author

@matthewfeickert matthewfeickert Apr 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this last rebase:

$ git log --graph --oneline --decorate --all | head -n 7
* 48ade7d904 (HEAD -> mnt/apply-pre-commit-to-all-files, origin/mnt/apply-pre-commit-to-all-files) DOC: Add pre-commit hook run dev instructions
* 4c003775bb DEV: Add .git-blame-ignore-revs to ignore style in blame view
* 0465337977 STY: Apply check-docstring-first pre-commit hook
* 213061c080 STY: Apply trailing-whitespace pre-commit hook
* c1a33a481b STY: Apply end-of-file-fixer pre-commit hook
* 787bdf08e5 MNT: Update pre-commit hooks to latest versions
* 95038d96a4 DEV: Exclude LICENSE files from pre-commit

and

$ git grep "c1a33a481b\|0465337977\|213061c080"
.git-blame-ignore-revs:c1a33a481b9c2df605bcb9bef9c19fe65c3dac21
.git-blame-ignore-revs:213061c0804530d04bbbd5c259f10dc8504e5b2b
.git-blame-ignore-revs:046533797725293dfc2a6edb9f536b25f08aa636

so these are correct. 👍

As a single line checker:

$ git log --oneline | head -n 7 | grep STY | awk '{print $1}' | xargs -I {} git grep {}
.git-blame-ignore-revs:046533797725293dfc2a6edb9f536b25f08aa636
.git-blame-ignore-revs:213061c0804530d04bbbd5c259f10dc8504e5b2b
.git-blame-ignore-revs:c1a33a481b9c2df605bcb9bef9c19fe65c3dac21

@matthewfeickert
Copy link
Contributor Author

@tacaswell The LICENSE/ dir is properly excluded now:

$ pre-commit run --all-files
check for added large files..............................................Passed
check docstring is first.................................................Passed
fix end of files.........................................................Passed
mixed line ending........................................................Passed
trim trailing whitespace.................................................Passed
flake8...................................................................Passed
$ git diff upstream/main -- LICENSE/  # No diff
$

@oscargus
Copy link
Member

setting a good precedent

Fair enough!

@tacaswell tacaswell merged commit f569f21 into matplotlib:main Apr 18, 2022
@tacaswell
Copy link
Member

Thanks @matthewfeickert !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants