-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MNT Add author/license note where missing and add the linter #29477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@betatim @lorentzenchr , wanna have a look? This should close #20813 maybe? |
@@ -5,6 +5,9 @@ | |||
the gradients and hessians of the training data. | |||
""" | |||
|
|||
# Authors: The scikit-learn developers | |||
# SPDX-License-Identifier: BSD-3-Clause | |||
|
|||
# Author: Nicolas Hug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we keep lines like this? I don't remember if we discussed/decided something, my knee jerk reaction while looking at this PR is "we should remove it because it is confusing to have authors and author".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that we'll remove them all at some point. It can be done in follow-up PRs though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another PR was supposed to remove these, this was missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing all these administrative fix ups.
One question about having "author" and "authors", but otherwise looks good to me!
# E721 is in preview (july 2024) and gives many false positives. | ||
# Use `is` and `is not` for type comparisons, or `isinstance()` for | ||
# isinstance checks | ||
"E721", | ||
# F841 is in preview (july 2024), and we don't care much about it. | ||
# Local variable ... is assigned to but never used | ||
"F841", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of ignoring them, should we just fix them first ?
Looks like you've already started it in #29501. When it's merged, we can remove this filter, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes and now, preview
enables some stuff in existing enabled rules that we might not want to fix (astral-sh/ruff#12343)
# Use `is` and `is not` for type comparisons, or `isinstance()` for | ||
# isinstance checks | ||
"E721", | ||
# F841 is in preview (july 2024), and we don't care much about it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and we don't care much about it.
well, it shouldn't be too hard to fix and I think it hurts readability to have useless variables :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually not sure. I'm happy to fix them in a separate PR, but I'm not sure if it helps readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the filters for now, and see what we do about them afterwards.
The thing is that these preview rules will become stable at some point and I don't think we should add filters for new rules to which we don't comply. I think that we should follow ruff/flake8/black rules as they come, unless it represents a major change that everyone dislike.
@@ -7,7 +7,7 @@ repos: | |||
- id: trailing-whitespace | |||
- repo: https://github.com/astral-sh/ruff-pre-commit | |||
# Ruff version. | |||
rev: v0.2.1 | |||
rev: v0.5.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we bump the min version in _min_dependencies
and update the lock files then ?
Otherwise I can imagine that a user has an older version and then the CI complains for errors that he doesn't see locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR also updates _min_dependencies
, should I try to recreate the lock files? I don't mind, but that happens anyway the monday after we merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I missed that. For the lock files, it should be fine because it's just the linter, but usually it's better to regenerate in the PR updating the dependencies so that we can see and fix any issue right away.
pyproject.toml
Outdated
] | ||
|
||
[tool.ruff.lint.flake8-copyright] | ||
notice-rgx = "\\#\\ Authors:\\ The\\ scikit\\-learn\\ developers\\\n\\#\\ SPDX\\-License\\-Identifier:\\ BSD\\-3\\-Clause" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add an extra blank line after because currently there are occurences of
# authors...
# copyright
import xxx
and
# authors...
# copyright...
import xxx
I prefer the latter
I think this is ready for a merge? |
I update the lockfiles and pushed them to this PR. Let's see what happens. I was expecting to see |
The answer to why |
haha I haven't thought about that :) then definitely no need to regenerate them here. |
examples/cluster/plot_hdbscan.py
Outdated
# Authors: The scikit-learn developers | ||
# SPDX-License-Identifier: BSD-3-Clause | ||
# %% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hum the extra blank line doesn't seem to work
I undid the lockfile change. |
Did more cleanups, we haven't even touched the cython files, maybe for another PR? |
sure :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the extra blank line thing is not working, see
#29477 (comment). I saw many occurrences. Otherwise LGTM.
@jeremiedbb I don't know how to fix the newline issue in linter, but I added all missing newlines with $ git grep -n '# SPDX-License-Identifier: BSD-3-Clause' | awk -F: '
{
filename=$1;
linenum=$2;
cmd="sed -n " linenum+1 "p " filename;
cmd | getline nextline;
close(cmd);
if (nextline !~ /^$/) {
insertCmd="sed -i \"" linenum "a\\\\\n\" " filename;
system(insertCmd);
}
}' (code generated partly by chatgpt) |
Alright, not a big issue though, thanks for trying :) |
The linter seems to raise false positives CPY001 if the file uses CRLF. Since the recommended setup of Git is |
Would it then make sense to change that |
If disabling |
would modifying the regex with |
…learn#29477) Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
This adds a linter and adds the notice where missing.