Skip to content

DOC Add guidelines for automated contributions #29287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 19, 2024

Conversation

ArturoAmorQ
Copy link
Member

What does this implement/fix? Explain your changes.

As presented in the dev meeting notes, it was decided to add a statement to the contribution guide about automated issues/PRs.

Any other comments?

I was not present during the dev meeting so my wording may not be completely inline with what was discussed there. Feel free to improve it.

Copy link

github-actions bot commented Jun 18, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 4630e8d. Link to the linter CI: here

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArturoAmorQ

contribution, including code generated by language models (LLMs), must conform
to the project's existing style guide and coding conventions, in particular:

- Keep all lines under 88 characters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here. it's linter's job

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I had in mind is to avoid contributions that would just result in a red CI. In the case of #29268 (review), I would have liked to refer the OP to an official statement in the contributor's guide regarding raw copy-paste from an LLM.

That being said, it's true that the bot already comments on such CI fails, but I have the feeling that raw-LLM contributions are going to become more and more of a problem. WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, at least for now we haven't had many. And when they get them, we simply block them in most cases anyway, so we don't care about the linter issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And there is no reason to believe that LLMs wouldn't be better at following format standard than the average human.

The real problem with LLM code contributions, is not necessarily their form or ability to make tests pass but the fact that no human contributor fully understands the goal of the contribution, its inner value for end-users and more generally if this a change of the code base that is useful for our community or not.

- Avoid modifying unrelated lines.
- Do not duplicate imports.
- Provide comments that add value and avoid stating the obvious.
- **Most importantly**: Do not contribute code that you don't understand.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the only point I'd like to keep from this section.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@adrinjalali adrinjalali added the Waiting for Second Reviewer First reviewer is done, need a second one! label Jun 18, 2024
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor improvements but otherwise LGTM.

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Copy link
Contributor

@StefanieSenger StefanieSenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArturoAmorQ. Here are some suggestions for being less wordy.

I would also encourage to add the policies about automated contributions in a less prominent place. It certainly should not be the first thing we give interested contributors to read after the "Our community, our values" block. Actually, it only needs to be somewhere, so the five people that are interested in that per year can find it through search (and we can refer to it if we need to).

The second part about coding conventions has some thematic similarities with the Coding guidelines which I think can be pulled off from "Developing scikit-learn estimators", because it is too general for this section anyways and could be unified with the new passage here.

ArturoAmorQ and others added 2 commits June 19, 2024 11:55
Co-authored-by: Stefanie Senger <91849487+StefanieSenger@users.noreply.github.com>
@ArturoAmorQ
Copy link
Member Author

It certainly should not be the first thing we give interested contributors to read after the "Our community, our values" block.

I see your point that it's quite a mood change w.r.t. the previous paragraph, but I still want it to be visible enough as in the future it may not be only 5 contributions per year. Maybe right after "Ways to contribute"?

@StefanieSenger
Copy link
Contributor

I see your point that it's quite a mood change w.r.t. the previous paragraph, but I still want it to be visible enough as in the future it may not be only 5 contributions per year. Maybe right after "Ways to contribute"?

Yes, I like that idea. :)

@adrinjalali adrinjalali enabled auto-merge (squash) June 19, 2024 13:13
@adrinjalali adrinjalali merged commit 36104b5 into scikit-learn:main Jun 19, 2024
30 checks passed
@ArturoAmorQ ArturoAmorQ deleted the automated_tools branch June 19, 2024 15:47
jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Jul 2, 2024
Co-authored-by: ArturoAmorQ <arturo.amor-quiroz@polytechnique.edu>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Stefanie Senger <91849487+StefanieSenger@users.noreply.github.com>
@jeremiedbb jeremiedbb mentioned this pull request Jul 2, 2024
11 tasks
jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Jul 2, 2024
Co-authored-by: ArturoAmorQ <arturo.amor-quiroz@polytechnique.edu>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Stefanie Senger <91849487+StefanieSenger@users.noreply.github.com>
jeremiedbb pushed a commit that referenced this pull request Jul 2, 2024
Co-authored-by: ArturoAmorQ <arturo.amor-quiroz@polytechnique.edu>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Stefanie Senger <91849487+StefanieSenger@users.noreply.github.com>
github-merge-queue bot pushed a commit to probabl-ai/skore that referenced this pull request Apr 29, 2025
Muhammad-Rebaal pushed a commit to Muhammad-Rebaal/skore that referenced this pull request May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Waiting for Second Reviewer First reviewer is done, need a second one!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants