DOC: do not posting LLM output as your own work #28335

tacaswell · 2024-06-03T18:02:48Z

I am not going to link to the issues or comments that inspired me to write this. We are starting to see comments that look like the are the output from feeding the OP of an issue in to an ai assitant and at least one bug report that was "chatGPT said this should work but it does not".

I think that users who report bugs with us should expect to get feedaback from an expert human and I do not think we should be spending time doing user support for openAI.

I thought about included a sentence to the effect of "We obviously can't stop you from using them, but please rewrite it in your own words" but did not like any of the formulations I came up with.

I do not think asking people who use LLMs to generate comments/issues to attribute addresses the actual problem and is insufficient.

story645 · 2024-06-03T19:33:23Z

I get the frustration but I think we should have a discussion on how we wanna deal w/ this - frankly in part cause I think many of the folks who are c&p'ing chatGPt into our forums are not reading our contributing guide. I think this information may therefore be more effective in the issue/bug report template and as part of the discourse rules.

Also b/c this is a behavior thing & therefore conduct, I don't know that the middle of the newbie contribution section is where we want to mix in conduct guidelines as their own thing - for example, we put the guidance on not duplicating PRs as part of the general PR guidance.

Also I think when it's not explicitly stated, it's not always clear whether a post is AI or stack overflow or inexperience & we have generally not banned stack overflow/suggestions that are essentially links to stack overflow (which nowadays may also be AI). I think it may be more effective to provide better guidance on what we want in a bug report - including in the triage section of the contributing guide- shutting it down on the issue tracker as needed. Possibly as part of picking #23109 back up.

rcomer · 2024-06-03T19:58:56Z

FWIW, stackoverflow explicitly bans AI generated content
https://stackoverflow.com/help/gen-ai-policy

I agree with @story645 that those posters are unlikely to have read the contributors’ guide but it is useful to have something policy-like to reference when we say "please don’t do that".

story645 · 2024-06-03T20:12:47Z

but it is useful to have something policy-like to reference when we say "please don’t do that"

I'm not opposed to this, and honestly think we should include PRs on this too similar to stack overflow...but then I think we should then just write an AI policy as its own thing and put it under the Review and Triage section since I think in some ways the guidance will be geared as much as to what to do w/ stuff you suspect to be AI as don't write AI.

I'm uncomfortable with it being stuck between contribute and talk to us as part of the newbie intro (ETA: where newbie = new to contributing to Matplotib, not necessarily their skill level) packet. Feels a bit like the plagiarism statement in a syllabus.

tacaswell · 2024-06-04T00:36:28Z

We mention the CoC at the very top of this page so we already talk about expected behavior (and I do not think that this belongs in the CoC) in this document. I'm relatively indifferent to where it goes in this page (maybe it should go up at the top where we mention the CoC but that seemed like giving it too high of a profile), but think it should be on this page someplace rather than buried someplace in the review/triage guidelines (which are even less likely to be read by new-to-Matplotlib contributors). While it may be the case that people don't read the docs (or even if they do read the docs but convince them selves that their use of AI is OK anyway), that is not a reason not to write them down!

We may we also want guidelines on how to respond to (suspected) disingenuous AI content which would make sense in the review guide (as it is guidance to our selves about how to deal with the problem), but I do not think that should block this. We should also get more evidence/experience with the problem before we come up with a whole process (because we will imaging the wrong problems 🤣 ).

Maybe it should also go in the templates (although there is no template for just leaving a comment), but that is not the right place to put a canonical policy which, as @rcomer points out we will likely want to link to.

I suspect that keeping out code that is generated by LLMs is a losing battle (that we have probably already lost) given how they are being embedded directly in IDEs. However, we are already experienced at integrating code from developers at many different skill levels so I hope we will be able to absorb AI generated code as well (I also have not seen any demos of "fix this code with a minimal change" so it is not clear AI can do what it is we need done (but then I have also not gone looking)). At any rate I do not think we need to pause to come up with a complete AI policy to say "please do not verbatim post the output of chatGPT as a comment responding to someone else's bug report"

Added another sentence (that probably needs some wordsmithing) that machine translation is not a problem.

timhoffm · 2024-06-04T06:57:20Z

There are different cases of AI usage to be considered and I have the feeling we should be addressing them individually:

Use of GenAI as part of development. Generated code in PRs and thus in our codebase. - This may have legal implications. I haven't followed what the status is. This may be a general topic for numfocus.
Bug reports on GenAI code ("chatGPT said this should work but it does not"). We're not a debugger for that. But it's not different from "please help me with my code" requests we've already had in the past. AI or not is not a relevant point. We expect a certain form and content for bug reports: They should focus on a specific error and contain a minimal example.
Giving answers in issues based on GenAI: Putting the issue in chatGPT and posting the result is neither helpful nor acceptable. Issue authors can do that themselves if an AI answer is sufficient for them. We expect answers to be curated feedback. (Should AI be good enough for that one day, we well run a bot for that). Again, we expect a certain form and quality in the answers. Verbatim AI responses currently don't meet these. It's ok, to use AI as part of the process, but we expect contributors to curate their answers.

story645 · 2024-06-04T15:05:49Z

but think it should be on this page someplace rather than buried someplace in the review/triage guidelines

So my issue w/ discoverability/scope/organization is in coming in from https://matplotlib.org/devdocs/devel/index.html:

this is the current contribute.html TOC:

This is how it's overviewed/indexed on the contribute landing page:

What I am proposing as it being review/triage guidelines was a standalone short page that gets added as a new entry to

rcomer · 2024-06-04T16:14:00Z

If someone is replying to issues, then that is arguably closer to triage work than it is to anything else.

story645 · 2024-06-04T17:03:07Z

Just to try and be clearer what I'm proposing is:

one source of truth policy doc on generative AI that goes under a policies and guidelines subsection, arguably triage since that's the umbrella of contributing/main it's most applicable to.
link out relevant subsections as needed in contribute.html, issue tracker, discourse, etc.

As Tim pointed out in #28335 (comment), we expect this to crop up in various flavors. I think it'll just be cleaner to maintain if we have one page that lays out the project attitude towards gen AI and what we don't consider acceptable usage.

timhoffm · 2024-06-04T18:12:49Z

I think it'll just be cleaner to maintain if we have one page that lays out the project attitude towards gen AI and what we don't consider acceptable usage.

I draw the opposite conclusion. We should address the individual aspects separately in the respective relevant sections of the docs. - That’s where users will encounter them. „I‘ll read what I can and cannot do with AI in matplotlib.“ is not a practical approach. Nobody (except maybe core devs) is interested in the general attitude towards AI usage across all topics.

In particular, the topics „bug reports on AI code“ and „issue answers using AI“ would not strictly need an explicit AI comment, because that AI is not a fit here follows from our existing policy. But if we see too much of it, we can simply add a sentence along the lines (for issue answers) „Please do not respond with AI generated answers, because they do not match our quality criteria.“

story645 · 2024-06-04T18:24:43Z

We should address the individual aspects separately in the respective relevant sections of the docs. - That’s where users will encounter them. „

I propose we do that through links and framing, not that we don't do it. Mostly b/c I'm thinking of it from "where do I find this when I need to link it"?

Here's the ToC from this PR, where "use of generative AI" is being binned as being of the same type as the sections explaining getting started with contributing code/docs/community:

story645 · 2024-06-04T18:59:35Z

What I mean by linkout + is for example in contribute.html is that where it's relevant add something like


> 
> Work on an issue
> ============
> 
>   In general, the Matplotlib project does not assign issues. Issues are "assigned" or "claimed" by opening a PR; 
<snip> We ask that code not be generated via AI because :ref:`license-issues <gen-ai-code>`.
>

story645 · 2024-06-04T19:24:08Z

Nobody (except maybe core devs) is interested in the general attitude towards AI usage across all topics.

Also, that's sorta what this PR is doing, but singling out new contributors and giving them instructions on how to use AI, mostly when triaging issues which is not a topic that's even covered in contribute.html

story645 · 2024-06-04T19:41:30Z

We should address the individual aspects separately in the respective relevant sections of the docs. - That’s where users will encounter them.

So here's an alternate approach of respective relevant (b/c yes my concern is more about where than what):

Use of GenAI as part of development:
- https://matplotlib.org/devdocs/devel/coding_guide.html
- https://matplotlib.org/devdocs/devel/style_guide.html
Bug reports on GenAI code:
- https://matplotlib.org/devdocs/devel/triage.html#improve-issue-reports
- https://github.com/matplotlib/matplotlib/issues/new?assignees=&labels=&projects=&template=bug_report.yml&title=%5BBug%5D%3A+
Giving answers in issues based on GenAI:
- https://matplotlib.org/devdocs/devel/triage.html#maintainers-and-triage-team-members
- we also don't really discuss this in the docs anywhere really

And I'm sure other places I'm not thinking of.

rcomer · 2024-06-04T20:30:49Z

I think the question of AI in code development is complicated because it comes in various forms: letting GitHub Copilot* auto-fill a docstring or suggest the next few lines of your loop is quite different from asking ChatGPT for a complete code solution. Drafting any policy on this should also probably involve a lawyer since ultimately the LLM is copying (lot of) other people's code.

For bug reports I agree with @timhoffm that AI use is irrelevant because we already have guidance to make a minimal example and that should apply regardless of how the code originated.

So I suggest for now focussing on

"please do not verbatim post the output of chatGPT as a comment responding to someone else's bug report"

*My knowledge of GH Copilot comes purely from having attended a 10 minute demo.

tacaswell · 2024-06-04T21:26:31Z

So I suggest for now focussing on

I removed the sentence about reporting. I agree that it is just another version of "please reduce that to a reproducible example" that we already ask people to do and have plenty of experience in kindly nudging users to do the right thing.

I also completely agree with @timhoffm that we should drop this stuff in where it is relevant rather than having a central page.

jklymak · 2024-06-04T21:49:27Z

I think I understand where this change is coming from, but not sure about the implementation. Contributions should be quality and not plagiarized, full stop. But if someone uses AI to spit out a chunk of code faster than they can type it, I don't know that we need to specifically make that sound like it's discouraged or it can't show up in discussions or even our code base. If anything, I think we should be encouraging beginners to ask AI smart questions to help them learn how to use the library.

story645 · 2024-06-04T23:55:00Z

But if someone uses AI to spit out a chunk of code faster than they can type it, I don't know that we need to specifically make that sound like it's discouraged or it can't show up in discussions or even our code base.

Yeah also this - like the real issue here is "please don't post solutions unless you're sure/have verified they work" - if someone posted verbatim chat GPT that worked, we wouldn't have (much) reason to care.

story645 · 2024-06-05T00:48:11Z

So I suggest for now focussing on

"please do not verbatim post the output of chatGPT as a comment responding to someone else's bug report"

Wondering if maybe we should add a "triage" section to `contribute.html' /"triage issues" button to the index- then this admonition has a relevant section & it'd give us a place to summarize/roadmap/signpost the triage docs and add in an other triage specific newbie guidance. xref #28347

tacaswell · 2024-11-20T18:08:38Z

I've rebased and tweaked the wording a bit. I think we have recently seen an issue where two different people copy-pasted chatgpt output into an issue [we got two almost identical comments and I had a co-worker put the issue into chatpgt which was also about the same] so this is something we need to address sooner rather than later.

We have the expectation that maintainers are going to engage with contributors in an earnest and empathetic way. I think this is the right thing to do both from basic human decency and to help foster/onboard future regular contributors. However, in the spirit of the discussion about having our project used in classes (where we effectively become drafted as un-paid TAs), spending the time (and emotional energy) to engage with someone who is (best case) acting as a human proxy to an LLM is not a good use of our time.

timhoffm · 2024-11-20T18:41:37Z

doc/devel/contribute.rst

+
+We expect authentic engagement in our community.  Do not post verbatim output
+of Large Language Models or similar generative AI as comments on GitHub or our
+discourse server as your own work.  For proposed code changes ensure that you


Is that sufficient? The wording allows to cop&paste AI code verbatim as long as it’s reasonable and the user can explain it. Do we know the legal/copyright implications of that?

I'm worried that we have no way to tell what is copy-pasted (other than the honor system) and do not we want to get into the slicing game of when you have changed it enough to qualify. I suspect there are also issues of slicing as to what is "too-AI". Using "tab-complete" is clearly OK, "dear claude please fix this bug and add a test" is clearly too far, but there is lots of stuff in the middle where 🤷🏻 .

The analogy I would go for is if you turned to a co-worker for help and they sent you a mostly working code snippet that you then opened a PR based on (or are adapting a patch someone left as a comment), we would not be happy with "I don't know why this works, but my coworker said it should" so our least-bad option is to ask people to understand their code before they open the PR 🤣 .

Currently we say nothing about them so I suspect that people will assume it is OK. This is at least putting some (social) guard rails on and not making the problem any worse.

I pushed some slight rewording, not sure if I made it better or worse though.

doc/devel/contribute.rst

tacaswell · 2024-11-20T19:24:24Z

prior art from networkx: https://github.com/networkx/networkx/blob/main/.github/PULL_REQUEST_TEMPLATE.md

tacaswell · 2024-11-20T19:41:39Z

I asked chatGPT why you should not use it for open source work 🤣

Using ChatGPT to develop contributions to open source projects may not always be the best choice for several important reasons:

1. Quality and Accuracy of Code

ChatGPT generates responses based on patterns it has learned from a large dataset of code and text. While it can often generate functional code snippets, these snippets may not always be optimal, free from bugs, or aligned with the best practices for a specific project. Open source projects require high-quality, maintainable, and secure code, and there is a risk that using ChatGPT to develop contributions could introduce mistakes or inefficiencies.

2. Lack of Context

Open source projects often have specific coding standards, architectural patterns, and design philosophies that contributors must adhere to. ChatGPT doesn't have deep context about a specific project, such as its history, its community standards, or the intricacies of its codebase. Consequently, contributions generated by ChatGPT might not fit well within the existing structure or could conflict with other parts of the project.

3. Ethical Considerations and Licensing

ChatGPT generates responses based on a mixture of licensed and publicly available data, but it does not have full awareness of the licensing and copyright status of specific code fragments in its training data. If an open source contributor uses ChatGPT-generated code without fully understanding its origin, there could be concerns about code reuse violating licenses or intellectual property rights, even if unintentionally.

4. Inability to Engage with the Community

Contributions to open source projects often require engaging with the community — discussing design decisions, reviewing code, getting feedback, and collaborating on features. ChatGPT is not capable of engaging in meaningful, context-rich conversations about the project's vision or requirements. The iterative process of improving code based on community feedback, understanding users’ needs, and adjusting features is something ChatGPT cannot replicate effectively.

5. Security Risks

Generating code via AI tools like ChatGPT can sometimes result in insecure code patterns or vulnerabilities that may not be immediately apparent. Open source projects, especially those used in production environments, need to ensure their contributions are free from security issues. Since ChatGPT may not be aware of the latest security trends or specific project vulnerabilities, contributions generated by AI may inadvertently introduce risks.

6. Lack of Test Coverage and Documentation

Contributions to open source projects often require proper test cases, documentation, and explanations to ensure the code can be effectively maintained. ChatGPT can assist with generating test cases and documentation, but it may not fully understand the intricacies of the project or write tests and documentation at the level of detail or quality that a human contributor with domain expertise would. Without strong test coverage and clear documentation, contributions can become difficult to maintain or expand upon.

7. Overreliance on AI

Relying heavily on AI for code contributions could undermine the value of human creativity, problem-solving, and innovation. Open source projects thrive on the diverse ideas, perspectives, and collaboration of human developers. AI-generated contributions may reduce the opportunity for learning, growth, and skill development for the developers working on the project.

8. Responsibility and Accountability

Contributors to open source projects are typically responsible for the quality, licensing, and security of the code they submit. If something goes wrong with AI-generated code, it may be difficult to assign responsibility, as the AI doesn't have accountability or the ability to understand the full impact of its suggestions.

Conclusion

While ChatGPT can be a useful tool for generating code snippets, offering advice, or helping with documentation, it is not a substitute for a human developer’s thoughtful engagement with a project. When contributing to open source projects, it is essential to have a deep understanding of the project’s goals, architecture, and community, and to ensure that contributions meet high standards for quality, security, and maintainability. Thus, ChatGPT should be used cautiously, and human oversight is necessary to ensure that any contribution is appropriate and beneficial to the project.

doc/devel/contribute.rst

Co-authored-by: Jody Klymak <jklymak@gmail.com>

jklymak

This all looks good to me. Thanks for the discussion!

timhoffm

I resist to bikeshed further. 😄 This is a reasonable improvement.

…own work

…335-on-v3.10.x Backport PR #28335 on branch v3.10.x (DOC: do not posting LLM output as your own work)

tacaswell · 2024-12-10T15:11:20Z

https://sethmlarson.dev/slop-security-reports <- commentary about AI generated security reports.

github-actions bot added the Documentation: devdocs files in doc/devel label Jun 3, 2024

story645 added the status: needs comment/discussion needs consensus on next step label Jun 3, 2024

story645 mentioned this pull request Jun 5, 2024

Doc: added triage section to new contributor docs #28347

Merged

5 tasks

github-actions bot added the status: needs rebase label Oct 30, 2024

tacaswell force-pushed the doc/llm_rule branch from 835f6d8 to b6e16aa Compare November 20, 2024 17:49

github-actions bot removed the status: needs rebase label Nov 20, 2024

timhoffm reviewed Nov 20, 2024

View reviewed changes

story645 reviewed Nov 20, 2024

View reviewed changes

doc/devel/contribute.rst Outdated Show resolved Hide resolved

jklymak reviewed Nov 20, 2024

View reviewed changes

doc/devel/contribute.rst Outdated Show resolved Hide resolved

story645 removed the status: needs comment/discussion needs consensus on next step label Nov 20, 2024

DOC: do not posting LLM output as your own work

e005975

Co-authored-by: Jody Klymak <jklymak@gmail.com>

tacaswell force-pushed the doc/llm_rule branch from 8a83c37 to e005975 Compare November 21, 2024 14:57

DOC: add links to this is PR and issue templates

a8b5d80

jklymak approved these changes Nov 21, 2024

View reviewed changes

rcomer approved these changes Nov 24, 2024

View reviewed changes

timhoffm approved these changes Nov 25, 2024

View reviewed changes

timhoffm added this to the v3.10.0 milestone Nov 25, 2024

timhoffm merged commit 7293157 into matplotlib:main Nov 25, 2024
43 checks passed

meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull request Nov 25, 2024

Backport PR matplotlib#28335: DOC: do not posting LLM output as your …

a0462e2

…own work

meeseeksmachine mentioned this pull request Nov 25, 2024

Backport PR #28335 on branch v3.10.x (DOC: do not posting LLM output as your own work) #29186

Merged

tacaswell deleted the doc/llm_rule branch November 25, 2024 15:16

tacaswell added a commit that referenced this pull request Nov 27, 2024

Merge pull request #29186 from meeseeksmachine/auto-backport-of-pr-28…

26c6e0d

…335-on-v3.10.x Backport PR #28335 on branch v3.10.x (DOC: do not posting LLM output as your own work)

nabobalis mentioned this pull request Dec 6, 2024

AI note sunpy/sunpy#7907

Merged

ankitkhushwaha mentioned this pull request Feb 24, 2025

Add AI-Note StingraySoftware/stingray#891

Closed

Uh oh!

DOC: do not posting LLM output as your own work #28335

DOC: do not posting LLM output as your own work #28335

Uh oh!

Conversation

tacaswell commented Jun 3, 2024

Uh oh!

story645 commented Jun 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rcomer commented Jun 3, 2024

Uh oh!

story645 commented Jun 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tacaswell commented Jun 4, 2024

Uh oh!

timhoffm commented Jun 4, 2024

Uh oh!

story645 commented Jun 4, 2024

Uh oh!

rcomer commented Jun 4, 2024

Uh oh!

story645 commented Jun 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timhoffm commented Jun 4, 2024

Uh oh!

story645 commented Jun 4, 2024

Uh oh!

story645 commented Jun 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

story645 commented Jun 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

story645 commented Jun 4, 2024

Uh oh!

rcomer commented Jun 4, 2024

Uh oh!

tacaswell commented Jun 4, 2024

Uh oh!

jklymak commented Jun 4, 2024

Uh oh!

story645 commented Jun 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

story645 commented Jun 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tacaswell commented Nov 20, 2024

Uh oh!

timhoffm Nov 20, 2024

Choose a reason for hiding this comment

Uh oh!

tacaswell Nov 20, 2024

Choose a reason for hiding this comment

Uh oh!

tacaswell Nov 20, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tacaswell commented Nov 20, 2024

Uh oh!

tacaswell commented Nov 20, 2024

1. Quality and Accuracy of Code

2. Lack of Context

3. Ethical Considerations and Licensing

4. Inability to Engage with the Community

5. Security Risks

6. Lack of Test Coverage and Documentation

7. Overreliance on AI

8. Responsibility and Accountability

Conclusion

Uh oh!

Uh oh!

jklymak left a comment

Choose a reason for hiding this comment

Uh oh!

timhoffm left a comment

story645 commented Jun 3, 2024 •

edited

Loading

story645 commented Jun 3, 2024 •

edited

Loading

story645 commented Jun 4, 2024 •

edited

Loading

story645 commented Jun 4, 2024 •

edited

Loading

story645 commented Jun 4, 2024 •

edited

Loading

story645 commented Jun 4, 2024 •

edited

Loading

story645 commented Jun 5, 2024 •

edited

Loading