-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MNT Add code scanning workflow #28312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious to see how this evolves. Thanks @lesteve
.github/workflows/codeql.yml
Outdated
branches: [ "main" ] | ||
pull_request: | ||
branches: [ "main" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering if we should also do release branches, or at least the ones we're doing from now on?
.github/workflows/codeql.yml
Outdated
# - https://gh.io/supported-runners-and-hardware-resources | ||
# - https://gh.io/using-larger-runners | ||
# Consider using larger runners for possible analysis time improvements. | ||
runs-on: ${{ (matrix.language == 'swift' && 'macos-latest') || 'ubuntu-latest' }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
swift? O_o
I removed the explicit swift mentions from the automatically generated workflow and also added I don't really know wheter this is likely to highlight many times the same issue in different In case that happens and it's too noisy, we could as you mention start with only |
strategy: | ||
fail-fast: false | ||
matrix: | ||
language: [ 'javascript-typescript', 'python' ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to understand if this line means that the workflow will scan JS and Python code only?
I am somewhat interested about getting reports about our Python code, but also somewhat not excited because I assume it'll lead to false positives and "yes I know you think this is dangerous but we are adults and want to do it".
What I would be very excited about is something that vets/scans/checks the workflows and CI configs. At least I think this is where you could manage to sneak things past us or we just misconfigure things to make life easier for attackers. Do you know if something like this exists? (I think it doesn't :()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I don't understand the reason why, but I think javascript-typescript actually also checks CI workflows ... see for example the PR in my fork I linked above lesteve#31 where malicious workflows are detected in a PR and also this Github blog post:
The CodeQL workflow scanning queries are (currently) only included in the query suite for JavaScript [...] If the main programming language of your project is something else, such as Python then you need to [...] add JavaScript as an additional language
I agree with you on the Python code, for example I never found LGTM very useful when it was enabled ... and I would admit that my tolerance to false positives is quite low.
The fact that there is no security report on the Python code (tested on my fork) is a good sign for me that there will not be that many false positives. We can always disable python code scanning if we agree it is too much on the noisy side.
Also I haven't played too much with the security report but it looks like you can tag a defect as false positive, hopefully that means that it never shows up again, see this for more details.
Whoops, I wrote that code. Always knew that I did not how to code in javascript :) |
@glemaitre @betatim should we merge this then? We can always revert and disable if it gets too noisy. |
Merged. Let's see what happens. Having something keep an eye on our workflow configs seems useful |
Great, if you want you can now look at security reports in https://github.com/scikit-learn/scikit-learn/security/code-scanning. |
Folllowing #28151, I started to read a few security-related blog posts, for example https://securitylab.github.com/research/github-actions-untrusted-input/.
This PR would add Code scanning to our repo and would potentially detect if we try to add an insecure workflow (amongst other things).
I set it up on my fork, this shows it does protect against a user that creates a PR to add unsafe workflows like the above Github blog post:
lesteve#31
I created the workflow, clicking on the right buttons on Github. The only tweak I did is to disable code scanning for C/C++. By default C/C++ was enabled (probably because of the svm code), except that it was failing. Probably it is possible to enable it but then you need to give a custom build command. I don't think it is worth it right now.
By the way it looks like one of our js file triggers a Medium security alert (screenshot from my fork setup):
