-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Python : Add Xpath injection query #3522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Detailed answer in #3521 (comment) -- just recording it here so I don't forget myself 😅 |
430ae96
to
87c1ea3
Compare
python/ql/test/experimental/CWE-643/XpathLibTests/xpathSinks.ql
Outdated
Show resolved
Hide resolved
I have added the qhelp too now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor things I noticed
I've enabled testing on both this PR and #3521. I think there's a few things in the stubbing that needs to be fixed before tests will pass. Let me know if you want me to review the PR in-depth now, or if you're planning to make more changes before I should do so 😊 |
@RasmusWL I think both the PR's can be reviewed in depth now. I have made the necessary changes to the stubs. They resemble the original code more now. One more thing, since |
This PR adds support for detecting XPATH injection in Python. I have included the ql files as well as the tests with this.
Right, I'll get to that soon 😊
I noticed that myself, but thanks for pointing it out 👍 Stubbing libraries like this is not a perfect solution, and in particular we still don't have a way to verify that our queries work against code using the real libraries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a more in-depth review of the QL code. Overall I'm very impressed with the quality of the code, compared to how long you have been working with CodeQL 🎉
There are some points I would to get addressed, most of them are mentioned as comments directly on the code. The ones that didn't fit on the actual code are:
- I'm not sure what the purpose of
python/ql/test/experimental/CWE-643/XpathLibTests/
is, but I do like thepython/ql/test/experimental/CWE-643/XpathLibTests/xpathSinks.ql
file. Could we just move it one directory up, so it lives inpython/ql/test/experimental/CWE-643/xpathSinks.ql
? 😊 - In the
qhelp
file you mention 3 mitigation strategies (1) using parameterized XPath interface, (2) escaping the user input, and (3) precompiled XPath query. Is there any clear-cut examples of how to do this that we could include? (possibly as part of the<example>
tag) - As a nitpick, think it would be slightly more straightforward if we moved
python/ql/test/query-tests/Security/lib/lxml/etree/__init__.py
topython/ql/test/query-tests/Security/lib/lxml/etree.py
(something to keep in mind for the future as well 😉).
Lastly, I want to highlight the next steps. Once you've addressed my (QL code) concerns here, I'll run your query on all Python projects we currently have on LGTM.com and share the results here. This helps us (me) determine the false-positive rate of the query. Our contributing guidelines also requires that The query must have at least one true positive result on some revision of a real project., so having all results can be helpful in that regard (unless you have one in mind already).
P.S., I did not review #3521 explicitly, since it follows pretty much the same pattern, and needs the same changes.
python/ql/src/experimental/semmle/python/security/injection/Xpath.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/semmle/python/security/injection/Xpath.qll
Outdated
Show resolved
Hide resolved
python/ql/src/experimental/semmle/python/security/injection/Xpath.qll
Outdated
Show resolved
Hide resolved
@RasmusWL I have included changes you asked. Please note that while this is similar to the XSLT injection PR, both of them tackle different issues and different impact. In the case of this one, XPath injection, the attacker only gets access to the XML datasheet. While in the XSLT injection's case, code execution is possible. XSLT also has a bunch of extra flow steps which this does not require. As for "one valid CVE", I don't have any out the top of my head. Googling does give you a few results But this is a quite common vulnerability. Most of the other scanners such as Fortify, SonarSource support this. Hence, I think CodeQL should support this too. |
One more very important thing, I think I have found a bug in the current master. The tests in this PR are successful in its current state. However, if I rebase it to the the master, the tests start failing. I have not made any changes anywhere. so one of the commits between this and the current master must have caused this. I have tried debugging the code, for some reason, the taint kinds are not playing well. For ex, see the code below which fails to return any results. import python
import experimental.semmle.python.security.injection.Xpath
from XpathInjection::XpathInjectionSink sink, TaintKind kind
where sink.sinks(kind)
select sink, kind However, this works, import python
import experimental.semmle.python.security.injection.Xpath
from XpathInjection::XpathInjectionSink sink
select sink, kind I don't have a clue what's happening here. |
Thanks 👍 I'll look properly at this (and the other XML PR) tomorrow. Could you please mark the things you fixed from my previous review as "resolved"? That will make things a bit more tidy. (Also note that you can accept the suggested changes directly from the GitHub web UI).
Oh, it doesn't have to be a CVE, it just has to be a true positive (a real result flagged by the query). If there is a CVE associated, that's definitely acceptable as well 😊 We'll see once I run the query on all Python projects. I've misunderstood our internal procedures a bit, so I won't be sharing the results here directly, but we'll provide some feedback based on what we see 😉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I rebase it to the the master, the tests start failing. I have not made any changes anywhere. so one of the commits between this and the current master must have caused this.
The automatic tests are run as if your PR was merged into master. Since they pass, I'm happy 👍 I don't know exactly what happened in your case, but since the tests pass, all should be good. (Maybe you have not pulled the latest master from this repo?)
Besides that, one small fix to the qhelp :) otherwise I think the rest looks ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my point of view I'm happy to merge this query into experimental. Just waiting on the evaluation on all Python projects before actually doing so.
@porcupineyhairs for considering of bounty award it is a requirement that the query finds a real world result. If you can map this query to any CVE or an existing project result, that would be helpful in further assessing the bounty award eligibility. We were unable to uncover a qualifying result set via LGTM. Thanks! |
@anticomputer @RasmusWL Sorry for the delayed response. I was OOO last week. I couldn't find a CVE for this per se. However I did come across a intentionally vulnerable application which has a xpath vulnerability. See https://github.com/tweksteen/django_vuln/blob/a18f5e86a441e6d0b7ea82f9dde7fb4bd94732e7/xpath_injection/views.py#L10-L15 This uses libxml2 which was not supported earlier but I have added the support now. So this should now be detected. Also, https://github.com/andresriancho/django-moth/blob/f836e4f914821f99766623238806c1db68beb6ba/moth/views/vulnerabilities/audit/xpath.py#L32 has a similar vuln but it uses a web framework which we don't support yet. I could add this but since all of the libs are being rewritten, it makes very less sense to add a new framework now. However, if you feel I should add that, I am happy to include that too. Will these qualify as valid detections for bounty? |
Looks like the build is failing due to an unrelated issue. |
Thanks, going to rerun the query on all projects on LGTM 👍 |
I believe according to the rules at least 1 finding of the vulnerability pattern in an external project (i.e. not your own) will meet the bar for consideration. Thanks! |
@porcupineyhairs unfortunately we were unable to reproduce a full result with the query in its current form against the mentioned project, which means it is not eligible under the "all for one" bounty award from the perspective of the Security Lab evaluation (which is an independent evaluation from e.g. the CodeQL team's evaluation) and we have to be consistent in applying those criteria to be fair. We understand that some of the complications are due to the way the CodeQL Python back end currently functions and since the CodeQL Python libs are under active rewrite we're putting Python queries temporarily out of scope for Bounty awards until the feature set there solidifies. You can track any changes for that on https://securitylab.github.com/bounties We enjoy your submissions very much and hope to be able to reward your future work under the bounty program! |
@anticomputer you are right, I tried it myself again. QL fails to detect a function as a django handler if it declared the way it is declared in the mentioned project. I got a detection initially as I had combined the files together in the way the other tests do. However, this is not the case here. With that said, I understand that this makes it ineligible for a bounty under the program. Anyways, I am fine with this getting merged even though this does not qualify for a bounty. |
You are right to point out the golang xpath PR, that submission was reviewed under our old evaluation process which lacked a more strict results and FP review, we've since addressed that and now have a more comprehensive review pipeline in place. In general most of the final bounty award decisions are based on the security lab evaluation, which aims to apply the bounty criteria as outlined https://securitylab.github.com/bounties as consistently as possible. This sometimes results in situations where the CodeQL team considers the query high quality, but it does not meet the result set requirements for bounty award from the seclab bounty award perspective . We had a lot of internal discussion about how to best treat these cases going forward, and came to the conclusion that it's best to stick to a strict interpretation of the bounty requirements to remove as much ambiguity from the award process as possible. Thanks for understanding in this case, and hopefully we'll be able to award future submissions :) |
@anticomputer No issues. I am closing this PR now. Feel free to reopen and merge this one irrespective of the bounty decision if you think this would be a good addition to the set. |
Thanks for being so understanding @porcupineyhairs. As said before, I'm happy to get this into experimental 👍 I can't merge it right now, since it needs an approving review, but since I added a commit to fix up the qhelp code I am no longer able to provide such. I'll get it merged once I persuade one of my colleagues to approve this PR 😊 |
This can't detect the case mentioned here. Adding a comment here to keep track of this limitation. |
This PR adds support for detecting Xpath injection in Python.
I have included the
ql
files as well as the tests with this.However, I haven't included included the
*.expected
files as lxml isnot stubbed into master yet. If someone could run
depstubber
and pushthe stubs, I would be able to complete my work.
cc: @RasmusWL @tausbn