Skip to content

Python : Add Xpath injection query #3522

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 25, 2020
Merged

Conversation

ghost
Copy link

@ghost ghost commented May 19, 2020

This PR adds support for detecting Xpath injection in Python.
I have included the ql files as well as the tests with this.

However, I haven't included included the *.expected files as lxml is
not stubbed into master yet. If someone could run depstubber and push
the stubs, I would be able to complete my work.

cc: @RasmusWL @tausbn

@ghost ghost self-requested a review as a code owner May 19, 2020 21:09
@RasmusWL RasmusWL self-assigned this May 22, 2020
@RasmusWL
Copy link
Member

Detailed answer in #3521 (comment) -- just recording it here so I don't forget myself 😅

@ghost ghost force-pushed the pythonXpath branch 2 times, most recently from 430ae96 to 87c1ea3 Compare May 23, 2020 23:41
@ghost
Copy link
Author

ghost commented May 25, 2020

I have added the qhelp too now.

Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor things I noticed

@RasmusWL
Copy link
Member

I've enabled testing on both this PR and #3521. I think there's a few things in the stubbing that needs to be fixed before tests will pass.

Let me know if you want me to review the PR in-depth now, or if you're planning to make more changes before I should do so 😊

@ghost
Copy link
Author

ghost commented May 27, 2020

@RasmusWL I think both the PR's can be reviewed in depth now. I have made the necessary changes to the stubs. They resemble the original code more now.

One more thing, since lxml is a wrapper around a c library, most of the code is written in Cython/Pyrex. So, I don't know how we can match the cython_function_or_method signature exactly here.

This PR adds support for detecting XPATH injection in Python.
I have included the ql files as well as the tests with this.
@ghost ghost force-pushed the pythonXpath branch from 241d4f8 to 8c5a971 Compare May 27, 2020 21:46
@RasmusWL
Copy link
Member

@RasmusWL I think both the PR's can be reviewed in depth now. I have made the necessary changes to the stubs. They resemble the original code more now.

Right, I'll get to that soon 😊

One more thing, since lxml is a wrapper around a c library, most of the code is written in Cython/Pyrex. So, I don't know how we can match the cython_function_or_method signature exactly here.

I noticed that myself, but thanks for pointing it out 👍 Stubbing libraries like this is not a perfect solution, and in particular we still don't have a way to verify that our queries work against code using the real libraries.

Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a more in-depth review of the QL code. Overall I'm very impressed with the quality of the code, compared to how long you have been working with CodeQL 🎉

There are some points I would to get addressed, most of them are mentioned as comments directly on the code. The ones that didn't fit on the actual code are:

  • I'm not sure what the purpose of python/ql/test/experimental/CWE-643/XpathLibTests/ is, but I do like the python/ql/test/experimental/CWE-643/XpathLibTests/xpathSinks.ql file. Could we just move it one directory up, so it lives in python/ql/test/experimental/CWE-643/xpathSinks.ql ? 😊
  • In the qhelp file you mention 3 mitigation strategies (1) using parameterized XPath interface, (2) escaping the user input, and (3) precompiled XPath query. Is there any clear-cut examples of how to do this that we could include? (possibly as part of the <example> tag)
  • As a nitpick, think it would be slightly more straightforward if we moved python/ql/test/query-tests/Security/lib/lxml/etree/__init__.py to python/ql/test/query-tests/Security/lib/lxml/etree.py (something to keep in mind for the future as well 😉).

Lastly, I want to highlight the next steps. Once you've addressed my (QL code) concerns here, I'll run your query on all Python projects we currently have on LGTM.com and share the results here. This helps us (me) determine the false-positive rate of the query. Our contributing guidelines also requires that The query must have at least one true positive result on some revision of a real project., so having all results can be helpful in that regard (unless you have one in mind already).

P.S., I did not review #3521 explicitly, since it follows pretty much the same pattern, and needs the same changes.

@ghost
Copy link
Author

ghost commented Jun 7, 2020

@RasmusWL I have included changes you asked.

Please note that while this is similar to the XSLT injection PR, both of them tackle different issues and different impact. In the case of this one, XPath injection, the attacker only gets access to the XML datasheet. While in the XSLT injection's case, code execution is possible. XSLT also has a bunch of extra flow steps which this does not require.

As for "one valid CVE", I don't have any out the top of my head. Googling does give you a few results But this is a quite common vulnerability. Most of the other scanners such as Fortify, SonarSource support this. Hence, I think CodeQL should support this too.

@ghost
Copy link
Author

ghost commented Jun 7, 2020

One more very important thing, I think I have found a bug in the current master. The tests in this PR are successful in its current state. However, if I rebase it to the the master, the tests start failing. I have not made any changes anywhere. so one of the commits between this and the current master must have caused this. I have tried debugging the code, for some reason, the taint kinds are not playing well. For ex, see the code below which fails to return any results.

import python
import experimental.semmle.python.security.injection.Xpath

from XpathInjection::XpathInjectionSink sink, TaintKind kind
where sink.sinks(kind)
select sink, kind

However, this works,

import python
import experimental.semmle.python.security.injection.Xpath

from XpathInjection::XpathInjectionSink sink
select sink, kind

I don't have a clue what's happening here.

@RasmusWL
Copy link
Member

RasmusWL commented Jun 8, 2020

Thanks 👍 I'll look properly at this (and the other XML PR) tomorrow.

Could you please mark the things you fixed from my previous review as "resolved"? That will make things a bit more tidy. (Also note that you can accept the suggested changes directly from the GitHub web UI).

As for "one valid CVE", I don't have any out the top of my head.

Oh, it doesn't have to be a CVE, it just has to be a true positive (a real result flagged by the query). If there is a CVE associated, that's definitely acceptable as well 😊

We'll see once I run the query on all Python projects. I've misunderstood our internal procedures a bit, so I won't be sharing the results here directly, but we'll provide some feedback based on what we see 😉

Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I rebase it to the the master, the tests start failing. I have not made any changes anywhere. so one of the commits between this and the current master must have caused this.

The automatic tests are run as if your PR was merged into master. Since they pass, I'm happy 👍 I don't know exactly what happened in your case, but since the tests pass, all should be good. (Maybe you have not pulled the latest master from this repo?)

Besides that, one small fix to the qhelp :) otherwise I think the rest looks ok.

@RasmusWL RasmusWL self-requested a review June 10, 2020 14:59
RasmusWL
RasmusWL previously approved these changes Jun 12, 2020
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point of view I'm happy to merge this query into experimental. Just waiting on the evaluation on all Python projects before actually doing so.

@anticomputer
Copy link

@porcupineyhairs for considering of bounty award it is a requirement that the query finds a real world result. If you can map this query to any CVE or an existing project result, that would be helpful in further assessing the bounty award eligibility. We were unable to uncover a qualifying result set via LGTM. Thanks!

@ghost
Copy link
Author

ghost commented Jun 21, 2020

@anticomputer @RasmusWL Sorry for the delayed response. I was OOO last week.

I couldn't find a CVE for this per se. However I did come across a intentionally vulnerable application which has a xpath vulnerability. See https://github.com/tweksteen/django_vuln/blob/a18f5e86a441e6d0b7ea82f9dde7fb4bd94732e7/xpath_injection/views.py#L10-L15

This uses libxml2 which was not supported earlier but I have added the support now. So this should now be detected.

Also, https://github.com/andresriancho/django-moth/blob/f836e4f914821f99766623238806c1db68beb6ba/moth/views/vulnerabilities/audit/xpath.py#L32 has a similar vuln but it uses a web framework which we don't support yet. I could add this but since all of the libs are being rewritten, it makes very less sense to add a new framework now. However, if you feel I should add that, I am happy to include that too.

Will these qualify as valid detections for bounty?

@ghost
Copy link
Author

ghost commented Jun 21, 2020

Looks like the build is failing due to an unrelated issue.

@RasmusWL
Copy link
Member

Thanks, going to rerun the query on all projects on LGTM 👍

@anticomputer
Copy link

@anticomputer @RasmusWL Sorry for the delayed response. I was OOO last week.

I couldn't find a CVE for this per se. However I did come across a intentionally vulnerable application which has a xpath vulnerability. See https://github.com/tweksteen/django_vuln/blob/a18f5e86a441e6d0b7ea82f9dde7fb4bd94732e7/xpath_injection/views.py#L10-L15

This uses libxml2 which was not supported earlier but I have added the support now. So this should now be detected.

Also, https://github.com/andresriancho/django-moth/blob/f836e4f914821f99766623238806c1db68beb6ba/moth/views/vulnerabilities/audit/xpath.py#L32 has a similar vuln but it uses a web framework which we don't support yet. I could add this but since all of the libs are being rewritten, it makes very less sense to add a new framework now. However, if you feel I should add that, I am happy to include that too.

Will these qualify as valid detections for bounty?

I believe according to the rules at least 1 finding of the vulnerability pattern in an external project (i.e. not your own) will meet the bar for consideration. Thanks!

@anticomputer
Copy link

@anticomputer @RasmusWL Sorry for the delayed response. I was OOO last week.
I couldn't find a CVE for this per se. However I did come across a intentionally vulnerable application which has a xpath vulnerability. See https://github.com/tweksteen/django_vuln/blob/a18f5e86a441e6d0b7ea82f9dde7fb4bd94732e7/xpath_injection/views.py#L10-L15
This uses libxml2 which was not supported earlier but I have added the support now. So this should now be detected.
Also, https://github.com/andresriancho/django-moth/blob/f836e4f914821f99766623238806c1db68beb6ba/moth/views/vulnerabilities/audit/xpath.py#L32 has a similar vuln but it uses a web framework which we don't support yet. I could add this but since all of the libs are being rewritten, it makes very less sense to add a new framework now. However, if you feel I should add that, I am happy to include that too.
Will these qualify as valid detections for bounty?

I believe according to the rules at least 1 finding of the vulnerability pattern in an external project (i.e. not your own) will meet the bar for consideration. Thanks!

@porcupineyhairs unfortunately we were unable to reproduce a full result with the query in its current form against the mentioned project, which means it is not eligible under the "all for one" bounty award from the perspective of the Security Lab evaluation (which is an independent evaluation from e.g. the CodeQL team's evaluation) and we have to be consistent in applying those criteria to be fair.

We understand that some of the complications are due to the way the CodeQL Python back end currently functions and since the CodeQL Python libs are under active rewrite we're putting Python queries temporarily out of scope for Bounty awards until the feature set there solidifies. You can track any changes for that on https://securitylab.github.com/bounties

We enjoy your submissions very much and hope to be able to reward your future work under the bounty program!

@ghost
Copy link
Author

ghost commented Jun 24, 2020

@anticomputer you are right, I tried it myself again. QL fails to detect a function as a django handler if it declared the way it is declared in the mentioned project. I got a detection initially as I had combined the files together in the way the other tests do. However, this is not the case here. With that said, I understand that this makes it ineligible for a bounty under the program.
An interesting point to note here is that the golang xpath pr was rewarded a bounty even when it didn't find a cve.

Anyways, I am fine with this getting merged even though this does not qualify for a bounty.

@anticomputer
Copy link

@anticomputer you are right, I tried it myself again. QL fails to detect a function as a django handler if it declared the way it is declared in the mentioned project. I got a detection initially as I had combined the files together in the way the other tests do. However, this is not the case here. With that said, I understand that this makes it ineligible for a bounty under the program.
An interesting point to note here is that the golang xpath pr was rewarded a bounty even when it didn't find a cve.

Anyways, I am fine with this getting merged even though this does not qualify for a bounty.

You are right to point out the golang xpath PR, that submission was reviewed under our old evaluation process which lacked a more strict results and FP review, we've since addressed that and now have a more comprehensive review pipeline in place.

In general most of the final bounty award decisions are based on the security lab evaluation, which aims to apply the bounty criteria as outlined https://securitylab.github.com/bounties as consistently as possible. This sometimes results in situations where the CodeQL team considers the query high quality, but it does not meet the result set requirements for bounty award from the seclab bounty award perspective .

We had a lot of internal discussion about how to best treat these cases going forward, and came to the conclusion that it's best to stick to a strict interpretation of the bounty requirements to remove as much ambiguity from the award process as possible.

Thanks for understanding in this case, and hopefully we'll be able to award future submissions :)

@ghost
Copy link
Author

ghost commented Jun 24, 2020

@anticomputer No issues. I am closing this PR now.

Feel free to reopen and merge this one irrespective of the bounty decision if you think this would be a good addition to the set.

@ghost ghost closed this Jun 24, 2020
@RasmusWL RasmusWL reopened this Jun 25, 2020
@RasmusWL RasmusWL self-requested a review June 25, 2020 09:00
@RasmusWL
Copy link
Member

Thanks for being so understanding @porcupineyhairs. As said before, I'm happy to get this into experimental 👍

I can't merge it right now, since it needs an approving review, but since I added a commit to fix up the qhelp code I am no longer able to provide such. I'll get it merged once I persuade one of my colleagues to approve this PR 😊

@RasmusWL RasmusWL merged commit 0b36cd4 into github:master Jun 25, 2020
@ghost ghost deleted the pythonXpath branch June 26, 2020 15:14
@ghost
Copy link
Author

ghost commented Jun 26, 2020

This can't detect the case mentioned here. Adding a comment here to keep track of this limitation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants