Python : Add Xpath injection query #3522

ghost · 2020-05-19T21:09:27Z

This PR adds support for detecting Xpath injection in Python.
I have included the ql files as well as the tests with this.

However, I haven't included included the *.expected files as lxml is
not stubbed into master yet. If someone could run depstubber and push
the stubs, I would be able to complete my work.

cc: @RasmusWL @tausbn

RasmusWL · 2020-05-22T16:20:34Z

Detailed answer in #3521 (comment) -- just recording it here so I don't forget myself 😅

python/ql/test/experimental/CWE-643/XpathLibTests/xpathSinks.ql

ghost · 2020-05-25T14:27:10Z

I have added the qhelp too now.

RasmusWL

A few minor things I noticed

python/ql/test/query-tests/Security/lib/lxml/etree/__init__.py

RasmusWL · 2020-05-26T09:00:21Z

I've enabled testing on both this PR and #3521. I think there's a few things in the stubbing that needs to be fixed before tests will pass.

Let me know if you want me to review the PR in-depth now, or if you're planning to make more changes before I should do so 😊

ghost · 2020-05-27T21:43:35Z

@RasmusWL I think both the PR's can be reviewed in depth now. I have made the necessary changes to the stubs. They resemble the original code more now.

One more thing, since lxml is a wrapper around a c library, most of the code is written in Cython/Pyrex. So, I don't know how we can match the cython_function_or_method signature exactly here.

This PR adds support for detecting XPATH injection in Python. I have included the ql files as well as the tests with this.

RasmusWL · 2020-05-28T09:24:33Z

@RasmusWL I think both the PR's can be reviewed in depth now. I have made the necessary changes to the stubs. They resemble the original code more now.

Right, I'll get to that soon 😊

One more thing, since lxml is a wrapper around a c library, most of the code is written in Cython/Pyrex. So, I don't know how we can match the cython_function_or_method signature exactly here.

I noticed that myself, but thanks for pointing it out 👍 Stubbing libraries like this is not a perfect solution, and in particular we still don't have a way to verify that our queries work against code using the real libraries.

RasmusWL

Here is a more in-depth review of the QL code. Overall I'm very impressed with the quality of the code, compared to how long you have been working with CodeQL 🎉

There are some points I would to get addressed, most of them are mentioned as comments directly on the code. The ones that didn't fit on the actual code are:

I'm not sure what the purpose of python/ql/test/experimental/CWE-643/XpathLibTests/ is, but I do like the python/ql/test/experimental/CWE-643/XpathLibTests/xpathSinks.ql file. Could we just move it one directory up, so it lives in python/ql/test/experimental/CWE-643/xpathSinks.ql ? 😊
In the qhelp file you mention 3 mitigation strategies (1) using parameterized XPath interface, (2) escaping the user input, and (3) precompiled XPath query. Is there any clear-cut examples of how to do this that we could include? (possibly as part of the <example> tag)
As a nitpick, think it would be slightly more straightforward if we moved python/ql/test/query-tests/Security/lib/lxml/etree/__init__.py to python/ql/test/query-tests/Security/lib/lxml/etree.py (something to keep in mind for the future as well 😉).

Lastly, I want to highlight the next steps. Once you've addressed my (QL code) concerns here, I'll run your query on all Python projects we currently have on LGTM.com and share the results here. This helps us (me) determine the false-positive rate of the query. Our contributing guidelines also requires that The query must have at least one true positive result on some revision of a real project., so having all results can be helpful in that regard (unless you have one in mind already).

P.S., I did not review #3521 explicitly, since it follows pretty much the same pattern, and needs the same changes.

python/ql/src/experimental/CWE-643/xpath.qhelp

python/ql/src/experimental/semmle/python/security/injection/Xpath.qll

ghost · 2020-06-07T21:40:01Z

@RasmusWL I have included changes you asked.

Please note that while this is similar to the XSLT injection PR, both of them tackle different issues and different impact. In the case of this one, XPath injection, the attacker only gets access to the XML datasheet. While in the XSLT injection's case, code execution is possible. XSLT also has a bunch of extra flow steps which this does not require.

As for "one valid CVE", I don't have any out the top of my head. Googling does give you a few results But this is a quite common vulnerability. Most of the other scanners such as Fortify, SonarSource support this. Hence, I think CodeQL should support this too.

ghost · 2020-06-07T21:45:27Z

One more very important thing, I think I have found a bug in the current master. The tests in this PR are successful in its current state. However, if I rebase it to the the master, the tests start failing. I have not made any changes anywhere. so one of the commits between this and the current master must have caused this. I have tried debugging the code, for some reason, the taint kinds are not playing well. For ex, see the code below which fails to return any results.

import python
import experimental.semmle.python.security.injection.Xpath

from XpathInjection::XpathInjectionSink sink, TaintKind kind
where sink.sinks(kind)
select sink, kind

However, this works,

import python
import experimental.semmle.python.security.injection.Xpath

from XpathInjection::XpathInjectionSink sink
select sink, kind

I don't have a clue what's happening here.

RasmusWL · 2020-06-08T16:16:52Z

Thanks 👍 I'll look properly at this (and the other XML PR) tomorrow.

Could you please mark the things you fixed from my previous review as "resolved"? That will make things a bit more tidy. (Also note that you can accept the suggested changes directly from the GitHub web UI).

As for "one valid CVE", I don't have any out the top of my head.

Oh, it doesn't have to be a CVE, it just has to be a true positive (a real result flagged by the query). If there is a CVE associated, that's definitely acceptable as well 😊

We'll see once I run the query on all Python projects. I've misunderstood our internal procedures a bit, so I won't be sharing the results here directly, but we'll provide some feedback based on what we see 😉

RasmusWL

if I rebase it to the the master, the tests start failing. I have not made any changes anywhere. so one of the commits between this and the current master must have caused this.

The automatic tests are run as if your PR was merged into master. Since they pass, I'm happy 👍 I don't know exactly what happened in your case, but since the tests pass, all should be good. (Maybe you have not pulled the latest master from this repo?)

Besides that, one small fix to the qhelp :) otherwise I think the rest looks ok.

python/ql/src/experimental/CWE-643/xpath.qhelp

RasmusWL

From my point of view I'm happy to merge this query into experimental. Just waiting on the evaluation on all Python projects before actually doing so.

anticomputer · 2020-06-12T16:25:30Z

@porcupineyhairs for considering of bounty award it is a requirement that the query finds a real world result. If you can map this query to any CVE or an existing project result, that would be helpful in further assessing the bounty award eligibility. We were unable to uncover a qualifying result set via LGTM. Thanks!

ghost · 2020-06-21T20:41:58Z

@anticomputer @RasmusWL Sorry for the delayed response. I was OOO last week.

I couldn't find a CVE for this per se. However I did come across a intentionally vulnerable application which has a xpath vulnerability. See https://github.com/tweksteen/django_vuln/blob/a18f5e86a441e6d0b7ea82f9dde7fb4bd94732e7/xpath_injection/views.py#L10-L15

This uses libxml2 which was not supported earlier but I have added the support now. So this should now be detected.

Also, https://github.com/andresriancho/django-moth/blob/f836e4f914821f99766623238806c1db68beb6ba/moth/views/vulnerabilities/audit/xpath.py#L32 has a similar vuln but it uses a web framework which we don't support yet. I could add this but since all of the libs are being rewritten, it makes very less sense to add a new framework now. However, if you feel I should add that, I am happy to include that too.

Will these qualify as valid detections for bounty?

ghost · 2020-06-21T20:44:42Z

Looks like the build is failing due to an unrelated issue.

RasmusWL · 2020-06-22T10:03:37Z

Thanks, going to rerun the query on all projects on LGTM 👍

anticomputer · 2020-06-22T14:20:18Z

@anticomputer @RasmusWL Sorry for the delayed response. I was OOO last week.

I couldn't find a CVE for this per se. However I did come across a intentionally vulnerable application which has a xpath vulnerability. See https://github.com/tweksteen/django_vuln/blob/a18f5e86a441e6d0b7ea82f9dde7fb4bd94732e7/xpath_injection/views.py#L10-L15

This uses libxml2 which was not supported earlier but I have added the support now. So this should now be detected.

Also, https://github.com/andresriancho/django-moth/blob/f836e4f914821f99766623238806c1db68beb6ba/moth/views/vulnerabilities/audit/xpath.py#L32 has a similar vuln but it uses a web framework which we don't support yet. I could add this but since all of the libs are being rewritten, it makes very less sense to add a new framework now. However, if you feel I should add that, I am happy to include that too.

Will these qualify as valid detections for bounty?

I believe according to the rules at least 1 finding of the vulnerability pattern in an external project (i.e. not your own) will meet the bar for consideration. Thanks!

anticomputer · 2020-06-24T14:06:46Z

@anticomputer @RasmusWL Sorry for the delayed response. I was OOO last week.
I couldn't find a CVE for this per se. However I did come across a intentionally vulnerable application which has a xpath vulnerability. See https://github.com/tweksteen/django_vuln/blob/a18f5e86a441e6d0b7ea82f9dde7fb4bd94732e7/xpath_injection/views.py#L10-L15
This uses libxml2 which was not supported earlier but I have added the support now. So this should now be detected.
Also, https://github.com/andresriancho/django-moth/blob/f836e4f914821f99766623238806c1db68beb6ba/moth/views/vulnerabilities/audit/xpath.py#L32 has a similar vuln but it uses a web framework which we don't support yet. I could add this but since all of the libs are being rewritten, it makes very less sense to add a new framework now. However, if you feel I should add that, I am happy to include that too.
Will these qualify as valid detections for bounty?

I believe according to the rules at least 1 finding of the vulnerability pattern in an external project (i.e. not your own) will meet the bar for consideration. Thanks!

@porcupineyhairs unfortunately we were unable to reproduce a full result with the query in its current form against the mentioned project, which means it is not eligible under the "all for one" bounty award from the perspective of the Security Lab evaluation (which is an independent evaluation from e.g. the CodeQL team's evaluation) and we have to be consistent in applying those criteria to be fair.

We understand that some of the complications are due to the way the CodeQL Python back end currently functions and since the CodeQL Python libs are under active rewrite we're putting Python queries temporarily out of scope for Bounty awards until the feature set there solidifies. You can track any changes for that on https://securitylab.github.com/bounties

We enjoy your submissions very much and hope to be able to reward your future work under the bounty program!

ghost · 2020-06-24T16:12:01Z

@anticomputer you are right, I tried it myself again. QL fails to detect a function as a django handler if it declared the way it is declared in the mentioned project. I got a detection initially as I had combined the files together in the way the other tests do. However, this is not the case here. With that said, I understand that this makes it ineligible for a bounty under the program.
An interesting point to note here is that the golang xpath pr was rewarded a bounty even when it didn't find a cve.

Anyways, I am fine with this getting merged even though this does not qualify for a bounty.

anticomputer · 2020-06-24T17:32:27Z

@anticomputer you are right, I tried it myself again. QL fails to detect a function as a django handler if it declared the way it is declared in the mentioned project. I got a detection initially as I had combined the files together in the way the other tests do. However, this is not the case here. With that said, I understand that this makes it ineligible for a bounty under the program.
An interesting point to note here is that the golang xpath pr was rewarded a bounty even when it didn't find a cve.

Anyways, I am fine with this getting merged even though this does not qualify for a bounty.

You are right to point out the golang xpath PR, that submission was reviewed under our old evaluation process which lacked a more strict results and FP review, we've since addressed that and now have a more comprehensive review pipeline in place.

In general most of the final bounty award decisions are based on the security lab evaluation, which aims to apply the bounty criteria as outlined https://securitylab.github.com/bounties as consistently as possible. This sometimes results in situations where the CodeQL team considers the query high quality, but it does not meet the result set requirements for bounty award from the seclab bounty award perspective .

We had a lot of internal discussion about how to best treat these cases going forward, and came to the conclusion that it's best to stick to a strict interpretation of the bounty requirements to remove as much ambiguity from the award process as possible.

Thanks for understanding in this case, and hopefully we'll be able to award future submissions :)

ghost · 2020-06-24T19:28:16Z

@anticomputer No issues. I am closing this PR now.

Feel free to reopen and merge this one irrespective of the bounty decision if you think this would be a good addition to the set.

RasmusWL · 2020-06-25T09:05:47Z

Thanks for being so understanding @porcupineyhairs. As said before, I'm happy to get this into experimental 👍

I can't merge it right now, since it needs an approving review, but since I added a commit to fix up the qhelp code I am no longer able to provide such. I'll get it merged once I persuade one of my colleagues to approve this PR 😊

ghost · 2020-06-26T15:14:43Z

This can't detect the case mentioned here. Adding a comment here to keep track of this limitation.

ghost self-requested a review as a code owner May 19, 2020 21:09

ghost mentioned this pull request May 21, 2020

Python : Add Xpath injection query github/securitylab#92

Closed

RasmusWL added the Python label May 22, 2020

RasmusWL self-assigned this May 22, 2020

ghost force-pushed the pythonXpath branch 2 times, most recently from 430ae96 to 87c1ea3 Compare May 23, 2020 23:41

ghost commented May 23, 2020

View reviewed changes

python/ql/test/experimental/CWE-643/XpathLibTests/xpathSinks.ql Outdated Show resolved Hide resolved

RasmusWL reviewed May 26, 2020

View reviewed changes

Python : Add Xpath injection query

8c5a971

This PR adds support for detecting XPATH injection in Python. I have included the ql files as well as the tests with this.

ghost force-pushed the pythonXpath branch from 241d4f8 to 8c5a971 Compare May 27, 2020 21:46

RasmusWL requested changes Jun 2, 2020

View reviewed changes

include sugestions from review

424e88d

RasmusWL requested changes Jun 10, 2020

View reviewed changes

python/ql/src/experimental/CWE-643/xpath.qhelp Outdated Show resolved Hide resolved

Python: Minor fixup of qhelp for XPath injection

ce1f0a3

RasmusWL self-requested a review June 10, 2020 14:59

RasmusWL previously approved these changes Jun 12, 2020

View reviewed changes

anticomputer mentioned this pull request Jun 12, 2020

Python : Add support for detecting XSLT Injection #3521

Closed

add support for libxml2

a519132

ghost dismissed RasmusWL’s stale review via a519132 June 21, 2020 20:31

ghost closed this Jun 24, 2020

RasmusWL reopened this Jun 25, 2020

RasmusWL self-requested a review June 25, 2020 09:00

tausbn approved these changes Jun 25, 2020

View reviewed changes

RasmusWL merged commit 0b36cd4 into github:master Jun 25, 2020

ghost mentioned this pull request Jun 26, 2020

Python: Add support for detecting XSLT Injection (#3521 revived) #3801

Merged

ghost deleted the pythonXpath branch June 26, 2020 15:14

RasmusWL mentioned this pull request Feb 28, 2022

Python : Improve Xpath Injection Query #6331

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python : Add Xpath injection query #3522

Python : Add Xpath injection query #3522

ghost commented May 19, 2020

RasmusWL commented May 22, 2020

ghost commented May 25, 2020

RasmusWL left a comment

RasmusWL commented May 26, 2020

ghost commented May 27, 2020 •

edited by ghost

Loading

RasmusWL commented May 28, 2020

RasmusWL left a comment

ghost commented Jun 7, 2020

ghost commented Jun 7, 2020

RasmusWL commented Jun 8, 2020

RasmusWL left a comment

RasmusWL left a comment

anticomputer commented Jun 12, 2020

ghost commented Jun 21, 2020

ghost commented Jun 21, 2020

RasmusWL commented Jun 22, 2020

anticomputer commented Jun 22, 2020

anticomputer commented Jun 24, 2020

ghost commented Jun 24, 2020 •

edited by ghost

Loading

anticomputer commented Jun 24, 2020

ghost commented Jun 24, 2020

RasmusWL commented Jun 25, 2020

ghost commented Jun 26, 2020

Python : Add Xpath injection query #3522

Python : Add Xpath injection query #3522

Conversation

ghost commented May 19, 2020

RasmusWL commented May 22, 2020

ghost commented May 25, 2020

RasmusWL left a comment

Choose a reason for hiding this comment

RasmusWL commented May 26, 2020

ghost commented May 27, 2020 • edited by ghost Loading

RasmusWL commented May 28, 2020

RasmusWL left a comment

Choose a reason for hiding this comment

ghost commented Jun 7, 2020

ghost commented Jun 7, 2020

RasmusWL commented Jun 8, 2020

RasmusWL left a comment

Choose a reason for hiding this comment

RasmusWL left a comment

Choose a reason for hiding this comment

anticomputer commented Jun 12, 2020

ghost commented Jun 21, 2020

ghost commented Jun 21, 2020

RasmusWL commented Jun 22, 2020

anticomputer commented Jun 22, 2020

anticomputer commented Jun 24, 2020

ghost commented Jun 24, 2020 • edited by ghost Loading

anticomputer commented Jun 24, 2020

ghost commented Jun 24, 2020

RasmusWL commented Jun 25, 2020

ghost commented Jun 26, 2020

ghost commented May 27, 2020 •

edited by ghost

Loading

ghost commented Jun 24, 2020 •

edited by ghost

Loading