Skip to content

Python: CWE-943 - Add NoSQL injection query #5612

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 81 commits into from
Aug 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
517a920
PR init
jorgectf Mar 30, 2021
aea7546
Add Concepts
jorgectf Mar 30, 2021
bd5ff01
PyMongo and Mongoengine sinks
jorgectf Mar 30, 2021
d856f16
Adapt query configs and custom classes
jorgectf Mar 30, 2021
4579132
Add left tests
jorgectf Mar 30, 2021
ccd57be
Fix imports
jorgectf Mar 30, 2021
01f9d4a
Fix MongoEngine Sink
jorgectf Mar 31, 2021
7a4dc46
Fix Sinks
jorgectf Mar 31, 2021
5a1dc48
Fix Mongoengine test
jorgectf Mar 31, 2021
017a826
Remove unused class variables
jorgectf Mar 31, 2021
f0a50eb
Polish up configs
jorgectf Mar 31, 2021
3a47a45
Attempt to apply TaintTracking2
jorgectf Mar 31, 2021
c8740a2
Update naming
jorgectf Apr 1, 2021
f980d06
Fix taint configs
jorgectf Apr 1, 2021
15e176a
Polish query select
jorgectf Apr 1, 2021
9072d19
Update qhelp file
Apr 5, 2021
be9a3a9
Add relevant PyMongo sink methods
Apr 5, 2021
80216f6
Rename classes
Apr 5, 2021
3f0c758
Add required __raw__ keyword
mrthankyou Apr 5, 2021
759fa2c
Update query to search for more pymongo sink methods
mrthankyou Apr 6, 2021
6ade120
Add check for mongoengine raw queries
mrthankyou Apr 6, 2021
ac31260
Made grammar changes
mrthankyou Apr 6, 2021
520e65e
Remove unnecessary example code
mrthankyou Apr 6, 2021
dc274ec
Improve sentence structure and grammar
mrthankyou Apr 6, 2021
4e98348
Remove comment
mrthankyou Apr 6, 2021
719c30b
Fix file name and adjust where the test points to
mrthankyou Apr 7, 2021
83f28bf
Catch any keyword argument passed to MongoEngine's objects method
mrthankyou Apr 7, 2021
0e51dbe
Polish tests
jorgectf Apr 9, 2021
a6b3aef
Add flask_mongoengine sink
jorgectf Apr 9, 2021
fa5869a
Polish qhelp and examples
jorgectf Apr 9, 2021
983af32
Polish qhelp examples
jorgectf Apr 9, 2021
208b53e
Polish query file
jorgectf Apr 9, 2021
1663857
Polish Calls naming
jorgectf Apr 9, 2021
4615927
Fix flask_mongoengine Call
jorgectf Apr 9, 2021
5d25a27
Add .expected
jorgectf Apr 9, 2021
bbd3552
Rename predicate to getQuery
mrthankyou Apr 20, 2021
7773c53
Replace any(string) with _ wildcard
mrthankyou Apr 20, 2021
62f3e8d
Add sanitizer for ObjectId
mrthankyou Apr 26, 2021
d85b1a2
Replace recursive getAMember*() method
mrthankyou Apr 28, 2021
56dc4d8
Add comment on BsonObjectIdCall
mrthankyou May 4, 2021
c4a67e5
Rewrite query to take into account MongoClient and subscript expressions
mrthankyou May 4, 2021
1d36aa6
Add additional querying for mongoengine Document subclassing
mrthankyou May 7, 2021
7693d69
Add additional query tests
mrthankyou May 7, 2021
8f8eff2
Fix comment description of predicate
mrthankyou May 7, 2021
9a44020
Rename StdLib.qll file to NoSQL.qll file
mrthankyou May 7, 2021
83f0870
Update file path of module
mrthankyou May 7, 2021
aa24c68
Add back accidentally deleted StdLib.qll file
mrthankyou May 7, 2021
65c6f19
Rename mongoengine-flask-db-document-subclass
jorgectf May 7, 2021
e7bdc73
Update .expected
jorgectf May 7, 2021
0fc044d
Checkout Stdlib.qll
jorgectf May 7, 2021
67bc576
Delete StdLib.qll
mrthankyou May 7, 2021
07c3e22
Fix method name to match flask_mongoengine library
mrthankyou May 9, 2021
0238e51
Add checks for EmbeddedDocument classes
mrthankyou May 9, 2021
3e25b14
Update NoSQLInjection.expected
mrthankyou May 12, 2021
c948970
resolve merge conflicts
jorgectf Jun 14, 2021
6bed859
Match sanitizer inputs' naming
jorgectf Jun 15, 2021
e61cf9a
Simplify tests
jorgectf Jun 15, 2021
5123b8f
Update .expected
jorgectf Jun 15, 2021
81505fb
Normalize tests
jorgectf Jun 16, 2021
5c7229c
Optimize Type Tracking stuff
jorgectf Jun 16, 2021
8527ccc
Update .expected
jorgectf Jun 16, 2021
b8e619a
Extend qhelp references
jorgectf Jun 17, 2021
8e3d5ff
Rename mongoclient tests
jorgectf Jun 17, 2021
7e6032f
Port to Decoding
jorgectf Jun 17, 2021
4e74003
Polish Concepts documentation
jorgectf Jun 17, 2021
eb16018
Update .expected
jorgectf Jun 17, 2021
5971142
Python: Fix qhelp for NoSQL injection
RasmusWL Jun 28, 2021
318694c
Python: Don't rely on `d = d.getOutput()` for `Decoding`
RasmusWL Jun 28, 2021
a5009ef
Merge pull request #5 from RasmusWL/nosql-fixes
jorgectf Jun 28, 2021
0ca4f24
Merge tests and update `.expected`
jorgectf Jun 28, 2021
3fd1129
Delete trivial tests
jorgectf Jun 28, 2021
68c6831
Polish documentation, `mongoCollectionMethod()` and update `.expected`
jorgectf Jun 28, 2021
3504408
Apply suggestions from code review
jorgectf Jun 28, 2021
51395d1
Move `xmltodict` to its own file under `frameworks/`
jorgectf Jun 28, 2021
0819090
Fix qldocs typo
jorgectf Jun 29, 2021
9a8d1f8
Take back non-trivial tests
jorgectf Jun 29, 2021
621a810
Update `.expected`
jorgectf Jun 29, 2021
e02a63a
Delete trivial `*_good.py` tests
jorgectf Jun 29, 2021
51a6140
Change variable name to correct sanitized input variable
mrthankyou Jul 13, 2021
6f09b95
Update `.expected`
jorgectf Jul 15, 2021
e6ce10b
Merge remote-tracking branch 'origin/main' into jty/python/nosqlInjec…
jorgectf Aug 10, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>

<overview>
<p>
Passing user-controlled sources into NoSQL queries can result in a NoSQL injection flaw.
This tainted NoSQL query containing a user-controlled source can then execute a malicious query in a NoSQL database such as MongoDB.
In order for the user-controlled source to taint the NoSQL query, the user-controller source must be converted into a Python object using something like <code>json.loads</code> or <code>xmltodict.parse</code>.
</p>
<p>
Because a user-controlled source is passed into the query, the malicious user can have complete control over the query itself.
When the tainted query is executed, the malicious user can commit malicious actions such as bypassing role restrictions or accessing and modifying restricted data in the NoSQL database.
</p>
</overview>

<recommendation>
<p>
NoSQL injections can be prevented by escaping user-input's special characters that are passed into the NoSQL query from the user-supplied source.
Alternatively, using a sanitize library such as MongoSanitizer will ensure that user-supplied sources can not act as a malicious query.
</p>
</recommendation>

<example>
<p>In the example below, the user-supplied source is passed to a MongoDB function that queries the MongoDB database.</p>
<sample src="examples/NoSQLInjection-bad.py" />
<p> This can be fixed by using a sanitizer library like MongoSanitizer as shown in this annotated code version below.</p>
<sample src="examples/NoSQLInjection-good.py" />
</example>

<references>
<li>Mongoengine: <a href="http://mongoengine.org/">Documentation</a>.</li>
<li>Flask-Mongoengine: <a href="http://docs.mongoengine.org/projects/flask-mongoengine/en/latest/">Documentation</a>.</li>
<li>PyMongo: <a href="https://pypi.org/project/pymongo/">Documentation</a>.</li>
<li>Flask-PyMongo: <a href="https://flask-pymongo.readthedocs.io/en/latest/">Documentation</a>.</li>
<li>OWASP: <a href="https://owasp.org/www-pdf-archive/GOD16-NOSQL.pdf">NoSQL Injection</a>.</li>
<li>Security Stack Exchange Discussion: <a href="https://security.stackexchange.com/questions/83231/mongodb-nosql-injection-in-python-code">Question 83231</a>.</li>
</references>
</qhelp>
19 changes: 19 additions & 0 deletions python/ql/src/experimental/Security/CWE-943/NoSQLInjection.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
/**
* @name NoSQL Injection
* @description Building a NoSQL query from user-controlled sources is vulnerable to insertion of
* malicious NoSQL code by the user.
* @kind path-problem
* @problem.severity error
* @id py/nosql-injection
* @tags experimental
* security
* external/cwe/cwe-943
*/

import python
import experimental.semmle.python.security.injection.NoSQLInjection

from CustomPathNode source, CustomPathNode sink
where noSQLInjectionFlow(source, sink)
select sink, source, sink, "$@ NoSQL query contains an unsanitized $@", sink, "This", source,
"user-provided value"
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from flask import Flask, request
from flask_pymongo import PyMongo
import json

mongo = PyMongo(app)


@app.route("/")
def home_page():
unsanitized_search = request.args['search']
json_search = json.loads(unsanitized_search)

result = mongo.db.user.find({'name': json_search})
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
from flask import Flask, request
from flask_pymongo import PyMongo
from mongosanitizer.sanitizer import sanitize
import json

mongo = PyMongo(app)


@app.route("/")
def home_page():
unsafe_search = request.args['search']
json_search = json.loads(unsafe_search)
safe_search = sanitize(unsanitized_search)

result = client.db.collection.find_one({'data': safe_search})
58 changes: 58 additions & 0 deletions python/ql/src/experimental/semmle/python/Concepts.qll
Original file line number Diff line number Diff line change
Expand Up @@ -209,3 +209,61 @@ class SQLEscape extends DataFlow::Node {
*/
DataFlow::Node getAnInput() { result = range.getAnInput() }
}

/** Provides a class for modeling NoSQL execution APIs. */
module NoSQLQuery {
/**
* A data-flow node that executes NoSQL queries.
*
* Extend this class to model new APIs. If you want to refine existing API models,
* extend `NoSQLQuery` instead.
*/
abstract class Range extends DataFlow::Node {
/** Gets the argument that specifies the NoSQL query to be executed. */
abstract DataFlow::Node getQuery();
}
}

/**
* A data-flow node that executes NoSQL queries.
*
* Extend this class to refine existing API models. If you want to model new APIs,
* extend `NoSQLQuery::Range` instead.
*/
class NoSQLQuery extends DataFlow::Node {
NoSQLQuery::Range range;

NoSQLQuery() { this = range }

/** Gets the argument that specifies the NoSQL query to be executed. */
DataFlow::Node getQuery() { result = range.getQuery() }
}

/** Provides classes for modeling NoSQL sanitization-related APIs. */
module NoSQLSanitizer {
/**
* A data-flow node that collects functions sanitizing NoSQL queries.
*
* Extend this class to model new APIs. If you want to refine existing API models,
* extend `NoSQLSanitizer` instead.
*/
abstract class Range extends DataFlow::Node {
/** Gets the argument that specifies the NoSQL query to be sanitized. */
abstract DataFlow::Node getAnInput();
Copy link
Contributor Author

@mrthankyou mrthankyou Jun 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better name! :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not my idea 😆 #5443 (review)

}
}

/**
* A data-flow node that collects functions sanitizing NoSQL queries.
*
* Extend this class to model new APIs. If you want to refine existing API models,
* extend `NoSQLSanitizer::Range` instead.
*/
class NoSQLSanitizer extends DataFlow::Node {
NoSQLSanitizer::Range range;

NoSQLSanitizer() { this = range }

/** Gets the argument that specifies the NoSQL query to be sanitized. */
DataFlow::Node getAnInput() { result = range.getAnInput() }
}
1 change: 1 addition & 0 deletions python/ql/src/experimental/semmle/python/Frameworks.qll
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@

private import experimental.semmle.python.frameworks.Stdlib
private import experimental.semmle.python.frameworks.LDAP
private import experimental.semmle.python.frameworks.NoSQL
215 changes: 215 additions & 0 deletions python/ql/src/experimental/semmle/python/frameworks/NoSQL.qll
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
/**
* Provides classes modeling security-relevant aspects of the standard libraries.
* Note: some modeling is done internally in the dataflow/taint tracking implementation.
*/

private import python
private import semmle.python.dataflow.new.DataFlow
private import semmle.python.dataflow.new.TaintTracking
private import semmle.python.dataflow.new.RemoteFlowSources
private import experimental.semmle.python.Concepts
private import semmle.python.ApiGraphs

private module NoSQL {
// API Nodes returning `Mongo` instances.
/** Gets a reference to `pymongo.MongoClient` */
private API::Node pyMongo() {
result = API::moduleImport("pymongo").getMember("MongoClient").getReturn()
}

/** Gets a reference to `flask_pymongo.PyMongo` */
private API::Node flask_PyMongo() {
result = API::moduleImport("flask_pymongo").getMember("PyMongo").getReturn()
}

/** Gets a reference to `mongoengine` */
private API::Node mongoEngine() { result = API::moduleImport("mongoengine") }

/** Gets a reference to `flask_mongoengine.MongoEngine` */
private API::Node flask_MongoEngine() {
result = API::moduleImport("flask_mongoengine").getMember("MongoEngine").getReturn()
}

/**
* Gets a reference to an initialized `Mongo` instance.
* See `pyMongo()`, `flask_PyMongo()`
*/
private API::Node mongoInstance() {
result = pyMongo() or
result = flask_PyMongo()
}

/**
* Gets a reference to an initialized `Mongo` DB instance.
* See `mongoEngine()`, `flask_MongoEngine()`
*/
private API::Node mongoDBInstance() {
result = mongoEngine().getMember(["get_db", "connect"]).getReturn() or
result = mongoEngine().getMember("connection").getMember(["get_db", "connect"]).getReturn() or
result = flask_MongoEngine().getMember("get_db").getReturn()
}

/**
* Gets a reference to a `Mongo` DB use.
*
* See `mongoInstance()`, `mongoDBInstance()`.
*/
private DataFlow::LocalSourceNode mongoDB(DataFlow::TypeTracker t) {
t.start() and
(
exists(SubscriptNode subscript |
subscript.getObject() = mongoInstance().getAUse().asCfgNode() and
result.asCfgNode() = subscript
)
or
result.(DataFlow::AttrRead).getObject() = mongoInstance().getAUse()
or
result = mongoDBInstance().getAUse()
)
or
exists(DataFlow::TypeTracker t2 | result = mongoDB(t2).track(t2, t))
}

/**
* Gets a reference to a `Mongo` DB use.
*
* ```py
* from flask_pymongo import PyMongo
* mongo = PyMongo(app)
* mongo.db.user.find({'name': safe_search})
* ```
*
* `mongo.db` would be a use of a `Mongo` instance, and so the result.
*/
private DataFlow::Node mongoDB() { mongoDB(DataFlow::TypeTracker::end()).flowsTo(result) }

/**
* Gets a reference to a `Mongo` collection use.
*
* See `mongoDB()`.
*/
private DataFlow::LocalSourceNode mongoCollection(DataFlow::TypeTracker t) {
t.start() and
(
exists(SubscriptNode subscript | result.asCfgNode() = subscript |
subscript.getObject() = mongoDB().asCfgNode()
)
or
result.(DataFlow::AttrRead).getObject() = mongoDB()
)
or
exists(DataFlow::TypeTracker t2 | result = mongoCollection(t2).track(t2, t))
}

/**
* Gets a reference to a `Mongo` collection use.
*
* ```py
* from flask_pymongo import PyMongo
* mongo = PyMongo(app)
* mongo.db.user.find({'name': safe_search})
* ```
*
* `mongo.db.user` would be a use of a `Mongo` collection, and so the result.
*/
private DataFlow::Node mongoCollection() {
mongoCollection(DataFlow::TypeTracker::end()).flowsTo(result)
}

/** This class represents names of find_* relevant `Mongo` collection-level operation methods. */
private class MongoCollectionMethodNames extends string {
MongoCollectionMethodNames() {
this in [
"find", "find_raw_batches", "find_one", "find_one_and_delete", "find_and_modify",
"find_one_and_replace", "find_one_and_update", "find_one_or_404"
]
}
}

/**
* Gets a reference to a `Mongo` collection method.
*
* ```py
* from flask_pymongo import PyMongo
* mongo = PyMongo(app)
* mongo.db.user.find({'name': safe_search})
* ```
*
* `mongo.db.user.find` would be a collection method, and so the result.
*/
private DataFlow::Node mongoCollectionMethod() {
mongoCollection() = result.(DataFlow::AttrRead).getObject() and
result.(DataFlow::AttrRead).getAttributeName() instanceof MongoCollectionMethodNames
}

/**
* Gets a reference to a `Mongo` collection method call
*
* ```py
* from flask_pymongo import PyMongo
* mongo = PyMongo(app)
* mongo.db.user.find({'name': safe_search})
* ```
*
* `mongo.db.user.find({'name': safe_search})` would be a collection method call, and so the result.
*/
private class MongoCollectionCall extends DataFlow::CallCfgNode, NoSQLQuery::Range {
MongoCollectionCall() { this.getFunction() = mongoCollectionMethod() }

override DataFlow::Node getQuery() { result = this.getArg(0) }
}

/**
* Gets a reference to a call from a class whose base is a reference to `mongoEngine()` or `flask_MongoEngine()`'s
* `Document` or `EmbeddedDocument` objects and its attribute is `objects`.
*
* ```py
* from flask_mongoengine import MongoEngine
* db = MongoEngine(app)
* class Movie(db.Document):
* title = db.StringField(required=True)
*
* Movie.objects(__raw__=json_search)
* ```
*
* `Movie.objects(__raw__=json_search)` would be the result.
*/
private class MongoEngineObjectsCall extends DataFlow::CallCfgNode, NoSQLQuery::Range {
MongoEngineObjectsCall() {
this =
[mongoEngine(), flask_MongoEngine()]
.getMember(["Document", "EmbeddedDocument"])
.getASubclass()
.getMember("objects")
.getACall()
}

override DataFlow::Node getQuery() { result = this.getArgByName(_) }
}

/** Gets a reference to `mongosanitizer.sanitizer.sanitize` */
private class MongoSanitizerCall extends DataFlow::CallCfgNode, NoSQLSanitizer::Range {
MongoSanitizerCall() {
this =
API::moduleImport("mongosanitizer").getMember("sanitizer").getMember("sanitize").getACall()
}

override DataFlow::Node getAnInput() { result = this.getArg(0) }
}

/**
* ObjectId returns a string representing an id.
* If at any time ObjectId can't parse it's input (like when a tainted dict in passed in),
* then ObjectId will throw an error preventing the query from running.
*/
private class BsonObjectIdCall extends DataFlow::CallCfgNode, NoSQLSanitizer::Range {
BsonObjectIdCall() {
this =
API::moduleImport(["bson", "bson.objectid", "bson.json_util"])
.getMember("ObjectId")
.getACall()
}

override DataFlow::Node getAnInput() { result = this.getArg(0) }
}
}
Loading