-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Python: Add SqlAlchemy model #5680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
this = API::moduleImport("sqlalchemy").getMember("create_engine").getReturn() | ||
} | ||
|
||
override string toString() { result = "Use of SqlAlchemy create_engine member" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% the best wording practices here so if this needs to be changed let me know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually we don't build classes for this, but instead use a predicate, such as
/** Gets an instance of ... */
getSqlAlchemyEngineInstance() {
result = API::moduleImport("sqlalchemy").getMember("create_engine").getReturn()
}
and I think doing so in this case would also make things a bit more clean.
Small note: I have another model I'm currently building and is related to the security query that will be using the model in this PR. I've decided to wait for this to be merged in before I'll submit the other query. This way, if there are any glaring issues in this PR, I can apply them to upcoming PR as well. |
python/ql/src/experimental/semmle/python/frameworks/SqlAlchemy.qll
Outdated
Show resolved
Hide resolved
- Replaced classes that look for SqlAlchemy instances with predicates - General clean-up of code
This is ready for re-review. PS: Are there additional requirements such as tests for supplementary code like this to a query? I know tests matter for the actual queries and I thought I saw tests written for other models but I haven't seen any guidance on models. |
Friendly ping about this PR 😄 |
Missed this response, but yes, please add tests demonstrating behavior that is covered by the library modeling. Please also include test examples for any interesting behavior that you can do with the library that you currently don't model (that is, if there is some SQL executions that you know can happen with the SQL Alchemy library, that it just hasn't been possible to model yet). The way I would want you to do this is by creating a new folder For an example, you can see the tests we did for SQL execution in django: https://github.com/github/codeql/blob/c8a6e837b57513df5c8ccd081e193be3443987d1/python/ql/test/library-tests/frameworks/django/SqlExecution.py and this is the |
I think there might be a misconception about the One of the examples I found was (rewritten slightly) with engine.connect() as conn:
t = text("select * from table")
result = conn.execution_options(stream_results=True).execute(t) As I see it, the current modeling would capture that the I'm not so sure about the Looking forward to seeing your tests, probably they will clear most of this up 👍 |
I'm a little lost here so I'm going to ask some questions and hopefully this will get cleared up on my end: When I eventually submit my SqlAlchemy query that relies on this PR, I will have accounted for this taint-step in the query I'm writing. However, as it stands the current SQL execution query can't account for this additional taint-step and will fail. Is the expectation here that if this PR is merged in it should automatically work with the current queries?
There are some cases where the taint-step |
After researching SqlAlchemy and it's various query methods, I discovered several types of SQL injection possibilities. The SQLExecution.py file contains these examples and can be broken up into two types of injections. Injections requiring the text() taint-step and injections NOT requiring the text() taint step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've talked this over in private a bit. For the record, I've created a few small changes in mrthankyou#1, and with that merged, I think this PR is OK to go, and I'm happy to do the last bit of polishing to get this promoted out of experiemntal.
Minor updates to SQL alchemy PR
*/ | ||
private class SqlAlchemyExecuteCall extends DataFlow::CallCfgNode, SqlExecution::Range { | ||
SqlAlchemyExecuteCall() { | ||
// new way |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote this and I have no idea what I was referencing. This model library points to SqlAlchemy 1.X so I don't think I'm referencing 2.0.
private import semmle.python.ApiGraphs | ||
private import semmle.python.Concepts | ||
|
||
private module SqlAlchemy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any desire on your side Rasmus to version this model? SqlAlchemy came out with 2.0 and I don't know how compatible this model library is for 2.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.0 has still not been released it seems (https://pypi.org/project/SQLAlchemy/#history), so I think it's OK. Thanks for letting me know about this though 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is ok to go now 👍
I assume the
MISSING
keyword represents a scenario where the test is skippable?
It means that the modeling was not able to find the result we know should be there. (so the expected result is MISSING
, if the modeling provides a result where we know there shouldn't be any, we mark that as SPURIOUS
)
So if you want to, feel free to fix up those instances (like filter_by
and having
).
private import semmle.python.ApiGraphs | ||
private import semmle.python.Concepts | ||
|
||
private module SqlAlchemy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2.0 has still not been released it seems (https://pypi.org/project/SQLAlchemy/#history), so I think it's OK. Thanks for letting me know about this though 👍
Hi, I have submitted the changes but am still working on getting the tests updated. Unfortunately I'm encountering errors while attempting to codeql test run and am having to debug it. Please note that I'm slammed right now with work and moving. I have no free time to work on this at least for the next week. If I have some free time today I will try to tackle it. I'm sorry for the delay on my end. This is unfortunately out of my control and will be tied down to at least next Wednesday. |
All the tests do pass though, so I don't think there's anything more for you to do on the tests 😉 |
Yep, you're correct Rasmus. Testing results below:
|
This PR is meant to add a module for dealing with SqlAlchemy in Python CodeQL queries. For context, this will be used in an upcoming CodeQL security query so I decided to go ahead and submit this PR to get the ball rolling.