-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Python: promote log injection #7735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- move from custom concept `LogOutput` to standard concept `Logging` - remove `Log.qll` from experimental frameworks - fold models into standard models (naively for now) - stdlib: - make Logger module public - broaden definition of instance - add `extra` keyword as possible source - flak: add app.logger as logger instance - django: `add django.utils.log.request_logger` as logger instance (should we add the rest?) - remove LogOutput from experimental concepts
- modernize the sanitizer, but do not make it less specific
should we have the `lgtm,codescanning` handshake or not?
I set the category to newQuery since that is what users will see. When we have tags, it would be nice to tag it as a query promotion.
80da48e
to
9d41666
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python/ql/src/Security/CWE-117/LogInjection.ql
needs to have meta-data updated, to include security severity at least. I haven't been involved in looking at this query before, do you feel that @precision high
is reasonable? For security-severity, we might just copy the one from java (which is the same as the one for JS)
Looking up those @security-severity
scores, I noticed that both queires had @precision medium
. That got me thinking, in the example below, neither username
nor foo
can contain newlines, but would probably end up causing an alert 😐 so we need to be mindful about the precision we set here.
@app.route("/users/<username>")
def show_user(username):
foo = request.args.get("foo", 10)
LOGGER.debug(f"showing user {username} with foo={foo}")
nitpick: QLDoc for classes doesn't follow our style-guide (must start with A
or An
)
nitpick: Would be very nice with documentations links.
(I've made a suggestion for Django to fix the two nitpicks above)
For the qhelp, I see that it is in large part just a copy of the qhelp from the java query. I personally have a mild preference for the JS version (since it better highlights the problems of HTML). Have you made a consideration between the two?
@@ -2683,14 +2698,12 @@ private module StdlibPrivate { | |||
method = "log" and | |||
msgIndex = 1 | |||
| | |||
this = Logger::instance().getMember(method).getACall() | |||
or | |||
this = API::moduleImport("logging").getMember(method).getACall() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't agree about this removal. I see that API::moduleImport("logging")
is now modeled as a LoggerInstance... but that is not true, these are helper methods defined on the module leve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there value in modelling instances specifically? Or could we just rename instance
to something like loggingMethodProvider
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted to previous modelling approach as discussed offline.
python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll
Outdated
Show resolved
Hide resolved
class ReplaceLineBreaksSanitizer extends Sanitizer, DataFlow::CallCfgNode { | ||
ReplaceLineBreaksSanitizer() { | ||
this.getFunction().(DataFlow::AttrRead).getAttributeName() = "replace" and | ||
this.getArg(0).asExpr().(StrConst).getText() in ["\r\n", "\n"] | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sanitizer check seems a bit weak. What if it only did text.replace("\r\n", "\n")
? I guess it's not trivial to write a sanitizer that knows we've eliminated both, so maybe this is ok? (what are your thoughts on this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have had the same concerns from the beginning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.
module Logger { | ||
/** | ||
* An instance of `logging.Logger`. Extend this class to model new instances. | ||
* Most major frameworks will provide a logger instance as a class attribute. | ||
*/ | ||
abstract class LoggerInstance extends API::Node { | ||
override string toString() { result = "logger" } | ||
} | ||
|
||
/** Gets a reference to the `logging.Logger` class or any subclass. */ | ||
API::Node subclassRef() { | ||
result = API::moduleImport("logging").getMember("Logger").getASubclass*() | ||
} | ||
|
||
/** Gets a reference to an instance of `logging.Logger` or any subclass. */ | ||
API::Node instance() { | ||
result instanceof LoggerInstance | ||
or | ||
result = subclassRef().getReturn() | ||
or | ||
result = API::moduleImport("logging") | ||
or | ||
result = API::moduleImport("logging").getMember("root") | ||
or | ||
result = API::moduleImport("logging").getMember("getLogger").getReturn() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that I created this non-standard setup with just using API::Node
, but now that we're exposing it more publicly, we need to rewrite this to use the standard library modeling setup with InstanceSource
and type-tracking (which has a snippet, so it's quite easy).
Looking over the code again, I don't quite know why I did this in the first place. LoggerLogCall
can easily be written with the standard approach together with DataFlow::MethodCallNode
🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely it is nicer, if we can get away with using an API:Node
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to standard setup as discussed offline.
I did not even consider that we would have such options, I just checked that it looked reasonable :-) Thanks for the heads-up! |
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
Given both the original FP score and our concerns regarding sanitizers, `@precision medium`, which is aligned with other languages, feels appropriate.
(combining the Java version and the JS version)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice updates so far 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two minor things, then we should be good to go 👍
class ReplaceLineBreaksSanitizer extends Sanitizer, DataFlow::CallCfgNode { | ||
ReplaceLineBreaksSanitizer() { | ||
this.getFunction().(DataFlow::AttrRead).getAttributeName() = "replace" and | ||
this.getArg(0).asExpr().(StrConst).getText() in ["\r\n", "\n"] | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
I actually got a bit annoyed with the explanation. Do let me know if we can live with postponing this, or if I should attempt a rewrite in this PR.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.
I actually got a bit annoyed with the explanation. Do let me know if we can live with postponing this, or if I should attempt a rewrite in this PR..
Thanks for writing this 👍 I think it's very nice that the next person thinking about this code can just read the comment explaining we thought about this 👍 I have a small suggestion for writing it, but otherwise LGTM 👍
python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll
Outdated
Show resolved
Hide resolved
python/ql/lib/semmle/python/security/dataflow/LogInjectionCustomizations.qll
Outdated
Show resolved
Hide resolved
…omizations.qll Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
…omizations.qll Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
Trailing whitespace is a bit too easy with the ```suggestions through the UI :|
Thanks for the auto format fix :-) (it confused me for a bit that I could not get it to change the file locally 😁) |
Python 3 Language tests failed seems to have been flaky, so I've rerun them. 🤞 |
Tests are all good, Actually, we should do a performance test for this 😳 |
Performance does looks fine, so in it goes 👍 |
This PR promotes the log injection (CWE-117) query out of experimental.
Some concerns: