Skip to content

Python: promote log injection #7735

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Feb 23, 2022
Merged

Conversation

yoff
Copy link
Contributor

@yoff yoff commented Jan 25, 2022

This PR promotes the log injection (CWE-117) query out of experimental.

Some concerns:

  • the models of loggers, particularly for Django is a bit curated. In general, should we do way more than the contribution when we promote?
  • the sanitization seems very specific. I added the string const compare sanitizer, but there are probably other ways to sanitize also. Should we have a concept for sanitizers similar to escaping or just reuse escaping?

@yoff yoff requested a review from a team as a code owner January 25, 2022 10:07
yoff added 4 commits January 31, 2022 11:27
- move from custom concept `LogOutput` to standard concept `Logging`
- remove `Log.qll` from experimental frameworks
  - fold models into standard models (naively for now)
    - stdlib:
      - make Logger module public
      - broaden definition of instance
      - add `extra` keyword as possible source
   - flak: add app.logger as logger instance
   - django: `add django.utils.log.request_logger` as logger instance
     (should we add the rest?)
- remove LogOutput from experimental concepts
- modernize the sanitizer, but do not make it less specific
should we have the `lgtm,codescanning` handshake or not?
I set the category to newQuery since that is what users will see.
When we have tags, it would be nice to tag it as a query promotion.
@yoff yoff force-pushed the python/promote-log-injection branch from 80da48e to 9d41666 Compare January 31, 2022 10:28
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python/ql/src/Security/CWE-117/LogInjection.ql needs to have meta-data updated, to include security severity at least. I haven't been involved in looking at this query before, do you feel that @precision high is reasonable? For security-severity, we might just copy the one from java (which is the same as the one for JS)

Looking up those @security-severity scores, I noticed that both queires had @precision medium. That got me thinking, in the example below, neither username nor foo can contain newlines, but would probably end up causing an alert 😐 so we need to be mindful about the precision we set here.

@app.route("/users/<username>")
def show_user(username):
    foo = request.args.get("foo", 10)
    LOGGER.debug(f"showing user {username} with foo={foo}")

nitpick: QLDoc for classes doesn't follow our style-guide (must start with A or An)
nitpick: Would be very nice with documentations links.

(I've made a suggestion for Django to fix the two nitpicks above)

For the qhelp, I see that it is in large part just a copy of the qhelp from the java query. I personally have a mild preference for the JS version (since it better highlights the problems of HTML). Have you made a consideration between the two?

@@ -2683,14 +2698,12 @@ private module StdlibPrivate {
method = "log" and
msgIndex = 1
|
this = Logger::instance().getMember(method).getACall()
or
this = API::moduleImport("logging").getMember(method).getACall()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree about this removal. I see that API::moduleImport("logging") is now modeled as a LoggerInstance... but that is not true, these are helper methods defined on the module leve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there value in modelling instances specifically? Or could we just rename instance to something like loggingMethodProvider?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted to previous modelling approach as discussed offline.

Comment on lines 59 to 64
class ReplaceLineBreaksSanitizer extends Sanitizer, DataFlow::CallCfgNode {
ReplaceLineBreaksSanitizer() {
this.getFunction().(DataFlow::AttrRead).getAttributeName() = "replace" and
this.getArg(0).asExpr().(StrConst).getText() in ["\r\n", "\n"]
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sanitizer check seems a bit weak. What if it only did text.replace("\r\n", "\n")? I guess it's not trivial to write a sanitizer that knows we've eliminated both, so maybe this is ok? (what are your thoughts on this)

Copy link
Contributor Author

@yoff yoff Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have had the same concerns from the beginning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.

Comment on lines 249 to 275
module Logger {
/**
* An instance of `logging.Logger`. Extend this class to model new instances.
* Most major frameworks will provide a logger instance as a class attribute.
*/
abstract class LoggerInstance extends API::Node {
override string toString() { result = "logger" }
}

/** Gets a reference to the `logging.Logger` class or any subclass. */
API::Node subclassRef() {
result = API::moduleImport("logging").getMember("Logger").getASubclass*()
}

/** Gets a reference to an instance of `logging.Logger` or any subclass. */
API::Node instance() {
result instanceof LoggerInstance
or
result = subclassRef().getReturn()
or
result = API::moduleImport("logging")
or
result = API::moduleImport("logging").getMember("root")
or
result = API::moduleImport("logging").getMember("getLogger").getReturn()
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that I created this non-standard setup with just using API::Node, but now that we're exposing it more publicly, we need to rewrite this to use the standard library modeling setup with InstanceSource and type-tracking (which has a snippet, so it's quite easy).

Looking over the code again, I don't quite know why I did this in the first place. LoggerLogCall can easily be written with the standard approach together with DataFlow::MethodCallNode 🤷

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely it is nicer, if we can get away with using an API:Node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to standard setup as discussed offline.

@yoff
Copy link
Contributor Author

yoff commented Feb 1, 2022

For the qhelp, I see that it is in large part just a copy of the qhelp from the java query. I personally have a mild preference for the JS version (since it better highlights the problems of HTML). Have you made a consideration between the two?

I did not even consider that we would have such options, I just checked that it looked reasonable :-) Thanks for the heads-up!

yoff and others added 5 commits February 1, 2022 10:04
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
Given both the original FP score and our concerns
regarding sanitizers, `@precision medium`, which
is aligned with other languages, feels appropriate.
(combining the Java version and the JS version)
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice updates so far 👍

@yoff yoff requested a review from RasmusWL February 1, 2022 12:42
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor things, then we should be good to go 👍

Comment on lines 59 to 64
class ReplaceLineBreaksSanitizer extends Sanitizer, DataFlow::CallCfgNode {
ReplaceLineBreaksSanitizer() {
this.getFunction().(DataFlow::AttrRead).getAttributeName() = "replace" and
this.getArg(0).asExpr().(StrConst).getText() in ["\r\n", "\n"]
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.

yoff and others added 2 commits February 9, 2022 09:22
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
@yoff yoff requested a review from RasmusWL February 14, 2022 10:38
@yoff
Copy link
Contributor Author

yoff commented Feb 14, 2022

I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.

I actually got a bit annoyed with the explanation. Do let me know if we can live with postponing this, or if I should attempt a rewrite in this PR..

Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be great to have a small comment in the code to explain this dilemma, and explain why the solution implemented was deemed ok.

I actually got a bit annoyed with the explanation. Do let me know if we can live with postponing this, or if I should attempt a rewrite in this PR..

Thanks for writing this 👍 I think it's very nice that the next person thinking about this code can just read the comment explaining we thought about this 👍 I have a small suggestion for writing it, but otherwise LGTM 👍

yoff and others added 2 commits February 14, 2022 16:07
…omizations.qll

Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
…omizations.qll

Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
@yoff yoff requested a review from RasmusWL February 14, 2022 15:08
Trailing whitespace is a bit too easy with the ```suggestions through
the UI :|
@yoff
Copy link
Contributor Author

yoff commented Feb 15, 2022

Thanks for the auto format fix :-) (it confused me for a bit that I could not get it to change the file locally 😁)

@RasmusWL
Copy link
Member

Python 3 Language tests failed seems to have been flaky, so I've rerun them. 🤞

@RasmusWL
Copy link
Member

RasmusWL commented Feb 17, 2022

Tests are all good, so shipit

Actually, we should do a performance test for this 😳

@RasmusWL
Copy link
Member

Performance does looks fine, so in it goes 👍

@RasmusWL RasmusWL merged commit aeba497 into github:main Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants