Skip to content

Java: Regex injection #5704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from Jun 1, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
package com.example.demo;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class DemoApplication {

@GetMapping("/string1")
public String string1(@RequestParam(value = "input", defaultValue = "test") String input,
@RequestParam(value = "pattern", defaultValue = ".*") String pattern) {
// BAD: Unsanitized user input is used to construct a regular expression
if (input.matches("^" + pattern + "=.*$"))
return "match!";

return "doesn't match!";
}

@GetMapping("/string2")
public String string2(@RequestParam(value = "input", defaultValue = "test") String input,
@RequestParam(value = "pattern", defaultValue = ".*") String pattern) {
// GOOD: User input is sanitized before constructing the regex
if (input.matches("^" + escapeSpecialRegexChars(pattern) + "=.*$"))
return "match!";

return "doesn't match!";
}

Pattern SPECIAL_REGEX_CHARS = Pattern.compile("[{}()\\[\\]><-=!.+*?^$\\\\|]");

String escapeSpecialRegexChars(String str) {
return SPECIAL_REGEX_CHARS.matcher(str).replaceAll("\\\\$0");
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>

<overview>
<p>
Constructing a regular expression with unsanitized user input is dangerous as a malicious user may
be able to modify the meaning of the expression. In particular, such a user may be able to provide
a regular expression fragment that takes exponential time in the worst case, and use that to
perform a Denial of Service attack.
</p>
</overview>

<recommendation>
<p>
Before embedding user input into a regular expression, use a sanitization function
to escape meta-characters that have special meaning.
</p>
</recommendation>

<example>
<p>
The following example shows a HTTP request parameter that is used to construct a regular expression:
</p>
<sample src="RegexInjection.java" />
<p>
In the first case the user-provided regex is not escaped.
If a malicious user provides a regex that has exponential worst case performance,
then this could lead to a Denial of Service.
</p>
<p>
In the second case, the user input is escaped using <code>escapeSpecialRegexChars</code> before being included
in the regular expression. This ensures that the user cannot insert characters which have a special
meaning in regular expressions.
</p>
</example>

<references>
<li>
OWASP:
<a href="https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS">Regular expression Denial of Service - ReDoS</a>.
</li>
<li>
Wikipedia: <a href="https://en.wikipedia.org/wiki/ReDoS">ReDoS</a>.
</li>
</references>
</qhelp>
89 changes: 89 additions & 0 deletions java/ql/src/experimental/Security/CWE/CWE-730/RegexInjection.ql
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
/**
* @name Regular expression injection
* @description User input should not be used in regular expressions without first being sanitized,
* otherwise a malicious user may be able to provide a regex that could require
* exponential time on certain inputs.
* @kind path-problem
* @problem.severity error
* @precision high
* @id java/regex-injection
* @tags security
* external/cwe/cwe-730
* external/cwe/cwe-400
*/

import java
import semmle.code.java.dataflow.FlowSources
import semmle.code.java.dataflow.TaintTracking
import DataFlow::PathGraph

/**
* A data flow sink for untrusted user input used to construct regular expressions.
*/
class RegexSink extends DataFlow::ExprNode {
RegexSink() {
exists(MethodAccess ma, Method m | m = ma.getMethod() |
(
m.getDeclaringType() instanceof TypeString and
(
ma.getArgument(0) = this.asExpr() and
m.hasName(["matches", "split", "replaceFirst", "replaceAll"])
)
or
m.getDeclaringType().hasQualifiedName("java.util.regex", "Pattern") and
(
ma.getArgument(0) = this.asExpr() and
m.hasName(["compile", "matches"])
)
or
m.getDeclaringType().hasQualifiedName("org.apache.commons.lang3", "RegExUtils") and
(
ma.getArgument(1) = this.asExpr() and
m.getParameterType(1) instanceof TypeString and
m.hasName([
"removeAll", "removeFirst", "removePattern", "replaceAll", "replaceFirst",
"replacePattern"
])
)
)
)
}
}

abstract class Sanitizer extends DataFlow::ExprNode { }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since only RegExpSanitizationCall is extending this class it might be best to remove it and have RegExpSanitizationCall directly extend DataFlow::ExprNode (unless this pull request is not finished yet and you are planning to add more sanitizers).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You never know when pull request is finished :)


/**
* A call to a function whose name suggests that it escapes regular
* expression meta-characters.
*/
class RegExpSanitizationCall extends Sanitizer {
RegExpSanitizationCall() {
exists(string calleeName, string sanitize, string regexp |
calleeName = this.asExpr().(Call).getCallee().getName() and
sanitize = "(?:escape|saniti[sz]e)" and
regexp = "regexp?"
|
calleeName
.regexpMatch("(?i)(" + sanitize + ".*" + regexp + ".*)" + "|(" + regexp + ".*" + sanitize +
".*)")
)
}
}

/**
* A taint-tracking configuration for untrusted user input used to construct regular expressions.
*/
class RegexInjectionConfiguration extends TaintTracking::Configuration {
RegexInjectionConfiguration() { this = "RegexInjectionConfiguration" }

override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource }

override predicate isSink(DataFlow::Node sink) { sink instanceof RegexSink }

override predicate isSanitizer(DataFlow::Node node) { node instanceof Sanitizer }
}

from DataFlow::PathNode source, DataFlow::PathNode sink, RegexInjectionConfiguration c
where c.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "$@ is user controlled.", source.getNode(),
"This regular expression pattern"
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
edges
| RegexInjection.java:13:22:13:52 | getParameter(...) : String | RegexInjection.java:16:26:16:47 | ... + ... |
| RegexInjection.java:20:22:20:52 | getParameter(...) : String | RegexInjection.java:23:24:23:30 | pattern |
| RegexInjection.java:27:22:27:52 | getParameter(...) : String | RegexInjection.java:30:31:30:37 | pattern |
| RegexInjection.java:34:22:34:52 | getParameter(...) : String | RegexInjection.java:37:29:37:35 | pattern |
| RegexInjection.java:41:22:41:52 | getParameter(...) : String | RegexInjection.java:44:34:44:40 | pattern |
| RegexInjection.java:51:22:51:52 | getParameter(...) : String | RegexInjection.java:54:28:54:34 | pattern |
| RegexInjection.java:58:22:58:52 | getParameter(...) : String | RegexInjection.java:61:28:61:34 | pattern |
| RegexInjection.java:65:22:65:52 | getParameter(...) : String | RegexInjection.java:68:36:68:42 | pattern : String |
| RegexInjection.java:68:32:68:43 | foo(...) : String | RegexInjection.java:68:26:68:52 | ... + ... |
| RegexInjection.java:68:36:68:42 | pattern : String | RegexInjection.java:68:32:68:43 | foo(...) : String |
| RegexInjection.java:84:22:84:52 | getParameter(...) : String | RegexInjection.java:90:26:90:47 | ... + ... |
| RegexInjection.java:100:22:100:52 | getParameter(...) : String | RegexInjection.java:103:40:103:46 | pattern |
| RegexInjection.java:107:22:107:52 | getParameter(...) : String | RegexInjection.java:110:42:110:48 | pattern |
| RegexInjection.java:114:22:114:52 | getParameter(...) : String | RegexInjection.java:117:44:117:50 | pattern |
| RegexInjection.java:121:22:121:52 | getParameter(...) : String | RegexInjection.java:124:41:124:47 | pattern |
| RegexInjection.java:128:22:128:52 | getParameter(...) : String | RegexInjection.java:131:43:131:49 | pattern |
| RegexInjection.java:143:22:143:52 | getParameter(...) : String | RegexInjection.java:146:45:146:51 | pattern |
nodes
| RegexInjection.java:13:22:13:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:16:26:16:47 | ... + ... | semmle.label | ... + ... |
| RegexInjection.java:20:22:20:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:23:24:23:30 | pattern | semmle.label | pattern |
| RegexInjection.java:27:22:27:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:30:31:30:37 | pattern | semmle.label | pattern |
| RegexInjection.java:34:22:34:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:37:29:37:35 | pattern | semmle.label | pattern |
| RegexInjection.java:41:22:41:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:44:34:44:40 | pattern | semmle.label | pattern |
| RegexInjection.java:51:22:51:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:54:28:54:34 | pattern | semmle.label | pattern |
| RegexInjection.java:58:22:58:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:61:28:61:34 | pattern | semmle.label | pattern |
| RegexInjection.java:65:22:65:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:68:26:68:52 | ... + ... | semmle.label | ... + ... |
| RegexInjection.java:68:32:68:43 | foo(...) : String | semmle.label | foo(...) : String |
| RegexInjection.java:68:36:68:42 | pattern : String | semmle.label | pattern : String |
| RegexInjection.java:84:22:84:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:90:26:90:47 | ... + ... | semmle.label | ... + ... |
| RegexInjection.java:100:22:100:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:103:40:103:46 | pattern | semmle.label | pattern |
| RegexInjection.java:107:22:107:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:110:42:110:48 | pattern | semmle.label | pattern |
| RegexInjection.java:114:22:114:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:117:44:117:50 | pattern | semmle.label | pattern |
| RegexInjection.java:121:22:121:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:124:41:124:47 | pattern | semmle.label | pattern |
| RegexInjection.java:128:22:128:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:131:43:131:49 | pattern | semmle.label | pattern |
| RegexInjection.java:143:22:143:52 | getParameter(...) : String | semmle.label | getParameter(...) : String |
| RegexInjection.java:146:45:146:51 | pattern | semmle.label | pattern |
#select
| RegexInjection.java:16:26:16:47 | ... + ... | RegexInjection.java:13:22:13:52 | getParameter(...) : String | RegexInjection.java:16:26:16:47 | ... + ... | $@ is user controlled. | RegexInjection.java:13:22:13:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:23:24:23:30 | pattern | RegexInjection.java:20:22:20:52 | getParameter(...) : String | RegexInjection.java:23:24:23:30 | pattern | $@ is user controlled. | RegexInjection.java:20:22:20:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:30:31:30:37 | pattern | RegexInjection.java:27:22:27:52 | getParameter(...) : String | RegexInjection.java:30:31:30:37 | pattern | $@ is user controlled. | RegexInjection.java:27:22:27:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:37:29:37:35 | pattern | RegexInjection.java:34:22:34:52 | getParameter(...) : String | RegexInjection.java:37:29:37:35 | pattern | $@ is user controlled. | RegexInjection.java:34:22:34:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:44:34:44:40 | pattern | RegexInjection.java:41:22:41:52 | getParameter(...) : String | RegexInjection.java:44:34:44:40 | pattern | $@ is user controlled. | RegexInjection.java:41:22:41:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:54:28:54:34 | pattern | RegexInjection.java:51:22:51:52 | getParameter(...) : String | RegexInjection.java:54:28:54:34 | pattern | $@ is user controlled. | RegexInjection.java:51:22:51:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:61:28:61:34 | pattern | RegexInjection.java:58:22:58:52 | getParameter(...) : String | RegexInjection.java:61:28:61:34 | pattern | $@ is user controlled. | RegexInjection.java:58:22:58:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:68:26:68:52 | ... + ... | RegexInjection.java:65:22:65:52 | getParameter(...) : String | RegexInjection.java:68:26:68:52 | ... + ... | $@ is user controlled. | RegexInjection.java:65:22:65:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:90:26:90:47 | ... + ... | RegexInjection.java:84:22:84:52 | getParameter(...) : String | RegexInjection.java:90:26:90:47 | ... + ... | $@ is user controlled. | RegexInjection.java:84:22:84:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:103:40:103:46 | pattern | RegexInjection.java:100:22:100:52 | getParameter(...) : String | RegexInjection.java:103:40:103:46 | pattern | $@ is user controlled. | RegexInjection.java:100:22:100:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:110:42:110:48 | pattern | RegexInjection.java:107:22:107:52 | getParameter(...) : String | RegexInjection.java:110:42:110:48 | pattern | $@ is user controlled. | RegexInjection.java:107:22:107:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:117:44:117:50 | pattern | RegexInjection.java:114:22:114:52 | getParameter(...) : String | RegexInjection.java:117:44:117:50 | pattern | $@ is user controlled. | RegexInjection.java:114:22:114:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:124:41:124:47 | pattern | RegexInjection.java:121:22:121:52 | getParameter(...) : String | RegexInjection.java:124:41:124:47 | pattern | $@ is user controlled. | RegexInjection.java:121:22:121:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:131:43:131:49 | pattern | RegexInjection.java:128:22:128:52 | getParameter(...) : String | RegexInjection.java:131:43:131:49 | pattern | $@ is user controlled. | RegexInjection.java:128:22:128:52 | getParameter(...) | This regular expression pattern |
| RegexInjection.java:146:45:146:51 | pattern | RegexInjection.java:143:22:143:52 | getParameter(...) : String | RegexInjection.java:146:45:146:51 | pattern | $@ is user controlled. | RegexInjection.java:143:22:143:52 | getParameter(...) | This regular expression pattern |
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.ServletException;

import org.apache.commons.lang3.RegExUtils;

public class RegexInjection extends HttpServlet {
public boolean string1(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return input.matches("^" + pattern + "=.*$"); // BAD
}

public boolean string2(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return input.split(pattern).length > 0; // BAD
}

public boolean string3(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return input.replaceFirst(pattern, "").length() > 0; // BAD
}

public boolean string4(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return input.replaceAll(pattern, "").length() > 0; // BAD
}

public boolean pattern1(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

Pattern pt = Pattern.compile(pattern);
Matcher matcher = pt.matcher(input);

return matcher.find(); // BAD
}

public boolean pattern2(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return Pattern.compile(pattern).matcher(input).matches(); // BAD
}

public boolean pattern3(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return Pattern.matches(pattern, input); // BAD
}

public boolean pattern4(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return input.matches("^" + foo(pattern) + "=.*$"); // BAD
}

String foo(String str) {
return str;
}

public boolean pattern5(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

// GOOD: User input is sanitized before constructing the regex
return input.matches("^" + escapeSpecialRegexChars(pattern) + "=.*$");
}

public boolean pattern6(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

escapeSpecialRegexChars(pattern);

// BAD: the pattern is not really sanitized
return input.matches("^" + pattern + "=.*$");
}

Pattern SPECIAL_REGEX_CHARS = Pattern.compile("[{}()\\[\\]><-=!.+*?^$\\\\|]");

String escapeSpecialRegexChars(String str) {
return SPECIAL_REGEX_CHARS.matcher(str).replaceAll("\\\\$0");
}

public boolean apache1(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return RegExUtils.removeAll(input, pattern).length() > 0; // BAD
}

public boolean apache2(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return RegExUtils.removeFirst(input, pattern).length() > 0; // BAD
}

public boolean apache3(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return RegExUtils.removePattern(input, pattern).length() > 0; // BAD
}

public boolean apache4(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return RegExUtils.replaceAll(input, pattern, "").length() > 0; // BAD
}

public boolean apache5(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return RegExUtils.replaceFirst(input, pattern, "").length() > 0; // BAD
}

public boolean apache6(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

Pattern pt = (Pattern)(Object) pattern;
return RegExUtils.replaceFirst(input, pt, "").length() > 0; // GOOD, Pattern compile is the sink instead
}

public boolean apache7(javax.servlet.http.HttpServletRequest request) {
String pattern = request.getParameter("pattern");
String input = request.getParameter("input");

return RegExUtils.replacePattern(input, pattern, "").length() > 0; // BAD
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
experimental/Security/CWE/CWE-730/RegexInjection.ql
Loading