-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
bpo-32981: Fix catastrophic backtracking vulns #5955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-32981: Fix catastrophic backtracking vulns #5955
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. Thanks again to your contribution and we look forward to looking at it! |
I have now signed the CLA. |
Because this is a security issue no issue was previously created to track it. Perhaps someone might create one now? [Update: I created bpo-32981.] |
Travis failure:
Took a look, I can't see what's wrong. Is there a command (or flag to |
That error comes from |
6c9b1ef
to
38bdbdd
Compare
Thank you. Fixed. |
I created bpo-32981 for this problem. I skimmed through the git log. Commit messages for security seem to follow the 'XXX (CVE-...)' format rather than the 'bpo-XXX: ...' format. Let me know if I should change the commit messages though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a news entry and add your name into Misc/ACKS.
Lib/difflib.py
Outdated
@@ -1083,7 +1083,7 @@ def _qformat(self, aline, bline, atags, btags): | |||
|
|||
import re | |||
|
|||
def IS_LINE_JUNK(line, pat=re.compile(r"\s*#?\s*$").match): | |||
def IS_LINE_JUNK(line, pat=re.compile(r"\s*(#\s*)?$").match): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use non-capturing group: (?:#\s*)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Lib/test/test_difflib.py
Outdated
def test_is_line_junk_true(self): | ||
for line in ['#', ' ', ' #', '# ', ' # ', '']: | ||
is_junk = difflib.IS_LINE_JUNK(line) | ||
self.assertEqual(is_junk, True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not assertTrue()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this and others, thanks.
Lib/test/test_difflib.py
Outdated
elapsed = end - start | ||
|
||
self.assertEqual(is_junk, False) | ||
self.assertEqual(elapsed.seconds, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timing test is very fragile. It can fail randomly when tests are paused for some reasons. I suggest to remove the timing test and increase the size of input so that it would take a hour with non-fixed code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the test will pass if you wait long enough. Since the test suite already takes a long time to run and does so in parallel (so one hanging test might not be visible), this seems undesirable to me. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing in Lib/test/test_poplib.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the test takes too long time for running it will be failed on buildbots due to total timeout.
1a03bf9
to
83bf67c
Compare
@serhiy-storchaka Thank you! I think I have addressed all of your comments so far. |
83bf67c
to
539f390
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you James. Here yet few comments.
@@ -0,0 +1,3 @@ | |||
Regexes in difflib and poplib were vulnerable to catastrophic backtracking. | |||
These regexes formed potential DOS vectors (REDOS). They have been | |||
refactored. This resolves CVE-2018-1060 and CVE-2018-1061. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add "Patch by your name."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Lib/test/test_difflib.py
Outdated
def test_is_character_junk_true(self): | ||
for char in [' ', '\t']: | ||
is_junk = difflib.IS_CHARACTER_JUNK(char) | ||
self.assertEqual(is_junk, True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertTrue()
?
Lib/test/test_difflib.py
Outdated
def test_is_line_junk_true(self): | ||
for line in ['#', ' ', ' #', '# ', ' # ', '']: | ||
is_junk = difflib.IS_LINE_JUNK(line) | ||
self.assertTrue(is_junk) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just
self.assertTrue(difflib.IS_LINE_JUNK(line))
?
And you can add the second argument of assertTrue()
for getting more information if the test fails:
self.assertTrue(difflib.IS_LINE_JUNK(line), rept(line))
But don't do this in test_is_line_junk_REDOS.
Lib/test/test_poplib.py
Outdated
# At this length, evil input makes each apop call take | ||
# on the order of milliseconds instead of microseconds. | ||
evil_welcome = bytes('+OK' + ('<' * 500000), self.client.encoding) | ||
self.client.welcome = evil_welcome |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use a special helper swap_attr()
for temporary setting the attribute:
with test.support.swap_attr(self.client, 'welcome', evil_welcome):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, nice, thanks for the tip.
Lib/test/test_poplib.py
Outdated
except poplib.error_proto: | ||
threw = True | ||
# The evil welcome is invalid, so apop should throw. | ||
self.assertTrue(threw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use assertRaises()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for the poor choices of asserts is that I didn't know what asserts are available and made do with what I could find. I really appreciate your help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.python.org/3/library/unittest.html#test-cases
assertRaises()
can be used as a function:
self.assertRaises(poplib.error_proto, self.client.apop, 'foo', 'dummypassword')
or as a context manager:
with self.assertRaises(poplib.error_proto):
self.client.apop('foo', 'dummypassword')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup I have at last found the documentation for unittest.
Lib/test/test_poplib.py
Outdated
# NB The upper bound on welcome length is currently 2048. | ||
# At this length, evil input makes each apop call take | ||
# on the order of milliseconds instead of microseconds. | ||
evil_welcome = bytes('+OK' + ('<' * 500000), self.client.encoding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the advantage over just b'+OK' + b'<' * 500000
?
Lib/test/test_poplib.py
Outdated
@@ -10,6 +10,7 @@ | |||
import os | |||
import errno | |||
import threading | |||
import datetime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer used.
Lib/test/test_difflib.py
Outdated
@@ -3,6 +3,7 @@ | |||
import unittest | |||
import doctest | |||
import sys | |||
import datetime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer used.
539f390
to
71db0e4
Compare
The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>
The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>
71db0e4
to
7bf9559
Compare
The latest version uses 1-million character attack strings. These take at least 3 minutes to evaluate on the master branch. I didn't want to wait for the full timeout. |
Lib/test/test_difflib.py
Outdated
|
||
def test_is_line_junk_false(self): | ||
for line in ['##', ' ##', '## ', 'abc ', 'abc #', 'abc # ', 'Mr. Moose is up!']: | ||
self.assertFalse(difflib.IS_LINE_JUNK(line), 'should not be junk: {}'.format(line)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The length of this line exceeds the 79 character limit allowed by PEP 8.
And since testing strings contain spaces it is hard to distinguish the following cases:
should not be junk: abc #
and
should not be junk: abc #
This is why I suggested repr()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The length of this line exceeds the 79 character limit allowed by PEP 8.
Didn't realize this was being enforced, there are other lines over this limit as well. Will do.
This is why I suggested
repr()
.
Got it, I misread that as rept earlier and couldn't understand what you meant.
Lib/test/test_poplib.py
Outdated
evil_welcome = b'+OK' + (b'<' * 1000000) | ||
with test_support.swap_attr(self.client, 'welcome', evil_welcome): | ||
# The evil welcome is invalid, so apop should throw. | ||
self.assertRaises(poplib.error_proto, self.client.apop, 'arthur', 'kingofbritons') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this line is too long, a context manager form of assertRaises() would be more preferable.
And please make your changes as new commits rather of editing old commits. This would make reviewing easier. |
@serhiy-storchaka Acknowledged. I believe I've addressed all of the concerns you've raised. |
Should I rebase or will these be squashed during merge anyway? |
Will be squashed during merge. |
Thanks @davisjam for the PR, and @benjaminp for merging it 🌮🎉.. I'm working now to backport this PR to: 2.7, 3.6, 3.7. |
@benjaminp: Please replace |
* Prevent low-grade poplib REDOS (CVE-2018-1060) The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. * Prevent difflib REDOS (CVE-2018-1061) The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org> (cherry picked from commit 0e6c8ee) Co-authored-by: Jamie Davis <davisjam@vt.edu>
GH-5969 is a backport of this pull request to the 3.7 branch. |
Sorry, @davisjam and @benjaminp, I could not cleanly backport this to |
Sorry, @davisjam and @benjaminp, I could not cleanly backport this to |
* Prevent low-grade poplib REDOS (CVE-2018-1060) The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. * Prevent difflib REDOS (CVE-2018-1061) The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>. (cherry picked from commit 0e6c8ee)
GH-5970 is a backport of this pull request to the 2.7 branch. |
* Prevent low-grade poplib REDOS (CVE-2018-1060) The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. * Prevent difflib REDOS (CVE-2018-1061) The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>. (cherry picked from commit 0e6c8ee)
GH-5971 is a backport of this pull request to the 3.6 branch. |
* Prevent low-grade poplib REDOS (CVE-2018-1060) The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. * Prevent difflib REDOS (CVE-2018-1061) The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org> Co-authored-by: Jamie Davis <davisjam@vt.edu> (cherry picked from commit 0e6c8ee)
* Prevent low-grade poplib REDOS (CVE-2018-1060) The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. * Prevent difflib REDOS (CVE-2018-1061) The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>. (cherry picked from commit 0e6c8ee)
* Prevent low-grade poplib REDOS (CVE-2018-1060) The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. * Prevent difflib REDOS (CVE-2018-1061) The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>. (cherry picked from commit 0e6c8ee)
* Prevent low-grade poplib REDOS (CVE-2018-1060) The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. * Prevent difflib REDOS (CVE-2018-1061) The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>. (cherry picked from commit 0e6c8ee)
* Prevent low-grade poplib REDOS (CVE-2018-1060) The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. * Prevent difflib REDOS (CVE-2018-1061) The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>.
* Prevent low-grade poplib REDOS (CVE-2018-1060) The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. * Prevent difflib REDOS (CVE-2018-1061) The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>.
* Prevent low-grade poplib REDOS (CVE-2018-1060) The regex to test a mail server's timestamp is susceptible to catastrophic backtracking on long evil responses from the server. Happily, the maximum length of malicious inputs is 2K thanks to a limit introduced in the fix for CVE-2013-1752. A 2KB evil response from the mail server would result in small slowdowns (milliseconds vs. microseconds) accumulated over many apop calls. This is a potential DOS vector via accumulated slowdowns. Replace it with a similar non-vulnerable regex. The new regex is RFC compliant. The old regex was non-compliant in edge cases. * Prevent difflib REDOS (CVE-2018-1061) The default regex for IS_LINE_JUNK is susceptible to catastrophic backtracking. This is a potential DOS vector. Replace it with an equivalent non-vulnerable regex. Also introduce unit and REDOS tests for difflib. Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>. (cherry picked from commit 0e6c8ee)
Fix catastrophic backtracking vulns disclosed by email.
https://bugs.python.org/issue32981