Skip to content

Commit 2a88782

Browse files
committed
Fix potential infinite loop in regular expression execution.
In cfindloop(), if the initial call to shortest() reports that a zero-length match is possible at the current search start point, but then it is unable to construct any actual match to that, it'll just loop around with the same start point, and thus make no progress. We need to force the start point to be advanced. This is safe because the loop over "begin" points has already tried and failed to match starting at "close", so there is surely no need to try that again. This bug was introduced in commit e2bd904, wherein we allowed continued searching after we'd run out of match possibilities, but evidently failed to think hard enough about exactly where we needed to search next. Because of the way this code works, such a match failure is only possible in the presence of backrefs --- otherwise, shortest()'s judgment that a match is possible should always be correct. That probably explains how come the bug has escaped detection for several years. The actual fix is a one-liner, but I took the trouble to add/improve some comments related to the loop logic. After fixing that, the submitted test case "()*\1" didn't loop anymore. But it reported failure, though it seems like it ought to match a zero-length string; both Tcl and Perl think it does. That seems to be from overenthusiastic optimization on my part when I rewrote the iteration match logic in commit 173e29a: we can't just "declare victory" for a zero-length match without bothering to set match data for capturing parens inside the iterator node. Per fuzz testing by Greg Stark. The first part of this is a bug in all supported branches, and the second part is a bug since 9.2 where the iteration rewrite happened.
1 parent d4f6488 commit 2a88782

File tree

1 file changed

+16
-5
lines changed

1 file changed

+16
-5
lines changed

src/backend/regex/regexec.c

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -423,6 +423,7 @@ cfindloop(struct vars * v,
423423
close = v->search_start;
424424
do
425425
{
426+
/* Search with the search RE for match range at/beyond "close" */
426427
MDEBUG(("\ncsearch at %ld\n", LOFF(close)));
427428
close = shortest(v, s, close, close, v->stop, &cold, (int *) NULL);
428429
if (ISERR())
@@ -431,10 +432,11 @@ cfindloop(struct vars * v,
431432
return v->err;
432433
}
433434
if (close == NULL)
434-
break; /* NOTE BREAK */
435+
break; /* no more possible match anywhere */
435436
assert(cold != NULL);
436437
open = cold;
437438
cold = NULL;
439+
/* Search for matches starting between "open" and "close" inclusive */
438440
MDEBUG(("cbetween %ld and %ld\n", LOFF(open), LOFF(close)));
439441
for (begin = open; begin <= close; begin++)
440442
{
@@ -443,6 +445,7 @@ cfindloop(struct vars * v,
443445
estop = v->stop;
444446
for (;;)
445447
{
448+
/* Here we use the top node's detailed RE */
446449
if (shorter)
447450
end = shortest(v, d, begin, estart,
448451
estop, (chr **) NULL, &hitend);
@@ -457,8 +460,9 @@ cfindloop(struct vars * v,
457460
if (hitend && cold == NULL)
458461
cold = begin;
459462
if (end == NULL)
460-
break; /* NOTE BREAK OUT */
463+
break; /* no match with this begin point, try next */
461464
MDEBUG(("tentative end %ld\n", LOFF(end)));
465+
/* Dissect the potential match to see if it really matches */
462466
zapsubs(v->pmatch, v->nmatch);
463467
zapmem(v, v->g->tree);
464468
er = cdissect(v, v->g->tree, begin, end);
@@ -478,21 +482,28 @@ cfindloop(struct vars * v,
478482
*coldp = cold;
479483
return er;
480484
}
481-
/* try next shorter/longer match with same begin point */
485+
/* Try next longer/shorter match with same begin point */
482486
if (shorter)
483487
{
484488
if (end == estop)
485-
break; /* NOTE BREAK OUT */
489+
break; /* no more, so try next begin point */
486490
estart = end + 1;
487491
}
488492
else
489493
{
490494
if (end == begin)
491-
break; /* NOTE BREAK OUT */
495+
break; /* no more, so try next begin point */
492496
estop = end - 1;
493497
}
494498
} /* end loop over endpoint positions */
495499
} /* end loop over beginning positions */
500+
501+
/*
502+
* If we get here, there is no possible match starting at or before
503+
* "close", so consider matches beyond that. We'll do a fresh search
504+
* with the search RE to find a new promising match range.
505+
*/
506+
close++;
496507
} while (close < v->stop);
497508

498509
*coldp = cold;

0 commit comments

Comments
 (0)