Skip to content

Commit facce1d

Browse files
committed
Fix performance bug in regexp's citerdissect/creviterdissect.
After detecting a sub-match "dissect" failure (i.e., a backref match failure) in the i'th sub-match of an iteration node, we should proceed by adjusting the attempted length of the i'th submatch. As coded, though, these functions changed the attempted length of the *last* sub-match, and only after exhausting all possibilities for that would they back up to adjust the next-to-last sub-match, and then the second-from-last, etc; all of which is wasted effort, since only changing the start or length of the i'th sub-match can possibly make it succeed. This oversight creates the possibility for exponentially bad performance. Fortunately the problem is masked in most cases by optimizations or constraints applied elsewhere; which explains why we'd not noticed it before. But it is possible to reach the problem with fairly simple, if contrived, regexps. Oversight in my commit 173e29a. That's pretty ancient now, so back-patch to all supported branches. Discussion: https://postgr.es/m/1808998.1629412269@sss.pgh.pa.us
1 parent 9a9c8b9 commit facce1d

File tree

1 file changed

+10
-8
lines changed

1 file changed

+10
-8
lines changed

src/backend/regex/regexec.c

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1147,8 +1147,8 @@ citerdissect(struct vars *v,
11471147
* Our strategy is to first find a set of sub-match endpoints that are
11481148
* valid according to the child node's DFA, and then recursively dissect
11491149
* each sub-match to confirm validity. If any validity check fails,
1150-
* backtrack the last sub-match and try again. And, when we next try for
1151-
* a validity check, we need not recheck any successfully verified
1150+
* backtrack that sub-match and try again. And, when we next try for a
1151+
* validity check, we need not recheck any successfully verified
11521152
* sub-matches that we didn't move the endpoints of. nverified remembers
11531153
* how many sub-matches are currently known okay.
11541154
*/
@@ -1236,12 +1236,13 @@ citerdissect(struct vars *v,
12361236
return REG_OKAY;
12371237
}
12381238

1239-
/* match failed to verify, so backtrack */
1239+
/* i'th match failed to verify, so backtrack it */
1240+
k = i;
12401241

12411242
backtrack:
12421243

12431244
/*
1244-
* Must consider shorter versions of the current sub-match. However,
1245+
* Must consider shorter versions of the k'th sub-match. However,
12451246
* we'll only ask for a zero-length match if necessary.
12461247
*/
12471248
while (k > 0)
@@ -1352,8 +1353,8 @@ creviterdissect(struct vars *v,
13521353
* Our strategy is to first find a set of sub-match endpoints that are
13531354
* valid according to the child node's DFA, and then recursively dissect
13541355
* each sub-match to confirm validity. If any validity check fails,
1355-
* backtrack the last sub-match and try again. And, when we next try for
1356-
* a validity check, we need not recheck any successfully verified
1356+
* backtrack that sub-match and try again. And, when we next try for a
1357+
* validity check, we need not recheck any successfully verified
13571358
* sub-matches that we didn't move the endpoints of. nverified remembers
13581359
* how many sub-matches are currently known okay.
13591360
*/
@@ -1447,12 +1448,13 @@ creviterdissect(struct vars *v,
14471448
return REG_OKAY;
14481449
}
14491450

1450-
/* match failed to verify, so backtrack */
1451+
/* i'th match failed to verify, so backtrack it */
1452+
k = i;
14511453

14521454
backtrack:
14531455

14541456
/*
1455-
* Must consider longer versions of the current sub-match.
1457+
* Must consider longer versions of the k'th sub-match.
14561458
*/
14571459
while (k > 0)
14581460
{

0 commit comments

Comments
 (0)