-
-
Notifications
You must be signed in to change notification settings - Fork 32.6k
Closed as not planned
Labels
type-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error
Description
Bug report
Bug description:
I encountered a serious performance issue with the re
module.
For a test I run the following script (Im aware the regex will match nothing, its a small example from a bigger code I have):
import re
re.search(
r'(?P<body>(?:.|\s)*)(//.*END.*$)',
"""export function isIconName(iconName: unknown): iconName is IconName {
if (!iconName || typeof iconName !== 'string') {
return false;
}
return iconName in availableIconsIndex;
}""").group('body')
and it causes the CPU to hit 100% on a single core and never completes. After more than 10 minutes, I had to terminate the process.
Tested this same pattern on regex101 (Python flavor), and it completes instantly - way below 1ms.
Something is definitely off with how re.search
handles this pattern.
re.debug
Running this search with re.debug
gave the following result
SUBPATTERN 1 0 0
MAX_REPEAT 0 MAXREPEAT
BRANCH
ANY None
OR
IN
CATEGORY CATEGORY_SPACE
SUBPATTERN 2 0 0
LITERAL 47
LITERAL 47
MAX_REPEAT 0 MAXREPEAT
ANY None
LITERAL 69
LITERAL 78
LITERAL 68
MAX_REPEAT 0 MAXREPEAT
ANY None
AT AT_END
0. INFO 4 0b0 5 MAXREPEAT (to 5)
5: MARK 0
7. REPEAT 17 0 MAXREPEAT (to 25)
11. BRANCH 4 (to 16)
13. ANY
14. JUMP 10 (to 25)
16: branch 8 (to 24)
17. IN 4 (to 22)
19. CATEGORY UNI_SPACE
21. FAILURE
22: JUMP 2 (to 25)
24: FAILURE
25: MAX_UNTIL
26. MARK 1
28. MARK 2
30. LITERAL 0x2f ('/')
32. LITERAL 0x2f ('/')
34. REPEAT_ONE 5 0 MAXREPEAT (to 40)
38. ANY
39. SUCCESS
40: LITERAL 0x45 ('E')
42. LITERAL 0x4e ('N')
44. LITERAL 0x44 ('D')
46. REPEAT_ONE 5 0 MAXREPEAT (to 52)
50. ANY
51. SUCCESS
52: AT END
54. MARK 3
56. SUCCESS
But it is still hangs and refuse to do anything - still had to terminate it
CPython versions tested on:
3.13, 3.12, 3.10
Operating systems tested on:
Linux
Metadata
Metadata
Assignees
Labels
type-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error