Skip to content

bpo-28293: The regex cache no longer completely dump when full. #3768

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions Lib/re.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,13 @@
except ImportError:
_locale = None

# try _collections first to reduce startup cost
try:
from _collections import OrderedDict
except ImportError:
from collections import OrderedDict


# public symbols
__all__ = [
"match", "fullmatch", "search", "sub", "subn", "split",
Expand Down Expand Up @@ -260,7 +267,7 @@ def escape(pattern):
# --------------------------------------------------------------------
# internals

_cache = {}
_cache = OrderedDict()

_pattern_type = type(sre_compile.compile("", 0))

Expand All @@ -281,7 +288,10 @@ def _compile(pattern, flags):
p = sre_compile.compile(pattern, flags)
if not (flags & DEBUG):
if len(_cache) >= _MAXCACHE:
_cache.clear()
try:
_cache.popitem(False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that I'm late to the game in reviewing this, and thanks for fixing this. One comment, should this be

_cache.popitem(last=False)

for readability? (I think if history had been different, we would have made that a keyword-only argument.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. This wouldn't harm performance, because sre_compile.compile() few lines above is much more expensive. Open a new PR for changing this if you will.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reworking my deferred compilation branch to add a new function, so I might just fold this simple change into that branch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind; I'm abandoning the deferred compilation branch, so #3791

except KeyError:
pass
_cache[type(pattern), pattern, flags] = p
return p

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
The regular expression cache is no longer completely dumped when it is full.