Skip to content

gh-130167: Improve speed of inspect.formatannotation by replacing re #130242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

donBarbos
Copy link
Contributor

@donBarbos donBarbos commented Feb 18, 2025

timeit benchmark with my script:

inspect_bench.py:

import re
import timeit
from typing import List, Dict, Any, Union

def old_format_annotation(annotation):
    if getattr(annotation, "__module__", None) == "typing":
        def repl(match):
            text = match.group()
            return text.removeprefix("typing.")
        return re.sub(r"[\w\.]+", repl, repr(annotation))
    return repr(annotation)

def new_format_annotation(annotation):
    if getattr(annotation, "__module__", None) == "typing":
        return (repr(annotation)
                .replace(".typing.", "@TYPING@")
                .replace("typing.", "")
                .replace("@TYPING@", ".typing."))
    return repr(annotation)

old_time = timeit.timeit(lambda: old_format_annotation(Union[List[str], Dict[str, Any]]), number=100_000)
new_time = timeit.timeit(lambda: new_format_annotation(Union[List[str], Dict[str, Any]]), number=100_000)

print(f"Old version (re.sub): {old_time:.6f}s")
print(f"New version (.replace()): {new_time:.6f}s")
print(f"Difference: {old_time / new_time:.2f}x")

Result: 2.0s -> 1.18s = x1.74 as fast

$ ./python -B inspect_bench.py
Old version (re.sub): 2.043900s
New version (.replace()): 1.175867s
Difference: 1.74x

✔️ This was also the only use of re in module so i got rid of the import

@donBarbos donBarbos changed the title Improve speed of inspect.formatannotation by replacing re gh-130167: Improve speed of inspect.formatannotation by replacing re Feb 18, 2025
donBarbos and others added 2 commits March 28, 2025 01:58
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
Copy link
Member

@AA-Turner AA-Turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm less sure this is worth it, the replacement isn't clearly better maintenence-wise.

What do the benchmarks look like if you extract repl to a module-level _formatannotation_repl and use .sub() on a pre-compiled pattern?

A

@donBarbos
Copy link
Contributor Author

donBarbos commented May 1, 2025

sorry, I forgot how python was compiled then :( but I'm pretty sure I compiled it with gcc last time

so I reran the test with the following config args '--enable-optimizations' '--with-lto' 'CC=/usr/bin/clang-18' and got a new result for my bench:

Old version (re.sub): 0.982520s
New version (.replace()): 0.701049s
Difference: 1.40x # sometimes it's 1.35x

and if we rewrite old version of benchmark as you suggested:

_pattern = re.compile(r"[\w\.]+")

def _repl(match):
    text = match.group()
    return text.removeprefix("typing.")

def old_format_annotation(annotation):
    if getattr(annotation, "__module__", None) == "typing":
        return re.sub(_pattern, _repl, repr(annotation))
    return repr(annotation)

we will get a bigger difference:

Old version (re.sub): 1.027774s
New version (.replace()): 0.698902s
Difference: 1.47x # the result can drop to 1.40

P.S.: another reason i like the PR version is because we get rid of the re import that is used once here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants