Skip to content

GH-94822: Respect metaclasses when caching class attributes #94863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

brandtbucher
Copy link
Member

@brandtbucher brandtbucher commented Jul 14, 2022

This patch changes LOAD_ATTR_CLASS to:

  • fail if the named attribute is present on the metaclass
  • fail if the metaclass defines __getattribute__
  • use the metaclass type version for additional cache validation

It also adds a bunch of regression tests for nasty edge-cases when caching class attributes and methods.

@brandtbucher brandtbucher added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) needs backport to 3.11 only security fixes labels Jul 14, 2022
@brandtbucher brandtbucher self-assigned this Jul 14, 2022
@brandtbucher
Copy link
Member Author

Note to self: this might be able to be automatically backported by the bots. We just need to change all instances of the term ATTR to METHOD in the resulting patch.

Comment on lines +957 to +961
if (metaclass->tp_getattro != PyType_Type.tp_getattro) {
// The metaclass defines __getattribute__. All bets are off:
SPECIALIZATION_FAIL(LOAD_ATTR, SPEC_FAIL_OVERRIDDEN);
return -1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should just play safe and fail whenever we see a non-type type? (ie whenever there's a metaclass)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have simple ways of checking for all known problematic cases (names present in the metaclass, __getattribute__ defined, mutating metaclasses). I guess we could do that, but it sort of feels like overkill to me.

I'm currently running the benchmarks and gathering stats for this branch, but I worry that disallowing all metaclasses would really hurt benchmarks like sympy which have tons of metaclasses. So I'd want to make sure we don't suffer any big regressions before disallowing them entirely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick work @brandtbucher

I don't know if this helps but if you want to time SymPy specifically then one way is:

import sympy
sympy.test()

That takes about half an hour to run on this computer.

Comparing different CPython versions in CI shows that 3.11.0b4 currently gives a speedup for some jobs and is neutral for others:
https://github.com/sympy/sympy/actions/runs/2662492934

If you look at for example the test2 matrix then the timings from there are:

3.8: 19m28s
3.9: 16m44s
3.10: 16m4s
3.11.0b4: 14m45s

The test2 job does roughly half of what sympy.test() does and can be reproduce directly with sympy.test(split='2/2'). The 3.10 timing is shown in CI as test2-latest separately from the test2 matrix of different Python versions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll take a look.

Side note, I just realized that potentially removing support for any metaclasses here would probably (further) trash enum performance. :( So that's also something we want visibility into before going that route.

@Fidget-Spinner
Copy link
Member

Also could you run make patchcheck please?

@brandtbucher
Copy link
Member Author

Looks like this patch makes things about 1% slower. It's not immediately clear to me if this is mainly from removing illegitimate specializations, the increased cost of specializing class attribute loads, or the overhead of additional metaclass version checks:

Slower (33):
- scimark_sparse_mat_mult: 3.93 ms +- 0.09 ms -> 4.13 ms +- 0.05 ms: 1.05x slower
- json_dumps: 12.0 ms +- 0.1 ms -> 12.5 ms +- 0.3 ms: 1.03x slower
- xml_etree_iterparse: 103 ms +- 2 ms -> 107 ms +- 2 ms: 1.03x slower
- logging_silent: 88.9 ns +- 1.7 ns -> 91.5 ns +- 0.6 ns: 1.03x slower
- spectral_norm: 93.2 ms +- 1.6 ms -> 95.5 ms +- 2.3 ms: 1.02x slower
- richards: 44.5 ms +- 0.9 ms -> 45.5 ms +- 1.0 ms: 1.02x slower
- scimark_fft: 303 ms +- 5 ms -> 310 ms +- 3 ms: 1.02x slower
- pickle_pure_python: 287 us +- 3 us -> 293 us +- 3 us: 1.02x slower
- chaos: 65.6 ms +- 1.0 ms -> 67.0 ms +- 1.0 ms: 1.02x slower
- thrift: 728 us +- 14 us -> 740 us +- 12 us: 1.02x slower
- pathlib: 17.8 ms +- 0.2 ms -> 18.1 ms +- 0.3 ms: 1.02x slower
- deltablue: 3.21 ms +- 0.04 ms -> 3.25 ms +- 0.04 ms: 1.01x slower
- regex_compile: 126 ms +- 1 ms -> 128 ms +- 1 ms: 1.01x slower
- scimark_lu: 103 ms +- 2 ms -> 105 ms +- 2 ms: 1.01x slower
- go: 132 ms +- 1 ms -> 133 ms +- 1 ms: 1.01x slower
- pyflate: 393 ms +- 4 ms -> 398 ms +- 5 ms: 1.01x slower
- json: 4.64 ms +- 0.12 ms -> 4.69 ms +- 0.11 ms: 1.01x slower
- logging_format: 6.27 us +- 0.11 us -> 6.34 us +- 0.11 us: 1.01x slower
- nqueens: 78.6 ms +- 0.9 ms -> 79.5 ms +- 1.2 ms: 1.01x slower
- hexiom: 5.92 ms +- 0.02 ms -> 5.98 ms +- 0.02 ms: 1.01x slower
- tornado_http: 91.5 ms +- 1.3 ms -> 92.4 ms +- 1.3 ms: 1.01x slower
- regex_v8: 21.5 ms +- 0.3 ms -> 21.7 ms +- 0.2 ms: 1.01x slower
- xml_etree_parse: 157 ms +- 3 ms -> 158 ms +- 5 ms: 1.01x slower
- nbody: 91.8 ms +- 1.4 ms -> 92.6 ms +- 0.7 ms: 1.01x slower
- sympy_sum: 158 ms +- 1 ms -> 159 ms +- 2 ms: 1.01x slower
- telco: 6.41 ms +- 0.09 ms -> 6.46 ms +- 0.09 ms: 1.01x slower
- unpickle_list: 4.98 us +- 0.04 us -> 5.01 us +- 0.06 us: 1.01x slower
- 2to3: 248 ms +- 1 ms -> 250 ms +- 1 ms: 1.01x slower
- scimark_sor: 110 ms +- 1 ms -> 110 ms +- 1 ms: 1.01x slower
- crypto_pyaes: 72.2 ms +- 0.5 ms -> 72.5 ms +- 0.5 ms: 1.00x slower
- python_startup: 8.19 ms +- 0.01 ms -> 8.22 ms +- 0.01 ms: 1.00x slower
- python_startup_no_site: 6.05 ms +- 0.01 ms -> 6.06 ms +- 0.00 ms: 1.00x slower
- flaskblogging: 4.26 ms +- 0.01 ms -> 4.26 ms +- 0.02 ms: 1.00x slower

Faster (12):
- pickle_list: 4.14 us +- 0.04 us -> 3.98 us +- 0.05 us: 1.04x faster
- raytrace: 294 ms +- 7 ms -> 286 ms +- 6 ms: 1.03x faster
- regex_dna: 207 ms +- 0 ms -> 202 ms +- 0 ms: 1.02x faster
- regex_effbot: 3.58 ms +- 0.03 ms -> 3.51 ms +- 0.01 ms: 1.02x faster
- mako: 9.47 ms +- 0.03 ms -> 9.30 ms +- 0.06 ms: 1.02x faster
- pickle: 10.4 us +- 0.1 us -> 10.2 us +- 0.1 us: 1.02x faster
- pickle_dict: 30.6 us +- 0.1 us -> 30.3 us +- 0.1 us: 1.01x faster
- chameleon: 6.43 ms +- 0.15 ms -> 6.38 ms +- 0.05 ms: 1.01x faster
- scimark_monte_carlo: 65.6 ms +- 0.9 ms -> 65.1 ms +- 1.1 ms: 1.01x faster
- django_template: 32.2 ms +- 0.4 ms -> 31.9 ms +- 0.3 ms: 1.01x faster
- json_loads: 24.3 us +- 0.2 us -> 24.2 us +- 0.2 us: 1.01x faster
- xml_etree_process: 52.2 ms +- 0.6 ms -> 52.0 ms +- 0.5 ms: 1.00x faster

Benchmark hidden because not significant (16): dulwich_log, fannkuch, float, html5lib, logging_simple, meteor_contest, pidigits, pycparser, sqlite_synth, sympy_expand, sympy_integrate, sympy_str, unpack_sequence, unpickle, unpickle_pure_python, xml_etree_generate

Geometric mean: 1.01x slower

Total hit rates for LOAD_ATTR_CLASS have gone down by 15%, but hit rates for the whole LOAD_ATTR family are largely unchanged. Specialization successes decreased by 3%. 5% more failures are attributed to metaclass __getattribute__ methods, while 7% more failures are due to the named attributes being present on the metaclass itself.

main

Name Count Self Cumulative Miss ratio
LOAD_ATTR_CLASS 102,062,535 0.2% 96.5% 1.6%
specialization stats for LOAD_ATTR family
Kind Count Ratio
unquickened 193618648 3.4%
specialization.deferred 629431714 11.2%
specialization.deopt 2359119 0.0%
hit 4675648068 83.1%
miss 129400942 2.3%

Specialization attempts

Count Ratio
Success 4,958,258 64.8%
Failure 2,693,208 35.2%
Failure kind Count Ratio
not managed dict 586,470 21.8%
overridden 540,857 20.1%
method 384,095 14.3%
non object slot 264,897 9.8%
out of range 254,521 9.5%
metaclass attribute 187,132 6.9%
non overriding descriptor 171,346 6.4%
class method obj 152,554 5.7%
has managed dict 116,971 4.3%
mutable class 112,244 4.2%
builtin class method 53,527 2.0%
shadowed 3,840 0.1%
other 1,669 0.1%
module attr not found 1,648 0.1%
expected error 591 0.0%
property 401 0.0%

This PR

Name Count Self Cumulative Miss ratio
LOAD_ATTR_CLASS 88,312,319 0.1% 97.0% 0.4%
specialization stats for LOAD_ATTR family
Kind Count Ratio
unquickened 190837306 3.4%
specialization.deferred 636947625 11.3%
specialization.deopt 2321544 0.0%
hit 4660987261 83.0%
miss 126561331 2.3%

Specialization attempts

Count Ratio
Success 4,829,110 61.4%
Failure 3,032,191 38.6%
Failure kind Count Ratio
overridden 765,516 25.2%
not managed dict 582,142 19.2%
metaclass attribute 417,214 13.8%
method 383,456 12.6%
non object slot 265,664 8.8%
out of range 253,965 8.4%
class method obj 144,317 4.8%
has managed dict 117,040 3.9%
mutable class 108,085 3.6%
non overriding descriptor 74,494 2.5%
builtin class method 53,548 1.8%
shadowed 3,840 0.1%
module attr not found 1,648 0.1%
expected error 591 0.0%
property 380 0.0%

@brandtbucher
Copy link
Member Author

brandtbucher commented Jul 14, 2022

I might prepare a second branch that disallows metaclasses entirely, and compare the results. Again, we should be careful to make sure that enum and sympy performance aren't totally destroyed before choosing to go that route.

@Fidget-Spinner
Copy link
Member

Wow, these are really comprehensive stats. Thanks Brandt.

FYI, enum attribute access isn't specialised in 3.11. So whatever we do it probably won't hurt 3.11 that much :).

@brandtbucher
Copy link
Member Author

brandtbucher commented Jul 15, 2022

Disallowing metaclasses entirely (main...brandtbucher:no-metaclasses) gets rid of the 1% slowdown for the benchmark suite as a whole. It looks like it does come with a ~1% hit to the suite's sympy benchmarks when compared to this PR, but that's actually much better than I was expecting:

Slower (27):
- nqueens: 78.6 ms +- 0.9 ms -> 82.1 ms +- 0.9 ms: 1.04x slower
- fannkuch: 370 ms +- 5 ms -> 386 ms +- 7 ms: 1.04x slower
- go: 132 ms +- 1 ms -> 137 ms +- 1 ms: 1.04x slower
- sqlite_synth: 2.59 us +- 0.05 us -> 2.64 us +- 0.07 us: 1.02x slower
- thrift: 728 us +- 14 us -> 743 us +- 20 us: 1.02x slower
- regex_compile: 126 ms +- 1 ms -> 129 ms +- 1 ms: 1.02x slower
- sympy_sum: 158 ms +- 1 ms -> 160 ms +- 2 ms: 1.02x slower
- pidigits: 190 ms +- 0 ms -> 193 ms +- 0 ms: 1.02x slower
- logging_silent: 88.9 ns +- 1.7 ns -> 90.2 ns +- 1.7 ns: 1.01x slower
- regex_v8: 21.5 ms +- 0.3 ms -> 21.8 ms +- 0.2 ms: 1.01x slower
- hexiom: 5.92 ms +- 0.02 ms -> 5.98 ms +- 0.03 ms: 1.01x slower
- pathlib: 17.8 ms +- 0.2 ms -> 18.0 ms +- 0.2 ms: 1.01x slower
- pickle_dict: 30.6 us +- 0.1 us -> 31.0 us +- 0.1 us: 1.01x slower
- scimark_lu: 103 ms +- 2 ms -> 104 ms +- 2 ms: 1.01x slower
- dulwich_log: 61.1 ms +- 0.4 ms -> 61.7 ms +- 0.4 ms: 1.01x slower
- pyflate: 393 ms +- 4 ms -> 397 ms +- 5 ms: 1.01x slower
- sympy_str: 279 ms +- 4 ms -> 281 ms +- 6 ms: 1.01x slower
- sympy_integrate: 20.1 ms +- 0.1 ms -> 20.3 ms +- 0.2 ms: 1.01x slower
- unpickle_list: 4.98 us +- 0.04 us -> 5.02 us +- 0.09 us: 1.01x slower
- pickle_list: 4.14 us +- 0.04 us -> 4.17 us +- 0.03 us: 1.01x slower
- 2to3: 248 ms +- 1 ms -> 250 ms +- 1 ms: 1.01x slower
- crypto_pyaes: 72.2 ms +- 0.5 ms -> 72.8 ms +- 0.7 ms: 1.01x slower
- xml_etree_iterparse: 103 ms +- 2 ms -> 104 ms +- 2 ms: 1.01x slower
- python_startup: 8.19 ms +- 0.01 ms -> 8.24 ms +- 0.01 ms: 1.01x slower
- python_startup_no_site: 6.05 ms +- 0.01 ms -> 6.08 ms +- 0.01 ms: 1.01x slower
- scimark_fft: 303 ms +- 5 ms -> 304 ms +- 3 ms: 1.00x slower
- flaskblogging: 4.26 ms +- 0.01 ms -> 4.26 ms +- 0.01 ms: 1.00x slower

Faster (20):
- regex_effbot: 3.58 ms +- 0.03 ms -> 3.37 ms +- 0.01 ms: 1.06x faster
- raytrace: 294 ms +- 7 ms -> 282 ms +- 4 ms: 1.04x faster
- regex_dna: 207 ms +- 0 ms -> 202 ms +- 1 ms: 1.02x faster
- scimark_monte_carlo: 65.6 ms +- 0.9 ms -> 64.4 ms +- 0.5 ms: 1.02x faster
- scimark_sparse_mat_mult: 3.93 ms +- 0.09 ms -> 3.86 ms +- 0.08 ms: 1.02x faster
- nbody: 91.8 ms +- 1.4 ms -> 90.3 ms +- 0.5 ms: 1.02x faster
- pickle: 10.4 us +- 0.1 us -> 10.2 us +- 0.1 us: 1.02x faster
- spectral_norm: 93.2 ms +- 1.6 ms -> 91.8 ms +- 0.5 ms: 1.01x faster
- logging_format: 6.27 us +- 0.11 us -> 6.19 us +- 0.07 us: 1.01x faster
- logging_simple: 5.69 us +- 0.08 us -> 5.63 us +- 0.07 us: 1.01x faster
- chameleon: 6.43 ms +- 0.15 ms -> 6.36 ms +- 0.05 ms: 1.01x faster
- django_template: 32.2 ms +- 0.4 ms -> 31.9 ms +- 0.3 ms: 1.01x faster
- json_dumps: 12.0 ms +- 0.1 ms -> 11.9 ms +- 0.1 ms: 1.01x faster
- xml_etree_process: 52.2 ms +- 0.6 ms -> 51.7 ms +- 0.4 ms: 1.01x faster
- deltablue: 3.21 ms +- 0.04 ms -> 3.18 ms +- 0.05 ms: 1.01x faster
- json_loads: 24.3 us +- 0.2 us -> 24.1 us +- 0.2 us: 1.01x faster
- chaos: 65.6 ms +- 1.0 ms -> 65.2 ms +- 0.8 ms: 1.01x faster
- xml_etree_generate: 75.0 ms +- 0.7 ms -> 74.6 ms +- 0.4 ms: 1.01x faster
- scimark_sor: 110 ms +- 1 ms -> 109 ms +- 1 ms: 1.00x faster
- mako: 9.47 ms +- 0.03 ms -> 9.44 ms +- 0.04 ms: 1.00x faster

Benchmark hidden because not significant (14): float, html5lib, json, meteor_contest, pickle_pure_python, pycparser, richards, sympy_expand, telco, tornado_http, unpack_sequence, unpickle, unpickle_pure_python, xml_etree_parse

Geometric mean: 1.00x slower

More stats:

Name Count Self Cumulative Miss ratio
LOAD_ATTR_CLASS 67,240,997 0.1% 97.5% 0.2%
specialization stats for LOAD_ATTR family
Kind Count Ratio
unquickened 192112856 3.5%
specialization.deferred 663275886 12.0%
specialization.deopt 2346873 0.0%
hit 4553072619 82.2%
miss 127895746 2.3%

Specialization attempts

Count Ratio
Success 4,842,590 61.2%
Failure 3,070,807 38.8%
Failure kind Count Ratio
metaclass attribute 724,696 23.6%
not managed dict 586,451 19.1%
overridden 528,290 17.2%
method 383,637 12.5%
non object slot 264,709 8.6%
out of range 254,092 8.3%
class method obj 128,659 4.2%
has managed dict 116,914 3.8%
mutable class 97,596 3.2%
non overriding descriptor 65,531 2.1%
builtin class method 53,397 1.7%
shadowed 3,840 0.1%
module attr not found 1,648 0.1%
expected error 591 0.0%
property 254 0.0%

It seems that your earlier intuition was spot-on @Fidget-Spinner (I keep forgetting that 3.11 doesn't specialize class attributes). At this point, I'm thinking that we:

  • Use my other branch, and fail to specialize in the presence of a metaclass. This will be backported to 3.11.
  • Try adding a LOAD_ATTR_METACLASS opcode to boost performance for things like enum and sympy in 3.12. That way we only need to bother with (apparently expensive) metaclass versions in those rare cases.

I'm planning to move forward with these today.

@brandtbucher
Copy link
Member Author

Superseded by #94892.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting core review interpreter-core (Objects, Python, Grammar, and Parser dirs) needs backport to 3.11 only security fixes type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants