GH-94822: Respect metaclasses when caching class attributes #94863

brandtbucher · 2022-07-14T18:55:00Z

This patch changes LOAD_ATTR_CLASS to:

fail if the named attribute is present on the metaclass
fail if the metaclass defines __getattribute__
use the metaclass type version for additional cache validation

It also adds a bunch of regression tests for nasty edge-cases when caching class attributes and methods.

Issue: Optimiser breaks attribute lookup from metaclass property #94822

brandtbucher · 2022-07-14T18:56:18Z

Note to self: this might be able to be automatically backported by the bots. We just need to change all instances of the term ATTR to METHOD in the resulting patch.

Fidget-Spinner · 2022-07-14T19:24:05Z

Python/specialize.c

+    if (metaclass->tp_getattro != PyType_Type.tp_getattro) {
+        // The metaclass defines __getattribute__. All bets are off:
+        SPECIALIZATION_FAIL(LOAD_ATTR, SPEC_FAIL_OVERRIDDEN);
+        return -1;
+    }


Do you think we should just play safe and fail whenever we see a non-type type? (ie whenever there's a metaclass)

We have simple ways of checking for all known problematic cases (names present in the metaclass, __getattribute__ defined, mutating metaclasses). I guess we could do that, but it sort of feels like overkill to me.

I'm currently running the benchmarks and gathering stats for this branch, but I worry that disallowing all metaclasses would really hurt benchmarks like sympy which have tons of metaclasses. So I'd want to make sure we don't suffer any big regressions before disallowing them entirely.

Thanks for the quick work @brandtbucher

I don't know if this helps but if you want to time SymPy specifically then one way is:

import sympy sympy.test()

That takes about half an hour to run on this computer.

Comparing different CPython versions in CI shows that 3.11.0b4 currently gives a speedup for some jobs and is neutral for others:
https://github.com/sympy/sympy/actions/runs/2662492934

If you look at for example the test2 matrix then the timings from there are:

3.8: 19m28s 3.9: 16m44s 3.10: 16m4s 3.11.0b4: 14m45s

The test2 job does roughly half of what sympy.test() does and can be reproduce directly with sympy.test(split='2/2'). The 3.10 timing is shown in CI as test2-latest separately from the test2 matrix of different Python versions.

Thanks, I'll take a look.

Side note, I just realized that potentially removing support for any metaclasses here would probably (further) trash enum performance. :( So that's also something we want visibility into before going that route.

Lib/test/test_opcache.py

Fidget-Spinner · 2022-07-14T19:27:43Z

Also could you run make patchcheck please?

brandtbucher · 2022-07-14T23:32:54Z

Looks like this patch makes things about 1% slower. It's not immediately clear to me if this is mainly from removing illegitimate specializations, the increased cost of specializing class attribute loads, or the overhead of additional metaclass version checks:

Slower (33):
- scimark_sparse_mat_mult: 3.93 ms +- 0.09 ms -> 4.13 ms +- 0.05 ms: 1.05x slower
- json_dumps: 12.0 ms +- 0.1 ms -> 12.5 ms +- 0.3 ms: 1.03x slower
- xml_etree_iterparse: 103 ms +- 2 ms -> 107 ms +- 2 ms: 1.03x slower
- logging_silent: 88.9 ns +- 1.7 ns -> 91.5 ns +- 0.6 ns: 1.03x slower
- spectral_norm: 93.2 ms +- 1.6 ms -> 95.5 ms +- 2.3 ms: 1.02x slower
- richards: 44.5 ms +- 0.9 ms -> 45.5 ms +- 1.0 ms: 1.02x slower
- scimark_fft: 303 ms +- 5 ms -> 310 ms +- 3 ms: 1.02x slower
- pickle_pure_python: 287 us +- 3 us -> 293 us +- 3 us: 1.02x slower
- chaos: 65.6 ms +- 1.0 ms -> 67.0 ms +- 1.0 ms: 1.02x slower
- thrift: 728 us +- 14 us -> 740 us +- 12 us: 1.02x slower
- pathlib: 17.8 ms +- 0.2 ms -> 18.1 ms +- 0.3 ms: 1.02x slower
- deltablue: 3.21 ms +- 0.04 ms -> 3.25 ms +- 0.04 ms: 1.01x slower
- regex_compile: 126 ms +- 1 ms -> 128 ms +- 1 ms: 1.01x slower
- scimark_lu: 103 ms +- 2 ms -> 105 ms +- 2 ms: 1.01x slower
- go: 132 ms +- 1 ms -> 133 ms +- 1 ms: 1.01x slower
- pyflate: 393 ms +- 4 ms -> 398 ms +- 5 ms: 1.01x slower
- json: 4.64 ms +- 0.12 ms -> 4.69 ms +- 0.11 ms: 1.01x slower
- logging_format: 6.27 us +- 0.11 us -> 6.34 us +- 0.11 us: 1.01x slower
- nqueens: 78.6 ms +- 0.9 ms -> 79.5 ms +- 1.2 ms: 1.01x slower
- hexiom: 5.92 ms +- 0.02 ms -> 5.98 ms +- 0.02 ms: 1.01x slower
- tornado_http: 91.5 ms +- 1.3 ms -> 92.4 ms +- 1.3 ms: 1.01x slower
- regex_v8: 21.5 ms +- 0.3 ms -> 21.7 ms +- 0.2 ms: 1.01x slower
- xml_etree_parse: 157 ms +- 3 ms -> 158 ms +- 5 ms: 1.01x slower
- nbody: 91.8 ms +- 1.4 ms -> 92.6 ms +- 0.7 ms: 1.01x slower
- sympy_sum: 158 ms +- 1 ms -> 159 ms +- 2 ms: 1.01x slower
- telco: 6.41 ms +- 0.09 ms -> 6.46 ms +- 0.09 ms: 1.01x slower
- unpickle_list: 4.98 us +- 0.04 us -> 5.01 us +- 0.06 us: 1.01x slower
- 2to3: 248 ms +- 1 ms -> 250 ms +- 1 ms: 1.01x slower
- scimark_sor: 110 ms +- 1 ms -> 110 ms +- 1 ms: 1.01x slower
- crypto_pyaes: 72.2 ms +- 0.5 ms -> 72.5 ms +- 0.5 ms: 1.00x slower
- python_startup: 8.19 ms +- 0.01 ms -> 8.22 ms +- 0.01 ms: 1.00x slower
- python_startup_no_site: 6.05 ms +- 0.01 ms -> 6.06 ms +- 0.00 ms: 1.00x slower
- flaskblogging: 4.26 ms +- 0.01 ms -> 4.26 ms +- 0.02 ms: 1.00x slower

Faster (12):
- pickle_list: 4.14 us +- 0.04 us -> 3.98 us +- 0.05 us: 1.04x faster
- raytrace: 294 ms +- 7 ms -> 286 ms +- 6 ms: 1.03x faster
- regex_dna: 207 ms +- 0 ms -> 202 ms +- 0 ms: 1.02x faster
- regex_effbot: 3.58 ms +- 0.03 ms -> 3.51 ms +- 0.01 ms: 1.02x faster
- mako: 9.47 ms +- 0.03 ms -> 9.30 ms +- 0.06 ms: 1.02x faster
- pickle: 10.4 us +- 0.1 us -> 10.2 us +- 0.1 us: 1.02x faster
- pickle_dict: 30.6 us +- 0.1 us -> 30.3 us +- 0.1 us: 1.01x faster
- chameleon: 6.43 ms +- 0.15 ms -> 6.38 ms +- 0.05 ms: 1.01x faster
- scimark_monte_carlo: 65.6 ms +- 0.9 ms -> 65.1 ms +- 1.1 ms: 1.01x faster
- django_template: 32.2 ms +- 0.4 ms -> 31.9 ms +- 0.3 ms: 1.01x faster
- json_loads: 24.3 us +- 0.2 us -> 24.2 us +- 0.2 us: 1.01x faster
- xml_etree_process: 52.2 ms +- 0.6 ms -> 52.0 ms +- 0.5 ms: 1.00x faster

Benchmark hidden because not significant (16): dulwich_log, fannkuch, float, html5lib, logging_simple, meteor_contest, pidigits, pycparser, sqlite_synth, sympy_expand, sympy_integrate, sympy_str, unpack_sequence, unpickle, unpickle_pure_python, xml_etree_generate

Geometric mean: 1.01x slower

Total hit rates for LOAD_ATTR_CLASS have gone down by 15%, but hit rates for the whole LOAD_ATTR family are largely unchanged. Specialization successes decreased by 3%. 5% more failures are attributed to metaclass __getattribute__ methods, while 7% more failures are due to the named attributes being present on the metaclass itself.

`main`

Name	Count	Self	Cumulative	Miss ratio
LOAD_ATTR_CLASS	102,062,535	0.2%	96.5%	1.6%

specialization stats for LOAD_ATTR family

Kind	Count	Ratio
unquickened	193618648	3.4%
specialization.deferred	629431714	11.2%
specialization.deopt	2359119	0.0%
hit	4675648068	83.1%
miss	129400942	2.3%

Specialization attempts

	Count	Ratio
Success	4,958,258	64.8%
Failure	2,693,208	35.2%

Failure kind	Count	Ratio
not managed dict	586,470	21.8%
overridden	540,857	20.1%
method	384,095	14.3%
non object slot	264,897	9.8%
out of range	254,521	9.5%
metaclass attribute	187,132	6.9%
non overriding descriptor	171,346	6.4%
class method obj	152,554	5.7%
has managed dict	116,971	4.3%
mutable class	112,244	4.2%
builtin class method	53,527	2.0%
shadowed	3,840	0.1%
other	1,669	0.1%
module attr not found	1,648	0.1%
expected error	591	0.0%
property	401	0.0%

This PR

Name	Count	Self	Cumulative	Miss ratio
LOAD_ATTR_CLASS	88,312,319	0.1%	97.0%	0.4%

specialization stats for LOAD_ATTR family

Kind	Count	Ratio
unquickened	190837306	3.4%
specialization.deferred	636947625	11.3%
specialization.deopt	`2321544`	0.0%
hit	4660987261	83.0%
miss	126561331	2.3%

Specialization attempts

	Count	Ratio
Success	4,829,110	61.4%
Failure	3,032,191	38.6%

Failure kind	Count	Ratio
overridden	765,516	25.2%
not managed dict	582,142	19.2%
metaclass attribute	417,214	13.8%
method	383,456	12.6%
non object slot	265,664	8.8%
out of range	253,965	8.4%
class method obj	144,317	4.8%
has managed dict	117,040	3.9%
mutable class	108,085	3.6%
non overriding descriptor	74,494	2.5%
builtin class method	53,548	1.8%
shadowed	3,840	0.1%
module attr not found	1,648	0.1%
expected error	591	0.0%
property	380	0.0%

brandtbucher · 2022-07-14T23:38:24Z

I might prepare a second branch that disallows metaclasses entirely, and compare the results. Again, we should be careful to make sure that enum and sympy performance aren't totally destroyed before choosing to go that route.

Fidget-Spinner · 2022-07-15T05:58:37Z

Wow, these are really comprehensive stats. Thanks Brandt.

FYI, enum attribute access isn't specialised in 3.11. So whatever we do it probably won't hurt 3.11 that much :).

brandtbucher · 2022-07-15T17:27:04Z

Disallowing metaclasses entirely (main...brandtbucher:no-metaclasses) gets rid of the 1% slowdown for the benchmark suite as a whole. It looks like it does come with a ~1% hit to the suite's sympy benchmarks when compared to this PR, but that's actually much better than I was expecting:

Slower (27):
- nqueens: 78.6 ms +- 0.9 ms -> 82.1 ms +- 0.9 ms: 1.04x slower
- fannkuch: 370 ms +- 5 ms -> 386 ms +- 7 ms: 1.04x slower
- go: 132 ms +- 1 ms -> 137 ms +- 1 ms: 1.04x slower
- sqlite_synth: 2.59 us +- 0.05 us -> 2.64 us +- 0.07 us: 1.02x slower
- thrift: 728 us +- 14 us -> 743 us +- 20 us: 1.02x slower
- regex_compile: 126 ms +- 1 ms -> 129 ms +- 1 ms: 1.02x slower
- sympy_sum: 158 ms +- 1 ms -> 160 ms +- 2 ms: 1.02x slower
- pidigits: 190 ms +- 0 ms -> 193 ms +- 0 ms: 1.02x slower
- logging_silent: 88.9 ns +- 1.7 ns -> 90.2 ns +- 1.7 ns: 1.01x slower
- regex_v8: 21.5 ms +- 0.3 ms -> 21.8 ms +- 0.2 ms: 1.01x slower
- hexiom: 5.92 ms +- 0.02 ms -> 5.98 ms +- 0.03 ms: 1.01x slower
- pathlib: 17.8 ms +- 0.2 ms -> 18.0 ms +- 0.2 ms: 1.01x slower
- pickle_dict: 30.6 us +- 0.1 us -> 31.0 us +- 0.1 us: 1.01x slower
- scimark_lu: 103 ms +- 2 ms -> 104 ms +- 2 ms: 1.01x slower
- dulwich_log: 61.1 ms +- 0.4 ms -> 61.7 ms +- 0.4 ms: 1.01x slower
- pyflate: 393 ms +- 4 ms -> 397 ms +- 5 ms: 1.01x slower
- sympy_str: 279 ms +- 4 ms -> 281 ms +- 6 ms: 1.01x slower
- sympy_integrate: 20.1 ms +- 0.1 ms -> 20.3 ms +- 0.2 ms: 1.01x slower
- unpickle_list: 4.98 us +- 0.04 us -> 5.02 us +- 0.09 us: 1.01x slower
- pickle_list: 4.14 us +- 0.04 us -> 4.17 us +- 0.03 us: 1.01x slower
- 2to3: 248 ms +- 1 ms -> 250 ms +- 1 ms: 1.01x slower
- crypto_pyaes: 72.2 ms +- 0.5 ms -> 72.8 ms +- 0.7 ms: 1.01x slower
- xml_etree_iterparse: 103 ms +- 2 ms -> 104 ms +- 2 ms: 1.01x slower
- python_startup: 8.19 ms +- 0.01 ms -> 8.24 ms +- 0.01 ms: 1.01x slower
- python_startup_no_site: 6.05 ms +- 0.01 ms -> 6.08 ms +- 0.01 ms: 1.01x slower
- scimark_fft: 303 ms +- 5 ms -> 304 ms +- 3 ms: 1.00x slower
- flaskblogging: 4.26 ms +- 0.01 ms -> 4.26 ms +- 0.01 ms: 1.00x slower

Faster (20):
- regex_effbot: 3.58 ms +- 0.03 ms -> 3.37 ms +- 0.01 ms: 1.06x faster
- raytrace: 294 ms +- 7 ms -> 282 ms +- 4 ms: 1.04x faster
- regex_dna: 207 ms +- 0 ms -> 202 ms +- 1 ms: 1.02x faster
- scimark_monte_carlo: 65.6 ms +- 0.9 ms -> 64.4 ms +- 0.5 ms: 1.02x faster
- scimark_sparse_mat_mult: 3.93 ms +- 0.09 ms -> 3.86 ms +- 0.08 ms: 1.02x faster
- nbody: 91.8 ms +- 1.4 ms -> 90.3 ms +- 0.5 ms: 1.02x faster
- pickle: 10.4 us +- 0.1 us -> 10.2 us +- 0.1 us: 1.02x faster
- spectral_norm: 93.2 ms +- 1.6 ms -> 91.8 ms +- 0.5 ms: 1.01x faster
- logging_format: 6.27 us +- 0.11 us -> 6.19 us +- 0.07 us: 1.01x faster
- logging_simple: 5.69 us +- 0.08 us -> 5.63 us +- 0.07 us: 1.01x faster
- chameleon: 6.43 ms +- 0.15 ms -> 6.36 ms +- 0.05 ms: 1.01x faster
- django_template: 32.2 ms +- 0.4 ms -> 31.9 ms +- 0.3 ms: 1.01x faster
- json_dumps: 12.0 ms +- 0.1 ms -> 11.9 ms +- 0.1 ms: 1.01x faster
- xml_etree_process: 52.2 ms +- 0.6 ms -> 51.7 ms +- 0.4 ms: 1.01x faster
- deltablue: 3.21 ms +- 0.04 ms -> 3.18 ms +- 0.05 ms: 1.01x faster
- json_loads: 24.3 us +- 0.2 us -> 24.1 us +- 0.2 us: 1.01x faster
- chaos: 65.6 ms +- 1.0 ms -> 65.2 ms +- 0.8 ms: 1.01x faster
- xml_etree_generate: 75.0 ms +- 0.7 ms -> 74.6 ms +- 0.4 ms: 1.01x faster
- scimark_sor: 110 ms +- 1 ms -> 109 ms +- 1 ms: 1.00x faster
- mako: 9.47 ms +- 0.03 ms -> 9.44 ms +- 0.04 ms: 1.00x faster

Benchmark hidden because not significant (14): float, html5lib, json, meteor_contest, pickle_pure_python, pycparser, richards, sympy_expand, telco, tornado_http, unpack_sequence, unpickle, unpickle_pure_python, xml_etree_parse

Geometric mean: 1.00x slower

More stats:

Name	Count	Self	Cumulative	Miss ratio
LOAD_ATTR_CLASS	67,240,997	0.1%	97.5%	0.2%

specialization stats for LOAD_ATTR family

Kind	Count	Ratio
unquickened	192112856	3.5%
specialization.deferred	663275886	12.0%
specialization.deopt	2346873	0.0%
hit	4553072619	82.2%
miss	127895746	2.3%

Specialization attempts

	Count	Ratio
Success	4,842,590	61.2%
Failure	3,070,807	38.8%

Failure kind	Count	Ratio
metaclass attribute	724,696	23.6%
not managed dict	586,451	19.1%
overridden	528,290	17.2%
method	383,637	12.5%
non object slot	264,709	8.6%
out of range	254,092	8.3%
class method obj	128,659	4.2%
has managed dict	116,914	3.8%
mutable class	97,596	3.2%
non overriding descriptor	65,531	2.1%
builtin class method	53,397	1.7%
shadowed	3,840	0.1%
module attr not found	1,648	0.1%
expected error	591	0.0%
property	254	0.0%

It seems that your earlier intuition was spot-on @Fidget-Spinner (I keep forgetting that 3.11 doesn't specialize class attributes). At this point, I'm thinking that we:

Use my other branch, and fail to specialize in the presence of a metaclass. This will be backported to 3.11.
Try adding a LOAD_ATTR_METACLASS opcode to boost performance for things like enum and sympy in 3.12. That way we only need to bother with (apparently expensive) metaclass versions in those rare cases.

I'm planning to move forward with these today.

brandtbucher · 2022-07-16T05:23:05Z

Superseded by #94892.

brandtbucher added 6 commits July 13, 2022 16:37

Add failing regression tests

dbc624c

Check metaclass versions in LOAD_ATTR_CLASS

28d1093

blurb add

3f1e5ce

Add more tests

4f84a5c

Refactor tests

606d460

Don't specialize with metaclass __getattribute__

6864e95

brandtbucher added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) needs backport to 3.11 only security fixes labels Jul 14, 2022

brandtbucher requested review from pablogsal and Fidget-Spinner July 14, 2022 18:55

brandtbucher self-assigned this Jul 14, 2022

brandtbucher requested a review from markshannon as a code owner July 14, 2022 18:55

bedevere-bot added the awaiting core review label Jul 14, 2022

Fidget-Spinner reviewed Jul 14, 2022

View reviewed changes

make patchcheck

03aa529

brandtbucher mentioned this pull request Jul 16, 2022

GH-94822: Don't specialize when metaclasses are involved #94892

Merged

brandtbucher closed this Jul 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-94822: Respect metaclasses when caching class attributes #94863

GH-94822: Respect metaclasses when caching class attributes #94863

brandtbucher commented Jul 14, 2022 •

edited by bedevere-bot

Loading

brandtbucher commented Jul 14, 2022

Fidget-Spinner Jul 14, 2022

brandtbucher Jul 14, 2022

oscarbenjamin Jul 14, 2022

brandtbucher Jul 14, 2022

Fidget-Spinner commented Jul 14, 2022

brandtbucher commented Jul 14, 2022

Specialization attempts

Specialization attempts

brandtbucher commented Jul 14, 2022 •

edited

Loading

Fidget-Spinner commented Jul 15, 2022

brandtbucher commented Jul 15, 2022 •

edited

Loading

Specialization attempts

brandtbucher commented Jul 16, 2022

GH-94822: Respect metaclasses when caching class attributes #94863

GH-94822: Respect metaclasses when caching class attributes #94863

Conversation

brandtbucher commented Jul 14, 2022 • edited by bedevere-bot Loading

brandtbucher commented Jul 14, 2022

Fidget-Spinner Jul 14, 2022

Choose a reason for hiding this comment

brandtbucher Jul 14, 2022

Choose a reason for hiding this comment

oscarbenjamin Jul 14, 2022

Choose a reason for hiding this comment

brandtbucher Jul 14, 2022

Choose a reason for hiding this comment

Fidget-Spinner commented Jul 14, 2022

brandtbucher commented Jul 14, 2022

main

Specialization attempts

This PR

Specialization attempts

brandtbucher commented Jul 14, 2022 • edited Loading

Fidget-Spinner commented Jul 15, 2022

brandtbucher commented Jul 15, 2022 • edited Loading

Specialization attempts

brandtbucher commented Jul 16, 2022

brandtbucher commented Jul 14, 2022 •

edited by bedevere-bot

Loading

`main`

brandtbucher commented Jul 14, 2022 •

edited

Loading

brandtbucher commented Jul 15, 2022 •

edited

Loading