You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -3382,6 +3383,68 @@ Why are same lookups becoming slower?
3382
3383
+ This process is not reversible for the particular `dict` instance, and the key doesn't even have to exist in the dictionary. That's why attempting a failed lookup has the same effect.
3383
3384
3384
3385
3386
+
### ▶ Bloating instance `dict`s *
3387
+
<!-- Example ID: fe706ab4-1615-c0ba-a078-76c98cbe3f48 --->
3388
+
```py
3389
+
import sys
3390
+
3391
+
class SomeClass:
3392
+
def __init__(self):
3393
+
self.some_attr1 = 1
3394
+
self.some_attr2 = 2
3395
+
self.some_attr3 = 3
3396
+
self.some_attr4 = 4
3397
+
3398
+
3399
+
def dict_size(o):
3400
+
return sys.getsizeof(o.__dict__)
3401
+
3402
+
```
3403
+
3404
+
**Output:** (Python 3.8, other Python 3 versions may vary a little)
3405
+
```py
3406
+
>>> o1 = SomeClass()
3407
+
>>> o2 = SomeClass()
3408
+
>>> dict_size(o1)
3409
+
104
3410
+
>>> dict_size(o2)
3411
+
104
3412
+
>>> del o1.some_attr1
3413
+
>>> o3 = SomeClass()
3414
+
>>> dict_size(o3)
3415
+
232
3416
+
>>> dict_size(o1)
3417
+
232
3418
+
```
3419
+
3420
+
Let's try again... In a new interpreter:
3421
+
3422
+
```py
3423
+
>>> o1 = SomeClass()
3424
+
>>> o2 = SomeClass()
3425
+
>>> dict_size(o1)
3426
+
104 # as expected
3427
+
>>> o1.some_attr5 = 5
3428
+
>>> o1.some_attr6 = 6
3429
+
>>> dict_size(o1)
3430
+
360
3431
+
>>> dict_size(o2)
3432
+
272
3433
+
>>> o3 = SomeClass()
3434
+
>>> dict_size(o3)
3435
+
232
3436
+
```
3437
+
3438
+
What makes those dictionaries become bloated? And why are newly created objects bloated as well?
3439
+
3440
+
#### 💡 Explanation:
3441
+
+ CPython is able to reuse the same "keys" object in multiple dictionaries. This was added in [PEP 412](https://www.python.org/dev/peps/pep-0412/) with the motivation to reduce memory usage, specifically in dictionaries of instances - where keys (instance attributes) tend to be common to all instances.
3442
+
+ This optimization is entirely seamless for instance dictionaries, but it is disabled if certain assumptions are broken.
3443
+
+ Key-sharing dictionaries do not support deletion; if an instance attribute is deleted, the dictionary is "unshared", and key-sharing is disabled for all future instances of the same class.
3444
+
+ Additionaly, if the dictionary keys have be resized (because new keys are inserted), they are kept shared *only* if they are used by a exactly single dictionary (this allows adding many attributes in the `__init__` of the very first created instance, without causing an "unshare"). If multiple instances exist when a resize happens, key-sharing is disabled for all future instances of the same class: CPython can't tell if your instances are using the same set of attributes anymore, and decides to bail out on attempting to share their keys.
3445
+
+ A small tip, if you aim to lower your program's memory footprint: don't delete instance attributes, and make sure to initialize all attributes in your `__init__`!
3446
+
3447
+
3385
3448
### ▶ Minor Ones *
3386
3449
<!-- Example ID: f885cb82-f1e4-4daa-9ff3-972b14cb1324 --->
3387
3450
* `join()` is a string operation instead of list operation. (sort of counter-intuitive at first usage)
0 commit comments