Skip to content

Commit ee0696b

Browse files
committed
Add "bloating instance dicts" section
Closes: satwikkansal#210
1 parent f72d732 commit ee0696b

File tree

2 files changed

+64
-1
lines changed

2 files changed

+64
-1
lines changed

CONTRIBUTORS.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Following are the wonderful people (in no specific order) who have contributed t
2121
| Ghost account | N/A | [#96](https://github.com/satwikkansal/wtfpython/issues/96)
2222
| koddo | [koddo](https://github.com/koddo) | [#80](https://github.com/satwikkansal/wtfpython/issues/80), [#73](https://github.com/satwikkansal/wtfpython/issues/73) |
2323
| jab | [jab](https://github.com/jab) | [#77](https://github.com/satwikkansal/wtfpython/issues/77) |
24-
| Jongy | [Jongy](https://github.com/Jongy) | [#208](https://github.com/satwikkansal/wtfpython/issues/208) |
24+
| Jongy | [Jongy](https://github.com/Jongy) | [#208](https://github.com/satwikkansal/wtfpython/issues/208), [#210](https://github.com/satwikkansal/wtfpython/issues/210) |
2525
---
2626

2727
**Translations**

README.md

+63
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ So, here we go...
9393
+ [`+=` is faster](#--is-faster)
9494
+ [▶ Let's make a giant string!](#-lets-make-a-giant-string)
9595
+ [▶ Slowing down `dict` lookups *](#-slowing-down-dict-lookups)
96+
+ [▶ Bloating instance `dict`s *](#-bloating-instance-dicts-)
9697
+ [▶ Minor Ones *](#-minor-ones-)
9798
- [Contributing](#contributing)
9899
- [Acknowledgements](#acknowledgements)
@@ -3382,6 +3383,68 @@ Why are same lookups becoming slower?
33823383
+ This process is not reversible for the particular `dict` instance, and the key doesn't even have to exist in the dictionary. That's why attempting a failed lookup has the same effect.
33833384
33843385
3386+
### ▶ Bloating instance `dict`s *
3387+
<!-- Example ID: fe706ab4-1615-c0ba-a078-76c98cbe3f48 --->
3388+
```py
3389+
import sys
3390+
3391+
class SomeClass:
3392+
def __init__(self):
3393+
self.some_attr1 = 1
3394+
self.some_attr2 = 2
3395+
self.some_attr3 = 3
3396+
self.some_attr4 = 4
3397+
3398+
3399+
def dict_size(o):
3400+
return sys.getsizeof(o.__dict__)
3401+
3402+
```
3403+
3404+
**Output:** (Python 3.8, other Python 3 versions may vary a little)
3405+
```py
3406+
>>> o1 = SomeClass()
3407+
>>> o2 = SomeClass()
3408+
>>> dict_size(o1)
3409+
104
3410+
>>> dict_size(o2)
3411+
104
3412+
>>> del o1.some_attr1
3413+
>>> o3 = SomeClass()
3414+
>>> dict_size(o3)
3415+
232
3416+
>>> dict_size(o1)
3417+
232
3418+
```
3419+
3420+
Let's try again... In a new interpreter:
3421+
3422+
```py
3423+
>>> o1 = SomeClass()
3424+
>>> o2 = SomeClass()
3425+
>>> dict_size(o1)
3426+
104 # as expected
3427+
>>> o1.some_attr5 = 5
3428+
>>> o1.some_attr6 = 6
3429+
>>> dict_size(o1)
3430+
360
3431+
>>> dict_size(o2)
3432+
272
3433+
>>> o3 = SomeClass()
3434+
>>> dict_size(o3)
3435+
232
3436+
```
3437+
3438+
What makes those dictionaries become bloated? And why are newly created objects bloated as well?
3439+
3440+
#### 💡 Explanation:
3441+
+ CPython is able to reuse the same "keys" object in multiple dictionaries. This was added in [PEP 412](https://www.python.org/dev/peps/pep-0412/) with the motivation to reduce memory usage, specifically in dictionaries of instances - where keys (instance attributes) tend to be common to all instances.
3442+
+ This optimization is entirely seamless for instance dictionaries, but it is disabled if certain assumptions are broken.
3443+
+ Key-sharing dictionaries do not support deletion; if an instance attribute is deleted, the dictionary is "unshared", and key-sharing is disabled for all future instances of the same class.
3444+
+ Additionaly, if the dictionary keys have be resized (because new keys are inserted), they are kept shared *only* if they are used by a exactly single dictionary (this allows adding many attributes in the `__init__` of the very first created instance, without causing an "unshare"). If multiple instances exist when a resize happens, key-sharing is disabled for all future instances of the same class: CPython can't tell if your instances are using the same set of attributes anymore, and decides to bail out on attempting to share their keys.
3445+
+ A small tip, if you aim to lower your program's memory footprint: don't delete instance attributes, and make sure to initialize all attributes in your `__init__`!
3446+
3447+
33853448
### ▶ Minor Ones *
33863449
<!-- Example ID: f885cb82-f1e4-4daa-9ff3-972b14cb1324 --->
33873450
* `join()` is a string operation instead of list operation. (sort of counter-intuitive at first usage)

0 commit comments

Comments
 (0)