Skip to content

Commit f97cbdd

Browse files
committed
Minor updates to slowinig down dict lookups example
1 parent 098d71f commit f97cbdd

File tree

1 file changed

+18
-16
lines changed

1 file changed

+18
-16
lines changed

README.md

+18-16
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ So, here we go...
9292
* [Section: Miscellaneous](#section-miscellaneous)
9393
+ [`+=` is faster](#--is-faster)
9494
+ [▶ Let's make a giant string!](#-lets-make-a-giant-string)
95-
+ [`dict` lookup performance](#-dict-lookup-performance)
95+
+ [`dict` lookup performance](#-slowing-down-dict-lookups)
9696
+ [▶ Minor Ones *](#-minor-ones-)
9797
- [Contributing](#contributing)
9898
- [Acknowledgements](#acknowledgements)
@@ -3349,36 +3349,38 @@ Let's increase the number of iterations by a factor of 10.
33493349
33503350
---
33513351
3352-
### ▶ `dict` lookup performance
3352+
### ▶ Slowing down `dict` lookups
33533353
33543354
```py
3355-
>>> some_dict = {str(i): 1 for i in range(1_000_000)}
3355+
some_dict = {str(i): 1 for i in range(1_000_000)}
3356+
another_dict = {str(i): 1 for i in range(1_000_000)}
3357+
```
3358+
3359+
**Output:**
3360+
```py
33563361
>>> %timeit some_dict['5']
33573362
28.6 ns ± 0.115 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
33583363
>>> some_dict[1] = 1
33593364
>>> %timeit some_dict['5']
33603365
37.2 ns ± 0.265 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
3361-
# why did it become much slower?
3362-
```
33633366
3364-
#### 💡 Explanation:
3365-
+ CPython has a generic dictionary lookup function that handles all types of keys (`str`, `int`, any object ...), and a specialized one for the common case of dictionaries composed of `str`-only keys.
3366-
+ The specialized function (named `lookdict_unicode` in CPython's sources) knows all existing keys (including the looked-up key) are strings, and uses the faster & simpler string comparison to compare keys, instead of calling the `__eq__` method.
3367-
+ The first time a `dict` instance is accessed with a non-`str` key, it's modified so future lookups use the generic function.
3368-
+ This process is not reversible for the particular `dict` instance, and the key doesn't even have to exist in the dictionary - attempting a failed lookup has the same effect:
3369-
```py
3370-
>>> some_dict = {str(i): 1 for i in range(1_000_000)}
3371-
>>> %timeit some_dict['5']
3367+
>>> %timeit another_dict['5']
33723368
28.5 ns ± 0.142 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
3373-
>>> some_dict[1]
3369+
>>> another_dict[1] # Trying to access a key that doesn't exist
33743370
Traceback (most recent call last):
33753371
File "<stdin>", line 1, in <module>
33763372
KeyError: 1
3377-
>>> %timeit some_dict['5']
3373+
>>> %timeit another_dict['5']
33783374
38.5 ns ± 0.0913 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
33793375
```
3376+
Why are same lookups becoming slower?
3377+
3378+
#### 💡 Explanation:
3379+
+ CPython has a generic dictionary lookup function that handles all types of keys (`str`, `int`, any object ...), and a specialized one for the common case of dictionaries composed of `str`-only keys.
3380+
+ The specialized function (named `lookdict_unicode` in CPython's [source](https://github.com/python/cpython/blob/522691c46e2ae51faaad5bbbce7d959dd61770df/Objects/dictobject.c#L841)) knows all existing keys (including the looked-up key) are strings, and uses the faster & simpler string comparison to compare keys, instead of calling the `__eq__` method.
3381+
+ The first time a `dict` instance is accessed with a non-`str` key, it's modified so future lookups use the generic function.
3382+
+ This process is not reversible for the particular `dict` instance, and the key doesn't even have to exist in the dictionary. That's why attempting a failed lookup has the same effect.
33803383
3381-
---
33823384
33833385
### ▶ Minor Ones *
33843386
<!-- Example ID: f885cb82-f1e4-4daa-9ff3-972b14cb1324 --->

0 commit comments

Comments
 (0)