Skip to content

Commit 0b74f9b

Browse files
committed
Add "dict lookup performance" section
Closes: satwikkansal#208
1 parent 7457ffb commit 0b74f9b

File tree

1 file changed

+32
-0
lines changed

1 file changed

+32
-0
lines changed

README.md

+32
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@ So, here we go...
9292
* [Section: Miscellaneous](#section-miscellaneous)
9393
+ [`+=` is faster](#--is-faster)
9494
+ [▶ Let's make a giant string!](#-lets-make-a-giant-string)
95+
+ [`dict` lookup performance](#-dict-lookup-performance)
9596
+ [▶ Minor Ones *](#-minor-ones-)
9697
- [Contributing](#contributing)
9798
- [Acknowledgements](#acknowledgements)
@@ -3348,6 +3349,37 @@ Let's increase the number of iterations by a factor of 10.
33483349
33493350
---
33503351
3352+
### ▶ `dict` lookup performance
3353+
3354+
```py
3355+
>>> some_dict = {str(i): 1 for i in range(1_000_000)}
3356+
>>> %timeit some_dict['5']
3357+
28.6 ns ± 0.115 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
3358+
>>> some_dict[1] = 1
3359+
>>> %timeit some_dict['5']
3360+
37.2 ns ± 0.265 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
3361+
# why did it become much slower?
3362+
```
3363+
3364+
#### 💡 Explanation:
3365+
+ CPython has a generic dictionary lookup function that handles all types of keys (`str`, `int`, any object ...), and a specialized one for the common case of dictionaries composed of `str`-only keys.
3366+
+ The specialized function (named `lookdict_unicode` in CPython's sources) knows all existing keys (including the looked-up key) are strings, and uses the faster & simpler string comparison to compare keys, instead of calling the `__eq__` method.
3367+
+ The first time a `dict` instance is accessed with a non-`str` key, it's modified so future lookups use the generic function.
3368+
+ This process is not reversible for the particular `dict` instance, and the key doesn't even have to exist in the dictionary - attempting a failed lookup has the same effect:
3369+
```py
3370+
>>> some_dict = {str(i): 1 for i in range(1_000_000)}
3371+
>>> %timeit some_dict['5']
3372+
28.5 ns ± 0.142 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
3373+
>>> some_dict[1]
3374+
Traceback (most recent call last):
3375+
File "<stdin>", line 1, in <module>
3376+
KeyError: 1
3377+
>>> %timeit some_dict['5']
3378+
38.5 ns ± 0.0913 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
3379+
```
3380+
3381+
---
3382+
33513383
### ▶ Minor Ones *
33523384
<!-- Example ID: f885cb82-f1e4-4daa-9ff3-972b14cb1324 --->
33533385
* `join()` is a string operation instead of list operation. (sort of counter-intuitive at first usage)

0 commit comments

Comments
 (0)