You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -3348,6 +3349,37 @@ Let's increase the number of iterations by a factor of 10.
3348
3349
3349
3350
---
3350
3351
3352
+
### ▶ `dict` lookup performance
3353
+
3354
+
```py
3355
+
>>> some_dict = {str(i): 1 for i in range(1_000_000)}
3356
+
>>> %timeit some_dict['5']
3357
+
28.6 ns ± 0.115 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
3358
+
>>> some_dict[1] = 1
3359
+
>>> %timeit some_dict['5']
3360
+
37.2 ns ± 0.265 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
3361
+
# why did it become much slower?
3362
+
```
3363
+
3364
+
#### 💡 Explanation:
3365
+
+ CPython has a generic dictionary lookup function that handles all types of keys (`str`, `int`, any object ...), and a specialized one for the common case of dictionaries composed of `str`-only keys.
3366
+
+ The specialized function (named `lookdict_unicode` in CPython's sources) knows all existing keys (including the looked-up key) are strings, and uses the faster & simpler string comparison to compare keys, instead of calling the `__eq__` method.
3367
+
+ The first time a `dict` instance is accessed with a non-`str` key, it's modified so future lookups use the generic function.
3368
+
+ This process is not reversible for the particular `dict` instance, and the key doesn't even have to exist in the dictionary - attempting a failed lookup has the same effect:
3369
+
```py
3370
+
>>> some_dict = {str(i): 1 for i in range(1_000_000)}
3371
+
>>> %timeit some_dict['5']
3372
+
28.5 ns ± 0.142 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
3373
+
>>> some_dict[1]
3374
+
Traceback (most recent call last):
3375
+
File "<stdin>", line 1, in <module>
3376
+
KeyError: 1
3377
+
>>> %timeit some_dict['5']
3378
+
38.5 ns ± 0.0913 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
3379
+
```
3380
+
3381
+
---
3382
+
3351
3383
### ▶ Minor Ones *
3352
3384
<!-- Example ID: f885cb82-f1e4-4daa-9ff3-972b14cb1324 --->
3353
3385
* `join()` is a string operation instead of list operation. (sort of counter-intuitive at first usage)
0 commit comments