Skip to content

Commit 5bf46f6

Browse files
committed
Bring compiler optimisation stuff back, and add minor updates.
1 parent 312c5d5 commit 5bf46f6

File tree

1 file changed

+56
-25
lines changed

1 file changed

+56
-25
lines changed

README.md

Lines changed: 56 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -297,6 +297,10 @@ True
297297
>>> b = "wtf!"
298298
>>> a is b
299299
False
300+
301+
>>> a, b = "wtf!", "wtf!"
302+
>>> a is b
303+
True
300304
```
301305
302306
3\.
@@ -320,6 +324,7 @@ Makes sense, right?
320324
* Strings are interned at compile time (`'wtf'` will be interned but `''.join(['w', 't', 'f']` will not be interned)
321325
* Strings that are not composed of ASCII letters, digits or underscores, are not interned. This explains why `'wtf!'` was not interned due to `!`. Cpython implementation of this rule can be found [here](https://github.com/python/cpython/blob/3.6/Objects/codeobject.c#L19)
322326
<img src="images/string-intern/string_intern.png" alt="">
327+
+ When `a` and `b` are set to `"wtf!"` in the same line, the Python interpreter creates a new object, then references the second variable at the same time. If you do it on separate lines, it doesn't "know" that there's already `wtf!` as an object (because `"wtf!"` is not implicitly interned as per the facts mentioned above). It's a compiler optimization and specifically applies to the interactive environment. This optimization doesn't apply to 3.7.x versions of CPython (check this [issue](https://github.com/satwikkansal/wtfpython/issues/100) for more discussion).
323328
+ The abrupt change in output of the third snippet is due to a [peephole optimization](https://en.wikipedia.org/wiki/Peephole_optimization) technique known as Constant folding. This means the expression `'a'*20` is replaced by `'aaaaaaaaaaaaaaaaaaaa'` during compilation to save a few clock cycles during runtime. Constant folding only occurs for strings having length less than 20. (Why? Imagine the size of `.pyc` file generated as a result of the expression `'a'*10**10`). [Here's](https://github.com/python/cpython/blob/3.6/Python/peephole.c#L288) the implementation source for the same.
324329
+ Note: In Python 3.7, Constant folding was moved out from peephole optimizer to the new AST optimizer with some change in logic as well, so the third snippet doesn't work for Python 3.7. You can read more about the change [here](https://bugs.python.org/issue11549).
325330
@@ -771,7 +776,8 @@ True
771776
```
772777
773778
3\.
774-
**Output (< Python 3.7)**
779+
**Output**
780+
775781
```
776782
>>> a, b = 257, 257
777783
True
@@ -781,7 +787,8 @@ True
781787
True
782788
```
783789
784-
**Output (Python 3.7)**
790+
**Output (Python 3.7.x specifically)**
791+
785792
```
786793
>>> a, b = 257, 257
787794
False
@@ -836,7 +843,8 @@ Similar optimization applies to other **immutable** objects like empty tuples as
836843
837844
**Both `a` and `b` refer to the same object when initialized with same value in the same line.**
838845
839-
**Output (< Python 3.7)**
846+
**Output**
847+
840848
```py
841849
>>> a, b = 257, 257
842850
>>> id(a)
@@ -852,7 +860,17 @@ Similar optimization applies to other **immutable** objects like empty tuples as
852860
```
853861
854862
* When a and b are set to `257` in the same line, the Python interpreter creates a new object, then references the second variable at the same time. If you do it on separate lines, it doesn't "know" that there's already `257` as an object.
855-
* It's a compiler optimization and specifically applies to the interactive environment. When you enter two lines in a live interpreter, they're compiled separately, therefore optimized separately. If you were to try this example in a `.py` file, you would not see the same behavior, because the file is compiled all at once.
863+
864+
* It's a compiler optimization and specifically applies to the interactive environment. When you enter two lines in a live interpreter, they're compiled separately, therefore optimized separately. If you were to try this example in a `.py` file, you would not see the same behavior, because the file is compiled all at once. This optimization is not limited to integers, it works for other immutable data types like strings and floats as well.
865+
866+
```py
867+
>>> a, b = 257.0, 257.0
868+
>>> a is b
869+
True
870+
```
871+
872+
873+
856874
* Why didn't this work for Python 3.7? The abstract reason is because such compiler optimizations are implementation specific (i.e. may change with version, OS, etc). I'm still figuring out what exact implementation change cause the issue, you can check out this [issue](https://github.com/satwikkansal/wtfpython/issues/100) for updates.
857875
858876
---
@@ -1189,6 +1207,7 @@ Before Python 3.5, the boolean value for `datetime.time` object was considered t
11891207
### ▶ What's wrong with booleans?
11901208
<!-- Example ID: 0bba5fa7-9e6d-4cd2-8b94-952d061af5dd --->
11911209
1\.
1210+
11921211
```py
11931212
# A simple example to count the number of booleans and
11941213
# integers in an iterable of mixed data types.
@@ -1222,15 +1241,33 @@ for item in mixed_list:
12221241
''
12231242
```
12241243
1244+
3\.
1245+
1246+
```py
1247+
True = False
1248+
if True == False:
1249+
print("I've lost faith in truth!")
1250+
```
1251+
1252+
**Output (< 3.x):**
1253+
1254+
```
1255+
I've lost faith in truth!
1256+
```
1257+
1258+
1259+
12251260
#### 💡 Explanation:
12261261
12271262
* `bool` is a subclass of `int` in Python
1263+
12281264
```py
12291265
>>> issubclass(bool, int)
12301266
True
12311267
>>> issubclass(int, bool)
12321268
False
12331269
```
1270+
12341271
* And thus, `True` and `False` are instances of `int`
12351272
```py
12361273
>>> isinstance(True, int)
@@ -1249,6 +1286,10 @@ for item in mixed_list:
12491286
12501287
* See this StackOverflow [answer](https://stackoverflow.com/a/8169049/4354153) for the rationale behind it.
12511288
1289+
* Initially, Python used to have no `bool` type (people used 0 for false and non-zero value like 1 for true). `True`, `False`, and a `bool` type was added in 2.x versions, but, for backward compatibility, `True` and `False` couldn't be made constants. They just were built-in variables, and it was possible to reassign them
1290+
1291+
* Python 3 was backward-incompatible, the issue was finally fixed, and thus the last snippet won't work with Python 3.x!
1292+
12521293
---
12531294
12541295
### ▶ Class attributes and instance attributes
@@ -1510,27 +1551,6 @@ NameError: name 'e' is not defined
15101551
15111552
---
15121553
1513-
### ▶ When True is actually False
1514-
<!-- Example ID: c8317047-48ae-4306-af5a-04c6d8b7c2b9 --->
1515-
```py
1516-
True = False
1517-
if True == False:
1518-
print("I've lost faith in truth!")
1519-
```
1520-
1521-
**Output (< 3.x):**
1522-
1523-
```
1524-
I've lost faith in truth!
1525-
```
1526-
1527-
#### 💡 Explanation:
1528-
1529-
- Initially, Python used to have no `bool` type (people used 0 for false and non-zero value like 1 for true). Then they added `True`, `False`, and a `bool` type, but, for backward compatibility, they couldn't make `True` and `False` constants- they just were built-in variables.
1530-
- Python 3 was backward-incompatible, so it was now finally possible to fix that, and so this example won't work with Python 3.x!
1531-
1532-
---
1533-
15341554
### ▶ Yielding from... return!
15351555
15361556
1\.
@@ -3346,6 +3366,17 @@ f()
33463366
33473367
* `int('١٢٣٤٥٦٧٨٩')` returns `123456789` in Python 3. In Python, Decimal characters include digit characters, and all characters that can be used to form decimal-radix numbers, e.g. U+0660, ARABIC-INDIC DIGIT ZERO. Here's an [interesting story](http://chris.improbable.org/2014/8/25/adventures-in-unicode-digits/) related to this behavior of Python.
33483368
3369+
* You can seperate numeric literals with underscores (for better readablity) from Python 3 onwards.
3370+
3371+
```py
3372+
>>> six_million = 6_000_000
3373+
>>> six_million
3374+
6000000
3375+
>>> hex_address = 0xF00D_CAFE
3376+
>>> hex_address
3377+
4027435774
3378+
```
3379+
33493380
* `'abc'.count('') == 4`. Here's an approximate implementation of `count` method, which would make the things more clear
33503381
```py
33513382
def count(s, sub):

0 commit comments

Comments
 (0)