You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add another function `add_bytes_with_plus` actually illustrating
quadratic behavior for the `+=` operator.
* Add explaination for linear behavior due to `+=` optimizations
in case of strings.
* Change the order of examples (move string interning example just
before the giant string example).
Closessatwikkansal#38
+`+=`is faster than `+`for concatenating more than two strings because the first string (example, `s1`for`s1 += s2 + s3`) isnot destroyed while calculating the complete string.
427
+
+ Both the strings refer to the same object because of CPython optimization that tries to use existing immutable objects in some cases (implementation specific) rather than creating a new object every time. You can read more about this [here](https://stackoverflow.com/questions/24245324/about-the-changing-id-of-an-immutable-string).
428
+
429
+
---
430
+
409
431
### Let's make a giant string!
410
432
411
433
This isnot a WTF at all, just some nice things to be aware of :)
- You can read more about [timeit](https://docs.python.org/3/library/timeit.html) from here. It is generally used to measure the execution time of snippets.
452
-
- Don't use `+` for generating long strings — In Python, `str` is immutable, so the left and right strings have to be copied into the new string for every pair of concatenations. If you concatenate four strings of length 10, you'll be copying (10+10) + ((10+10)+10) + (((10+10)+10)+10) =90 characters instead of just 40 characters. Things get quadratically worse as the number and size of the string increases.
453
-
- Therefore, it's advised to use `.format.` or `%` syntax (however, they are slightly slower than `+` for short strings).
454
-
- Or better, if already you've contents available in the form of an iterable object, then use `''.join(iterable_object)` which is much faster.
455
-
456
-
---
457
-
458
-
### String interning
480
+
Let's increase the number of iterations by a factor of 10.
459
481
460
482
```py
461
-
>>>a="some_string"
462
-
>>>id(a)
463
-
140420665652016
464
-
>>>id("some"+"_"+"string") # Notice that both the ids are same.
>>> timeit(add_string_with_format(100000)) # Linear increase
488
+
100 loops, best of 3: 5.25 ms per loop
489
+
>>> timeit(add_string_with_join(100000)) # Linear increase
490
+
100 loops, best of 3: 9.85 ms per loop
491
+
>>>l= ["xyz"]*100000
492
+
>>> timeit(convert_list_to_string(l, 100000)) # Linear increase
493
+
1000 loops, best of 3: 723 µs per loop
472
494
```
473
495
474
-
#### 💡 Explanation:
475
-
+`+=`is faster than `+`for concatenating more than two strings because the first string (example, `s1`for`s1 += s2 + s3`) isnot destroyed while calculating the complete string.
476
-
+ Both the strings refer to the same object because of CPython optimization that tries to use existing immutable objects in some cases (implementation specific) rather than creating a new object every time. You can read more about this [here](https://stackoverflow.com/questions/24245324/about-the-changing-id-of-an-immutable-string).
496
+
#### 💡 Explanation
497
+
- You can read more about [timeit](https://docs.python.org/3/library/timeit.html) from here. It is generally used to measure the execution time of snippets.
498
+
- Don't use `+` for generating long strings — In Python, `str` is immutable, so the left and right strings have to be copied into the new string for every pair of concatenations. If you concatenate four strings of length 10, you'll be copying (10+10) + ((10+10)+10) + (((10+10)+10)+10) =90 characters instead of just 40 characters. Things get quadratically worse as the number and size of the string increases (justified with the execution times of `add_bytes_with_plus` function)
499
+
- Therefore, it's advised to use `.format.` or `%` syntax (however, they are slightly slower than `+` for short strings).
500
+
- Or better, if already you've contents available in the form of an iterable object, then use `''.join(iterable_object)` which is much faster.
501
+
-`add_string_with_plus` didn't show a quadratic increase in execution time unlike `add_bytes_with_plus` becuase of the `+=` optimizations discussed in the previous example. Had the statement been `s = s + "x" + "y" + "z"` instead of `s += "xyz"`, the increase would have been quadratic.
502
+
```py
503
+
def add_string_with_plus(iters):
504
+
s=""
505
+
for i inrange(iters):
506
+
s= s +"x"+"y"+"z"
507
+
assertlen(s) ==3*iters
508
+
509
+
>>> timeit(add_string_with_plus(10000))
510
+
100 loops, best of 3: 9.87 ms per loop
511
+
>>> timeit(add_string_with_plus(100000)) # Quadratic increase in execution time
0 commit comments