|
| 1 | +# Handy stuff: Strings |
| 2 | + |
| 3 | +Python strings are just pieces of text. |
| 4 | + |
| 5 | +```py |
| 6 | +>>> our_string = "Hello World!" |
| 7 | +>>> our_string |
| 8 | +'Hello World!' |
| 9 | +>>> |
| 10 | +``` |
| 11 | + |
| 12 | +So far we know how to add them together. |
| 13 | + |
| 14 | +```py |
| 15 | +>>> "I said: " + our_string |
| 16 | +'I said: Hello World!' |
| 17 | +>>> |
| 18 | +``` |
| 19 | + |
| 20 | +We also know how to repeat them multiple times. |
| 21 | + |
| 22 | +```py |
| 23 | +>>> our_string * 3 |
| 24 | +'Hello World!Hello World!Hello World!' |
| 25 | +>>> |
| 26 | +``` |
| 27 | + |
| 28 | +Also note that everything returns a new string, and the original string |
| 29 | +is never modified. |
| 30 | + |
| 31 | +```py |
| 32 | +>>> our_string = "Hello World!" |
| 33 | +>>> our_string |
| 34 | +'Hello World!' |
| 35 | +>>> |
| 36 | +``` |
| 37 | + |
| 38 | +Python strings are **immutable**. That's basically a fancy way to say that |
| 39 | +they cannot be changed in-place, and you need to create a new string to |
| 40 | +change them. Even `some_string += another_string` creates a new string. |
| 41 | +Python will treat that as `some_string = some_string + another_string`, |
| 42 | +so it creates a new string but it puts it back to the same variable. |
| 43 | + |
| 44 | +`+` and `*` are nice, but what else can we do with strings? |
| 45 | + |
| 46 | +## Slicing |
| 47 | + |
| 48 | +Slicing is really simple. It just means getting a part of the string. |
| 49 | +For example, to get all characters between the second place between the |
| 50 | +characters and the fifth place between the characters, we can do this: |
| 51 | + |
| 52 | +```py |
| 53 | +>>> our_string[2:5] |
| 54 | +'llo' |
| 55 | +>>> |
| 56 | +``` |
| 57 | + |
| 58 | +So the syntax is like `some_string[start:end]`. |
| 59 | + |
| 60 | +This picture shows you how the slicing works: |
| 61 | + |
| 62 | + |
| 63 | + |
| 64 | +But what happens if we slice with negative values? |
| 65 | + |
| 66 | +```py |
| 67 | +>>> our_string[-5:-2] |
| 68 | +'orl' |
| 69 | +>>> |
| 70 | +``` |
| 71 | + |
| 72 | +It turns out that slicing with negative values simply starts counting |
| 73 | +from the end of the string. |
| 74 | + |
| 75 | + |
| 76 | + |
| 77 | +If we don't specify the beginning it defaults to 0, and if we don't |
| 78 | +specify the end it defaults to the length of the string. For example, we |
| 79 | +can get everything except the first or last character like this: |
| 80 | + |
| 81 | +```py |
| 82 | +>>> our_string[1:] |
| 83 | +'ello World!' |
| 84 | +>>> our_string[:-1] |
| 85 | +'Hello World' |
| 86 | +>>> |
| 87 | +``` |
| 88 | + |
| 89 | +Remember that strings can't be changed in-place. |
| 90 | + |
| 91 | +```py |
| 92 | +>>> our_string[:5] = 'Howdy' |
| 93 | +Traceback (most recent call last): |
| 94 | + File "<stdin>", line 1, in <module> |
| 95 | +TypeError: 'str' object does not support item assignment |
| 96 | +>>> |
| 97 | +``` |
| 98 | + |
| 99 | +There's also a step argument we can give to our slices, but I'm not |
| 100 | +going to talk about it in this tutorial. |
| 101 | + |
| 102 | +## Indexing |
| 103 | + |
| 104 | +So now we know how slicing works. But what happens if we forget the `:`? |
| 105 | + |
| 106 | +```py |
| 107 | +>>> our_string[1] |
| 108 | +'e' |
| 109 | +>>> |
| 110 | +``` |
| 111 | + |
| 112 | +That's interesting. We got a string that is only one character long. But |
| 113 | +the first character of `Hello World!` should be `H`, not `e`, so why did |
| 114 | +we get an e? |
| 115 | + |
| 116 | +Prograhttps://pyformat.info/mming starts at zero. Indexing strings also starts at zero. The |
| 117 | +first character is `our_string[0]`, the second character is |
| 118 | +`our_string[1]`, and so on. |
| 119 | + |
| 120 | +So string indexes work like this: |
| 121 | + |
| 122 | + |
| 123 | + |
| 124 | +How about negative values? |
| 125 | + |
| 126 | +```py |
| 127 | +>>> our_string[-1] |
| 128 | +'!' |
| 129 | +>>> |
| 130 | +``` |
| 131 | + |
| 132 | +But why didn't that start at zero? `our_string[-1]` is the last |
| 133 | +character, but `our_string[1]` is not the first character! |
| 134 | + |
| 135 | +That's because 0 and -0 are equal, so indexing with -0 would do the same |
| 136 | +thing as indexing with 0. |
| 137 | + |
| 138 | +So indexing with negative values works like this: |
| 139 | + |
| 140 | + |
| 141 | + |
| 142 | +## The in keyword |
| 143 | + |
| 144 | +We can use `in` and `not in` to check if a string contains another |
| 145 | +string: |
| 146 | + |
| 147 | +```py |
| 148 | +>>> "Hello" in our_string |
| 149 | +True |
| 150 | +>>> "Python" in our_string |
| 151 | +False |
| 152 | +>>> "Python" not in our_string |
| 153 | +True |
| 154 | +>>> |
| 155 | +``` |
| 156 | + |
| 157 | +## String methods |
| 158 | + |
| 159 | +Python's strings have many useful methods. [The official documentation] |
| 160 | +(https://docs.python.org/3/library/stdtypes.html#string-methods) covers |
| 161 | +them all, but I'm going to just show some of the most commonly used ones |
| 162 | +briefly. **You don't need to remember all of these string methods, just |
| 163 | +learn to use the link above so you can find them when you need them.** |
| 164 | +Python also comes with built-in documentation about the string methods. |
| 165 | +You can run `help(str)` to read it. |
| 166 | + |
| 167 | +Remember that nothing can modify strings in-place. Most string methods |
| 168 | +return a new string, but things like `our_string = our_string.upper()` |
| 169 | +still work because the new string is assigned to the old variable. |
| 170 | + |
| 171 | +Here's some of the most commonly used string methods: |
| 172 | + |
| 173 | +- `upper` and `lower` can be used for converting to uppercase and |
| 174 | + lowercase. |
| 175 | + |
| 176 | + ```py |
| 177 | + >>> our_string.upper() |
| 178 | + 'HELLO WORLD!' |
| 179 | + >>> our_string.lower() |
| 180 | + 'hello world!' |
| 181 | + >>> |
| 182 | + ``` |
| 183 | + |
| 184 | +- To check if a string starts or ends with another string we could just |
| 185 | + slice the string and compare with to the slice. |
| 186 | + |
| 187 | + ```py |
| 188 | + >>> our_string[:5] == 'Hello' |
| 189 | + True |
| 190 | + >>> our_string[-2:] == 'hi' |
| 191 | + False |
| 192 | + >>> |
| 193 | + ``` |
| 194 | + |
| 195 | + But that gets a bit complicated if we don't know the length of the |
| 196 | + substring beforehand. |
| 197 | + |
| 198 | + ```py |
| 199 | + >>> substring = 'Hello' |
| 200 | + >>> our_string[:len(substring)] == substring |
| 201 | + True |
| 202 | + >>> substring = 'hi' |
| 203 | + >>> our_string[-len(substring):] == substring |
| 204 | + False |
| 205 | + >>> |
| 206 | + ``` |
| 207 | + |
| 208 | + That's why it's recommended to use `startswith` and `endswith`: |
| 209 | + |
| 210 | + ```py |
| 211 | + >>> our_string.startswith('Hello') |
| 212 | + True |
| 213 | + >>> our_string.endswith('hi') |
| 214 | + False |
| 215 | + >>> |
| 216 | + ``` |
| 217 | + |
| 218 | +- If we need to find out where a substring is located, we can do that |
| 219 | + with `index`: |
| 220 | + |
| 221 | + ```py |
| 222 | + >>> our_string.index('World') |
| 223 | + 6 |
| 224 | + >>> our_string[6:] |
| 225 | + 'World!' |
| 226 | + >>> |
| 227 | + ``` |
| 228 | + |
| 229 | +- The `join` method joins a list of other strings. We'll talk more about |
| 230 | + lists later. |
| 231 | + |
| 232 | + ```py |
| 233 | + >>> '-'.join(['Hello', 'World', 'test']) |
| 234 | + 'Hello-World-test' |
| 235 | + >>> |
| 236 | + ``` |
| 237 | + |
| 238 | + The `split` method is the opposite of joining, it splits a string to |
| 239 | + a list. |
| 240 | + |
| 241 | + ```py |
| 242 | + >>> 'Hello-World-test'.split('-') |
| 243 | + ['Hello', 'World', 'test'] |
| 244 | + >>> |
| 245 | + |
| 246 | +- Last but not least, we can use `strip`, `lstrip` and `rstrip` to |
| 247 | + remove spaces, newlines and some other whitespace characters from |
| 248 | + the end of a string. `lstrip` strips from the left side, `lstrip` |
| 249 | + strips from the right side and `strip` strips from both sides. |
| 250 | + |
| 251 | + ```py |
| 252 | + >>> ' hello 123 \n '.lstrip() |
| 253 | + 'hello 123 \n ' |
| 254 | + >>> ' hello 123 \n '.rstrip() |
| 255 | + ' hello 123' |
| 256 | + >>> ' hello 123 \n '.strip() |
| 257 | + 'hello 123 |
| 258 | + >>> |
| 259 | + ``` |
| 260 | + |
| 261 | +It's also possible to combine string methods with slicing and other |
| 262 | +string methods: |
| 263 | + |
| 264 | +```py |
| 265 | +>>> our_string.upper()[:7].startswith('HELLO') |
| 266 | +True |
| 267 | +>>> |
| 268 | +``` |
| 269 | + |
| 270 | +## String formatting |
| 271 | + |
| 272 | +To add a string in the middle of another string, you can do something |
| 273 | +like this: |
| 274 | + |
| 275 | +```py |
| 276 | +>>> name = 'Akuli' |
| 277 | +>>> 'My name is ' + name + '.' |
| 278 | +'My name is Akuli.' |
| 279 | +>>> |
| 280 | +``` |
| 281 | + |
| 282 | +But that gets complicated if you have many things to add. |
| 283 | + |
| 284 | +```py |
| 285 | +>>> channel = '##learnpython' |
| 286 | +>>> network = 'freenode' |
| 287 | +>>> "My name is " + name + " and I'm on the " + channel + " channel on " + network + "." |
| 288 | +"My name is Akuli and I'm on the ##learnpython channel on freenode." |
| 289 | +>>> |
| 290 | +``` |
| 291 | + |
| 292 | +Instead it's recommended to use string formatting. It means putting |
| 293 | +other things in the middle of a string. |
| 294 | + |
| 295 | +Python has two ways to format strings. One is not better than the other, |
| 296 | +they are just different. The two ways are: |
| 297 | + |
| 298 | +- `.format()`-formatting, also known as new-style formatting. This |
| 299 | + formatting style has a lot of features, but it's a little bit more |
| 300 | + typing than `%s`-formatting. |
| 301 | + |
| 302 | + ```py |
| 303 | + >>> "Hello {}.".format(name) |
| 304 | + 'Hello Akuli.' |
| 305 | + >>> "My name is {} and I'm on the {} channel on {}.".format(name, channel, network) |
| 306 | + "My name is Akuli and I'm on the ##learnpython channel on freenode." |
| 307 | + >>> |
| 308 | + ``` |
| 309 | + |
| 310 | +- `%s`-formatting, also known as printf-formatting and old-style |
| 311 | + formatting. This has less features than `.format()`-formatting, but |
| 312 | + `'Hello %s.' % name` is shorter and faster to type than |
| 313 | + `'Hello {}.'.format(name)`. |
| 314 | + |
| 315 | + ```py |
| 316 | + >>> "Hello %s." % name |
| 317 | + 'Hello Akuli.' |
| 318 | + >>> "My name is %s and I'm on the %s channel on %s." % (name, channel, network) |
| 319 | + "My name is Akuli and I'm on the ##learnpython channel on freenode." |
| 320 | + >>> |
| 321 | + ``` |
| 322 | + |
| 323 | +Both formatting styles have many other features also: |
| 324 | + |
| 325 | +```py |
| 326 | +>>> 'Three zeros and number one: {:04d}'.format(1) |
| 327 | +'Three zeros and number one: 0001' |
| 328 | +>>> 'Three zeros and number one: %04d' % 1 |
| 329 | +'Three zeros and number one: 0001' |
| 330 | +>>> |
| 331 | +``` |
| 332 | + |
| 333 | +If you need to know more about formatting I recommend reading |
| 334 | +[this](https://pyformat.info/). |
| 335 | + |
| 336 | +## Summary |
| 337 | + |
| 338 | +- Slicing returns a copy of a string with indexes from one index to |
| 339 | + another index. The indexes work like this: |
| 340 | + |
| 341 | +  |
| 342 | + |
| 343 | +- Indexing returns one character of a string. Remember that you don't |
| 344 | + need a `:` with indexing. The indexes work like this: |
| 345 | + |
| 346 | +  |
| 347 | + |
| 348 | +- The `in` keyword can be used for checking if a string contains another |
| 349 | + string. |
| 350 | + |
| 351 | +- Python has many string methods. Use [the documentation] |
| 352 | + (https://docs.python.org/3/library/stdtypes.html#string-methods) |
| 353 | + when you don't rememeber something about them. |
| 354 | + |
| 355 | +- String formatting means adding other things to the middle of a string. |
0 commit comments