|
| 1 | +# Managing files in Python |
| 2 | + |
| 3 | +## What are files, directories and paths? |
| 4 | + |
| 5 | +These are simple thing that many computer users already know, but I'll go |
| 6 | +through them just to make sure you know them also. |
| 7 | + |
| 8 | +#### Files |
| 9 | + |
| 10 | +- Each file has a **name**, like `hello.py`, `mytext.txt` or |
| 11 | + `coolimage.png`. Usually the name ends with an **extension** that |
| 12 | + describes the content, like `py` for Python, `txt` for text or `png` |
| 13 | + for "portable network graphic". |
| 14 | +- With just names identifying the files, it wouldn't be possible to have |
| 15 | + two files with the same name. That's why files also have a |
| 16 | + **location**. We'll talk more about this in a moment. |
| 17 | +- Files have **content** that consists of |
| 18 | + [8-bit bytes](https://www.youtube.com/watch?v=Dnd28lQHquU). |
| 19 | + |
| 20 | +#### Directories/folders |
| 21 | + |
| 22 | +Directories are a way to group files. They also have a name and a location |
| 23 | +like files, but instead of containing data directly like files do they |
| 24 | +contain other files and directories. |
| 25 | + |
| 26 | +#### Paths |
| 27 | + |
| 28 | +Directories and files have a path, like `C:\Users\me\hello.py`. That just |
| 29 | +means that there's a folder called `C:`, and inside it there's a folder |
| 30 | +called `Users`, and inside it there's a folder called `me` and inside it |
| 31 | +there's a `hello.py`. Like this: |
| 32 | + |
| 33 | +``` |
| 34 | +C: |
| 35 | +└── Users |
| 36 | + └── me |
| 37 | + └── hello.py |
| 38 | +``` |
| 39 | + |
| 40 | +`C:\Users\me\hello.py` is an **absolute path**. But there are also |
| 41 | +**relative paths**. For example, if you're in `C:\Users`, `me\hello.py` |
| 42 | +is same as `C:\Users\me\hello.py`. The place we are in is sometimes |
| 43 | +called **current directory**, **working directory** or |
| 44 | +**current working directory**. |
| 45 | + |
| 46 | +So far we've talked about Windows paths, but not all computers run |
| 47 | +Windows. For example, an equivalent to `C:\Users\me\hello.py` is |
| 48 | +`/home/me/hello.py` on my Ubuntu, and if I'm in `/home`, `me/hello.py` |
| 49 | +is same as `/home/me/hello.py`. |
| 50 | + |
| 51 | +``` |
| 52 | +/ |
| 53 | +└── home |
| 54 | + └── me |
| 55 | + └── hello.py |
| 56 | +``` |
| 57 | + |
| 58 | +## Writing to a file |
| 59 | + |
| 60 | +Let's create a file and write a hello world to it. |
| 61 | + |
| 62 | +```py |
| 63 | +>>> with open('hello.txt', 'w') as f: |
| 64 | +... print("Hello World!", file=f) |
| 65 | +... |
| 66 | +>>> |
| 67 | +``` |
| 68 | + |
| 69 | +Doesn't seem like it did anything. But actually it created a `hello.txt` |
| 70 | +somewhere on your system. On Windows it's probably in `C:\Users\YourName`, |
| 71 | +and on most other systems it should be in `/home/yourname`. You can open |
| 72 | +it with notepad or any other plain text editor your system comes with. |
| 73 | + |
| 74 | +So how does that code work? |
| 75 | + |
| 76 | +First of all, we open a path with `open`, and it gives us a Python file |
| 77 | +object that is assigned to the variable `f`. |
| 78 | + |
| 79 | +```py |
| 80 | +>>> f |
| 81 | +<_io.TextIOWrapper name='hello.txt' mode='w' encoding='UTF-8'> |
| 82 | +>>> |
| 83 | +``` |
| 84 | + |
| 85 | +So the first argument we passed to `open` was the path we wanted to write. |
| 86 | +Our path was more like a filename than a path, so the file ended up in |
| 87 | +the current working directory. |
| 88 | + |
| 89 | +The second argument was `w`... but where the heck does that come from? |
| 90 | +`w` is short for write, and that just means that we'll create a new file. |
| 91 | +There's some other modes you can use also: |
| 92 | + |
| 93 | +| Mode | Short for | Meaning | |
| 94 | +|-------|-----------|-----------------------------------------------------------------------| |
| 95 | +| `r` | read | Read from an existing file. | |
| 96 | +| `w` | write | Write to a file. **If the file exists, its old content is removed.** | |
| 97 | +| `a` | append | Write to the end of a file, and keep the old content. | |
| 98 | + |
| 99 | +The `w` and `a` modes create a new file if it exists already, but trying |
| 100 | +to read from a non-existent file is an error. |
| 101 | + |
| 102 | +But what is that `with ourfile as f` crap? That's just a fancy way to make |
| 103 | +sure that the file gets closed, no matter what happens. As you can see, |
| 104 | +the file was indeed closed. |
| 105 | + |
| 106 | +```py |
| 107 | +>>> f.closed |
| 108 | +True |
| 109 | +>>> |
| 110 | +``` |
| 111 | + |
| 112 | +When we had opened the file we could just print to it. The print is just |
| 113 | +like any other print, but we also need to specify that we want to print |
| 114 | +to the file we opened using `file=f`. |
| 115 | + |
| 116 | +## Reading from files |
| 117 | + |
| 118 | +After opening a file with the `r` mode you can for loop over it, just |
| 119 | +like it was a list. So let's go ahead and read everything in the file |
| 120 | +we created to a list of lines. |
| 121 | + |
| 122 | +```py |
| 123 | +>>> lines = [] |
| 124 | +>>> with open('hello.txt', 'r') as f: |
| 125 | +... for line in f: |
| 126 | +... lines.append(line) |
| 127 | +... |
| 128 | +>>> lines |
| 129 | +['Hello World!\n'] |
| 130 | +>>> |
| 131 | +``` |
| 132 | + |
| 133 | +But why is there a `\n` at the end of our hello world? |
| 134 | + |
| 135 | +`\n` means newline. Note that it needs to be a backslash, so `/n` |
| 136 | +doesn't have any special meaning like `\n` has. When we wrote the file |
| 137 | +with print it actually added a `\n` to the end of it. It's good practise |
| 138 | +to end the content of files with a newline character, but it's not |
| 139 | +necessary. |
| 140 | + |
| 141 | +So how does that work if we have more than one line in the file? |
| 142 | + |
| 143 | +```py |
| 144 | +>>> with open('hello.txt', 'w') as f: |
| 145 | +... print("Hello one!", file=f) |
| 146 | +... print("Hello two!", file=f) |
| 147 | +... print("Hello three!", file=f) |
| 148 | +... |
| 149 | +>>> lines = [] |
| 150 | +>>> with open('hello.txt', 'r') as f: |
| 151 | +... for line in f: |
| 152 | +... lines.append(line) |
| 153 | +... |
| 154 | +>>> lines |
| 155 | +['Hello one!\n', 'Hello two!\n', 'Hello three!\n'] |
| 156 | +>>> |
| 157 | +``` |
| 158 | + |
| 159 | +There we go, each of our lines now ends with a `\n`. When we for |
| 160 | +loop over the file it's divided into lines based on where the `\n` |
| 161 | +characters are, not based on how we printed to it. |
| 162 | + |
| 163 | +But how to get rid of that `\n`? The `rstrip` |
| 164 | +[string method](handy-stuff-strings.md#string-methods) is great |
| 165 | +for this: |
| 166 | + |
| 167 | +```py |
| 168 | +>>> stripped = [] |
| 169 | +>>> for line in lines: |
| 170 | +... stripped.append(line.rstrip('\n')) |
| 171 | +... |
| 172 | +>>> stripped |
| 173 | +['Hello one!', 'Hello two!', 'Hello three!'] |
| 174 | +>>> |
| 175 | +``` |
| 176 | + |
| 177 | +It's also possible to read lines one by one. Files have a |
| 178 | +`readline()` method that reads the next line, and returns `''` |
| 179 | +if we're at the end of the file. |
| 180 | + |
| 181 | +**TODO:** example of readline() |
| 182 | + |
| 183 | +There's only one confusing thing about reading files. If you try |
| 184 | +to read it twice you'll find out that it only gets read once: |
| 185 | + |
| 186 | +```py |
| 187 | +>>> first = [] |
| 188 | +>>> second = [] |
| 189 | +>>> with open('hello.txt', 'r') as f: |
| 190 | +... for line in f: |
| 191 | +... first.append(line) |
| 192 | +... for line in f: |
| 193 | +... second.append(line) |
| 194 | +... |
| 195 | +>>> first |
| 196 | +['Hello one!\n', 'Hello two!\n', 'Hello three!\n'] |
| 197 | +>>> second |
| 198 | +[] |
| 199 | +>>> |
| 200 | +``` |
| 201 | + |
| 202 | +But if we open the file again, everything works. |
| 203 | + |
| 204 | +```py |
| 205 | +>>> first = [] |
| 206 | +>>> second = [] |
| 207 | +>>> with open('hello.txt', 'r') as f: |
| 208 | +... for line in f: |
| 209 | +... first.append(line) |
| 210 | +... |
| 211 | +>>> with open('hello.txt', 'r') as f: |
| 212 | +... for line in f: |
| 213 | +... second.append(line) |
| 214 | +... |
| 215 | +>>> first |
| 216 | +['Hello one!\n', 'Hello two!\n', 'Hello three!\n'] |
| 217 | +>>> second |
| 218 | +['Hello one!\n', 'Hello two!\n', 'Hello three!\n'] |
| 219 | +>>> |
| 220 | +``` |
| 221 | + |
| 222 | +Usually it's best to just read the file once, and use the |
| 223 | +content you have read from it multiple times. |
| 224 | + |
| 225 | +As you can see, files behave a lot like lists. The `join()` |
| 226 | +string method joins together strings from a list, but we can |
| 227 | +also use it to join together lines of a file: |
| 228 | + |
| 229 | +```py |
| 230 | +>>> with open('hello.txt', 'r') as f: |
| 231 | +... full_content = ''.join(f) |
| 232 | +... |
| 233 | +>>> full_content |
| 234 | +'Hello one!\nHello two!\nHello three!\n' |
| 235 | +>>> |
| 236 | +``` |
| 237 | + |
| 238 | +But if you need all of the content as a string, you can just |
| 239 | +use the `read()` method. |
| 240 | + |
| 241 | +```py |
| 242 | +>>> with open('hello.txt', 'r') as f: |
| 243 | +... full_content = f.read() |
| 244 | +... |
| 245 | +>>> full_content |
| 246 | +'Hello one!\nHello two!\nHello three!\n' |
| 247 | +>>> |
| 248 | +``` |
| 249 | + |
| 250 | +**TODO:** Explain paths and \\. |
| 251 | + |
| 252 | +## Example: File viewer |
| 253 | + |
| 254 | +The following program prints the contents of files: |
| 255 | + |
| 256 | +```py |
| 257 | +while True: |
| 258 | + filename = input("Filename or path, or nothing at all to exit: ") |
| 259 | + if filename == '': |
| 260 | + break |
| 261 | + |
| 262 | + with open(filename, 'r') as f: |
| 263 | + # We could read the whole file at once, but this is |
| 264 | + # faster if the file is very large. |
| 265 | + for line in f: |
| 266 | + print(line.rstrip('\n')) |
| 267 | +``` |
| 268 | + |
| 269 | +## Example: User information |
| 270 | + |
| 271 | +This program stores the user's username and password in a file. |
| 272 | +Plain text files are definitely not a good way to store usernames |
| 273 | +and passwords, but this is just an example. |
| 274 | + |
| 275 | +```py |
| 276 | +# Ask repeatedly until the user answers 'y' or 'n'. |
| 277 | +while True: |
| 278 | + answer = input("Have you been here before? (y/n) ") |
| 279 | + if answer == 'Y' or answer == 'y': |
| 280 | + been_here_before = True |
| 281 | + break |
| 282 | + elif answer == 'N' or answer == 'n': |
| 283 | + been_here_before = False |
| 284 | + break |
| 285 | + else: |
| 286 | + print("Enter 'y' or 'n'.") |
| 287 | + |
| 288 | +if been_here_before: |
| 289 | + # Read username and password from a file. |
| 290 | + with open('userinfo.txt', 'r') as f: |
| 291 | + username = f.readline().rstrip('\n') |
| 292 | + password = f.readline().rstrip('\n') |
| 293 | + |
| 294 | + if input("Username: ") != username: |
| 295 | + print("Wrong username!") |
| 296 | + elif input("Password: ") != password: |
| 297 | + print("Wrong password!") |
| 298 | + else: |
| 299 | + print("Correct password, welcome!") |
| 300 | + |
| 301 | +else: |
| 302 | + # Write username and password to a file. |
| 303 | + username = input("Username: ") |
| 304 | + password = input("Password: ") |
| 305 | + with open('userinfo.txt', 'w') as f: |
| 306 | + print(username, file=f) |
| 307 | + print(password, file=f) |
| 308 | + |
| 309 | + print("Done! Now run this program again and select 'y'.") |
| 310 | +``` |
0 commit comments