Zero To Py
Zero To Py
Zero To Py
Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Floats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Container Types . . . . . . . . . . . . . . . . . . . . . . . 21
Tuples and Lists . . . . . . . . . . . . . . . . . . . . . . 21
Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . 24
Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Python Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
References to Objects . . . . . . . . . . . . . . . . . . . . 27
Chapter 2. Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Packing operators . . . . . . . . . . . . . . . . . . . . . . 32
Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Logical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
What is Truthy? . . . . . . . . . . . . . . . . . . . . . . . 37
Short-Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 37
Logical Assignments . . . . . . . . . . . . . . . . . . . . 37
Membership . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Bitwise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Chapter 3. Lexical Structure . . . . . . . . . . . . . . . . . . . . . . . 43
Line Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Explicit and Implicit Line Joining . . . . . . . . . . . . . . . 45
Indentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Chapter 4. Control Flow . . . . . . . . . . . . . . . . . . . . . . . . . 47
if, elif, and else . . . . . . . . . . . . . . . . . . . . . . . . . 48
while . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
break . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
CONTENTS
continue . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
for . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
What is an Iterable? . . . . . . . . . . . . . . . . . . . . . 52
for/else and break . . . . . . . . . . . . . . . . . . . . . . . 53
Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . 55
raise from . . . . . . . . . . . . . . . . . . . . . . . . . . 57
else and finally . . . . . . . . . . . . . . . . . . . . . 58
match . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
type checking . . . . . . . . . . . . . . . . . . . . . . . . . 61
guards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Or Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
as . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Chapter 5. Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Function Signatures . . . . . . . . . . . . . . . . . . . . . . . . . 67
Explicitly positional/key-value . . . . . . . . . . . . . . 70
Default Values . . . . . . . . . . . . . . . . . . . . . . . . . 70
Mutable Types as Default Values . . . . . . . . . . . 71
Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Nested Scopes . . . . . . . . . . . . . . . . . . . . . . . . . 74
nonlocal and global . . . . . . . . . . . . . . . . . . . 75
Closures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Anonymous Functions . . . . . . . . . . . . . . . . . . . . . . . 78
Decorators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Chapter 6. Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
data types as classes. . . . . . . . . . . . . . . . . . . . . . . . . 85
__dunder__ methods . . . . . . . . . . . . . . . . . . . . . . . 86
The __init__ method . . . . . . . . . . . . . . . . . . . 86
Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Class Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
A Functional Approach . . . . . . . . . . . . . . . . . . . . . . 89
@staticmethod . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
CONTENTS
@classmethod . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Part II. A Deeper Dive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Chapter 7. Expressions, Comprehensions, and Generators . . 93
Generator Expressions . . . . . . . . . . . . . . . . . . . . . . . 93
Generator Functions . . . . . . . . . . . . . . . . . . . . . . . . 94
yield from . . . . . . . . . . . . . . . . . . . . . . . . . . 95
List Comprehensions . . . . . . . . . . . . . . . . . . . . . . . . 96
Dictionary Comprehensions . . . . . . . . . . . . . . . . . . . 97
Expressions and the Walrus Operator . . . . . . . . . . . . . 98
Chapter 8. Python’s Built-in Functions . . . . . . . . . . . . . . . 98
Type Conversions . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Mathematical Functions . . . . . . . . . . . . . . . . . . . . . . 105
all() and any() . . . . . . . . . . . . . . . . . . . . . . . . . . 107
dir() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
enumerate() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
eval() and exec() . . . . . . . . . . . . . . . . . . . . . . . . 110
map() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
filter() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
input() and print() . . . . . . . . . . . . . . . . . . . . . . 112
open() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
range() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
sorted() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
reversed() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
zip() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Chapter 9. The Python Data Model . . . . . . . . . . . . . . . . . . 117
Object Creation Using __new__ and __init__ . . . . . . 118
Singletons . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Rich Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Operator Overloading . . . . . . . . . . . . . . . . . . . . . . . 122
String Representations . . . . . . . . . . . . . . . . . . . 125
CONTENTS
TypeVar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Runtime type checking . . . . . . . . . . . . . . . . . . . 193
Generics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
TypedDict . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Chapter 14. Modules, Packages, and Namespaces . . . . . . . . 197
Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Module Attributes . . . . . . . . . . . . . . . . . . . . . . 200
if __name__ == "__main__": . . . . . . . . . . . . 201
Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Imports within packages . . . . . . . . . . . . . . . . . . 204
Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Part III. The Python Standard Library . . . . . . . . . . . . . . . . . . 207
Chapter 15. Virtual Environments . . . . . . . . . . . . . . . . . . . 207
Chapter 16. Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Chapter 17. Itertools . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Chaining Iterables . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Filtering Iterables . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Cycling Through Iterables . . . . . . . . . . . . . . . . . . . . 215
Creating Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Slicing Iterables . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Zip Longest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Chapter 18. Functools . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Partials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Reduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Pipes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Dispatching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Chapter 19. Enums, NamedTuples, and Dataclasses . . . . . . . 223
Enums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
CONTENTS
NamedTuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Dataclasses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Chapter 20. Multithreading and Multiprocessing . . . . . . . . . 228
Multithreading . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Thread Locks . . . . . . . . . . . . . . . . . . . . . . . . . 231
Multiprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Process and Pool . . . . . . . . . . . . . . . . . . . . . . . 234
Process Locks . . . . . . . . . . . . . . . . . . . . . . . . . 236
Pipes and Queues . . . . . . . . . . . . . . . . . . . . . . 237
concurrent.futures . . . . . . . . . . . . . . . . . . . . . . 239
Chapter 21. Asyncio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Coroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
Tasks and Task Groups . . . . . . . . . . . . . . . . . . . . . . 244
ExceptionGroup and Exception unpacking . . . . 245
Part VI. The Underbelly of the Snake . . . . . . . . . . . . . . . . . . . 248
Chapter 22. Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . 248
pdb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Other Debuggers . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Chapter 23. Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
cProfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
flameprof . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
snakeviz . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
memory_profiler . . . . . . . . . . . . . . . . . . . . . . . . . 264
Chapter 24. C extensions . . . . . . . . . . . . . . . . . . . . . . . . . 265
Hello World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
hello_world.c . . . . . . . . . . . . . . . . . . . . . . . 267
setup.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Passing data in and out of Python . . . . . . . . . . . . . . . 274
Memory Management . . . . . . . . . . . . . . . . . . . 278
Parsing Arguments . . . . . . . . . . . . . . . . . . . . . . . . . 279
CONTENTS
Installing the python interpreter is the first step to getting started with
programming in python. The installation process is different for every
Part I: A Whirlwind Tour 4
operating system. To install the latest version, Python 3.11, you’ll want to
go to the downloads¹ page of the official website for the Python Software
Foundation (PSF), and click the “download” button for the latest release
version. This will take you to a page where you can select a version
specific for your operating system.
Python Versions
Windows
The PSF provides several installation options for windows users. The first
two options are for 32-bit and 64-bit versions of the python interpreter.
Your default choice is likely to be the 64-bit interpreter - the 32 bit
interpreter is only necessary for hardware which doesn’t support 64-bit,
which nowadays is atypical for standard desktop and laptop. The second
two options are for installing offline vs. over the web. If you intend to
hold onto the original installer for archive, the offline version is what
you want. Else, the web installer is what you want.
When you start the installer, a dialogue box will appear. There will be
four clickable items in this installer; an “Install Now” button, a “Cus-
tomize installation” button, a “install launcher for all users” checkbox,
¹https://www.python.org/downloads
Part I: A Whirlwind Tour 5
macOS
Linux
For linux users, installing the latest version of python can be done by
compiling the source code. First, download a tarball from the downloads
Part I: A Whirlwind Tour 6
page and extract into a local directory. Next, install the build dependen-
cies, and run the ./configure script to configure the workspace. Run
make to compile the source code and finally use make altinstall to
install the python version as python<version> (this is so your latest
version doesn’t conflict with the system installation).
What’s in a PATH?
Before moving forward, it’s worthwhile to talk a bit about the PATH
variable and what it’s used for. When you type a name into the terminal,
python for example, the computer looks for an executable on your
machine that matches this name. But it doesn’t look just anywhere; it
Part I: A Whirlwind Tour 7
1 root@d9687c856f09:/# /usr/local/bin/python
2 Python 3.11.1 (main, Jan 23 2023, 21:04:06) [GCC 10.2.1 20210110]\
3 on linux
4 Type "help", "copyright", "credits" or "license" for more informa\
5 tion.
6 >>>
Many developers have many opinions on what the best setup is for
writing Python. This debate over the optimal setup is a contentious
Part I: A Whirlwind Tour 8
one, with many developers holding strong opinions on the matter. Some
prefer a lightweight text editor like Sublime Text or Neovim, while others
prefer an IDE like PyCharm or Visual Studio Code. Some developers
prefer a minimalist setup with only a terminal and the interpreter itself,
while others prefer a more feature-rich environment with additional
tools builtin for debugging and testing. Ultimately, the best setup for
writing Python will vary depending on the individual developer’s needs
and preferences. Some developers may find that a certain setup works
best for them, while others may find that another setup is more suitable.
With that being said, if you’re new to python and writing software, it
might be best to keep your setup simple and focus on learning the basics
of the language. When it comes down to it, all you really need is an
interpreter and a .py file. This minimal setup allows you to focus on
the core concepts of the programming language without getting bogged
down by the plethora of additional tools and features. As you progress
and gain more experience, you can explore more advanced tools and
setups. But when starting out, keeping things simple will allow you to
quickly start writing and running your own code.
Running Python
1 root@2711ea43ad26:~# python3.11
2 Python 3.11.2 (main, Feb 14 2023, 05:47:57) [GCC 11.3.0] on linux
3 Type "help", "copyright", "credits" or "license" for more informa\
4 tion.
5 >>>
Within the REPL we can write code, submit it, and the python interpreter
will execute that code and update its state accordingly. To exit the REPL,
simply type exit() or press Ctrl-D.
Many examples in this textbook are depicted as code which was written
in the python REPL. If you see three right-pointing angle brackets >>>,
Part I: A Whirlwind Tour 10
1 root@2711ea43ad26:~# python3.11
2 Python 3.11.2 (main, Feb 14 2023, 05:47:57) [GCC 11.3.0] on linux
3 Type "help", "copyright", "credits" or "license" for more informa\
4 tion.
5 >>> for i in range(2):
6 ... print(i)
7 ...
8 0
9 1
10 >>>
A second mode is where you write python code in a file, and then
use the interpreter to execute that file. This is referred to as scripting,
where you execute your python script from the terminal using python
./script.py, where the only argument is a filepath to your python
script, which is a file with a .py extension. Many examples in this
textbook are depicted as scripts. They all start with the first line as a
comment # which will depict the relative filepath of the script.
1 # ./script.py
2
3 for i in range(2):
4 print(i)
Part I: A Whirlwind Tour 11
There are other ways to interact with the python interpreter and execute
python code, and the choice of which one to use will depend on your
specific needs and preferences. But for just starting off, these two means
of interacting with the interpreter are sufficient for our use cases.
In the Python programming language, the data types are the funda-
mental building blocks for creating and manipulating data structures.
Understanding how to work with different data structures is essential
for any programmer, as it allows you to store, process, and manipulate
data in a variety of ways. In this chapter, we will cover the basic data
types of Python; including integers, floats, strings, booleans, and various
container types. We will explore their properties, how to create and
manipulate them, and how to perform common operations with them.
By the end of this chapter, you will have a solid understanding of the
basic data types in Python and will be able to use them effectively in
your own programs.
Part I: A Whirlwind Tour 12
But before we talk about data structures, we should first talk about how
we go about referencing a given data structure in Python. This is done by
assigning our data structures to variables using the assignment operator,
=. Variables are references to values, like data structures, and the type of
the variable is determined by the type of the value that is assigned to the
variable. For example, if you assign an integer value to a variable, that
variable will be of the type int. If you assign a string value to a variable,
that variable will be of the type str. You can check the type of a variable
by using the built-in type() function.
1 >>> x = "Hello!"
2 >>> x
3 "Hello!"
4 >>> type(x)
5 <class 'str'>
Note: this example using type() is our first encounter of whats called a
“function” in Python. We’ll talk more about functions at a later point in
this book. For now, just know that to use a function, you “call” it using
parentheses, and you pass it arguments. This type() function can take
one variable as an argument, and it returns the “type” of the variable it
was passed - in this case x is a str type, so type(x) returns the str type.
It’s also important to note that, in Python, variables do not have a set data
type; they are simply a name that refers to a value. The data type of a
variable is determined by the type of the value that is currently assigned
to the variable.
Variable names must follow a set of rules in order to be considered valid.
These rules include:
Part I: A Whirlwind Tour 13
1 >>> __import__('keyword').kwlist
2 ['False', 'None', 'True', 'and', 'as', 'assert',
3 'async', 'await', 'break', 'class', 'continue',
4 'def', 'del', 'elif', 'else', 'except', 'finally',
5 'for', 'from', 'global', 'if', 'import', 'in',
6 'is', 'lambda', 'nonlocal', 'not', 'or', 'pass',
7 'raise', 'return', 'try', 'while', 'with', 'yield']
1 >>> x
2 9
3 >>> x = "message"
4 >>> x
5 "message"
1 CONSTANT=3.14
Python str and bytes types are somewhat unique in that they have
properties of both primitive and container data types. They can be used
like primitive data types, as a single, indivisible value; but they also
behave like container types in that they are sequences. It can be indexed
and iterated, where each letter can be accessed individually. In this sense,
these types can be said to be a hybrid that combine the properties of both
primitive and container data types.
Mutability vs Immutability
1 >>> x = [1, 2, 3]
2 >>> x[0] = 0
3 >>> x
4 [0, 2, 3]
5
6 >>> y = (1, 2, 3)
7 >>> y[0] = 0
8 Traceback (most recent call last):
9 File "<stdin>", line 1, in <module>
10 TypeError: 'tuple' object does not support item assignment
Part I: A Whirlwind Tour 16
Primitive Types
Booleans
1 >>> x = True
2 >>> type(x)
3 <class 'bool'>
Integers
The python integer, represented by the int data type, is a primitive data
type that is used to represent whole numbers. Integers can be positive
or negative and are commonly used in mathematical operations such as
addition, subtraction, multiplication, and division.
1 >>> x = 5
2 >>> type(x)
3 <class 'int'>
Part I: A Whirlwind Tour 17
Integers are by default are coded in base 10. However, Python integers
can also be represented in different number systems, such as decimal,
hexadecimal, octal, and binary. Hexadecimal representation uses the
base 16 number system with digits 0-9 and letters A-F. In Python, you
can represent a hexadecimal number by prefixing it with 0x. For example,
the decimal number 15 can be represented as 0xF in hexadecimal. Octal
representation uses the base 8 number system with digits 0-7. In Python,
you can represent an octal number by prefixing it with 0o. For example,
the decimal number 8 can be represented as 0o10 in octal. Binary
representation uses the base 2 number system with digits 0 and 1. In
Python, you can represent a binary number by prefixing it with 0b. For
example, the decimal number 5 can be represented as 0b101 in binary.
1 >>> 0xF
2 15
3 >>> 0o10
4 8
5 >>> 0b101
6 5
Floats
1 >>> x = 3.14
2 >>> type(x)
3 <class 'float'>
Complex
1 >>> x = (1+2j)
2 >>> type(x)
3 <class 'complex'>
Strings
The python string, represented by the str data type, is a primitive data
type that is used to represent sequences of characters, such as "hello"
or "world". Strings are one of the most commonly used data types in
Python and are used to store and manipulate text. Strings can be created
by enclosing a sequence of characters in 'single' or "double" quotes.
They also can be made into multi-line strings using triple quotes (""" or
'''). Strings are immutable, meaning that once they are created, their
value cannot be modified.
Part I: A Whirlwind Tour 19
1 >>> x = "Hello"
2 >>> type(x)
3 <class 'str'>
4 >>> _str = """
5 ... this is a
6 ... multiline string.
7 ... """
8 >>> print(_str)
9
10 this is a
11 multiline string.
12
13 >>>
Strings can contain special characters that are represented using back-
slash \ as an escape character. These special characters include the
newline character \n, the tab character \t, the backspace character \b,
the carriage return \r, the form feed character \f, the single quote ' the
double quote " and backslash character itself \.
By default, Python interprets these escape sequences in a string, meaning
that when an escape character is encountered followed by a special char-
acter, python will replace the sequence with the corresponding character.
So for example, the string "hello\nworld" contains a newline character
that will separate "hello" and "world" on separate lines when printed
to the console or written to a file.
Raw strings, represented by the prefix r before the string, are used to
represent strings that should not have special characters interpreted. For
example, a raw string like r"c:\Windows\System32" will include the
backslashes in the string, whereas a normal string would interpret them
as escape characters.
Part I: A Whirlwind Tour 20
1 >>> "\n"
2 '\n'
3 >>> r"\n"
4 '\\n'
Byte strings, represented by the bytes data type, are used to represent
strings as a sequence of bytes. They are typically used when working
with binary data or when working with encodings that are not Unicode.
Byte strings can be created by prefixing a string with a b, for example,
b"Hello".
1 >>> type(b"")
2 <class 'bytes'>
3 >>>
Part I: A Whirlwind Tour 21
Container Types
The python list, represented by the list data type, is a container data type
that is used to store an ordered collection of items. A list is similar to a
tuple, but it is mutable, meaning that its items in the container can be
modified after it is created. Lists are created by enclosing a sequence of
primitives in square brackets, separated by commas. For example, a list
of integers [1, 2, 3] or a list of strings ["one", "two", "three"].
Again, these primitives can be of varying types.
Part I: A Whirlwind Tour 22
Both tuples and lists in Python can be indexed to access the individual
elements within their collection. Indexing is done by using square
brackets [] and providing the index of the element you want to access.
Indexing starts at 0, so the first element in the tuple or list has an index
of 0, the second element has an index of 1, and so on.
So for example if you have a tuple my_tuple = ("Peter", "Theo",
"Sarah"), you can access the first element by using the index my_tu-
ple[0]. Similarly, if you have a list my_list = ["Bob", "Karen",
"Steve"], you can access the second element by using the index
my_list[1].
Both tuples and lists use this convention for getting items from their re-
spective collections. Lists, given their mutability, also use this convention
for setting items.
Negative indexing can also be used to access the elements of the tuple
or list in reverse order. For example, given the tuple my_tuple =
Part I: A Whirlwind Tour 23
When you try to access an index that is out of bounds of a list or a tuple,
Python raises an exception called IndexError (we’ll talk more about
Part I: A Whirlwind Tour 24
exceptions later, but for now just consider exceptions as how python
indicates that something went wrong). Or in other words, if you try to
access an element at an index that does not exist, Python will raise an
IndexError.
Dictionaries
Sets
The python set, represented by the set data type, is a container data type
that is used to store an unordered collection of unique items. Sets are
created by enclosing a sequence of items in curly braces {} and separated
by commas. For example, a set of integers {1, 2, 3} or a set of strings
{"bippity", "boppity", "boop"}. It is important to note that you
can not create an empty set outside of using a constructing function (a
constructor), as {} by default creates a dictionary. The constructor for
creating an empty set is set().
Python also includes a data type called frozenset that is similar to
the set data type, but with one key difference: it is immutable. This
means that once a frozenset is created, its elements cannot be added,
removed, or modified. Frozensets are created by using the frozenset()
constructor; for example, frozenset({1, 2, 3}) or frozenset([4,
5, 6]).
Part I: A Whirlwind Tour 26
Python Objects
For example, the following code creates two variables, x and y, and
assigns the value 257 to both of them. Even though the values of x and
y are the same, they are different objects in memory and therefore have
different identities. Even though the values are the same, x and y are
given two different id values, because they are different objects.
1 >>> x = 257
2 >>> y = 257
3 >>> id(x)
4 140097007292784
5 >>> id(y)
6 140097007292496
It is generally the case that variable assignment creates an object that has
a unique identity. However, there are some exceptions to this rule.
There are specific values which Python has elected to make as what is
called a singleton. Meaning, that there will only ever be a single instance
of this object. For example, the integers -5 through 256 are singletons,
because of how common they are in everyday programming. The values
None, False, and True are singletons as well.
1 >>> x = 1
2 >>> y = 1
3 >>> id(x)
4 140097008697584
5 >>> id(y)
6 140097008697584
References to Objects
the value directly, it instead stores a reference to the object that contains
the value.
For example, consider the following:
1 >>> x = 5
2 >>> y = x
3 >>> x = 10
1 x = 5 y = x x = 5
2 ------ ----- -----
3 x -- int(5) x x -- int(10)
4 \
5 int(5)
6 /
7 y y -- int(5)
It’s worth noting that when you work with mutable objects, like lists or
dictionaries, you have to be aware that if you assign a variable to another
variable, both of the variables fundamentally reference the same object
in memory. This means that any modification made to the object from
one variable will be visible to the other variable.
Part I: A Whirlwind Tour 29
1 x = {} y = x x["fizz"] = "buzz"
2 ------ ----- -----
3 x -- {} x x
4 \ \
5 {} {"fizz": "buzz"}
6 / /
7 y y
Chapter 2. Operators
Operators are special symbols in Python that carry out various types
of computation. The value that a given operator operates on is called
the operand. For example: in the expression 4 + 5 = 9, 4 and 5 are
operands and + is the operator that designates addition. Python supports
the following types of operators: arithmetic, assignment, comparison,
logical, membership, bitwise, and identity. Each of these operators will
be covered in depth in the chapter.
Arithmetic
and Division, and Addition and Subtraction. That being said it’s rec-
ommended to use parentheses liberally in order to make the code more
readable and avoid confusion about the order of operations.
Here are the most commonly used arithmetic operators in Python:
1 >>> x = 5
2 >>> y = 2
3 >>> x + y
4 7
5 >>> x - y
6 3
7 >>> x * y
8 10
9 >>> x / y
10 2.5
11 >>> x // y
12 2
13 >>> x % y
14 1
15 >>> x ** y
16 25
Part I: A Whirlwind Tour 31
Assignment
• += - adds the right operand to the left operand and assigns the result
to the left operand
• -= - subtracts the right operand from the left operand and assigns
the result to the left operand
• *= - multiplies the left operand by the right operand and assigns the
result to the left operand
• /= - divides the left operand by the right operand and assigns the
result to the left operand
• %= - takes the modulus of the left operand by the right operand and
assigns the result to the left operand
• //= - floor divides the left operand by the right operand and assigns
the result to the left operand
• **= - raises the left operand to the power of the right operand and
assigns the result to the left operand
Part I: A Whirlwind Tour 32
For example:
Packing operators
Python also defines two unpacking operators which can be used to assign
the elements of a sequence to multiple variables in a single expression.
These include:
Comparison
• >= - returns True if the left operand is greater than or equal to the
right operand, False otherwise
• <= - returns True if the left operand is less than or equal to the right
operand, False otherwise
• is - returns True if the left operand is the same instance as the right
operand, False otherwise
• is not - returns True if the left operand is not the same instance
as the right operand, False otherwise
1 >>> 1 == 1
2 True
3 >>> 1 == 2
4 False
5 >>> 1 != 2
6 True
7 >>> 1 is 1
8 True
9 >>> 5 < 2
10 False
11 >>> 5 > 2
12 True
13 >>> 5 <= 2
14 False
15 >>> 2 >= 2
16 True
17 >>> x = int(1)
18 >>> y = int(1)
19 >>> x is y # 1 is a singleton in Python
Part I: A Whirlwind Tour 36
20 True
21 >>> x = int(257)
22 >>> y = int(257)
23 >>> x is y # 257 is not a singleton
24 False
25 >>> x is not y
26 True
Logical
1 >>> x = 4
2 >>> y = 2
3 >>> z = 8
4 >>> (x > 2 or y > 5) and z > 6
5 True
What is Truthy?
• False
• None
• 0
• 0.0
• Empty collections (i.e. [], (), {}, set(), etc.)
• Objects which define a __bool__() method that returns falsy
• Objects which define a __len__() method that returns falsy
Short-Circuits
It’s important to note that the and and or operators are short-circuit
operators, which means that they exit their respective operations eagerly.
In the case of and operator, it only evaluates the second operand if the
first operand is True, and in the case of or operator, it only evaluates
the second operand if the first operand is False.
Logical Assignments
1 >>> x = "" or 3
2 >>> x
3 3
This example assigns the variable x the value 3 because the empty string
is falsy.
1 >>> x = 0 and 6
2 >>> x
3 0
This example assigns the variable x the value 0 because zero is falsy, and
since and requires the first operand to be truthy, the resulting assignment
is 0.
1 >>> x = 3 and 6
2 >>> x
3 6
This example assigns the variable x the value 6 because 3 is truthy, and
since and requires the first operand to be truthy, the resulting assignment
falls to the second operand.
In instances where we want to explicitly assign a given variable a boolean
representation of the expression, we would need some way to evaluate
the logical operation before the assignment operation. One way to do this
is using an explicit type conversion, by passing the value to the bool()
constructor:
Part I: A Whirlwind Tour 39
However, we can also do this using only logical operators, and without
a function call.
Membership
Bitwise
1 >>> x = 0b101
2 >>> y = 0b011
3 >>> x & y
4 1 # 0b001
5 >>> x | y
6 7 # 0b111
7 >>> x ^ y
8 6 # 0b110
9 >>> ~x
10 -6 # -0b110
11 >>> x << 1
12 10 # 0b1010
13 >>> x >> 1
14 2 # 0b010
It’s important to note that the bitwise NOT operator (∼) inverts the bits
of the operand and returns the two’s complement of the operand.
It’s also worth noting that Python also has support for the analogous
bitwise assignment operators, such as &=, |=, ^=, <<=, and >>=, which
allow you to perform a bitwise operation and assignment in a single
statement.
Identity
• is - returns True if the operands are the same object, False otherwise
• is not - returns True if the operands are not the same object, False
otherwise
Part I: A Whirlwind Tour 42
You can use this identity operator to check against singletons, such as
None, False, True, etc.
1 # ./script.py
2
3 x = None
4 if x is None:
5 print("x is None!")
It’s important to note that two objects can have the same value but
different memory addresses; in this case the == operator can be used
to check if they have the same value and the is operator can be used to
check if they have the same memory address.
1 >>> x = [1, 2, 3]
2 >>> y = [1, 2, 3]
3 >>> z = x
4
5 >>> z is x
6 True
7 >>> x is y
8 False
9 >>> x == y
10 True
This example shows that even though x and y contain the same elements,
they are two different lists, so x is not y returns True. The variables
x and z however point to the same list, so x is z returns True.
Part I: A Whirlwind Tour 43
Up until now, we’ve been working with Python in the context of single-
line statements. However, in order to proceed to writing full python
scripts and complete programs, we need to take a moment to discuss
the lexical structure which goes into producing valid Python code.
Lexical Structure refers to the smallest units of code that the language is
made up of, and the rules for combining these units into larger structures.
In other words, it is the set of rules that define the syntax and grammar of
the language, and they determine how programs written by developers
will be interpreted.
Line Structure
1 # ./script.py
2
3 for i in range(2):
4 print(i)
In this example, we see how the python script ‘script.py’ looks in our
editor. We can contrast that to how the computer interpret the file, as a
sequence of characters separated by newline characters (as this is on a
linux machine, the newline character is \n).
Comments
Two or more physical lines in Python can be joined into a single logical
line using backslash characters \. This is done by placing a backslash at
the end of a physical line, which will be followed by the next physical
line. The backslash and the end-of-line character are removed, resulting
in a single logical line.
1 >>> # you can join multiple physical lines to make a single logic\
2 al line
3 >>> x = 1 + 2 + \
4 ... 3 + 4
5 ...
6 >>> x
7 10
1 >>> x = 1 + 2 \ # comment
2 File "<stdin>", line 1
3 x = 1 + 2 \ # comment
4 ^
5 SyntaxError: unexpected character after line continuation charact\
6 er
In this example, all the expressions are spread across multiple physical
lines, but they are still considered as a single logical line, because they
are enclosed in parentheses, square brackets, or curly braces.
Indentation
For example, in the following code snippet, the lines of code that are
indented under the if statement are considered to be within the scope
of the if statement and will only be executed if the condition x > 0 is
True.
1 >>> x = 5
2 >>> if x > 0:
3 ... print("x is positive")
4 ... x = x - 1
5 ... print("x is now", x)
6 ...
7 x is positive
8 x is now 4
1 if condition:
2 # code to be executed if condition is Truthy
In addition to the if statement, Python also provides the elif (short for
“else if”) and else statements, which can be used to specify additional
blocks of code to be executed if the initial condition is not met.
1 if condition:
2 # code to be executed if `condition` is truthy
3 elif other_condition:
4 # code to be executed if `condition` is falsy
5 # and `other_condition` is truthy
6 else:
7 # code to be executed if both conditions are falsy
Question 1:
What values would trigger the if statement to execute?
while
1 while condition:
2 # code to be executed while `condition` is truthy
1 >>> condition = 10
2 >>> while condition:
3 ... condition -= 1 # condition = condition - 1
4 ...
5 >>> condition
6 0 # 0 is falsy
break
1 >>> condition = 0
2 >>> while condition > 10:
3 ... if condition == 5:
4 ... break
5 ... condition -= 1
6 ...
7 >>> condition
8 5
continue
In instances where you would like to skip an iteration of a loop and move
onto the next, you can use a continue statement. When the interpreter
Part I: A Whirlwind Tour 51
1 >>> i = 0
2 >>> skipped = {2, 4}
3 >>> while i <= 5:
4 ... i += 1
5 ... if number in skipped:
6 ... continue
7 ... print(number)
8 ...
9 1
10 3
11 5
for
Here, the variable item will take on the value of each item in the
sequence in turn, allowing you to operate on each item of the collection
one at a time.
Part I: A Whirlwind Tour 52
What is an Iterable?
In this case, word is an iterable where each item is a character, and the
items can be accessed by their position in the string.
The for/else construct is a special combination of the for loop and the
else statement. It allows you to specify a block of code (the else block)
that will be executed after the for loop completes, but only if the loop
completes without a break. This means that if the loop is exited early
using a break statement, the else block will not be executed.
11 2
12 3
13 Found 4, exiting loop.
In this example, the for loop iterates over the numbers in the list. When
it encounters the number 4, it exits the loop using the break statement.
Since the loop was exited early, the else block is not executed, and the
message “Loop completed normally” is not printed.
If the loop completes normally, the else block will be executed:
Exception Handling
1 >>> x = None
2 >>> if x is None:
3 ... raise ValueError("value 'x' should not be None")
4 ...
5 Traceback (most recent call last):
6 File "<stdin>", line 2, in <module>
7 ValueError: value 'x' should not be None
1 try:
2 # code that may raise an exception
3 except Exception:
4 # code to be executed if the exception occurred
Each try block can have one or more except blocks, so as to handle
multiple different types of errors. Each except block can have multiple
exceptions to match against, by using a tuple of exceptions. The exception
in an except block can be assigned to a variable using the as keyword.
When an Exception is raised in the try block, the interpreter checks
each except block in the order which they were defined to see if a
particular exception handler matches the type of exception that was
raised. If a match is found, the code in the corresponding except block
is executed, and the exception is considered handled. If no except block
matches the exception type, the exception continues to bubble up the call
stack until it is either handled elsewhere or the program terminates.
1 >>> try:
2 ... raise ArithmeticError
3 ... except (KeyError, ValueError):
4 ... print("handle Key and Value errors")
5 ... except ArithmeticError as err:
6 ... print(type(err))
7 ...
8 <class 'ArithmeticError'>
raise from
The raise from idiom can be used to re-raise an exception which was
caught in an except block. This is useful in instance where we want to
add more context to an exception, so as to give more meaning to a given
exception that was raised.
1 >>> try:
2 ... x = int("abc")
3 ... except ValueError as e:
4 ... raise ValueError(f"Variable `x` recieved invalid input: {\
5 e}") from e
6 ...
7 Traceback (most recent call last):
8 File "<stdin>", line 2, in <module>
9 ValueError: invalid literal for int() with base 10: 'abc'
10
11 The above exception was the direct cause of the following excepti\
12 on:
Part I: A Whirlwind Tour 58
13
14 Traceback (most recent call last):
15 File "<stdin>", line 4, in <module>
16 ValueError: Variable `x` recieved invalid input: invalid literal \
17 for int() with base 10: 'abc'
1 >>> try:
2 ... pass
3 ... except Exception:
4 ... print("an exception was raised")
5 ... else:
6 ... print("no exception was raised")
7 ... finally:
8 ... print("this is going to be executed regardless")
9 ...
10 no exception was raised
11 this is going to be executed regardless
12 >>> try:
13 ... raise ValueError
14 ... except ArithmeticError:
15 ... pass
16 ... finally:
17 ... print("this is going to be executed regardless")
Part I: A Whirlwind Tour 59
18 ...
19 this is going to be executed regardless
20 Traceback (most recent call last):
21 File "<stdin>", line 2, in <module>
22 ValueError
match
1 match <expression>:
2 case <pattern>:
3 <block>
cases were to match the expression. Since the second case was a match,
this block was passed over.
Matches can be composed of any array of objects, from simple primitive
types to collections and even user-defined types. Furthermore, collec-
tions can make use of unpacking operators to assign values to variables.
type checking
1 >>> item = 1
2 >>> match item:
3 ... case int(0 | -1):
4 ... print(f"{item} is either 0 or -1")
5 ... case int():
6 ... print(f"{item} is an int, not zero or -1")
7 ...
8 1 is an int, not 0 or -1
In this example, we’re using a match statement to check both the type
and the value of the given item. The first case defines a class pattern
for matching int types if the item matches the pattern of 0 or -1. This
pattern fails to match because item = 1. The second case defines a class
pattern for matching all int types, since a pattern is not provided in the
type call. This pattern matches the item, so the case block is executed.
guards
In this example, the first case matches because both the pattern of the
coordinate matches the case, and the expression y == 1 is True.
Or Patterns
as
In this example, the or pattern is used in the case definition, and each
pattern in the case maps the values of the coordinate to the variables x
and y. Both variables are defined in each pattern, so the case does not
throw a syntax error. The as keyword is used to bind matching values
to variables, which can subsequently be used in the case block when the
pattern is matched.
Chapter 5. Functions
that are returned to the function caller are specified by the return
keyword. If a return is not defined, the function implicitly returns None.
For example, the following code defines a simple function called greet
that takes in a single argument, name, and returns a greeting using that
parameter:
Functions are not executed immediately when they are defined. Instead,
they are executed only when they are called. This means that when
a script or program is running, the interpreter will read the function
definition, but it will not execute the code within the function block until
the program calls the function.
Consider the following script:
1 # ./script.py
2
3 def my_function():
4 print("Function called")
5
6 print("Script start")
7 my_function()
8 print("Script end")
When this script is run, the Python interpreter first reads the function
definition. The function is defined, but the function code is not executed
Part I: A Whirlwind Tour 67
Once a function is defined, it can be called multiple times, and each time
it will execute the code inside the function.
Function Signatures
1 function name
2 |
3 | /--/------- function parameters
4 def add(a, b):
The function name is the identifier that is used to call the function. It
should be chosen to be descriptive and meaningful, so other developers
can ascertain the function’s purpose.
The parameters of a function are variables that are assigned in the scope
of the body of the function when it is called. In Python, the parameters
are defined in parentheses following the function name. Each parameter
has a name, which is used to reference values within the function. It’s
Part I: A Whirlwind Tour 68
Explicitly positional/key-value
Default Values
Unless this shared state is desired, it’s better to use immutable state for
default values, or a unique object to serve as an identity check.
Part I: A Whirlwind Tour 72
Scope
Up until this point, the code we’ve written has made no sort of distinction
between when access to a variable is considered valid. Once a variable
has been defined, later code is able to access its value and make use of it.
With the introduction of functions, this assumption of a variable being
available at definition is no longer strictly valid. This is because variables
defined within a function block are not variables which can be accessed
from outside the function block. The function block is said to have it’s
own local scope, also referred to as a namespace (or, the space where a
given variable name is assigned), and that block scope is separate from
the top-level global scope of the interpreter.
1 def my_function():
2 x = 5
3 print(x)
Part I: A Whirlwind Tour 74
In this example, the variable x is defined within the local scope of the
function my_function, and it can only be accessed within the block
scope of the function. If you try to access it outside the function, you
will get an error.
Nested Scopes
In Python, the block scopes can be nested. A nested scope is a scope that
is defined within another scope. Any scope which is nested can read
variables from scopes which enclose the nested scope.
When the interpreter searches for a variable referenced in a nested scope,
it follows what’s referred to as the LEGB rule for searching scopes. LEGB
is an acronym for Local, Enclosing, Global, and Built-in. The interpreter
first looks in the local scope for a variable. If it is not found, the interpreter
then looks in any enclosing scopes for the variable. If it is still not found,
the interpreter then looks in the top-level global scope for the variable.
If it is still not found, it looks in the built-in scope.
and the inner_function scope. When the inner function is called, the
interpreter first looks for the variable greeting in the local scope of the
inner function, and finds it with the value “Hey”, which is then printed to
the console. This scope is then exited when the function returns. The next
print(greeting) call looks for the variable greeting. This variable is
not defined in the local scope, so it next checks for any enclosing scopes.
There is no enclosing scope, as outer_function is a top-level function.
So the interpreter then checks the global scope for greeting, where
greeting is defined as “Hello”, which is then printed to the console.
12 >>> outer_function()
13 Hi Hey
Closures
Anonymous Functions
Lambda functions typically used when you need to define a function that
will be used only once, or when you need to pass a small function as an
argument to another function.
Decorators
12 >>> my_function()
13 entering the function...
14 inside the function...
15 exiting the function...
It’s important to note that decorators are a syntactic sugar, meaning they
make a particular design pattern more elegant, but they do not add any
additional functionality to the language.
Chapter 6. Classes
details of how those properties and methods work. This allows you to
separate the interface of an object, which defines how it can be used,
from its implementation, which defines how it works.
In Python, a class is defined using the class keyword, followed by the
name of the class. The class definition typically starts with an indented
block of code, which is known as the class body.
The pass keyword is used as a placeholder and does not do anything, but
it is needed in this case to create an empty class.
Once a class is defined, it can be instantiated by calling the type.
Inside the block scope of a class, you can define methods that are
associated with an object or class. Methods are used to define the
behavior of an object and to perform operations on the properties of the
object.
Methods are defined inside a class using the def keyword, followed by
the name of the method, a set of parentheses which contain the method’s
arguments, and finally a colon :. The first parameter of a standard
method is always a variable which references the instance of the class.
By convention, this is typically given the name self.
For example:
Part I: A Whirlwind Tour 84
Methods can also take additional parameters and return values, similar
to functions.
You might have noticed that earlier in the book when we called the
type() function on an object, the return specified that the type was
oftentimes a class.
1 >>> x = "Hello!"
2 >>> x
3 "Hello!"
4 >>> type(x)
5 <class 'str'>
In Python, all of the built-in data types, such as integers, strings, lists,
and dictionaries, are implemented in a manner similar to classes. And
similar to user-defined classes, all of the built-in data types in Python
have associated methods that can be used to manipulate their underlying
data structures. For example, and as we just saw, the str class defines a
method called join(). This join method takes the instance of the calling
object, a string, as self, and a collection of other strings which are joined
using self as the separator.
Part I: A Whirlwind Tour 86
Given this, calling the class method is equally valid; in this case we’re
simply passing the self parameter explicitly.
__dunder__ methods
object by calling it. Those values will be passed into the call of the __-
init__ method, where they can be assigned to the instance self via an
instance attribute.
Attributes
Class Attributes
Class attributes are variables that are defined at the class level, rather
than the instance level. They are shared amongst all instances of a class
and can be accessed using either the class name, or an instance of the
class.
It’s worth noting that the same precaution about using default function
arguments also applies to class attributes. Since class attributes are
shared across all instances of the class, mutable state should likewise
be avoided.
Part I: A Whirlwind Tour 89
A Functional Approach
the attribute exists, it will be deleted; if it does not exist the function call
will raise an AttributeError.
@staticmethod
@classmethod
Generator Expressions
Generator expressions are useful when working with large data sets
or when the values in the expression are the result of some expensive
computation. Because they generate values on-the-fly, they can be more
memory-efficient than creating a list in memory.
They can also be used to build powerful and efficient iterator pipelines.
Using iterators in this fashion allows you to perform multiple operations
on data without necessitating that each operation hold intermediary
values in memory.
Generator Functions
yield from
The yield from statement can be used to pass the context of an iteration
from one generator to the next. When a generator executes yield from,
the control of an iteration passes to the called generator or iterator.
Part II. A Deeper Dive 96
List Comprehensions
for loop, because the code that generates the list is fully implemented in
C.
Dictionary Comprehensions
This however is not strictly necessary; for example the following dictio-
nary comprehension creates a dictionary that maps some letters of the
alphabet to their corresponding ASCII values:
Type Conversions
Type conversions are the process of converting one data type to another
data type. Python provides the built-in functions int(), float(), com-
plex(), str(), bytes(), list(), dict(), set(), and frozenset()
to use for type conversions.
The int() function is used to convert a number or a string into an
integer. It takes a single object and returns a new object of type int if
a conversion is possible, else it throws a ValueError. An optional base
keyword can be provided if the string is in a format other than base 10.
Part II. A Deeper Dive 103
1 >>> int('123')
2 123
3 >>> int('a')
4 Traceback (most recent call last):
5 File "<stdin>", line 1, in <module>
6 ValueError: invalid literal for int() with base 10: 'a'
7 >>> int('101', base=2)
8 5
1 >>> float('12.34')
2 12.34
3 >>> float('a')
4 Traceback (most recent call last):
5 File "<stdin>", line 1, in <module>
6 ValueError: could not convert string to float: 'a'
1 >>> complex(1, 2)
2 (1+2j)
1 >>> str(123)
2 '123'
3 >>> str(b'\xdf', encoding='latin1')
4 'ß'
1 >>> list('abcde')
2 ['a', 'b', 'c', 'd', 'e']
3 >>> list()
4 []
1 >>> set([1,2,3,1])
2 {1, 2, 3}
3 >>> set()
4 set()
1 >>> frozenset([1,2,3,1])
2 frozenset({1, 2, 3})
3 >>> frozenset()
4 frozenset()
Mathematical Functions
1 >>> abs(-5)
2 5
3 >>> abs(5)
4 5
The max() function returns the largest item in an iterable or the largest
of two or more arguments. The same optional keyword values from min
apply.
Part II. A Deeper Dive 107
The sum() function returns the sum of all items in an iterable. It also
accepts an optional second argument which is used as the starting value.
1 >>> pow(2, 3)
2 8
The all() and any() functions are used to check if all or any of the
elements in an iterable are truthy. The all() function returns True if
all elements in an iterable are truthy and False otherwise. The any()
function returns True if any of the elements in an iterable are truthy,
and False otherwise.
Part II. A Deeper Dive 108
all() and any() can also be used with an expression to check if all or
any elements in a sequence meet a certain condition.
dir()
The dir() function is used to find out the attributes and methods of an
object. When called without an argument, dir() returns a list of names in
Part II. A Deeper Dive 109
the current local scope or global scope. When called with an argument,
it returns a list of attribute and method names in the namespace of the
object.
1 >>> dir()
2 ['__annotations__', '__builtins__', '__doc__',
3 '__loader__', '__name__', '__package__', '__spec__']
4 >>> dir(list)
5 ['__add__', '__class__', '__contains__', '__delattr__',
6 '__delitem__', '__dir__', '__doc__', '__eq__',
7 '__format__', '__ge__', '__getattribute__',
8 '__getitem__', '__gt__', '__hash__', '__iadd__',
9 '__imul__', '__init__', '__init_subclass__',
10 '__iter__', '__le__', '__len__', '__lt__', '__mul__',
11 '__ne__', '__new__', '__reduce__', '__reduce_ex__',
12 '__repr__', '__reversed__', '__rmul__', '__setattr__',
13 '__setitem__', '__sizeof__', '__str__',
14 '__subclasshook__', 'append', 'clear', 'copy', 'count',
15 'extend', 'index', 'insert', 'pop', 'remove',
16 'reverse', 'sort']
enumerate()
The eval() and exec() functions are built-in Python functions that are
used to evaluate and execute code, respectively.
The eval() function takes a least one argument, which is a string
containing a valid Python expression, and evaluates it, returning the
result of the expression. It also takes optional globals and locals
arguments which act as global and local namespaces. If these values
aren’t provided, the global and local namespaces of the current scope
are used.
1 >>> x = 1
2 >>> y = 2
3 >>> eval("x + y")
4 3
5 >>> eval("z + a", {}, {"z": 2, "a": 3})
6 5
1 >>> x = 1
2 >>> y = 2
3 >>> exec("result = x + y")
4 >>> result
5 3
It’s important to note that eval() and exec() can execute any code
that could be written in a Python script. If the strings passed to these
functions are not properly sanitized, a hacker could use them to execute
arbitrary code with the permissions of the user running the script.
Both eval() and exec() have the potential to introduce security
vulnerabilities in your code, so it’s important to be extremely careful
when using them, and to avoid them if possible.
map()
filter()
The input() and print() functions are are used to read input from the
user and print output to the console, respectively.
The input() function reads a line of text from the standard input
(usually the keyboard) and returns it as a string. A prompt can be passed
as an optional argument that will be displayed to the end user.
1 print(
2 *objects,
3 sep=' ',
4 end='\n',
5 file=sys.stdout,
6 flush=False
7 )
open()
The open() function takes the name of the file or a file-like object as
an argument and opens it in a specified mode. Different modes can
be selected by passing a second mode argument, or the mode keyword.
Modes include r for read mode (default), w for write mode, a for append
mode, x for exclusive creation mode, b for binary mode, t for text mode
(which is the default), and + for both reading and writing. Modes which
don’t conflict can be used in concert.
Part II. A Deeper Dive 114
• buffering - the buffering policy for the file. The default is to use
the system default buffering policy (-1). A value of 1 configures the
file object to buffer per-line in text mode. A value greater than 1
configures a buffer size, in bytes, for the file object to use. Finally, a
value of 0 switches buffering off (though this option is only available
with binary mode).
• encoding - the encoding to be used for the file. The default is None,
which sets the file object to use the default encoding for the platform.
Can be any string which python recognizes as a valid codec.
• errors - the error handling policy for the file. The default is None,
which means that errors will be handled in the default way for the
platform.
• newline - configures the file object to anticipate a specific newline
character. The default is None, which means that universal newlines
mode is disabled.
• closefd - whether the file descriptor should be closed when the file
is closed. The default is True.
• opener - a custom opener for opening the file. The default is None,
which means that the built-in opener will be used.
range()
one value is provided, it yields a sequence from [0, n), where n is the
stop value passed to range. If two arguments are provided, the range
is [a, b), where a is the start value and b is the stop value. If a third
argument is provided, it acts as a step value, incrementing the count per-
iteration by the value of the third step argument.
1 >>> tuple(range(2))
2 (0, 1)
3 >>> list(range(1, 3))
4 [1, 2]
5 >>> for i in range(start=2, stop=8, step=2):
6 >>> print(i)
7 2
8 4
9 6
sorted()
reversed()
zip()
__new__ and __init__ are two special methods in Python that are used
in the process of creating and initializing new objects.
__new__ is a method that is called when a new object is to be constructed.
It is responsible for creating a new instance of the class. It takes the class
as its first argument, and any additional arguments passed to the class
constructor are passed into the __new__ method. The new method is
then responsible for creating an instance of a class, and returning that
object to the interpreter.
Once an instance of an object has been created by __new__, it is then
passed to the __init__ method, along with the provided initializing
values as positional and keyword arguments. __init__ is then responsi-
ble for initializing the state of the new object, and to perform any other
necessary setup.
In most cases, the __new__ method is not necessary to be implemented
by the developer. Most of the time it is sufficient to use the base
implementation, which creates a new instance of the class, and then
calls __init__ on the instance, passing the instance as self and further
passing in all the provided arguments.
Singletons
can only have one instance, while providing a global access point to that
instance. This pattern can be implemented in Python by using the __-
new__ method.
The basic idea is to override the __new__ method in the singleton class
so that it only creates a new instance if one does not already exist.
If an instance already exists, the __new__ method simply returns that
instance, instead of creating a new one.
Rich Comparisons
In Python, rich comparison methods are special methods that allow you
to define custom behavior for comparison operators, such as <, >, ==, !=,
<=, and >=.
The rich comparison methods are:
1 class Money:
2 def __init__(self, amount: int, currency: str):
3 self.amount = amount
4 self.currency = currency
5
6 def __eq__(self, other):
7 return (
8 self.amount == other.amount
9 and self.currency == other.currency
10 )
11
12 def __lt__(self, other):
13 if self.currency != other.currency:
14 raise ValueError(
15 "Can't compare money of differing currencies."
16 )
17 return self.amount < other.amount
18
19 def __le__(self, other):
20 if self.currency != other.currency:
21 raise ValueError(
22 "Can't compare money of differing currencies."
23 )
24 return self.amount <= other.amount
In this example, the Money class has two attributes: amount, which is
the monetary value, and currency, which is the currency type. The __-
eq__ method compares the amount and currency of two Money objects,
and returns True if they are the same, and False otherwise. The __lt_-
_ and __le__ methods compare the amount value between two Money
objects with the same currency, and raises an error if they have different
currencies.
Part II. A Deeper Dive 122
You are typically only required to define half of the rich comparison
methods of any given object, as Python can infer the inverse value if a
requested operation is not defined.
With these methods defined, a Money object can be used in compar-
ison operations such as <, >, ==, !=, <=, and >= in a natural and in-
tuitive way. For example, Money(10, "USD") < Money(20, "USD")
will return True, and Money(10, "USD") == Money(10, "EUR") will
return False.
Operator Overloading
In this example, the Money class has two attributes: amount, which is the
monetary value, and currency, which is the currency type.
The __add__ method overloads the + operator, allowing you to add two
Money objects. It also check to make sure that the operands are of the
same currency, otherwise it raises a ValueError.
The __sub__ method overloads the - operator, allowing to subtract two
Money objects. It also check to make sure that the operands are of the
same currency, otherwise it raises a ValueError.
The __mul__ method overloads the * operator, allowing to multiply a
Money object by an int. If the value is not an int, it raises a ValueError.
The __truediv__ method overloads the / operator, allowing to divide a
Money object by an int. It returns the quotient and the remainder in the
form of new Money objects in the same currency.
Part II. A Deeper Dive 125
String Representations
14 Money(USD $12.05)
15 >>> str(Money(1200, "USD"))
16 '$12.0'
Emulating Containers
Container types such as lists, tuples, and dictionaries have built-in be-
havior for certain operations, such as the ability to iterate over elements,
check if an element is in their collection, and retrieve the length of the
container. We can emulate this behavior using special methods.
The [] operator can be overloaded using the following methods:
By using these methods, we can create classes that have similar behavior
to built-in container types, making our code more effecient and Pythonic.
Emulating Functions
Using Slots
11 ...
12 ... def __getattr__(self, key):
13 ... return self._data.get(key)
14 ...
15 >>> this = ADTRecursion(my_value=1)
16 >>> this.my_value
17 File "<stdin>", line 12, in __getattribute__
18 File "<stdin>", line 12, in __getattribute__
19 File "<stdin>", line 12, in __getattribute__
20 [Previous line repeated 996 more times]
21 RecursionError: maximum recursion depth exceeded
22 >>> that = AbstractDataType(my_value=1)
23 >>> that.my_value
24 1
The __setattr__ method is called when an attribute is set using the dot
notation (e.g., obj.attribute = value) or when using the setattr()
Part II. A Deeper Dive 132
built-in function. It takes the attribute name and value as its parameters
and should set the attribute to the specified value.
The __delattr__ method is called when an attribute is deleted using the
del statement (e.g., del obj.attribute) or when using the delattr()
built-in function. It takes the attribute name as its parameter and should
delete the attribute.
Iterators
stops iterating when the internal value reaches the stop value, as at this
point the StopIteration exception is raised.
Lazy Evaluation
Context Managers
To hook into the context manager protocol, an object should define the
special methods __enter__ and __exit__. The __enter__ method
is called when the context is entered, and can be used to acquire the
resources needed by the block of code. It can return an object that will
be used as the context variable in the as clause of the with statement.
The __exit__ method is called when the context is exited, and it can
be used to release the resources acquired by the __enter__ method.
It takes three arguments: an exception type, an exception value, and
a traceback object. The __exit__ method can use these arguments to
perform cleanup actions, suppress exceptions, or log errors. If you don’t
plan on using the exception values in the object, they can be ignored.
21 entering!
22 exiting!
23 Traceback (most recent call last):
24 File "<stdin>", line 7, in <module>
25 raise ValueError
26 ValueError
Descriptors
Descriptors are objects that define one or more of the special methods __-
get__, __set__, and __delete__. These methods are used to customize
Part II. A Deeper Dive 139
While vaild, the isn’t the typical use case of the property descriptor. You
will most commonly see the property descriptor evoked as a decorator
(we’ll talk more about decorators later).
Inheritance
1 class SuperClass:
2 def __init__(self, name):
3 self.name = name
4
5 def print_name(self):
6 print(self.name)
7
8
9 class SubClass(SuperClass):
10 pass
Part II. A Deeper Dive 143
In Python, a class can inherit from multiple classes by listing them in the
class definition, separated by commas. For example:
diamond problem, and can lead to ambiguity about which version of the
method or property to use.
To resolve this ambiguity, Python uses a method resolution order (MRO)
to determine the order in which the classes are searched for a method
or property. The MRO used in Python is the C3 linearization algorithm,
which creates a linearization of the class hierarchy that is guaranteed to
be consistent and predictable.
C3 guarentees the following:
1 >>> class A:
2 ... def method(self):
3 ... print("A")
4 ...
5 ... class B(A):
6 ... pass
7 ...
8 ... class C(A):
9 ... def method(self):
10 ... print("C")
11 ...
12 ... class D(B, C):
13 ... pass
14 ...
Here, class D inherits from both B and C which both inherit from A. If we
create an instance of D and call the method method, C is printed, because
Part II. A Deeper Dive 146
1 >>> D().method()
2 C
3 >>> D.__mro__
4 (__main__.D, __main__.B, __main__.C, __main__.A, object)
Encapsulation
Polymorphism
The syntax for creating a custom metaclass is to define a new class that
inherits from the type class. From there, we can overload the default
methods of the type class to create custom hooks that run when a class
is defined. This is typically done using the __new__ method to define
the behavior of class creation. The __init__ method is also commonly
overridden to define the behavior of the class initialization.
For example, the following code defines a custom metaclass that adds a
greeting attribute to the class definition, and prints to the console when
this is done:
In this chapter, we’re going to circle back and look more in depth as some
of python’s built-in data types. As previously discussed, the built-in data
types are all classes which implement a series of attributes and methods.
In order to inspect these attributes and methods, we can use the built-
in function dir() to see a list of all the names of attributes and
methods that the object has. This can be a useful tool for exploring
Part II. A Deeper Dive 152
1 >>> help(str.count)
2 Help on method_descriptor:
3
4 count(...)
5 S.count(sub[, start[, end]]) -> int
6
7 Return the number of non-overlapping occurrences
8 of substring sub in string S[start:end]. Optional
9 arguments start and end are interpreted as in slice
10 notation.
Numbers
The three major numeric types in Python are int, float, and complex.
While each data type is uniquely distinct, they share some attributes so
to make interoperability easier. For example, the .conjugate() method
returns the complex conjugate of a complex number - for integers and
floats, this is just the number itself. Furthermore, the .real and .imag
attributes define the real and imaginary part of a given number. For floats
and ints, the .real part is the number itself, and the .imag part is always
zero.
Integers
The Python int has several methods for convenience when it comes to
bit and byte representations. The int.bit_length() method returns
the number of bits necessary to represent an integer in binary, excluding
the sign and leading zeros. int.bit_count() returns the number of bits
set to 1 in the binary representation of an integer. int.from_bytes()
and int.to_bytes() convert integer values to and from a bytestring
representation.
1 >>> int(5).bit_count()
2 2
3 >>> int(5).bit_length()
4 3
5 >>> int(5).to_bytes()
6 b'\x05'
7 >>> int.from_bytes(b'\x05')
8 5
Floats
The python floating point value, represented by the float data type, is
a primitive data type that is used to represent decimal numbers. Floats
are numbers with decimal points, such as 3.14 or 2.718. There are also
a few special float values, such as float('inf'), float('-inf'), and
float('nan'), which are representations of infinity, negative infinity,
and “Not a Number”, respectively.
Part II. A Deeper Dive 154
Float Methods
1 >>> float(5).is_integer()
2 True
1 >>> float(1.5).as_integer_ratio()
2 (3, 2)
Hex Values
Complex Numbers
1 >>> (3-4j).conjugate()
2 (3+4j)
Strings
Paddings
Formatting
Translating
Partitioning
delimiter in the string, instead of the first. If the delimiter is not found,
the tuple contains two empty strings, followed by the original string.
ends with the specified suffix and returns the modified string. They are
non-failing, so they return a resultant string regardless of whether or
not the substring was present.
Boolean Checks
Case Methods
Encodings
Bytes
1 >>> "Maß".encode('utf8')
2 b'Ma\xc3\x9f'
There are a few distinct methods which are unique to the bytes type,
and we’ll discuss those here.
Part II. A Deeper Dive 165
Decoding
Hex Values
Tuples
The Python tuple data type has limited functionality, given its nature as
an immutable data type. The two methods it does define however are
tuple.count() and tuple.index().
Part II. A Deeper Dive 166
1 >>> t = (1, 2, 3, 2, 1)
2 >>> t.count(2)
3 2
The .index() method is used to find the index of the first occurrence of
a specific element in a tuple. The method takes one required argument,
which is the element to find, and returns the index of the first occurrence
of that element in the tuple. If the element is not found in the tuple, the
method raises a ValueError.
1 >>> t = (1, 2, 3, 2, 1)
2 >>> t.index(2)
3 1
Lists
The Python list type defines a suite a methods which are used to inspect
and mutate the data structure.
1 >>> l = [1, 2, 3, 2, 1]
2 >>> l.count(2)
3 2
The .index() method is used to find the index of the first occurrence
of a specific element in a list. The method takes one required argument,
which is the element to find, and returns the index of the first occurrence
of that element in the list. If the element is not found in the list, the
method raises a ValueError.
1 >>> l = [1, 2, 3, 2, 1]
2 >>> l.index(2)
3 1
Copying
The copy method is used to create a shallow copy of a list. It does not
take any arguments and it returns a new list that is a copy of the original
list.
1 >>> l1 = [object()]
2 >>> l2 = l1.copy()
3 >>> l2
4 [<object at 0x7f910fec4630>]
5 >>> l1[0] is l2[0]
6 True
Mutations
1 >>> l = [0]
2 >>> l.append(1)
3 >>> l
4 [0, 1]
1 >>> l = [1, 2, 3]
2 >>> l.extend((4, 5, 6))
3 >>> l
4 [1, 2, 3, 4, 5, 6]
1 >>> l = [1, 2, 3, 4]
2 >>> l.insert(2, 5)
3 >>> l
4 [1, 2, 5, 3, 4]
1 >>> l = [1, 2, 3, 2, 1]
2 >>> l.remove(2)
3 >>> l
4 [1, 3, 2, 1]
The .pop() method is used to remove an element from a list and return
it. The method takes one optional argument, which is the index of the
element to remove. If no index is provided, the method removes and
returns the last element of the list.
1 >>> l = [1, 2, 3, 4]
2 >>> l.pop(2)
3 3
4 >>> l
5 [1, 2, 4]
Finally, the .clear() method is used to remove all elements from a list.
It does not take any arguments and removes all elements from the list.
1 >>> l = [1, 2, 3, 4]
2 >>> l.clear()
3 >>> l
4 []
Orderings
1 >>> l = [1, 2, 3, 4]
2 >>> l.reverse()
3 >>> l
4 [4, 3, 2, 1]
Finally, the .sort() method is used to sort the elements in a list. The
method is an in-place operation and returns None. The method does
not take any arguments but accepts a few optional arguments, such as
key and reverse. The key argument specifies a function that is used to
extract a comparison value from each element in the list. The reverse
argument is a Boolean value that specifies whether the list should be
sorted in a ascending or descending order. The sort() method sorts the
elements in ascending order by default.
1 >>> l = [3, 2, 4, 1]
2 >>> l.sort()
3 >>> l
4 [1, 2, 3, 4]
Dictionaries
The Python dict type defines a suite a methods which are used to inspect
and mutate the data structure.
Iter Methods
Getter/Setter Methods
1 >>> d = {}
2 >>> d.setdefault("a", []).append(1)
3 >>> d
4 {'a': [1]}
Mutations
the second is an optional value to set as the default value for each key. If
a default value is not specified, it defaults to None.
Sets
Sets are used to store unique elements. They are useful for operations
such as membership testing, removing duplicates from a sequence and
mathematical operations. They are mutable and have a variety of built-in
methods to perform various set operations.
Mutations
The .add() method is used to add an element to a set. The method takes
one required argument, which is the element to add, and adds it to the
set.
1 >>> s = {1, 2, 3}
2 >>> s.add(4)
3 >>> s
4 {1, 2, 3, 4}
1 >>> s = {1, 2, 3}
2 >>> s.remove(2)
3 >>> s
4 {1, 3}
1 >>> s = {1, 2, 3}
2 >>> s.discard(2)
3 >>> s
4 {1, 3}
5 >>> s.discard(4)
6 >>> s
7 {1, 3}
1 >>> s = {1, 2, 3}
2 >>> s.pop()
3 1
4 >>> s
5 {2, 3}
1 >>> s = {1, 2}
2 >>> s.update([2, 3], (4, 5))
3 >>> s
4 {1, 2, 3, 4, 5}
The .clear() method is used to remove all elements from a set. It does
not take any arguments and removes all elements from the set.
1 >>> s = {1, 2, 3}
2 >>> s.clear()
3 >>> s
4 set()
Set Theory
1 >>> s1 = {1, 2, 3}
2 >>> s2 = {2, 3, 4}
3 >>> s1.union(s2)
4 {1, 2, 3, 4}
1 >>> s1 = {1, 2, 3}
2 >>> s2 = {2, 3, 4}
3 >>> s1.intersection(s2)
4 {2, 3}
5 >>> s1.intersection_update(s2)
6 >>> s1
7 {2, 3}
1 >>> s1 = {1, 2, 3}
2 >>> s2 = {2, 3, 4}
3 >>> s1.difference(s2)
4 {1}
5 >>> s1.difference_update(s2)
6 >>> s1
1 >>> s1 = {1, 2, 3}
2 >>> s2 = {2, 3, 4}
3 >>> s1.symmetric_difference(s2)
4 {1, 4}
5 >>> s1.symmetric_difference_update(s2)
6 >>> s1
7 {1, 4}
Boolean Checks
Several methods available for set objects allow you to check the relation-
ship between a pair of sets.
The .isdisjoint() method returns True if two sets have no common
elements.
1 >>> s1 = {1, 2, 3}
2 >>> s2 = {4, 5, 6}
3 >>> s1.isdisjoint(s2)
4 True
1 >>> s1 = {1, 2, 3}
2 >>> s2 = {1, 2, 3, 4, 5, 6}
3 >>> s1.issubset(s2)
4 True
1 >>> s1 = {1, 2, 3, 4, 5, 6}
2 >>> s2 = {1, 2, 3}
3 >>> s1.issuperset(s2)
4 True
Copying
are only used for static analysis and documentation purposes. It should
also be explicitly noted that, absent a mechanism for type checking, type
hints are not guaranteed to be accurate by the interpreter. They are hints
in the very sense of the word.
One of the best places to start adding type hints is in existing functions.
Functions typically have explicit inputs and outputs, so adding type
definitions to I/O values is relatively straightforward. Once you have
identified a function that can be type annotated, determine the types of
inputs and outputs for that function. This should be done by inspecting
the code and looking at the context in which each variable of the function
is used. For example, if a function takes two integers as inputs and returns
a single integer as output, you would know that the inputs should have
type hints int and the return type hint should be int as well.
Type hints for functions go in the function signature. The syntax for these
hints is name: type[=default] for function arguments, and -> type:
for the return value.
In this example, the multiply function is given type annotations for the
a and b variables. In this case, those types are specified as integers. Since
the resulting type of multiplying two integers is always an integer, the
return type of this function is also int.
Its worth reiterating that this multiply function will not fail at runtime
if it is passed values which aren’t integers. However, static checkers like
Part II. A Deeper Dive 181
mypy will fail if the function is called elsewhere in the code base with
non-int values, and text editors with LSP support will indicate that you
are using the function in error.
1 # ./script.py
2
3 def multiply(a: int, b: int) -> int:
4 return a * b
5
6 print(multiply(1, 2.0))
Union types
You can specify multiple possible types for a single argument or return
value by using a union type. This is done either by using the Union type
from the typing module, enclosing the possible types in square brackets,
or by using the bitwise or operator | as of python 3.10.
Part II. A Deeper Dive 182
1 # ./script.py
2
3 def multiply(a: int|float, b: int|float) -> int|float:
4 return a * b
5
6 print(multiply(1, 2.0))
Optional
1 # ./script.py
2
3 from typing import Optional
4
5 def greet(name: str, title: Optional[str] = None) -> str:
6 if title:
7 return f"Hello, {title} {name}"
8 return f"Hello, {name}"
9
10 print(greet("Justin", "Dr."))
11 print(greet("Cory"))
type|None
1 # ./script.py
2
3 def greet(name: str, title: str|None = None) -> str:
4 if title:
5 return f"Hello, {title} {name}"
6 return f"Hello, {name}"
7
8 print(greet("Justin", "Dr."))
9 print(greet("Cory"))
Literal
The Literal type allows you to restrict the values that an argument or
variable can take to a specific set of literal values. This can be used for
ensuring that a function or method is only called with a specific correct
argument.
Part II. A Deeper Dive 185
1 # ./script.py
2
3 from typing import Literal
4
5 def color_picker(
6 color: Literal["red", "green", "blue"]
7 ) -> tuple[int, int, int]:
8 match color:
9 case "red":
10 return (255, 0, 0)
11 case "green":
12 return (0, 255, 0)
13 case "blue":
14 return (0, 0, 255)
15
16 print(color_picker("pink"))
Final
1 # ./script.py
2
3 from typing import Final
4
5 API_KEY: Final = "77d75da2e4a24afb85480c3c61f2eb09"
6 API_KEY = "c25e77071f5a4733bcd453c037adeb3f"
7
8 class User:
9 MAXSIZE: Final = 32
10
11 class NewUser(User):
12 MAXSIZE: Final = 64
TypeAlias
In instances where inlining type hints becomes too verbose, you can use
a TypeAlias type to create an alias for a definition based on existing
types.
Part II. A Deeper Dive 187
1 # ./script.py
2
3 from typing import Literal, TypeAlias
4
5 ColorName = Literal["red", "green", "blue"]
6 Color: TypeAlias = tuple[int, int, int]
7
8 def color_picker(color: ColorName) -> Color:
9 match color:
10 case "red":
11 return (255, 0, 0)
12 case "green":
13 return (0, 255, 0)
14 case "blue":
15 return (0, 0, 255)
16
17 print(color_picker("green"))
NewType
1 # ./script.py
2
3 from typing import Literal, NewType
4
5 ColorName = Literal["red", "green", "blue"]
6 Color = NewType("Color", tuple[int, int, int])
7
8 def color_picker(color: ColorName) -> Color:
9 match color:
10 case "red":
11 return Color((255, 0, 0))
12 case "green":
13 return Color((0, 255, 0))
14 case "blue":
15 return Color((0, 0, 255))
16
17 print(color_picker("green"))
TypeVar
1 # ./script.py
2
3 from typing import TypeVar
4
5 T = TypeVar('T')
6
7 def identity(item: T) -> T:
8 return item
9
10 identity(1)
1 # ./script.py
2
3 from typing import TypeVar
4
5 class Missing(int):
6 _instance = None
7 def __new__(cls, *args, **kwargs):
8 if not cls._instance:
9 cls._instance = super().__new__(cls, -1)
10 return cls._instance
11
12 # True, False, and Missing all satisfy this type
13 # because they are subclasses of int. However,
14 # int types also satisfy.
15 BoundTrinary = TypeVar("Trinary", bound=int)
16
17 def is_falsy(a: BoundTrinary) -> bool:
Part II. A Deeper Dive 190
“Missing” state for trinary logic (in comparison to boolean logic which
has 1 and 0, trinary logic has three states, 1, 0, and -1). Since this
new missing state, as well as the True and False singletons, all are
subclasses of int, we can create a TypeVar named BoundTrinary
which is bound to the int type, and this type definition will satisfy
the three states in the trinary. However this type will also match any
variable of int type; if this is undesirable we can instead constrain
the type definition to exclusively the bool, and Missing types as
demonstrated in the ConstrainedTrinary type definition. Static type
checkers will now throw a type error if an int type is passed where a
ConstrainedTrinary is expected.
Protocols
1 # ./script.py
2
3 from typing import Protocol
4
5 class Incrementable(Protocol):
6 def increment(self) -> None: ...
7
8 class Counter:
9 def __init__(self):
10 self.value = 0
11
Part II. A Deeper Dive 192
12 class CountByOnes(Counter):
13 def increment(self):
14 self.value += 1
15
16 class CountByTwos(Counter):
17 def increment(self):
18 self.value += 2
19
20 def increment_n(counter: Incrementable, n: int) -> None:
21 for _ in range(n):
22 counter.increment()
23
24 for c in (CountByOnes(), CountByTwos()):
25 increment_n(c, 10)
26 print(c.value)
1 # ./script.py
2
3 from typing import Protocol, runtime_checkable
4
5 @runtime_checkable
6 class Incrementable(Protocol):
7 def increment(self) -> None: ...
8
9 def increment_n(counter: Incrementable, n: int) -> None:
10 if not isinstance(counter, Incrementable):
11 raise TypeError
Generics
Generics are used to create type-safe data abstractions while being type
agnostic. To do this, we can create classes which subclasses the Generic
type from the typing module. When the class is later instantiated, a type
definition can then be defined at the point of object construction using
square brackets.
Part II. A Deeper Dive 194
1 # ./script.py
2
3 from typing import Generic, TypeVar
4
5 T = TypeVar("T")
6
7 class Queue(Generic[T]):
8 def __init__(self) -> None:
9 self._data: list[T] = []
10
11 def push(self, item: T) -> None:
12 self._data.append(item)
13
14 def pop(self) -> T:
15 return self._data.pop(0)
16
17 int_queue = Queue[int]()
18 str_queue = Queue[str]()
19
20 # fails type check, pushing a string to an `int` queue
21 int_queue.push('a')
square bracket notation. This makes it so code fails static type checking if
items of incorrect type are pushed and popped from either of the queues.
TypedDict
1 # ./script.py
2
3 from typing import TypedDict
4
5 class Person(TypedDict):
6 name: str
7 age: int
8
9 person = Person(name="Chloe", age=27)
Classes which subclass TypedDict can use class variables to specify keys
in a dictionary key-value store. The included type annotation asserts
during static type check that the value associated with a given key is
of a specified type.
At instantiation, TypedDict objects require that all variables are passed
to the constructor as keyword arguments. Failing to pass an expected
variable will cause static type checking to fail. This default behavior
can be changed however, by passing a total=False flag in the class
definition. Setting this flag will make all keys optional by default. We
can further specify a specific key as either Required or NotRequired,
so to designate if a specified key in the collection requires a value at
instantiation.
Part II. A Deeper Dive 196
1 # ./script.py
2
3 from typing import TypedDict, NotRequired, Required
4
5 class FirstClass(TypedDict, total=False):
6 a: str
7 b: str
8
9 class SecondClass(TypedDict):
10 a: str
11 b: NotRequired[str]
12
13 class ThirdClass(TypedDict, total=False):
14 a: str
15 b: Required[str]
16
17 first = FirstClass() # passes type check, nothing is required
18 second = SecondClass() # fails type check, missing 'a'
19 third = ThirdClass() # fails type check, missing 'b'
Finally, TypedDict objects can subclass the Generic type. This allows
dictionary values to be collections of types which are specified at
instantiation.
Part II. A Deeper Dive 197
1 # ./script.py
2
3 from typing import TypedDict, Generic, TypeVar
4
5 T = TypeVar("T")
6
7 class Collection(TypedDict, Generic[T]):
8 name: str
9 items: list[T]
10
11 coll = Collection[int](name="uid", items=[1,2,3])
Modules
Consider the following directory structure with two files. The first file is
a module file, and the second file is a script.
1 # ./module.py
2
3 def add(a, b):
4 return a + b
5
6 def subtract(a, b):
7 return a - b
In the script file, we use the import statement to import our module.
The import statement will import the entire module and make all of
its definitions and statements available in the current namespace of
the script, under the name module. We can access all of the defined
functionality of the module using the dot syntax, similar to accessing
the attributes of a class.
Part II. A Deeper Dive 199
1 # ./script.py
2
3 import module
4
5 print(module.add(1, 2))
6 print(module.subtract(1, 2))
With this, we can run python ./script.py and the output to the
console is 3, and then -1.
1 root@b9ba278f248d:/# cd code/
2 root@b9ba278f248d:/code# ls
3 module.py script.py
4 root@b9ba278f248d:/code# python script.py
5 3
6 -1
7 root@b9ba278f248d:/code#
If we want to instead import our module under a given alias, we can use
the as keyword in our import statement to rename the module in the
scope of the current namespace.
1 # ./script.py
2
3 import module as m
4
5 print(m.add(1, 2))
6 print(m.subtract(1, 2))
1 # ./script.py
2
3 from module import (
4 add,
5 subtract,
6 )
7
8 print(add(1, 2))
9 print(subtract(1, 2))
In instances where our modules are collected into folders within the root
directory, we can use dot syntax to specify the path to a module.
1 # ./script.py
2 import modules.module as m
3
4 print(m.add(1, 2))
Module Attributes
There are several attributes that are automatically created for every
module in Python. Some of the most commonly used module attributes
are:
if __name__ == "__main__":
1 # ./script.py
2
3 import module
4
5 def main():
6 print(module.add(1, 2))
7
8 if __name__ == "__main__":
9 main()
Packages
modules and the package name serves as a prefix for the modules
it contains. For example, a package named mypackage could contain
modules mypackage.module1, mypackage.module2 and so on.
To create a package, you need to create a directory with the package
name, and within that directory, you can have one or more modules.
The directory should contain an __init__.py file, which is a special
file that tells Python that this directory should be treated as a package.
Outside the directory you should have a setup file. Typically these come
in the form of either a setup.py file or a pyproject.toml file. The
setup.py file is a setup script that the setuptools package uses to
configure the build of a python package. A typical setup.py file will
define a function called setup() that is used to specify build parameters
and metadata. A pyproject.toml file accomplishes much of the same
functionality, but uses a declarative configuration scheme to define the
project’s metadata, dependencies, and build settings. For now, we’ll use
just use the pyproject.toml toolchain, and introduce setup.py later
when necessary.
Let’s consider the following package structure:
1 my_package/
2 logging/
3 log.py
4 math/
5 addition.py
6 subtraction.py
7 __init__.py
8 __main__.py
9 pyproject.toml
1 # ./pyproject.toml
2
3 [build-system]
4 requires = ["setuptools", "wheel"]
1 # ./my_package/__main__.py
2
3 if __name__ == "__main__":
4 print("hello from main!")
Inside the my_package directory is two folders. The first is a folder called
logging, and inside that folder there is a log.py file. There is only one
function in this file, named fn. This function simply calls the print()
function.
1 # ./my_package/logging/log.py
2
3 def fn(*args, **kwargs):
4 print(*args, **kwargs)
Inside the second folder math is two files. One file, addition.py,
contains a function which adds two numbers. This file also calls the fn
Part II. A Deeper Dive 204
from logging.log to print out the two numbers whenever it is called. The
second file, subtraction.py does the same, except it returns the result
of subtracting the two numbers, instead of adding them.
1 # ./my_package/math/addition.py
2
3 from ..logging.log import fn
4
5 def add(a, b):
6 fn(a, b)
7 return a + b
1 # ./my_package/math/subtraction.py
2
3 from my_package.logging.log import fn
4
5 def subtract(a, b):
6 fn(a, b)
7 return a - b
Packages allow for both relative and absolute imports across the package.
Relative import allow you to specify the package and module names
relative to the current namespace using dot notation. For example, in
addition.py the log file’s fn is imported relatively. The logging module
is once removed up the folder tree, so ..logging specifies that the
logging module is once removed from the current directory using a
second .. As another example, from the addition.py file you could
import the subtract function by specifying a relative import from
.subtraction import subtract. Absolute imports on the other hand
Part II. A Deeper Dive 205
Installation
1 root@e08d854dfbfe:/code# ls
2 my_package pyproject.toml
3
4 root@e08d854dfbfe:/code# python -m pip install -e .
5 Obtaining file:///code
6 Installing build dependencies ... done
7 Checking if build backend supports build_editable ... done
8 Getting requirements to build editable ... done
9 Preparing editable metadata (pyproject.toml) ... done
10 Building wheels for collected packages: my-package
11 Building editable for my-package (pyproject.toml) ... done
12 Created wheel for my-package: filename=my_package-0.0.0-0.edita\
13 ble-py3-none-any.whl size=2321 sha256=2137df7f84a48acdcb77e9d1ab3\
14 33bf838693910338918fe16870419cb351979
Part II. A Deeper Dive 206
Inside the newly created venv directory, we’ll see a few folders. First is
the ./venv/bin folder (on Windows this will be the ./venv/Scripts
folder) which is where executables will be placed. The two files within
this folder that are worth knowing about are the python file, which
is a symlink to the binary executable of the global python version,
and the activate script, which for POSIX systems (linux, mac) is
generally ./venv/bin/activate (or an analogous activate.fish
or activate.csh if using either the fish shell or csh/tcsh shells,
respectively), and for Windows is ./venv/Scripts/Activate.ps1 if
Part III. The Python Standard Library 209
Apart from the ./venv/bin folder, the venv package also creates a
./venv/lib folder. Inside this folder nested under a python version
folder is a folder called site-packages. This site-packages directory
inside the ./venv/lib/ directory of the virtual environment is where
third-party packages are installed when the virtual environment is
active.
Part III. The Python Standard Library 210
This can be contrasted against the action of deepcopy. Given the same
object my_list, we can call copy.deepcopy on this list, which returns
a variable we assign to deep_copy. The deep_copy is a copy of the list
my_list, so calling id() on deep_copy and my_list yields different
identifiers, as they are different lists. The difference however is that
now if we call id() on any of the items of the two lists, such as the
item at the zeroth index, the call yields different values. This is because
copy.deep_copy recursively traverses collections, making copies of
each item, until the new copy is completely independent of the original
item.
Chaining Iterables
Filtering Iterables
Finally, if you only wish to filter out the elements of an iterable while
the callback is satisfied, the itertools.dropwhile() function can be
Part III. The Python Standard Library 215
used. This function creates an iterator that drops elements from the
input iterator as long as a given callback is truthy for those elements.
Once the callback returns a falsy value for an element, the remaining
elements are included in the output iterator. There’s also an analogous
itertools.takewhile() function, which instead of dropping values,
it yields values until the callback returns falsy.
12 >>> next(result)
13 2
Creating Groups
18 >>> c
19 {1: [1, 3, 5], 0: [2, 4, 6]}
Slicing Iterables
Zip Longest
Partials
Reduce
Pipes
Caching
for the same input. It stores the result of the function call for a specific
input in a cache. If the function is called again with the same input, the
cached result is returned instead of recomputing the result. This can be
useful for reducing the amount of computation required for a function,
and can also be useful for improving the performance of a function that
is called multiple times with the same input.
In some instances it may be necessary to limit the cache size. In this case,
you can use functools.lru_cache() to set an optional maxsize for
the cache, where the “least recently used” values that exceed the cache
size are dropped. By default this value is 128 items. The cache can be
invalidated using the .cache_clear() method on the cache.
Part III. The Python Standard Library 221
Dispatching
1 >>> fn = functools.singledispatch(
2 ... lambda x: print("unknown type:", x)
3 ... )
4 >>> fn.register(int)(lambda x: print(f"type int: {x}"))
5 >>> fn.register(float)(lambda x: print(f"type float: {x}"))
6 >>> fn("a")
7 unknown type: a
8 >>> fn(1)
9 type int: 1
10 >>> fn(1.0)
11 type float: 1.0
Part III. The Python Standard Library 223
Enums
1 match current_day:
2 case Weekdays.Monday:
3 print("Today is Monday")
4 case Weekdays.Tuesday:
5 print("Today is Tuesday")
6 case Weekdays.Wednesday:
7 print("Today is Wednesday")
8 case Weekdays.Thursday:
9 print("Today is Thursday")
10 case Weekdays.Friday:
Part III. The Python Standard Library 225
11 print("Today is Friday")
12 case Weekends.Saturday | Weekends.Sunday:
13 print("Weekend!")
NamedTuples
NamedTuples are a special type of tuple in Python that allows you to give
names to the elements of the tuple. They are designed to be interoperable
with regular tuples, but they also allow you to access elements by name,
in addition to by index.
The NamedTuple class from the typing module can be used to create a
NamedTuple. The derived class defines attributes via type annotations.
Instance variables are defined at instantiation. A NamedTuple can be
instantiated using variadic arguments or keyword arguments; in the
case of using variadic args, the order of attributes defined in the class
determine which indexed value is assigned to the attribute.
A NamedTuple can also be created using the factory function named-
tuple in the collections module. In this case, the first argument is
the name of the NamedTuple, and the second argument is an iterable
containing the name of each attribute.
Part III. The Python Standard Library 226
Dataclasses
Multithreading
Thread Locks
19 ...
20 >>> for t in threads:
21 ... t.join()
22 ...
23 >>> value # should be 2
24 1
Multiprocessing
The Process() class can be used to create and manage new processes.
A process is a separate execution environment, with its own memory
space and Python interpreter. This allows you to take advantage of
multiple cores on a machine, and to work around the Global Interpreter
Lock (GIL) that prevents multiple threads from executing Python code
simultaneously. The multiprocessing.Process class has the same
API as the threading.Thread class.
Part III. The Python Standard Library 235
The multiprocessing library also provies a Pool class that can be used
to orchestrate multiple tasks in parallel. A pool of worker processes is a
group of processes that can be reused to execute multiple tasks.
Part III. The Python Standard Library 236
Process Locks
concurrent.futures
a maximum time to wait for the futures to complete before stopping the
iteration.
It’s common to utilize as_completed when you want to process the
results of the tasks as soon as they are completed, regardless of the order
they were submitted. On the other hand, wait is useful when you want
to block until all the tasks are completed and retrieve the results in the
order they were submitted.
chapter. Our focus instead will be how developers can use asyncio in
the context of backend systems - framework developers are encouraged
to read the full spec in python’s official documentation.
Coroutines
raises the group. Exceptions from the group can be selectively handled
using an except* syntax, which is akin to unpacking exceptions from
the group to handle individually.
In this example, the try block catches both the AssertionError and the
ValueError within a single iteration of the event loop. Both exceptions
are grouped together into a single ExceptionGroup by the TaskGroup,
and the ExceptionGroup is raised. Tasks which aren’t finished by the
time the ExceptionGroup is raised are cancelled.
The except* handler unpacks the exceptions of the exception group and
it allow you to handle each exception in isolation from other exceptions
caught by the group exception handler. In this case, the AssertionError
and the ValueError are print to the console, and the main block exits.
Part VI. The Underbelly of
the Snake
As Python developers, it’s important to have the skills and tools to find
and fix bugs, and optimize the performance of our code. As the codebase
grows and becomes more complex, it becomes increasingly difficult to
identify and fix issues. Additionally, as our respected platforms start
to gain users, or start deployed in more demanding environments, the
importance of ensuring that it is performing well becomes even more
critical.
In this section, we’ll cover the various techniques and tools available for
debugging and profiling Python, as well as how to optimize bottlenecks
with C extensions.
pdb
The Python Debugger, also known as pdb, is a built-in module that allows
you to debug your code by stepping through it line by line and inspecting
the state of the program at each step. It is a command-line interface that
provides a set of commands for controlling the execution of the program
and for inspecting the program’s state.
When using pdb, you can set breakpoints in your code, which will cause
the interpreter to pause execution at that point, and drop you into a
debugger REPL. You can then use various commands to inspect the state
of the program, such as viewing the values of variables, inspecting the
call stack, and seeing the source code. You can also step through your
code, line by line, to see how each command mutates state.
1 # ./closure.py
2
3 def my_closure(value):
4 def my_decorator(fn):
5 def wrapper(*args, **kwargs):
6 _enclosed = (fn, value)
7 breakpoint()
8 return wrapper
9 return my_decorator
10
11 @my_closure("this")
12 def my_function(*args, **kwargs):
13 pass
14
15 my_function(None, kwarg=1)
of a decorator. The breakpoint will drop us into the pdb repl when the
interpreter hits this line in execution.
From here, we have a set of commands which we can use to inspect the
state of our program. Here are some of the most common:
1 root@e08d854dfbfe:~# ls
2 script.py
3 root@e08d854dfbfe:~# python ./script.py
4 --Return--
5 > /root/script.py(5)wrapper()->None
6 -> breakpoint()
7 (Pdb) help
8
9 Documented commands (type help <topic>):
10 ========================================
11 EOF c d h list q rv \
12 undisplay
13 a cl debug help ll quit s \
14 unt
15 alias clear disable ignore longlist r source \
16 until
17 args commands display interact n restart step \
18 up
19 b condition down j next return tbreak w
20 break cont enable jump p retval u \
21 whatis
22 bt continue exit l pp run unalias \
23 where
24
25 Miscellaneous help topics:
26 ==========================
27 exec pdb
28 (Pdb) help whatis
29 whatis arg
30 Print the type of the argument.
In addition, the pdb repl is able to run executable python. This includes
Part VI. The Underbelly of the Snake 252
1 (Pdb) list
2 1 def my_closure(value):
3 2 def my_decorator(fn):
4 3 def wrapper(*args, **kwargs):
5 4 _enclosed = (fn, value)
6 5 -> breakpoint()
7 6 return wrapper
8 7 return my_decorator
9 8
10 9 @my_closure("this")
11 10 def my_function(*args, **kwargs):
12 11 pass
13
14 (pdb) where
15 /root/script.py(13)<module>()
16 -> my_function(0, kwarg=1)
17 > /root/script.py(5)wrapper()->None
18 -> breakpoint()
19
20 (pdb) p args
21 (0, )
22
23 (pdb) p value
24 'this'
25
26 (Pdb) dir(kwargs)
27 ['__class__', '__class_getitem__', '__contains__', '__delattr__',
28 '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__g\
29 e__',
30 '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__h\
31 ash__',
Part VI. The Underbelly of the Snake 253
Python programs can be run within the context of the python debugger.
This allows the debugger to catch unexpected exceptions, dropping you
Part VI. The Underbelly of the Snake 254
1 # ./script.py
2
3 def my_closure(value):
4 def my_decorator(fn):
5 def wrapper(*args, **kwargs):
6 _enclosed = (fn, value)
7 raise ValueError
8 return wrapper
9 return my_decorator
10
11 @my_closure("this")
12 def my_function(*args, **kwargs):
13 pass
14
15 my_function(0, kwarg=1)
12
13 (Pdb) c
14 Traceback (most recent call last):
15 File "/usr/local/lib/python3.11/pdb.py", line 1774, in main
16 pdb._run(target)
17 File "/usr/local/lib/python3.11/pdb.py", line 1652, in _run
18 self.run(target.code)
19 File "/usr/local/lib/python3.11/bdb.py", line 597, in run
20 exec(cmd, globals, locals)
21 File "<string>", line 1, in <module>
22 File "/root/script.py", line 13, in <module>
23 my_function(0, kwarg=1)
24 File "/root/script.py", line 5, in wrapper
25 raise ValueError
26 ValueError
27 Uncaught exception. Entering post mortem debugging
28 Running 'cont' or 'step' will restart the program
29 > /root/script.py(5)wrapper()
30 -> raise ValueError
31
32 (Pdb) ll
33 3 def wrapper(*args, **kwargs):
34 4 _enclosed = (fn, value)
35 5 -> raise ValueError
Other Debuggers
such as call or return, and report how much time elapsed between the
auditable events of a given set of instructions. There are also third party
libraries available which can analyze other aspects of the runtime, like
memory usage.
cProfile
One of the most commonly used tools for profiling Python code is the
cProfile module. It is a built-in library that generates statistics on the
number of calls and the time spent in each function. This information
can be used to identify which parts of the code are taking the most time
to execute, and make adjustments accordingly.
To use cProfile, you can run your script with the command python -m
cProfile ./script.py and it will output the statistics of the script’s
execution. You can pass an optional -s argument so as to control how
the output is sorted; by default the output is sorted by the call count, but
can be set to cumulative to sort by cumulative time, ncalls to sort
by the call count, etc. You can also pass -o ./file.prof to dump the
results to a file, though -s and -o are mutually exclusive.
1 import time
2
3
4 def slow_mult(a, b):
5 time.sleep(1.1)
6 return a * b
7
8
9 def fast_mult(a, b):
10 time.sleep(0.1)
Part VI. The Underbelly of the Snake 260
11 return a * b
12
13
14 def run_mult(a, b):
15 x = slow_mult(a, b)
16 y = fast_mult(a, b)
17 _abs = abs(x - y)
18 return _abs < 0.001
19
20
21 def main():
22 a, b = 1, 2
23 run_mult(a, b)
24
25
26 if __name__ == "__main__":
27 main()
15 >)
16 1 0.000 0.000 1.200 1.200 script.py:19(main)
17 1 0.000 0.000 1.200 1.200 script.py:12(run_mu\
18 lt)
19 2 1.200 0.600 1.200 0.600 {built-in method ti\
20 me.sleep}
21 1 0.000 0.000 1.100 1.100 script.py:3(slow_mu\
22 lt)
23 1 0.000 0.000 0.100 0.100 script.py:7(fast_mu\
24 lt)
25 1 0.000 0.000 0.000 0.000 {method 'disable' o\
26 f '_lsprof.Profiler' objects}
27 1 0.000 0.000 0.000 0.000 {built-in method bu\
28 iltins.abs}
flameprof
The data dump of a cProfile run can be used to generate what’s called a
flame graph. Flame graphs are visual representations of how much time
is spent within the scope of a given function call. Each bar in a flame
graph represents a function and its subfunctions, with the width of the
bar representing the amount of time spent in that function. Functions
that take up more time are represented by wider bars, and functions
that take up less time are represented by narrower bars. The functions
are stacked vertically, with the main function at the bottom and the
subfunctions at the top.
The python library flameprof can be used to generate flame graphs
from the output of cProfile. To generate one, first run cProfile with
the -o argument to dump results to a file. Next, use flameprof to ingest
the dump file. flameprof will use that profile to generate an svg file.
You can open this file in a web browser to see the results.
Part VI. The Underbelly of the Snake 262
flamegraph
snakeviz
snakeviz
memory_profiler
1 # ./script.py
2
3 @__import__("memory_profiler").profile
4 def main():
5 my_list = [257] * (10**6)
6 return my_list
7
8
9 if __name__ == "__main__":
10 main()
your Python code, which can be especially useful for large or complex
applications.
One of the benefits of using C extensions is that C is much faster than
Python, especially for tasks that are computationally intensive or involve
heavy manipulation of data. With C extensions, you can take advantage
of the performance benefits of C while still being able to use Python for
the higher-level logic and user interface parts of your code.
Additionally, C extensions can also be used to interface with existing
C libraries and APIs. This allows you to leverage existing libraries and
tools that are already available in C, making it easier to build complex
systems and applications.
It is important to note that the use of C extensions in Python requires a
good understanding of programming in C. If you are unfamiliar with C
or are not comfortable with its syntax and concepts, it is recommended
that you first learn C before attempting to use C extensions in Python.
Writing and using C extensions can be complex and requires a strong
understanding of both Python and C, so it is important to have a solid
foundation in both languages before diving into this aspect of Python
development.
Hello World
To start, we’re going to create a new python package with the following
folder structure:
Part VI. The Underbelly of the Snake 267
1 my_package/
2 __init__.py
3 hello_world.c
4 setup.py
hello_world.c
1 // ./my_package/hello_world.c
2
3 #include <stdio.h>
4
5 #include <Python.h>
6
7 static PyObject* hello_world() {
8 puts("Hello World!");
9 Py_RETURN_NONE;
10 }
11
12 static char HelloWorldFunctionDocs[] =
13 "prints 'Hello World!' to the screen, from C.";
14
15 static PyMethodDef MethodTable[] = {
16 {
Part VI. The Underbelly of the Snake 268
17 "hello_world",
18 (PyCFunction) hello_world,
19 METH_NOARGS,
20 HelloWorldFunctionDocs
21 },
22 {NULL, }
23 };
24
25 static char HelloWorldModuleDocs[] =
26 "module documentation for 'Hello World'";
27
28 static struct PyModuleDef HelloWorld = {
29 PyModuleDef_HEAD_INIT,
30 "hello_world",
31 HelloWorldModuleDocs,
32 -1,
33 MethodTable
34 };
35
36 PyMODINIT_FUNC PyInit_hello_world() {
37 return PyModule_Create(&HelloWorld);
38 }
setup.py
Now that we have our C extension written to a file, we need to tell python
how to compile this extension into a dynamically linked shared object
Part VI. The Underbelly of the Snake 270
1 # ./setup.py
2
3 import os.path
4 from setuptools import setup, Extension
5
6 extensions = [
7 Extension(
8 'my_package.hello_world',
9 [os.path.join('my_package', 'hello_world.c')]
10 )
11 ]
12
13 setup(
14 name="my_package",
15 ext_modules=extensions
16 )
Finally, to use our extension, install the package and call the module
Part VI. The Underbelly of the Snake 271
function
in Python.
1 root@edc7d7fa9220:/code# ls
2 my_package setup.py
3
4 root@edc7d7fa9220:/code# python -m pip install -e .
5 Obtaining file:///code
6 Preparing metadata (setup.py) ... done
7 Installing collected packages: my-package
8 Running setup.py develop for my-package
9 Successfully installed my-package-0.0.0
10
11 [notice] A new release of pip available: 22.3.1 -> 23.0
12 [notice] To update, run: pip install --upgrade pip
13
14 root@edc7d7fa9220:/code# python
15 Python 3.11.1 (main, Jan 23 2023, 21:04:06) [GCC 10.2.1 20210110]\
16 on linux
17 Type "help", "copyright", "credits" or "license" for more informa\
18 tion.
19 >>> from my_package import hello_world
20 >>> hello_world.__doc__
21 "module documentation for 'Hello World'"
22 >>> hello_world.hello_world.__doc__
23 "prints 'Hello World!' to the screen, from C."
24 >>> hello_world.hello_world()
25 Hello World!
26 >>>
1 // ./my_package/hello_world.c
2
3 #include <stddef.h>
4 #include <stdio.h>
5
6 #include <Python.h>
7
8 static PyObject* hello_world() {
9 puts("Hello World! \n- With <3 from C");
10 Py_RETURN_NONE;
11 }
12
13 static char HelloWorldFunctionDocs[] =
14 "prints 'Hello World!' to the screen, from C.";
15
16 static PyMethodDef MethodTable[] = {
17 {
18 "hello_world",
19 (PyCFunction) hello_world,
20 METH_NOARGS,
21 HelloWorldFunctionDocs
22 },
23 {NULL, }
24 };
25
26 static char HelloWorldModuleDocs[] =
27 "module documentation for 'Hello World'";
Part VI. The Underbelly of the Snake 273
28
29 static struct PyModuleDef HelloWorld = {
30 PyModuleDef_HEAD_INIT,
31 "hello_world",
32 HelloWorldModuleDocs,
33 -1,
34 MethodTable
35 };
36
37 PyMODINIT_FUNC PyInit_hello_world() {
38 return PyModule_Create(&HelloWorld);
39 }
20 on linux
21 Type "help", "copyright", "credits" or "license" for more informa\
22 tion.
23 >>> from my_package import hello_world
24 >>> hello_world.hello_world()
25 Hello World!
26 - With <3 from C
27 >>>
1 #include <Python.h>
2
3 static PyObject* sum(PyObject* self, PyObject* args) {
4 PyObject* iter = PyObject_GetIter(args);
5 PyObject* item;
6
7 long res_i = 0;
8 double res_f = 0;
9
10 while ((item = PyIter_Next(iter))) {
11 if (PyLong_Check(item)) {
12 long val_i = PyLong_AsLong(item);
13 res_i += val_i;
14 }
15 else if (PyFloat_Check(item)) {
16 double val_f = PyFloat_AsDouble(item);
17 res_f += val_f;
18 }
19 Py_DECREF(item);
20 }
21 Py_DECREF(iter);
22
23 if (res_f) {
24 double result = res_f + res_i;
25 return PyFloat_FromDouble(result);
26 }
27 return PyLong_FromLong(res_i);
28 }
29
30 static PyMethodDef MethodTable[] = {
31 {
32 "sum",
Part VI. The Underbelly of the Snake 276
33 (PyCFunction) sum,
34 METH_VARARGS,
35 "returns the sum of a series of numeric types"
36 },
37 {NULL, }
38 };
39
40
41 static struct PyModuleDef MyMathModule = {
42 PyModuleDef_HEAD_INIT,
43 "math",
44 "my math module",
45 -1,
46 MethodTable,
47 };
48
49 PyMODINIT_FUNC PyInit_math() {
50 return PyModule_Create(&MyMathModule);
51 }
33 class 'int'
34 >>> (val := math.sum(1, 2, 3, 4.0))
35 10.0
36 >>> type(val)
37 class 'float'
38 >>>
Memory Management
Parsing Arguments
the specified variables, and zero otherwise. If the arguments are invalid,
it raises an exception in Python to indicate the error, and the C function
should immediately return NULL.
include names for all of the arguments, whether or not they are strictly
positional vs keyword.
17 my_package/math.cpython-311-x86_64-linux-gnu.so
18 copying build/lib.linux-x86_64-cpython-311/my_package/hello_world\
19 .cpython-311-x86_64-linux-gnu.so -> my_package
20 copying build/lib.linux-x86_64-cpython-311/my_package/math.cpytho\
21 n-311-x86_64-linux-gnu.so -> my_package
22
23 root@cb4fbbf71628:/code# python -c "from my_package import math; \
24 math.print_ints(pos=1, keyword=2)"
25 postional: 1
26 keyword: 2
27 root@cb4fbbf71628:/code#
Creating PyObjects
It should be noted that in this example each data type is only being as-
signed once, if you wish to assign the same object to multiple collections,
for example appending the _float to the _list, as well as adding it to
the _set, you’ll need to increase the reference count.
Importing Modules
1 #include <Python.h>
2 #include "structmember.h"
3
4 typedef struct {
5 PyObject_HEAD
6 PyObject *first_name;
7 PyObject *last_name;
8 } Person;
9
10 static void Person_Destruct(Person* self) {
11 Py_XDECREF(self->first_name);
12 Py_XDECREF(self->last_name);
13 Py_TYPE(self)->tp_free((PyObject*)self);
14 }
15
16 static int Person_Init(Person *self, PyObject *args, PyObject *kw\
17 args) {
18
19 PyObject *first=NULL, *last=NULL;
20 static char *kwlist[] = {"first_name", "last_name", NULL};
21 if (!PyArg_ParseTupleAndKeywords(args, kwargs, "OO", kwlist, \
22 &first, &last))
23 return -1;
24
25 PyObject* _first = self->first_name;
26 Py_INCREF(first);
27 self->first_name = first;
28 Py_XDECREF(_first);
29
30 PyObject* _last = self->last_name;
31 Py_INCREF(last);
32 self->last_name = last;
Part VI. The Underbelly of the Snake 288
33 Py_XDECREF(_last);
34
35 return 0;
36 }
37
38 static PyObject* Person_FullName(Person* self) {
39 if (self->first_name == NULL) {
40 PyErr_SetString(PyExc_AttributeError, "first_name is not \
41 defined");
42 return NULL;
43 }
44
45 if (self->last_name == NULL) {
46 PyErr_SetString(PyExc_AttributeError, "last_name is not d\
47 efined");
48 return NULL;
49 }
50
51 return PyUnicode_FromFormat("%S %S", self->first_name, self->\
52 last_name);
53 }
54
55 static PyMemberDef PersonMembers[] = {
56 {
57 "first_name",
58 T_OBJECT_EX,
59 offsetof(Person, first_name),
60 0,
61 "first name"
62 },
63 {
64 "last_name",
Part VI. The Underbelly of the Snake 289
65 T_OBJECT_EX,
66 offsetof(Person, last_name),
67 0,
68 "last name"
69 },
70 {NULL, }
71 };
72
73 static PyMethodDef PersonMethods[] = {
74 {
75 "full_name",
76 (PyCFunction)Person_FullName,
77 METH_NOARGS,
78 "return the full name of the Person"
79 },
80 {NULL, }
81 };
82
83 static PyTypeObject PersonType = {
84 PyVarObject_HEAD_INIT(NULL, 0)
85 "person.Person", /* tp_name */
86 sizeof(Person), /* tp_basicsize */
87 0, /* tp_itemsize */
88 (destructor)Person_Destruct, /* tp_dealloc */
89 0, /* tp_print */
90 0, /* tp_getattr */
91 0, /* tp_setattr */
92 0, /* tp_reserved */
93 0, /* tp_repr */
94 0, /* tp_as_number */
95 0, /* tp_as_sequence */
96 0, /* tp_as_mapping */
Part VI. The Underbelly of the Snake 290
97 0, /* tp_hash */
98 0, /* tp_call */
99 0, /* tp_str */
100 0, /* tp_getattro */
101 0, /* tp_setattro */
102 0, /* tp_as_buffer */
103 Py_TPFLAGS_DEFAULT
104 | Py_TPFLAGS_BASETYPE, /* tp_flags */
105 "Person objects", /* tp_doc */
106 0, /* tp_traverse */
107 0, /* tp_clear */
108 0, /* tp_richcompare */
109 0, /* tp_weaklistoffset */
110 0, /* tp_iter */
111 0, /* tp_iternext */
112 PersonMethods, /* tp_methods */
113 PersonMembers, /* tp_members */
114 0, /* tp_getset */
115 0, /* tp_base */
116 0, /* tp_dict */
117 0, /* tp_descr_get */
118 0, /* tp_descr_set */
119 0, /* tp_dictoffset */
120 (initproc)Person_Init, /* tp_init */
121 0, /* tp_alloc */
122 PyType_GenericNew, /* tp_new */
123 };
124
125 static PyModuleDef PersonModule = {
126 PyModuleDef_HEAD_INIT,
127 "person",
128 "Example module for creating Python types in C",
Part VI. The Underbelly of the Snake 291
129 -1,
130 NULL
131 };
132
133 PyMODINIT_FUNC PyInit_person() {
134 if (PyType_Ready(&PersonType) < 0)
135 return NULL;
136
137 PyObject* module = PyModule_Create(&PersonModule);
138 if (module == NULL)
139 return NULL;
140
141 Py_INCREF(&PersonType);
142 PyModule_AddObject(module, "Person", (PyObject*)&PersonType);
143 return module;
144 }
1 #include <Python.h>
2 #include "structmember.h"
1 typedef struct {
2 PyObject_HEAD
3 PyObject *first_name;
4 PyObject *last_name;
5 } Person;
Next, we define a Person struct. This struct represents the new type in
Python that can be used to model a person. The struct is initialized using
the PyObject_HEAD macro for including the standard prefixes for all
Python objects in C, and two custom fields, first_name and last_name,
which are pointers to the Python Objects which will represent the first
and last name of our person.
28 0, /* tp_iter */
29 0, /* tp_iternext */
30 PersonMethods, /* tp_methods */
31 PersonMembers, /* tp_members */
32 0, /* tp_getset */
33 0, /* tp_base */
34 0, /* tp_dict */
35 0, /* tp_descr_get */
36 0, /* tp_descr_set */
37 0, /* tp_dictoffset */
38 (initproc)Person_Init, /* tp_init */
39 0, /* tp_alloc */
40 PyType_GenericNew, /* tp_new */
41 };
Stack type
1 #include <Python.h>
2 #include "structmember.h"
3
4 typedef struct {
5 PyObject_HEAD
6 size_t length;
7 PyObject* pop;
8 PyObject* push;
9 PyObject** _data;
10 } Stack;
11
12 static PyObject* Stack_Push(Stack* self, PyObject* item) {
13 size_t len = self->length + 1;
14 self->_data = realloc(self->_data, len*sizeof(PyObject*));
15 Py_INCREF(item);
16 self->_data[self->length] = item;
17 self->length = len;
18 Py_RETURN_NONE;
19 }
20
21 static PyObject* Stack_Pop(Stack* self) {
22 if (self->length == 0)
23 Py_RETURN_NONE;
24 long len = self->length - 1;
25 PyObject* item = self->_data[len];
26 self->_data = realloc(self->_data, len*sizeof(PyObject*));
27 self->length = len;
28 return item;
29 }
30
31 static PyObject* Stack_New(PyTypeObject* type, PyObject* args, Py\
32 Object* kwargs) {
Part VI. The Underbelly of the Snake 301
65 sizeof(Stack), /* tp_basicsize */
66 0, /* tp_itemsize */
67 (destructor)Stack_Destruct, /* tp_dealloc */
68 0, /* tp_print */
69 0, /* tp_getattr */
70 0, /* tp_setattr */
71 0, /* tp_reserved */
72 0, /* tp_repr */
73 0, /* tp_as_number */
74 0, /* tp_as_sequence */
75 0, /* tp_as_mapping */
76 0, /* tp_hash */
77 0, /* tp_call */
78 0, /* tp_str */
79 0, /* tp_getattro */
80 0, /* tp_setattro */
81 0, /* tp_as_buffer */
82 Py_TPFLAGS_DEFAULT
83 | Py_TPFLAGS_BASETYPE, /* tp_flags */
84 "Stack objects", /* tp_doc */
85 0, /* tp_traverse */
86 0, /* tp_clear */
87 0, /* tp_richcompare */
88 0, /* tp_weaklistoffset */
89 0, /* tp_iter */
90 0, /* tp_iternext */
91 StackMethods, /* tp_methods */
92 StackMembers, /* tp_members */
93 0, /* tp_getset */
94 0, /* tp_base */
95 0, /* tp_dict */
96 0, /* tp_descr_get */
Part VI. The Underbelly of the Snake 303
97 0, /* tp_descr_set */
98 0, /* tp_dictoffset */
99 0, /* tp_init */
100 0, /* tp_alloc */
101 Stack_New, /* tp_new */
102 };
103
104 static PyModuleDef StackModule = {
105 PyModuleDef_HEAD_INIT,
106 "stack",
107 "module for custom stack object",
108 -1,
109 NULL
110 };
111
112 PyMODINIT_FUNC PyInit_stack() {
113 if (PyType_Ready(&StackType) < 0)
114 return NULL;
115
116 PyObject* module = PyModule_Create(&StackModule);
117 if (!module)
118 return NULL;
119
120 Py_INCREF(&StackType);
121 PyModule_AddObject(module, "Stack", (PyObject*)&StackType);
122 return module;
123 }
tiation of our Stack struct, setting the default values for the _data and
length fields. Since Stack() won’t take any initialization arguments,
we set tp_init to 0. We also use T_LONG as the implementation function
in the PyMemberDef struct to automatically cast the field value to a
Python int when the struct field is requested from the interpreter. We
also set this attribute as READONLY so as to prevent its value from being
reassigned.
Since this object is responsible for its own data collection, it’s worth
talking a moment to look at it’s implementation in finer detail. Specif-
ically, the push() and pop() methods of the struct, as well as the
Stack_Destruct implementation that is called at garbage collection.
When an item is passed to the Stack during a push operation, this
PyObject* is being passed as a borrowed reference. Since we are writing
a copy of this reference to _data, we need to explicitly inform the
interpreter that the Stack owns a reference, and until Stack releases
this reference, the item should not be garbage collected. We do this by
increasing the reference count.
When an item is popped from the Stack, we can simply return it without
decreasing the reference count, because we are returning an owned
reference to the caller.
Finally, when the Stack is deallocated, any owned reference in the
collection of PyObject*’s must be dereferenced; if this is not done then
the reference count of each item in the stack will never hit zero, resulting
in memory leaks. This is done by iterating over the length of the _data
array and calling Py_DECREF on each of the items. Only after this is done
can you call tp_free to delete the Stack object.
Part VI. The Underbelly of the Snake 305
Debugging C extensions
Similar to pdb, it’s possible to stop the execution of the python interpreter
inside c extensions using a C debugger like gdb. gdb allows you to find
and fix errors in C code by providing you with information about the
state of the extension while it is running.
To best use this, it’s important to first compile extensions without
optimizations. By default, python compiles extensions with an -O3
optimization flag; this is good for production but can result in objects
being optimized out of the extension. To compile without optimizations,
set the CFLAGS environment variable to -O0 during the extension build
step.
Once the C extension is compiled, use gdb to start the python interpreter.
It should be noted that gdb requires a binary executable, so modules
(for example pytest) and scripts need to be invoked from the python
executable as either a module or a script.
Part VI. The Underbelly of the Snake 306
Once in the gdb shell, we can do things such as set breakpoints, inspect
the call stack, observe variables, and run our program.
Part VI. The Underbelly of the Snake 307
1 (gdb) b Stack_Push
2 Function "Stack_Push" not defined.
3 Make breakpoint pending on future shared library load? (y or [n])\
4 y
5 Breakpoint 1 (Stack_Push) pending.
6 (gdb) run
7 Starting program: /usr/local/bin/python script.py
8 warning: Error disabling address space randomization: Operation n\
9 ot permitted
10 [Thread debugging using libthread_db enabled]
11 Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_\
12 db.so.1".
13
14 Breakpoint 1, Stack_Push (self=0x7f0d84d2bde0, item=0) at my_pack\
15 age/stack.c:13
16 13 size_t len = self->length + 1;
17 (gdb) b 17
18 Breakpoint 2 at 0x7f0d84adb25c: file my_package/stack.c, line 17.
19 (gdb) c
20 Continuing.
21
22 Breakpoint 2, Stack_Push (self=0x7f0d84d2bde0, item=0) at my_pack\
23 age/stack.c:17
24 17 self->length = len;
25 (gdb) p len
26 $1 = 1
27 (gdb) l
28 12 static PyObject* Stack_Push(Stack* self, PyObject* item) {
29 13 size_t len = self->length + 1;
30 14 self->_data = realloc(self->_data, len*sizeof(PyObjec\
31 t*));
32 15 Py_INCREF(item);
Part VI. The Underbelly of the Snake 308
33 16 self->_data[self->length] = item;
34 17 self->length = len;
35 18 Py_RETURN_NONE;
36 19 }
37 20
38 21 static PyObject* Stack_Pop(Stack* self) {
39 (gdb) p *self
40 $2 = {ob_base = {ob_refcnt = 2, ob_type = 0x7f0d84ade140 <StackTy\
41 pe>}, length = 0,
42 _data = 0x563ee09d7000, push = 0x0, pop = 0x0}
43 (gdb) p *item
44 $3 = {ob_refcnt = 1000000155, ob_type = 0x7f0d8564b760 <PyLong_Ty\
45 pe>}
46 <b>(gdb) p (long)PyLong_AsLong(item)</b>
47 $4 = 0
48 (gdb)