Python Handout cs6452
Python Handout cs6452
com
TRAINING Email: contact@dcsqrd.com
Python 3
(Intro excerpted from Python for Informatics)
1. Introduction
Programming is a very creative and rewarding activity. You can write programs for
many reasons, ranging from making your living to solving a difficult data analysis problem
to having fun to helping someone else solve a problem. We believe that everyone needs to
know how to program, and that once you know how to program you will figure out what
you want to do with your newfound skills.
There are many things that you might need to do, that you could offload to a
computer. If you know the language, you can “tell” the computer to do tasks that were
repetitive. Interestingly, the kinds of things computers can do best are often the kinds of
things that we humans find boring and mind-numbing. For example, you can easily read
and understand the above paragraph, but if I ask you to tell me the word that is used most,
counting them is almost painful because it is not the kind of problem that human minds are
designed to solve.
This very fact that computers are good at things that humans are not is why you need
to become skilled at talking “computer language”. Once you learn this new language, you
can delegate mundane tasks to your computer, leaving more time for you to do the things
that you are uniquely suited for. You bring creativity, intuition, and inventiveness to this
partnership.
DCsqrd, 2016
1
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
2. Installing Python
This section will tell you where to download and install the Python IDE (IDLE)
https://www.python.org/downloads/
Download and install like any other program, no license required, completely free!
Remember, Python 3 and Python 2 are not compatible! Download 3 for this
workshop.
For a better IDE, download PyCharm Community from
https://www.jetbrains.com/pycharm/download/
During initial configuration, you will need to point PyCharm to where Python (select
Local) is installed. This is the Project SDK.
DCsqrd, 2016
2
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
Indentation-based Space-insensitive
Procedure-based
Scripting language
programming
DCsqrd, 2016
3
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
This is like the Linux terminal, you can “script” from here – mainly one-line commands.
To write a “batch” of commands, press ctrl+n, to open a new Python file.
Type the program, save it with ctrl+s, press f5 to run. Output will appear in the console
window.
To print “hello world!” just type print(“hello world!”) into the chevron prompt
or a new file.
DCsqrd, 2016
4
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
7.1 Primitives
Primitives, as the name suggest, are primitive values bound to a variable. For
example, a=15 is a primitive assignment, where the variable name is a and the primitive
value is 15.
True and False, which are Booleans. Note that they do not have quotes around
them, to differentiate from strings.
None, which indicates the absence of any value. Functions equivalent is the
NotImplemented type. Neither are used much in interest of best practices.
DCsqrd, 2016
5
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
7.2 Lists
Lists contain a set of values accessed by the “index”. Index is a contiguous integer
sequence that starts with 0. Example: arr=[10,20,35] means a[0]=10, a[1]=20 and
a[2]=35.
Note that, since Python does not differentiate between int, float, char and
strings, it is possible to have mixed types of data inside a list, unlike C-like languages. For
instance, arr=[10,’a’,”Hello!\n”,22.7895,”Learn Python!”] is absolutely valid.
Accessing elements are similar with normal lists, and types can change dynamically. Python
calls this a “list” to differentiate from arrays (in other languages), which usually contain
elements of similar data types. Note that lists can contain lists as elements, like this:
['spam', 1, ['Brie', 'Roquefort', 'Pol le Veq'], [1, 2, 3]] # length=4
The operator can be used to check if an element is in a given list. For example:
list = [3, 5, 7, 14]
3 in list # returns True
‘adam’ in list # returns False
7.3 Dictionaries
Dictionaries are like hash maps or associative arrays in other languages. For the
uninitiated, they are key-value pairs. Dictionaries are similar to lists, except elements are
accessed via “key”s. Example,
hash = {“name”:”John”,”age”:21,”college”:”NIE IT”}
Note that dictionaries are delimited by {} and elements are separated by , whereas
keys and values are separated by :
DCsqrd, 2016
6
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
7.4 Tuples
Tuples are like lists, except they are immutable. Values can be of any type, and are
comparable and hashable. Just like lists, they are indexed by integers. Tuples are enclosed
by round brackets, although it is not strictly necessary. Example, t = ('a','b','c','d',
'e'). If you need to create a tuple with a single value, remember to use a comma after the
first element, like t = (‘a’,) to differentiate from string assignments.
9. Operators
Operators work pretty much like C-based languages, but with a few differences:
DCsqrd, 2016
7
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
used to compare values, but is and is not compare references. (For example, whether
two variables refer to a single object). A special use is comparison to None type and
NotImplemented type, like if a is not Null or if __exec__() is
NotImplemented.
DCsqrd, 2016
8
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
Similarly, sending byte sequences have a b prefix. (ASCII only, 0-255). This is used
in rare cases, and cannot be treated as or concatenated with strings. Example:
print(b’\x41\x37’) #prints b’A7’
There are three useful functions used in association with strings.
rstrip([character]) removes the specified character from either end of the
string, and removes spaces if no argument is specified.
string.split([delimiter]) splits a string into a list separated by the specified
delimiter, and delimiter.join(list) joins all the elements of the list by the
specified delimiter string.
11.1 if
if cond:
run_this
11.2 if-else
if cond:
run_this
else:
run_this
11.3 if-elif-else
if cond:
run_this
elif cond:
run_this
else:
run_this
DCsqrd, 2016
9
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
Note that the expression in the cond has to evaluate to a Boolean value or be a
Boolean value in the first place, by best practices. In theory, only values that evaluate
to False are integer 0 and None.
Also, Python recommends you use the logical operators and avoid nesting if-else
conditionals.
The try block contains the “risky code”, where there is a possibility of having an
error. For example, you might be trying to open a file, and there is a possibility that the file
may not exist. In such a case, you put the code to open the file in the try block, specify
FileExistsError in the except statement, and write code to handle the error – say, tell
the user the file was not found. This mechanism ensures that your program “handles” and
recovers from an error that happens, and not crash silently in the background.
The except block executes only if the specified type of error / exception happens,
otherwise, it is skipped. The finally block contains code that is always executed, despite
whether the error / exception occurred or not. For example, you will need to release the
connection to the database whether or not the insert operation fails.
If you are writing custom code, you can create your own exceptions for debugging,
by creating exception classes that inherit from the Exception class (defined by Python).
DCsqrd, 2016
10
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
12. Iteration
Iteration repeats a set of instructions based on a condition. The condition should be a
Boolean expression, or follow Boolean conventions as explained in primitives. Care should
be taken to avoid infinite loops. Iteration or looping falls into three categories in Python
13. Functions
Functions provide a mechanism for modularity and reuse, by allowing code to “jump”
to a different point and return back to the jump point after performing some computation.
Parameters and return values are optional. def is used to “define” a function. In addition to
DCsqrd, 2016
11
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
those you define, Python defines many built-in functions like print(), min(), length()
and some library functions like math.random() and math.sin(). Python’s syntax for
defining a function is:
def func_name([params]):
do_something_here
do_more_here
return [something]
You can specify default values for function parameters, (including None), as shown:
def add(a=0,b=0):
return a+b # returns 0 if add() is called, 6 if add(6) is called
Also, you can change the order of the parameters, but you need to specify the names
of the arguments along with the values. For example, if a function def login (username,
password, url) is defined, then, it is legal to call the
login(url=’http://some_url/’,username=’user’,password=’r-crYp1:’)
There are two types of functions, depending on the return type. The third type is called
Lambda function, and is akin to Macros in C.
13.3 Lambdas
Lambdas are anonymous functions that do not follow the same syntax as a normal
function. In many places (in spirit of functional programming), lambdas can be passed
to other functions like any other parameter. Example:
f = lambda x: x**2 # print(f(8)) prints 64
foo = [2, 18, 9, 22, 17, 24, 8, 12, 27]
print (filter(lambda x: x%3==0, foo)) # prints [18, 9, 24, 12, 27]
DCsqrd, 2016
12
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
All the classic OOP constructs like encapsulation, polymorphism, inheritance and
abstraction are supported. Let us consider the program below:
class Employee:
“““Common base class for all employees””” # class documentation
empCount = 0 # class variable
def displayEmployee(self):
print("Name:",self.name,",Salary:",self.salary) # Python style print
The keyword class tells Python that we are defining a class. The class name follows,
followed by the class block.
Triple quoted strings inside the class represent ‘Class Documentation’ that can be
accessed by classname.__doc__ variable. This is a good practice to write
documentation for each class you define in Python.
Class members and member functions are places inside the class block. The normal
variable naming conventions and rules apply.
self is the reference to the object, like the this keyword in C and Java. Python requires
you to explicitly mention this variable in each class function, unless it is a static function
(not associated with the class instance). Note that self is not a keyword, you can use
this or any other valid variable name in place of it.
The __init__() function is the constructor, and is called each time you instantiate an
object of that class. Similarly, there exists the __del__() function for destructor, though
it is not used frequently (Python manages garbage collection internally).
DCsqrd, 2016
13
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
Variables that begin with double underscores (__) are protected class members. Others
are public by default.
Inheritance works by inheriting from a parent class or a set of parent classes. The syntax
is:
class SubClassName (ParentClass1[, ParentClass2, ...]):
“””Optional class documentation string”””
# class_suite
Method overriding is achieved by having a function with the same name in the child
class. The nearest function (child precedence) is called.
Operator overriding is achieved by overriding the __add__(), __sub__(),
__radd__(), __lt__(), __gt__(), __eq__(), __iadd__(), etc., These are built-in
functions that are executed in lieu of operators for classes. The names surrounded by __
are called magic methods or dunder methods, and __init__() is pronounced “dunder
init dunder”
15. Imports
We grew out of single-file source codes a long ago. As programs get more complex,
in view of maintainability and logical organization, we will need to “break” the code over
several files, in order to separate functionalities. In fact, software architecture defines
patterns like MVVM and MVC (Model-View-Controller) that suggest separation of concerns.
Also, libraries that you import into Python should have a mechanism to be included into the
program, like the C #include<> or Java include.
Python uses the import statement to reference and use code from other files. Printing
an imported module will print its location, like so:
import math
print math # prints <module 'math' from '/usr/lib/python2.5/lib-
dynload/math.so'>
Finally, you can import either modules or libraries. To import a library, a plain
import library_name is used, whereas, to import a specific module, from
library_name import module_name or import library_name.module_name is
used. For the second approach, remember that the library name prefix is mandatory
whenever the module is called, because it is not imported to the program namespace.
Python is not pre-processed. If you want to, say, read constants from a file
constants.py, you can just place it in the same directory and just say import constants to
be able to access variables in that file.
DCsqrd, 2016
14
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
To create your own libraries, you need to create a python package (basically, a
directory) with a file called __init__.py. This file decides which modules are exposed
as APIs, while keeping other modules internal. This is done by overriding the __all__
list, including the names of the exposed modules.
With this in mind, the most used persistence techniques are writing to flat files,
followed by databases. We will get to databases later, and look at writing to files and
reading from files. The program needs a “handle” to deal with files. Handle is basically a
variable that points to the file in order to perform operations on it.
handle = open(file_name,[mode],[buffering])
The file name follows the standard relative / absolute convention, and mode is read
(by default), or write(w). Other modes like binary(b), append(a), initial(+) can be used in
groups. The buffering argument is 0 or 1 to disable or enable buffering respectively.
While we are discussing file operations, the os library supports file renames,
deletions, and working with directories.
os.rename(cur_file_name, new_file_name) # rename a file
os.mkdir(new_dir_name) # create new directory
os.chdir(dir_name) # change to specified directory
os.getcwd() # returns current working directory
os.rmdir('dirname') # delete a directory
DCsqrd, 2016
15
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
Installed PIP libraries can by recreated with corresponding version numbers for use
with another computer / cloud system or virtual environment (venv) by using
pip install –r requirements.txt
We will be looking at MySQL in particular, for this demonstration. Install the access
library by opening the command line and using pip install pymysql. Once that is done,
the library can be imported into any Python program by using the import pymysql
statement at the beginning of your program. Note that you should have set up MySQL
previously, and know the database name, username and password (this typically constitutes
the “connection string”) to connect to the database.
Next, you have to use a ‘cursor’, to point to this connection and perform query
operations on the database. cur = con.cursor()
DCsqrd, 2016
16
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
DCsqrd, 2016
17
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
BeautifulSoup allows you to “read” a webpage by its tags, IDs, etc., and filter out
the information for further processing. Even though you can write a web parser by reading
page contents, BeautifulSoup handles malformed tags and generic attributes well.
from bs4 import BeautifulSoup will import the BeautifulSoup library. It makes
use of a “soup object” that lets you read components from the page. soup =
BeautifulSoup(html_doc, 'html.parser'). html_doc is the file handle / URL handle
to the webpage, and the second argument is optional, indicating which parser to use. lxml,
lxml-xml, html5lib can be used.
Tag contents can be read easily by using the dot operator with soup object. For
example, soup.title prints the title of the page. You can also navigate as
soup.article.em, where it searches for occurrences of tag em inside article elements
only. Note that this finds only the first occurrence of the tag. For example, to find the first
link in a page, use soup.a. Instead, if you like to find all the links on a page, use
soup.find_all(‘a’), it returns a list with all the links. To find a tag with a particular ID
(say, price), you can use soup.find(id=”price”)
If you want to print a particular attribute of the tag, you can use the .get method.
For example, printing all the links in a page, you can use:
for link in soup.find_all('a'):
print(link.get('href'))
DCsqrd, 2016
18
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
.string the bit of string within the tag (“p”,”title”) all p and title tags
("a",
dictionary of attributes and all a tags which have
.attrs attrs={"class":
values the class “danger”
"danger"})
DCsqrd, 2016
19
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute
DCsqrd Website: http://dcsqrd.com
TRAINING Email: contact@dcsqrd.com
Our point is, code that is not “Pythonic” tends to look odd or cumbersome to an
experienced Python programmer. It may also be harder to understand, as instead of using
a common, recognizable, brief idiom, another, longer, sequence of code is used to
accomplish the desired effect. Since the language tends to support the right idioms, non-
idiomatic code frequently also executes more slowly.
Remember that Python was built around the core of understandability and simplicity
– and in order to achieve that, there have been more efficient data structures and functions
added over the years that let you concentrate on the task with the least amount of coding
effort, at the same time, not sacrificing understandability.
These idioms extend beyond programming constructs, and you should keep them in
mind when you are, say, writing a library. It is your duty to make the code as easy and
natural as possible for a Python programmer to pick up and perform a task. Also, as you
become more experienced in Python programming, you will realize the importance of the
possibilities that you can exploit by utilizing these idioms, like passing methods to functions.
Dedicate some time to tell your story the pythonic way!
DCsqrd, 2016
20
For Georgia Tech CS6452 Prototyping Interactive Systems, please do not distribute