python AQJ
python AQJ
Python Introduction:
Python is a popular programming language. It was created by Guido van Rossum, and
released in 1991. It was designed with an emphasis on code readability, and its syntax
allows programmers to express their concepts in fewer lines of code.
It is used for:
web development (server-side),
software development,
mathematics,
System scripting.
What can Python do?
Python can be used on a server to create web applications.
Python can be used alongside software to create workflows.
Python can connect to database systems. It can also read and modify files.
Python can be used to handle big data and perform complex mathematics.
Python can be used for rapid prototyping, or for production-ready software
development.
Why Python?
Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).
Python has a simple syntax similar to the English language.
Python has syntax that allows developers to write programs with fewer lines than
some other programming languages.
Python runs on an interpreter system, meaning that code can be executed as soon as it
is written. This means that prototyping can be very quick.
Python can be treated in a procedural way, an object-oriented way or a functional
way.
Python Syntax:
Execute Python Syntax
As we learned in the previous page, Python syntax can be executed by writing directly in the
Command Line:
>>> print("Hello, World!")
Hello, World!
Or by creating a python file on the server, using the .py file extension, and running it in the
Command Line:
C:\Users\Your Name>python myfile.py
Python Indentation:
Indentation refers to the spaces at the beginning of a code line.
Where in other programming languages the indentation in code is for readability only, the
indentation in Python is very important.
Python uses indentation to indicate a block of code.
Example
if 5 > 2:
print("Five is greater than two!")
Python will give you an error if you skip the indentation:
Example
Syntax Error:
if 5 > 2:
print("Five is greater than two!")
You have to use the same number of spaces in the same block of code, otherwise Python will
give you an error:
Example
Syntax Error:
if 5 > 2:
print("Five is greater than two!")
print("Five is greater than two!")
Python Comments:
Python has commenting capability for the purpose of in-code documentation; Comments can
be used to explain Python code.
Comments can be used to make the code more readable.
Comments can be used to prevent execution when testing code.
Comments start with #, and Python will render the rest of the line as a comment:
Example
Comments in Python:
#This is a comment.
print("Hello, World!")
Comments can be placed at the end of a line, and Python will ignore the rest of the line:
Example
print("Hello, World!") #This is a comment
A comment does not have to be text that explains the code, it can also be used to prevent
Python from executing code:
Example
#print("Hello, World!")
print("Cheers, Mate!")
Multi Line Comments
Python does not really have a syntax for multi line comments.
To add a multiline comment you could insert a # for each line:
Example
#This is a comment
#written in
#more than just one line
print("Hello, World!")
Or, not quite as intended, you can use a multiline string.
Since Python will ignore string literals that are not assigned to a variable, you can add a
multiline string (triple quotes) in your code, and place your comment inside it:
Example
"""
This is a comment
written in
more than just one line
"""
print("Hello, World!")
Python Variables:
In Python, variables are created when you assign a value to it; Variables are containers for
storing data values.
Variable Names:
A variable can have a short name (like x and y) or a more descriptive name (age, carname,
total_volume). Rules for Python variables:
A variable name must start with a letter or the underscore character
A variable name cannot start with a number
A variable name can only contain alpha-numeric characters and underscores (A-z, 0-
9, and _ )
Variable names are case-sensitive (age, Age and AGE are three different variables)
Example
Legal variable names:
myvar = "John"
my_var = "John"
_my_var = "John"
myVar = "John"
MYVAR = "John"
myvar2 = "John"
Creating Variables:
Python has no command for declaring a variable.
A variable is created the moment you first assign a value to it.
Example
x=5
y = "John"
print(x)
print(y)
Variables do not need to be declared with any particular type, and can even change type after
they have been set.
Example
x=4 # x is of type int
x = "Sally" # x is now of type str
print(x)
Casting
If you want to specify the data type of a variable, this can be done with casting.
Example
x = str(3) # x will be '3'
y = int(3) # y will be 3
z = float(3) # z will be 3.0
Example
x = "John"
# is the same as
x = 'John'
Case-Sensitive:
Variable names are case-sensitive.
Example
This will create two variables:
a=4
A = "Sally"
#A will not overwrite a
Many Values to Multiple Variables:
Python allows you to assign values to multiple variables in one line:
Example
x, y, z = "Orange", "Banana", "Cherry"
print(x)
print(y)
print(z)
If you create a variable with the same name inside a function, this variable will be local, and
can only be used inside the function. The global variable with the same name will remain as it
was, global and with the original value.
Example
Create a variable inside a function, with the same name as the global variable
x = "awesome"
def myfunc():
x = "fantastic"
print("Python is " + x)
myfunc()
print("Python is " + x)
The global Keyword:
Normally, when you create a variable inside a function, that variable is local, and can only be
used inside that function.
To create a global variable inside a function, you can use the global keyword.
Example
If you use the global keyword, the variable belongs to the global scope:
def myfunc():
global x
x = "fantastic"
myfunc()
print("Python is " + x)
Python Data Types:
Built-in Data Types: In programming, data type is an important concept. Variables can
store data of different types, and different types can do different things.
Python has the following data types built-in by default, in these categories:
You can get the data type of any object by using the type() function:
Example
Print the data type of the variable x:
x=5
print(type(x))
Setting the Data Type:
In Python, the data type is set when you assign a value to a variable:
x = 20 int
x = 20.5 float
x = 1j complex
x = range(6) range
x = frozenset({"apple", frozenset
"banana", "cherry"})
x = True bool
x = b"Hello" bytes
x = bytearray(5) bytearray
x = memoryview(bytes(5)) memoryview
Setting the Specific Data Type:
If you want to specify the data type, you can use the following constructor functions:
x = float(20.5) float
x = complex(1j) complex
x = list(("apple", list
"banana", "cherry"))
x = tuple(("apple", tuple
"banana", "cherry"))
x = range(6) range
x = dict(name="John", dict
age=36)
x = set(("apple", set
"banana", "cherry"))
x = frozenset(("apple", frozenset
"banana", "cherry"))
x = bool(5) bool
x = bytes(5) bytes
x = bytearray(5) bytearray
x= memoryview
memoryview(bytes(5))
Python Numbers
int
float
complex
Variables of numeric types are created when you assign a value to them:
Example
x = 1 # int
y = 2.8 # float
z = 1j # complex
To verify the type of any object in Python, use the type() function:
Example
print(type(x))
print(type(y))
print(type(z))
Int:
Int, or integer, is a whole number, positive or negative, without decimals, of unlimited length.
Example
Integers:
x=1
y = 35656222554887711
z = -3255522
print(type(x))
print(type(y))
print(type(z))
Float:
Float, or "floating point number" is a number, positive or negative, containing one or
more decimals.
Example
Floats:
x = 1.10
y = 1.0
z = -35.59
print(type(x))
print(type(y))
print(type(z))
Float can also be scientific numbers with an "e" to indicate the power of 10.
Example
Floats:
x = 35e3
y = 12E4
z = -87.7e100
print(type(x))
print(type(y))
print(type(z))
Complex:
Example
Complex:
x = 3+5j
y = 5j
z = -5j
print(type(x))
print(type(y))
print(type(z))
Type Conversion:
You can convert from one type to another with the int(), float(), and complex() methods:
Example
Convert from one type to another:
x = 1 # int
y = 2.8 # float
z = 1j # complex
#convert from int to float:
a = float(x)
#convert from float to int:
b = int(y)
#convert from int to complex:
c = complex(x)
print(a)
print(b)
print(c)
print(type(a))
print(type(b))
print(type(c))
Note: You cannot convert complex numbers into another number type.
Random Number:
Python does not have a random() function to make a random number, but Python has a built-
in module called random that can be used to make random numbers:
Example
Import the random module, and display a random number between 1 and 9:
import random
print(random.randrange(1, 10))
Python Casting:
Specify a Variable Type:
There may be times when you want to specify a type on to a variable. This can be done with
casting. Python is an object-orientated language, and as such it uses classes to define data
types, including its primitive types.Casting in python is therefore done using constructor
functions:
int() - constructs an integer number from an integer literal, a float literal (by removing
all decimals), or a string literal (providing the string represents a whole number)
float() - constructs a float number from an integer literal, a float literal or a string
literal (providing the string represents a float or an integer)
str() - constructs a string from a wide variety of data types, including strings, integer
literals and float literals
Example
Integers:
x = int(1) # x will be 1
y = int(2.8) # y will be 2
z = int("3") # z will be 3
Example:
Floats:
Python Strings:
Strings: Strings in python are surrounded by either single quotation marks, or double
quotation marks.
'hello' is the same as "hello".
You can display a string literal with the print() function:
Example
print("Hello")
print('Hello')
Assign String to a Variable:
Assigning a string to a variable is done with the variable name followed by an equal sign and
the string:
Example
a = "Hello"
print(a)
Multiline Strings:
You can assign a multiline string to a variable by using three quotes:
Example
You can use three double quotes:
a = """Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."""
print(a)
Example
a = '''Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.'''
print(a)
Note: in the result, the line breaks are inserted at the same position as in the code.
a = "Hello, World!"
print(a[1])
Looping Through a String:
Since strings are arrays, we can loop through the characters in a string, with a for loop.
Example
Loop through the letters in the word "banana":
for x in "banana":
print(x)
String Length:
To get the length of a string, use the len() function.
Example
The len() function returns the length of a string:
a = "Hello, World!"
print(len(a))
Check String:
To check if a certain phrase or character is present in a string, we can use the keyword in.
Example
Check if "free" is present in the following text:
txt = "The best things in life are free!"
print("free" in txt)
Use it in an if statement:
Example
Print only if "free" is present:
txt = "The best things in life are free!"
if "free" in txt:
print("Yes, 'free' is present.")
Check if NOT:
To check if a certain phrase or character is NOT present in a string, we can use the
keyword not in.
Example
Check if "expensive" is NOT present in the following text:
txt = "The best things in life are free!"
print("expensive" not in txt)
Use it in an if statement:
Example
print only if "expensive" is NOT present:
txt = "The best things in life are free!"
if "expensive" not in txt:
print("No, 'expensive' is NOT present.")
Python - Slicing Strings:
You can return a range of characters by using the slice syntax.
Specify the start index and the end index, separated by a colon, to return a part of the string.
Example
Get the characters from position 2 to position 5 (not included):
b = "Hello, World!"
print(b[2:5])
Slice From the Start:
By leaving out the start index, the range will start at the first character:
Example
Get the characters from the start to position 5 (not included):
b = "Hello, World!"
print(b[:5])
Slice To the End:
By leaving out the end index, the range will go to the end:
Example
Get the characters from position 2, and all the way to the end:
b = "Hello, World!"
print(b[2:])
Negative Indexing:
Use negative indexes to start the slice from the end of the string:
Example
Get the characters:
From: "o" in "World!" (position -5)
To, but not included: "d" in "World!" (position -2):
b = "Hello, World!"
print(b[-5:-2])
Python - Modify Strings
Upper Case:
Example
The upper() method returns the string in upper case:
a = "Hello, World!"
print(a.upper())
Lower Case:
Example
The lower() method returns the string in lower case:
a = "Hello, World!"
print(a.lower())
Remove Whitespace:
Whitespace is the space before and/or after the actual text, and very often you want to remove
this space.
Example
The strip() method removes any whitespace from the beginning or the end:
a = " Hello, World! "
print(a.strip()) # returns "Hello, World!"
Replace String:
Example
The replace() method replaces a string with another string:
a = "Hello, World!"
print(a.replace("H", "J"))
Split String:
The split() method returns a list where the text between the specified separator becomes the
list items.
Example
The split() method splits the string into substrings if it finds instances of the separator:
a = "Hello, World!"
print(a.split(",")) # returns ['Hello', ' World!']
Python String Format:
As we learned in the Python Variables chapter, we cannot combine strings and numbers like
this:
Example
age = 36
txt = "My name is John, I am " + age
print(txt)
But we can combine strings and numbers by using the format() method!
The format() method takes the passed arguments, formats them, and places them in the string
where the placeholders {} are:
Example
Use the format() method to insert numbers into strings:
age = 36
txt = "My name is John, and I am {}"
print(txt.format(age))
The format() method takes unlimited number of arguments, and are placed into the respective
placeholders:
Example
quantity = 3
itemno = 567
price = 49.95
myorder = "I want {} pieces of item {} for {} dollars."
print(myorder.format(quantity, itemno, price))
You can use index numbers {0} to be sure the arguments are placed in the correct
placeholders:
Example
quantity = 3
itemno = 567
price = 49.95
myorder = "I want to pay {2} dollars for {0} pieces of item {1}."
print(myorder.format(quantity, itemno, price))
String Methods:
Python has a set of built-in methods that you can use on strings.
Note: All string methods returns new values. They do not change the original string.
Method Description
endswith() Returns true if the string ends with the specified value
expandtabs() Sets the tab size of the string
find() Searches the string for a specified value and returns the position of where it
was found
index() Searches the string for a specified value and returns the position of where it
was found
isalpha() Returns True if all characters in the string are in the alphabet
islower() Returns True if all characters in the string are lower case
isupper() Returns True if all characters in the string are upper case
partition() Returns a tuple where the string is parted into three parts
replace() Returns a string where a specified value is replaced with a specified value
rfind() Searches the string for a specified value and returns the last position of
where it was found
rindex() Searches the string for a specified value and returns the last position of
where it was found
rpartition() Returns a tuple where the string is parted into three parts
rsplit() Splits the string at the specified separator, and returns a list
split() Splits the string at the specified separator, and returns a list
startswith() Returns true if the string starts with the specified value
swapcase() Swaps cases, lower case becomes upper case and vice versa
zfill() Fills the string with a specified number of 0 values at the beginning
Python Booleans:
Booleans represent one of two values: True or False.
Boolean Values:
In programming you often need to know if an expression is True or False.
You can evaluate any expression in Python, and get one of two answers, True or False.
When you compare two values, the expression is evaluated and Python returns the Boolean
answer:
Example
print(10 > 9)
print(10 == 9)
print(10 < 9)
The bool() function allows you to evaluate any value, and give you True or False in return,
Example
Evaluate a string and a number:
print(bool("Hello"))
print(bool(15))
Example
Evaluate two variables:
x = "Hello"
y = 15
print(bool(x))
print(bool(y))
Most Values are True
Almost any value is evaluated to True if it has some sort of content.
Any string is True, except empty strings.
Any number is True, except 0.
Any list, tuple, set, and dictionary are True, except empty ones.
Example
The following will return True:
bool("abc")
bool(123)
bool(["apple", "cherry", "banana"])
bool(False)
bool(None)
bool(0)
bool("")
bool(())
bool([])
bool({})
One more value, or object in this case, evaluates to False, and that is if you have an object
that is made from a class with a __len__ function that returns 0 or False:
Example
class myclass():
def __len__(self):
return 0
myobj = myclass()
print(bool(myobj))
Example
Print "YES!" if the function returns True, otherwise print "NO!":
def myFunction() :
return True
if myFunction():
print("YES!")
else:
print("NO!")
Python also has many built-in functions that return a boolean value, like
the isinstance() function, which can be used to determine if an object is of a certain data type:
Example
Check if an object is an integer or not:
x = 200
print(isinstance(x, int))
Python Operators:
Operators are used to perform operations on variables and values.
In the example below, we use the + operator to add together two values:
Example
print(10 + 5)
Python divides the operators in the following groups:
Arithmetic operators
Assignment operators
Comparison operators
Logical operators
Identity operators
Membership operators
Bitwise operators
Arithmetic operators are used with numeric values to perform common mathematical
operations:
+ Addition x+y
- Subtraction x-y
* Multiplication x*y
/ Division x/y
% Modulus x%y
** Exponentiation x ** y
// Floor division x // y
= x=5 x=5
+= x += 3 x=x+3
-= x -= 3 x=x-3
*= x *= 3 x=x*3
/= x /= 3 x=x/3
%= x %= 3 x=x%3
//= x //= 3 x = x // 3
**= x **= 3 x = x ** 3
^= x ^= 3 x=x^3
== Equal x == y
!= Not equal x != y
and Returns True if both statements are true x < 5 and x < 10
not Reverse the result, returns False if the result is not(x < 5 and x <
true 10)
Identity operators are used to compare the objects, not if they are equal, but if they are
actually the same object, with the same memory location:
is not Returns True if both variables are not the same x is not y
object
Python Membership Operators:
not in Returns True if a sequence with the specified value is not x not in y
present in the object
<< Zero fill left Shift left by pushing zeros in from the right and let
shift the leftmost bits fall off
>> Signed right Shift right by pushing copies of the leftmost bit in
shift from the left, and let the rightmost bits fall off
Python Lists:
List:
Lists are used to store multiple items in a single variable.
Lists are one of 4 built-in data types in Python used to store collections of data, the other 3
are Tuple, Set, and Dictionary, all with different qualities and usage.
Lists are created using square brackets:
Example
Create a List:
Note: There are some list methods that will change the order, but in general: the order of the
items will not change.
Changeable:
The list is changeable, meaning that we can change, add, and remove items in a list after it
has been created.
Allow Duplicates:
Since lists are indexed, lists can have items with the same value:
Example
Lists allow duplicate values:
thislist = ["apple", "banana", "cherry", "apple", "cherry"]
print(thislist)
List Length:
To determine how many items a list has, use the len() function:
Example
Print the number of items in the list:
thislist = ["apple", "banana", "cherry"]
print(len(thislist))
List Items - Data Types:
List items can be of any data type:
Example
String, int and boolean data types:
list1 = ["apple", "banana", "cherry"]
list2 = [1, 5, 7, 9, 3]
list3 = [True, False, False]
Example
A list with strings, integers and boolean values:
list1 = ["abc", 34, True, 40, "male"]
type()
From Python's perspective, lists are defined as objects with the data type 'list':
<class 'list'>
Example
What is the data type of a list?
mylist = ["apple", "banana", "cherry"]
print(type(mylist))
The list() Constructor:
It is also possible to use the list() constructor when creating a new list.
Example
Using the list() constructor to make a List:
thislist = list(("apple", "banana", "cherry")) # note the double round-brackets
print(thislist)
Python Collections (Arrays):
There are four collection data types in the Python programming language:
List is a collection which is ordered and changeable. Allows duplicate members.
Tuple is a collection which is ordered and unchangeable. Allows duplicate members.
Set is a collection which is unordered, unchangeable*, and unindexed. No duplicate
members.
Dictionary is a collection which is ordered** and changeable. No duplicate members.
Access Items
List items are indexed and you can access them by referring to the index number:
Example
Print the second item of the list:
thislist = ["apple", "banana", "cherry"]
print(thislist[1])
Negative Indexing:
Negative indexing means start from the end
-1 refers to the last item, -2 refers to the second last item etc.
Example
Print the last item of the list:
thislist = ["apple", "banana", "cherry"]
print(thislist[-1])
Range of Indexes:
You can specify a range of indexes by specifying where to start and where to end the range.
When specifying a range, the return value will be a new list with the specified items.
Example
Return the third, fourth, and fifth item:
thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]
print(thislist[2:5])
Append Items:
To add an item to the end of the list, use the append() method:
Example
Using the append() method to append an item:
thislist = ["apple", "banana", "cherry"]
thislist.append("orange")
print(thislist)
Insert Items:
To insert a list item at a specified index, use the insert() method.
The insert() method inserts an item at the specified index:
Example
Insert an item as the second position:
thislist = ["apple", "banana", "cherry"]
thislist.insert(1, "orange")
print(thislist)
Extend List:
To append elements from another list to the current list, use the extend() method.
Example
Add the elements of tropical to thislist:
thislist = ["apple", "banana", "cherry"]
tropical = ["mango", "pineapple", "papaya"]
thislist.extend(tropical)
print(thislist)
Remove Specified Item:
The remove() method removes the specified item.
Example
Remove "banana":
thislist = ["apple", "banana", "cherry"]
thislist.remove("banana")
print(thislist)
Remove Specified Index:
The pop() method removes the specified index.
Example
Remove the second item:
thislist = ["apple", "banana", "cherry"]
thislist.pop(1)
print(thislist)
If you do not specify the index, the pop() method removes the last item.
Example
Remove the last item:
thislist = ["apple", "banana", "cherry"]
thislist.pop()
print(thislist)
The del keyword also removes the specified index:
Example
Remove the first item:
thislist = ["apple", "banana", "cherry"]
del thislist[0]
print(thislist)
The del keyword can also delete the list completely.
Example
Delete the entire list:
thislist = ["apple", "banana", "cherry"]
del thislist
List Length:
The clear() method empties the list.
The list still remains, but it has no content.
Example
Clear the list content:
thislist = ["apple", "banana", "cherry"]
thislist.clear()
print(thislist)
Python - Loop Lists:
Loop Through a List
You can loop through the list items by using a for loop:
Example
Print all items in the list, one by one:
thislist = ["apple", "banana", "cherry"]
for x in thislist:
print(x)
Loop Through the Index Numbers:
You can also loop through the list items by referring to their index number.
Use the range() and len() functions to create a suitable iterable.
Example
Print all items by referring to their index number:
thislist = ["apple", "banana", "cherry"]
for i in range(len(thislist)):
print(thislist[i])
What if you want to reverse the order of a list, regardless of the alphabet?
The reverse() method reverses the current sorting order of the elements.
Example
Copy a List
You cannot copy a list simply by typing list2 = list1, because: list2 will only be
a reference to list1, and changes made in list1 will automatically also be made in list2.
There are ways to make a copy, one way is to use the built-in List method copy().
Example
Example
There are several ways to join, or concatenate, two or more lists in Python.
Example
Another way to join two lists is by appending all the items from list2 into list1, one by one:
Example
Or you can use the extend() method, which purpose is to add elements from one list to
another list:
Example
list1.extend(list2)
print(list1)
List Methods
Python has a set of built-in methods that you can use on lists.
Method Description
extend() Add the elements of a list (or any iterable), to the end of the current list
index() Returns the index of the first element with the specified value
Tuples are unchangeable, meaning that we cannot change, add or remove items after the tuple
has been created.
Allow Duplicates
Since tuples are indexed, they can have items with the same value:
Example
You can access tuple items by referring to the index number, inside square brackets:
Example
Negative Indexing
-1 refers to the last item, -2 refers to the second last item etc.
Example
Range of Indexes
You can specify a range of indexes by specifying where to start and where to end the range.
When specifying a range, the return value will be a new tuple with the specified items.
Example
Note: The search will start at index 2 (included) and end at index 5 (not included).
By leaving out the start value, the range will start at the first item:
Example
This example returns the items from the beginning to, but NOT included, "kiwi":
By leaving out the end value, the range will go on to the end of the list:
Example
This example returns the items from "cherry" and to the end:
Specify negative indexes if you want to start the search from the end of the tuple:
Example
This example returns the items from index -4 (included) to index -1 (excluded)
Example
Tuples are unchangeable, meaning that you cannot change, add, or remove items once the
tuple is created.
Once a tuple is created, you cannot change its values. Tuples are unchangeable,
or immutable as it also is called.
But there is a workaround. You can convert the tuple into a list, change the list, and convert
the list back into a tuple.
Example
print(x)
Add Items
Since tuples are immutable, they do not have a build-in append() method, but there are other
ways to add items to a tuple.
1. Convert into a list: Just like the workaround for changing a tuple, you can convert it into a
list, add your item(s), and convert it back into a tuple.
Example
Convert the tuple into a list, add "orange", and convert it back into a tuple:
2. Add tuple to a tuple. You are allowed to add tuples to tuples, so if you want to add one
item, (or many), create a new tuple with the item(s), and add it to the existing tuple:
Example
Create a new tuple with the value "orange", and add that tuple:
When we create a tuple, we normally assign values to it. This is called "packing" a tuple:
Example
Packing a tuple:
But, in Python, we are also allowed to extract the values back into variables. This is called
"unpacking":
Example
Unpacking a tuple:
print(green)
print(yellow)
print(red)
Note: The number of variables must match the number of values in the tuple, if not, you must
use an asterisk to collect the remaining values as a list.
Using Asterisk*
If the number of variables is less than the number of values, you can add an * to the variable
name and the values will be assigned to the variable as a list:
Example
print(green)
print(yellow)
print(red)
If the asterisk is added to another variable name than the last, Python will assign values to the
variable until the number of values left matches the number of variables left.
Example
print(green)
print(tropic)
print(red)
You can loop through the tuple items by using a for loop.
Example
You can also loop through the tuple items by referring to their index number.
Example
You can loop through the list items by using a while loop.
Use the len() function to determine the length of the tuple, then start at 0 and loop your way
through the tuple items by refering to their indexes.
Example
Print all items, using a while loop to go through all the index numbers:
Example
Multiply Tuples
If you want to multiply the content of a tuple a given number of times, you can use
the * operator:
Example
print(mytuple)
Tuple Methods
Python has two built-in methods that you can use on tuples.
Method Description
index() Searches the tuple for a specified value and returns the position of
where it was found
Python Sets:
Set is one of 4 built-in data types in Python used to store collections of data, the other 3
are List, Tuple, and Dictionary, all with different qualities and usage.
* Note: Set items are unchangeable, but you can remove items and add new items.
Example
Create a Set:
Note: Sets are unordered, so you cannot be sure in which order the items will appear.
Set Items
Set items are unordered, unchangeable, and do not allow duplicate values.
Unordered
Unordered means that the items in a set do not have a defined order.
Set items can appear in a different order every time you use them, and cannot be referred to
by index or key.
Unchangeable
Set items are unchangeable, meaning that we cannot change the items after the set has been
created.
Once a set is created, you cannot change its items, but you can remove items and add new
items.
Example
print(thisset)
To determine how many items a set has, use the len() method.
Example
print(len(thisset))
Example
String, int and boolean data types:
Example
type()
From Python's perspective, sets are defined as objects with the data type 'set':
<class 'set'>
Example
Example
There are four collection data types in the Python programming language:
Access Items
But you can loop through the set items using a for loop, or ask if a specified value is present
in a set, by using the in keyword.
Example
Loop through the set, and print the values:
for x in thisset:
print(x)
Example
print("banana" in thisset)
Change Items
Once a set is created, you cannot change its items, but you can add new items.
Add Items
Once a set is created, you cannot change its items, but you can add new items.
Example
thisset.add("orange")
print(thisset)
Add Sets
To add items from another set into the current set, use the update() method.
Example
thisset.update(tropical)
print(thisset)
The object in the update() method does not have to be a set, it can be any iterable object
(tuples, lists, dictionaries etc.).
Example
Add elements of a list to at set:
thisset.update(mylist)
print(thisset)
Remove Item
Example
thisset.remove("banana")
print(thisset)
Note: If the item to remove does not exist, remove() will raise an error.
Example
thisset.discard("banana")
print(thisset)
Note: If the item to remove does not exist, discard() will NOT raise an error.
You can also use the pop() method to remove an item, but this method will remove
the last item. Remember that sets are unordered, so you will not know what item that gets
removed.
Example
x = thisset.pop()
print(x)
print(thisset)
Note: Sets are unordered, so when using the pop() method, you do not know which item that
gets removed.
Example
thisset.clear()
print(thisset)
Example
del thisset
print(thisset)
Loop Items
You can loop through the set items by using a for loop:
Example
for x in thisset:
print(x)
Join Sets
Join Two Sets
You can use the union() method that returns a new set containing all items from both sets, or
the update() method that inserts all the items from one set into another:
Example
The union() method returns a new set with all items from both sets:
set3 = set1.union(set2)
print(set3)
Example
set1.update(set2)
print(set1)
Note: Both union() and update() will exclude any duplicate items.
The intersection_update() method will keep only the items that are present in both sets.
Example
x.intersection_update(y)
print(x)
The intersection() method will return a new set, that only contains the items that are present
in both sets.
Example
Return a set that contains the items that exist in both set x, and set y:
z = x.intersection(y)
print(z)
The symmetric_difference_update() method will keep only the elements that are NOT present
in both sets.
Example
x.symmetric_difference_update(y)
print(x)
The symmetric_difference() method will return a new set, that contains only the elements that
are NOT present in both sets.
Example
Return a set that contains all items from both sets, except items that are present in both:
z = x.symmetric_difference(y)
print(z)
Set Methods:
Python has a set of built-in methods that you can use on sets.
Method Description
difference_update() Removes the items in this set that are also included in
another, specified set
intersection_update() Removes the items in this set that are not present in other,
specified set(s)
update() Update the set with the union of this set and others
Python Dictionaries:
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
Dictionary
Dictionaries are used to store data values in key:value pairs. A dictionary is a collection
which is ordered*, changeable and do not allow duplicates.
As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries
are unordered.
Dictionaries are written with curly brackets, and have keys and values:
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(thisdict)
Dictionary Items
Dictionary items are ordered, changeable, and does not allow duplicates.
Dictionary items are presented in key:value pairs, and can be referred to by using the key
name.
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(thisdict["brand"])
Ordered or Unordered?
As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries
are unordered.
When we say that dictionaries are ordered, it means that the items have a defined order, and
that order will not change.
Unordered means that the items does not have a defined order, you cannot refer to an item by
using an index.
Changeable
Dictionaries are changeable, meaning that we can change, add or remove items after the
dictionary has been created.
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964,
"year": 2020
}
print(thisdict)
Dictionary Length
To determine how many items a dictionary has, use the len() function:
Example
print(len(thisdict))
Example
thisdict = {
"brand": "Ford",
"electric": False,
"year": 1964,
"colors": ["red", "white", "blue"]
}
type()
From Python's perspective, dictionaries are defined as objects with the data type 'dict':
<class 'dict'>
Example
Print the data type of a dictionary:
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(type(thisdict))
There are four collection data types in the Python programming language:
Accessing Items
You can access the items of a dictionary by referring to its key name, inside square brackets:
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
x = thisdict["model"]
There is also a method called get() that will give you the same result:
Example
x = thisdict.get("model")
Get Keys
The keys() method will return a list of all the keys in the dictionary.
Example
x = thisdict.keys()
The list of the keys is a view of the dictionary, meaning that any changes done to the
dictionary will be reflected in the keys list.
Example
Add a new item to the original dictionary, and see that the keys list gets updated as well:
car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
x = car.keys()
car["color"] = "white"
The values() method will return a list of all the values in the dictionary.
Example
x = thisdict.values()
The list of the values is a view of the dictionary, meaning that any changes done to the
dictionary will be reflected in the values list.
Example
Make a change in the original dictionary, and see that the values list gets updated as well:
car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
x = car.values()
car["year"] = 2020
Example
Add a new item to the original dictionary, and see that the values list gets updated as well:
car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
x = car.values()
print(x) #before the change
car["color"] = "red"
The items() method will return each item in a dictionary, as tuples in a list.
Example
x = thisdict.items()
The returned list is a view of the items of the dictionary, meaning that any changes done to
the dictionary will be reflected in the items list.
Example
Make a change in the original dictionary, and see that the items list gets updated as well:
car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
x = car.items()
car["year"] = 2020
Add a new item to the original dictionary, and see that the items list gets updated as well:
car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
x = car.items()
car["color"] = "red"
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
if "model" in thisdict:
print("Yes, 'model' is one of the keys in the thisdict dictionary")
Change Values
You can change the value of a specific item by referring to its key name:
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict["year"] = 2018
Update Dictionary
The update() method will update the dictionary with the items from the given argument.
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict.update({"year": 2020})
Adding Items
Adding an item to the dictionary is done by using a new index key and assigning a value to it:
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict["color"] = "red"
print(thisdict)
Removing Items
Example
The pop() method removes the item with the specified key name:
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict.pop("model")
print(thisdict)
Example
The popitem() method removes the last inserted item (in versions before 3.7, a random item
is removed instead):
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict.popitem()
print(thisdict)
Example
The del keyword removes the item with the specified key name:
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
del thisdict["model"]
print(thisdict)
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
del thisdict
print(thisdict) #this will cause an error because "thisdict" no longer exists.
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict.clear()
print(thisdict)
When looping through a dictionary, the return value are the keys of the dictionary, but there
are methods to return the values as well.
Example
for x in thisdict:
print(x)
Example
for x in thisdict:
print(thisdict[x])
Example
You can also use the values() method to return values of a dictionary:
for x in thisdict.values():
print(x)
Example
You can use the keys() method to return the keys of a dictionary:
for x in thisdict.keys():
print(x)
Example
Loop through both keys and values, by using the items() method:
for x, y in thisdict.items():
print(x, y)
Copy a Dictionary
You cannot copy a dictionary simply by typing dict2 = dict1, because: dict2 will only be
a reference to dict1, and changes made in dict1 will automatically also be made in dict2.
There are ways to make a copy, one way is to use the built-in Dictionary method copy().
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
mydict = thisdict.copy()
print(mydict)
Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
mydict = dict(thisdict)
print(mydict)
Nested Dictionaries
Example
myfamily = {
"child1" : {
"name" : "Emil",
"year" : 2004
},
"child2" : {
"name" : "Tobias",
"year" : 2007
},
"child3" : {
"name" : "Linus",
"year" : 2011
}
}
Or, if you want to add three dictionaries into a new dictionary:
Example
Create three dictionaries, then create one dictionary that will contain the other three
dictionaries:
child1 = {
"name" : "Emil",
"year" : 2004
}
child2 = {
"name" : "Tobias",
"year" : 2007
}
child3 = {
"name" : "Linus",
"year" : 2011
}
myfamily ={
"child1" : child1,
"child2" : child2,
"child3" : child3
}
Dictionary Methods
Python has a set of built-in methods that you can use on dictionaries.
Method Description
items() Returns a list containing a tuple for each key value pair
setdefault() Returns the value of the specified key. If the key does not exist: insert the key, with
the specified value
Mr. D.Gangadhar
Associate Professor
Here, the condition after evaluation will be either true or false. if statement accepts boolean values – if
the value is true then it will execute the block of statements below it otherwise not. We can
use condition with bracket „(„ „)‟ also.
As we know, python uses indentation to identify a block. So the block under an if statement will be
identified as shown in the below example:
if condition:
statement1
statement2
# Here if the condition is true, if block
# will consider only statement1 to be inside
# its block.
Example: Python if Statement
Output:
I am Not in if
As the condition present in the if statement is false. So, the block below the if statement is not
executed.
if-else
The if statement alone tells us that if a co ndition is true it will execute a block of statements and if the
condition is false it won‟t. But what if we want to do something else if the condition is false. Here
comes the else statement. We can use the else statement with if statement to execute a block of code
when the condition is false.
Syntax:
if (condition):
# Executes this block if
# condition is true
else:
# Executes this block if
# condition is false
Mr. D.Gangadhar
Associate Professor
Example 1: Python if else statement
Output:
i is greater than 15
i'm in else Block
i'm not in if and not in else Block
nested-if
A nested if is an if statement that is the target of another if statement. Nested if statements mean an if
statement inside another if statement. Yes, Python allows us to nest if statements within if
statements. i.e, we can place an if statement inside another if statement.
Syntax:
if (condition1):
# Executes when condition1 is true
if (condition2):
# Executes when condition2 is true
# if Block is end here
# if Block is end here
Output:
i is smaller than 15
i is smaller than 12 too
Mr. D.Gangadhar
Associate Professor
if-elif-else ladder
Here, a user can decide among multiple options. The if statements are executed from the top down.
As soon as one of the conditions controlling the if is true, the statement associated with that if is
executed, and the rest of the ladder is bypassed. If none of the conditions is true, then the final else
statement will be executed.
Syntax:
if (condition):
statement
elif (condition):
statement
.
.
else:
statement
Example: Python if else elif statements
Output:
i is 20
Loops:
In programming, loops are a sequence of instructions that does a specific set of instructions or tasks
based on some conditions and continue the tasks until it reaches certain conditions.
Python provides three types of looping techniques:
Loop Description
for Loop This is traditionally used when programmers had a piece of code and
wanted to repeat that 'n' number of times.
while Loop The loop gets repeated until the specific Boolean condition is met.
Nested Loops Programmers can use one loop inside another; i.e., they can use for loop
inside while or vice - versa or for loop inside for loop or while inside while.
Syntax:
for var in iterable:
# statements
Here the iterable is a collection of objects like lists, tuples. The indented statements inside the for
loops are executed once for each item in an iterable. The variable var takes the value of the next item
of the iterable each time through the loop.
Mr. D.Gangadhar
Associate Professor
# Iterating over dictionary
print("\nDictionary Iteration")
d = dict()
d['xyz'] = 123
d['abc'] = 345
for i in d:
print("% s % d" % (i, d[i]))
Output:
List Iteration
geeks
for
geeks
Tuple Iteration
geeks
for
geeks
String Iteration
G
e
e
k
s
Dictionary Iteration
xyz 123
abc 345
Python While Loop:
Python While Loop is used to execute a block of statements repeatedly until a given condition is
satisfied. And when the condition becomes false, the line immediately after the loop in the program
is executed. While loop falls under the category of indefinite ite ration. Indefinite iteration means
that the number of times the loop is executed isn‟t specified explicitly in advance.
Syntax:
while expression:
statement(s)
Statements represent all the statements indented by the same number of character spaces after a
programming construct are considered to be part of a single block of code. Python uses indentation as
its method of grouping statements. When a while loop is executed, expr is first evaluated in a
Boolean context and if it is true, the loop body is executed. Then the expr is checked again, if it is
still true then the body is executed again and this continues until the expression becomes false.
Mr. D.Gangadhar
Associate Professor
Output:
Hello Geek
Hello Geek
Hello Geek
In the above example, the condition for while will be True as long as the counter variable (count) is
less than 3.
Output
4
3
2
1
In the above example, we have run a while loop over a list that will run until there is an element
present in the list.
Single statement while block
Just like the if block, if the while block consists of a single statement we can declare the entire loop
in a single line. If there are multiple statements in the block that makes up the loop body, they can be
Mr. D.Gangadhar
Associate Professor
Output:
Hello Geek
Hello Geek
Hello Geek
Hello Geek
Hello Geek
Nested Loops
Syntax
Example
Source Code
Mr. D.Gangadhar
Associate Professor
OUTPUT
1*1=1
1*2=2
2*1=2
2*2=4
3*1=3
3*2=6
4*1=4
4*2=8
5*1=5
5*2=10
These statements are used to change execution from its normal sequence.
Mr. D.Gangadhar
Associate Professor
Python supports three types of loop control statements:
Break statement It is used to exit a while loop or a for a loop. It terminates the looping
& transfers execution to the statement next to the loop.
Continue statement It causes the looping to skip the rest part of its body & start re-
testing its condition.
# Python program to
# demonstrate break statement
s = 'geeksforgeeks'
# Using for loop
for letter in s:
print(letter)
# break the loop as soon it sees 'e'
# or 's'
if letter == 'e' or letter == 's':
break
print("Out of for loop")
Mr. D.Gangadhar
Associate Professor
print()
i=0
# Using while loop
while True:
print(s[i])
# break the loop as soon it sees 'e'
# or 's'
if s[i] == 'e' or s[i] == 's':
break
i += 1
print("Out of while loop")
Output:
g
e
Out of for loop
g
e
Out of while loop
2. Continue statement:
Continue state me nt is a loop control statement that forces to execute the next iteration of the loop
while skipping the rest of the code inside the loop for the current iteration only i.e. when the continue
statement is executed in the loop, the code inside the loop following the continue statement will be
skipped for the current iteration and the next iteration of the loop will begin.
Syntax:
continue
Example: Continue statement in Python
Consider the situation when you need to write a program which prints the number from 1 to 10 and
but not 6. It is specified that you have to do this using loop and only one loop is allowed to use. Here
comes the usage of continue statement. What we can do here is we can run a loop from 1 to 10 and
every time we have to compare the value of the iterator with 6. If it is equal to 6 we will use the
continue statement to continue to the next iteration without printing anything otherwise we will print
the value.
Below is the implementation of the above idea:
# Python program to
# demonstrate continue
# statement
# loop from 1 to 10
for i in range(1, 11):
# If i is equals to 6,
# continue to next iteration
# without printing
if i == 6:
continue
else:
# otherwise print the value
# of i
Mr. D.Gangadhar
Associate Professor
print(i, end=" ")
Output:
1 2 3 4 5 7 8 9 10
3. Pass Statement:
The pass statement is a null statement. But the difference between pass and comment is that
comment is ignored by the interpreter whereas pass is not ignored.
The pass statement is generally used as a placeholder i.e. when the user does not know what code to
write. So user simply places pass at that line. Sometimes, pass is used when the user doesn‟t want
any code to execute. So user can simply place pass where empty code is not allowed, like in loops,
function definitions, class definitions, or in if statements. So using pass statement user avoids this
error.
Syntax:
pass
Example 1: Pass statement can be used in empty functions
def geekFunction:
pass
class geekClass:
pass
Example 3: pass statement can be used in for loop when user doesn‟t know what to code inside the
loop
n = 10
for i in range(n):
# pass can be used as placeholder
# when code is to added later
pass
a = 10
b = 20
if(a<b):
pass
else:
print("b<a")
Example 5: lets take another example in which the pass statement get executed when the condition is
true
Mr. D.Gangadhar
Associate Professor
pass
else:
print(i)
Output:
b
c
d
Python Exception:
An exception can be defined as an unusual condition in a program resulting in the interruption in the
flow of the program.
Whenever an exception occurs, the program stops the execution, and thus the further code is not
executed. Therefore, an exception is the run-time errors that are unable to handle to Python script. An
exception is a Python object that represents an error
Python provides a way to handle the exception so that the code can be executed without any
interruption. If we do not handle the exception, the interpreter doesn't execute all the code that exists
after the exception.
Python has many built-in exceptions that enable our program to run without interruption and give the
output. These exceptions are given below:
Common Exceptions
Python provides the number of built- in exceptions, but here we are describing the common standard
exceptions. A list of common exceptions that can be thrown from a standard Python program is given
below.
ZeroDivisionError: Occurs when a number is divided by zero.
NameError: It occurs when a name is not found. It may be local or global.
IndentationError: If incorrect indentation is given.
IOError: It occurs when Input Output operation fails.
EOFError: It occurs when the end of the file is reached, and yet operations are being performed.
Output:
Enter a:10
Enter b:0
Traceback (most recent call last):
Mr. D.Gangadhar
Associate Professor
File "exception-test.py", line 3, in <module>
c = a/b;
ZeroDivisionError: division by zero
The above program is syntactically correct, but it through the error because of unusual input. That kind
of programming may not be suitable or recommended for the projects because these projects are
required uninterrupted execution. That's why an exception-handling plays an essential role in handling
these unexpected exceptions. We can handle these exceptions in the following way.
If the Python program contains suspicious code that may throw the exception, we must place that code
in the try block. The try block must be followed with the except statement, which contains a block of
code that will be executed if there is some exception in the try block.
Syntax
try:
#block of code
except Exception1:
#block of code
except Exception2:
#block of code
#other code
Example 1
try:
a = int(input("Enter a:"))
b = int(input("Enter b:"))
c = a/b
except:
print("Can't divide with zero")
Output:
Enter a:10
Enter b:0
Mr. D.Gangadhar
Associate Professor
Can't divide with zero
We can also use the else statement with the try-except statement in which, we can place the code which
will be executed in the scenario if no exception occurs in the try block.
The syntax to use the else statement with the try-except statement is given below.
try:
#block of code
except Exception1:
#block of code
else:
#this code executes if no except block is executed
Example 2
try:
a = int(input("Enter a:"))
b = int(input("Enter b:"))
c = a/b
print("a/b = %d"%c)
# Using Exception with except statement. If we print(Exception) it will return exception class
except Exception:
print("can't divide by zero")
print(Exception)
else:
print("Hi I am else block")
Output:
Enter a:10
Enter b:0
can't divide by zero
<class 'Exception'>
The except statement with no exception
Python provides the flexibility not to specify the name of exception with the exception statement.
Consider the following example.
Example
try:
Mr. D.Gangadhar
Associate Professor
a = int(input("Enter a:"))
b = int(input("Enter b:"))
c = a/b;
print("a/b = %d"%c)
except:
print("can't divide by zero")
else:
print("Hi I am else block")
The except statement using with exception variable
We can use the exception variable with the except statement. It is used by using the as keyword. this
object will return the cause of the exception. Consider the following example:
try:
a = int(input("Enter a:"))
b = int(input("Enter b:"))
c = a/b
print("a/b = %d"%c)
# Using exception object with the except statement
except Exception as e:
print("can't divide by zero")
print(e)
else:
print("Hi I am else block")
Output:
Enter a:10
Enter b:0
can't divide by zero
division by zero
Points to remember
Python facilitates us to not specify the exception with the except statement.
We can declare multiple exceptions in the except statement since the try block may contain the
statements which throw the different type of exceptions.
We can also specify an else block along with the try-except statement, which will be executed if no
exception is raised in the try block.
The statements that don't throw the exception should be placed inside the else block.
Example
try:
#this will throw an exception if the file doesn't exist.
fileptr = open("file.txt","r")
except IOError:
print("File not found")
else:
print("The file opened successfully")
fileptr.close()
Output:
File not found
Declaring Multiple Exceptions
Mr. D.Gangadhar
Associate Professor
The Python allows us to declare the multiple exceptions with the except clause. Declaring multiple
exceptions is useful in the cases where a try block throws multiple exceptions. The syntax is given
below.
Syntax
try:
#block of code
except (<Exception 1>,<Exception 2>,<Exception 3>,...<Exception n>)
#block of code
else:
#block of code
Consider the following example.
try:
a=10/0;
except(ArithmeticError, IOError):
print("Arithmetic Exception")
else:
print("Successfully Done")
Output
Arithmetic Exception
# block of code
Mr. D.Gangadhar
Associate Professor
Example
try:
fileptr = open("file2.txt","r")
try:
fileptr.write("Hi I am good")
finally:
fileptr.close()
print("file closed")
except:
print("Error")
Output:
file closed
Error
Raising exceptions:
An exception can be raised forcefully by using the raise clause in Python. It is useful in in that scenario
where we need to raise an exception to stop the execution of the program.
For example, there is a program that requires 2GB memory for execution, and if the program tries to
occupy 2GB of memory, then we can raise an exception to stop the execution of the program.
The syntax to use the raise statement is given below.
Syntax
raise Exception_class,<value>
Points to remember
To raise an exception, the raise statement is used. The exception class name follows it.
An exception can be provided with a value that can be given in the parenthesis.
Mr. D.Gangadhar
Associate Professor
To access the value "as" keyword is used. "e " is used as a reference variable which stores the value of
the exception.
We can pass the value to an exception to specify the exception type.
Example
try:
age = int(input("Enter the age:"))
if(age<18):
raise ValueError
else:
print("the age is valid")
except ValueError:
print("The age is not valid")
Output:
Enter the age:17
The age is not valid
Example 2 Raise the exception with message
try:
num = int(input("Enter a positive integer: "))
if(num <= 0):
# we can pass the message in the raise statement
raise ValueError("That is a negative number!")
except ValueError as e:
print(e)
Output:
Enter a positive integer: -5
That is a negative number!
Example 3
try:
a = int(input("Enter a:"))
b = int(input("Enter b:"))
if b is 0:
raise ArithmeticError
else:
print("a/b = ",a/b)
except ArithmeticError:
print("The value of b can't be 0")
Output:
Enter a:10
Enter b:0
The value of b can't be 0
Custom Exception:
The Python allows us to create our exceptions that can be raised from the program and caught using the
except clause. However, we suggest you read this section after visiting the Python object and classes.
Consider the following example.
Example:
Mr. D.Gangadhar
Associate Professor
class ErrorInCode(Exception):
def __init__(self, data):
self.data = data
def __str__(self):
return repr(self.data)
try:
raise ErrorInCode(2000)
except ErrorInCode as ae:
print("Received error:", ae.data)
Output:
Received error: 2000
Python Random module:
The Python random module functions depend on a pseudo-random number generator function random(),
which generates the float number between 0.0 and 1.0.
There are different types of functions used in a random module which is given below:
random.random()
This function generates a random float number between 0.0 and 1.0.
random.randint()
This function returns a random integer between the specified integers.
random.choice()
This function returns a randomly selected element from a non-empty sequence.
Example
1. # importing "random" module.
2. import random
3. # We are using the choice() function to generate a random number from
4. # the given list of numbers.
5. print ("The random number from list is : ",end="")
6. print (random.choice([50, 41, 84, 40, 31]))
Output:
This function is used to generate a number within the range specified in its argument. It accepts three
arguments, beginning number, last number, and step, which is used to skip a number in the range.
Consider the following example.
Output:
This function is used to apply on the particular random number with the seed argument. It returns the
mapper value. Consider the following example.
Output:
Pie (n): It is a well-known mathematical constant and defined as the ratio of circumstance to the
diameter of a circle. Its value is 3.141592653589793.
Euler's numbe r(e): It is defined as the base of the natural logarithmic, and its value is
2.718281828459045.
Example
1. import math
2. number = 2e-7 # small value of of x
3. print('log(fabs(x), base) is :', math.log(math.fabs(number), 10))
Output:
math.log10()
This method returns base 10 logarithm of the given number and called the standard logarithm.
Mr. D.Gangadhar
Associate Professor
Example
1. import math
2. x=13 # small value of of x
3. print('log10(x) is :', math.log10(x))
Output:
log10(x) is : 1.1139433523068367
math.exp()
This method returns a floating-point number after raising e to the given number.
Example
1. import math
2. number = 5e-2 # small value of of x
3. print('The given number (x) is :', number)
4. print('e^x (using exp() function) is :', math.exp(number)-1)
Output:
math.pow(x,y)
This method returns the power of the x corresponding to the value of y. If value of x is negative or y is
not integer value than it raises a ValueError.
Example
1. import math
2. number = math.pow(10,2)
3. print("The power of number:",number)
Output:
math.floor(x)
This method returns the floor value of the x. It returns the less than or equal value to x.
Example:
1. import math
2. number = math.floor(10.25201)
Mr. D.Gangadhar
Associate Professor
3. print("The floor value is:",number)
Output:
math.ceil(x)
This method returns the ceil value of the x. It returns the greater than or equal value to x.
1. import math
2. number = math.ceil(10.25201)
3. print("The floor value is:",number)
Output:
math.fabs(x)
1. import math
2. number = math.fabs(10.001)
3. print("The floor absolute is:",number)
Output:
math.factorial()
This method returns the factorial of the given number x. If x is not integral, it raises a ValueError.
Example
1. import math
2. number = math.factorial(7)
3. print("The factorial of number:",number)
Output:
math.modf(x)
This method returns the fractional and integer parts of x. It carries the sign of x is float.
Mr. D.Gangadhar
Associate Professor
Example
1. import math
2. number = math.modf(44.5)
3. print("The modf of number:",number)
Output:
Python provides the several math modules which can perform the complex task in single- line of code. In
this tutorial, we have discussed a few important math modules.
Python OS Module:
Python OS module provides the facility to establish the interaction between the user and the operating
system. It offers many useful OS functions that are used to perform OS-based tasks and get related
information about operating system.
The OS comes under Python's standard utility modules. This module offers a portable way of using
operating system dependent functionality.
The Python OS module lets us work with the files and directories.
There are some functions in the OS module which are given below:
os.name()
This function provides the name of the operating system module that it imports.
Example
1. import os
2. print(os.name)
Output:
nt
os.mkdir()
The os.mkdir() function is used to create new directory. Consider the following example.
1. import os
Mr. D.Gangadhar
Associate Professor
2. os.mkdir("d:\\newdir")
It will create the new directory to the path in the string argument of the function in the D drive named
folder newdir.
os.getcwd()
Example
1. import os
2. print(os.getcwd())
Output:
C:\Users\Python\Desktop\ModuleOS
os.chdir()
The os module provides the chdir() function to change the current working directory.
1. import os
2. os.chdir("d:\\")
Output:
d:\\
os.rmdir()
The rmdir() function removes the specified directory with an absolute or related path. First, we have to
change the current working directory and remove the folder.
Example
1. import os
2. # It will throw a Permission error; that's why we have to change the current working directory.
3. os.rmdir("d:\\newdir")
4. os.chdir("..")
5. os.rmdir("newdir")
os.error()
The os.error() function defines the OS level errors. It raises OSError in case of invalid or inaccessible
file names and path etc.
Example
Mr. D.Gangadhar
Associate Professor
1. import os
2.
3. try:
4. # If file does not exist,
5. # then it throw an IOError
6. filename = 'Python.txt'
7. f = open(filename, 'rU')
8. text = f.read()
9. f.close()
10.
11. # The Control jumps directly to here if
12. # any lines throws IOError.
13. except IOError:
14.
15. # print(os.error) will <class 'OSError'>
16. print('Problem reading: ' + filename)
Output:
os.popen()
This function opens a file or from the command specified, and it returns a file object which is connected
to a pipe.
Example
1. import os
2. fd = "python.txt"
3. # popen() is similar to open()
4. file = open(fd, 'w')
5. file.write("This is awesome")
6. file.close()
7. file = open(fd, 'r')
8. text = file.read()
9. print(text)
10.
11. # popen() provides gateway and accesses the file directly
12. file = os.popen(fd, 'w')
13. file.write("This is awesome")
14. # File not closed, shown in next function.
Output:
This is awesome
os.close()
Mr. D.Gangadhar
Associate Professor
This function closes the associated file with descriptor fr.
Example
1. import os
2. fr = "Python1.txt"
3. file = open(fr, 'r')
4. text = file.read()
5. print(text)
6. os.close(file)
Output:
os.rename()
A file or directory can be renamed by using the function os.rename(). A user can rename the file if it
has privilege to change the file.
Example
1. import os
2. fd = "python.txt"
3. os.rename(fd,'Python1.txt')
4. os.rename(fd,'Python1.txt')
Output:
os.access()
This function uses real uid/gid to test if the invoking user has access to the path.
Example
1. import os
2. import sys
3.
4. path1 = os.access("Python.txt", os.F_OK)
5. print("Exist path:", path1)
6.
Mr. D.Gangadhar
Associate Professor
7. # Checking access with os.R_OK
8. path2 = os.access("Python.txt", os.R_OK)
9. print("It access to read the file:", path2)
10.
11. # Checking access with os.W_OK
12. path3 = os.access("Python.txt", os.W_OK)
13. print("It access to write the file:", path3)
14.
15. # Checking access with os.X_OK
16. path4 = os.access("Python.txt", os.X_OK)
17. print("Check if path can be executed:", path4)
Output:
The python sys module provides functions and variables which are used to manipulate different parts of
the Python Runtime Environment. It lets us access system-specific parameters and functions.
import sys
First, we have to import the sys module in our program before running any functions.
sys.modules
40.3M
663
Hello Java Program for Beginners
This function provides the name of the existing python modules which have been imported.
sys.argv
This function returns a list of command line arguments passed to a Python script. The name of the script
is always the item at index 0, and the rest of the arguments are stored at subsequent indices.
sys.base_exec_prefix
This function provides an efficient way to the same value as exec_prefix. If not running a virtual
environment, the value will remain the same.
sys.base_prefix
Mr. D.Gangadhar
Associate Professor
It is set up during Python startup, before site.py is run, to the same value as prefix.
sys.byteorder
sys.maxsize
sys.path
This function shows the PYTHONPATH set in the current system. It is an environment variable that is a
search path for all the python modules.
sys.stdin
It is an object that contains the original values of stdin at the start of the program and used during
finalization. It can restore the files.
sys.getrefcount
sys.exit
This function is used to exit from either the Python console or command prompt, and also used to exit
from the program in case of an exception.
sys executable
The value of this function is the absolute path to a Python interpreter. It is useful for knowing where
python is installed on someone else machine.
sys.platform
This value of this function is used to identify the platform on which we are working.
Python statistics module provides the functions to mathematical statistics of numeric data. There are
some popular statistical functions defined in this module.
mean() function: The mean() function is used to calculate the arithmetic mean of the numbers in the
list.
Example:
import statistics
Mr. D.Gangadhar
Associate Professor
list of positive integer
numbers datasets = [5, 2, 7,
4, 2, 6, 8]
x = statistics.mean(datasets)
Printing the mean
print("Mean is :", x)
Output:
Mean is : 4.857142857142857
median() function :The median() function is used to return the middle value of the numeric data in the
list.
Example
import statistics
datasets = [4, -5, 6, 6, 9, 4, 5, -2]
% (statistics.median(datasets)))
Output:
Mode () function: The mode() function returns the most common data that occurs in the list.
Example
import statistics
Output:
Calculated Mode 2
stdev() function: The stdev() function is used to calculate the standard deviation on a given sample
which is available in the form of the list.
Example
import statistics
Mr. D.Gangadhar
Associate Professor
creating a simple data -
set sample = [7, 8, 9, 10,
11]
Prints standard deviation
print("Standard Deviation of sample is
%s "
(statistics.stdev(sample)))
Output:
Standard Deviation of sample is 1.5811388300841898
median_low(): The median_low function is used to return the low median of numeric data in the list.
Example
import statistics
simple list of a set of
integers set1 = [4, 6, 2, 5,
7, 7]
Note: low median will always be a member of the data-set.
Print low median of the data-set
print("Low median of data-set is
% s
"(statistics.median_low(set1)))
Output:
Low median of the data-set is 5
median_high():
The median_high function is used to return the high median of numeric data in the list.
Example:
import statistics
% (statistics.median_high(dataset)))
Output:
High median of the data-set is 6
o Date - It is a naive ideal date. It consists of the year, month, and day as attributes.
Mr. D.Gangadhar
Associate Professor
o time - It is a perfect time, assuming every day has precisely 24*60*60 seconds. It has hour,
minute, second, microsecond, and tzinfo as attributes.
o datetime - It is a grouping of date and time, along with the attributes year, month, day, hour,
minute, second, microsecond, and tzinfo.
o timedelta - It represents the difference between two dates, time or datetime instances to
microsecond resolution.
o tzinfo - It provides time zone information objects.
o timezone - It is included in the new version of Python. It is the class that implements
the tzinfo abstract base class.
Tick: In Python, the time instants are counted since 12 AM, 1st January 1970. The function time() of
the module time returns the total number of ticks spent since 12 AM, 1st January 1970. A tick can be
seen as the smallest unit to measure the time.
1. import time;
2. #prints the number of ticks spent since 12 AM, 1st January 1970
3. print(time.time())
Output:
1585928913.6519969
The localtime() functions of the time module are used to get the current time tuple. Consider the
following example.
Example
1. import time;
2.
3. #returns a time tuple
4.
5. print(time.localtime(time.time()))
Output:
Time tuple
The time is treated as the tuple of 9 numbers. Let's look at the members of the time tuple.
Mr. D.Gangadhar
Associate Professor
0 Year 4 digit (for example 2018)
1 Month 1 to 12
2 Day 1 to 31
3 Hour 0 to 23
4 Minute 0 to 59
5 Second 0 to 60
6 Day of weak 0 to 6
The time can be formatted by using the asctime() function of the time module. It returns the formatted
time for the time tuple being passed.
Example
1. import time
2. #returns the formatted time
3.
4. print(time.asctime(time.localtime(time.time())))
Output:
The sleep() method of time module is used to stop the execution of the script for a given amount of
time. The output will be delayed for the number of seconds provided as the float.
Example
1. import time
2. for i in range(0,5):
3. print(i)
4. #Each element will be printed after 1 second
5. time.sleep(1)
Output:
Mr. D.Gangadhar
Associate Professor
0
1
2
3
4
The datetime module enables us to create the custom date objects, perform various operations on dates
like the comparison, etc.
To work with dates as date objects, we have to import the datetime module into the python source code.
Consider the following example to get the datetime object representation for the current time.
Example
1. import datetime
2. #returns the current datetime object
3. print(datetime.datetime.now())
Output:
2020-04-04 13:18:35.252578
We can create the date objects bypassing the desired date in the datetime constructor for which the date
objects are to be created.
Example
1. import datetime
2. #returns the datetime object for the specified date
3. print(datetime.datetime(2020,04,04))
Output:
2020-04-04 00:00:00
We can also specify the time along with the date to create the datetime object. Consider the following
example.
Mr. D.Gangadhar
Associate Professor
Example:
1. import datetime
2.
3. #returns the datetime object for the specified time
4.
5. print(datetime.datetime(2020,4,4,1,26,40))
Output:
2020-04-04 01:26:40
In the above code, we have passed in datetime() function year, month, day, hour, minute, and
millisecond attributes in a sequential manner.
Output:
fun hours
The calendar module
Python provides a calendar object that contains various methods to work with the calendars.
Consider the following example to print the calendar for the last month of 2018.
Example
1. import calendar;
2. cal = calendar.month(2020,3)
3. #printing the calendar of December 2018
4. print(cal)
Output:
March 2020
Mo Tu We Th Fr Sa Su
1
2 3 4 5 6 7 8
Mr. D.Gangadhar
Associate Professor
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31
SHUTIL MODULE:
Shutil module offers high- level operation on a file like a copy, create, and remote operation on the file.
It comes under Python‟s standard utility modules. This module helps in automating the process of
copying and removal of files and directories. Copying Files to another directory
shutil.copy() method in Python is used to copy the content of the source file to the destination file or
directory. It also preserves the file‟s permission mode but other metadata of the file like the file‟s
creation and modification times is not preserved.
The source must represent a file but the destination can be a file or a directory. If the destination is a
directory then the file will be copied into the destination using the base filename from the source.
Also, the destination must be writable. If the destination is a file and already exists then it will be
replaced with the source file otherwise a new file will be created.
Output:
Destination path: path/main2.py
Output:
Before copying file:
[„archive (2)‟, „c.jpg‟, „c.PNG‟, „Capture.PNG‟, „cc.jpg‟, „check.zip‟, „cv.csv‟, „d.png‟, „Done! Terms
And Conditions Generator – The Fastest Free Terms and Conditions Generator!.pdf‟, „file1.csv‟, „gfg‟
, „haarcascade_frontalface_alt2.xml‟, „log_transformed.jpg‟, „main.py‟, „nba.csv‟, „new_gfg.png‟, „r.g
if‟, „Result -_ Terms and Conditions are Ready!.pdf‟, „rockyou.txt‟, „sample.txt‟]
Mr. D.Gangadhar
Associate Professor
Metadata: os.stat_result(st_mode=33206, st_ino=2251799814202896, st_dev=1689971230, st_nlink=
1, st_uid=0, st_gid=0, st_size=1916, st_atime=1612953710, st_mtime=1612613202, st_ctime=161252
2940)
After copying file:
[„archive (2)‟, „c.jpg‟, „c.PNG‟, „Capture.PNG‟, „cc.jpg‟, „check.zip‟, „cv.csv‟, „d.png‟, „Done! Terms
And Conditions Generator – The Fastest Free Terms and Conditions Generator!.pdf‟, „file1.csv‟, „gfg‟
, „haarcascade_frontalface_alt2.xml‟, „log_transformed.jpg‟, „main.py‟, „nba.csv‟, „new_gfg.png‟, „r.g
if‟, „Result -_ Terms and Conditions are Ready!.pdf‟, „rockyou.txt‟, „sample.txt‟]
Metadata: os.stat_result(st_mode=33206, st_ino=2251799814202896, st_dev=1 689971230, st_nlink=
1, st_uid=0, st_gid=0, st_size=1916, st_atime=1612953710, st_mtime=1612613202, st_ctime=161252
2940)
Destination path: csv/gfg/check.txt
Output:
Destination path: csv/gfg/main_2.py
shutil.copytree() method recursively copies an entire directory tree rooted at source (src) to the
destination directory. The destination directory, named by (dst) must not already exist. It will be
created during copying.
Syntax: shutil.copytree(src, dst, symlinks = False, ignore = None, copy_function = copy2, igonre_dan
gling_symlinks = False)
Mr. D.Gangadhar
Associate Professor
P a r a m e t e r s :
src: A string representing the path of the source directory.
dest: A string representing the path of the destination.
symlinks (optional) : This parameter accepts True or False, depending on which the metadata of the o
r ig ina l link s o r link e d link s will b e co p ie d to the ne w tr ee.
ignore (optional) : If ignore is given, it must be a callable that will receive as its arguments the direct
ory being visited by copytree(), and a list of its contents, as returned by os.listdir().
copy_function (optional): The default value of this parameter is copy2. We can use other copy functi
on like copy() for this parameter.
igonre_dangling_symlinks (optional) : This parameter value when set to True is used to put a silenc
e on the exception raised if the file pointed by the symlink doesn‟t exist.
Return Value: This method returns a string which represents the path of newly created directory.
Output:
Before copying file:
[„cc.jpg‟, „check.txt‟, „log_transformed.jpg‟, „main.py‟, „main2.py‟, „main_2.py‟]
After copying file:
[„cc.jpg‟, „check.txt‟, „dest‟, „log_transformed.jpg‟, „main.py‟, „main2.py‟, „main_2.py‟]
Destination path: C:/Users/ksaty/csv/gfg/dest
Removing a Directory:
shutil.rmtree() is used to delete an entire directory tree, the path must point to a directory (but not a
symbolic link to a directory).
Mr. D.Gangadhar
Associate Professor
Syntax: shutil.rmtree(path, ignore_errors=False, onerror=None)
P a r a m e t e r s :
path: A path- like object representing a file path. A path- like object is either a string or bytes object re
p r e s e n t i n g a p a t h .
ignore_e rrors : If ignore_errors is true, errors resulting from failed removals will be ignored.
oneerror: If ignore_errors is false or omitted, such errors are handled by calling a handler specified b
y onerror.
Finding files:
shutil.which() method tells the path to an executable application that would be run if the
given cmd was called. This method can be used to find a file on a computer which is present on the
PATH.
Syntax: shutil.which(cmd, mode = os.F_OK | os.X_OK, path = None)
P a r a m e t e r s :
cmd: A string representing the file.
mode: This parameter specifies mode by which method should execute. os.F_OK tests existence of th
e path and os.X_OK Checks if path can be executed or we can say mode determines if the file exists a
nd executable.
path: This parameter specifies the path to be used, if no path is specified then the results of os.environ
() are used
Return Value: This method returns the path to an executable application
Output:
D:\Installation_bulk\Scripts\anaconda.EXE
Mr. D.Gangadhar
Associate Professor
Python Glob Module
In Python, we have many in-built modules for performing various tasks, and one of such tasks we want
to perform with the Python modules is finding and locating all the files present in our system, which
follows a similar pattern. This similar pattern can be a file extension, the file name's prefix, or any
similarity between two or many files. We have many different Python modules with which we can
easily perform this task using a Python program, but not all the modules are as efficient as others. In this
tutorial, we are going to learn about one of such efficient modules, i.e., glob module in Python, with
which we can perform file matching with a specific pattern by using it inside a program. We will learn
in detail about the glob module in Python, how we can use it inside a program, what its key features are
and the application of this module.
With the help of the Python glob module, we can search for all the path names which are looking for
files matching a specific pattern (which is defined by us). The specified pattern for file matching is
defined according to the rules dictated by the Unix shell. The result obtained by following t hese rules for
a specific pattern file matching is returned in the arbitrary order in the output of the program. While
using the file matching pattern, we have to fulfil some requirements of the glob module because the
module can travel through the list of the files at some location in our local disk. The module will mostly
go through those lists of the files in the disk that follow a specific pattern only.
In Python, we have several functions which we can use to list down the files that match with the specific
pattern which we have defined inside the function in a program. With the help of these functions, we can
get the result list of the files which will match the given pattern in the specified folder in an arbitrary
order in the output.
Keep Watching
Skip Ad
1. fnmatch()
2. scandir()
3. path.expandvars()
4. path.expanduser()
The first two functions present in the above- given list, i.e., fnmatch.fnmatch() and os.scandir()
function, is actually used to perform the pattern matching task and not by invoking the sub-shell in the
Python. These two functions perform the pattern matching task and get the list of all filenames and that
too in arbitrary order. Here is a catch that the glob module treats as special cases for all the files which
names begin with a dot (.) which is very unlikely in the fnmatch.fnmatch() function.
If any of us thinks that we can define or use any pattern to perform the pattern matching filename task,
then let us clarify here that it is not possible. We can't define any pattern or use any pattern to collect the
list of files with the same. We have to follow a specific set of rules while defining the pattern for the
filename pattern matching functions in the glob module.
In this section, we will discuss all such rules which we have to keep in mind and adhere them while
defining a pattern for filename pattern matching functions. We will only discuss these rules briefly and
don't go in-depth about them as they are not our primary focus in this tutorial.
Following are set of rules for the pattern that we define inside the glob module's pattern matching
functions:
o We have to follow all the standard set of rules of the UNIX path expansion in the pattern
matching.
o The path we define inside the pattern should be either absolute or relative, and we can't define
any unclear path inside the pattern.
o The special characters allowed inside the pattern are only two wild-cards, i.e., '*, ?' and the
normal characters that can be expressed inside the pattern are expressed in [].
o The rules of the pattern for glob module functions are applied to the filename segment (which is
provided in the functions), and it stops at the path separator, i.e., '/' of the files.
These are some general rules for the patterns we define inside the glob module functions for filename
pattern matching tasks, and we have to follow these set of rules in order to perform the task successfully.
We have already discussed how pattern matching is very helpful for us when we are looking for similar
files on our disk. Here, we will discuss the applications of the glob module and how it is very helpful to
us.
Following are some listed applications of the Python glob module, and we can use this module in the
given functions:
1. Sometimes, we want to search for a file that has a certain prefix in its name, any common string
in the middle of the names of many files or have the same certain extension. Now, to perform
this task, we may have to write a code that will scan the whole directory and then it will produce
the result. Instead of it, the glob module is going to be very helpful in this case as we can use the
functions of the glob module and perform this task very easily and can save our time.
2. Other than this, the Glob module is also very useful when one of our programs have to look for
the list of all the files in a given file system with the names of the files matching a similar
pattern. Glob module can easily perform this task and that too without opening the result of the
program in other sub-shell.
So, by looking at the application of the glob module, we can say that how important this module is for
us and where we can use it to reduce the complexity of the code and save our time.
Mr. D.Gangadhar
Associate Professor
Glob Module Functions
Now, we will discuss various more functions of the glob module and understand their working inside a
Python program. We will also learn that how these functions help us in the pattern matching task. Look
at the following list of functions that we have in the glob module, and with the help of these functions,
we can carry out the task of filename pattern matching very smoothly:
1. iglob()
2. glob()
3. escape()
Now, we will briefly discuss these functions and then understand the implementation of these functions
by using them inside a Python program. We will use each of the above- given functions in an example
program and get the list of file names following a similar pattern (that we will define in the function) in
the output.
1. iglob() Function: The iglob() function of the glob module is very helpful in yielding the arbitrary
values of the list of files in the output. We can create a Python generator with the iglob() method. We
can use the Python generator created by the glob module to list down the files under a given directory.
This function also returns an iterator when called, and the iterator returned by it yields the values (list of
files) without storing all of the filenames simultaneously.
Syntax: Following is the syntax for using the iglob() function of glob module inside a Python program:
1. iglob(pathname, *, recursive=False)
As we can see in the syntax of iglob() function, it takes a total of three parameters in it, which can
be defined as given below:
(i) pathname: The pathname parameter is the optional parameter of the function, a nd we can even leave
it while we are working on the file directory that is the same as where our Python is installed. We have
to define the pathname from where we have to collect the list of files that following a similar pattern
(which is also defined inside the function).
(ii) recursive: It is also an optional parameter for the iglob() function, and it takes only bool values (true
or false) in it. The recursive parameter is used to set if the function is following the recursive approach
for finding file names or not.
(iii) '*': This is the mandatory parameter of the iglob() function as here we have to define the pattern for
which the iglob() function will collect the file names and list them down in the output. The pattern we
define inside the iglob() function (such as the extension of file) for the pattern matching should start
with the '*' symbol.
Now, let's use this iglob() function in an example program so that we can understand its implementation
and function in a better way.
Example 1:
Look at the following Python program with the implementation of iglob() function:
Mr. D.Gangadhar
Associate Professor
1. # Import glob module in the program
2. import glob as gb
3. # Initialize a variable
4. inVar = gb.iglob("*.py") # Set Pattern in iglob() function
5. # Returning class type of variable
6. print(type(inVar))
7. # Printing list of names of all files that matched the pattern
8. print("List of the all the files in the directory having extension .py: ")
9. for py in inVar:
10. print(py)
Output:
<class 'generator'>
List of the all the files in the directory having extension .py:
adding.py
changing.py
code#1.py
code#2.py
code-3.py
code-4.py
code.py
code37.py
code_5.py
code_6.py
configuring.py
glob() Function: With the help of the glob() function, we can also get the list of files that matching a
specific pattern (We have to define that specific pattern inside the function). The list returned by the
glob() function will be a string that should contain a path specification according to the path we have
defined inside the function. The string or iterator for glob() function actually returns the same value as
returned by the iglob() function without actually storing these values (filenames) in it.
Syntax:
Following is the syntax for using the glob() function of the glob module inside a Python program:
As we can see in the syntax of the glob() function, it also takes a total of three parameters in it, like the
iglob() function. The three parameters defined in the glob() function are the same as those we have read
in the iglob() function above. Now, let's use this glob() function in an example program so that we can
understand its implementation and function in a better way.
Example 2: Look at the following Python program with the implementation of glob() function:
Output:
List of the all the files in the directory having extension .py:
adding.py
changing.py
code#1.py
code#2.py
code-3.py
code-4.py
code.py
code37.py
code_5.py
code_6.py
configuring.py
escape() Function: The escape() becomes very impactful as it allows us to escape the given character
sequence, which we defined in the function. The escape() function is very handy for locating files that
having certain characters (as we will define in the function) in their file names. It will match the
sequence by matching an arbitrary literal string in the file names with that special character in them.
Syntax:
Following is the syntax for using the escape() function of glob module inside a Python program:
1. >> escape(pathname)
The escape() should be used with either glob() or iglob() function so that we can print the list of file
names in the output as a result. Now, let's use this escape() function in an example program so that we
can understand its implementation and function in a better way.
Example 3: Look at the following Python program with the implementation of escape() function:
Output:
Following is the list of filenames that match the special character sequence of escape function:
code-3.py
code-4.py
code_5.py
code_6.py
code#1.py
Object-oriented Programming:
Class − A user-defined prototype for an object that defines a set of attributes that characterize any object
of the class. The attributes are data members (class variables and instance variables) and methods,
accessed via dot notation.
Class variable − A variable that is shared by all instances of a class. Class variables are defined within a
class but outside any of the class's methods. Class variables are not used as frequently as instance
variables are.
Data member − A class variable or instance variable that holds data associated with a class and its
objects.
Function overloading − The assignment of more than one behavior to a particular function. The
operation performed varies by the types of objects or arguments involved.
Instance variable − A variable that is defined inside a method and belongs only to the current instance
of a class.
Inheritance − The transfer of the characteristics of a class to other classes that are derived from it.
Instance − An individual object of a certain class. An object obj that belongs to a class Circle, for
example, is an instance of the class Circle.
Instantiation − The creation of an instance of a class.
Method − A special kind of function that is defined in a class definition.
Object − A unique instance of a data structure that's defined by its class. An object comprises both data
members (class variables and instance variables) and methods.
Operator overloading − The assignment of more than one function to a particular operator.
ATTRIBUTE AND METHODS IN PYTHON:
As an object oriented programming language python stresses on objects. Classes are the blueprint from
which the objects are created. Each class in python can have many attributes including a function as an
attribute.
Accessing the attributes of a class
To check the attributes of a class and also to manipulate those attributes, we use many python in-built
methods as shown below.
getattr() − A python method used to access the attribute of a class.
hasattr() − A python method used to verify the presence of an attribute in a class.
setattr() − A python method used to set an additional attribute in a class.
The below program illustrates the use of the above methods to access class attributes in python.
Example
Mr. D.Gangadhar
Associate Professor
class StateInfo:
StateName='Telangana'
population='3.5 crore'
def func1(self):
print("Hello from my function")
print getattr(StateInfo,'StateName')
# returns true if object has attribute
print hasattr(StateInfo,'population')
setattr(StateInfo,'ForestCover',39)
print getattr(StateInfo,'ForestCover')
print hasattr(StateInfo,'func1')
Output
Running the above code gives us the following result −
Telangana
True
39
True
Accessing the method of a class
To access the method of a class, we need to instantiate a class into an object. Then we can access the
method as an instance method of the class as shown in the program below. Here t hrough the self
parameter, instance methods can access attributes and other methods on the same object.
Example
class StateInfo:
StateName='Telangana'
population='3.5 crore'
def func1(self):
print("Hello from my function")
print getattr(StateInfo,'StateName')
# returns true if object has attribute
print hasattr(StateInfo,'population')
setattr(StateInfo,'ForestCover',39)
print getattr(StateInfo,'ForestCover')
print hasattr(StateInfo,'func1')
obj = StateInfo()
obj.func1()
Mr. D.Gangadhar
Associate Professor
Output
Running the above code gives us the following result −
Telangana
True
39
True
Hello from my function
Mr. D.Gangadhar
Associate Professor
Accessing the method of one class from another
To access the method of one class from another class, we need to pass an instance of the called class to
the calling class. The below example shows how it is done.
Example
class ClassOne:
def m_class1(self):
print "Method in class 1"
# Definign the calling Class
class ClassTwo(object):
def __init__(self, c1):
self.c1 = c1
# The calling method
def m_class2(self):
Object_inst = self.c1()
Object_inst.m_class1()
# Passing classone object as an argument to classTwo
obj = ClassTwo(ClassOne)
obj.m_class2()
Output
Running the above code gives us the following result −
Method in class 1
Python Inheritance:
Inheritance is an important aspect of the object-oriented paradigm. Inheritance provides code reusability
to the program because we can use an existing class to create a new class instead of creating it from
scratch. In inheritance, the child class acquires the properties and can access all the data members and
functions defined in the parent class. A child class can also provide its specific implementation to the
functions of the parent class. In this section of the tutorial, we will discuss inheritance in detail.
In python, a derived class can inherit base class by just mentioning the base in the bracket after the
derived class name. Consider the following syntax to inherit a base class into the derived class.
Mr. D.Gangadhar
Associate Professor
Syntax
A class can inherit multiple classes by mentioning all of them inside the bracket. Consider the following
syntax.
Syntax
1. class derive-class(<base class 1>, <base class 2>, ..... <base class n>):
2. <class - suite>
Example:
1. class Animal:
2. def speak(self):
3. print("Animal Speaking")
4. #child class Dog inherits the base class Animal
5. class Dog(Animal):
6. def bark(self):
7. print("dog barking")
8. d = Dog()
9. d.bark()
10. d.speak()
Output:
dog barking
Animal Speaking
Multi- Level inheritance is possible in python like other object-oriented languages. Multi- level
inheritance is archived when a derived class inherits another derived class. There is no limit on the
number of levels up to which, the multi-level inheritance is archived in python.
Mr. D.Gangadhar
Associate Professor
The syntax of multi- level inheritance is given below.
Syntax
1. class class1:
2. <class-suite>
3. class class2(class1):
4. <class suite>
5. class class3(class2):
6. <class suite>
7. .
8. .
Example
1. class Animal:
2. def speak(self):
3. print("Animal Speaking")
4. #The child class Dog inherits the base class Animal
5. class Dog(Animal):
6. def bark(self):
7. print("dog barking")
8. #The child class Dogchild inherits another child class Dog
9. class DogChild(Dog):
10. def eat(self):
11. print("Eating bread...")
12. d = DogChild()
13. d.bark()
14. d.speak()
15. d.eat()
Output:
dog barking
Animal Speaking
Eating bread...
Mr. D.Gangadhar
Associate Professor
Python Multiple inheritance
Python provides us the flexibility to inherit multiple base classes in the child class.
Syntax
1. class Base1:
2. <class-suite>
3.
4. class Base2:
5. <class-suite>
6. .
7. .
8. .
9. class BaseN:
10. <class-suite>
11.
12. class Derived(Base1, Base2, ...... BaseN):
13. <class-suite>
Example
1. class Calculation1:
2. def Summation(self,a,b):
3. return a+b;
4. class Calculation2:
5. def Multiplication(self,a,b):
6. return a*b;
7. class Derived(Calculation1,Calculation2):
8. def Divide(self,a,b):
9. return a/b;
10. d = Derived()
11. print(d.Summation(10,20))
12. print(d.Multiplication(10,20))
13. print(d.Divide(10,20))
Mr. D.Gangadhar
Associate Professor
Output:
30
200
0.5
The issubclass(sub, sup) method is used to check the relationships between the specified classes. It
returns true if the first class is the subclass of the second class, and false otherwise.
Example
1. class Calculation1:
2. def Summation(self,a,b):
3. return a+b;
4. class Calculation2:
5. def Multiplication(self,a,b):
6. return a*b;
7. class Derived(Calculation1,Calculation2):
8. def Divide(self,a,b):
9. return a/b;
10. d = Derived()
11. print(issubclass(Derived,Calculation2))
12. print(issubclass(Calculation1,Calculation2))
Output:
True
False
The isinstance() method is used to check the relationship between the objects and classes. It returns true
if the first parameter, i.e., obj is the instance of the second parameter, i.e., class.
Example
1. class Calculation1:
2. def Summation(self,a,b):
3. return a+b;
4. class Calculation2:
5. def Multiplication(self,a,b):
6. return a*b;
7. class Derived(Calculation1,Calculation2):
8. def Divide(self,a,b):
9. return a/b;
10. d = Derived()
Mr. D.Gangadhar
Associate Professor
11. print(isinstance(d,Derived))
Output:
True
Method Overriding
We can provide some specific implementation of the parent class method in our child class. When the
parent class method is defined in the child class with some specific implementation, then the concept is
called method overriding. We may need to perform method overriding in the scenario where the
different definition of a parent class method is needed in the child class.
Example
1. class Animal:
2. def speak(self):
3. print("speaking")
4. class Dog(Animal):
5. def speak(self):
6. print("Barking")
7. d = Dog()
8. d.speak()
Output:
Barking
1. class Bank:
2. def getroi(self):
3. return 10;
4. class SBI(Bank):
5. def getroi(self):
6. return 7;
7.
8. class ICICI(Bank):
9. def getroi(self):
10. return 8;
11. b1 = Bank()
12. b2 = SBI()
13. b3 = ICICI()
14. print("Bank Rate of interest:",b1.getroi());
15. print("SBI Rate of interest:",b2.getroi());
16. print("ICICI Rate of interest:",b3.getroi());
Output:
Mr. D.Gangadhar
Associate Professor
Bank Rate of interest: 10
SBI Rate of interest: 7
ICICI Rate of interest: 8
Abstraction is an important aspect of object-oriented programming. In python, we can also perform data
hiding by adding the double underscore (___) as a prefix to the attribute which is to be hidden. After
this, the attribute will not be visible outside of the class through the object.
Example
1. class Employee:
2. __count = 0;
3. def __init__(self):
4. Employee.__count = Employee.__count+1
5. def display(self):
6. print("The number of employees",Employee.__count)
7. emp = Employee()
8. emp2 = Employee()
9. try:
10. print(emp.__count)
11. finally:
12. emp.display()
Output:
Polymorphism is taken from the Greek words Poly (many) and morphism (forms). It means that the
same function name can be used for different types. This makes programming more intuitive and easier.
In Python, we have different ways to define polymorphism. So let‟s move ahead and see how
polymorphism works in Python.
Polymorphism in Python
A child class inherits all the methods from the parent class. However, in some situations, the method
inherited from the parent class doesn‟t quite fit into the child class. In such cases, you will have to re-
implement method in the child class.
There are different methods to use polymorphis m in Python. You can use different function, class
methods or objects to define polymorphis m. So, let’s move ahead and have a look at each of these
methods in detail. Polymorphis m with Function and Objects
You can create a function that can take any object, allowing for polymorphism.
Mr. D.Gangadhar
Associate Professor
Let‟s take an example and create a function called “func()” which will take an object which we will
name “obj”. Now, let‟s give the function something to do that uses the „obj‟ object we passed to it. In
this case, let‟s call the methods type() and color(), each of which is defined in the two classes „Tomato‟
and „Apple‟. Now, you have to create instantiations of both the „Tomato‟ and „Apple‟ classes if we
don‟t have them already:
1 class Tomato():
2 def type(self):
3 print("Vegetable")
4 def color(self):
5 print("Red")
6 class Apple():
7 def type(self):
8 print("Fruit")
9 def color(self):
10 print("Red")
11
12 def func(obj):
13 obj.type()
14 obj.color()
15
16 obj_tomato = Tomato()
17 obj_apple = Apple()
18 func(obj_tomato)
19 func(obj_apple)
Output:
Vegetable
Red
Fruit
Red
Mr. D.Gangadhar
Associate Professor
UNIT-3 NumPy Arrays and Vectorized Computation
NumPy Arrays:
NumPy, which stands for Numerical Python, is a library consisting of multidimensional array
objects and a collection of routines for processing those arrays. Using NumPy, mathematical and
logical operations on arrays can be performed. This tutorial explains the basics of NumPy such
as its architecture and environment. It also discusses the various array functions, types of
indexing, etc.
NumPy is a Python package. It stands for 'Numerical Python'. It is a library consisting of
multidimensional array objects and a collection of routines for processing of array.
Numeric, the ancestor of NumPy, was developed by Jim Hugunin. Another package Numarray
was also developed, having some additional functionalities. In 2005, Travis Oliphant created
NumPy package by incorporating the features of Numarray into Numeric package. There are
many contributors to this open source project.
Operations using NumPy:
Using NumPy, a developer can perform the following operations −
Mathematical and logical operations on arrays.
Fourier transforms and routines for shape manipulation.
Operations related to linear algebra. NumPy has in-built functions for linear algebra and
random number generation.
NumPy – A Replacement for MatLab
NumPy is often used along with packages like SciPy (Scientific Python)
and Mat−plotlib (plotting library). This combination is widely used as a replacement for
MatLab, a popular platform for technical computing. However, Python alternative to MatLab is
now seen as a more modern and complete programming language.
NumPy – Environment:
Standard Python distribution doesn't come bundled with NumPy module. A lightweight
alternative is to install NumPy using popular Python package installer, pip.
pip install numpy The best way to enable NumPy is to use an installable binary package specific
to your operating system. These binaries contain full SciPy stack (inclusive of NumPy, SciPy,
matplotlib, IPython, SymPy and nose packages along with core Python).
Windows
Anaconda (from https://www.continuum.io) is a free Python distribution for SciPy stack. It is
also available for Linux and Mac.
Canopy (https://www.enthought.com/products/canopy/) is available as free as well as
commercial distribution with full SciPy stack for Windows, Linux and Mac.
Python (x,y): It is a free Python distribution with SciPy stack and Spyder IDE for Windows OS.
(Downloadable from https://www.python- xy.github.io/)
NumPy Array objects:
The most important object defined in NumPy is an N-dimensional array type called ndarray. It
describes the collection of items of the same type. Items in the collection can be accessed using
a zero-based index.
Mr. D.Gangadhar
Associate Professor
Every item in an ndarray takes the same size of block in the memory. Each element in ndarray
is an object of data-type object (called dtype). Any item extracted from ndarray object (by
slicing) is represented by a Python object of one of array scalar types. The following diagram
shows a relationship between ndarray, data type object (dtype) and array scalar type −
An instance of ndarray class can be constructed by different array creation routines described
later in the tutorial. The basic ndarray is created using an array function in NumPy as follows −
numpy.array It creates an ndarray from any object exposing array interface, or from any method
that returns an array.
numpy.array(object, dtype = None, copy = True, order = None, subok = False, ndmin = 0)
The above constructor takes the following parameters −
1 object
Any object exposing the array interface method returns an array, or any (nested) sequence.
2 dtype
Desired data type of array, optional
3 copy
Optional. By default (true), the object is copied
4 order
C (row major) or F (column major) or A (any) (default)
5 subok
By default, returned array forced to be a base class array. If true, sub-classes passed
through
6 ndmin
Specifies minimum dimensions of resultant array
Example 1
import numpy as np
a = np.array([1,2,3])
print a
The output is as follows −
[1, 2, 3]
NumPy - Data Types:
NumPy supports a much greater variety of numerical types than Python does. The following
table shows different scalar data types defined in NumPy.
Mr. D.Gangadhar
Associate Professor
1 bool_
Boolean (True or False) stored as a byte
2 int_
Default integer type (same as C long; normally either int64 or int32)
3 intc
Identical to C int (normally int32 or int64)
4 intp
Integer used for indexing (same as C ssize_t; normally either int32 or int64)
5 int8
Byte (-128 to 127)
6 int16
Integer (-32768 to 32767)
7 int32
Integer (-2147483648 to 2147483647)
8 int64
Integer (-9223372036854775808 to 9223372036854775807)
9 uint8
Unsigned integer (0 to 255)
10 uint16
Unsigned integer (0 to 65535)
11 uint32
Unsigned integer (0 to 4294967295)
12 uint64
Unsigned integer (0 to 18446744073709551615)
13 float_
Shorthand for float64
14 float16
Half precision float: sign bit, 5 bits exponent, 10 bits mantissa
Mr. D.Gangadhar
Associate Professor
15 float32
Single precision float: sign bit, 8 bits exponent, 23 bits mantissa
16 float64
Double precision float: sign bit, 11 bits exponent, 52 bits mantissa
17 complex_
Shorthand for complex128
18 complex64
Complex number, represented by two 32-bit floats (real and imaginary components)
19 complex128
Complex number, represented by two 64-bit floats (real and imaginary components)
NumPy numerical types are instances of dtype (data-type) objects, each having unique
characteristics. The dtypes are available as np.bool_, np.float32, etc.
Data Type Objects (dtype):
A data type object describes interpretation of fixed block of memory corresponding to an array,
depending on the following aspects −
Type of data (integer, float or Python object)
Size of data
Byte order (little-endian or big-endian)
In case of structured type, the names of fields, data type of each field and part of the
memory block taken by each field.
If data type is a subarray, its shape and data type
The byte order is decided by prefixing '<' or '>' to data type. '<' means that encoding is little-
endian (least significant is stored in smallest address). '>' means that encoding is big-endian
(most significant byte is stored in smallest address).
A dtype object is constructed using the following syntax −
numpy.dtype(object, align, copy)
The parameters are −
Object − To be converted to data type object
Align − If true, adds padding to the field to make it similar to C-struct
Copy − Makes a new copy of dtype object. If false, the result is reference to builtin data
type object
Example
# using array-scalar type
import numpy as np
dt = np.dtype(np.int32)
print dt
Mr. D.Gangadhar
Associate Professor
The output is as follows −
int32
NumPy – Array:
NumPy - Array Numpy arrays are a very good substitute for python lists. They are better than
python lists as they provide better speed and takes less memory space. For those who are
unaware of what numpy arrays are, let‟s begin with its definition. These are a special kind of
data structure. They are basically multi-dimensional matrices or lists of fixed size with similar
kind of elements.
2D-Array:
Here, all attributes other than objects are optional. So, do not worry even if you do not
understand a lot about other parameters.
Attributes of an Array
An array has the following six main attributes:
import numpy as np
#creating an array to understand its attributes
A = np.array([[1,2,3],[1,2,3],[1,2,3]])
print("Array A is:\n",A)
#type of array
print("Type:", type(A))
#Shape of array
print("Shape:", A.shape)
#no. of dimensions
print("Rank:", A.ndim)
#size of array
print("Size:", A.size)
#type of each element in the array
print("Element type:", A.dtype)
ndarray(shape, type): Creates an array of the given shape with random numbers
array(array_object): Creates an array of the given shape from the list or tuple
Mr. D.Gangadhar
Associate Professor
zeros(shape): Creates an array of the given shape with all zeros
ones(shape): Creates an array of the given shape with all ones
full(shape,array_object, dtype): Create an array of the given shape with complex
numbers
arange(range): Creates an array with the specified range
import numpy as np
#creating array using ndarray
A = np.ndarray(shape=(2,2), dtype=float)
print("Array with random values:\n", A)
# Creating array from list
B = np.array([[1, 2, 3], [4, 5, 6]])
print ("Array created with list:\n", B)
# Creating array from tuple
C = np.array((1 , 2, 3))
print ("Array created with tuple:\n", C)
Output:
import numpy as np
#creating an array to understand indexing
A = np.array([[1,2,1],[7,5,3],[9,4,8]])
print("Array A is:\n",A)
#accessing elements at any given indices
B = A[[0, 1, 2], [0, 1, 2]] print ("Elements at indices (0, 0),(1, 1), (2, 2) are : \n",B)
#changing the value of elements at a given index
Mr. D.Gangadhar
Associate Professor
A[0,0] = 12
A[1,1] = 4
A[2,2] = 7
print("Array A after change is:\n", A)
Output:
import numpy as np
#creating a 3d array to understand indexing in a 3D array
I = np.array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
print("3D Array is:\n", I)
print("Elements at index (0,0,1):\n", I[0,0,1])
print("Elements at index (1,0,1):\n", I[1,0,1])
#changing the value of elements at a given index
I[1,0,2] = 31
print("3D Array after change is:\n", I)
Output:
Mr. D.Gangadhar
Associate Professor
Array creation using List : Arrays are used to store multiple values in one single variable.Python
does not have built-in support for Arrays, but Python lists can be used instead.
Example :
arr = [1, 2, 3, 4, 5]
arr1 = ["geeks", "for", "geeks"]
# Python program to create
# an array
arr=[1, 2, 3, 4, 5]
for i in arr:
print(i)
Run on IDE
Output:
1
Mr. D.Gangadhar
Associate Professor
2
Output:
The new created array is : 1 2 3 1 5
Reshaping array: We can use reshape method to reshape an array. Consider an array with shape
(a1, a2, a3, …, aN). We can reshape and convert it into another array with shape (b1, b2, b3, …,
bM).
The only required condition is: a1 x a2 x a3 … x aN = b1 x b2 x b3 … x bM . (i.e original size of
array remains unchanged.)
numpy.reshape (array, shape, order = ‘C’) : Shapes an array without changing data of array.
# Python Program illustrating
# numpy.reshape() method
import numpy as geek
Mr. D.Gangadhar
Associate Professor
array = geek.arange(8)
print("Original array : \n", array)
Mr. D.Gangadhar
Associate Professor
A
[[0 1]
[2 3]]
A
[4 5 6 7 8 9]
A
[ 4 7 10 13 16 19]
Flatten array: We can use flatten method to get a copy of array collapsed into one dimension. It
accepts order argument. Default value is „C‟ (for row-major order). Use „F‟ for column major
order.
numpy.ndarray.flatten(order = ‘C’) : Return a copy of the array collapsed into one dimension.
# Python Program illustrating
# numpy.flatten() method
import numpy as geek
array = geek.array([[1, 2], [3, 4]])
# using flatten method
array.flatten()
print(array)
#using fatten method
array.flatten('F')
Mr. D.Gangadhar
Associate Professor
print(array)
Run on IDE
Output :
[1, 2, 3, 4]
[1, 3, 2, 4]
FUNCTION DESCRIPTION
empty() Return a new array of given shape and type, without initializing entries
empty_like() Return a new array with the same shape and type as a given array
eye() Return a 2-D array with ones on the diagonal and zeros elsewhere.
ones() Return a new array of given shape and type, filled with ones
ones_like() Return an array of ones with the same shape and type as a given array
zeros() Return a new array of given shape and type, filled with zeros
zeros_like() Return an array of zeros with the same shape and type as a given array
full_like() Return a full array with the same shape and type as a given array.
asanyarray() Convert the input to an ndarray, but pass ndarray subclasses through
Mr. D.Gangadhar
Associate Professor
fromfunction() Construct an array by executing a function over each coordinate
tri() An array with ones at and below the given diagonal and zeros elsewhere
Mr. D.Gangadhar
Associate Professor
Example:
import numpy as np
a = np.arange(10)
s = slice(2,7,2)
print a[s]
Its output is as follows −
[2 4 6]
In the above example, an ndarray object is prepared by arange() function. Then a slice object
is defined with start, stop, and step values 2, 7, and 2 respectively. When this slice object is
passed to the ndarray, a part of it starting with index 2 up to 7 with a step of 2 is sliced.
The same result can also be obtained by giving the slicing parameters separated by a colon :
(start:stop:step) directly to the ndarray object.
Example:
import numpy as np
a = np.arange(10)
b = a[2:7:2]
print b
Here, we will get the same output −
[2 4 6]
If only one parameter is put, a single item corresponding to the index will be returned. If a : is
inserted in front of it, all items from that index onwards will be extracted. If two parameters
(with : between them) is used, items between the two indexes (not including the stop index)
with default step one are sliced.
Example:
# slice single item
import numpy as np
a = np.arange(10)
b = a[5]
print b
Its output is as follows −
5
Example:
# slice items starting from index
import numpy as np
a = np.arange(10)
print a[2:]
Now, the output would be −
[2 3 4 5 6 7 8 9]
Example:
# slice items between indexes
import numpy as np
a = np.arange(10)
Mr. D.Gangadhar
Associate Professor
print a[2:5]
Here, the output would be −
[2 3 4]
The above description applies to multi-dimensional ndarray too.
Example:
import numpy as np
a = np.array([[1,2,3],[3,4,5],[4,5,6]])
print a
Example:
import numpy as np
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
print 'Our array is:'
print x
print '\n'
rows = np.array([[0,0],[3,3]])
cols = np.array([[0,2],[0,2]])
y = x[rows,cols]
print 'The corner elements of this array are:'
print y
The output of this program is as follows −
Our array is:
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
Mr. D.Gangadhar
Associate Professor
[ 9 10 11]]
The corner elements of this array are:
[[ 0 2]
[ 9 11]]
The resultant selection is an ndarray object containing corner elements.
Advanced and basic indexing can be combined by using one slice (:) or ellipsis (…) with an
index array. The following example uses slice for row and advanced index for column. The
result is the same when slice is used for both. But advanced index results in copy and may have
different memory layout.
Example:
import numpy as np
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
print 'Our array is:'
print x
print '\n'
# slicing
z = x[1:4,1:3]
print 'After slicing, our array becomes:'
print z
print '\n'
# using advanced index for column
y = x[1:4,[1,2]]
print 'Slicing using advanced index for column:'
print y
The output of this program would be as follows −
Our array is:
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
After slicing, our array becomes:
[[ 4 5]
[ 7 8]
[10 11]]
Slicing using advanced index for column:
[[ 4 5]
[ 7 8]
[10 11]]
Boolean Array Indexing
This type of advanced indexing is used when the resultant object is meant to be the result of
Boolean operations, such as comparison operators.
Example:
import numpy as np
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
Mr. D.Gangadhar
Associate Professor
print 'Our array is:'
print x
print '\n'
# Now we will print the items greater than 5
print 'The items greater than 5 are:'
print x[x > 5]
The output of this program would be −
Our array is:
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
The items greater than 5 are:
[ 6 7 8 9 10 11]
Example:
import numpy as np
a = np.array([np.nan, 1,2,np.nan,3,4,5])
print a[~np.isnan(a)]
Its output would be −
[ 1. 2. 3. 4. 5.]
Operations on Numpy Arrays
NumPy is a Python package which means „Numerical Python‟. It is the library for logical
computing, which contains a powerful n-dimensional array object, gives tools to integrate C,
C++ and so on. It is likewise helpful in linear based math, arbitrary number capacity and so on.
NumPy exhibits can likewise be utilized as an effective multi-dimensional compartment for
generic data.
NumPy Array: Numpy array is a powerful N-dimensional array object which is in the form of
rows and columns. We can initialize NumPy arrays from nested Python lists and access it
elements. A Numpy array on a structural level is made up of a combination of:
The Data pointer indicates the memory address of the first byte in the array.
The Data type or dtype pointer describes the kind of elements that are contained within the
array.
The shape indicates the shape of the array.
The strides are the number of bytes that should be skipped in memory to go to the next
element.
Operations on Numpy Array
Arithmetic Operations:
Mr. D.Gangadhar
Associate Professor
print('First array:')
print(arr1)
print('\nSecond array:')
arr2 = np.array([12, 12])
print(arr2)
print('\nAdding the two arrays:')
print(np.add(arr1, arr2))
print('\nSubtracting the two arrays:')
print(np.subtract(arr1, arr2))
print('\nMultiplying the two arrays:')
print(np.multiply(arr1, arr2))
print('\nDividing the two arrays:')
print(np.divide(arr1, arr2))
Output:
First array:
[[ 0. 1.]
[ 2. 3.]]
Second array:
[12 12]
Adding the two arrays:
[[ 12. 13.]
[ 14. 15.]]
Subtracting the two arrays:
[[-12. -11.]
[-10. -9.]]
Multiplying the two arrays:
[[ 0. 12.]
[ 24. 36.]]
Dividing the two arrays:
[[ 0. 0.08333333]
[ 0.16666667 0.25 ]]
numpy.reciprocol()
This function returns the reciprocal of argument, element-wise. For elements with absolute
values larger than 1, the result is always 0 and for integer 0, overflow warning is issued.
Example:
Mr. D.Gangadhar
Associate Professor
print(arr)
print('\nAfter applying reciprocal function:')
print(np.reciprocal(arr))
arr2 = np.array([25], dtype = int)
print('\nThe second array is:')
print(arr2)
print('\nAfter applying reciprocal function:')
print(np.reciprocal(arr2))
Output
Our array is:
[ 25. 1.33 1. 1. 100. ]
After applying reciprocal function:
[ 0.04 0.7518797 1. 1. 0.01 ]
The second array is:
[25]
After applying reciprocal function:
[0]
numpy.power()
This function treats elements in the first input array as the base and returns it raised to the power
of the corresponding element in the second input array.
Output:
First array is:
[ 5 10 15]
Applying power function:
[ 25 100 225]
Second array is:
[1 2 3]
Mr. D.Gangadhar
Associate Professor
Applying power function again:
[ 5 100 3375]
numpy.mod()
This function returns the remainder of division of the corresponding elements in the input array.
The function numpy.remainder() also produces the same result.
Output:
First array:
[ 5 15 20]
Second array:
[2 5 9]
Applying mod() function:
[1 0 2]
Applying remainder() function:
[1 0 2]
Array functions In Python:
Numpy is a python package used for scientific computing. So certainly, it supports a vast variety
of functions used for computation. The various functions supported by numpy are mathematical,
financial, universal, windows, and logical functions. Universal functions are used for array
broadcasting, typecasting, and several other standard features. While windows functions are used
in signal processing. We will be learning mathematical functions in detail in this article.
Mathematical Functions in NumPy
Numpy is written purely in C language. Hence, it‟s mathematical functions are closely associated
with functions present is math.h library in C.
1. Arithmetic Functions
Function Description
Mr. D.Gangadhar
Associate Professor
add(arr1, arr2,..) Add arrays element wise
power(arr1,arr2) Return the first array with its each of its elements raised to the
power of elements in the second array (element wise)
float_power(arr1,arr2) Return the first array with its each of its elements raised to the
power of elements in the second array (elementwise)
Mr. D.Gangadhar
Associate Professor
Code:
import numpy as np
a = np.array([10,20,30])
b= np.array([2,3,4])
print("division of a and b :",np.divide(a,b))
print("true division of a :",np.true_divide(a,b))
print("floor_division of a and b :",np.floor_divide(a,b))
print("float_power of a raised to b :",np.float_power(a,b))
print("fmod of a and b :",np.fmod(a,b))
print("mod of a and b :",np.mod(a,b))
print("quotient and remainder of a and b :",np.divmod(a,b))
print("remainders when a/b :",np.remainder(a,b))
Output:
2. Trigonometric Functions
Function Description
Mr. D.Gangadhar
Associate Professor
arctan(arr) Returns trigonometric inverse tan element wise
import numpy as np
angles = np.array([0,np.pi/2, np.pi]) -----> input array angles
sin_angles = np.sin(angles)
cosine_angles = np.cos(angles)
tan_angles = np.tan(angles)
rad2degree = np.degrees(angles)
print("sin of angles:",sin_angles)
print("cosine of angles:",cosine_angles)
print("tan of angles:",tan_angles)
print("angles in radians",rad2degree)
Output:
Function Description
Mr. D.Gangadhar
Associate Professor
log10(arr) Returns log base 10 of an input array element wise
import numpy as np
a = np.array([1,2,3,4,5])
a_log = np.log(a)
a_exp = np.exp(a)
print("log of input array a is:",a_log)
print("exponent of input array a is:",a_exp)
Output:
4. Rounding Functions
Function Description
around(arr,decimal) Rounds the elements of an input array upto given decimal places
round_(arr,decimal) Rounds the elements of an input array upto given decimal places
rint(arr) Round the elements of an input array to the nearest integer towards
zero
fix(arr) Round the elements of an input array to the nearest integer towards
zero
Mr. D.Gangadhar
Associate Professor
Code:
import numpy as np
a = np.array([1.23,4.165,3.8245])
rounded_a = np.round_(a,2)
print(rounded_a)
Output:
5. Miscellaneous Functions
Function Description
Mr. D.Gangadhar
Associate Professor
With the NumPy package, we can easily solve many kinds of data processing tasks without
writing complex loops. It is very helpful for us to control our code as well as the performance of
the program. In this part, we want to introduce some mathematical and statistical functions.
See the following table for a listing of mathematical and statistical functions:
sum Calculate the sum of all the elements in an >>> a = np.array([[2,4], [3,5]])
array or along the axis >>> np.sum(a, axis=0)
array([5, 9])
Copy
diff Calculate the discrete difference along the >>> np.diff(a, axis=0)
given axis array([[1,1]])
Copy
import numpy as np
a = np.array([1, 3, 5, 7])
a.tofile('test2.dat')
a2 = np.fromfile('test2.dat', dtype=int)
print(a == a2)
Output:
The Linear Algebra module of NumPy offers various methods to apply linear algebra on any
numpy array.
One can find:
rank, determinant, trace, etc. of an array.
eigen values of matrices
matrix and vector products (dot, inner, outer,etc. product), matrix exponentiation
solve linear or tensor equations and much more!
# Importing numpy as np
import numpy as np
A = np.array([[6, 1, 1],
[4, -2, 5],
[2, 8, 7]])
# Rank of a matrix
print("Rank of A:", np.linalg.matrix_rank(A))
# Trace of matrix A
print("\nTrace of A:", np.trace(A))
# Determinant of a matrix
print("\nDeterminant of A:", np.linalg.det(A))
# Inverse of matrix A
print("\nInverse of A:\n", np.linalg.inv(A))
print("\nMatrix A raised to power 3:\n",
np.linalg.matrix_power(A, 3))
Output:
Rank of A: 3
Trace of A: 11
Determinant of A: -306.0
Inverse of A:
[[ 0.17647059 -0.00326797 -0.02287582]
[ 0.05882353 -0.13071895 0.08496732]
[-0.11764706 0.1503268 0.05228758]]
Matrix A raised to power 3:
[[336 162 228]
[406 162 469]
[698 702 905]]
Matrix eigenvalues Functions
numpy.linalg.eigh(a, UPLO=’L’) : This function is used to return the eigenvalues and
eigenvectors of a complex Hermitian (conjugate symmetric) or a real symmetric matrix.Returns
two objects, a 1-D array containing the eigenvalues of a, and a 2-D square array or matrix
(depending on the input type) of the corresponding eigenvectors (in columns).
Mr. D.Gangadhar
Associate Professor
# Python program explaining
# eigh() function
from numpy import linalg as geek
# Creating an array using array
# function
a = np.array([[1, -2j], [2j, 5]])
print("Array is :",a)
# calculating an eigen value
# using eigh() function
c, d = geek.eigh(a)
print("Eigen value is :", c)
print("Eigen value is :", d)
Output :
Array is : [[ 1.+0.j, 0.-2.j],
[ 0.+2.j, 5.+0.j]]
[ 0.00000000+0.38268343j, 0.00000000-0.92387953j]]
numpy.linalg.eig(a) : This function is used to compute the eigenvalues and right eigenvectors of a
square array.
# Python program explaining
# eig() function
from numpy import linalg as geek
# Creating an array using diag
# function
a = np.diag((1, 2, 3))
print("Array is :",a)
# calculating an eigen value
# using eig() function
c, d = geek.eig(a)
print("Eigen value is :",c)
print("Eigen value is :",d)
Output :
Array is : [[1 0 0],
[0 2 0],
[0 0 3]]
Eigen value is : [ 1 2 3]
Eigen value is : [[ 1 0 0],
Mr. D.Gangadhar
Associate Professor
[ 0 1 0],
[ 0 0 1]]
FUNCTION DESCRIPTION
FUNCTION DESCRIPTION
Compute the dot product of two or more arrays in a single function call,
linalg.multi_dot() while automatically selecting the fastest evaluation order.
tensordot() Compute tensor dot product along specified axes for arrays >= 1-D.
Mr. D.Gangadhar
Associate Professor
UNIT 4 Data Analysis with Pandas
Pandas are an open-source Python Library providing high-performance data manipulation and
analysis tool using its powerful data structures. The name Pandas is derived from the word
Panel Data – an Econometrics from Multidimensional data.
In 2008, developer Wes McKinney started developing pandas when in need of high
performance, flexible tool for analysis of data. Prior to Pandas, Python was majorly used for
data munging and preparation. It had very little contribution towards data analysis. Pandas
solved this problem. Using Pandas, we can accomplish five typical steps in the processing and
analysis of data, regardless of the origin of data — load, prepare, manipulate, model, and
analyze. Python with Pandas is used in a wide range of fields including academic and
commercial domains including finance, economics, Statistics, analytics, etc.
Key Features of Pandas
Fast and efficient DataFrame object with default and customized indexing.
Tools for loading data into in-memory data objects from different file formats.
Data alignment and integrated handling of missing data.
Reshaping and pivoting of date sets.
Label-based slicing, indexing and subsetting of large data sets.
Columns from a data structure can be deleted or inserted.
Group by data for aggregation and transformations.
High performance merging and joining of data.
Time Series functionality.
Python Pandas - Environment Setup:
Standard Python distribution doesn't come bundled with Pandas module. A lightweight
alternative is to install NumPy using popular Python package installer, pip.
pip install pandas
If you install Anaconda Python package, Pandas will be installed by default with the following :
Windows
Anaconda (from https://www.continuum.io) is a free Python distribution for SciPy
stack. It is also available for Linux and Mac.
Canopy (https://www.enthought.com/products/canopy/) is available as free as well as
commercial distribution with full SciPy stack for Windows, Linux and Mac.
Python (x,y) is a free Python distribution with SciPy stack and Spyder IDE for Windows
OS. (Downloadable from http://python-xy.github.io/)
PANDAS Data Structures:
Pandas deals with the following three data structures −
Series
DataFrame
Panel
These data structures are built on top of Numpy array, which means they are fast.
Dimension & Description:
The best way to think of these data structures is that the higher dimensional data structure is a
container of its lower dimensional data structure. For example, DataFrame is a container of
Series, Panel is a container of DataFrame.
Mr. D.Gangadhar
Associate Professor
Data Dimensions Description
Structure
Series:
Series is a one-dimensional array like structure with homogeneous data. For example, the
following series is a collection of integers 10, 23, 56, …
10 23 56 17 52 61 73 90 26 72
Key Points
Homogeneous data
Size Immutable
Mr. D.Gangadhar
Associate Professor
The table represents the data of a sales team of an organization with their overall performance
rating. The data is represented in rows and columns. Each column represents an attribute and
each row represents a person.
Data Type of Columns
The data types of the four columns are as follows −
Column Type
Name String
Age Integer
Gender String
Rating Float
Key Points
Heterogeneous data
Size Mutable
Data Mutable
Panel
Panel is a three-dimensional data structure with heterogeneous data. It is hard to represent the
panel in graphical representation. But a panel can be illustrated as a container of DataFrame.
Key Points
Heterogeneous data
Size Mutable
Data Mutable
pandas.Series
A pandas Series can be created using the following constructor −
pandas.Series( data, index, dtype, copy)
The parameters of the constructor are as follows −
1 data
data takes various forms like ndarray, list, constants
2 index
Index values must be unique and hashable, same length as data. Default np.arrange(n) if no
index is passed.
3 dtype
dtype is for data type. If None, data type will be inferred
Mr. D.Gangadhar
Associate Professor
4 copy
Copy data. Default False
A series can be created using various inputs like −
Array
Dict
Scalar value or constant
Create an Empty Series
A basic series, which can be created is an Empty Series.
Example
#import the pandas library and aliasing as pd
import pandas as pd
s = pd.Series()
print s
Its output is as follows −
Series([], dtype: float64)
Create a Series from ndarray
If data is an ndarray, then index passed must be of the same length. If no index is passed, then
by default index will be range(n) where n is array length, i.e., [0,1,2,3…. range(len(array))-1].
Example:
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print s
Its output is as follows −
0 a
1 b
2 c
3 d
dtype: object
We did not pass any index, so by default, it assigned the indexes ranging from 0 to len(data)-1,
i.e., 0 to 3.
Example:
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data,index=[100,101,102,103])
Mr. D.Gangadhar
Associate Professor
print s
Its output is as follows −
100 a
101 b
102 c
103 d
dtype: object
We passed the index values here. Now we can see the customized indexed values in the output.
Create a Series from dict
A dict can be passed as input and if no index is specified, then the dictionary keys are taken in a
sorted order to construct index. If index is passed, the values in data corresponding to the labels
in the index will be pulled out.
Example:
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data)
print s
Its output is as follows −
a 0.0
b 1.0
c 2.0
dtype: float64
Create a Series from Scalar
If data is a scalar value, an index must be provided. The value will be repeated to match the
length of index
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
s = pd.Series(5, index=[0, 1, 2, 3])
print s
Its output is as follows −
0 5
1 5
2 5
3 5
dtype: int64
Accessing Data from Series with Position
Data in the series can be accessed similar to that in an ndarray.
Mr. D.Gangadhar
Associate Professor
Example:
Retrieve the first element. As we already know, the counting starts from zero for the array,
which means the first element is stored at zero th position and so on.
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
#retrieve the last three element
print s[-3:]
Its output is as follows −
c 3
d 4
e 5
dtype: int64
Retrieve Data Using Label (Index)
A Series is like a fixed-size dict in that you can get and set values by index label.
Example 1
Retrieve a single element using index label value.
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
Mr. D.Gangadhar
Associate Professor
#retrieve a single element
print s['a']
Its output is as follows −
Python Pandas – DataFrame:
A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in
rows and columns.
Features of DataFrame
Potentially columns are of different types
Size – Mutable
Labeled axes (rows and columns)
Can Perform Arithmetic operations on rows and columns
Structure
Let us assume that we are creating a data frame with student’s data. You can think of it as an
SQL table or a spreadsheet data representation.
pandas.DataFrame
A pandas DataFrame can be created using the following constructor −
pandas.DataFrame( data, index, columns, dtype, copy)
The parameters of the constructor are as follows −
1 data
data takes various forms like ndarray, series, map, lists, dict, constants and also another
DataFrame.
2 index
For the row labels, the Index to be used for the resulting frame is Optional Default
np.arange(n) if no index is passed.
3 columns
For column labels, the optional default syntax is - np.arange(n). This is only true if no index
is passed.
4 dtype
Data type of each column.
5 copy
This command (or whatever it is) is used for copying of data, if the default is False.
Create DataFrame:
A pandas DataFrame can be created using various inputs like −
Lists
dict
Series
Mr. D.Gangadhar
Associate Professor
Numpy ndarrays
Another DataFrame
In the subsequent sections of this chapter, we will see how to create a DataFrame using these
inputs.
Create an Empty DataFrame
A basic DataFrame, which can be created is an Empty Dataframe.
Example:
#import the pandas library and aliasing as pd
import pandas as pd
df = pd.DataFrame()
print df
Its output is as follows −
Empty DataFrame
Columns: []
Index: []
Create a DataFrame from Lists
The DataFrame can be created using a single list or a list of lists.
Example
import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print df
Its output is as follows −
0
0 1
1 2
2 3
3 4
4 5
Example
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print df
Its output is as follows −
Name Age
0 Alex 10
1 Bob 12
2 Clarke 13
Create a DataFrame from Dict of ndarrays / Lists:
Mr. D.Gangadhar
Associate Professor
All the ndarrays must be of same length. If index is passed, then the length of the index should
equal to the length of the arrays. If no index is passed, then by default, index will be range(n),
where n is the array length.
Example:
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print df
Its output is as follows −
Age Name
0 28 Tom
1 34 Jack
2 29 Steve
3 42 Ricky
Note − Observe the values 0,1,2,3. They are the default index assigned to each using the
function range(n).
Example
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data, index=['rank1','rank2','rank3','rank4'])
print df
Its output is as follows −
Age Name
rank1 28 Tom
rank2 34 Jack
rank3 29 Steve
rank4 42 Ricky
Note − Observe, the index parameter assigns an index to each row.
Create a DataFrame from List of Dicts:
List of Dictionaries can be passed as input data to create a DataFrame. The dictionary keys are
by default taken as column names.
Example:
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data)
print df
Its output is as follows −
a b c
0 1 2 NaN
1 5 10 20.0
Note − Observe, NaN (Not a Number) is appended in missing areas.
Mr. D.Gangadhar
Associate Professor
Example:
The following example shows how to create a DataFrame with a list of dictionaries, row
indices, and column indices.
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
#With two column indices, values same as dictionary keys
df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b'])
#With two column indices with one index with other name
df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
print df1
print df2
Its output is as follows −
#df1 output
a b
first 1 2
second 5 10
#df2 output
a b1
first 1 NaN
second 5 NaN
Note − Observe, df2 DataFrame is created with a column index other than the dictionary key;
thus, appended the NaN’s in place. Whereas, df1 is created with column indices same as
dictionary keys, so NaN’s appended.
Create a DataFrame from Dict of Series:
Dictionary of Series can be passed to form a DataFrame. The resultant index is the union of all
the series indexes passed.
Example:
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df
Its output is as follows −
one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4
Note − Observe, for the series one, there is no label ‘d’ passed, but in the result, for the d label,
NaN is appended with NaN.
Let us now understand column selection, addition, and deletion through examples.
Mr. D.Gangadhar
Associate Professor
Column Selection
We will understand this by selecting a column from the DataFrame.
Example:
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df ['one']
Its output is as follows −
a 1.0
b 2.0
c 3.0
d NaN
Name: one, dtype: float64
Column Addition
We will understand this by adding a new column to an existing data frame.
Example:
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
# Adding a new column to an existing DataFrame object with column label by passing new
series
print ("Adding a new column by passing as Series:")
df['three']=pd.Series([10,20,30],index=['a','b','c'])
print df
print ("Adding a new column using the existing columns in DataFrame:")
df['four']=df['one']+df['three']
print df
Its output is as follows −
Adding a new column by passing as Series:
one two three
a 1.0 1 10.0
b 2.0 2 20.0
c 3.0 3 30.0
d NaN 4 NaN
Adding a new column using the existing columns in DataFrame:
one two three four
a 1.0 1 10.0 11.0
b 2.0 2 20.0 22.0
c 3.0 3 30.0 33.0
Mr. D.Gangadhar
Associate Professor
d NaN 4 NaN NaN
Column Deletion
Columns can be deleted or popped; let us take an example to understand how.
Example:
# Using the previous DataFrame, we will delete a column
# using del function
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']),
'three' : pd.Series([10,20,30], index=['a','b','c'])}
df = pd.DataFrame(d)
print ("Our dataframe is:")
print df
# using del function
print ("Deleting the first column using DEL function:")
del df['one']
print df
# using pop function
print ("Deleting another column using POP function:")
df.pop('two')
print df
Its output is as follows −
Our dataframe is:
one three two
a 1.0 10.0 1
b 2.0 20.0 2
c 3.0 30.0 3
d NaN NaN 4
Deleting the first column using DEL function:
three two
a 10.0 1
b 20.0 2
c 30.0 3
d NaN 4
Deleting another column using POP function:
three
a 10.0
b 20.0
c 30.0
d NaN
Row Selection, Addition, and Deletion:
We will now understand row selection, addition and deletion through examples. Let us begin
with the concept of selection.
Mr. D.Gangadhar
Associate Professor
Selection by Label
Rows can be selected by passing row label to a loc function.
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df.loc['b']
Its output is as follows −
one 2.0
two 2.0
Name: b, dtype: float64
The result is a series with labels as column names of the DataFrame. And, the Name of the
series is the label with which it is retrieved.
Selection by integer location
Rows can be selected by passing integer location to an iloc function.
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df.iloc[2]
Its output is as follows −
one 3.0
two 3.0
Name: c, dtype: float64
Slice Rows:
Multiple rows can be selected using ‘ : ’ operator.
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df[2:4]
Its output is as follows −
one two
c 3.0 3
d NaN 4
Addition of Rows:
Add new rows to a DataFrame using the append function. This function will append the rows at
the end.
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df = df.append(df2)
Mr. D.Gangadhar
Associate Professor
print df
Its output is as follows −
a b
0 1 2
1 3 4
0 5 6
1 7 8
Deletion of Rows:
Use index label to delete or drop rows from a DataFrame. If label is duplicated, then multiple
rows will be dropped.
If you observe, in the above example, the labels are duplicate. Let us drop a label and will see
how many rows will get dropped.
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df = df.append(df2)
# Drop rows with label 0
df = df.drop(0)
print df
Its output is as follows −
ab
134
178
Python Pandas - Basic Functionality
Series Basic Functionality
Sr.No. Attribute or Method & Description
1 axes
Returns a list of the row axis labels
2 dtype
Returns the dtype of the object.
3 empty
Returns True if series is empty.
4 ndim
Returns the number of dimensions of the underlying data, by definition 1.
5 size
Returns the number of elements in the underlying data.
6 values
Returns the Series as ndarray.
Mr. D.Gangadhar
Associate Professor
7 head()
Returns the first n rows.
8 tail()
Returns the last n rows.
Let us now create a Series and see all the above tabulated attributes operation.
Example:
import pandas as pd
import numpy as np
#Create a series with 100 random numbers
s = pd.Series(np.random.randn(4))
print s
Its output is as follows −
0 0.967853
1 -0.148368
2 -1.395906
3 -1.758394
dtype: float64
axes:
Returns the list of the labels of the series.
import pandas as pd
import numpy as np
#Create a series with 100 random numbers
s = pd.Series(np.random.randn(4))
print ("The axes are:")
print s.axes
Its output is as follows −
The axes are:
[RangeIndex(start=0, stop=4, step=1)]
The above result is a compact format of a list of values from 0 to 5, i.e., [0,1,2,3,4].
Empty:
Returns the Boolean value saying whether the Object is empty or not. True indicates that the
object is empty.
import pandas as pd
import numpy as np
#Create a series with 100 random numbers
s = pd.Series(np.random.randn(4))
print ("Is the Object empty?")
print s.empty
Its output is as follows −
Mr. D.Gangadhar
Associate Professor
Is the Object empty?
False
Ndim:
Returns the number of dimensions of the object. By definition, a Series is a 1D data structure,
so it returns
import pandas as pd
import numpy as np
#Create a series with 4 random numbers
s = pd.Series(np.random.randn(4))
print s
print ("The dimensions of the object:")
print s.ndim
Its output is as follows −
0 0.175898
1 0.166197
2 -0.609712
3 -1.377000
dtype: float64
Size:
Returns the size(length) of the series.
import pandas as pd
import numpy as np
#Create a series with 4 random numbers
s = pd.Series(np.random.randn(2))
print s
print ("The size of the object:")
print s.size
Its output is as follows −
0 3.078058
1 -1.207803
dtype: float64
Mr. D.Gangadhar
Associate Professor
Values:
Returns the actual data in the series as an array.
import pandas as pd
import numpy as np
#Create a series with 4 random numbers
s = pd.Series(np.random.randn(4))
print s
print ("The actual data series is:")
print s.values
Its output is as follows −
0 1.787373
1 -0.605159
2 0.180477
3 -0.140922
dtype: float64
Mr. D.Gangadhar
Associate Professor
The first two rows of the data series:
0 0.720876
1 -0.765898
dtype: float64
tail() returns the last n rows(observe the index values). The default number of elements to
display is five, but you may pass a custom number.
import pandas as pd
import numpy as np
#Create a series with 4 random numbers
s = pd.Series(np.random.randn(4))
print ("The original series is:")
print s
print ("The last two rows of the data series:")
print s.tail(2)
Its output is as follows −
The original series is:
0 -0.655091
1 -0.881407
2 -0.608592
3 -2.341413
dtype: float64
Let us now understand what DataFrame Basic Functionality is. The following tables lists down
the important attributes or methods that help in DataFrame Basic Functionality.
1 T
Transposes rows and columns.
2 axes
Returns a list with the row axis labels and column axis labels as the only members.
3 dtypes
Returns the dtypes in this object.
Mr. D.Gangadhar
Associate Professor
4 empty
True if NDFrame is entirely empty [no items]; if any of the axes are of length 0.
5 ndim
Number of axes / array dimensions.
6 shape
Returns a tuple representing the dimensionality of the DataFrame.
7 size
Number of elements in the NDFrame.
8 values
Numpy representation of NDFrame.
9 head()
Returns the first n rows.
10 tail()
Returns last n rows.
Let us now create a DataFrame and see all how the above mentioned attributes operate.
Example
import pandas as pd
import numpy as np
#Create a DataFrame
df = pd.DataFrame(d)
print ("Our data series is:")
print df
Its output is as follows −
Our data series is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
Mr. D.Gangadhar
Associate Professor
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80
T (Transpose)
Returns the transpose of the DataFrame. The rows and columns will interchange.
import pandas as pd
import numpy as np
# Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
# Create a DataFrame
df = pd.DataFrame(d)
print ("The transpose of the data series is:")
print df.T
Its output is as follows −
The transpose of the data series is:
0 1 2 3 4 5 6
Age 25 26 25 23 30 29 23
Name Tom James Ricky Vin Steve Smith Jack
Rating 4.23 3.24 3.98 2.56 3.2 4.6 3.8
axes
Returns the list of row axis labels and column axis labels.
import pandas as pd
import numpy as np
#Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
#Create a DataFrame
df = pd.DataFrame(d)
print ("Row axis labels and column axis labels are:")
print df.axes
Its output is as follows −
Row axis labels and column axis labels are:
[RangeIndex(start=0, stop=7, step=1), Index([u'Age', u'Name', u'Rating'],
dtype='object')]
Mr. D.Gangadhar
Associate Professor
dtypes
Returns the data type of each column.
import pandas as pd
import numpy as np
#Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
#Create a DataFrame
df = pd.DataFrame(d)
print ("The data types of each column are:")
print df.dtypes
Its output is as follows −
The data types of each column are:
Age int64
Name object
Rating float64
dtype: object
empty
Returns the Boolean value saying whether the Object is empty or not; True indicates that the
object is empty.
import pandas as pd
import numpy as np
#Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
#Create a DataFrame
df = pd.DataFrame(d)
print ("Is the object empty?")
print df.empty
Its output is as follows −
Is the object empty?
False
ndim
Returns the number of dimensions of the object. By definition, DataFrame is a 2D object.
Mr. D.Gangadhar
Associate Professor
import pandas as pd
import numpy as np
#Create a DataFrame
df = pd.DataFrame(d)
print ("Our object is:")
print df
print ("The dimension of the object is:")
print df.ndim
Its output is as follows −
Our object is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80
shape
Returns a tuple representing the dimensionality of the DataFrame. Tuple (a,b), where a
represents the number of rows and b represents the number of columns.
import pandas as pd
import numpy as np
#Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
#Create a DataFrame
df = pd.DataFrame(d)
print ("Our object is:")
print df
print ("The shape of the object is:")
print df.shape
Mr. D.Gangadhar
Associate Professor
Its output is as follows −
Our object is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80
size
Returns the number of elements in the DataFrame.
import pandas as pd
import numpy as np
#Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
#Create a DataFrame
df = pd.DataFrame(d)
print ("Our object is:")
print df
print ("The total number of elements in our object is:")
print df.size
Its output is as follows −
Our object is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80
Mr. D.Gangadhar
Associate Professor
values
Returns the actual data in the DataFrame as an NDarray.
import pandas as pd
import numpy as np
#Create a DataFrame
df = pd.DataFrame(d)
print ("Our object is:")
print df
print ("The actual data in our data frame is:")
print df.values
Its output is as follows −
Our object is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80
The actual data in our data frame is:
[[25 'Tom' 4.23]
[26 'James' 3.24]
[25 'Ricky' 3.98]
[23 'Vin' 2.56]
[30 'Steve' 3.2]
[29 'Smith' 4.6]
[23 'Jack' 3.8]]
import pandas as pd
import numpy as np
#Create a DataFrame
df = pd.DataFrame(d)
print ("Our data frame is:")
print df
print ("The first two rows of the data frame is:")
print df.head(2)
Its output is as follows −
Our data frame is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80
import pandas as pd
import numpy as np
#Create a DataFrame
df = pd.DataFrame(d)
print ("Our data frame is:")
print df
print ("The last two rows of the data frame is:")
print df.tail(2)
Its output is as follows −
Our data frame is:
Mr. D.Gangadhar
Associate Professor
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80
Example
import pandas as pd
import numpy as np
N=20
df = pd.DataFrame({
'A': pd.date_range(start='2016-01-01',periods=N,freq='D'),
'x': np.linspace(0,stop=N-1,num=N),
'y': np.random.rand(N),
'C': np.random.choice(['Low','Medium','High'],N).tolist(),
'D': np.random.normal(100, 10, size=(N)).tolist()
})
#reindex the DataFrame
df_reindexed = df.reindex(index=[0,2,5], columns=['A', 'C', 'B'])
print df_reindexed
Its output is as follows −
A C B
0 2016-01-01 Low NaN
2 2016-01-03 High NaN
5 2016-01-06 Low NaN
Example
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(10,3),columns=['col1','col2','col3'])
df2 = pd.DataFrame(np.random.randn(7,3),columns=['col1','col2','col3'])
df1 = df1.reindex_like(df2)
print df1
Its output is as follows −
col1 col2 col3
0 -2.467652 -1.211687 -0.391761
1 -0.287396 0.522350 0.562512
2 -0.255409 -0.483250 1.866258
3 -1.150467 -0.646493 -0.222462
4 0.152768 -2.056643 1.877233
5 -1.155997 1.528719 -1.343719
6 -1.015606 -1.245936 -0.295275
Note − Here, the df1 DataFrame is altered and reindexed like df2. The column names should be
matched or else NAN will be added for the entire column label.
reindex() takes an optional parameter method which is a filling method with values as follows
−
pad/ffill − Fill values forward
bfill/backfill − Fill values backward
nearest − Fill from the nearest index values
Example
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3'])
df2 = pd.DataFrame(np.random.randn(2,3),columns=['col1','col2','col3'])
# Padding NAN's
print df2.reindex_like(df1)
# Now Fill the NAN's with preceding Values
print ("Data Frame with Forward Fill:")
print df2.reindex_like(df1,method='ffill')
Its output is as follows −
col1 col2 col3
Mr. D.Gangadhar
Associate Professor
0 1.311620 -0.707176 0.599863
1 -0.423455 -0.700265 1.133371
2 NaN NaN NaN
3 NaN NaN NaN
4 NaN NaN NaN
5 NaN NaN NaN
The limit argument provides additional control over filling while reindexing. Limit specifies the
maximum count of consecutive matches. Let us consider the following example to understand
the same −
Example
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3'])
df2 = pd.DataFrame(np.random.randn(2,3),columns=['col1','col2','col3'])
# Padding NAN's
print df2.reindex_like(df1)
# Now Fill the NAN's with preceding Values
print ("Data Frame with Forward Fill limiting to 1:")
print df2.reindex_like(df1,method='ffill',limit=1)
Its output is as follows −
col1 col2 col3
0 0.247784 2.128727 0.702576
1 -0.055713 -0.021732 -0.174577
2 NaN NaN NaN
3 NaN NaN NaN
4 NaN NaN NaN
5 NaN NaN NaN
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3'])
print df1
print ("After renaming the rows and columns:")
print df1.rename(columns={'col1' : 'c1', 'col2' : 'c2'},
index = {0 : 'apple', 1 : 'banana', 2 : 'durian'})
Its output is as follows −
col1 col2 col3
0 0.486791 0.105759 1.540122
1 -0.990237 1.007885 -0.217896
2 -0.483855 -1.645027 -1.194113
3 -0.122316 0.566277 -0.366028
4 -0.231524 -0.721172 -0.112007
5 0.438810 0.000225 0.435479
Mr. D.Gangadhar
Associate Professor
Let us consider an example with an output.
import pandas as pd
import numpy as np
unsorted_df=pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu
mns=['col2','col1'])
print unsorted_df
Its output is as follows −
col2 col1
1 -2.063177 0.537527
4 0.142932 -0.684884
6 0.012667 -0.389340
2 -0.548797 1.848743
3 -1.044160 0.837381
5 0.385605 1.300185
9 1.031425 -1.002967
8 -0.407374 -0.435142
0 2.237453 -1.067139
7 -1.445831 -1.701035
In unsorted_df, the labels and the values are unsorted. Let us see how these can be sorted.
By Label
Using the sort_index() method, by passing the axis arguments and the order of sorting,
DataFrame can be sorted. By default, sorting is done on row labels in ascending order.
import pandas as pd
import numpy as np
unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu
mns = ['col2','col1'])
sorted_df=unsorted_df.sort_index()
print sorted_df
Its output is as follows −
col2 col1
0 0.208464 0.627037
1 0.641004 0.331352
2 -0.038067 -0.464730
3 -0.638456 -0.021466
4 0.014646 -0.737438
5 -0.290761 -1.669827
6 -0.797303 -0.018737
7 0.525753 1.628921
8 -0.567031 0.775951
9 0.060724 -0.322425
Mr. D.Gangadhar
Associate Professor
Order of Sorting
By passing the Boolean value to ascending parameter, the order of the sorting can be controlled.
Let us consider the following example to understand the same.
import pandas as pd
import numpy as np
unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu
mns = ['col2','col1'])
sorted_df = unsorted_df.sort_index(ascending=False)
print sorted_df
Its output is as follows −
col2 col1
9 0.825697 0.374463
8 -1.699509 0.510373
7 -0.581378 0.622958
6 -0.202951 0.954300
5 -1.289321 -1.551250
4 1.302561 0.851385
3 -0.157915 -0.388659
2 -1.222295 0.166609
1 0.584890 -0.291048
0 0.668444 -0.061294
Sort the Columns
By passing the axis argument with a value 0 or 1, the sorting can be done on the column labels.
By default, axis=0, sort by row. Let us consider the following example to understand the same.
import pandas as pd
import numpy as np
unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu
mns = ['col2','col1'])
sorted_df=unsorted_df.sort_index(axis=1)
print sorted_df
Its output is as follows −
col1 col2
1 -0.291048 0.584890
4 0.851385 1.302561
6 0.954300 -0.202951
2 0.166609 -1.222295
3 -0.388659 -0.157915
5 -1.551250 -1.289321
9 0.374463 0.825697
Mr. D.Gangadhar
Associate Professor
8 0.510373 -1.699509
0 -0.061294 0.668444
7 0.622958 -0.581378
By Value
Like index sorting, sort_values() is the method for sorting by values. It accepts a 'by' argument
which will use the column name of the DataFrame with which the values are to be sorted.
import pandas as pd
import numpy as np
unsorted_df = pd.DataFrame({'col1':[2,1,1,1],'col2':[1,3,2,4]})
sorted_df = unsorted_df.sort_values(by='col1')
print sorted_df
Its output is as follows −
col1 col2
1 1 3
2 1 2
3 1 4
0 2 1
Observe, col1 values are sorted and the respective col2 value and row index will alter along
with col1. Thus, they look unsorted.
'by' argument takes a list of column values.
import pandas as pd
import numpy as np
unsorted_df = pd.DataFrame({'col1':[2,1,1,1],'col2':[1,3,2,4]})
sorted_df = unsorted_df.sort_values(by=['col1','col2'])
print sorted_df
Its output is as follows −
col1 col2
2 1 2
1 1 3
3 1 4
0 2 1
Sorting Algorithm
sort_values() provides a provision to choose the algorithm from mergesort, heapsort and
quicksort. Mergesort is the only stable algorithm.
import pandas as pd
import numpy as np
unsorted_df = pd.DataFrame({'col1':[2,1,1,1],'col2':[1,3,2,4]})
sorted_df = unsorted_df.sort_values(by='col1' ,kind='mergesort')
print sorted_df
Mr. D.Gangadhar
Associate Professor
Its output is as follows −
col1 col2
1 1 3
2 1 2
3 1 4
0 2 1
Working with Missing Data in Pandas:
Missing Data can occur when no information is provided for one or more items or for a whole unit.
Missing Data is a very big problem in real-life scenarios. Missing Data can also refer to as NA (Not
Available) values in pandas. In Data Frame sometimes many datasets simply arrive with missing
data, either because it exists and was not collected or it never existed. For Example, suppose
different users being surveyed may choose not to share their income, some users may choose not
to share the address in this way many datasets went missing. In Pandas missing data is
represented by two values:
None: None is a Python singleton object that is often used for missing data in Python code.
NaN : NaN (an acronym for Not a Number), is a special floating-point value recognized by all
systems that use the standard IEEE floating-point representation
Pandas treat None and NaN as essentially interchangeable for indicating missing or null values.
To facilitate this convention, there are several useful functions for detecting, removing, and
replacing null values in Pandas DataFrame:
isnull()
notnull()
dropna()
fillna()
replace()
interpolate()
Checking for missing values using isnull() and notnull()
In order to check missing values in Pandas DataFrame, we use a function isnull() and
notnull(). Both function help in checking whether a value is NaN or not. These function can
also
be used in Pandas Series in order to find null values in a series.
Checking for missing values using isnull():
In order to check null values in Pandas DataFrame, we use isnull() function this function
return dataframe of Boolean values which are True for NaN values.
Code:
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
creating a dataframe from
list df = pd.DataFrame(dict)
Mr. D.Gangadhar
Associate Professor
using isnull()
function df.isnull()
Output:
Checking for missing values using notnull():
In order to check null values in Pandas Dataframe, we use notnull() function this function
return dataframe of Boolean values which are False for NaN values.
Code:
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
# creating a dataframe using dictionary
df = pd.DataFrame(dict)
using notnull()
function df.notnull()
Output:
Code:
Mr. D.Gangadhar
Associate Professor
displayind data only with Gender = Not
NaN data[bool_series]
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
# creating a dataframe using dictionary
df = pd.DataFrame(dict)
using notnull()
function df.notnull()
Output:
Code:
Mr. D.Gangadhar
Associate Professor
Output:
As shown in the output image, only the rows having Gender = NOT NULL are displayed.
Mr. D.Gangadhar
Associate Professor
UNIT-5 Data Analysis Application Examples
Cleaning Data:
Missing data is always a problem in real life scenarios. Areas like machine learning and
data mining face severe issues in the accuracy of their model predictions because of poor
quality of data caused by missing values. In these areas, missing value treatment is a major
point of focus to make their models more accurate and valid.
Let us consider an online survey for a product. Many a times, people do not share all the
information related to them. Few people share their experience, but not how long they are
using the product; few people share how long they are using the product, their experience
but not their contact information. Thus, in some or the other way a part of data is always
missing, and this is very common in real time.
Dealing with missing values, as we can see from the previous output, there
are NaN values present in the MARKS column which are going to be taken care of by
replacing them with the column mean.
# Compute average
c = avg = 0
for ele in df['Marks']:
if str(ele).isnumeric():
c += 1
avg += ele
avg /= c
# Replace missing values
df = df.replace(to_replace="NaN",
value=avg)
# Display data
df
Mr. D.Gangadhar
Associate Professor
Output:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(3, 3), index=['a', 'c', 'e'],columns=['one',
'two', 'three'])
df = df.reindex(['a', 'b', 'c'])
print df
print ("NaN replaced with '0':")
print df.fillna(0)
Its output is as follows −
one two three
a -0.576991 -0.741695 0.553172
b NaN NaN NaN
c 0.744328 -1.735166 1.749580
Mr. D.Gangadhar
Associate Professor
Fill NA Forward and Backward
Using the concepts of filling discussed in the ReIndexing Chapter we will fill the missing
values.
Method Action
Example
import pandas as pd
import numpy as np
print df.fillna(method='pad')
Its output is as follows −
one two three
a 0.077988 0.476149 0.965836
b 0.077988 0.476149 0.965836
c -0.390208 -0.551605 -2.301950
d -0.390208 -0.551605 -2.301950
e -2.000303 -0.788201 1.510072
f -0.930230 -0.670473 1.146615
g -0.930230 -0.670473 1.146615
h 0.085100 0.532791 0.887415
Drop Missing Values
If you want to simply exclude the missing values, then use the dropna function along with
the axis argument. By default, axis=0, i.e., along row, which means that if any value within
a row is NA then the whole row is excluded.
Example
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(5, 3), index=['a', 'c', 'e', 'f',
'h'],columns=['one', 'two', 'three'])
df = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
print df.dropna()
Its output is as follows −
Mr. D.Gangadhar
Associate Professor
one two three
a 0.077988 0.476149 0.965836
c -0.390208 -0.551605 -2.301950
e -2.000303 -0.788201 1.510072
f -0.930230 -0.670473 1.146615
h 0.085100 0.532791 0.887415
Replace Missing (or) Generic Values
Many times, we have to replace a generic value with some specific value. We can achieve
this by applying the replace method.
Replacing NA with a scalar value is equivalent behavior of the fillna() function.
Example
import pandas as pd
import numpy as np
df = pd.DataFrame({'one':[10,20,30,40,50,2000],
'two':[1000,0,30,40,50,60]})
print df.replace({1000:10,2000:60})
Its output is as follows −
one two
0 10 10
1 20 0
2 30 30
3 40 40
4 50 50
5 60 60
Filtering data:
Suppose there is a requirement for the details regarding name, gender, marks of the top-
scoring students. Here we need to remove some unwanted data.
Filter top scoring
students df =
df[df['Marks'] >=
75]
Remove age row
df = df.drop(['Age'], axis=1)
Disp
lay
data
Df
Output:
Mr. D.Gangadhar
Associate Professor
Merging Data:
The Pandas library in python provides a single function, merge, as the entry point for all
standard database join operations between DataFrame objects.
pd.merge(left, right, how='inner', on=None, left_on=None, right_on=None,
left_index=False, right_index=False, sort=True)
Let us now create two different DataFrames and perform the merging operations on it.
Name id subject_id
0 Billy 1 sub2
1 Brian 2 sub4
2 Bran 3 sub3
3 Bryce 4 sub6
4 Betty 5 sub5
Reshaping data, in the GENDER column, we can reshape the data by categorizing them
into different numbers.
# Categorize gender
df['Gender'] = df['Gender'].map({'M': 0,
'F': 1, }).astype(float)
# Display data
Mr. D.Gangadhar
Associate Professor
df
Output:
Grouping Data:
Grouping data sets is a frequent need in data analysis where we need the result in terms of
various groups present in the data set. Panadas has in-built methods which can roll the data
into various groups.
In the below example we group the data by year and then get the result for a specific year.
import pandas as pd
one = pd.DataFrame({
'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'subject_id':['sub1','sub2','sub4','sub6','sub5'],
Mr. D.Gangadhar
Associate Professor
'Marks_scored':[98,90,87,69,78]},
index=[1,2,3,4,5])
two = pd.DataFrame({
'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'subject_id':['sub2','sub4','sub3','sub6','sub5'],
'Marks_scored':[89,80,79,97,88]},
index=[1,2,3,4,5])
print pd.concat([one,two])
Its output is as follows −
Marks_scored Name subject_id
1 98 Alex sub1
2 90 Amy sub2
3 87 Allen sub4
4 69 Alice sub6
5 78 Ayoung sub5
1 89 Billy sub2
2 80 Brian sub4
3 79 Bran sub3
4 97 Bryce sub6
5 88 Betty sub5
Data Aggregation
Python has several methods are available to perform aggregations on data. It is done using
the pandas and numpy libraries. The data must be available or converted to a dataframe to
apply the aggregation functions.
Applying Aggregations on DataFrame
Let us create a DataFrame and apply aggregations on it.
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10, 4),
index = pd.date_range('1/1/2000', periods=10),
columns = ['A', 'B', 'C', 'D'])
print df
r = df.rolling(window=3,min_periods=1)
print r
Its output is as follows −
A B C D
2000-01-01 1.088512 -0.650942 -2.547450 -0.566858
2000-01-02 0.790670 -0.387854 -0.668132 0.267283
2000-01-03 -0.575523 -0.965025 0.060427 -2.179780
2000-01-04 1.669653 1.211759 -0.254695 1.429166
2000-01-05 0.100568 -0.236184 0.491646 -0.466081
2000-01-06 0.155172 0.992975 -1.205134 0.320958
2000-01-07 0.309468 -0.724053 -1.412446 0.627919
Mr. D.Gangadhar
Associate Professor
2000-01-08 0.099489 -1.028040 0.163206 -1.274331
2000-01-09 1.639500 -0.068443 0.714008 -0.565969
2000-01-10 0.326761 1.479841 0.664282 -1.361169
Rolling [window=3,min_periods=1,center=False,axis=0]
We can aggregate by passing a function to the entire DataFrame, or select a column via the
standard get item method.
A B C D
2000-01-01 1.088512 -0.650942 -2.547450 -0.566858
2000-01-02 1.879182 -1.038796 -3.215581 -0.299575
2000-01-03 1.303660 -2.003821 -3.155154 -2.479355
2000-01-04 1.884801 -0.141119 -0.862400 -0.483331
2000-01-05 1.194699 0.010551 0.297378 -1.216695
2000-01-06 1.925393 1.968551 -0.968183 1.284044
2000-01-07 0.565208 0.032738 -2.125934 0.482797
2000-01-08 0.564129 -0.759118 -2.454374 -0.325454
2000-01-09 2.048458 -1.820537 -0.535232 -1.212381
2000-01-10 2.065750 0.383357 1.541496 -3.201469
A B C D
2000-01-01 1.088512 -0.650942 -2.547450 -0.566858
2000-01-02 1.879182 -1.038796 -3.215581 -0.299575
2000-01-03 1.303660 -2.003821 -3.155154 -2.479355
2000-01-04 1.884801 -0.141119 -0.862400 -0.483331
2000-01-05 1.194699 0.010551 0.297378 -1.216695
2000-01-06 1.925393 1.968551 -0.968183 1.284044
2000-01-07 0.565208 0.032738 -2.125934 0.482797
2000-01-08 0.564129 -0.759118 -2.454374 -0.325454
2000-01-09 2.048458 -1.820537 -0.535232 -1.212381
2000-01-10 2.065750 0.383357 1.541496 -3.201469
Mr. D.Gangadhar
Associate Professor
UNIT-6 Data Visualization
Matplotlib:
Matplotlib is one of the most popular Python packages used for data visualization. It is a
cross-platform library for making 2D plots from data in arrays. Matplotlib is written in
Python and makes use of NumPy, the numerical mathematics extension of Python. It
provides an object-oriented API that helps in embedding plots in applications using Python
GUI toolkits such as PyQt, WxPythonotTkinter. It can be used in Python and IPython
shells, Jupyter notebook and web application servers also.
Matplotlib has a procedural interface named the Pylab, which is designed to resemble
MATLAB, a proprietary programming language developed by MathWorks. Matplotlib
along with NumPy can be considered as the open source equivalent of MATLAB.
Matplotlib was originally written by John D. Hunter in 2003. The current stable version is
2.2.0 released in January 2018.
Matplotlib - Environment Setup:
Matplotlib and its dependency packages are available in the form of wheel packages on the
standard Python package repositories and can be installed on Windows, Linux as well as
MacOS systems using the pip package manager.
pip3 install matplotlib
Incase Python 2.7 or 3.4 versions are not installed for all users, the Microsoft Visual C++
2008 (64 bit or 32 bit forPython 2.7) or Microsoft Visual C++ 2010 (64 bit or 32 bit for
Python 3.4) redistributable packages need to be installed.
If you are using Python 2.7 on a Mac, execute the following command −
xcode-select –install
Upon execution of the above command, the subprocess32 - a dependency, may be
compiled.
On extremely old versions of Linux and Python 2.7, you may need to install the master
version of subprocess32.
Matplotlib requires a large number of dependencies −
Mr. D.Gangadhar
Associate Professor
cycler
six
Matplotlib - Pyplot API:
A new untitled notebook with the .ipynbextension (stands for the IPython notebook) is
displayed in the new tab of the browser.
1 Bar
Make a bar plot.
2 Barh
Make a horizontal bar plot.
3 Boxplot
Make a box and whisker plot.
Mr. D.Gangadhar
Associate Professor
4 Hist
Plot a histogram.
5 hist2d
Make a 2D histogram plot.
6 Pie
Plot a pie chart.
7 Plot
Plot lines and/or markers to the Axes.
8 Polar
Make a polar plot..
9 Scatter
Make a scatter plot of x vs y.
10 Stackplot
Draws a stacked area plot.
11 Stem
Create a stem plot.
12 Step
Make a step plot.
13 Quiver
Plot a 2-D field of arrows.
Image Functions
Sr.No Function & Description
1 Imread
Read an image from a file into an array.
2 Imsave
Save an array as in image file.
3 Imshow
Display an image on the axes.
Axis Functions
Sr.No Function & Description
1 Axes
Mr. D.Gangadhar
Associate Professor
Add axes to the figure.
2 Text
Add text to the axes.
3 Title
Set a title of the current axes.
4 Xlabel
Set the x axis label of the current axis.
5 Xlim
Get or set the x limits of the current axes.
6 Xscale
.
7 Xticks
Get or set the x-limits of the current tick locations and
labels.
8 Ylabel
Set the y axis label of the current axis.
9 Ylim
Get or set the y-limits of the current axes.
10 Yscale
Set the scaling of the y-axis.
11 Yticks
Get or set the y-limits of the current tick locations and
labels.
Figure Functions
Sr.No Function & Description
1 Figtext
Add text to figure.
2 Figure
Creates a new figure.
3 Show
Display a figure.
4 Savefig
Mr. D.Gangadhar
Associate Professor
Save the current figure.
5 Close
Close a figure window.
Pyplot
Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported
under the plt alias:
Code:
import matplotlib.pyplot as plt
import numpy as np
xpoints = np.array([0, 6])
ypoints = np.array([0, 250])
plt.plot(xpoints, ypoints)
plt.show()
Result:
Matplotlib Plotting:
Plotting x and y points
The plot() function is used to draw points (markers) in a diagram.
By default, the plot() function draws a line from point to point.
The function takes parameters for specifying points in the diagram.
Parameter 1 is an array containing the points on the x-axis.
Parameter 2 is an array containing the points on the y-axis.
If we need to plot a line from (1, 1) to (5, 7), we have to pass two arrays [1, 5] and [1, 7] to
the plot function.
Mr. D.Gangadhar
Associate Professor
Code:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
xpoints = np.array([1, 5])
ypoints = np.array([1, 7])
plt.plot(xpoints, ypoints)
plt.show()
Result:
Result:
Mr. D.Gangadhar
Associate Professor
Multiple Points:
You can plot as many points as you like, just make sure you have the same number of points
in both axis.
Code:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
xpoints = np.array([1, 3, 6, 9])
ypoints = np.array([2, 8, 2, 8])
plt.plot(xpoints, ypoints)
plt.show()
Result:
Default X-Points:
If we do not specify the points in the x-axis, they will get the default values 0, 1, 2, 3, (etc.
depending on the length of the y-points.
So, if we take the same example as above, and leave out the x-points, the diagram will look like
this:
Example
Plotting without x-points:
import sys
import matplotlib
Mr. D.Gangadhar
Associate Professor
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([2, 4, 3, 6, 8])plt.plot(ypoints)
plt.show()
Result:
Markers:
You can use the keyword argument marker to emphasize each point with a specified marker
Example
Mark each point with a star:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([2, 5, 1, 8])
plt.plot(ypoints, marker = '*')
plt.show()
Result:
Marker Reference
You can choose any of these markers:
Marker Description
'o' Circle
'*' Star
Mr. D.Gangadhar
Associate Professor
'.' Point
',' Pixel
'x' X
'X' X (filled)
'+' Plus
's' Square
'D' Diamond
'p' Pentagon
'H' Hexagon
'h' Hexagon
'^' Triangle Up
'2' Tri Up
'|' Vline
'_' Hline
Format Strings fmt:
You can use also use the shortcut string notation parameter to specify the marker.
This parameter is also called fmt, and is written with this syntax:
Example
import sys
import matplotlib
Mr. D.Gangadhar
Associate Professor
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, '*-.g')
plt.show()
Result:
The marker value can be anything from the Marker Reference above.
The line value can be one of the following:
Line Reference
Line Syntax Description
Color Reference
'r' Red
'g' Green
'b' Blue
'c' Cyan
'm' Magenta
'y' Yellow
Mr. D.Gangadhar
Associate Professor
'k' Black
'w' White
Marker Size:
You can use the keyword argument markersize or the shorter version, ms to set the size of the
markers:
Example
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([2, 8, 1, 6])
plt.plot(ypoints, marker = '*', ms = 23)
plt.show()
Result:
Marker Color:
You can use the keyword argument markeredgecolor or the shorter mec to set the color of
the edge of the markers:
Example
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 2, 6])
plt.plot(ypoints, marker = '*', ms = 25, mec = 'r')
plt.show()
Mr. D.Gangadhar
Associate Professor
Result:
Matplotlib Line:
Linestyle
You can use the keyword argument linestyle, or shorter ls, to change the style of the plotted
line:
Example
Use a dotted line:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([2, 8, 1, 5])
plt.plot(ypoints, linestyle = 'dotted')
plt.show()
Result:
Example
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
Mr. D.Gangadhar
Associate Professor
ypoints = np.array([2,6, 1, 4])
plt.plot(ypoints, linestyle = 'dashed')
plt.show()
Result:
Line Styles
You can choose any of these styles:
Style Or
'dotted' ':'
'dashed' '--'
'dashdot' '-.'
Mr. D.Gangadhar
Associate Professor
Line Width:
You can use the keyword argument line width or the shorter lw to change the width of the
line.
The value is a floating number, in points:
Example
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, linewidth = '26.5')
plt.show()
Result:
Multiple Lines:
You can plot as many lines as you like by simply adding more plt.plot() functions:
Example
Draw two lines by specifying a plt.plot() function for each line:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y1 = np.array([1, 8, 1, 6])
y2 = np.array([3, 2, 6, 9])
Mr. D.Gangadhar
Associate Professor
plt.plot(y1)
plt.plot(y2)
plt.show()
Result:
Matplotlib Subplot:
Display Multiple Plots
With the subplot() function you can draw multiple plots in one figure:
Example
Draw 2 plots:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([2, 8, 4, 6])
plt.subplot(1, 2, 1)
plt.plot(x,y)
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 15, 11, 24])
plt.subplot(1, 2, 2)
plt.plot(x,y)
plt.show()
Result:
Mr. D.Gangadhar
Associate Professor
The subplot() Function:
The subplot() function takes three arguments that describes the layout of the figure.
The layout is organized in rows and columns, which are represented by
the first and second argument.
The third argument represents the index of the current plot.
plt.subplot(1, 2, 1)
#the figure has 1 row, 2 columns, and this plot is the first plot.
plt.subplot(1, 2, 2)
#the figure has 1 row, 2 columns, and this plot is the second plot.
So, if we want a figure with 2 rows an 1 column (meaning that the two plots will be
displayed on top of each other instead of side-by-side), we can write the syntax like this:
Example:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([2, 8, 4, 6])
plt.subplot(2, 1, 1)
plt.plot(x,y)
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 15, 11, 24])
plt.subplot(2, 1, 2)
plt.plot(x,y)
plt.show()
Result:
You can draw as many plots you like on one figure, just descibe the number of rows, columns,
and the index of the plot.
Mr. D.Gangadhar
Associate Professor
Example
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([0, 1, 2, 3])
y = np.array([3, 5, 1, 8])
plt.subplot(2, 3, 1)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([10, 12, 30, 15])
plt.subplot(2, 3, 2)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(2, 3, 3)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(2, 3, 4)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(2, 3, 5)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(2, 3, 6)
plt.plot(x,y)
plt.show()
Result:
Mr. D.Gangadhar
Associate Professor
Exploring plot types-Scatter plots:
Creating Scatter Plots
With Pyplot, you can use the scatter() function to draw a scatter plot.
The scatter() function plots one dot for each observation. It needs two arrays of the same
length, one for the values of the x-axis, and one for values on the y-axis:
Example
A simple scatter plot:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
plt.show()
Result:
The observation in the example above is the result of 13 cars passing by.
The X-axis shows how old the car is.
The Y-axis shows the speed of the car when it passes.
Are there any relationships between the observations?
It seems that the newer the car, the faster it drives, but that could be a coincidence, after all
we only registered 13 cars.
Compare Plots
In the example above, there seems to be a relationship between speed and age, but what if
we plot the observations from another day as well? Will the scatter plot tell us something
else?
Example
Draw two plots on the same figure:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
#day one, the age and speed of 13 cars:
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
Mr. D.Gangadhar
Associate Professor
plt.scatter(x, y)
#day two, the age and speed of 15 cars:
x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = np.array([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
plt.scatter(x, y)
plt.show()
Result:
By comparing the two plots, I think it is safe to say that they both gives us the same
conclusion: the newer the car, the faster it drives.
Colors
You can set your own color for each scatter plot with the color or the c argument:
Example
Set your own color of the markers:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y, color = 'Red')
x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = np.array([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
plt.scatter(x, y, color = '#88c999')
plt.show()
Result:
Mr. D.Gangadhar
Associate Professor
Color Each Dot
You can even set a specific color for each dot by using an array of colors as value for
the c argument:
Example:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors =
np.array(["red","green","blue","yellow","pink","black","orange","purple","beige","brown","
gray","cyan","magenta"])
plt.scatter(x, y, c=colors)
plt.show()
Result:
ColorMap:
The Matplotlib module has a number of available colormaps. A colormap is like a list of
colors, where each color has a value that ranges from 0 to 100.
Here is an example of a colormap:
This colormap is called 'viridis' and as you can see it ranges from 0, which is a purple color,
and up to 100, which is a yellow color.
How to Use the ColorMap
You can specify the colormap with the keyword argument cmap with the value of the
colormap, in this case 'viridis' which is one of the built-in colormaps available in Matplotlib.
In addition you have to create an array with values (from 0 to 100), one value for each of the
point in the scatter plot:
Example
Create a color array, and specify a colormap in the scatter plot:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
Mr. D.Gangadhar
Associate Professor
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')
plt.show()
Result:
You can include the colormap in the drawing by including the plt.colorbar() statement:
Exsmple:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')
plt.colorbar()
plt.show()
Result:
Available ColorMaps
You can choose any of the built-in colormaps:
Name Reverse
Accent Accent_r
Blues Blues_r
Mr. D.Gangadhar
Associate Professor
BrBG BrBG_r
BuGn BuGn_r
BuPu BuPu_r
CMRmap CMRmap_r
Dark2 Dark2_r
GnBu GnBu_r
Greens Greens_r
Greys Greys_r
OrRd OrRd_r
Oranges Oranges_r
PRGn PRGn_r
Paired Paired_r
Pastel1 Pastel1_r
Pastel2 Pastel2_r
PiYG PiYG_r
PuBu PuBu_r
PuBuGn PuBuGn_r
PuOr PuOr_r
PuRd PuRd_r
Purples Purples_r
RdBu RdBu_r
RdGy RdGy_r
RdPu RdPu_r
RdYlBu RdYlBu_r
RdYlGn RdYlGn_r
Mr. D.Gangadhar
Associate Professor
Reds Reds_r
Set1 Set1_r
Set2 Set2_r
Set3 Set3_r
Spectral Spectral_r
Wistia Wistia_r
YlGn YlGn_r
YlGnBu YlGnBu_r
YlOrBr YlOrBr_r
YlOrRd YlOrRd_r
afmhot afmhot_r
autumn autumn_r
binary binary_r
bone bone_r
brg brg_r
bwr bwr_r
cividis cividis_r
cool cool_r
coolwarm coolwarm_r
copper copper_r
cubehelix cubehelix_r
flag flag_r
gist_earth gist_earth_r
gist_gray gist_gray_r
gist_heat gist_heat_r
Mr. D.Gangadhar
Associate Professor
gist_ncar gist_ncar_r
gist_rainbow gist_rainbow_r
gist_stern gist_stern_r
gist_yarg gist_yarg_r
gnuplot gnuplot_r
gnuplot2 gnuplot2_r
gray gray_r
hot hot_r
hsv hsv_r
inferno inferno_r
jet jet_r
magma magma_r
nipy_spectral nipy_spectral_r
ocean ocean_r
pink pink_r
plasma plasma_r
prism prism_r
rainbow rainbow_r
seismic seismic_r
spring spring_r
summer summer_r
tab10 tab10_r
tab20 tab20_r
tab20b tab20b_r
tab20c tab20c_r
Mr. D.Gangadhar
Associate Professor
terrain terrain_r
twilight twilight_r
twilight_shifted twilight_shifted_r
viridis viridis_r
winter winter_r
Size
You can change the size of the dots with the s argument.
Just like colors, make sure the array for sizes has the same length as the arrays for the x- and
y-axis:
Example
Set your own size for the markers:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
sizes = np.array([20,50,100,200,500,1000,60,90,10,300,600,800,75])
plt.scatter(x, y, s=sizes)
plt.show()
Result:
Matplotlib Bars:
Creating Bars
With Pyplot, you can use the bar() function to draw bar graphs:
Example
Draw 4 bars:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([6, 2, 7, 4])
plt.bar(x,y)
plt.show()
Result:
Mr. D.Gangadhar
Associate Professor
Horizontal Bars
If you want the bars to be displayed horizontally instead of vertically, use
the barh() function:
Example
Draw 4 horizontal bars:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.barh(x, y)
plt.show()
Result:
Bar Color:
The bar() and barh() takes the keyword argument color to set the color of the bars:
Example
Draw 4 red bars:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.bar(x, y, color = "hotpink")
plt.show()
Result:
Mr. D.Gangadhar
Associate Professor
Bar Width:
The bar() takes the keyword argument width to set the width of the bars:
Example
Draw 4 very thin bars:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.bar(x, y, width = 0.3)
plt.show()
Result:
Bar Height:
The barh() takes the keyword argument height to set the height of the bars:
Example
Draw 4 very thin bars:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.barh(x, y, height = 0.1)
plt.show()
Result:
Mr. D.Gangadhar
Associate Professor
Matplotlib Histograms:
Histogram
A histogram is a graph showing frequency distributions. It is a graph showing the number of
observations within each given interval.
Create Histogram
In Matplotlib, we use the hist() function to create histograms.
The hist() function will use an array of numbers to create a histogram, the array is sent into
the function as an argument.
For simplicity we use NumPy to randomly generate an array with 250 values, where the
values will concentrate around 170, and the standard deviation is 10.
Example
A Normal Data Distribution by NumPy:
import numpy as np
x = np.random.normal(170, 10, 250)
print(x)
Result:
[167.62255766 175.32495609 152.84661337 165.50264047 163.17457988
162.29867872 172.83638413 168.67303667 164.57361342 180.81120541
170.57782187 167.53075749 176.15356275 176.95378312 158.4125473
187.8842668 159.03730075 166.69284332 160.73882029 152.22378865
164.01255164 163.95288674 176.58146832 173.19849526 169.40206527
166.88861903 149.90348576 148.39039643 177.90349066 166.72462233
1776004 170.93335636 173.26312881 174.76534435 162.28791953
166.77301551 160.53785202 170.67972019 159.11594186 165.36992993
178.38979253 171.52158489 173.32636678 159.63894401 151.95735707
175.71274153 165.00458544 164.80607211 177.50988211 149.28106703
179.43586267 181.98365273 170.98196794 179.1093176 176.91855744
168.32092784 162.33939782 165.18364866 160.52300507 174.14316386
163.01947601 172.01767945 173.33491959 169.75842718 198.04834503
192.82490521 164.54557943 206.36247244 165.47748898 195.26377975
164.37569092 156.15175531 162.15564208 179.34100362 167.22138242
147.23667125 162.86940215 167.84986671 172.99302505 166.77279814
196.6137667 159.79012341 166.5840824 170.68645637 165.62204521
174.5559345 165.0079216 187.92545129 166.86186393 179.78383824
161.0973573 167.44890343 157.38075812 151.35412246 171.3107829
162.57149341 182.49985133 163.24700057 168.72639903 169.05309467
167.19232875 161.06405208 176.87667712 165.48750185 179.68799986
158.7913483 170.22465411 182.66432721 173.5675715 176.85646836
157.31299754 174.88959677 183.78323508 174.36814558 182.55474697
180.03359793 180.53094948 161.09560099 172.29179934 161.22665588
171.88382477 159.04626132 169.43886536 163.75793589 157.73710983
174.68921523 176.19843414 167.39315397 181.17128255 174.2674597
186.05053154 177.06516302 171.78523683 166.14875436 163.31607668
174.01429569 194.98819875 169.75129209 164.25748789 180.25773528
170.44784934 157.81966006 171.33315907 174.71390637 160.55423274
Mr. D.Gangadhar
Associate Professor
163.92896899 177.29159542 168.30674234 165.42853878 176.46256226
162.61719142 166.60810831 165.83648812 184.83238352 188.99833856
161.3054697 175.30396693 175.28109026 171.54765201 162.08762813
164.53011089 189.86213299 170.83784593 163.25869004 198.68079225
166.95154328 152.03381334 152.25444225 149.75522816 161.79200594
162.13535052 183.37298831 165.40405341 155.59224806 172.68678385
179.35359654 174.19668349 163.46176882 168.26621173 162.97527574
192.80170974 151.29673582 178.65251432 163.17266558 165.11172588
183.11107905 169.69556831 166.35149789 178.74419135 166.28562032
169.96465166 178.24368042 175.3035525 170.16496554 158.80682882
187.10006553 178.90542991 171.65790645 183.19289193 168.17446717
155.84544031 177.96091745 186.28887898 187.89867406 163.26716924
169.71242393 152.9410412 158.68101969 171.12655559 178.1482624
187.45272185 173.02872935 163.8047623 169.95676819 179.36887054
157.01955088 185.58143864 170.19037101 157.221245 168.90639755
178.7045601 168.64074373 172.37416382 165.61890535 163.40873027
168.98683006 149.48186389 172.20815568 172.82947206 173.71584064
189.42642762 172.79575803 177.00005573 169.24498561 171.55576698
161.36400372 176.47928342 163.02642822 165.09656415 186.70951892
153.27990317 165.59289527 180.34566865 189.19506385 183.10723435
173.48070474 170.28701875 157.24642079 157.9096498 176.4248199 ]
The hist() function will read the array and produce a histogram:
Example
A simple histogram:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.random.normal(170, 10, 250)
plt.hist(x)
plt.show()
Result:
Mr. D.Gangadhar
Associate Professor
Legends and annotations:
Legends and annotations are effective tools to display information required to comprehend a
plot in a glance. A typical plot will have the following additional information elements:
A legend describing the various data series in the plot. This is provided by invoking the
matplotlib legend() function and supplying the labels for each data series.
Annotations for important points in the plot. The matplotlib annotate() function can be used
for this purpose. A matplotlib annotation consists of a label and an arrow. This function has
many parameters describing the label and arrow style and position, so you may need to
call help(annotate) for a detailed description. Labels on the horizontal and vertical axes.
These labels can be drawn by the xlabel() and ylabel() functions. We need to give these
functions the text of the labels as a string and optional parameters such as the font size of the
label. A descriptive title for the graph with the matplotlib title() function. Legends and
annotations are effective tools to display information required to comprehend a plot in a
glance. A typical plot will have the following additional information elements:
A legend describing the various data series in the plot. This is provided by invoking the
matplotlib legend() function and supplying the labels for each data series. Annotations for
important points in the plot. The matplotlib annotate() function can be used for this purpose.
A matplotlib annotation consists of a label and an arrow. This function has many parameters
describing the label and arrow style and position, so you may need to call help(annotate) for
a detailed description. Labels on the horizontal and vertical axes. These labels can be drawn
by the xlabel() and ylabel() functions. We need to give these functions the text of the labels
as a string and optional parameters such as the font size of the label.
Mr. D.Gangadhar
Associate Professor
Labels
Add labels to the pie chart with the label parameter.
The label parameter must be an array with one label for each wedge:
Example
A simple pie chart:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels)
plt.show()
Result:
Start Angle
As mentioned the default start angle is at the x-axis, but you can change the start angle by
specifying a startangle parameter.
The startangle parameter is defined with an angle in degrees, default angle is 0:
Example
Start the first wedge at 90 degrees:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
Mr. D.Gangadhar
Associate Professor
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels, startangle = 90)
plt.show()
Result:
Explode
Maybe you want one of the wedges to stand out? The explode parameter allows you to do
that. The explode parameter, if specified, and not None, must be an array with one value for
each wedge. Each value represents how far from the center each wedge is displayed:
Example
Pull the "Apples" wedge 0.2 from the center of the pie:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
myexplode = [0.2, 0, 0, 0]
plt.pie(y, labels = mylabels, explode = myexplode)
plt.show()
Result:
Shadow
Add a shadow to the pie chart by setting the shadows parameter to True:
Example
Add a shadow:
import sys
Mr. D.Gangadhar
Associate Professor
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
myexplode = [0.2, 0, 0, 0]
plt.pie(y, labels = mylabels, explode = myexplode, shadow = True)
plt.show()
Result:
Colors
You can set the color of each wedge with the colors parameter. The colors parameter, if
specified, must be an array with one value for each wedge:
Example
Specify a new color for each wedge:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
mycolors = ["black", "hotpink", "b", "#4CAF50"]
plt.pie(y, labels = mylabels, colors = mycolors)
plt.show()
Result:
Mr. D.Gangadhar
Associate Professor
You can use Hexadecimal color values, any of the 140 supported color names, or one of these
shortcuts:
'r' - Red
'g' - Green
'b' - Blue
'c' - Cyan
'm' - Magenta
'y' - Yellow
'k' - Black
'w' - White
Legend
To add a list of explanation for each wedge, use the legend() function:
Example
Add a legend:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels)
plt.legend()
plt.show()
Result:
Result:
Mr. D.Gangadhar
Associate Professor