0% found this document useful (0 votes)
10 views225 pages

python AQJ

Python is a widely-used programming language known for its readability and versatility, created by Guido van Rossum in 1991. It supports various applications including web development, software development, and data analysis, and features a simple syntax that allows for quick prototyping. Key concepts covered include Python syntax, variables, data types, and string manipulation.

Uploaded by

vasantha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views225 pages

python AQJ

Python is a widely-used programming language known for its readability and versatility, created by Guido van Rossum in 1991. It supports various applications including web development, software development, and data analysis, and features a simple syntax that allows for quick prototyping. Key concepts covered include Python syntax, variables, data types, and string manipulation.

Uploaded by

vasantha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 225

UNIT-1 INTRODUCTION TO PYTHON

Python Introduction:
Python is a popular programming language. It was created by Guido van Rossum, and
released in 1991. It was designed with an emphasis on code readability, and its syntax
allows programmers to express their concepts in fewer lines of code.
It is used for:
 web development (server-side),
 software development,
 mathematics,
 System scripting.
What can Python do?
 Python can be used on a server to create web applications.
 Python can be used alongside software to create workflows.
 Python can connect to database systems. It can also read and modify files.
 Python can be used to handle big data and perform complex mathematics.
 Python can be used for rapid prototyping, or for production-ready software
development.

Why Python?
 Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).
 Python has a simple syntax similar to the English language.
 Python has syntax that allows developers to write programs with fewer lines than
some other programming languages.
 Python runs on an interpreter system, meaning that code can be executed as soon as it
is written. This means that prototyping can be very quick.
 Python can be treated in a procedural way, an object-oriented way or a functional
way.

Python Syntax compared to other programming languages


 Python was designed for readability, and has some similarities to the English
language with influence from mathematics.
 Python uses new lines to complete a command, as opposed to other programming
languages which often use semicolons or parentheses.
 Python relies on indentation, using whitespace, to define scope; such as the scope of
loops, functions and classes. Other programming languages often use curly-brackets
for this purpose.

Python Syntax:
Execute Python Syntax
As we learned in the previous page, Python syntax can be executed by writing directly in the
Command Line:
>>> print("Hello, World!")
Hello, World!
Or by creating a python file on the server, using the .py file extension, and running it in the
Command Line:
C:\Users\Your Name>python myfile.py
Python Indentation:
Indentation refers to the spaces at the beginning of a code line.
Where in other programming languages the indentation in code is for readability only, the
indentation in Python is very important.
Python uses indentation to indicate a block of code.
Example
if 5 > 2:
print("Five is greater than two!")
Python will give you an error if you skip the indentation:
Example
Syntax Error:
if 5 > 2:
print("Five is greater than two!")

The number of spaces is up to you as a programmer, but it has to be at least one.


Example
if 5 > 2:
print("Five is greater than two!")
if 5 > 2:
print("Five is greater than two!")

You have to use the same number of spaces in the same block of code, otherwise Python will
give you an error:
Example
Syntax Error:
if 5 > 2:
print("Five is greater than two!")
print("Five is greater than two!")
Python Comments:
Python has commenting capability for the purpose of in-code documentation; Comments can
be used to explain Python code.
Comments can be used to make the code more readable.
Comments can be used to prevent execution when testing code.
Comments start with #, and Python will render the rest of the line as a comment:
Example
Comments in Python:
#This is a comment.
print("Hello, World!")
Comments can be placed at the end of a line, and Python will ignore the rest of the line:
Example
print("Hello, World!") #This is a comment
A comment does not have to be text that explains the code, it can also be used to prevent
Python from executing code:
Example
#print("Hello, World!")
print("Cheers, Mate!")
Multi Line Comments
Python does not really have a syntax for multi line comments.
To add a multiline comment you could insert a # for each line:
Example
#This is a comment
#written in
#more than just one line
print("Hello, World!")
Or, not quite as intended, you can use a multiline string.
Since Python will ignore string literals that are not assigned to a variable, you can add a
multiline string (triple quotes) in your code, and place your comment inside it:
Example
"""
This is a comment
written in
more than just one line
"""
print("Hello, World!")
Python Variables:
In Python, variables are created when you assign a value to it; Variables are containers for
storing data values.
Variable Names:
A variable can have a short name (like x and y) or a more descriptive name (age, carname,
total_volume). Rules for Python variables:
 A variable name must start with a letter or the underscore character
 A variable name cannot start with a number
 A variable name can only contain alpha-numeric characters and underscores (A-z, 0-
9, and _ )
 Variable names are case-sensitive (age, Age and AGE are three different variables)
Example
Legal variable names:
myvar = "John"
my_var = "John"
_my_var = "John"
myVar = "John"
MYVAR = "John"
myvar2 = "John"
Creating Variables:
Python has no command for declaring a variable.
A variable is created the moment you first assign a value to it.
Example
x=5
y = "John"
print(x)
print(y)

Variables do not need to be declared with any particular type, and can even change type after
they have been set.
Example
x=4 # x is of type int
x = "Sally" # x is now of type str
print(x)

Casting
If you want to specify the data type of a variable, this can be done with casting.
Example
x = str(3) # x will be '3'
y = int(3) # y will be 3
z = float(3) # z will be 3.0

Get the Type


You can get the data type of a variable with the type() function.
Example
x=5
y = "John"
print(type(x))
print(type(y))
Single or Double Quotes?

String variables can be declared either by using single or double quotes:

Example
x = "John"
# is the same as
x = 'John'

Case-Sensitive:
Variable names are case-sensitive.
Example
This will create two variables:
a=4
A = "Sally"
#A will not overwrite a
Many Values to Multiple Variables:
Python allows you to assign values to multiple variables in one line:
Example
x, y, z = "Orange", "Banana", "Cherry"
print(x)
print(y)
print(z)

One Value to Multiple Variables:


And you can assign the same value to multiple variables in one line:
Example
x = y = z = "Orange"
print(x)
print(y)
print(z)
Output Variables:
The Python print statement is often used to output variables. To combine both text and a
variable, Python uses the + character:
Example
x = "awesome"
print ("Python is” + x)
You can also use the + character to add a variable to another variable:
Example
x = "Python is "
y = "awesome"
z= x+y
print(z)
For numbers, the + character works as a mathematical operator:
Example
x=5
y = 10
print(x + y)
If you try to combine a string and a number, Python will give you an error:
Example
x=5
y = "John"
print(x + y)
Global Variables:
Variables that are created outside of a function (as in all of the examples above) are known as
global variables. Global variables can be used by everyone, both inside of functions and
outside.
Example
Create a variable outside of a function, and use it inside the function
x = "awesome"
def myfunc():
print("Python is " + x)
myfunc()

If you create a variable with the same name inside a function, this variable will be local, and
can only be used inside the function. The global variable with the same name will remain as it
was, global and with the original value.

Example
Create a variable inside a function, with the same name as the global variable
x = "awesome"

def myfunc():
x = "fantastic"
print("Python is " + x)
myfunc()
print("Python is " + x)
The global Keyword:
Normally, when you create a variable inside a function, that variable is local, and can only be
used inside that function.
To create a global variable inside a function, you can use the global keyword.
Example
If you use the global keyword, the variable belongs to the global scope:
def myfunc():
global x
x = "fantastic"
myfunc()
print("Python is " + x)
Python Data Types:
Built-in Data Types: In programming, data type is an important concept. Variables can
store data of different types, and different types can do different things.

Python has the following data types built-in by default, in these categories:

Text Type: str

Numeric Types: int, float, complex

Sequence Types: list, tuple, range

Mapping Type: dict

Set Types: set, frozenset

Boolean Type: bool

Binary Types: bytes, bytearray, memoryview


Getting the Data Type:

You can get the data type of any object by using the type() function:

Example
Print the data type of the variable x:
x=5
print(type(x))
Setting the Data Type:

In Python, the data type is set when you assign a value to a variable:

Example Data Type

x = "Hello World" str

x = 20 int

x = 20.5 float

x = 1j complex

x = ["apple", "banana", list


"cherry"]

x = ("apple", "banana", tuple


"cherry")

x = range(6) range

x = {"name" : "John", "age" : dict


36}

x = {"apple", "banana", set


"cherry"}

x = frozenset({"apple", frozenset
"banana", "cherry"})

x = True bool

x = b"Hello" bytes

x = bytearray(5) bytearray

x = memoryview(bytes(5)) memoryview
Setting the Specific Data Type:

If you want to specify the data type, you can use the following constructor functions:

Example Data Type

x = str("Hello World") str


x = int(20) int

x = float(20.5) float

x = complex(1j) complex

x = list(("apple", list
"banana", "cherry"))

x = tuple(("apple", tuple
"banana", "cherry"))

x = range(6) range

x = dict(name="John", dict
age=36)

x = set(("apple", set
"banana", "cherry"))

x = frozenset(("apple", frozenset
"banana", "cherry"))

x = bool(5) bool

x = bytes(5) bytes

x = bytearray(5) bytearray

x= memoryview
memoryview(bytes(5))

Python Numbers

There are three numeric types in Python:

 int
 float
 complex

Variables of numeric types are created when you assign a value to them:

Example
x = 1 # int
y = 2.8 # float
z = 1j # complex
To verify the type of any object in Python, use the type() function:

Example
print(type(x))
print(type(y))
print(type(z))
Int:
Int, or integer, is a whole number, positive or negative, without decimals, of unlimited length.
Example

Integers:

x=1
y = 35656222554887711
z = -3255522

print(type(x))
print(type(y))
print(type(z))

Float:
Float, or "floating point number" is a number, positive or negative, containing one or
more decimals.
Example

Floats:

x = 1.10
y = 1.0
z = -35.59
print(type(x))
print(type(y))
print(type(z))

Float can also be scientific numbers with an "e" to indicate the power of 10.

Example

Floats:

x = 35e3
y = 12E4
z = -87.7e100
print(type(x))
print(type(y))
print(type(z))
Complex:

Complex numbers are written with a "j" as the imaginary part:

Example

Complex:

x = 3+5j
y = 5j
z = -5j
print(type(x))
print(type(y))
print(type(z))

Type Conversion:
You can convert from one type to another with the int(), float(), and complex() methods:
Example
Convert from one type to another:
x = 1 # int
y = 2.8 # float
z = 1j # complex
#convert from int to float:
a = float(x)
#convert from float to int:
b = int(y)
#convert from int to complex:
c = complex(x)
print(a)
print(b)
print(c)
print(type(a))
print(type(b))
print(type(c))

Note: You cannot convert complex numbers into another number type.

Random Number:
Python does not have a random() function to make a random number, but Python has a built-
in module called random that can be used to make random numbers:
Example
Import the random module, and display a random number between 1 and 9:
import random
print(random.randrange(1, 10))
Python Casting:
Specify a Variable Type:
There may be times when you want to specify a type on to a variable. This can be done with
casting. Python is an object-orientated language, and as such it uses classes to define data
types, including its primitive types.Casting in python is therefore done using constructor
functions:

 int() - constructs an integer number from an integer literal, a float literal (by removing
all decimals), or a string literal (providing the string represents a whole number)
 float() - constructs a float number from an integer literal, a float literal or a string
literal (providing the string represents a float or an integer)
 str() - constructs a string from a wide variety of data types, including strings, integer
literals and float literals

Example
Integers:
x = int(1) # x will be 1
y = int(2.8) # y will be 2
z = int("3") # z will be 3
Example:

Floats:

x = float(1) # x will be 1.0


y = float(2.8) # y will be 2.8
z = float("3") # z will be 3.0
w = float("4.2") # w will be 4.2
Example
Strings:

x = str("s1") # x will be 's1'


y = str(2) # y will be '2'
z = str(3.0) # z will be '3.0'

Python Strings:
Strings: Strings in python are surrounded by either single quotation marks, or double
quotation marks.
'hello' is the same as "hello".
You can display a string literal with the print() function:
Example
print("Hello")
print('Hello')
Assign String to a Variable:
Assigning a string to a variable is done with the variable name followed by an equal sign and
the string:
Example
a = "Hello"
print(a)

Multiline Strings:
You can assign a multiline string to a variable by using three quotes:
Example
You can use three double quotes:
a = """Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."""
print(a)

Or three single quotes:

Example
a = '''Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.'''
print(a)

Note: in the result, the line breaks are inserted at the same position as in the code.

Strings are Arrays:


Like many other popular programming languages, strings in Python are arrays of bytes
representing unicode characters. However, Python does not have a character data type, a
single character is simply a string with a length of 1.
Square brackets can be used to access elements of the string.
Example
Get the character at position 1 (remember that the first character has the position 0):

a = "Hello, World!"
print(a[1])
Looping Through a String:
Since strings are arrays, we can loop through the characters in a string, with a for loop.
Example
Loop through the letters in the word "banana":
for x in "banana":
print(x)

String Length:
To get the length of a string, use the len() function.
Example
The len() function returns the length of a string:
a = "Hello, World!"
print(len(a))

Check String:
To check if a certain phrase or character is present in a string, we can use the keyword in.
Example
Check if "free" is present in the following text:
txt = "The best things in life are free!"
print("free" in txt)
Use it in an if statement:
Example
Print only if "free" is present:
txt = "The best things in life are free!"
if "free" in txt:
print("Yes, 'free' is present.")
Check if NOT:
To check if a certain phrase or character is NOT present in a string, we can use the
keyword not in.
Example
Check if "expensive" is NOT present in the following text:
txt = "The best things in life are free!"
print("expensive" not in txt)

Use it in an if statement:

Example
print only if "expensive" is NOT present:
txt = "The best things in life are free!"
if "expensive" not in txt:
print("No, 'expensive' is NOT present.")
Python - Slicing Strings:
You can return a range of characters by using the slice syntax.
Specify the start index and the end index, separated by a colon, to return a part of the string.
Example
Get the characters from position 2 to position 5 (not included):
b = "Hello, World!"
print(b[2:5])
Slice From the Start:
By leaving out the start index, the range will start at the first character:
Example
Get the characters from the start to position 5 (not included):
b = "Hello, World!"
print(b[:5])
Slice To the End:
By leaving out the end index, the range will go to the end:
Example

Get the characters from position 2, and all the way to the end:

b = "Hello, World!"
print(b[2:])

Negative Indexing:
Use negative indexes to start the slice from the end of the string:

Example
Get the characters:
From: "o" in "World!" (position -5)
To, but not included: "d" in "World!" (position -2):
b = "Hello, World!"
print(b[-5:-2])
Python - Modify Strings
Upper Case:
Example
The upper() method returns the string in upper case:
a = "Hello, World!"
print(a.upper())
Lower Case:
Example
The lower() method returns the string in lower case:
a = "Hello, World!"
print(a.lower())

Remove Whitespace:
Whitespace is the space before and/or after the actual text, and very often you want to remove
this space.
Example
The strip() method removes any whitespace from the beginning or the end:
a = " Hello, World! "
print(a.strip()) # returns "Hello, World!"
Replace String:
Example
The replace() method replaces a string with another string:
a = "Hello, World!"
print(a.replace("H", "J"))
Split String:
The split() method returns a list where the text between the specified separator becomes the
list items.
Example
The split() method splits the string into substrings if it finds instances of the separator:
a = "Hello, World!"
print(a.split(",")) # returns ['Hello', ' World!']
Python String Format:
As we learned in the Python Variables chapter, we cannot combine strings and numbers like
this:
Example
age = 36
txt = "My name is John, I am " + age
print(txt)

But we can combine strings and numbers by using the format() method!
The format() method takes the passed arguments, formats them, and places them in the string
where the placeholders {} are:
Example
Use the format() method to insert numbers into strings:
age = 36
txt = "My name is John, and I am {}"
print(txt.format(age))
The format() method takes unlimited number of arguments, and are placed into the respective
placeholders:
Example
quantity = 3
itemno = 567
price = 49.95
myorder = "I want {} pieces of item {} for {} dollars."
print(myorder.format(quantity, itemno, price))
You can use index numbers {0} to be sure the arguments are placed in the correct
placeholders:
Example
quantity = 3
itemno = 567
price = 49.95
myorder = "I want to pay {2} dollars for {0} pieces of item {1}."
print(myorder.format(quantity, itemno, price))
String Methods:

Python has a set of built-in methods that you can use on strings.

Note: All string methods returns new values. They do not change the original string.

Method Description

capitalize() Converts the first character to upper case

casefold() Converts string into lower case

center() Returns a centered string

count() Returns the number of times a specified value occurs in a string

encode() Returns an encoded version of the string

endswith() Returns true if the string ends with the specified value
expandtabs() Sets the tab size of the string

find() Searches the string for a specified value and returns the position of where it
was found

format() Formats specified values in a string

format_map() Formats specified values in a string

index() Searches the string for a specified value and returns the position of where it
was found

isalnum() Returns True if all characters in the string are alphanumeric

isalpha() Returns True if all characters in the string are in the alphabet

isdecimal() Returns True if all characters in the string are decimals

isdigit() Returns True if all characters in the string are digits

isidentifier() Returns True if the string is an identifier

islower() Returns True if all characters in the string are lower case

isnumeric() Returns True if all characters in the string are numeric

isprintable() Returns True if all characters in the string are printable

isspace() Returns True if all characters in the string are whitespaces

istitle() Returns True if the string follows the rules of a title

isupper() Returns True if all characters in the string are upper case

join() Joins the elements of an iterable to the end of the string

ljust() Returns a left justified version of the string

lower() Converts a string into lower case

lstrip() Returns a left trim version of the string

maketrans() Returns a translation table to be used in translations

partition() Returns a tuple where the string is parted into three parts

replace() Returns a string where a specified value is replaced with a specified value

rfind() Searches the string for a specified value and returns the last position of
where it was found

rindex() Searches the string for a specified value and returns the last position of
where it was found

rjust() Returns a right justified version of the string

rpartition() Returns a tuple where the string is parted into three parts
rsplit() Splits the string at the specified separator, and returns a list

rstrip() Returns a right trim version of the string

split() Splits the string at the specified separator, and returns a list

splitlines() Splits the string at line breaks and returns a list

startswith() Returns true if the string starts with the specified value

strip() Returns a trimmed version of the string

swapcase() Swaps cases, lower case becomes upper case and vice versa

title() Converts the first character of each word to upper case

translate() Returns a translated string

upper() Converts a string into upper case

zfill() Fills the string with a specified number of 0 values at the beginning
Python Booleans:
Booleans represent one of two values: True or False.
Boolean Values:
In programming you often need to know if an expression is True or False.
You can evaluate any expression in Python, and get one of two answers, True or False.
When you compare two values, the expression is evaluated and Python returns the Boolean
answer:
Example
print(10 > 9)
print(10 == 9)
print(10 < 9)

When you run a condition in an if statement, Python returns True or False:


Example

Print a message based on whether the condition is True or False:


a = 200
b = 33
if b > a:
print("b is greater than a")
else:
print("b is not greater than a")

Evaluate Values and Variables:

The bool() function allows you to evaluate any value, and give you True or False in return,

Example
Evaluate a string and a number:
print(bool("Hello"))
print(bool(15))
Example
Evaluate two variables:
x = "Hello"
y = 15
print(bool(x))
print(bool(y))
Most Values are True
Almost any value is evaluated to True if it has some sort of content.
Any string is True, except empty strings.
Any number is True, except 0.
Any list, tuple, set, and dictionary are True, except empty ones.
Example
The following will return True:
bool("abc")
bool(123)
bool(["apple", "cherry", "banana"])

Some Values are False:


In fact, there are not many values that evaluate to False, except empty values, such
as (), [], {}, "", the number 0, and the value None. And of course the value False evaluates
to False.
Example
The following will return False:

bool(False)
bool(None)
bool(0)
bool("")
bool(())
bool([])
bool({})

One more value, or object in this case, evaluates to False, and that is if you have an object
that is made from a class with a __len__ function that returns 0 or False:

Example
class myclass():
def __len__(self):
return 0
myobj = myclass()
print(bool(myobj))

Functions can return a Boolean:


You can create functions that return a Boolean Value:
Example
Print the answer of a function:
def myFunction() :
return True
print(myFunction())

You can execute code based on the Boolean answer of a function:

Example
Print "YES!" if the function returns True, otherwise print "NO!":
def myFunction() :
return True
if myFunction():
print("YES!")
else:
print("NO!")

Python also has many built-in functions that return a boolean value, like
the isinstance() function, which can be used to determine if an object is of a certain data type:

Example
Check if an object is an integer or not:
x = 200

print(isinstance(x, int))
Python Operators:
Operators are used to perform operations on variables and values.
In the example below, we use the + operator to add together two values:
Example
print(10 + 5)
Python divides the operators in the following groups:

 Arithmetic operators
 Assignment operators
 Comparison operators
 Logical operators
 Identity operators
 Membership operators
 Bitwise operators

Python Arithmetic Operators:

Arithmetic operators are used with numeric values to perform common mathematical
operations:

Operator Name Example

+ Addition x+y

- Subtraction x-y

* Multiplication x*y

/ Division x/y

% Modulus x%y

** Exponentiation x ** y

// Floor division x // y

Python Assignment Operators:

Assignment operators are used to assign values to variables:

Operator Example Same As

= x=5 x=5

+= x += 3 x=x+3

-= x -= 3 x=x-3

*= x *= 3 x=x*3

/= x /= 3 x=x/3

%= x %= 3 x=x%3

//= x //= 3 x = x // 3

**= x **= 3 x = x ** 3

&= x &= 3 x=x&3


|= x |= 3 x=x|3

^= x ^= 3 x=x^3

>>= x >>= 3 x = x >> 3

<<= x <<= 3 x = x << 3

Python Comparison Operators:

Comparison operators are used to compare two values:

Operator Name Example Try it

== Equal x == y

!= Not equal x != y

> Greater than x>y

< Less than x<y

>= Greater than or equal to x >= y

<= Less than or equal to x <= y

Python Logical Operators:

Logical operators are used to combine conditional statements:

Operator Description Example

and Returns True if both statements are true x < 5 and x < 10

or Returns True if one of the statements is true x < 5 or x < 4

not Reverse the result, returns False if the result is not(x < 5 and x <
true 10)

Python Identity Operators:

Identity operators are used to compare the objects, not if they are equal, but if they are
actually the same object, with the same memory location:

Operator Description Example

is Returns True if both variables are the same x is y


object

is not Returns True if both variables are not the same x is not y
object
Python Membership Operators:

Membership operators are used to test if a sequence is presented in an object:

Operator Description Example

in Returns True if a sequence with the specified value is x in y


present in the object

not in Returns True if a sequence with the specified value is not x not in y
present in the object

Python Bitwise Operators:

Bitwise operators are used to compare (binary) numbers:

Operator Name Description

& AND Sets each bit to 1 if both bits are 1

| OR Sets each bit to 1 if one of two bits is 1

^ XOR Sets each bit to 1 if only one of two bits is 1

~ NOT Inverts all the bits

<< Zero fill left Shift left by pushing zeros in from the right and let
shift the leftmost bits fall off

>> Signed right Shift right by pushing copies of the leftmost bit in
shift from the left, and let the rightmost bits fall off
Python Lists:

mylist = ["apple", "banana", "cherry"]

List:
Lists are used to store multiple items in a single variable.
Lists are one of 4 built-in data types in Python used to store collections of data, the other 3
are Tuple, Set, and Dictionary, all with different qualities and usage.
Lists are created using square brackets:
Example

Create a List:

thislist = ["apple", "banana", "cherry"]


print(thislist)
List Items:
List items are ordered, changeable, and allow duplicate values.
List items are indexed, the first item has index [0], the second item has index [1] etc.
Ordered:
When we say that lists are ordered, it means that the items have a defined order, and that
order will not change. If you add new items to a list, the new items will be placed at the end
of the list.

Note: There are some list methods that will change the order, but in general: the order of the
items will not change.

Changeable:
The list is changeable, meaning that we can change, add, and remove items in a list after it
has been created.
Allow Duplicates:
Since lists are indexed, lists can have items with the same value:
Example
Lists allow duplicate values:
thislist = ["apple", "banana", "cherry", "apple", "cherry"]
print(thislist)
List Length:
To determine how many items a list has, use the len() function:
Example
Print the number of items in the list:
thislist = ["apple", "banana", "cherry"]
print(len(thislist))
List Items - Data Types:
List items can be of any data type:
Example
String, int and boolean data types:
list1 = ["apple", "banana", "cherry"]
list2 = [1, 5, 7, 9, 3]
list3 = [True, False, False]

A list can contain different data types:

Example
A list with strings, integers and boolean values:
list1 = ["abc", 34, True, 40, "male"]
type()
From Python's perspective, lists are defined as objects with the data type 'list':
<class 'list'>
Example
What is the data type of a list?
mylist = ["apple", "banana", "cherry"]
print(type(mylist))
The list() Constructor:
It is also possible to use the list() constructor when creating a new list.
Example
Using the list() constructor to make a List:
thislist = list(("apple", "banana", "cherry")) # note the double round-brackets
print(thislist)
Python Collections (Arrays):
There are four collection data types in the Python programming language:
 List is a collection which is ordered and changeable. Allows duplicate members.
 Tuple is a collection which is ordered and unchangeable. Allows duplicate members.
 Set is a collection which is unordered, unchangeable*, and unindexed. No duplicate
members.
 Dictionary is a collection which is ordered** and changeable. No duplicate members.
Access Items
List items are indexed and you can access them by referring to the index number:
Example
Print the second item of the list:
thislist = ["apple", "banana", "cherry"]
print(thislist[1])
Negative Indexing:
Negative indexing means start from the end
-1 refers to the last item, -2 refers to the second last item etc.
Example
Print the last item of the list:
thislist = ["apple", "banana", "cherry"]
print(thislist[-1])
Range of Indexes:
You can specify a range of indexes by specifying where to start and where to end the range.
When specifying a range, the return value will be a new list with the specified items.
Example
Return the third, fourth, and fifth item:
thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]
print(thislist[2:5])
Append Items:
To add an item to the end of the list, use the append() method:
Example
Using the append() method to append an item:
thislist = ["apple", "banana", "cherry"]
thislist.append("orange")
print(thislist)
Insert Items:
To insert a list item at a specified index, use the insert() method.
The insert() method inserts an item at the specified index:
Example
Insert an item as the second position:
thislist = ["apple", "banana", "cherry"]
thislist.insert(1, "orange")
print(thislist)
Extend List:
To append elements from another list to the current list, use the extend() method.
Example
Add the elements of tropical to thislist:
thislist = ["apple", "banana", "cherry"]
tropical = ["mango", "pineapple", "papaya"]
thislist.extend(tropical)
print(thislist)
Remove Specified Item:
The remove() method removes the specified item.
Example
Remove "banana":
thislist = ["apple", "banana", "cherry"]
thislist.remove("banana")
print(thislist)
Remove Specified Index:
The pop() method removes the specified index.
Example
Remove the second item:
thislist = ["apple", "banana", "cherry"]
thislist.pop(1)
print(thislist)
If you do not specify the index, the pop() method removes the last item.
Example
Remove the last item:
thislist = ["apple", "banana", "cherry"]
thislist.pop()
print(thislist)
The del keyword also removes the specified index:
Example
Remove the first item:
thislist = ["apple", "banana", "cherry"]
del thislist[0]
print(thislist)
The del keyword can also delete the list completely.
Example
Delete the entire list:
thislist = ["apple", "banana", "cherry"]
del thislist
List Length:
The clear() method empties the list.
The list still remains, but it has no content.
Example
Clear the list content:
thislist = ["apple", "banana", "cherry"]
thislist.clear()
print(thislist)
Python - Loop Lists:
Loop Through a List
You can loop through the list items by using a for loop:
Example
Print all items in the list, one by one:
thislist = ["apple", "banana", "cherry"]
for x in thislist:
print(x)
Loop Through the Index Numbers:
You can also loop through the list items by referring to their index number.
Use the range() and len() functions to create a suitable iterable.
Example
Print all items by referring to their index number:
thislist = ["apple", "banana", "cherry"]
for i in range(len(thislist)):
print(thislist[i])

The iterable created in the example above is [0, 1, 2].

Using a While Loop:


You can loop through the list items by using a while loop.
Use the len() function to determine the length of the list, then start at 0 and loop your way
through the list items by refering to their indexes.
Remember to increase the index by 1 after each iteration.
Example
Print all items, using a while loop to go through all the index numbers
thislist = ["apple", "banana", "cherry"]
i=0
while i < len(thislist):
print(thislist[i])
i=i+1
Looping Using List Comprehension:
List Comprehension offers the shortest syntax for looping through lists:
Example
A short hand for loop that will print all items in a list:
thislist = ["apple", "banana", "cherry"]
[print(x) for x in thislist]
List Comprehension:
List comprehension offers a shorter syntax when you want to create a new list based on the
values of an existing list.
Example:
Based on a list of fruits, you want a new list, containing only the fruits with the letter "a" in
the name.
Without list comprehension you will have to write a for statement with a conditional test
inside:
Example
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
newlist = []
for x in fruits:
if "a" in x:
newlist.append(x)
print(newlist)
With list comprehension you can do all that with only one line of code:
Example
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
newlist = [x for x in fruits if "a" in x]
print(newlist)
Sort List Alphanumerically
List objects have a sort() method that will sort the list alphanumerically, ascending, by
default:
Example
Sort the list alphabetically:
thislist = ["orange", "mango", "kiwi", "pineapple", "banana"]
thislist.sort()
print(thislist)
Example
Sort the list numerically:
thislist = [100, 50, 65, 82, 23]
thislist.sort()
print(thislist)
Sort Descending:
To sort descending, use the keyword argument reverse = True:
Example
Sort the list descending:
thislist = ["orange", "mango", "kiwi", "pineapple", "banana"]
thislist.sort(reverse = True)
print(thislist)
Example
Sort the list descending:
thislist = [100, 50, 65, 82, 23]
thislist.sort(reverse = True)
print(thislist)
Customize Sort Function:
You can also customize your own function by using the keyword argument key = function.
The function will return a number that will be used to sort the list (the lowest number first):
Example
Sort the list based on how close the number is to 50:
def myfunc(n):
return abs(n - 50)
thislist = [100, 50, 65, 82, 23]
thislist.sort(key = myfunc)
print(thislist)
Case Insensitive Sort:
By default the sort() method is case sensitive, resulting in all capital letters being sorted
before lower case letters:
Example
Case sensitive sorting can give an unexpected result:
thislist = ["banana", "Orange", "Kiwi", "cherry"]
thislist.sort()
print(thislist)
Luckily we can use built-in functions as key functions when sorting a list.
So if you want a case-insensitive sort function, use str.lower as a key function:
Example
Perform a case-insensitive sort of the list:
thislist = ["banana", "Orange", "Kiwi", "cherry"]
thislist.sort(key = str.lower)
print(thislist)
Reverse Order

What if you want to reverse the order of a list, regardless of the alphabet?

The reverse() method reverses the current sorting order of the elements.

Example

Reverse the order of the list items:

thislist = ["banana", "Orange", "Kiwi", "cherry"]


thislist.reverse()
print(thislist)

Copy a List

You cannot copy a list simply by typing list2 = list1, because: list2 will only be
a reference to list1, and changes made in list1 will automatically also be made in list2.

There are ways to make a copy, one way is to use the built-in List method copy().

Example

Make a copy of a list with the copy() method:

thislist = ["apple", "banana", "cherry"]


mylist = thislist.copy()
print(mylist)

Another way to make a copy is to use the built-in method list().

Example

Make a copy of a list with the list() method:

thislist = ["apple", "banana", "cherry"]


mylist = list(thislist)
print(mylist)

Join Two Lists

There are several ways to join, or concatenate, two or more lists in Python.

One of the easiest ways are by using the + operator.

Example

Join two list:

list1 = ["a", "b", "c"]


list2 = [1, 2, 3]

list3 = list1 + list2


print(list3)

Another way to join two lists is by appending all the items from list2 into list1, one by one:
Example

Append list2 into list1:

list1 = ["a", "b" , "c"]


list2 = [1, 2, 3]
for x in list2:
list1.append(x)
print(list1)

Or you can use the extend() method, which purpose is to add elements from one list to
another list:

Example

Use the extend() method to add list2 at the end of list1:

list1 = ["a", "b" , "c"]


list2 = [1, 2, 3]

list1.extend(list2)
print(list1)

List Methods

Python has a set of built-in methods that you can use on lists.

Method Description

append() Adds an element at the end of the list

clear() Removes all the elements from the list

copy() Returns a copy of the list

count() Returns the number of elements with the specified value

extend() Add the elements of a list (or any iterable), to the end of the current list

index() Returns the index of the first element with the specified value

insert() Adds an element at the specified position

pop() Removes the element at the specified position

remove() Removes the item with the specified value

reverse() Reverses the order of the list

sort() Sorts the list


Python Tuples:

mytuple = ("apple", "banana", "cherry")


Tuple: Tuples are used to store multiple items in a single variable.
Tuple is one of 4 built- in data types in Python used to store collections of data, the other 3
are List, Set, and Dictionary, all with different qualities and usage.
A tuple is a collection which is ordered and unchangeable.
Tuples are written with round brackets.
Example
Create a Tuple:
thistuple = ("apple", "banana", "cherry")
print(thistuple)
Tuple Items
Tuple items are ordered, unchangeable, and allow duplicate values.
Tuple items are indexed, the first item has index [0], the second item has index [1] etc.
Ordered
When we say that tuples are ordered, it means that the items have a defined order, and that
order will not change.
Unchangeable

Tuples are unchangeable, meaning that we cannot change, add or remove items after the tuple
has been created.

Allow Duplicates

Since tuples are indexed, they can have items with the same value:

Example

Tuples allow duplicate values:

thistuple = ("apple", "banana", "cherry", "apple", "cherry")


print(thistuple)
Access Tuple Items

You can access tuple items by referring to the index number, inside square brackets:

Example

Print the second item in the tuple:

thistuple = ("apple", "banana", "cherry")


print(thistuple[1])

Note: The first item has index 0.

Negative Indexing

Negative indexing means start from the end.

-1 refers to the last item, -2 refers to the second last item etc.

Example

Print the last item of the tuple:


thistuple = ("apple", "banana", "cherry")
print(thistuple[-1])

Range of Indexes

You can specify a range of indexes by specifying where to start and where to end the range.

When specifying a range, the return value will be a new tuple with the specified items.

Example

Return the third, fourth, and fifth item:

thistuple = ("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango")


print(thistuple[2:5])

Note: The search will start at index 2 (included) and end at index 5 (not included).

Remember that the first item has index 0.

By leaving out the start value, the range will start at the first item:

Example

This example returns the items from the beginning to, but NOT included, "kiwi":

thistuple = ("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango")


print(thistuple[:4])

By leaving out the end value, the range will go on to the end of the list:

Example

This example returns the items from "cherry" and to the end:

thistuple = ("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango")


print(thistuple[2:])

Range of Negative Indexes

Specify negative indexes if you want to start the search from the end of the tuple:

Example

This example returns the items from index -4 (included) to index -1 (excluded)

thistuple = ("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango")


print(thistuple[-4:-1])
Check if Item Exists

To determine if a specified item is present in a tuple use the in keyword:

Example

Check if "apple" is present in the tuple:


thistuple = ("apple", "banana", "cherry")
if "apple" in thistuple:
print("Yes, 'apple' is in the fruits tuple")
Python - Update Tuples

Tuples are unchangeable, meaning that you cannot change, add, or remove items once the
tuple is created.

Change Tuple Values

Once a tuple is created, you cannot change its values. Tuples are unchangeable,
or immutable as it also is called.

But there is a workaround. You can convert the tuple into a list, change the list, and convert
the list back into a tuple.

Example

Convert the tuple into a list to be able to change it:

x = ("apple", "banana", "cherry")


y = list(x)
y[1] = "kiwi"
x = tuple(y)

print(x)

Add Items

Since tuples are immutable, they do not have a build-in append() method, but there are other
ways to add items to a tuple.

1. Convert into a list: Just like the workaround for changing a tuple, you can convert it into a
list, add your item(s), and convert it back into a tuple.

Example

Convert the tuple into a list, add "orange", and convert it back into a tuple:

thistuple = ("apple", "banana", "cherry")


y = list(thistuple)
y.append("orange")
thistuple = tuple(y)

2. Add tuple to a tuple. You are allowed to add tuples to tuples, so if you want to add one
item, (or many), create a new tuple with the item(s), and add it to the existing tuple:

Example

Create a new tuple with the value "orange", and add that tuple:

thistuple = ("apple", "banana", "cherry")


y = ("orange",)
thistuple += y
print(thistuple)

Python - Unpack Tuples:

When we create a tuple, we normally assign values to it. This is called "packing" a tuple:

Example

Packing a tuple:

fruits = ("apple", "banana", "cherry")

But, in Python, we are also allowed to extract the values back into variables. This is called
"unpacking":

Example

Unpacking a tuple:

fruits = ("apple", "banana", "cherry")

(green, yellow, red) = fruits

print(green)
print(yellow)
print(red)

Note: The number of variables must match the number of values in the tuple, if not, you must
use an asterisk to collect the remaining values as a list.

Using Asterisk*

If the number of variables is less than the number of values, you can add an * to the variable
name and the values will be assigned to the variable as a list:

Example

Assign the rest of the values as a list called "red":

fruits = ("apple", "banana", "cherry", "strawberry", "raspberry")

(green, yellow, *red) = fruits

print(green)
print(yellow)
print(red)

If the asterisk is added to another variable name than the last, Python will assign values to the
variable until the number of values left matches the number of variables left.

Example

Add a list of values the "tropic" variable:


fruits = ("apple", "mango", "papaya", "pineapple", "cherry")

(green, *tropic, red) = fruits

print(green)
print(tropic)
print(red)

Python - Loop Tuples


Loop Through a Tuple

You can loop through the tuple items by using a for loop.

Example

Iterate through the items and print the values:

thistuple = ("apple", "banana", "cherry")


for x in thistuple:
print(x)

Loop Through the Index Numbers

You can also loop through the tuple items by referring to their index number.

Use the range() and len() functions to create a suitable iterable.

Example

Print all items by referring to their index number:

thistuple = ("apple", "banana", "cherry")


for i in range(len(thistuple)):
print(thistuple[i])

Using a While Loop

You can loop through the list items by using a while loop.

Use the len() function to determine the length of the tuple, then start at 0 and loop your way
through the tuple items by refering to their indexes.

Remember to increase the index by 1 after each iteration.

Example

Print all items, using a while loop to go through all the index numbers:

thistuple = ("apple", "banana", "cherry")


i=0
while i < len(thistuple):
print(thistuple[i])
i=i+1
Join Two Tuples

To join two or more tuples you can use the + operator:

Example

Join two tuples:

tuple1 = ("a", "b" , "c")


tuple2 = (1, 2, 3)

tuple3 = tuple1 + tuple2


print(tuple3)

Multiply Tuples

If you want to multiply the content of a tuple a given number of times, you can use
the * operator:

Example

Multiply the fruits tuple by 2:

fruits = ("apple", "banana", "cherry")


mytuple = fruits * 2

print(mytuple)
Tuple Methods

Python has two built-in methods that you can use on tuples.

Method Description

count() Returns the number of times a specified value occurs in a tuple

index() Searches the tuple for a specified value and returns the position of
where it was found
Python Sets:

myset = {"apple", "banana", "cherry"}

Set: Sets are used to store multiple items in a single variable.

Set is one of 4 built-in data types in Python used to store collections of data, the other 3
are List, Tuple, and Dictionary, all with different qualities and usage.

A set is a collection which is unordered, unchangeable*, and unindexed.

* Note: Set items are unchangeable, but you can remove items and add new items.

Sets are written with curly brackets.

Example
Create a Set:

thisset = {"apple", "banana", "cherry"}


print(thisset)

Note: Sets are unordered, so you cannot be sure in which order the items will appear.

Set Items

Set items are unordered, unchangeable, and do not allow duplicate values.

Unordered

Unordered means that the items in a set do not have a defined order.

Set items can appear in a different order every time you use them, and cannot be referred to
by index or key.

Unchangeable

Set items are unchangeable, meaning that we cannot change the items after the set has been
created.

Once a set is created, you cannot change its items, but you can remove items and add new
items.

Duplicates Not Allowed

Sets cannot have two items with the same value.

Example

Duplicate values will be ignored:

thisset = {"apple", "banana", "cherry", "apple"}

print(thisset)

Get the Length of a Set

To determine how many items a set has, use the len() method.

Example

Get the number of items in a set:

thisset = {"apple", "banana", "cherry"}

print(len(thisset))

Set Items - Data Types

Set items can be of any data type:

Example
String, int and boolean data types:

set1 = {"apple", "banana", "cherry"}


set2 = {1, 5, 7, 9, 3}
set3 = {True, False, False}

A set can contain different data types:

Example

A set with strings, integers and boolean values:

set1 = {"abc", 34, True, 40, "male"}

type()

From Python's perspective, sets are defined as objects with the data type 'set':

<class 'set'>

Example

What is the data type of a set?

myset = {"apple", "banana", "cherry"}


print(type(myset))

The set() Constructor

It is also possible to use the set() constructor to make a set.

Example

Using the set() constructor to make a set:

thisset = set(("apple", "banana", "cherry")) # note the double round-brackets


print(thisset)

Python Collections (Arrays):

There are four collection data types in the Python programming language:

 List is a collection which is ordered and changeable. Allows duplicate members.


 Tuple is a collection which is ordered and unchangeable. Allows duplicate members.
 Set is a collection which is unordered, unchangeable*, and unindexed. No duplicate
members.
 Dictionary is a collection which is ordered** and changeable. No duplicate members.

Access Items

You cannot access items in a set by referring to an index or a key.

But you can loop through the set items using a for loop, or ask if a specified value is present
in a set, by using the in keyword.

Example
Loop through the set, and print the values:

thisset = {"apple", "banana", "cherry"}

for x in thisset:
print(x)

Example

Check if "banana" is present in the set:

thisset = {"apple", "banana", "cherry"}

print("banana" in thisset)
Change Items

Once a set is created, you cannot change its items, but you can add new items.

Add Items

Once a set is created, you cannot change its items, but you can add new items.

To add one item to a set use the add() method.

Example

Add an item to a set, using the add() method:

thisset = {"apple", "banana", "cherry"}

thisset.add("orange")

print(thisset)

Add Sets

To add items from another set into the current set, use the update() method.

Example

Add elements from tropical into thisset:

thisset = {"apple", "banana", "cherry"}


tropical = {"pineapple", "mango", "papaya"}

thisset.update(tropical)

print(thisset)

Add Any Iterable

The object in the update() method does not have to be a set, it can be any iterable object
(tuples, lists, dictionaries etc.).

Example
Add elements of a list to at set:

thisset = {"apple", "banana", "cherry"}


mylist = ["kiwi", "orange"]

thisset.update(mylist)

print(thisset)

Remove Item

To remove an item in a set, use the remove(), or the discard() method.

Example

Remove "banana" by using the remove() method:

thisset = {"apple", "banana", "cherry"}

thisset.remove("banana")

print(thisset)

Note: If the item to remove does not exist, remove() will raise an error.

Example

Remove "banana" by using the discard() method:

thisset = {"apple", "banana", "cherry"}

thisset.discard("banana")

print(thisset)

Note: If the item to remove does not exist, discard() will NOT raise an error.

You can also use the pop() method to remove an item, but this method will remove
the last item. Remember that sets are unordered, so you will not know what item that gets
removed.

The return value of the pop() method is the removed item.

Example

Remove the last item by using the pop() method:

thisset = {"apple", "banana", "cherry"}

x = thisset.pop()

print(x)

print(thisset)
Note: Sets are unordered, so when using the pop() method, you do not know which item that
gets removed.

Example

The clear() method empties the set:

thisset = {"apple", "banana", "cherry"}

thisset.clear()

print(thisset)

Example

The del keyword will delete the set completely:

thisset = {"apple", "banana", "cherry"}

del thisset

print(thisset)

Loop Items

You can loop through the set items by using a for loop:

Example

Loop through the set, and print the values:

thisset = {"apple", "banana", "cherry"}

for x in thisset:
print(x)

Join Sets
Join Two Sets

There are several ways to join two or more sets in Python.

You can use the union() method that returns a new set containing all items from both sets, or
the update() method that inserts all the items from one set into another:

Example

The union() method returns a new set with all items from both sets:

set1 = {"a", "b" , "c"}


set2 = {1, 2, 3}

set3 = set1.union(set2)
print(set3)
Example

The update() method inserts the items in set2 into set1:


set1 = {"a", "b" , "c"}
set2 = {1, 2, 3}

set1.update(set2)
print(set1)

Note: Both union() and update() will exclude any duplicate items.

Keep ONLY the Duplicates

The intersection_update() method will keep only the items that are present in both sets.

Example

Keep the items that exist in both set x, and set y:

x = {"apple", "banana", "cherry"}


y = {"google", "microsoft", "apple"}

x.intersection_update(y)

print(x)

The intersection() method will return a new set, that only contains the items that are present
in both sets.

Example

Return a set that contains the items that exist in both set x, and set y:

x = {"apple", "banana", "cherry"}


y = {"google", "microsoft", "apple"}

z = x.intersection(y)

print(z)

Keep All, But NOT the Duplicates

The symmetric_difference_update() method will keep only the elements that are NOT present
in both sets.

Example

Keep the items that are not present in both sets:

x = {"apple", "banana", "cherry"}


y = {"google", "microsoft", "apple"}

x.symmetric_difference_update(y)

print(x)

The symmetric_difference() method will return a new set, that contains only the elements that
are NOT present in both sets.
Example

Return a set that contains all items from both sets, except items that are present in both:

x = {"apple", "banana", "cherry"}


y = {"google", "microsoft", "apple"}

z = x.symmetric_difference(y)

print(z)
Set Methods:

Python has a set of built-in methods that you can use on sets.

Method Description

add() Adds an element to the set

clear() Removes all the elements from the set

copy() Returns a copy of the set

difference() Returns a set containing the difference between two or


more sets

difference_update() Removes the items in this set that are also included in
another, specified set

discard() Remove the specified item

intersection() Returns a set, that is the intersection of two other sets

intersection_update() Removes the items in this set that are not present in other,
specified set(s)

isdisjoint() Returns whether two sets have a intersection or not

issubset() Returns whether another set contains this set or not

issuperset() Returns whether this set contains another set or not

pop() Removes an element from the set

remove() Removes the specified element

symmetric_difference() Returns a set with the symmetric differences of two sets

symmetric_difference_update() inserts the symmetric differences from this set and


another

union() Return a set containing the union of sets

update() Update the set with the union of this set and others
Python Dictionaries:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

Dictionary

Dictionaries are used to store data values in key:value pairs. A dictionary is a collection
which is ordered*, changeable and do not allow duplicates.

As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries
are unordered.

Dictionaries are written with curly brackets, and have keys and values:

Example

Create and print a dictionary:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(thisdict)

Dictionary Items

Dictionary items are ordered, changeable, and does not allow duplicates.

Dictionary items are presented in key:value pairs, and can be referred to by using the key
name.

Example

Print the "brand" value of the dictionary:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(thisdict["brand"])

Ordered or Unordered?

As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries
are unordered.

When we say that dictionaries are ordered, it means that the items have a defined order, and
that order will not change.
Unordered means that the items does not have a defined order, you cannot refer to an item by
using an index.

Changeable

Dictionaries are changeable, meaning that we can change, add or remove items after the
dictionary has been created.

Duplicates Not Allowed

Dictionaries cannot have two items with the same key:

Example

Duplicate values will overwrite existing values:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964,
"year": 2020
}
print(thisdict)
Dictionary Length

To determine how many items a dictionary has, use the len() function:

Example

Print the number of items in the dictionary:

print(len(thisdict))

Dictionary Items - Data Types

The values in dictionary items can be of any data type:

Example

String, int, boolean, and list data types:

thisdict = {
"brand": "Ford",
"electric": False,
"year": 1964,
"colors": ["red", "white", "blue"]
}

type()

From Python's perspective, dictionaries are defined as objects with the data type 'dict':

<class 'dict'>

Example
Print the data type of a dictionary:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(type(thisdict))

Python Collections (Arrays)

There are four collection data types in the Python programming language:

 List is a collection which is ordered and changeable. Allows duplicate members.


 Tuple is a collection which is ordered and unchangeable. Allows duplicate members.
 Set is a collection which is unordered, unchangeable*, and unindexed. No duplicate
members.
 Dictionary is a collection which is ordered** and changeable. No duplicate members.

Accessing Items

You can access the items of a dictionary by referring to its key name, inside square brackets:

Example

Get the value of the "model" key:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
x = thisdict["model"]
There is also a method called get() that will give you the same result:

Example

Get the value of the "model" key:

x = thisdict.get("model")
Get Keys

The keys() method will return a list of all the keys in the dictionary.

Example

Get a list of the keys:

x = thisdict.keys()

The list of the keys is a view of the dictionary, meaning that any changes done to the
dictionary will be reflected in the keys list.

Example

Add a new item to the original dictionary, and see that the keys list gets updated as well:
car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

x = car.keys()

print(x) #before the change

car["color"] = "white"

print(x) #after the change


Get Values

The values() method will return a list of all the values in the dictionary.

Example

Get a list of the values:

x = thisdict.values()

The list of the values is a view of the dictionary, meaning that any changes done to the
dictionary will be reflected in the values list.

Example

Make a change in the original dictionary, and see that the values list gets updated as well:

car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

x = car.values()

print(x) #before the change

car["year"] = 2020

print(x) #after the change

Example

Add a new item to the original dictionary, and see that the values list gets updated as well:

car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

x = car.values()
print(x) #before the change

car["color"] = "red"

print(x) #after the change


Get Items

The items() method will return each item in a dictionary, as tuples in a list.

Example

Get a list of the key:value pairs

x = thisdict.items()

The returned list is a view of the items of the dictionary, meaning that any changes done to
the dictionary will be reflected in the items list.

Example

Make a change in the original dictionary, and see that the items list gets updated as well:

car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

x = car.items()

print(x) #before the change

car["year"] = 2020

print(x) #after the change


Example

Add a new item to the original dictionary, and see that the items list gets updated as well:

car = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

x = car.items()

print(x) #before the change

car["color"] = "red"

print(x) #after the change


Check if Key Exists

To determine if a specified key is present in a dictionary use the in keyword:

Example

Check if "model" is present in the dictionary:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
if "model" in thisdict:
print("Yes, 'model' is one of the keys in the thisdict dictionary")

Change Values

You can change the value of a specific item by referring to its key name:

Example

Change the "year" to 2018:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict["year"] = 2018

Update Dictionary

The update() method will update the dictionary with the items from the given argument.

The argument must be a dictionary, or an iterable object with key:value pairs.

Example

Update the "year" of the car by using the update() method:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict.update({"year": 2020})
Adding Items

Adding an item to the dictionary is done by using a new index key and assigning a value to it:

Example
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict["color"] = "red"
print(thisdict)

Removing Items

There are several methods to remove items from a dictionary:

Example

The pop() method removes the item with the specified key name:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict.pop("model")
print(thisdict)

Example

The popitem() method removes the last inserted item (in versions before 3.7, a random item
is removed instead):

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict.popitem()
print(thisdict)

Example

The del keyword removes the item with the specified key name:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
del thisdict["model"]
print(thisdict)

Example

The del keyword can also delete the dictionary completely:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
del thisdict
print(thisdict) #this will cause an error because "thisdict" no longer exists.

Example

The clear() method empties the dictionary:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
thisdict.clear()
print(thisdict)

Loop Through a Dictionary

You can loop through a dictionary by using a for loop.

When looping through a dictionary, the return value are the keys of the dictionary, but there
are methods to return the values as well.

Example

Print all key names in the dictionary, one by one:

for x in thisdict:
print(x)

Example

Print all values in the dictionary, one by one:

for x in thisdict:
print(thisdict[x])

Example

You can also use the values() method to return values of a dictionary:

for x in thisdict.values():
print(x)

Example

You can use the keys() method to return the keys of a dictionary:

for x in thisdict.keys():
print(x)

Example

Loop through both keys and values, by using the items() method:

for x, y in thisdict.items():
print(x, y)
Copy a Dictionary

You cannot copy a dictionary simply by typing dict2 = dict1, because: dict2 will only be
a reference to dict1, and changes made in dict1 will automatically also be made in dict2.

There are ways to make a copy, one way is to use the built-in Dictionary method copy().

Example

Make a copy of a dictionary with the copy() method:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
mydict = thisdict.copy()
print(mydict)

Another way to make a copy is to use the built-in function dict().

Example

Make a copy of a dictionary with the dict() function:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
mydict = dict(thisdict)
print(mydict)

Nested Dictionaries

A dictionary can contain dictionaries, this is called nested dictionaries.

Example

Create a dictionary that contain three dictionaries:

myfamily = {
"child1" : {
"name" : "Emil",
"year" : 2004
},
"child2" : {
"name" : "Tobias",
"year" : 2007
},
"child3" : {
"name" : "Linus",
"year" : 2011
}
}
Or, if you want to add three dictionaries into a new dictionary:

Example

Create three dictionaries, then create one dictionary that will contain the other three
dictionaries:

child1 = {
"name" : "Emil",
"year" : 2004
}
child2 = {
"name" : "Tobias",
"year" : 2007
}
child3 = {
"name" : "Linus",
"year" : 2011
}

myfamily ={
"child1" : child1,
"child2" : child2,
"child3" : child3
}

Dictionary Methods

Python has a set of built-in methods that you can use on dictionaries.

Method Description

clear() Removes all the elements from the dictionary

copy() Returns a copy of the dictionary

fromkeys() Returns a dictionary with the specified keys and value

get() Returns the value of the specified key

items() Returns a list containing a tuple for each key value pair

keys() Returns a list containing the dictionary's keys

pop() Removes the element with the specified key

popitem() Removes the last inserted key-value pair

setdefault() Returns the value of the specified key. If the key does not exist: insert the key, with
the specified value

update() Updates the dictionary with the specified key-value pairs

values() Returns a list of all the values in the dictionary


UNIT-2 Python Control Structures, Functions and OOP

Conditional Statements in Python:


There come situations in real life when we need to make some decisions and based on these decisions,
we decide what we should do next. Similar situations arise in programming also where we need to
make some decisions and based on these decisions we will execute the next block of code. Decision-
making statements in programming languages decide the direction of the flow of program execution.
In Python, if else elif statement is used for decision making.
Conditional Statement Description
if Statements It consists of a Boolean expression which
results are either TRUE or FALSE, followed
by one or more statements.
if else Statements It also contains a Boolean expression. The if
the statement is followed by an optional else
statement & if the expression results in
FALSE, then else statement gets executed. It
is also called alternative execution in which
there are two possibilities of the condition
determined in which any one of them will get
executed.
Nested Statements We can implement if statement and or if-else
statement inside another if or if - else
statement. Here more than one if conditions
are applied & there can be more than one if
within elif.
if statement
if statement is the most simple decision- making statement. It is used to decide whether a certain
statement or block of statements will be executed or not i.e if a certain condition is true then a block
of statement is executed otherwise not.
Syntax:
if condition:
# Statements to execute if
# condition is true

Mr. D.Gangadhar
Associate Professor
Here, the condition after evaluation will be either true or false. if statement accepts boolean values – if
the value is true then it will execute the block of statements below it otherwise not. We can
use condition with bracket „(„ „)‟ also.
As we know, python uses indentation to identify a block. So the block under an if statement will be
identified as shown in the below example:
if condition:
statement1
statement2
# Here if the condition is true, if block
# will consider only statement1 to be inside
# its block.
Example: Python if Statement

python program to illustrate If statement


i = 10
if (i > 15):
print("10 is less than 15")
print("I am Not in if")

Output:
I am Not in if
As the condition present in the if statement is false. So, the block below the if statement is not
executed.
if-else
The if statement alone tells us that if a co ndition is true it will execute a block of statements and if the
condition is false it won‟t. But what if we want to do something else if the condition is false. Here
comes the else statement. We can use the else statement with if statement to execute a block of code
when the condition is false.
Syntax:
if (condition):
# Executes this block if
# condition is true
else:
# Executes this block if
# condition is false

Mr. D.Gangadhar
Associate Professor
Example 1: Python if else statement

# python program to illustrate If else statement


#!/usr/bin/python
i = 20
if (i < 15):
print("i is smaller than 15")
print("i'm in if Block")
else:
print("i is greater than 15")
print("i'm in else Block")
print("i'm not in if and not in else Block")

Output:
i is greater than 15
i'm in else Block
i'm not in if and not in else Block
nested-if
A nested if is an if statement that is the target of another if statement. Nested if statements mean an if
statement inside another if statement. Yes, Python allows us to nest if statements within if
statements. i.e, we can place an if statement inside another if statement.
Syntax:
if (condition1):
# Executes when condition1 is true
if (condition2):
# Executes when condition2 is true
# if Block is end here
# if Block is end here

Example: Python Nested if

python program to illustrate nested If statement


#!/usr/bin/python
i = 10
if (i == 10):
# First if statement
if (i < 15):
print("i is smaller than 15")
# Nested - if statement
# Will only be executed if statement above
# it is true
if (i < 12):
print("i is smaller than 12 too")
else:
print("i is greater than 15")

Output:
i is smaller than 15
i is smaller than 12 too

Mr. D.Gangadhar
Associate Professor
if-elif-else ladder
Here, a user can decide among multiple options. The if statements are executed from the top down.
As soon as one of the conditions controlling the if is true, the statement associated with that if is
executed, and the rest of the ladder is bypassed. If none of the conditions is true, then the final else
statement will be executed.
Syntax:
if (condition):
statement
elif (condition):
statement
.
.
else:
statement
Example: Python if else elif statements

# Python program to illustrate if-elif-else ladder


#!/usr/bin/python
i = 20
if (i == 10):
print("i is 10")
elif (i == 15):
print("i is 15")
elif (i == 20):
print("i is 20")
else:
print("i is not present")

Output:
i is 20
Loops:

In programming, loops are a sequence of instructions that does a specific set of instructions or tasks
based on some conditions and continue the tasks until it reaches certain conditions.
Python provides three types of looping techniques:

Loop Description

for Loop This is traditionally used when programmers had a piece of code and
wanted to repeat that 'n' number of times.
while Loop The loop gets repeated until the specific Boolean condition is met.
Nested Loops Programmers can use one loop inside another; i.e., they can use for loop
inside while or vice - versa or for loop inside for loop or while inside while.

Python For loop:


Python For loop is used for sequential traversal i.e. it is used for iterating over an iterable like
string, tuple, list, etc. It falls under the category of definite iteration. Definite iterations mean the
number of repetitions is specified explicitly in advance. In Python, there is no C style for loop, i.e.,
Mr. D.Gangadhar
Associate Professor
for (i=0; i<n; i++). There is “for in” loop which is similar to for each loop in other languages. Let us
learn how to use for in loop for sequential traversals.
Note: In Python, for loops only implements the collection-based iteration.

Syntax:
for var in iterable:
# statements
Here the iterable is a collection of objects like lists, tuples. The indented statements inside the for
loops are executed once for each item in an iterable. The variable var takes the value of the next item
of the iterable each time through the loop.

Example: Python For Loop using List, Dictionary, String

# Python program to illustrate


# Iterating over a list
print("List Iteration")
l = ["geeks", "for", "geeks"]
for i in l:
print(i)
# Iterating over a tuple (immutable)
print("\nTuple Iteration")
t = ("geeks", "for", "geeks")
for i in t:
print(i)
# Iterating over a String
print("\nString Iteration")
s = "Geeks"
for i in s:
print(i)

Mr. D.Gangadhar
Associate Professor
# Iterating over dictionary
print("\nDictionary Iteration")
d = dict()
d['xyz'] = 123
d['abc'] = 345
for i in d:
print("% s % d" % (i, d[i]))

Output:
List Iteration
geeks
for
geeks
Tuple Iteration
geeks
for
geeks
String Iteration
G
e
e
k
s
Dictionary Iteration
xyz 123
abc 345
Python While Loop:
Python While Loop is used to execute a block of statements repeatedly until a given condition is
satisfied. And when the condition becomes false, the line immediately after the loop in the program
is executed. While loop falls under the category of indefinite ite ration. Indefinite iteration means
that the number of times the loop is executed isn‟t specified explicitly in advance.
Syntax:
while expression:
statement(s)
Statements represent all the statements indented by the same number of character spaces after a
programming construct are considered to be part of a single block of code. Python uses indentation as
its method of grouping statements. When a while loop is executed, expr is first evaluated in a
Boolean context and if it is true, the loop body is executed. Then the expr is checked again, if it is
still true then the body is executed again and this continues until the expression becomes false.

Example 1: Python While Loop

# Python program to illustrate


# while loop
count = 0
while (count < 3):
count = count + 1
print("Hello Geek")

Mr. D.Gangadhar
Associate Professor
Output:
Hello Geek
Hello Geek
Hello Geek
In the above example, the condition for while will be True as long as the counter variable (count) is
less than 3.

Example 2: Python while loop with list

# checks if list still


# contains any element
a = [1, 2, 3, 4]
while a:
print(a.pop())

Output
4
3
2
1
In the above example, we have run a while loop over a list that will run until there is an element
present in the list.
Single statement while block
Just like the if block, if the while block consists of a single statement we can declare the entire loop
in a single line. If there are multiple statements in the block that makes up the loop body, they can be

# Python program to illustrate


# Single statement while block
count = 0
while (count < 5): count += 1; print("Hello Geek")

Mr. D.Gangadhar
Associate Professor
Output:
Hello Geek
Hello Geek
Hello Geek
Hello Geek
Hello Geek
Nested Loops

Syntax

for iterating_var in sequence:

for iterating_var in sequence:

#execute your code

#execute your code

Example

Source Code

for g in range(1, 6):

for k in range(1, 3):

print ("%d * %d = %d" % ( g, k, g*k))

Mr. D.Gangadhar
Associate Professor
OUTPUT

1*1=1

1*2=2

2*1=2

2*2=4
3*1=3

3*2=6

4*1=4

4*2=8

5*1=5

5*2=10

Loop Control Statements

These statements are used to change execution from its normal sequence.
Mr. D.Gangadhar
Associate Professor
Python supports three types of loop control statements:

Python Loop Control Statements

Control Statments Description

Break statement It is used to exit a while loop or a for a loop. It terminates the looping
& transfers execution to the statement next to the loop.

Continue statement It causes the looping to skip the rest part of its body & start re-
testing its condition.

Pass statement It is used in Python to when a statement is required syntactically, and


the programmer does not want to execute any code block or
command.

Python break statement:


Using loops in Python automates and repeats the tasks in an efficient manner. But sometimes, there may
arise a condition where you want to exit the loop completely, skip an iteration or ignore that condition.
These can be done by loop control statements. Loop control statements change execution from its
normal sequence. When execution leaves a scope, all automatic objects that were created in that scope
are destroyed. Python supports the following control statements.
1. Continue statement
2. Break statement
3. Pass statement
1. Break statement
Break statement in Python is used to bring the control out of the loop when some external condition is
triggered. Break statement is put inside the loop body (generally after if condition).
Syntax:
break
Example:

# Python program to
# demonstrate break statement
s = 'geeksforgeeks'
# Using for loop
for letter in s:
print(letter)
# break the loop as soon it sees 'e'
# or 's'
if letter == 'e' or letter == 's':
break
print("Out of for loop")

Mr. D.Gangadhar
Associate Professor
print()
i=0
# Using while loop
while True:
print(s[i])
# break the loop as soon it sees 'e'
# or 's'
if s[i] == 'e' or s[i] == 's':
break
i += 1
print("Out of while loop")

Output:
g
e
Out of for loop
g
e
Out of while loop
2. Continue statement:
Continue state me nt is a loop control statement that forces to execute the next iteration of the loop
while skipping the rest of the code inside the loop for the current iteration only i.e. when the continue
statement is executed in the loop, the code inside the loop following the continue statement will be
skipped for the current iteration and the next iteration of the loop will begin.
Syntax:
continue
Example: Continue statement in Python
Consider the situation when you need to write a program which prints the number from 1 to 10 and
but not 6. It is specified that you have to do this using loop and only one loop is allowed to use. Here
comes the usage of continue statement. What we can do here is we can run a loop from 1 to 10 and
every time we have to compare the value of the iterator with 6. If it is equal to 6 we will use the
continue statement to continue to the next iteration without printing anything otherwise we will print
the value.
Below is the implementation of the above idea:

# Python program to
# demonstrate continue
# statement
# loop from 1 to 10
for i in range(1, 11):
# If i is equals to 6,
# continue to next iteration
# without printing
if i == 6:
continue
else:
# otherwise print the value
# of i

Mr. D.Gangadhar
Associate Professor
print(i, end=" ")

Output:
1 2 3 4 5 7 8 9 10
3. Pass Statement:
The pass statement is a null statement. But the difference between pass and comment is that
comment is ignored by the interpreter whereas pass is not ignored.
The pass statement is generally used as a placeholder i.e. when the user does not know what code to
write. So user simply places pass at that line. Sometimes, pass is used when the user doesn‟t want
any code to execute. So user can simply place pass where empty code is not allowed, like in loops,
function definitions, class definitions, or in if statements. So using pass statement user avoids this
error.
Syntax:
pass
Example 1: Pass statement can be used in empty functions

def geekFunction:
pass

Example 2: pass statement can also be used in empty class

class geekClass:
pass

Example 3: pass statement can be used in for loop when user doesn‟t know what to code inside the
loop

n = 10
for i in range(n):
# pass can be used as placeholder
# when code is to added later
pass

Example 4: pass statement can be used with conditional statements

a = 10
b = 20
if(a<b):
pass
else:
print("b<a")

Example 5: lets take another example in which the pass statement get executed when the condition is
true

li =['a', 'b', 'c', 'd']


for i in li:
if(i =='a'):

Mr. D.Gangadhar
Associate Professor
pass
else:
print(i)

Output:
b
c
d
Python Exception:
An exception can be defined as an unusual condition in a program resulting in the interruption in the
flow of the program.

Whenever an exception occurs, the program stops the execution, and thus the further code is not
executed. Therefore, an exception is the run-time errors that are unable to handle to Python script. An
exception is a Python object that represents an error
Python provides a way to handle the exception so that the code can be executed without any
interruption. If we do not handle the exception, the interpreter doesn't execute all the code that exists
after the exception.
Python has many built-in exceptions that enable our program to run without interruption and give the
output. These exceptions are given below:
Common Exceptions
Python provides the number of built- in exceptions, but here we are describing the common standard
exceptions. A list of common exceptions that can be thrown from a standard Python program is given
below.
ZeroDivisionError: Occurs when a number is divided by zero.
NameError: It occurs when a name is not found. It may be local or global.
IndentationError: If incorrect indentation is given.
IOError: It occurs when Input Output operation fails.
EOFError: It occurs when the end of the file is reached, and yet operations are being performed.

The problem without handling exceptions:


As we have already discussed, the exception is an abnormal condition that halts the execution of the
program.
Suppose we have two variables a and b, which take the input from the user and perform the division of
these values. What if the user entered the zero as the denominator? It will interrupt the program
execution and through a ZeroDivision exception. Let's see the following example.
Example
a = int(input("Enter a:"))
b = int(input("Enter b:"))
c = a/b
print("a/b = %d" %c)
#other code:
print("Hi I am other part of the program")

Output:
Enter a:10
Enter b:0
Traceback (most recent call last):
Mr. D.Gangadhar
Associate Professor
File "exception-test.py", line 3, in <module>
c = a/b;
ZeroDivisionError: division by zero

The above program is syntactically correct, but it through the error because of unusual input. That kind
of programming may not be suitable or recommended for the projects because these projects are
required uninterrupted execution. That's why an exception-handling plays an essential role in handling
these unexpected exceptions. We can handle these exceptions in the following way.

Exception handling in python


The try-expect statement

If the Python program contains suspicious code that may throw the exception, we must place that code
in the try block. The try block must be followed with the except statement, which contains a block of
code that will be executed if there is some exception in the try block.

Syntax
try:
#block of code
except Exception1:
#block of code
except Exception2:
#block of code
#other code
Example 1
try:
a = int(input("Enter a:"))
b = int(input("Enter b:"))
c = a/b
except:
print("Can't divide with zero")
Output:
Enter a:10
Enter b:0

Mr. D.Gangadhar
Associate Professor
Can't divide with zero
We can also use the else statement with the try-except statement in which, we can place the code which
will be executed in the scenario if no exception occurs in the try block.
The syntax to use the else statement with the try-except statement is given below.
try:
#block of code
except Exception1:
#block of code
else:
#this code executes if no except block is executed

Example 2
try:
a = int(input("Enter a:"))
b = int(input("Enter b:"))
c = a/b
print("a/b = %d"%c)
# Using Exception with except statement. If we print(Exception) it will return exception class
except Exception:
print("can't divide by zero")
print(Exception)
else:
print("Hi I am else block")
Output:
Enter a:10
Enter b:0
can't divide by zero
<class 'Exception'>
The except statement with no exception
Python provides the flexibility not to specify the name of exception with the exception statement.
Consider the following example.
Example
try:
Mr. D.Gangadhar
Associate Professor
a = int(input("Enter a:"))
b = int(input("Enter b:"))
c = a/b;
print("a/b = %d"%c)
except:
print("can't divide by zero")
else:
print("Hi I am else block")
The except statement using with exception variable
We can use the exception variable with the except statement. It is used by using the as keyword. this
object will return the cause of the exception. Consider the following example:

try:
a = int(input("Enter a:"))

b = int(input("Enter b:"))

c = a/b

print("a/b = %d"%c)
# Using exception object with the except statement
except Exception as e:
print("can't divide by zero")
print(e)
else:
print("Hi I am else block")
Output:
Enter a:10
Enter b:0
can't divide by zero
division by zero
Points to remember
Python facilitates us to not specify the exception with the except statement.
We can declare multiple exceptions in the except statement since the try block may contain the
statements which throw the different type of exceptions.
We can also specify an else block along with the try-except statement, which will be executed if no
exception is raised in the try block.
The statements that don't throw the exception should be placed inside the else block.
Example
try:
#this will throw an exception if the file doesn't exist.
fileptr = open("file.txt","r")
except IOError:
print("File not found")
else:
print("The file opened successfully")
fileptr.close()
Output:
File not found
Declaring Multiple Exceptions
Mr. D.Gangadhar
Associate Professor
The Python allows us to declare the multiple exceptions with the except clause. Declaring multiple
exceptions is useful in the cases where a try block throws multiple exceptions. The syntax is given
below.
Syntax
try:
#block of code
except (<Exception 1>,<Exception 2>,<Exception 3>,...<Exception n>)
#block of code
else:
#block of code
Consider the following example.
try:
a=10/0;
except(ArithmeticError, IOError):
print("Arithmetic Exception")
else:
print("Successfully Done")

Output
Arithmetic Exception

The try...finally block


Python provides the optional finally statement, which is used with the try statement. It is executed no
matter what exception occurs and used to release the external resource. The finally block provides a
guarantee of the execution.
We can use the finally block with the try block in which we can pace the necessary code, which must be
executed before the try statement throws an exception.
The syntax to use the finally block is given below.
Syntax
try:
# block of code

# this may throw an exception


finally:

# block of code

# this will always be executed

Mr. D.Gangadhar
Associate Professor
Example
try:
fileptr = open("file2.txt","r")
try:
fileptr.write("Hi I am good")
finally:
fileptr.close()
print("file closed")
except:
print("Error")

Output:

file closed
Error
Raising exceptions:
An exception can be raised forcefully by using the raise clause in Python. It is useful in in that scenario
where we need to raise an exception to stop the execution of the program.
For example, there is a program that requires 2GB memory for execution, and if the program tries to
occupy 2GB of memory, then we can raise an exception to stop the execution of the program.
The syntax to use the raise statement is given below.
Syntax
raise Exception_class,<value>
Points to remember
To raise an exception, the raise statement is used. The exception class name follows it.
An exception can be provided with a value that can be given in the parenthesis.
Mr. D.Gangadhar
Associate Professor
To access the value "as" keyword is used. "e " is used as a reference variable which stores the value of
the exception.
We can pass the value to an exception to specify the exception type.
Example
try:
age = int(input("Enter the age:"))
if(age<18):
raise ValueError
else:
print("the age is valid")
except ValueError:
print("The age is not valid")
Output:
Enter the age:17
The age is not valid
Example 2 Raise the exception with message
try:
num = int(input("Enter a positive integer: "))
if(num <= 0):
# we can pass the message in the raise statement
raise ValueError("That is a negative number!")
except ValueError as e:
print(e)

Output:
Enter a positive integer: -5
That is a negative number!

Example 3

try:
a = int(input("Enter a:"))
b = int(input("Enter b:"))
if b is 0:
raise ArithmeticError
else:
print("a/b = ",a/b)
except ArithmeticError:
print("The value of b can't be 0")

Output:
Enter a:10
Enter b:0
The value of b can't be 0
Custom Exception:
The Python allows us to create our exceptions that can be raised from the program and caught using the
except clause. However, we suggest you read this section after visiting the Python object and classes.
Consider the following example.
Example:
Mr. D.Gangadhar
Associate Professor
class ErrorInCode(Exception):
def __init__(self, data):
self.data = data
def __str__(self):
return repr(self.data)
try:
raise ErrorInCode(2000)
except ErrorInCode as ae:
print("Received error:", ae.data)
Output:
Received error: 2000
Python Random module:
The Python random module functions depend on a pseudo-random number generator function random(),
which generates the float number between 0.0 and 1.0.
There are different types of functions used in a random module which is given below:
random.random()
This function generates a random float number between 0.0 and 1.0.
random.randint()
This function returns a random integer between the specified integers.
random.choice()
This function returns a randomly selected element from a non-empty sequence.
Example
1. # importing "random" module.
2. import random
3. # We are using the choice() function to generate a random number from
4. # the given list of numbers.
5. print ("The random number from list is : ",end="")
6. print (random.choice([50, 41, 84, 40, 31]))
Output:

The random number from list is : 84


random.shuffle ():
This function randomly reorders the elements in the list.
random.randrange(beg,end,step)

This function is used to generate a number within the range specified in its argument. It accepts three
arguments, beginning number, last number, and step, which is used to skip a number in the range.
Consider the following example.

1. # We are using randrange() function to generate in range from 100


2. # to 500. The last parameter 10 is step size to skip
3. # ten numbers when selecting.
4. import random
5. print ("A random number from range is : ",end="")
6. print (random.randrange(100, 500, 10))

Output:

A random number from range is : 290


Mr. D.Gangadhar
Associate Professor
random.seed():

This function is used to apply on the particular random number with the seed argument. It returns the
mapper value. Consider the following example.

1. # importing "random" module.


2. import random
3. # using random() to generate a random number
4. # between 0 and 1
5. print("The random number between 0 and 1 is : ", end="")
6. print(random.random())
7.
8. # using seed() to seed a random number
9. random.seed(4)

Output:

The random number between 0 and 1 is : 0.4405576668981033

Python Math Module:


Python math module is defined as the most famous mathematical functions, which includes
trigonometric functions, representation functions, logarithmic functions, etc. Furthermore, it also defines
two mathematical constants, i.e., Pie and Euler number, etc.

Pie (n): It is a well-known mathematical constant and defined as the ratio of circumstance to the
diameter of a circle. Its value is 3.141592653589793.

Euler's numbe r(e): It is defined as the base of the natural logarithmic, and its value is
2.718281828459045.

There are different math modules which are given below:


Skip Ad
math.log()
This method returns the natural logarithm of a given number. It is calculated to the base e.

Example

1. import math
2. number = 2e-7 # small value of of x
3. print('log(fabs(x), base) is :', math.log(math.fabs(number), 10))

Output:

log(fabs(x), base) is : -6.698970004336019


<

math.log10()

This method returns base 10 logarithm of the given number and called the standard logarithm.
Mr. D.Gangadhar
Associate Professor
Example

1. import math
2. x=13 # small value of of x
3. print('log10(x) is :', math.log10(x))

Output:

log10(x) is : 1.1139433523068367

math.exp()

This method returns a floating-point number after raising e to the given number.

Example

1. import math
2. number = 5e-2 # small value of of x
3. print('The given number (x) is :', number)
4. print('e^x (using exp() function) is :', math.exp(number)-1)

Output:

The given number (x) is : 0.05


e^x (using exp() function) is : 0.05127109637602412

math.pow(x,y)

This method returns the power of the x corresponding to the value of y. If value of x is negative or y is
not integer value than it raises a ValueError.

Example

1. import math
2. number = math.pow(10,2)
3. print("The power of number:",number)

Output:

The power of number: 100.0

math.floor(x)

This method returns the floor value of the x. It returns the less than or equal value to x.

Example:

1. import math
2. number = math.floor(10.25201)
Mr. D.Gangadhar
Associate Professor
3. print("The floor value is:",number)

Output:

The floor value is: 10

math.ceil(x)

This method returns the ceil value of the x. It returns the greater than or equal value to x.

1. import math
2. number = math.ceil(10.25201)
3. print("The floor value is:",number)

Output:

The floor value is: 11

math.fabs(x)

This method returns the absolute value of x.

1. import math
2. number = math.fabs(10.001)
3. print("The floor absolute is:",number)

Output:

The absolute value is: 10.001

math.factorial()

This method returns the factorial of the given number x. If x is not integral, it raises a ValueError.

Example

1. import math
2. number = math.factorial(7)
3. print("The factorial of number:",number)

Output:

The factorial of number: 5040

math.modf(x)

This method returns the fractional and integer parts of x. It carries the sign of x is float.
Mr. D.Gangadhar
Associate Professor
Example

1. import math
2. number = math.modf(44.5)
3. print("The modf of number:",number)

Output:

The modf of number: (0.5, 44.0)

Python provides the several math modules which can perform the complex task in single- line of code. In
this tutorial, we have discussed a few important math modules.

Python OS Module:

Python OS module provides the facility to establish the interaction between the user and the operating
system. It offers many useful OS functions that are used to perform OS-based tasks and get related
information about operating system.

The OS comes under Python's standard utility modules. This module offers a portable way of using
operating system dependent functionality.

The Python OS module lets us work with the files and directories.

1. To work with the OS module, we need to import the OS module.


2. import os

There are some functions in the OS module which are given below:

os.name()

This function provides the name of the operating system module that it imports.

Currently, it registers 'posix', 'nt', 'os2', 'ce', 'java' and 'riscos'.

Example

1. import os
2. print(os.name)

Output:

nt

os.mkdir()

The os.mkdir() function is used to create new directory. Consider the following example.

1. import os
Mr. D.Gangadhar
Associate Professor
2. os.mkdir("d:\\newdir")

It will create the new directory to the path in the string argument of the function in the D drive named
folder newdir.

os.getcwd()

It returns the current working directory(CWD) of the file.

Example

1. import os
2. print(os.getcwd())

Output:

C:\Users\Python\Desktop\ModuleOS

os.chdir()

The os module provides the chdir() function to change the current working directory.

1. import os
2. os.chdir("d:\\")

Output:

d:\\

os.rmdir()

The rmdir() function removes the specified directory with an absolute or related path. First, we have to
change the current working directory and remove the folder.

Example

1. import os
2. # It will throw a Permission error; that's why we have to change the current working directory.
3. os.rmdir("d:\\newdir")
4. os.chdir("..")
5. os.rmdir("newdir")

os.error()

The os.error() function defines the OS level errors. It raises OSError in case of invalid or inaccessible
file names and path etc.

Example

Mr. D.Gangadhar
Associate Professor
1. import os
2.
3. try:
4. # If file does not exist,
5. # then it throw an IOError
6. filename = 'Python.txt'
7. f = open(filename, 'rU')
8. text = f.read()
9. f.close()
10.
11. # The Control jumps directly to here if
12. # any lines throws IOError.
13. except IOError:
14.
15. # print(os.error) will <class 'OSError'>
16. print('Problem reading: ' + filename)

Output:

Problem reading: Python.txt

os.popen()

This function opens a file or from the command specified, and it returns a file object which is connected
to a pipe.

Example

1. import os
2. fd = "python.txt"
3. # popen() is similar to open()
4. file = open(fd, 'w')
5. file.write("This is awesome")
6. file.close()
7. file = open(fd, 'r')
8. text = file.read()
9. print(text)
10.
11. # popen() provides gateway and accesses the file directly
12. file = os.popen(fd, 'w')
13. file.write("This is awesome")
14. # File not closed, shown in next function.

Output:

This is awesome

os.close()

Mr. D.Gangadhar
Associate Professor
This function closes the associated file with descriptor fr.

Example

1. import os
2. fr = "Python1.txt"
3. file = open(fr, 'r')
4. text = file.read()
5. print(text)
6. os.close(file)

Output:

Traceback (most recent call last):


File "main.py", line 3, in
file = open(fr, 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'Python1.txt'

os.rename()

A file or directory can be renamed by using the function os.rename(). A user can rename the file if it
has privilege to change the file.

Example

1. import os
2. fd = "python.txt"
3. os.rename(fd,'Python1.txt')
4. os.rename(fd,'Python1.txt')

Output:

Traceback (most recent call last):


File "main.py", line 3, in
os.rename(fd,'Python1.txt')
FileNotFoundError: [Errno 2] No such file or directory: 'python.txt' -> 'Python1.txt'

os.access()

This function uses real uid/gid to test if the invoking user has access to the path.

Example

1. import os
2. import sys
3.
4. path1 = os.access("Python.txt", os.F_OK)
5. print("Exist path:", path1)
6.
Mr. D.Gangadhar
Associate Professor
7. # Checking access with os.R_OK
8. path2 = os.access("Python.txt", os.R_OK)
9. print("It access to read the file:", path2)
10.
11. # Checking access with os.W_OK
12. path3 = os.access("Python.txt", os.W_OK)
13. print("It access to write the file:", path3)
14.
15. # Checking access with os.X_OK
16. path4 = os.access("Python.txt", os.X_OK)
17. print("Check if path can be executed:", path4)

Output:

Exist path: False


It access to read the file: False
It access to write the file: False
Check if path can be executed: False

Python sys module:

The python sys module provides functions and variables which are used to manipulate different parts of
the Python Runtime Environment. It lets us access system-specific parameters and functions.

import sys

First, we have to import the sys module in our program before running any functions.

sys.modules

40.3M
663
Hello Java Program for Beginners

This function provides the name of the existing python modules which have been imported.

sys.argv

This function returns a list of command line arguments passed to a Python script. The name of the script
is always the item at index 0, and the rest of the arguments are stored at subsequent indices.

sys.base_exec_prefix

This function provides an efficient way to the same value as exec_prefix. If not running a virtual
environment, the value will remain the same.

sys.base_prefix
Mr. D.Gangadhar
Associate Professor
It is set up during Python startup, before site.py is run, to the same value as prefix.

sys.byteorder

It is an indication of the native byteorder that provides an efficient way to do something.

sys.maxsize

This function returns the largest integer of a variable.

sys.path

This function shows the PYTHONPATH set in the current system. It is an environment variable that is a
search path for all the python modules.

sys.stdin

It is an object that contains the original values of stdin at the start of the program and used during
finalization. It can restore the files.

sys.getrefcount

This function returns the reference count of an object.

sys.exit

This function is used to exit from either the Python console or command prompt, and also used to exit
from the program in case of an exception.

sys executable

The value of this function is the absolute path to a Python interpreter. It is useful for knowing where
python is installed on someone else machine.

sys.platform

This value of this function is used to identify the platform on which we are working.

PYTHON STATISTICS MODULE:

Python statistics module provides the functions to mathematical statistics of numeric data. There are
some popular statistical functions defined in this module.

mean() function: The mean() function is used to calculate the arithmetic mean of the numbers in the
list.

Example:

import statistics

Mr. D.Gangadhar
Associate Professor
list of positive integer
numbers datasets = [5, 2, 7,
4, 2, 6, 8]
x = statistics.mean(datasets)
Printing the mean
print("Mean is :", x)

Output:
Mean is : 4.857142857142857

median() function :The median() function is used to return the middle value of the numeric data in the
list.

Example

import statistics
datasets = [4, -5, 6, 6, 9, 4, 5, -2]

Printing median of the


random data-set
print("Median of data-set is : % s "

% (statistics.median(datasets)))
Output:

Median of data-set is : 4.5

Mode () function: The mode() function returns the most common data that occurs in the list.

Example
import statistics

declaring a simple data-set consisting of real valued positive


integers. dataset =[2, 4, 7, 7, 2, 2, 3, 6, 6, 8]

Printing out the mode of given data-set


print("Calculated Mode % s" % (statistics.mode(dataset)))

Output:
Calculated Mode 2

stdev() function: The stdev() function is used to calculate the standard deviation on a given sample
which is available in the form of the list.

Example

import statistics

Mr. D.Gangadhar
Associate Professor
creating a simple data -
set sample = [7, 8, 9, 10,
11]
Prints standard deviation
print("Standard Deviation of sample is
%s "
(statistics.stdev(sample)))
Output:
Standard Deviation of sample is 1.5811388300841898
median_low(): The median_low function is used to return the low median of numeric data in the list.
Example
import statistics
simple list of a set of
integers set1 = [4, 6, 2, 5,
7, 7]
Note: low median will always be a member of the data-set.
Print low median of the data-set
print("Low median of data-set is
% s
"(statistics.median_low(set1)))
Output:
Low median of the data-set is 5
median_high():

The median_high function is used to return the high median of numeric data in the list.

Example:
import statistics

list of set of the


integers dataset = [2,
1, 7, 6, 1, 9]
print("High median of data-set is %s "

% (statistics.median_high(dataset)))

Output:
High median of the data-set is 6

Python Date and time:


Python provides the datetime module work with real dates and times. In real-world applications, we
need to work with the date and time. Python enables us to schedule our Python script to run at a
particular timing. In Python, the date is not a data type, but we can work with the date objects by
importing the module named with datetime, time, and calendar.

The datetime classes are classified in the six main classes.

o Date - It is a naive ideal date. It consists of the year, month, and day as attributes.

Mr. D.Gangadhar
Associate Professor
o time - It is a perfect time, assuming every day has precisely 24*60*60 seconds. It has hour,
minute, second, microsecond, and tzinfo as attributes.
o datetime - It is a grouping of date and time, along with the attributes year, month, day, hour,
minute, second, microsecond, and tzinfo.
o timedelta - It represents the difference between two dates, time or datetime instances to
microsecond resolution.
o tzinfo - It provides time zone information objects.
o timezone - It is included in the new version of Python. It is the class that implements
the tzinfo abstract base class.

Tick: In Python, the time instants are counted since 12 AM, 1st January 1970. The function time() of
the module time returns the total number of ticks spent since 12 AM, 1st January 1970. A tick can be
seen as the smallest unit to measure the time.

Consider the following example

1. import time;
2. #prints the number of ticks spent since 12 AM, 1st January 1970
3. print(time.time())

Output:

1585928913.6519969

How to get the current time?

The localtime() functions of the time module are used to get the current time tuple. Consider the
following example.

Example

1. import time;
2.
3. #returns a time tuple
4.
5. print(time.localtime(time.time()))

Output:

time.struct_time(tm_year=2020, tm_mon=4, tm_mday=3, tm_hour=21, tm_min=21, tm_sec=40,


tm_wday=4, tm_yday=94, tm_isdst=0)

Time tuple

The time is treated as the tuple of 9 numbers. Let's look at the members of the time tuple.

Index Attribute Values

Mr. D.Gangadhar
Associate Professor
0 Year 4 digit (for example 2018)

1 Month 1 to 12

2 Day 1 to 31

3 Hour 0 to 23

4 Minute 0 to 59

5 Second 0 to 60

6 Day of weak 0 to 6

7 Day of year 1 to 366

8 Daylight savings -1, 0, 1 , or -1

Getting formatted time

The time can be formatted by using the asctime() function of the time module. It returns the formatted
time for the time tuple being passed.

Example

1. import time
2. #returns the formatted time
3.
4. print(time.asctime(time.localtime(time.time())))

Output:

Tue Dec 18 15:31:39 2018

Python sleep time

The sleep() method of time module is used to stop the execution of the script for a given amount of
time. The output will be delayed for the number of seconds provided as the float.

Example

1. import time
2. for i in range(0,5):
3. print(i)
4. #Each element will be printed after 1 second
5. time.sleep(1)

Output:

Mr. D.Gangadhar
Associate Professor
0
1
2
3
4

The datetime Module

The datetime module enables us to create the custom date objects, perform various operations on dates
like the comparison, etc.
To work with dates as date objects, we have to import the datetime module into the python source code.
Consider the following example to get the datetime object representation for the current time.

Example

1. import datetime
2. #returns the current datetime object
3. print(datetime.datetime.now())

Output:

2020-04-04 13:18:35.252578

Creating date objects

We can create the date objects bypassing the desired date in the datetime constructor for which the date
objects are to be created.
Example
1. import datetime
2. #returns the datetime object for the specified date
3. print(datetime.datetime(2020,04,04))
Output:
2020-04-04 00:00:00

We can also specify the time along with the date to create the datetime object. Consider the following
example.

Mr. D.Gangadhar
Associate Professor
Example:

1. import datetime
2.
3. #returns the datetime object for the specified time
4.
5. print(datetime.datetime(2020,4,4,1,26,40))

Output:

2020-04-04 01:26:40

In the above code, we have passed in datetime() function year, month, day, hour, minute, and
millisecond attributes in a sequential manner.

Comparison of two dates


We can compare two dates by using the comparison operators like >, >=, <, and <=.
Consider the following example.
Example

1. from datetime import datetime as dt


2. #Compares the time. If the time is in between 8AM and 4PM, then it prints working hours otherwise it p
rints fun hours
3. if dt(dt.now().year,dt.now().month,dt.now().day,8)<dt.now()<dt(dt.now().year,dt.now().month,dt.now().
day,16):
4. print("Working hours....")
5. else:
6. print("fun hours")

Output:

fun hours
The calendar module
Python provides a calendar object that contains various methods to work with the calendars.
Consider the following example to print the calendar for the last month of 2018.

Example

1. import calendar;
2. cal = calendar.month(2020,3)
3. #printing the calendar of December 2018
4. print(cal)

Output:

March 2020
Mo Tu We Th Fr Sa Su
1
2 3 4 5 6 7 8
Mr. D.Gangadhar
Associate Professor
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31

SHUTIL MODULE:

Shutil module offers high- level operation on a file like a copy, create, and remote operation on the file.
It comes under Python‟s standard utility modules. This module helps in automating the process of
copying and removal of files and directories. Copying Files to another directory

Coping files to another Directory:

shutil.copy() method in Python is used to copy the content of the source file to the destination file or
directory. It also preserves the file‟s permission mode but other metadata of the file like the file‟s
creation and modification times is not preserved.
The source must represent a file but the destination can be a file or a directory. If the destination is a
directory then the file will be copied into the destination using the base filename from the source.
Also, the destination must be writable. If the destination is a file and already exists then it will be
replaced with the source file otherwise a new file will be created.

Syntax: shutil.copy(source, destination, *, follow_symlinks = True)


Parameter:
 source: A string representing the path of the source file.
 destination: A string representing the path of the destination file or directory.
 follow_symlinks (optional) : The default value of this parameter is True. If it is False and source r
epresents a symbolic link then destination will be created as a symbolic link.
Return Type: This method returns a string which represents the path of newly created file.
Example 1:

# Python program to explain shutil.copy() method


# importing shutil module
import shutil
source = "path/main.py"
destination ="path/main2.py"
# Copy the content of
# source to destination
dest = shutil.copy(source, destination)
# Print path of newly
# created file
print("Destination path:", dest)

Output:
Destination path: path/main2.py

Copying the Metadata along with File:


shutil.copy2() method in Python is used to copy the content of the source file to the destination file or
directory. This method is identical to shutil.copy() method but it also tries to preserve the file‟s
metadata.
Mr. D.Gangadhar
Associate Professor
Syntax: shutil.copy2(source, destination, *, follow_symlinks = True)
Parameter:
 source: A string representing the path of the source file.
 destination: A string representing the path of the destination file or directory.
 follow_symlinks (optional) : The default value of this parameter is True. If it is False and source r
epresents a symbolic link then it attempts to copy all metadata from the source symbolic link to th
e newly-created destination symbolic link. This functionality is platform dependent.
Return Type: This method returns a string which represents the path of newly created file.

# Python program to explain shutil.copy2() method


# importing os module
import os
# importing shutil module
import shutil
# path
path = 'csv/'
# List files and directories
# in '/home/User/Documents'
print("Before copying file:")
print(os.listdir(path))
# Source path
source = "csv/main.py"
# Print the metadeta
# of source file
metadata = os.stat(source)
print("Metadata:", metadata, "\n")
# Destination path
destination = "csv/gfg/check.txt"
# Copy the content of
# source to destination
dest = shutil.copy2(source, destination)
# List files and directories
# in "/home / User / Documents"
print("After copying file:")
print(os.listdir(path))
# Print the metadata
# of the destination file
matadata = os.stat(destination)
print("Metadata:", metadata)
# Print path of newly
# created file
print("Destination path:", dest)

Output:
Before copying file:
[„archive (2)‟, „c.jpg‟, „c.PNG‟, „Capture.PNG‟, „cc.jpg‟, „check.zip‟, „cv.csv‟, „d.png‟, „Done! Terms
And Conditions Generator – The Fastest Free Terms and Conditions Generator!.pdf‟, „file1.csv‟, „gfg‟
, „haarcascade_frontalface_alt2.xml‟, „log_transformed.jpg‟, „main.py‟, „nba.csv‟, „new_gfg.png‟, „r.g
if‟, „Result -_ Terms and Conditions are Ready!.pdf‟, „rockyou.txt‟, „sample.txt‟]
Mr. D.Gangadhar
Associate Professor
Metadata: os.stat_result(st_mode=33206, st_ino=2251799814202896, st_dev=1689971230, st_nlink=
1, st_uid=0, st_gid=0, st_size=1916, st_atime=1612953710, st_mtime=1612613202, st_ctime=161252
2940)
After copying file:
[„archive (2)‟, „c.jpg‟, „c.PNG‟, „Capture.PNG‟, „cc.jpg‟, „check.zip‟, „cv.csv‟, „d.png‟, „Done! Terms
And Conditions Generator – The Fastest Free Terms and Conditions Generator!.pdf‟, „file1.csv‟, „gfg‟
, „haarcascade_frontalface_alt2.xml‟, „log_transformed.jpg‟, „main.py‟, „nba.csv‟, „new_gfg.png‟, „r.g
if‟, „Result -_ Terms and Conditions are Ready!.pdf‟, „rockyou.txt‟, „sample.txt‟]
Metadata: os.stat_result(st_mode=33206, st_ino=2251799814202896, st_dev=1 689971230, st_nlink=
1, st_uid=0, st_gid=0, st_size=1916, st_atime=1612953710, st_mtime=1612613202, st_ctime=161252
2940)
Destination path: csv/gfg/check.txt

Copying the content of one file to another:


shutil.copyfile() method in Python is used to copy the content of the source file to the destination file.
The metadata of the file is not copied. Source and destination must represent a file and destination
must be writable. If the destination already exists then it will be replaced with the source file
otherwise a new file will be created.
If source and destination represent the same file then SameFileError exception will be raised.
Syntax: shutil.copyfile(source, destination, *, follow_symlinks = True)
Parameter:
 source: A string representing the path of the source file.
 destination: A string representing the path of the destination file.
 follow_symlinks (optional) : The default value of this parameter is True. If False and source repre
sents a symbolic link then a new symbolic link will be created instead of copying the file.
Return Type: This method returns a string which represents the path of newly created file.

# Python program to explain shutil.copyfile() method


# importing shutil module
import shutil
# Source path
source = "csv/main.py"
# Destination path
destination = "csv/gfg/main_2.py"
dest = shutil.copyfile(source, destination)
print("Destination path:", dest)

Output:
Destination path: csv/gfg/main_2.py

Replicating complete Directory:

shutil.copytree() method recursively copies an entire directory tree rooted at source (src) to the
destination directory. The destination directory, named by (dst) must not already exist. It will be
created during copying.
Syntax: shutil.copytree(src, dst, symlinks = False, ignore = None, copy_function = copy2, igonre_dan
gling_symlinks = False)
Mr. D.Gangadhar
Associate Professor
P a r a m e t e r s :
src: A string representing the path of the source directory.
dest: A string representing the path of the destination.
symlinks (optional) : This parameter accepts True or False, depending on which the metadata of the o
r ig ina l link s o r link e d link s will b e co p ie d to the ne w tr ee.
ignore (optional) : If ignore is given, it must be a callable that will receive as its arguments the direct
ory being visited by copytree(), and a list of its contents, as returned by os.listdir().
copy_function (optional): The default value of this parameter is copy2. We can use other copy functi
on like copy() for this parameter.
igonre_dangling_symlinks (optional) : This parameter value when set to True is used to put a silenc
e on the exception raised if the file pointed by the symlink doesn‟t exist.
Return Value: This method returns a string which represents the path of newly created directory.

# Python program to explain shutil.copytree() method


# importing os module
import os
# importing shutil module
import shutil
# path
path = 'C:/Users/ksaty/csv/gfg'
print("Before copying file:")
print(os.listdir(path))
# Source path
src = 'C:/Users/ksaty/csv/gfg'
# Destination path
dest = 'C:/Users/ksaty/csv/gfg/dest'
# Copy the content of
# source to destination
destination = shutil.copytree(src, dest)
print("After copying file:")
print(os.listdir(path))
# Print path of newly
# created file
print("Destination path:", destination)

Output:
Before copying file:
[„cc.jpg‟, „check.txt‟, „log_transformed.jpg‟, „main.py‟, „main2.py‟, „main_2.py‟]
After copying file:
[„cc.jpg‟, „check.txt‟, „dest‟, „log_transformed.jpg‟, „main.py‟, „main2.py‟, „main_2.py‟]
Destination path: C:/Users/ksaty/csv/gfg/dest

Removing a Directory:
shutil.rmtree() is used to delete an entire directory tree, the path must point to a directory (but not a
symbolic link to a directory).

Mr. D.Gangadhar
Associate Professor
Syntax: shutil.rmtree(path, ignore_errors=False, onerror=None)
P a r a m e t e r s :
path: A path- like object representing a file path. A path- like object is either a string or bytes object re
p r e s e n t i n g a p a t h .
ignore_e rrors : If ignore_errors is true, errors resulting from failed removals will be ignored.
oneerror: If ignore_errors is false or omitted, such errors are handled by calling a handler specified b
y onerror.

# Python program to demonstrate


# shutil.rmtree()
import shutil
import os
# location
location = "csv/gfg/"
# directory
dir = "dest"
# path
path = os.path.join(location, dir)
# removing directory
shutil.rmtree(path)

Finding files:
shutil.which() method tells the path to an executable application that would be run if the
given cmd was called. This method can be used to find a file on a computer which is present on the
PATH.
Syntax: shutil.which(cmd, mode = os.F_OK | os.X_OK, path = None)
P a r a m e t e r s :
cmd: A string representing the file.
mode: This parameter specifies mode by which method should execute. os.F_OK tests existence of th
e path and os.X_OK Checks if path can be executed or we can say mode determines if the file exists a
nd executable.
path: This parameter specifies the path to be used, if no path is specified then the results of os.environ
() are used
Return Value: This method returns the path to an executable application

# importing shutil module


import shutil
# file search
cmd = 'anaconda'
# Using shutil.which() method
locate = shutil.which(cmd)
# Print result
print(locate)

Output:
D:\Installation_bulk\Scripts\anaconda.EXE

Mr. D.Gangadhar
Associate Professor
Python Glob Module

In Python, we have many in-built modules for performing various tasks, and one of such tasks we want
to perform with the Python modules is finding and locating all the files present in our system, which
follows a similar pattern. This similar pattern can be a file extension, the file name's prefix, or any
similarity between two or many files. We have many different Python modules with which we can
easily perform this task using a Python program, but not all the modules are as efficient as others. In this
tutorial, we are going to learn about one of such efficient modules, i.e., glob module in Python, with
which we can perform file matching with a specific pattern by using it inside a program. We will learn
in detail about the glob module in Python, how we can use it inside a program, what its key features are
and the application of this module.

Glob Module in Python

With the help of the Python glob module, we can search for all the path names which are looking for
files matching a specific pattern (which is defined by us). The specified pattern for file matching is
defined according to the rules dictated by the Unix shell. The result obtained by following t hese rules for
a specific pattern file matching is returned in the arbitrary order in the output of the program. While
using the file matching pattern, we have to fulfil some requirements of the glob module because the
module can travel through the list of the files at some location in our local disk. The module will mostly
go through those lists of the files in the disk that follow a specific pattern only.

Pattern Matching Functions

In Python, we have several functions which we can use to list down the files that match with the specific
pattern which we have defined inside the function in a program. With the help of these functions, we can
get the result list of the files which will match the given pattern in the specified folder in an arbitrary
order in the output.

We will discuss the following such functions in this section:

Keep Watching

Skip Ad

1. fnmatch()
2. scandir()
3. path.expandvars()
4. path.expanduser()

The first two functions present in the above- given list, i.e., fnmatch.fnmatch() and os.scandir()
function, is actually used to perform the pattern matching task and not by invoking the sub-shell in the
Python. These two functions perform the pattern matching task and get the list of all filenames and that
too in arbitrary order. Here is a catch that the glob module treats as special cases for all the files which
names begin with a dot (.) which is very unlikely in the fnmatch.fnmatch() function.

The last two functions are given in the list,


i.e., os.path.expandvars() and os.path.expanduser() function can be used for the shell and tilde
variable expansion in the filename pattern-matching task.
Mr. D.Gangadhar
Associate Professor
Rules of Pattern

If any of us thinks that we can define or use any pattern to perform the pattern matching filename task,
then let us clarify here that it is not possible. We can't define any pattern or use any pattern to collect the
list of files with the same. We have to follow a specific set of rules while defining the pattern for the
filename pattern matching functions in the glob module.

In this section, we will discuss all such rules which we have to keep in mind and adhere them while
defining a pattern for filename pattern matching functions. We will only discuss these rules briefly and
don't go in-depth about them as they are not our primary focus in this tutorial.

Following are set of rules for the pattern that we define inside the glob module's pattern matching
functions:

o We have to follow all the standard set of rules of the UNIX path expansion in the pattern
matching.
o The path we define inside the pattern should be either absolute or relative, and we can't define
any unclear path inside the pattern.
o The special characters allowed inside the pattern are only two wild-cards, i.e., '*, ?' and the
normal characters that can be expressed inside the pattern are expressed in [].
o The rules of the pattern for glob module functions are applied to the filename segment (which is
provided in the functions), and it stops at the path separator, i.e., '/' of the files.

These are some general rules for the patterns we define inside the glob module functions for filename
pattern matching tasks, and we have to follow these set of rules in order to perform the task successfully.

Applications of Glob Module

We have already discussed how pattern matching is very helpful for us when we are looking for similar
files on our disk. Here, we will discuss the applications of the glob module and how it is very helpful to
us.

Following are some listed applications of the Python glob module, and we can use this module in the
given functions:

1. Sometimes, we want to search for a file that has a certain prefix in its name, any common string
in the middle of the names of many files or have the same certain extension. Now, to perform
this task, we may have to write a code that will scan the whole directory and then it will produce
the result. Instead of it, the glob module is going to be very helpful in this case as we can use the
functions of the glob module and perform this task very easily and can save our time.
2. Other than this, the Glob module is also very useful when one of our programs have to look for
the list of all the files in a given file system with the names of the files matching a similar
pattern. Glob module can easily perform this task and that too without opening the result of the
program in other sub-shell.

So, by looking at the application of the glob module, we can say that how important this module is for
us and where we can use it to reduce the complexity of the code and save our time.

Mr. D.Gangadhar
Associate Professor
Glob Module Functions

Now, we will discuss various more functions of the glob module and understand their working inside a
Python program. We will also learn that how these functions help us in the pattern matching task. Look
at the following list of functions that we have in the glob module, and with the help of these functions,
we can carry out the task of filename pattern matching very smoothly:

1. iglob()
2. glob()
3. escape()

Now, we will briefly discuss these functions and then understand the implementation of these functions
by using them inside a Python program. We will use each of the above- given functions in an example
program and get the list of file names following a similar pattern (that we will define in the function) in
the output.

1. iglob() Function: The iglob() function of the glob module is very helpful in yielding the arbitrary
values of the list of files in the output. We can create a Python generator with the iglob() method. We
can use the Python generator created by the glob module to list down the files under a given directory.
This function also returns an iterator when called, and the iterator returned by it yields the values (list of
files) without storing all of the filenames simultaneously.

Syntax: Following is the syntax for using the iglob() function of glob module inside a Python program:

1. iglob(pathname, *, recursive=False)

As we can see in the syntax of iglob() function, it takes a total of three parameters in it, which can
be defined as given below:

(i) pathname: The pathname parameter is the optional parameter of the function, a nd we can even leave
it while we are working on the file directory that is the same as where our Python is installed. We have
to define the pathname from where we have to collect the list of files that following a similar pattern
(which is also defined inside the function).

(ii) recursive: It is also an optional parameter for the iglob() function, and it takes only bool values (true
or false) in it. The recursive parameter is used to set if the function is following the recursive approach
for finding file names or not.

(iii) '*': This is the mandatory parameter of the iglob() function as here we have to define the pattern for
which the iglob() function will collect the file names and list them down in the output. The pattern we
define inside the iglob() function (such as the extension of file) for the pattern matching should start
with the '*' symbol.

Now, let's use this iglob() function in an example program so that we can understand its implementation
and function in a better way.

Example 1:

Look at the following Python program with the implementation of iglob() function:
Mr. D.Gangadhar
Associate Professor
1. # Import glob module in the program
2. import glob as gb
3. # Initialize a variable
4. inVar = gb.iglob("*.py") # Set Pattern in iglob() function
5. # Returning class type of variable
6. print(type(inVar))
7. # Printing list of names of all files that matched the pattern
8. print("List of the all the files in the directory having extension .py: ")
9. for py in inVar:
10. print(py)

Output:

<class 'generator'>
List of the all the files in the directory having extension .py:
adding.py
changing.py
code#1.py
code#2.py
code-3.py
code-4.py
code.py
code37.py
code_5.py
code_6.py
configuring.py

glob() Function: With the help of the glob() function, we can also get the list of files that matching a
specific pattern (We have to define that specific pattern inside the function). The list returned by the
glob() function will be a string that should contain a path specification according to the path we have
defined inside the function. The string or iterator for glob() function actually returns the same value as
returned by the iglob() function without actually storing these values (filenames) in it.

Syntax:

Following is the syntax for using the glob() function of the glob module inside a Python program:

1. glob(pathname, *, recursive = True)

As we can see in the syntax of the glob() function, it also takes a total of three parameters in it, like the
iglob() function. The three parameters defined in the glob() function are the same as those we have read
in the iglob() function above. Now, let's use this glob() function in an example program so that we can
understand its implementation and function in a better way.

Example 2: Look at the following Python program with the implementation of glob() function:

1. # Import glob module in the program


2. import glob as gb
3. # Initialize a variable
Mr. D.Gangadhar
Associate Professor
4. genVar = gb.glob("*.py") # Set Pattern in glob() function
5. # Printing list of names of all files that matched the pattern
6. print("List of the all the files in the directory having extension .py: ")
7. for py in genVar:
8. print(py)

Output:

List of the all the files in the directory having extension .py:
adding.py
changing.py
code#1.py
code#2.py
code-3.py
code-4.py
code.py
code37.py
code_5.py
code_6.py
configuring.py

escape() Function: The escape() becomes very impactful as it allows us to escape the given character
sequence, which we defined in the function. The escape() function is very handy for locating files that
having certain characters (as we will define in the function) in their file names. It will match the
sequence by matching an arbitrary literal string in the file names with that special character in them.

Syntax:

Following is the syntax for using the escape() function of glob module inside a Python program:

1. >> escape(pathname)

The escape() should be used with either glob() or iglob() function so that we can print the list of file
names in the output as a result. Now, let's use this escape() function in an example program so that we
can understand its implementation and function in a better way.

Example 3: Look at the following Python program with the implementation of escape() function:

1. # Import glob module in the program


2. import glob as gb
3. # Initialize a variable
4. charSeq = "-_#"
5. print("Following is the list of filenames that match the special character sequence of escape function: ")

6. # Using nested for loop to get the filenames


7. for splChar in charSeq:
8. # Pathname for the glob() function
9. escSet = "*" + gb.escape(splChar) + "*" + ".py"
10. # Printing list of filenames with glob() function
Mr. D.Gangadhar
Associate Professor
11. for py in (gb.glob(escSet)):
12. print(py)

Output:

Following is the list of filenames that match the special character sequence of escape function:
code-3.py
code-4.py
code_5.py
code_6.py
code#1.py
Object-oriented Programming:

Class − A user-defined prototype for an object that defines a set of attributes that characterize any object
of the class. The attributes are data members (class variables and instance variables) and methods,
accessed via dot notation.
Class variable − A variable that is shared by all instances of a class. Class variables are defined within a
class but outside any of the class's methods. Class variables are not used as frequently as instance
variables are.
Data member − A class variable or instance variable that holds data associated with a class and its
objects.
Function overloading − The assignment of more than one behavior to a particular function. The
operation performed varies by the types of objects or arguments involved.
Instance variable − A variable that is defined inside a method and belongs only to the current instance
of a class.
Inheritance − The transfer of the characteristics of a class to other classes that are derived from it.
Instance − An individual object of a certain class. An object obj that belongs to a class Circle, for
example, is an instance of the class Circle.
Instantiation − The creation of an instance of a class.
Method − A special kind of function that is defined in a class definition.
Object − A unique instance of a data structure that's defined by its class. An object comprises both data
members (class variables and instance variables) and methods.
Operator overloading − The assignment of more than one function to a particular operator.
ATTRIBUTE AND METHODS IN PYTHON:
As an object oriented programming language python stresses on objects. Classes are the blueprint from
which the objects are created. Each class in python can have many attributes including a function as an
attribute.
Accessing the attributes of a class
To check the attributes of a class and also to manipulate those attributes, we use many python in-built
methods as shown below.
 getattr() − A python method used to access the attribute of a class.
 hasattr() − A python method used to verify the presence of an attribute in a class.
 setattr() − A python method used to set an additional attribute in a class.
The below program illustrates the use of the above methods to access class attributes in python.
Example
Mr. D.Gangadhar
Associate Professor
class StateInfo:
StateName='Telangana'
population='3.5 crore'
def func1(self):
print("Hello from my function")
print getattr(StateInfo,'StateName')
# returns true if object has attribute
print hasattr(StateInfo,'population')
setattr(StateInfo,'ForestCover',39)
print getattr(StateInfo,'ForestCover')
print hasattr(StateInfo,'func1')
Output
Running the above code gives us the following result −
Telangana
True
39
True
Accessing the method of a class
To access the method of a class, we need to instantiate a class into an object. Then we can access the
method as an instance method of the class as shown in the program below. Here t hrough the self
parameter, instance methods can access attributes and other methods on the same object.
Example
class StateInfo:
StateName='Telangana'
population='3.5 crore'
def func1(self):
print("Hello from my function")
print getattr(StateInfo,'StateName')
# returns true if object has attribute
print hasattr(StateInfo,'population')
setattr(StateInfo,'ForestCover',39)
print getattr(StateInfo,'ForestCover')
print hasattr(StateInfo,'func1')
obj = StateInfo()
obj.func1()
Mr. D.Gangadhar
Associate Professor
Output
Running the above code gives us the following result −
Telangana
True
39
True
Hello from my function

Mr. D.Gangadhar
Associate Professor
Accessing the method of one class from another
To access the method of one class from another class, we need to pass an instance of the called class to
the calling class. The below example shows how it is done.
Example
class ClassOne:
def m_class1(self):
print "Method in class 1"
# Definign the calling Class
class ClassTwo(object):
def __init__(self, c1):
self.c1 = c1
# The calling method
def m_class2(self):
Object_inst = self.c1()
Object_inst.m_class1()
# Passing classone object as an argument to classTwo
obj = ClassTwo(ClassOne)
obj.m_class2()
Output
Running the above code gives us the following result −
Method in class 1
Python Inheritance:
Inheritance is an important aspect of the object-oriented paradigm. Inheritance provides code reusability
to the program because we can use an existing class to create a new class instead of creating it from
scratch. In inheritance, the child class acquires the properties and can access all the data members and
functions defined in the parent class. A child class can also provide its specific implementation to the
functions of the parent class. In this section of the tutorial, we will discuss inheritance in detail.

In python, a derived class can inherit base class by just mentioning the base in the bracket after the
derived class name. Consider the following syntax to inherit a base class into the derived class.

Mr. D.Gangadhar
Associate Professor
Syntax

1. class derived-class(base class):


2. <class-suite>

A class can inherit multiple classes by mentioning all of them inside the bracket. Consider the following
syntax.

Syntax

1. class derive-class(<base class 1>, <base class 2>, ..... <base class n>):
2. <class - suite>

Example:

1. class Animal:
2. def speak(self):
3. print("Animal Speaking")
4. #child class Dog inherits the base class Animal
5. class Dog(Animal):
6. def bark(self):
7. print("dog barking")
8. d = Dog()
9. d.bark()
10. d.speak()

Output:

dog barking
Animal Speaking

Python Multi-Level inheritance:

Multi- Level inheritance is possible in python like other object-oriented languages. Multi- level
inheritance is archived when a derived class inherits another derived class. There is no limit on the
number of levels up to which, the multi-level inheritance is archived in python.

Mr. D.Gangadhar
Associate Professor
The syntax of multi- level inheritance is given below.

Syntax

1. class class1:
2. <class-suite>
3. class class2(class1):
4. <class suite>
5. class class3(class2):
6. <class suite>
7. .
8. .

Example

1. class Animal:
2. def speak(self):
3. print("Animal Speaking")
4. #The child class Dog inherits the base class Animal
5. class Dog(Animal):
6. def bark(self):
7. print("dog barking")
8. #The child class Dogchild inherits another child class Dog
9. class DogChild(Dog):
10. def eat(self):
11. print("Eating bread...")
12. d = DogChild()
13. d.bark()
14. d.speak()
15. d.eat()

Output:

dog barking
Animal Speaking
Eating bread...
Mr. D.Gangadhar
Associate Professor
Python Multiple inheritance

Python provides us the flexibility to inherit multiple base classes in the child class.

The syntax to perform multiple inheritance is given below.

Syntax

1. class Base1:
2. <class-suite>
3.
4. class Base2:
5. <class-suite>
6. .
7. .
8. .
9. class BaseN:
10. <class-suite>
11.
12. class Derived(Base1, Base2, ...... BaseN):
13. <class-suite>

Example

1. class Calculation1:
2. def Summation(self,a,b):
3. return a+b;
4. class Calculation2:
5. def Multiplication(self,a,b):
6. return a*b;
7. class Derived(Calculation1,Calculation2):
8. def Divide(self,a,b):
9. return a/b;
10. d = Derived()
11. print(d.Summation(10,20))
12. print(d.Multiplication(10,20))
13. print(d.Divide(10,20))

Mr. D.Gangadhar
Associate Professor
Output:

30
200
0.5

The issubclass(sub,sup) method

The issubclass(sub, sup) method is used to check the relationships between the specified classes. It
returns true if the first class is the subclass of the second class, and false otherwise.

Example

1. class Calculation1:
2. def Summation(self,a,b):
3. return a+b;
4. class Calculation2:
5. def Multiplication(self,a,b):
6. return a*b;
7. class Derived(Calculation1,Calculation2):
8. def Divide(self,a,b):
9. return a/b;
10. d = Derived()
11. print(issubclass(Derived,Calculation2))
12. print(issubclass(Calculation1,Calculation2))

Output:

True
False

The isinstance (obj, class) method

The isinstance() method is used to check the relationship between the objects and classes. It returns true
if the first parameter, i.e., obj is the instance of the second parameter, i.e., class.

Example

1. class Calculation1:
2. def Summation(self,a,b):
3. return a+b;
4. class Calculation2:
5. def Multiplication(self,a,b):
6. return a*b;
7. class Derived(Calculation1,Calculation2):
8. def Divide(self,a,b):
9. return a/b;
10. d = Derived()
Mr. D.Gangadhar
Associate Professor
11. print(isinstance(d,Derived))

Output:

True

Method Overriding

We can provide some specific implementation of the parent class method in our child class. When the
parent class method is defined in the child class with some specific implementation, then the concept is
called method overriding. We may need to perform method overriding in the scenario where the
different definition of a parent class method is needed in the child class.

Example

1. class Animal:
2. def speak(self):
3. print("speaking")
4. class Dog(Animal):
5. def speak(self):
6. print("Barking")
7. d = Dog()
8. d.speak()

Output:

Barking

Real Life Example of method overriding

1. class Bank:
2. def getroi(self):
3. return 10;
4. class SBI(Bank):
5. def getroi(self):
6. return 7;
7.
8. class ICICI(Bank):
9. def getroi(self):
10. return 8;
11. b1 = Bank()
12. b2 = SBI()
13. b3 = ICICI()
14. print("Bank Rate of interest:",b1.getroi());
15. print("SBI Rate of interest:",b2.getroi());
16. print("ICICI Rate of interest:",b3.getroi());

Output:

Mr. D.Gangadhar
Associate Professor
Bank Rate of interest: 10
SBI Rate of interest: 7
ICICI Rate of interest: 8

Data abstraction in python

Abstraction is an important aspect of object-oriented programming. In python, we can also perform data
hiding by adding the double underscore (___) as a prefix to the attribute which is to be hidden. After
this, the attribute will not be visible outside of the class through the object.

Example

1. class Employee:
2. __count = 0;
3. def __init__(self):
4. Employee.__count = Employee.__count+1
5. def display(self):
6. print("The number of employees",Employee.__count)
7. emp = Employee()
8. emp2 = Employee()
9. try:
10. print(emp.__count)
11. finally:
12. emp.display()

Output:

The number of employees 2


AttributeError: 'Employee' object has no attribute '__count'
Polymorphism:

Polymorphism is taken from the Greek words Poly (many) and morphism (forms). It means that the
same function name can be used for different types. This makes programming more intuitive and easier.

In Python, we have different ways to define polymorphism. So let‟s move ahead and see how
polymorphism works in Python.

Polymorphism in Python

A child class inherits all the methods from the parent class. However, in some situations, the method
inherited from the parent class doesn‟t quite fit into the child class. In such cases, you will have to re-
implement method in the child class.

There are different methods to use polymorphis m in Python. You can use different function, class
methods or objects to define polymorphis m. So, let’s move ahead and have a look at each of these
methods in detail. Polymorphis m with Function and Objects

You can create a function that can take any object, allowing for polymorphism.
Mr. D.Gangadhar
Associate Professor
Let‟s take an example and create a function called “func()” which will take an object which we will
name “obj”. Now, let‟s give the function something to do that uses the „obj‟ object we passed to it. In
this case, let‟s call the methods type() and color(), each of which is defined in the two classes „Tomato‟
and „Apple‟. Now, you have to create instantiations of both the „Tomato‟ and „Apple‟ classes if we
don‟t have them already:

1 class Tomato():
2 def type(self):
3 print("Vegetable")
4 def color(self):
5 print("Red")
6 class Apple():
7 def type(self):
8 print("Fruit")
9 def color(self):
10 print("Red")
11
12 def func(obj):
13 obj.type()
14 obj.color()
15
16 obj_tomato = Tomato()
17 obj_apple = Apple()
18 func(obj_tomato)
19 func(obj_apple)
Output:

Vegetable
Red
Fruit
Red

Mr. D.Gangadhar
Associate Professor
UNIT-3 NumPy Arrays and Vectorized Computation

NumPy Arrays:
NumPy, which stands for Numerical Python, is a library consisting of multidimensional array
objects and a collection of routines for processing those arrays. Using NumPy, mathematical and
logical operations on arrays can be performed. This tutorial explains the basics of NumPy such
as its architecture and environment. It also discusses the various array functions, types of
indexing, etc.
NumPy is a Python package. It stands for 'Numerical Python'. It is a library consisting of
multidimensional array objects and a collection of routines for processing of array.
Numeric, the ancestor of NumPy, was developed by Jim Hugunin. Another package Numarray
was also developed, having some additional functionalities. In 2005, Travis Oliphant created
NumPy package by incorporating the features of Numarray into Numeric package. There are
many contributors to this open source project.
Operations using NumPy:
Using NumPy, a developer can perform the following operations −
 Mathematical and logical operations on arrays.
 Fourier transforms and routines for shape manipulation.
 Operations related to linear algebra. NumPy has in-built functions for linear algebra and
random number generation.
NumPy – A Replacement for MatLab
NumPy is often used along with packages like SciPy (Scientific Python)
and Mat−plotlib (plotting library). This combination is widely used as a replacement for
MatLab, a popular platform for technical computing. However, Python alternative to MatLab is
now seen as a more modern and complete programming language.
NumPy – Environment:
Standard Python distribution doesn't come bundled with NumPy module. A lightweight
alternative is to install NumPy using popular Python package installer, pip.
pip install numpy The best way to enable NumPy is to use an installable binary package specific
to your operating system. These binaries contain full SciPy stack (inclusive of NumPy, SciPy,
matplotlib, IPython, SymPy and nose packages along with core Python).
Windows
Anaconda (from https://www.continuum.io) is a free Python distribution for SciPy stack. It is
also available for Linux and Mac.
Canopy (https://www.enthought.com/products/canopy/) is available as free as well as
commercial distribution with full SciPy stack for Windows, Linux and Mac.
Python (x,y): It is a free Python distribution with SciPy stack and Spyder IDE for Windows OS.
(Downloadable from https://www.python- xy.github.io/)
NumPy Array objects:
The most important object defined in NumPy is an N-dimensional array type called ndarray. It
describes the collection of items of the same type. Items in the collection can be accessed using
a zero-based index.
Mr. D.Gangadhar
Associate Professor
Every item in an ndarray takes the same size of block in the memory. Each element in ndarray
is an object of data-type object (called dtype). Any item extracted from ndarray object (by
slicing) is represented by a Python object of one of array scalar types. The following diagram
shows a relationship between ndarray, data type object (dtype) and array scalar type −
An instance of ndarray class can be constructed by different array creation routines described
later in the tutorial. The basic ndarray is created using an array function in NumPy as follows −
numpy.array It creates an ndarray from any object exposing array interface, or from any method
that returns an array.
numpy.array(object, dtype = None, copy = True, order = None, subok = False, ndmin = 0)
The above constructor takes the following parameters −

Sr.No. Parameter & Description

1 object
Any object exposing the array interface method returns an array, or any (nested) sequence.

2 dtype
Desired data type of array, optional

3 copy
Optional. By default (true), the object is copied

4 order
C (row major) or F (column major) or A (any) (default)

5 subok
By default, returned array forced to be a base class array. If true, sub-classes passed
through

6 ndmin
Specifies minimum dimensions of resultant array
Example 1
import numpy as np
a = np.array([1,2,3])
print a
The output is as follows −
[1, 2, 3]
NumPy - Data Types:
NumPy supports a much greater variety of numerical types than Python does. The following
table shows different scalar data types defined in NumPy.

Sr.No. Data Types & Description

Mr. D.Gangadhar
Associate Professor
1 bool_
Boolean (True or False) stored as a byte

2 int_
Default integer type (same as C long; normally either int64 or int32)

3 intc
Identical to C int (normally int32 or int64)

4 intp
Integer used for indexing (same as C ssize_t; normally either int32 or int64)

5 int8
Byte (-128 to 127)

6 int16
Integer (-32768 to 32767)

7 int32
Integer (-2147483648 to 2147483647)

8 int64
Integer (-9223372036854775808 to 9223372036854775807)

9 uint8
Unsigned integer (0 to 255)

10 uint16
Unsigned integer (0 to 65535)

11 uint32
Unsigned integer (0 to 4294967295)

12 uint64
Unsigned integer (0 to 18446744073709551615)

13 float_
Shorthand for float64

14 float16
Half precision float: sign bit, 5 bits exponent, 10 bits mantissa

Mr. D.Gangadhar
Associate Professor
15 float32
Single precision float: sign bit, 8 bits exponent, 23 bits mantissa

16 float64
Double precision float: sign bit, 11 bits exponent, 52 bits mantissa

17 complex_
Shorthand for complex128

18 complex64
Complex number, represented by two 32-bit floats (real and imaginary components)

19 complex128
Complex number, represented by two 64-bit floats (real and imaginary components)
NumPy numerical types are instances of dtype (data-type) objects, each having unique
characteristics. The dtypes are available as np.bool_, np.float32, etc.
Data Type Objects (dtype):
A data type object describes interpretation of fixed block of memory corresponding to an array,
depending on the following aspects −
 Type of data (integer, float or Python object)
 Size of data
 Byte order (little-endian or big-endian)
 In case of structured type, the names of fields, data type of each field and part of the
memory block taken by each field.
 If data type is a subarray, its shape and data type
The byte order is decided by prefixing '<' or '>' to data type. '<' means that encoding is little-
endian (least significant is stored in smallest address). '>' means that encoding is big-endian
(most significant byte is stored in smallest address).
A dtype object is constructed using the following syntax −
numpy.dtype(object, align, copy)
The parameters are −
 Object − To be converted to data type object
 Align − If true, adds padding to the field to make it similar to C-struct
 Copy − Makes a new copy of dtype object. If false, the result is reference to builtin data
type object

Example
# using array-scalar type
import numpy as np
dt = np.dtype(np.int32)
print dt
Mr. D.Gangadhar
Associate Professor
The output is as follows −
int32

NumPy – Array:
NumPy - Array Numpy arrays are a very good substitute for python lists. They are better than
python lists as they provide better speed and takes less memory space. For those who are
unaware of what numpy arrays are, let‟s begin with its definition. These are a special kind of
data structure. They are basically multi-dimensional matrices or lists of fixed size with similar
kind of elements.

2D-Array:

A typical array function looks something like this:

numpy.array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0)

Here, all attributes other than objects are optional. So, do not worry even if you do not
understand a lot about other parameters.

 Object: Specify the object for which you want an array


 Dtype: Specify the desired data type of the array
 Copy: Specify if you want the array to be copied or not
 Order: Specify the order of memory creation
 Subok: Specify if you want a sub-class or a base-class type array
 Ndmin: Specify the dimensions of an array

Attributes of an Array
An array has the following six main attributes:

 Size: The total number of elements in an array


 Shape: The shape of an array
 Dimension: The dimension or rank of an array
 Dtype: Data type of an array
Mr. D.Gangadhar
Associate Professor
 Itemsize: Size of each element of an array in bytes
 Nbytes: Total size of an array in bytes

Example – To Illustrate the Attributes of an Array


Code:

import numpy as np
#creating an array to understand its attributes
A = np.array([[1,2,3],[1,2,3],[1,2,3]])
print("Array A is:\n",A)
#type of array
print("Type:", type(A))
#Shape of array
print("Shape:", A.shape)
#no. of dimensions
print("Rank:", A.ndim)
#size of array
print("Size:", A.size)
#type of each element in the array
print("Element type:", A.dtype)

Create an Array in NumPy:


Numpy provides us with several built-in functions to create and work with arrays from scratch.
An array can be created using the following functions:

 ndarray(shape, type): Creates an array of the given shape with random numbers
 array(array_object): Creates an array of the given shape from the list or tuple
Mr. D.Gangadhar
Associate Professor
 zeros(shape): Creates an array of the given shape with all zeros
 ones(shape): Creates an array of the given shape with all ones
 full(shape,array_object, dtype): Create an array of the given shape with complex
numbers
 arange(range): Creates an array with the specified range

Example– Creation of a NumPy Array


Code:

import numpy as np
#creating array using ndarray
A = np.ndarray(shape=(2,2), dtype=float)
print("Array with random values:\n", A)
# Creating array from list
B = np.array([[1, 2, 3], [4, 5, 6]])
print ("Array created with list:\n", B)
# Creating array from tuple
C = np.array((1 , 2, 3))
print ("Array created with tuple:\n", C)

Output:

Example– Element Accessing in a 2D Array


Code:

import numpy as np
#creating an array to understand indexing
A = np.array([[1,2,1],[7,5,3],[9,4,8]])
print("Array A is:\n",A)
#accessing elements at any given indices
B = A[[0, 1, 2], [0, 1, 2]] print ("Elements at indices (0, 0),(1, 1), (2, 2) are : \n",B)
#changing the value of elements at a given index
Mr. D.Gangadhar
Associate Professor
A[0,0] = 12
A[1,1] = 4
A[2,2] = 7
print("Array A after change is:\n", A)

Output:

Example– Array Indices in a 3D Array


Code:

import numpy as np
#creating a 3d array to understand indexing in a 3D array
I = np.array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
print("3D Array is:\n", I)
print("Elements at index (0,0,1):\n", I[0,0,1])
print("Elements at index (1,0,1):\n", I[1,0,1])
#changing the value of elements at a given index
I[1,0,2] = 31
print("3D Array after change is:\n", I)

Output:

Mr. D.Gangadhar
Associate Professor
Array creation using List : Arrays are used to store multiple values in one single variable.Python
does not have built-in support for Arrays, but Python lists can be used instead.
Example :
arr = [1, 2, 3, 4, 5]
arr1 = ["geeks", "for", "geeks"]
# Python program to create
# an array

# Creating an array using list

arr=[1, 2, 3, 4, 5]

for i in arr:
print(i)

Run on IDE

Output:
1
Mr. D.Gangadhar
Associate Professor
2

Array creation using array functions:


array (data type, value list) function is used to create an array with data type and value list
specified in its arguments.
Example :
# Python code to demonstrate the working of
# array()
# importing "array" for array operations
import array
# initializing array with array values
# initializes array with signed integers
arr = array.array('i', [1, 2, 3])
# printing original array
print ("The new created array is : ",end="")
for i in range (0,3):
print (arr[i], end=" ")
print ("\r")

Output:
The new created array is : 1 2 3 1 5

Array creation using numpy methods:


NumPy offers several functions to create arrays with initial placeholder content. These minimize
the necessity of growing arrays, an expensive operation. For example: np.zeros,np.empty etc.
numpy.empty (shape, dtype = float, order = ‘C’) : Return a new array of given shape and type,
with random values.
# Python Programming illustrating
# numpy.empty method
import numpy as geek
b = geek.empty(2, dtype = int)
print("Matrix b : \n", b)
a = geek.empty([2, 2], dtype = int)
print("\nMatrix a : \n", a)
c = geek.empty([3, 3])
print("\nMatrix c : \n", c)
Output:
Mr. D.Gangadhar
Associate Professor
Matrix b:
[ 0 1079574528]
Matrix a:
[[0 0]
[0 0]]
Matrix a:
[[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]]
numpy.zeros(shape, dtype = None, order = ‘C’) : Return a new array of given shape and type,
with zeros.
# Python Program illustrating
# numpy.zeros method
import numpy as geek
b = geek.zeros(2, dtype = int)
print("Matrix b : \n", b)
a = geek.zeros([2, 2], dtype = int)
print("\nMatrix a : \n", a)
c = geek.zeros([3, 3])
print("\nMatrix c : \n", c)
Run on IDE
Output :
Matrix b:
[0 0]
Matrix a:
[[0 0]
[0 0]]
Matrix c:
[[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]]

Reshaping array: We can use reshape method to reshape an array. Consider an array with shape
(a1, a2, a3, …, aN). We can reshape and convert it into another array with shape (b1, b2, b3, …,
bM).
The only required condition is: a1 x a2 x a3 … x aN = b1 x b2 x b3 … x bM . (i.e original size of
array remains unchanged.)
numpy.reshape (array, shape, order = ‘C’) : Shapes an array without changing data of array.
# Python Program illustrating
# numpy.reshape() method
import numpy as geek
Mr. D.Gangadhar
Associate Professor
array = geek.arange(8)
print("Original array : \n", array)

# shape array with 2 rows and 4 columns


array = geek.arange(8).reshape(2, 4)
print("\narray reshaped with 2 rows and 4 columns : \n", array)
# shape array with 2 rows and 4 columns
array = geek.arange(8).reshape(4 ,2)
print("\narray reshaped with 2 rows and 4 columns : \n", array)
# Constructs 3D array
array = geek.arange(8).reshape(2, 2, 2)
print("\nOriginal array reshaped to 3D : \n", array)
Output :
Original array :
[0 1 2 3 4 5 6 7]
array reshaped with 2 rows and 4 columns :
[[0 1 2 3]
[4 5 6 7]]
array reshaped with 2 rows and 4 columns :
[[0 1]
[2 3]
[4 5]
[6 7]]
Original array reshaped to 3D :
[[[0 1]
[2 3]]
[[4 5]
[6 7]]]
To create sequences of numbers, NumPy provides a function analogous to range that returns arrays
instead of lists.
arange returns evenly spaced values within a given interval. step size is specified.
linspace returns evenly spaced values within a given interval. num no. of elements are returned.
arange ([start,] stop[, step,][, dtype]) : Returns an array with evenly spaced elements as per the
interval. The interval mentioned is half opened i.e. [Start, Stop)
# Python Programming illustrating
# numpy.arange method
import numpy as geek
print("A\n", geek.arange(4).reshape(2, 2), "\n")
print("A\n", geek.arange(4, 10), "\n")
print("A\n", geek.arange(4, 20, 3), "\n")
Output :

Mr. D.Gangadhar
Associate Professor
A
[[0 1]
[2 3]]
A
[4 5 6 7 8 9]
A
[ 4 7 10 13 16 19]

numpy.linspace(start, stop, num = 50, endpoint = True, retstep = False, dtype =


None) : Returns number spaces evenly w.r.t interval. Similiar to arange but instead of step it uses
sample number.
# Python Programming illustrating
# numpy.linspace method
import numpy as geek
# restep set to True
print("B\n", geek.linspace(2.0, 3.0, num=5, retstep=True), "\n")
# To evaluate sin() in long range
x = geek.linspace(0, 2, 10)
print("A\n", geek.sin(x))
Run on IDE
Output :
B
(array([ 2. , 2.25, 2.5 , 2.75, 3. ]), 0.25)
A
[ 0. 0.22039774 0.42995636 0.6183698 0.77637192 0.8961922
0.9719379 0.99988386 0.9786557 0.90929743]

Flatten array: We can use flatten method to get a copy of array collapsed into one dimension. It
accepts order argument. Default value is „C‟ (for row-major order). Use „F‟ for column major
order.
numpy.ndarray.flatten(order = ‘C’) : Return a copy of the array collapsed into one dimension.
# Python Program illustrating
# numpy.flatten() method
import numpy as geek
array = geek.array([[1, 2], [3, 4]])
# using flatten method
array.flatten()
print(array)
#using fatten method
array.flatten('F')

Mr. D.Gangadhar
Associate Professor
print(array)
Run on IDE

Output :
[1, 2, 3, 4]
[1, 3, 2, 4]

Methods for array creation in Numpy:

FUNCTION DESCRIPTION

empty() Return a new array of given shape and type, without initializing entries

empty_like() Return a new array with the same shape and type as a given array

eye() Return a 2-D array with ones on the diagonal and zeros elsewhere.

identity() Return the identity array

ones() Return a new array of given shape and type, filled with ones

ones_like() Return an array of ones with the same shape and type as a given array

zeros() Return a new array of given shape and type, filled with zeros

zeros_like() Return an array of zeros with the same shape and type as a given array

full_like() Return a full array with the same shape and type as a given array.

array() Create an array

asarray() Convert the input to an array

asanyarray() Convert the input to an ndarray, but pass ndarray subclasses through

ascontiguousarray() Return a contiguous array in memory (C order)

asmatrix() Interpret the input as a matrix

copy() Return an array copy of the given object

frombuffer() Interpret a buffer as a 1-dimensional array

fromfile() Construct an array from data in a text or binary file

Mr. D.Gangadhar
Associate Professor
fromfunction() Construct an array by executing a function over each coordinate

fromiter() Create a new 1-dimensional array from an iterable object

fromstring() A new 1-D array initialized from text data in a string

loadtxt() Load data from a text file

arange() Return evenly spaced values within a given interval

linspace() Return evenly spaced numbers over a specified interval

logspace() Return numbers spaced evenly on a log scale

geomspace() Return numbers spaced evenly on a log scale (a geometric progression)

meshgrid() Return coordinate matrices from coordinate vectors

mgrid() nd_grid instance which returns a dense multi-dimensional “meshgrid

ogrid() nd_grid instance which returns an open multi-dimensional “meshgrid

diag() Extract a diagonal or construct a diagonal array

diagflat() Create a two-dimensional array with the flattened input as a diagonal

tri() An array with ones at and below the given diagonal and zeros elsewhere

tril() Lower triangle of an array

triu() Upper triangle of an array

vander() Generate a Vandermonde matrix

mat() Interpret the input as a matrix

bmat() Build a matrix object from a string, nested sequence, or array


NumPy - Indexing & Slicing:
Contents of ndarray object can be accessed and modified by indexing or slicing, just like
Python's in-built container objects. As mentioned earlier, items in ndarray object follows zero-
based index. Three types of indexing methods are available − field access, basic
slicing and advanced indexing.
Basic slicing is an extension of Python's basic concept of slicing to n dimensions. A Python
slice object is constructed by giving start, stop, and step parameters to the built-
in slice function. This slice object is passed to the array to extract a part of array.

Mr. D.Gangadhar
Associate Professor
Example:
import numpy as np
a = np.arange(10)
s = slice(2,7,2)
print a[s]
Its output is as follows −
[2 4 6]
In the above example, an ndarray object is prepared by arange() function. Then a slice object
is defined with start, stop, and step values 2, 7, and 2 respectively. When this slice object is
passed to the ndarray, a part of it starting with index 2 up to 7 with a step of 2 is sliced.
The same result can also be obtained by giving the slicing parameters separated by a colon :
(start:stop:step) directly to the ndarray object.
Example:
import numpy as np
a = np.arange(10)
b = a[2:7:2]
print b
Here, we will get the same output −
[2 4 6]
If only one parameter is put, a single item corresponding to the index will be returned. If a : is
inserted in front of it, all items from that index onwards will be extracted. If two parameters
(with : between them) is used, items between the two indexes (not including the stop index)
with default step one are sliced.
Example:
# slice single item
import numpy as np
a = np.arange(10)
b = a[5]
print b
Its output is as follows −
5
Example:
# slice items starting from index
import numpy as np
a = np.arange(10)
print a[2:]
Now, the output would be −
[2 3 4 5 6 7 8 9]
Example:
# slice items between indexes
import numpy as np
a = np.arange(10)
Mr. D.Gangadhar
Associate Professor
print a[2:5]
Here, the output would be −
[2 3 4]
The above description applies to multi-dimensional ndarray too.
Example:
import numpy as np
a = np.array([[1,2,3],[3,4,5],[4,5,6]])
print a

# slice items starting from index


print 'Now we will slice the array from the index a[1:]'
print a[1:]
The output is as follows −
[[1 2 3]
[3 4 5]
[4 5 6]]
Now we will slice the array from the index a[1:]
[[3 4 5]
[4 5 6]]
Slicing can also include ellipsis (…) to make a selection tuple of the same length as the
dimension of an array. If ellipsis is used at the row position, it will return an ndarray comprising
of items in rows.
Example:
# array to begin with
import numpy as np
a = np.array([[1,2,3],[3,4,5],[4,5,6]])
print 'Our array is:'
print a
print '\n'
# this returns array of items in the second column
print 'The items in the second column are:'
print a[...,1]
print '\n'
# Now we will slice all items from the second row
print 'The items in the second row are:'
print a[1,...]
print '\n'
# Now we will slice all items from column 1 onwards
print 'The items column 1 onwards are:'
print a[...,1:]
The output of this program is as follows −
Our array is:
[[1 2 3]
Mr. D.Gangadhar
Associate Professor
[3 4 5]
[4 5 6]]
The items in the second column are:
[2 4 5]
The items in the second row are:
[3 4 5]
The items column 1 onwards are:
[[2 3]
[4 5]
[5 6]]
Integer Indexing:
This mechanism helps in selecting any arbitrary item in an array based on its Ndimensional
index. Each integer array represents the number of indexes into that dimension. When the index
consists of as many integer arrays as the dimensions of the target ndarray, it becomes
straightforward.
In the following example, one element of specified column from each row of ndarray object is
selected. Hence, the row index contains all row numbers, and the column index specifies the
element to be selected.
Example:
import numpy as np
x = np.array([[1, 2], [3, 4], [5, 6]])
y = x[[0,1,2], [0,1,0]]
print y
Its output would be as follows −
[1 4 5]
The selection includes elements at (0,0), (1,1) and (2,0) from the first array.
In the following example, elements placed at corners of a 4X3 array are selected. The row
indices of selection are [0, 0] and [3,3] whereas the column indices are [0,2] and [0,2].

Example:
import numpy as np
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
print 'Our array is:'
print x
print '\n'
rows = np.array([[0,0],[3,3]])
cols = np.array([[0,2],[0,2]])
y = x[rows,cols]
print 'The corner elements of this array are:'
print y
The output of this program is as follows −
Our array is:
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
Mr. D.Gangadhar
Associate Professor
[ 9 10 11]]
The corner elements of this array are:
[[ 0 2]
[ 9 11]]
The resultant selection is an ndarray object containing corner elements.
Advanced and basic indexing can be combined by using one slice (:) or ellipsis (…) with an
index array. The following example uses slice for row and advanced index for column. The
result is the same when slice is used for both. But advanced index results in copy and may have
different memory layout.

Example:
import numpy as np
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
print 'Our array is:'
print x
print '\n'
# slicing
z = x[1:4,1:3]
print 'After slicing, our array becomes:'
print z
print '\n'
# using advanced index for column
y = x[1:4,[1,2]]
print 'Slicing using advanced index for column:'
print y
The output of this program would be as follows −
Our array is:
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
After slicing, our array becomes:
[[ 4 5]
[ 7 8]
[10 11]]
Slicing using advanced index for column:
[[ 4 5]
[ 7 8]
[10 11]]
Boolean Array Indexing
This type of advanced indexing is used when the resultant object is meant to be the result of
Boolean operations, such as comparison operators.
Example:
import numpy as np
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
Mr. D.Gangadhar
Associate Professor
print 'Our array is:'
print x
print '\n'
# Now we will print the items greater than 5
print 'The items greater than 5 are:'
print x[x > 5]
The output of this program would be −
Our array is:
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
The items greater than 5 are:
[ 6 7 8 9 10 11]
Example:
import numpy as np
a = np.array([np.nan, 1,2,np.nan,3,4,5])
print a[~np.isnan(a)]
Its output would be −
[ 1. 2. 3. 4. 5.]
Operations on Numpy Arrays
NumPy is a Python package which means „Numerical Python‟. It is the library for logical
computing, which contains a powerful n-dimensional array object, gives tools to integrate C,
C++ and so on. It is likewise helpful in linear based math, arbitrary number capacity and so on.
NumPy exhibits can likewise be utilized as an effective multi-dimensional compartment for
generic data.
NumPy Array: Numpy array is a powerful N-dimensional array object which is in the form of
rows and columns. We can initialize NumPy arrays from nested Python lists and access it
elements. A Numpy array on a structural level is made up of a combination of:
 The Data pointer indicates the memory address of the first byte in the array.
 The Data type or dtype pointer describes the kind of elements that are contained within the
array.
 The shape indicates the shape of the array.
 The strides are the number of bytes that should be skipped in memory to go to the next
element.
Operations on Numpy Array
Arithmetic Operations:

# Python code to perform arithmetic


# operations on NumPy array
import numpy as np
# Initializing the array
arr1 = np.arange(4, dtype = np.float_).reshape(2, 2)

Mr. D.Gangadhar
Associate Professor
print('First array:')
print(arr1)
print('\nSecond array:')
arr2 = np.array([12, 12])
print(arr2)
print('\nAdding the two arrays:')
print(np.add(arr1, arr2))
print('\nSubtracting the two arrays:')
print(np.subtract(arr1, arr2))
print('\nMultiplying the two arrays:')
print(np.multiply(arr1, arr2))
print('\nDividing the two arrays:')
print(np.divide(arr1, arr2))

Output:
First array:
[[ 0. 1.]
[ 2. 3.]]
Second array:
[12 12]
Adding the two arrays:
[[ 12. 13.]
[ 14. 15.]]
Subtracting the two arrays:
[[-12. -11.]
[-10. -9.]]
Multiplying the two arrays:
[[ 0. 12.]
[ 24. 36.]]
Dividing the two arrays:
[[ 0. 0.08333333]
[ 0.16666667 0.25 ]]
numpy.reciprocol()
This function returns the reciprocal of argument, element-wise. For elements with absolute
values larger than 1, the result is always 0 and for integer 0, overflow warning is issued.
Example:

# Python code to perform reciprocal operation


# on NumPy array
import numpy as np
arr = np.array([25, 1.33, 1, 1, 100])
print('Our array is:')

Mr. D.Gangadhar
Associate Professor
print(arr)
print('\nAfter applying reciprocal function:')
print(np.reciprocal(arr))
arr2 = np.array([25], dtype = int)
print('\nThe second array is:')
print(arr2)
print('\nAfter applying reciprocal function:')
print(np.reciprocal(arr2))

Output
Our array is:
[ 25. 1.33 1. 1. 100. ]
After applying reciprocal function:
[ 0.04 0.7518797 1. 1. 0.01 ]
The second array is:
[25]
After applying reciprocal function:
[0]
numpy.power()
This function treats elements in the first input array as the base and returns it raised to the power
of the corresponding element in the second input array.

# Python code to perform power operation


# on NumPy array
import numpy as np
arr = np.array([5, 10, 15])
print('First array is:')
print(arr)
print('\nApplying power function:')
print(np.power(arr, 2))
print('\nSecond array is:')
arr1 = np.array([1, 2, 3])
print(arr1)
print('\nApplying power function again:')
print(np.power(arr, arr1))

Output:
First array is:
[ 5 10 15]
Applying power function:
[ 25 100 225]
Second array is:
[1 2 3]
Mr. D.Gangadhar
Associate Professor
Applying power function again:
[ 5 100 3375]
numpy.mod()
This function returns the remainder of division of the corresponding elements in the input array.
The function numpy.remainder() also produces the same result.

# Python code to perform mod function


# on NumPy array
import numpy as np
arr = np.array([5, 15, 20])
arr1 = np.array([2, 5, 9])
print('First array:')
print(arr)
print('\nSecond array:')
print(arr1)
print('\nApplying mod() function:')
print(np.mod(arr, arr1))
print('\nApplying remainder() function:')
print(np.remainder(arr, arr1))

Output:
First array:
[ 5 15 20]
Second array:
[2 5 9]
Applying mod() function:
[1 0 2]
Applying remainder() function:
[1 0 2]
Array functions In Python:
Numpy is a python package used for scientific computing. So certainly, it supports a vast variety
of functions used for computation. The various functions supported by numpy are mathematical,
financial, universal, windows, and logical functions. Universal functions are used for array
broadcasting, typecasting, and several other standard features. While windows functions are used
in signal processing. We will be learning mathematical functions in detail in this article.
Mathematical Functions in NumPy
Numpy is written purely in C language. Hence, it‟s mathematical functions are closely associated
with functions present is math.h library in C.

1. Arithmetic Functions

Function Description

Mr. D.Gangadhar
Associate Professor
add(arr1, arr2,..) Add arrays element wise

reciprocal(arr) Returns reciprocal of elements of the argument array

negative(arr) Returns numerical negative of elements of an array

multiply(arr1,arr2,…) Multiply arrays element wise

divide(arr1,arr2) Divide arrays element wise

power(arr1,arr2) Return the first array with its each of its elements raised to the
power of elements in the second array (element wise)

subtract(arr1,arr2,…) Subtract arrays element wise

true_divide(arr1,arr2) Returns true_divide of an array element wise

floor_divide(arr1,arr2) Returns floor after dividing an array element wise

float_power(arr1,arr2) Return the first array with its each of its elements raised to the
power of elements in the second array (elementwise)

fmod(arr1,arr2) Returns floor of the remainder after division elementwise

mod(arr1,arr2) Returns remainder after division elementwise

remainder(arr1,arr2) Returns remainder after division elementwise

divmod(arr1,arr2) Returns remainder and quotient after division elementwise


The above-mentioned operations can be performed in the following ways:
In the given code snippet, we try to do some basic operations on the arguments, array a and array
b.
Code:
import numpy as np
a = np.array([10,20,30])
b= np.array([1,2,3])
print("addition of a and b :",np.add(a,b))
print("multiplication of a and b :",np.multiply(a,b))
print("subtraction of a and b :",np.subtract(a,b))
print("a raised to b is:",np.power(a,b))
Output:

Mr. D.Gangadhar
Associate Professor
Code:
import numpy as np
a = np.array([10,20,30])
b= np.array([2,3,4])
print("division of a and b :",np.divide(a,b))
print("true division of a :",np.true_divide(a,b))
print("floor_division of a and b :",np.floor_divide(a,b))
print("float_power of a raised to b :",np.float_power(a,b))
print("fmod of a and b :",np.fmod(a,b))
print("mod of a and b :",np.mod(a,b))
print("quotient and remainder of a and b :",np.divmod(a,b))
print("remainders when a/b :",np.remainder(a,b))
Output:

2. Trigonometric Functions

Function Description

sin(arr) Returns trigonometric sine element wise

cos(arr) Returns trigonometric cos element wise

tan(arr) Returns trigonometric tan element wise

arcsin(arr) Returns trigonometric inverse sine element wise

arccos(arr) Returns trigonometric inverse cosine element wise

Mr. D.Gangadhar
Associate Professor
arctan(arr) Returns trigonometric inverse tan element wise

hypot(a,b) Returns hypotenuse of a right triangle with perpendicular and base as


arguments

degrees(arr) Covert input angles from radians to degrees


rad2deg(arr)

radians(arr) Covert input angles from degrees to radians


deg2rad(arr)
Code:

import numpy as np
angles = np.array([0,np.pi/2, np.pi]) -----> input array angles
sin_angles = np.sin(angles)
cosine_angles = np.cos(angles)
tan_angles = np.tan(angles)
rad2degree = np.degrees(angles)
print("sin of angles:",sin_angles)
print("cosine of angles:",cosine_angles)
print("tan of angles:",tan_angles)
print("angles in radians",rad2degree)
Output:

3. Logarithmic and Exponential Functions

Function Description

exp(arr) Returns exponential of an input array element wise

expm1(arr) Returns exponential exp(x)-1 of an input array element wise

exp2(arr) Returns exponential 2**x of all elements in an array

log(arr) Returns natural log of an input array element wise

Mr. D.Gangadhar
Associate Professor
log10(arr) Returns log base 10 of an input array element wise

log2(arr) Returns log base 2 of an input array element wise

logaddexp(arr) Returns logarithm of the sum of exponentiations of all inputs

logaddexp2(arr) Returns logarithm of the sum of exponentiations of the inputs in base 2


Code:

import numpy as np
a = np.array([1,2,3,4,5])
a_log = np.log(a)
a_exp = np.exp(a)
print("log of input array a is:",a_log)
print("exponent of input array a is:",a_exp)
Output:

4. Rounding Functions

Function Description

around(arr,decimal) Rounds the elements of an input array upto given decimal places

round_(arr,decimal) Rounds the elements of an input array upto given decimal places

rint(arr) Round the elements of an input array to the nearest integer towards
zero

fix(arr) Round the elements of an input array to the nearest integer towards
zero

floor(arr) Returns floor of input array element wise

ceil(arr) Returns ceiling of input array element wise

trunc(arr) Return the truncated value of an input array element wise


Example of using rounding functions with numpy arrays:

Mr. D.Gangadhar
Associate Professor
Code:

import numpy as np
a = np.array([1.23,4.165,3.8245])
rounded_a = np.round_(a,2)
print(rounded_a)
Output:

5. Miscellaneous Functions

Function Description

sqrt(arr) Returns the square root of an input array element wise

cbrt(arr) Returns cube root of an input array element wise

absolute(arr) Returns absolute value each element in an input array

maximum(arr1,arr2,…) Returns element wise maximum of the input arrays

minimum(arr1,arr2,…) Returns element wise minimum of the input arrays

interp(arr, xp, fp) Calculates one-dimensional linear interpolation

convolve(arr, v) Returns linear convolution of two one-dimensional sequences

clip(arr, arr_min, arr_max) Limits the values in an input array


Finding the Maxima:
Code:
import numpy as np
a = [1,2,3] b = [3,1,2] maximum_elementwise = np.maximum(a,b)
print("maxima are:",maximum_elementwise)
Output:

Data processing using arrays:

Mr. D.Gangadhar
Associate Professor
With the NumPy package, we can easily solve many kinds of data processing tasks without
writing complex loops. It is very helpful for us to control our code as well as the performance of
the program. In this part, we want to introduce some mathematical and statistical functions.

See the following table for a listing of mathematical and statistical functions:

Function Description Example

sum Calculate the sum of all the elements in an >>> a = np.array([[2,4], [3,5]])
array or along the axis >>> np.sum(a, axis=0)
array([5, 9])
Copy

prod Compute the product of array elements >>> np.prod(a, axis=1)


over the given axis array([8, 15])
Copy

diff Calculate the discrete difference along the >>> np.diff(a, axis=0)
given axis array([[1,1]])
Copy

gradient Return the gradient of an array >>> np.gradient(a)


[array([[1., 1.], [1., 1.]]), array([[2.,
2.], [2., 2.]])]
Copy

cross Return the cross product of two arrays ...

Save and Load NumPy Array in Python:


1. The numpy.savetxt() and the numpy.loadtxt():
The numpy.savetxt() function saves a NumPy array to a text file
and the numpy.loadtxt() function loads a NumPy array from a text file in Python.
The numpy.save() function takes the name of the text file, the array to be saved, and the
desired format as input parameters and saves the array inside the text file.
The numpy.loadtxt() function takes the name of the text file and the data type of the array
and returns the saved array. The following code example shows us how we can save and load
a NumPy array with the numpy.savetxt() and numpy.loadtxt() functions in Python.
Code:
import numpy as np
a = np.array([1, 3, 5, 7])
np.savetxt('test1.txt', a, fmt='%d')
a2 = np.loadtxt('test1.txt', dtype=int)
Mr. D.Gangadhar
Associate Professor
print(a == a2)
Output:

[ True True True True]

2. The numpy.tofile() and numpy.fromfile() :


The numpy.tofile() function saves a NumPy array in a binary file and
the numpy.fromfile() function loads a NumPy array from a binary file.
The numpy.tofile() function takes the name of the file as an input argument and saves the
calling array inside the file in a binary format. The numpy.fromfile() function takes the
name of the file, and the data type of the array as input parameters and returns the array.
The following code example shows us how to save and load a NumPy array with
the numpy.tofile() and numpy.fromfile() functions in Python.
Code:

import numpy as np
a = np.array([1, 3, 5, 7])
a.tofile('test2.dat')
a2 = np.fromfile('test2.dat', dtype=int)
print(a == a2)
Output:

[ True True True True]


3. The numpy.save() function and the numpy.load() function:

This approach is a platform-independent way of saving and loading a NumPy array in


Python. The numpy.save() function saves a NumPy array to a file, and
the numpy.load() function loads a NumPy array from a file. We need to specify
the .npy extension for the files in this method. The numpy.save() function takes the name of
the file and the array to be saved as input parameters and saves the array inside the specified
file. The numpy.load() function takes the name of the file as an input parameter and returns
the array. The following code example shows us how we can save and load a NumPy array
with the numpy.save() and numpy.load() functions in Python.
Code:
import numpy as np
a = np.array([1, 3, 5, 7])
np.save('test3.npy', a)
a2 = np.load('test3.npy')
print(a == a2)
Output:

[ True True True True]


Mr. D.Gangadhar
Associate Professor
Numpy | Linear Algebra

The Linear Algebra module of NumPy offers various methods to apply linear algebra on any
numpy array.
One can find:
 rank, determinant, trace, etc. of an array.
 eigen values of matrices
 matrix and vector products (dot, inner, outer,etc. product), matrix exponentiation
 solve linear or tensor equations and much more!
# Importing numpy as np
import numpy as np
A = np.array([[6, 1, 1],
[4, -2, 5],
[2, 8, 7]])
# Rank of a matrix
print("Rank of A:", np.linalg.matrix_rank(A))
# Trace of matrix A
print("\nTrace of A:", np.trace(A))
# Determinant of a matrix
print("\nDeterminant of A:", np.linalg.det(A))
# Inverse of matrix A
print("\nInverse of A:\n", np.linalg.inv(A))
print("\nMatrix A raised to power 3:\n",
np.linalg.matrix_power(A, 3))
Output:
Rank of A: 3
Trace of A: 11
Determinant of A: -306.0
Inverse of A:
[[ 0.17647059 -0.00326797 -0.02287582]
[ 0.05882353 -0.13071895 0.08496732]
[-0.11764706 0.1503268 0.05228758]]
Matrix A raised to power 3:
[[336 162 228]
[406 162 469]
[698 702 905]]
Matrix eigenvalues Functions
numpy.linalg.eigh(a, UPLO=’L’) : This function is used to return the eigenvalues and
eigenvectors of a complex Hermitian (conjugate symmetric) or a real symmetric matrix.Returns
two objects, a 1-D array containing the eigenvalues of a, and a 2-D square array or matrix
(depending on the input type) of the corresponding eigenvectors (in columns).

Mr. D.Gangadhar
Associate Professor
# Python program explaining
# eigh() function
from numpy import linalg as geek
# Creating an array using array
# function
a = np.array([[1, -2j], [2j, 5]])
print("Array is :",a)
# calculating an eigen value
# using eigh() function
c, d = geek.eigh(a)
print("Eigen value is :", c)
print("Eigen value is :", d)
Output :
Array is : [[ 1.+0.j, 0.-2.j],

[ 0.+2.j, 5.+0.j]]

Eigen value is : [ 0.17157288, 5.82842712]

Eigen value is : [[-0.92387953+0.j , -0.38268343+0.j ],

[ 0.00000000+0.38268343j, 0.00000000-0.92387953j]]

numpy.linalg.eig(a) : This function is used to compute the eigenvalues and right eigenvectors of a
square array.
# Python program explaining
# eig() function
from numpy import linalg as geek
# Creating an array using diag
# function
a = np.diag((1, 2, 3))
print("Array is :",a)
# calculating an eigen value
# using eig() function
c, d = geek.eig(a)
print("Eigen value is :",c)
print("Eigen value is :",d)
Output :
Array is : [[1 0 0],
[0 2 0],
[0 0 3]]
Eigen value is : [ 1 2 3]
Eigen value is : [[ 1 0 0],
Mr. D.Gangadhar
Associate Professor
[ 0 1 0],
[ 0 0 1]]
FUNCTION DESCRIPTION

linalg.eigvals() Compute the eigenvalues of a general matrix.

linalg.eigvalsh(a[, Compute the eigenvalues of a complex Hermitian or real symmetric


UPLO]) matrix.
Matrix and vector products:
numpy.dot(vector_a, vector_b, out = None) : returns the dot product of vectors a and b. It can
handle 2D arrays but considering them as matrix and will perform matrix multiplication. For N
dimensions it is a sum product over the last axis of a and the second-to-last of b :
dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
Code #1:
# Python Program illustrating
# numpy.dot() method
import numpy as geek
# Scalars
product = geek.dot(5, 4)
print("Dot Product of scalar values : ", product)
# 1D array
vector_a = 2 + 3j
vector_b = 4 + 5j
product = geek.dot(vector_a, vector_b)
print("Dot Product : ", product)
Output:
Dot Product of scalar values : 20
Dot Product : (-7+22j)
How Code #1 works ?
vector_a = 2 + 3j
vector_b = 4 + 5j
now dot product
= 2(4 + 5j) + 3j(4 - 5j)
= 8 + 10j + 12j - 15
= -7 + 22j
Numpy.vdot(vector_a, vector_b) : Returns the dot product of vectors a and b. If first argument is
complex the complex conjugate of the first argument(this is where vdot() differs working
of dot() method) is used for the calculation of the dot product. It can handle multi-dimensional
arrays but working on it as a flattened array.
Code #1:
# Python Program illustrating
# numpy.vdot() method
import numpy as geek
Mr. D.Gangadhar
Associate Professor
# 1D array
vector_a = 2 + 3j
vector_b = 4 + 5j
product = geek.vdot(vector_a, vector_b)
print("Dot Product : ", product)
Output :
Dot Product : (23-2j)

FUNCTION DESCRIPTION

matmul() Matrix product of two arrays.

inner() Inner product of two arrays.

outer() Compute the outer product of two vectors.

Compute the dot product of two or more arrays in a single function call,
linalg.multi_dot() while automatically selecting the fastest evaluation order.

tensordot() Compute tensor dot product along specified axes for arrays >= 1-D.

einsum() Evaluates the Einstein summation convention on the operands.

Evaluates the lowest cost contraction order for an einsum expression by


einsum_path() considering the creation of intermediate arrays.

linalg.matrix_power() Raise a square matrix to the (integer) power n.

kron() Kronecker product of two arrays.


Solving equations and inverting matrices:
numpy.linalg.solve() : Solve a linear matrix equation, or system of linear scalar
equations.Computes the “exact” solution, x, of the well-determined, i.e., full rank, linear matrix
equation ax = b.
# Python Program illustrating
# numpy.linalg.solve() method
import numpy as np
# Creating an array using array
# function
a = np.array([[1, 2], [3, 4]])
# Creating an array using array
# function
b = np.array([8, 18])
print(("Solution of linear equations:",
Mr. D.Gangadhar
Associate Professor
np.linalg.solve(a, b)))
Output:
Solution of linear equations: [ 2. 3.]
Numpy.linalg.lstsq() : Return the least-squares solution to a linear matrix equation.Solves the
equation a x = b by computing a vector x that minimizes the Euclidean 2-norm || b – a x ||^2. The
equation may be under-, well-, or over- determined (i.e., the number of linearly independent rows
of a can be less than, equal to, or greater than its number of linearly independent columns). If a is
square and of full rank, then x (but for round-off error) is the “exact” solution of the equation.
# Python Program illustrating
# numpy.linalg.lstsq() method
import numpy as np
import matplotlib.pyplot as plt
# x co-ordinates
x = np.arange(0, 9)
A = np.array([x, np.ones(9)])
# linearly generated sequence
y = [19, 20, 20.5, 21.5, 22, 23, 23, 25.5, 24]
# obtaining the parameters of regression line
w = np.linalg.lstsq(A.T, y)[0]
# plotting the line
line = w[0]*x + w[1] # regression line
plt.plot(x, line, 'r-')
plt.plot(x, y, 'o')
plt.show()
FUNCTION DESCRIPTION

numpy.linalg.tensorsolve() Solve the tensor equation a x = b for x.

numpy.linalg.inv() Compute the (multiplicative) inverse of a matrix.

numpy.linalg.pinv() Compute the (Moore-Penrose) pseudo-inverse of a matrix.

numpy.linalg.tensorinv() Compute the „inverse‟ of an N-dimensional array.


Special Functions
numpy.linalg.det() : Compute the determinant of an array.
# Python Program illustrating
# numpy.linalg.det() method
import numpy as np
# creating an array using
# array method
A = np.array([[6, 1, 1],
[4, -2, 5],
Mr. D.Gangadhar
Associate Professor
[2, 8, 7]])
print(("\nDeterminant of A:"
, np.linalg.det(A)))
Output:
Determinant of A: -306.0
numpy.trace() : Return the sum along diagonals of the array.If a is 2-D, the sum along its diagonal
with the given offset is returned, i.e., the sum of elements a[i,i+offset] for all i.If a has more than
two dimensions, then the axes specified by axis1 and axis2 are used to determine the 2-D sub-
arrays whose traces are returned. The shape of the resulting array is the same as that of a with axis1
and axis2 removed.
# Python Program illustrating
# numpy.trace()() method
import numpy as np
# creating an array using
# array method
A = np.array([[6, 1, 1],
[4, -2, 5],
[2, 8, 7]])
print("\nTrace of A:", np.trace(A))
Output:
Trace of A: 11
FUNCTION DESCRIPTION

numpy.linalg.norm() Matrix or vector norm.

numpy.linalg.cond() Compute the condition number of a matrix.

numpy.linalg.matrix_rank() Return matrix rank of array using SVD method

numpy.linalg.cholesky() Cholesky decomposition.

numpy.linalg.qr() Compute the qr factorization of a matrix.

numpy.linalg.svd() Singular Value Decomposition.

Mr. D.Gangadhar
Associate Professor
UNIT 4 Data Analysis with Pandas

Pandas are an open-source Python Library providing high-performance data manipulation and
analysis tool using its powerful data structures. The name Pandas is derived from the word
Panel Data – an Econometrics from Multidimensional data.
In 2008, developer Wes McKinney started developing pandas when in need of high
performance, flexible tool for analysis of data. Prior to Pandas, Python was majorly used for
data munging and preparation. It had very little contribution towards data analysis. Pandas
solved this problem. Using Pandas, we can accomplish five typical steps in the processing and
analysis of data, regardless of the origin of data — load, prepare, manipulate, model, and
analyze. Python with Pandas is used in a wide range of fields including academic and
commercial domains including finance, economics, Statistics, analytics, etc.
Key Features of Pandas
 Fast and efficient DataFrame object with default and customized indexing.
 Tools for loading data into in-memory data objects from different file formats.
 Data alignment and integrated handling of missing data.
 Reshaping and pivoting of date sets.
 Label-based slicing, indexing and subsetting of large data sets.
 Columns from a data structure can be deleted or inserted.
 Group by data for aggregation and transformations.
 High performance merging and joining of data.
 Time Series functionality.
Python Pandas - Environment Setup:
Standard Python distribution doesn't come bundled with Pandas module. A lightweight
alternative is to install NumPy using popular Python package installer, pip.
pip install pandas
If you install Anaconda Python package, Pandas will be installed by default with the following :
Windows
 Anaconda (from https://www.continuum.io) is a free Python distribution for SciPy
stack. It is also available for Linux and Mac.
 Canopy (https://www.enthought.com/products/canopy/) is available as free as well as
commercial distribution with full SciPy stack for Windows, Linux and Mac.
 Python (x,y) is a free Python distribution with SciPy stack and Spyder IDE for Windows
OS. (Downloadable from http://python-xy.github.io/)
PANDAS Data Structures:
Pandas deals with the following three data structures −
 Series
 DataFrame
 Panel
These data structures are built on top of Numpy array, which means they are fast.
Dimension & Description:
The best way to think of these data structures is that the higher dimensional data structure is a
container of its lower dimensional data structure. For example, DataFrame is a container of
Series, Panel is a container of DataFrame.
Mr. D.Gangadhar
Associate Professor
Data Dimensions Description
Structure

Series 1 1D labeled homogeneous array, sizeimmutable.

Data Frames 2 General 2D labeled, size-mutable tabular structure with potentially


heterogeneously typed columns.

Panel 3 General 3D labeled, size-mutable array.


Building and handling two or more dimensional arrays is a tedious task, burden is placed on the
user to consider the orientation of the data set when writing functions. But using Pandas data
structures, the mental effort of the user is reduced.
For example, with tabular data (DataFrame) it is more semantically helpful to think of
the index (the rows) and the columns rather than axis 0 and axis 1.
Mutability
All Pandas data structures are value mutable (can be changed) and except Series all are size
mutable. Series is size immutable.
Note − DataFrame is widely used and one of the most important data structures. Panel is used
much less.

Series:

Series is a one-dimensional array like structure with homogeneous data. For example, the
following series is a collection of integers 10, 23, 56, …

10 23 56 17 52 61 73 90 26 72
Key Points
 Homogeneous data
 Size Immutable

Name Age Gender Rating

Steve 32 Male 3.45

Lia 28 Female 4.6

Vin 45 Male 3.9

Katie 38 Female 2.78


 Values of Data Mutable
Data Frame
Data Frame is a two-dimensional array with heterogeneous data. For example,

Mr. D.Gangadhar
Associate Professor
The table represents the data of a sales team of an organization with their overall performance
rating. The data is represented in rows and columns. Each column represents an attribute and
each row represents a person.
Data Type of Columns
The data types of the four columns are as follows −

Column Type

Name String

Age Integer

Gender String

Rating Float
Key Points
 Heterogeneous data
 Size Mutable
 Data Mutable

Panel
Panel is a three-dimensional data structure with heterogeneous data. It is hard to represent the
panel in graphical representation. But a panel can be illustrated as a container of DataFrame.
Key Points
 Heterogeneous data
 Size Mutable
 Data Mutable
pandas.Series
A pandas Series can be created using the following constructor −
pandas.Series( data, index, dtype, copy)
The parameters of the constructor are as follows −

Sr.No Parameter & Description

1 data
data takes various forms like ndarray, list, constants

2 index
Index values must be unique and hashable, same length as data. Default np.arrange(n) if no
index is passed.

3 dtype
dtype is for data type. If None, data type will be inferred

Mr. D.Gangadhar
Associate Professor
4 copy
Copy data. Default False
A series can be created using various inputs like −
 Array
 Dict
 Scalar value or constant
Create an Empty Series
A basic series, which can be created is an Empty Series.
Example
#import the pandas library and aliasing as pd
import pandas as pd
s = pd.Series()
print s
Its output is as follows −
Series([], dtype: float64)
Create a Series from ndarray
If data is an ndarray, then index passed must be of the same length. If no index is passed, then
by default index will be range(n) where n is array length, i.e., [0,1,2,3…. range(len(array))-1].
Example:
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print s
Its output is as follows −
0 a
1 b
2 c
3 d
dtype: object
We did not pass any index, so by default, it assigned the indexes ranging from 0 to len(data)-1,
i.e., 0 to 3.

Example:
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data,index=[100,101,102,103])
Mr. D.Gangadhar
Associate Professor
print s
Its output is as follows −
100 a
101 b
102 c
103 d
dtype: object
We passed the index values here. Now we can see the customized indexed values in the output.
Create a Series from dict
A dict can be passed as input and if no index is specified, then the dictionary keys are taken in a
sorted order to construct index. If index is passed, the values in data corresponding to the labels
in the index will be pulled out.

Example:
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data)
print s
Its output is as follows −
a 0.0
b 1.0
c 2.0
dtype: float64
Create a Series from Scalar
If data is a scalar value, an index must be provided. The value will be repeated to match the
length of index
#import the pandas library and aliasing as pd
import pandas as pd
import numpy as np
s = pd.Series(5, index=[0, 1, 2, 3])
print s
Its output is as follows −
0 5
1 5
2 5
3 5
dtype: int64
Accessing Data from Series with Position
Data in the series can be accessed similar to that in an ndarray.

Mr. D.Gangadhar
Associate Professor
Example:
Retrieve the first element. As we already know, the counting starts from zero for the array,
which means the first element is stored at zero th position and so on.

import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve the first element


print s[0]
Its output is as follows −
Example:
Retrieve the first three elements in the Series. If a : is inserted in front of it, all items from that
index onwards will be extracted. If two parameters (with : between them) is used, items
between the two indexes (not including the stop index)
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

#retrieve the first three element


print s[:3]
Its output is as follows −
a 1
b 2
c 3
dtype: int64
Example 3
Retrieve the last three elements.

import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
#retrieve the last three element
print s[-3:]
Its output is as follows −
c 3
d 4
e 5
dtype: int64
Retrieve Data Using Label (Index)
A Series is like a fixed-size dict in that you can get and set values by index label.
Example 1
Retrieve a single element using index label value.

import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])

Mr. D.Gangadhar
Associate Professor
#retrieve a single element
print s['a']
Its output is as follows −
Python Pandas – DataFrame:
A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in
rows and columns.
Features of DataFrame
 Potentially columns are of different types
 Size – Mutable
 Labeled axes (rows and columns)
 Can Perform Arithmetic operations on rows and columns
Structure
Let us assume that we are creating a data frame with student’s data. You can think of it as an
SQL table or a spreadsheet data representation.
pandas.DataFrame
A pandas DataFrame can be created using the following constructor −
pandas.DataFrame( data, index, columns, dtype, copy)
The parameters of the constructor are as follows −

Sr.No Parameter & Description

1 data
data takes various forms like ndarray, series, map, lists, dict, constants and also another
DataFrame.

2 index
For the row labels, the Index to be used for the resulting frame is Optional Default
np.arange(n) if no index is passed.

3 columns
For column labels, the optional default syntax is - np.arange(n). This is only true if no index
is passed.

4 dtype
Data type of each column.

5 copy
This command (or whatever it is) is used for copying of data, if the default is False.
Create DataFrame:
A pandas DataFrame can be created using various inputs like −
 Lists
 dict
 Series

Mr. D.Gangadhar
Associate Professor
 Numpy ndarrays
 Another DataFrame
In the subsequent sections of this chapter, we will see how to create a DataFrame using these
inputs.
Create an Empty DataFrame
A basic DataFrame, which can be created is an Empty Dataframe.
Example:
#import the pandas library and aliasing as pd
import pandas as pd
df = pd.DataFrame()
print df
Its output is as follows −
Empty DataFrame
Columns: []
Index: []
Create a DataFrame from Lists
The DataFrame can be created using a single list or a list of lists.
Example
import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print df
Its output is as follows −
0
0 1
1 2
2 3
3 4
4 5
Example
import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print df
Its output is as follows −
Name Age
0 Alex 10
1 Bob 12
2 Clarke 13
Create a DataFrame from Dict of ndarrays / Lists:

Mr. D.Gangadhar
Associate Professor
All the ndarrays must be of same length. If index is passed, then the length of the index should
equal to the length of the arrays. If no index is passed, then by default, index will be range(n),
where n is the array length.
Example:
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print df
Its output is as follows −
Age Name
0 28 Tom
1 34 Jack
2 29 Steve
3 42 Ricky
Note − Observe the values 0,1,2,3. They are the default index assigned to each using the
function range(n).
Example
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data, index=['rank1','rank2','rank3','rank4'])
print df
Its output is as follows −
Age Name
rank1 28 Tom
rank2 34 Jack
rank3 29 Steve
rank4 42 Ricky
Note − Observe, the index parameter assigns an index to each row.
Create a DataFrame from List of Dicts:
List of Dictionaries can be passed as input data to create a DataFrame. The dictionary keys are
by default taken as column names.
Example:
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data)
print df
Its output is as follows −
a b c
0 1 2 NaN
1 5 10 20.0
Note − Observe, NaN (Not a Number) is appended in missing areas.
Mr. D.Gangadhar
Associate Professor
Example:
The following example shows how to create a DataFrame with a list of dictionaries, row
indices, and column indices.
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
#With two column indices, values same as dictionary keys
df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b'])
#With two column indices with one index with other name
df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
print df1
print df2
Its output is as follows −
#df1 output
a b
first 1 2
second 5 10

#df2 output
a b1
first 1 NaN
second 5 NaN
Note − Observe, df2 DataFrame is created with a column index other than the dictionary key;
thus, appended the NaN’s in place. Whereas, df1 is created with column indices same as
dictionary keys, so NaN’s appended.
Create a DataFrame from Dict of Series:
Dictionary of Series can be passed to form a DataFrame. The resultant index is the union of all
the series indexes passed.

Example:
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df
Its output is as follows −
one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4
Note − Observe, for the series one, there is no label ‘d’ passed, but in the result, for the d label,
NaN is appended with NaN.
Let us now understand column selection, addition, and deletion through examples.

Mr. D.Gangadhar
Associate Professor
Column Selection
We will understand this by selecting a column from the DataFrame.
Example:
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df ['one']
Its output is as follows −
a 1.0
b 2.0
c 3.0
d NaN
Name: one, dtype: float64
Column Addition
We will understand this by adding a new column to an existing data frame.

Example:
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
# Adding a new column to an existing DataFrame object with column label by passing new
series
print ("Adding a new column by passing as Series:")
df['three']=pd.Series([10,20,30],index=['a','b','c'])
print df
print ("Adding a new column using the existing columns in DataFrame:")
df['four']=df['one']+df['three']
print df
Its output is as follows −
Adding a new column by passing as Series:
one two three
a 1.0 1 10.0
b 2.0 2 20.0
c 3.0 3 30.0
d NaN 4 NaN
Adding a new column using the existing columns in DataFrame:
one two three four
a 1.0 1 10.0 11.0
b 2.0 2 20.0 22.0
c 3.0 3 30.0 33.0
Mr. D.Gangadhar
Associate Professor
d NaN 4 NaN NaN
Column Deletion
Columns can be deleted or popped; let us take an example to understand how.
Example:
# Using the previous DataFrame, we will delete a column
# using del function
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']),
'three' : pd.Series([10,20,30], index=['a','b','c'])}
df = pd.DataFrame(d)
print ("Our dataframe is:")
print df
# using del function
print ("Deleting the first column using DEL function:")
del df['one']
print df
# using pop function
print ("Deleting another column using POP function:")
df.pop('two')
print df
Its output is as follows −
Our dataframe is:
one three two
a 1.0 10.0 1
b 2.0 20.0 2
c 3.0 30.0 3
d NaN NaN 4
Deleting the first column using DEL function:
three two
a 10.0 1
b 20.0 2
c 30.0 3
d NaN 4
Deleting another column using POP function:
three
a 10.0
b 20.0
c 30.0
d NaN
Row Selection, Addition, and Deletion:
We will now understand row selection, addition and deletion through examples. Let us begin
with the concept of selection.

Mr. D.Gangadhar
Associate Professor
Selection by Label
Rows can be selected by passing row label to a loc function.
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df.loc['b']
Its output is as follows −
one 2.0
two 2.0
Name: b, dtype: float64
The result is a series with labels as column names of the DataFrame. And, the Name of the
series is the label with which it is retrieved.
Selection by integer location
Rows can be selected by passing integer location to an iloc function.
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df.iloc[2]
Its output is as follows −
one 3.0
two 3.0
Name: c, dtype: float64
Slice Rows:
Multiple rows can be selected using ‘ : ’ operator.
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
print df[2:4]
Its output is as follows −
one two
c 3.0 3
d NaN 4
Addition of Rows:
Add new rows to a DataFrame using the append function. This function will append the rows at
the end.
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df = df.append(df2)

Mr. D.Gangadhar
Associate Professor
print df
Its output is as follows −
a b
0 1 2
1 3 4
0 5 6
1 7 8
Deletion of Rows:
Use index label to delete or drop rows from a DataFrame. If label is duplicated, then multiple
rows will be dropped.
If you observe, in the above example, the labels are duplicate. Let us drop a label and will see
how many rows will get dropped.
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns = ['a','b'])
df2 = pd.DataFrame([[5, 6], [7, 8]], columns = ['a','b'])
df = df.append(df2)
# Drop rows with label 0
df = df.drop(0)
print df
Its output is as follows −
ab
134
178
Python Pandas - Basic Functionality
Series Basic Functionality
Sr.No. Attribute or Method & Description

1 axes
Returns a list of the row axis labels

2 dtype
Returns the dtype of the object.

3 empty
Returns True if series is empty.

4 ndim
Returns the number of dimensions of the underlying data, by definition 1.

5 size
Returns the number of elements in the underlying data.

6 values
Returns the Series as ndarray.

Mr. D.Gangadhar
Associate Professor
7 head()
Returns the first n rows.

8 tail()
Returns the last n rows.
Let us now create a Series and see all the above tabulated attributes operation.
Example:
import pandas as pd
import numpy as np
#Create a series with 100 random numbers
s = pd.Series(np.random.randn(4))
print s
Its output is as follows −
0 0.967853
1 -0.148368
2 -1.395906
3 -1.758394
dtype: float64
axes:
Returns the list of the labels of the series.

import pandas as pd
import numpy as np
#Create a series with 100 random numbers
s = pd.Series(np.random.randn(4))
print ("The axes are:")
print s.axes
Its output is as follows −
The axes are:
[RangeIndex(start=0, stop=4, step=1)]
The above result is a compact format of a list of values from 0 to 5, i.e., [0,1,2,3,4].
Empty:
Returns the Boolean value saying whether the Object is empty or not. True indicates that the
object is empty.
import pandas as pd
import numpy as np
#Create a series with 100 random numbers
s = pd.Series(np.random.randn(4))
print ("Is the Object empty?")
print s.empty
Its output is as follows −

Mr. D.Gangadhar
Associate Professor
Is the Object empty?
False

Ndim:
Returns the number of dimensions of the object. By definition, a Series is a 1D data structure,
so it returns
import pandas as pd
import numpy as np
#Create a series with 4 random numbers
s = pd.Series(np.random.randn(4))
print s
print ("The dimensions of the object:")
print s.ndim
Its output is as follows −
0 0.175898
1 0.166197
2 -0.609712
3 -1.377000
dtype: float64

The dimensions of the object:

Size:
Returns the size(length) of the series.
import pandas as pd
import numpy as np
#Create a series with 4 random numbers
s = pd.Series(np.random.randn(2))
print s
print ("The size of the object:")
print s.size
Its output is as follows −
0 3.078058
1 -1.207803
dtype: float64

The size of the object:


2

Mr. D.Gangadhar
Associate Professor
Values:
Returns the actual data in the series as an array.
import pandas as pd
import numpy as np
#Create a series with 4 random numbers
s = pd.Series(np.random.randn(4))
print s
print ("The actual data series is:")
print s.values
Its output is as follows −
0 1.787373
1 -0.605159
2 0.180477
3 -0.140922
dtype: float64

The actual data series is:


[ 1.78737302 -0.60515881 0.18047664 -0.1409218 ]
Head & Tail
To view a small sample of a Series or the DataFrame object, use the head() and the tail()
methods.
head() returns the first n rows(observe the index values). The default number of elements to
display is five, but you may pass a custom number.
import pandas as pd
import numpy as np
#Create a series with 4 random numbers
s = pd.Series(np.random.randn(4))
print ("The original series is:")
print s
print ("The first two rows of the data series:")
print s.head(2)
Its output is as follows −
The original series is:
0 0.720876
1 -0.765898
2 0.479221
3 -0.139547
dtype: float64

Mr. D.Gangadhar
Associate Professor
The first two rows of the data series:
0 0.720876
1 -0.765898
dtype: float64
tail() returns the last n rows(observe the index values). The default number of elements to
display is five, but you may pass a custom number.
import pandas as pd
import numpy as np
#Create a series with 4 random numbers
s = pd.Series(np.random.randn(4))
print ("The original series is:")
print s
print ("The last two rows of the data series:")
print s.tail(2)
Its output is as follows −
The original series is:
0 -0.655091
1 -0.881407
2 -0.608592
3 -2.341413
dtype: float64

The last two rows of the data series:


2 -0.608592
3 -2.341413
dtype: float64

DataFrame Basic Functionality

Let us now understand what DataFrame Basic Functionality is. The following tables lists down
the important attributes or methods that help in DataFrame Basic Functionality.

Sr.No. Attribute or Method & Description

1 T
Transposes rows and columns.

2 axes
Returns a list with the row axis labels and column axis labels as the only members.

3 dtypes
Returns the dtypes in this object.

Mr. D.Gangadhar
Associate Professor
4 empty
True if NDFrame is entirely empty [no items]; if any of the axes are of length 0.

5 ndim
Number of axes / array dimensions.

6 shape
Returns a tuple representing the dimensionality of the DataFrame.

7 size
Number of elements in the NDFrame.

8 values
Numpy representation of NDFrame.

9 head()
Returns the first n rows.

10 tail()
Returns last n rows.

Let us now create a DataFrame and see all how the above mentioned attributes operate.

Example
import pandas as pd
import numpy as np

#Create a Dictionary of series


d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}

#Create a DataFrame
df = pd.DataFrame(d)
print ("Our data series is:")
print df
Its output is as follows −
Our data series is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
Mr. D.Gangadhar
Associate Professor
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80
T (Transpose)
Returns the transpose of the DataFrame. The rows and columns will interchange.
import pandas as pd
import numpy as np
# Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
# Create a DataFrame
df = pd.DataFrame(d)
print ("The transpose of the data series is:")
print df.T
Its output is as follows −
The transpose of the data series is:
0 1 2 3 4 5 6
Age 25 26 25 23 30 29 23
Name Tom James Ricky Vin Steve Smith Jack
Rating 4.23 3.24 3.98 2.56 3.2 4.6 3.8
axes
Returns the list of row axis labels and column axis labels.
import pandas as pd
import numpy as np
#Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
#Create a DataFrame
df = pd.DataFrame(d)
print ("Row axis labels and column axis labels are:")
print df.axes
Its output is as follows −
Row axis labels and column axis labels are:
[RangeIndex(start=0, stop=7, step=1), Index([u'Age', u'Name', u'Rating'],
dtype='object')]

Mr. D.Gangadhar
Associate Professor
dtypes
Returns the data type of each column.
import pandas as pd
import numpy as np
#Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
#Create a DataFrame
df = pd.DataFrame(d)
print ("The data types of each column are:")
print df.dtypes
Its output is as follows −
The data types of each column are:
Age int64
Name object
Rating float64
dtype: object

empty
Returns the Boolean value saying whether the Object is empty or not; True indicates that the
object is empty.
import pandas as pd
import numpy as np
#Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
#Create a DataFrame
df = pd.DataFrame(d)
print ("Is the object empty?")
print df.empty
Its output is as follows −
Is the object empty?
False
ndim
Returns the number of dimensions of the object. By definition, DataFrame is a 2D object.

Mr. D.Gangadhar
Associate Professor
import pandas as pd
import numpy as np

#Create a Dictionary of series


d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}

#Create a DataFrame
df = pd.DataFrame(d)
print ("Our object is:")
print df
print ("The dimension of the object is:")
print df.ndim
Its output is as follows −
Our object is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80

The dimension of the object is:


2

shape
Returns a tuple representing the dimensionality of the DataFrame. Tuple (a,b), where a
represents the number of rows and b represents the number of columns.
import pandas as pd
import numpy as np
#Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
#Create a DataFrame
df = pd.DataFrame(d)
print ("Our object is:")
print df
print ("The shape of the object is:")
print df.shape
Mr. D.Gangadhar
Associate Professor
Its output is as follows −
Our object is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80

The shape of the object is:


(7, 3)

size
Returns the number of elements in the DataFrame.

import pandas as pd
import numpy as np
#Create a Dictionary of series
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}

#Create a DataFrame
df = pd.DataFrame(d)
print ("Our object is:")
print df
print ("The total number of elements in our object is:")
print df.size
Its output is as follows −
Our object is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80

The total number of elements in our object is:


21

Mr. D.Gangadhar
Associate Professor
values
Returns the actual data in the DataFrame as an NDarray.

import pandas as pd
import numpy as np

#Create a Dictionary of series


d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}

#Create a DataFrame
df = pd.DataFrame(d)
print ("Our object is:")
print df
print ("The actual data in our data frame is:")
print df.values
Its output is as follows −
Our object is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80
The actual data in our data frame is:
[[25 'Tom' 4.23]
[26 'James' 3.24]
[25 'Ricky' 3.98]
[23 'Vin' 2.56]
[30 'Steve' 3.2]
[29 'Smith' 4.6]
[23 'Jack' 3.8]]

Head & Tail


To view a small sample of a DataFrame object, use the head() and tail()
methods. head() returns the first n rows (observe the index values). The default number of
elements to display is five, but you may pass a custom number.

import pandas as pd
import numpy as np

#Create a Dictionary of series


Mr. D.Gangadhar
Associate Professor
d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}

#Create a DataFrame
df = pd.DataFrame(d)
print ("Our data frame is:")
print df
print ("The first two rows of the data frame is:")
print df.head(2)
Its output is as follows −
Our data frame is:
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80

The first two rows of the data frame is:


Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
tail() returns the last n rows (observe the index values). The default number of elements to
display is five, but you may pass a custom number.

import pandas as pd
import numpy as np

#Create a Dictionary of series


d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
'Age':pd.Series([25,26,25,23,30,29,23]),
'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}

#Create a DataFrame
df = pd.DataFrame(d)
print ("Our data frame is:")
print df
print ("The last two rows of the data frame is:")
print df.tail(2)
Its output is as follows −
Our data frame is:
Mr. D.Gangadhar
Associate Professor
Age Name Rating
0 25 Tom 4.23
1 26 James 3.24
2 25 Ricky 3.98
3 23 Vin 2.56
4 30 Steve 3.20
5 29 Smith 4.60
6 23 Jack 3.80

The last two rows of the data frame is:


Age Name Rating
5 29 Smith 4.6
6 23 Jack 3.8
Python Pandas – Reindexing:
Reindexing changes the row labels and column labels of a DataFrame. To reindex means to
conform the data to match a given set of labels along a particular axis.
Multiple operations can be accomplished through indexing like −
 Reorder the existing data to match a new set of labels.
 Insert missing value (NA) markers in label locations where no data for the label existed.

Example
import pandas as pd
import numpy as np

N=20

df = pd.DataFrame({
'A': pd.date_range(start='2016-01-01',periods=N,freq='D'),
'x': np.linspace(0,stop=N-1,num=N),
'y': np.random.rand(N),
'C': np.random.choice(['Low','Medium','High'],N).tolist(),
'D': np.random.normal(100, 10, size=(N)).tolist()
})
#reindex the DataFrame
df_reindexed = df.reindex(index=[0,2,5], columns=['A', 'C', 'B'])
print df_reindexed
Its output is as follows −
A C B
0 2016-01-01 Low NaN
2 2016-01-03 High NaN
5 2016-01-06 Low NaN

Reindex to Align with Other Objects


Mr. D.Gangadhar
Associate Professor
You may wish to take an object and reindex its axes to be labeled the same as another object.
Consider the following example to understand the same.

Example
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(10,3),columns=['col1','col2','col3'])
df2 = pd.DataFrame(np.random.randn(7,3),columns=['col1','col2','col3'])
df1 = df1.reindex_like(df2)
print df1
Its output is as follows −
col1 col2 col3
0 -2.467652 -1.211687 -0.391761
1 -0.287396 0.522350 0.562512
2 -0.255409 -0.483250 1.866258
3 -1.150467 -0.646493 -0.222462
4 0.152768 -2.056643 1.877233
5 -1.155997 1.528719 -1.343719
6 -1.015606 -1.245936 -0.295275
Note − Here, the df1 DataFrame is altered and reindexed like df2. The column names should be
matched or else NAN will be added for the entire column label.

Filling while ReIndexing

reindex() takes an optional parameter method which is a filling method with values as follows

 pad/ffill − Fill values forward
 bfill/backfill − Fill values backward
 nearest − Fill from the nearest index values

Example
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3'])
df2 = pd.DataFrame(np.random.randn(2,3),columns=['col1','col2','col3'])
# Padding NAN's
print df2.reindex_like(df1)
# Now Fill the NAN's with preceding Values
print ("Data Frame with Forward Fill:")
print df2.reindex_like(df1,method='ffill')
Its output is as follows −
col1 col2 col3
Mr. D.Gangadhar
Associate Professor
0 1.311620 -0.707176 0.599863
1 -0.423455 -0.700265 1.133371
2 NaN NaN NaN
3 NaN NaN NaN
4 NaN NaN NaN
5 NaN NaN NaN

Data Frame with Forward Fill:


col1 col2 col3
0 1.311620 -0.707176 0.599863
1 -0.423455 -0.700265 1.133371
2 -0.423455 -0.700265 1.133371
3 -0.423455 -0.700265 1.133371
4 -0.423455 -0.700265 1.133371
5 -0.423455 -0.700265 1.133371
Note − The last four rows are padded.

Limits on Filling while Reindexing

The limit argument provides additional control over filling while reindexing. Limit specifies the
maximum count of consecutive matches. Let us consider the following example to understand
the same −

Example
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3'])
df2 = pd.DataFrame(np.random.randn(2,3),columns=['col1','col2','col3'])
# Padding NAN's
print df2.reindex_like(df1)
# Now Fill the NAN's with preceding Values
print ("Data Frame with Forward Fill limiting to 1:")
print df2.reindex_like(df1,method='ffill',limit=1)
Its output is as follows −
col1 col2 col3
0 0.247784 2.128727 0.702576
1 -0.055713 -0.021732 -0.174577
2 NaN NaN NaN
3 NaN NaN NaN
4 NaN NaN NaN
5 NaN NaN NaN

Data Frame with Forward Fill limiting to 1:


col1 col2 col3
0 0.247784 2.128727 0.702576
Mr. D.Gangadhar
Associate Professor
1 -0.055713 -0.021732 -0.174577
2 -0.055713 -0.021732 -0.174577
3 NaN NaN NaN
4 NaN NaN NaN
5 NaN NaN NaN
Note − Observe, only the 7th row is filled by the preceding 6th row. Then, the rows are left as
they are.
Renaming:
The rename() method allows you to relabel an axis based on some mapping (a dict or Series) or
an arbitrary function.
Let us consider the following example to understand this −

import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3'])
print df1
print ("After renaming the rows and columns:")
print df1.rename(columns={'col1' : 'c1', 'col2' : 'c2'},
index = {0 : 'apple', 1 : 'banana', 2 : 'durian'})
Its output is as follows −
col1 col2 col3
0 0.486791 0.105759 1.540122
1 -0.990237 1.007885 -0.217896
2 -0.483855 -1.645027 -1.194113
3 -0.122316 0.566277 -0.366028
4 -0.231524 -0.721172 -0.112007
5 0.438810 0.000225 0.435479

After renaming the rows and columns:


c1 c2 col3
apple 0.486791 0.105759 1.540122
banana -0.990237 1.007885 -0.217896
durian -0.483855 -1.645027 -1.194113
3 -0.122316 0.566277 -0.366028
4 -0.231524 -0.721172 -0.112007
5 0.438810 0.000225 0.435479
The rename() method provides an inplace named parameter, which by default is False and
copies the underlying data. Pass inplace=True to rename the data in place.

Python Pandas – Sorting:


There are two kinds of sorting available in Pandas. They are −
 By label
 By Actual Value

Mr. D.Gangadhar
Associate Professor
Let us consider an example with an output.

import pandas as pd
import numpy as np
unsorted_df=pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu
mns=['col2','col1'])
print unsorted_df
Its output is as follows −
col2 col1
1 -2.063177 0.537527
4 0.142932 -0.684884
6 0.012667 -0.389340
2 -0.548797 1.848743
3 -1.044160 0.837381
5 0.385605 1.300185
9 1.031425 -1.002967
8 -0.407374 -0.435142
0 2.237453 -1.067139
7 -1.445831 -1.701035
In unsorted_df, the labels and the values are unsorted. Let us see how these can be sorted.
By Label
Using the sort_index() method, by passing the axis arguments and the order of sorting,
DataFrame can be sorted. By default, sorting is done on row labels in ascending order.
import pandas as pd
import numpy as np

unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu
mns = ['col2','col1'])

sorted_df=unsorted_df.sort_index()
print sorted_df
Its output is as follows −
col2 col1
0 0.208464 0.627037
1 0.641004 0.331352
2 -0.038067 -0.464730
3 -0.638456 -0.021466
4 0.014646 -0.737438
5 -0.290761 -1.669827
6 -0.797303 -0.018737
7 0.525753 1.628921
8 -0.567031 0.775951
9 0.060724 -0.322425
Mr. D.Gangadhar
Associate Professor
Order of Sorting
By passing the Boolean value to ascending parameter, the order of the sorting can be controlled.
Let us consider the following example to understand the same.

import pandas as pd
import numpy as np

unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu
mns = ['col2','col1'])

sorted_df = unsorted_df.sort_index(ascending=False)
print sorted_df
Its output is as follows −
col2 col1
9 0.825697 0.374463
8 -1.699509 0.510373
7 -0.581378 0.622958
6 -0.202951 0.954300
5 -1.289321 -1.551250
4 1.302561 0.851385
3 -0.157915 -0.388659
2 -1.222295 0.166609
1 0.584890 -0.291048
0 0.668444 -0.061294
Sort the Columns
By passing the axis argument with a value 0 or 1, the sorting can be done on the column labels.
By default, axis=0, sort by row. Let us consider the following example to understand the same.

import pandas as pd
import numpy as np
unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu
mns = ['col2','col1'])
sorted_df=unsorted_df.sort_index(axis=1)
print sorted_df
Its output is as follows −
col1 col2
1 -0.291048 0.584890
4 0.851385 1.302561
6 0.954300 -0.202951
2 0.166609 -1.222295
3 -0.388659 -0.157915
5 -1.551250 -1.289321
9 0.374463 0.825697
Mr. D.Gangadhar
Associate Professor
8 0.510373 -1.699509
0 -0.061294 0.668444
7 0.622958 -0.581378
By Value
Like index sorting, sort_values() is the method for sorting by values. It accepts a 'by' argument
which will use the column name of the DataFrame with which the values are to be sorted.

import pandas as pd
import numpy as np
unsorted_df = pd.DataFrame({'col1':[2,1,1,1],'col2':[1,3,2,4]})
sorted_df = unsorted_df.sort_values(by='col1')
print sorted_df
Its output is as follows −
col1 col2
1 1 3
2 1 2
3 1 4
0 2 1
Observe, col1 values are sorted and the respective col2 value and row index will alter along
with col1. Thus, they look unsorted.
'by' argument takes a list of column values.

import pandas as pd
import numpy as np
unsorted_df = pd.DataFrame({'col1':[2,1,1,1],'col2':[1,3,2,4]})
sorted_df = unsorted_df.sort_values(by=['col1','col2'])
print sorted_df
Its output is as follows −
col1 col2
2 1 2
1 1 3
3 1 4
0 2 1
Sorting Algorithm
sort_values() provides a provision to choose the algorithm from mergesort, heapsort and
quicksort. Mergesort is the only stable algorithm.
import pandas as pd
import numpy as np
unsorted_df = pd.DataFrame({'col1':[2,1,1,1],'col2':[1,3,2,4]})
sorted_df = unsorted_df.sort_values(by='col1' ,kind='mergesort')
print sorted_df

Mr. D.Gangadhar
Associate Professor
Its output is as follows −
col1 col2
1 1 3
2 1 2
3 1 4
0 2 1
Working with Missing Data in Pandas:
Missing Data can occur when no information is provided for one or more items or for a whole unit.
Missing Data is a very big problem in real-life scenarios. Missing Data can also refer to as NA (Not
Available) values in pandas. In Data Frame sometimes many datasets simply arrive with missing
data, either because it exists and was not collected or it never existed. For Example, suppose
different users being surveyed may choose not to share their income, some users may choose not
to share the address in this way many datasets went missing. In Pandas missing data is
represented by two values:
None: None is a Python singleton object that is often used for missing data in Python code.
NaN : NaN (an acronym for Not a Number), is a special floating-point value recognized by all
systems that use the standard IEEE floating-point representation
Pandas treat None and NaN as essentially interchangeable for indicating missing or null values.
To facilitate this convention, there are several useful functions for detecting, removing, and
replacing null values in Pandas DataFrame:
isnull()
notnull()
dropna()
fillna()
replace()
interpolate()
Checking for missing values using isnull() and notnull()

In order to check missing values in Pandas DataFrame, we use a function isnull() and
notnull(). Both function help in checking whether a value is NaN or not. These function can
also
 be used in Pandas Series in order to find null values in a series.
Checking for missing values using isnull():
In order to check null values in Pandas DataFrame, we use isnull() function this function
return dataframe of Boolean values which are True for NaN values.

Code:

# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
creating a dataframe from
list df = pd.DataFrame(dict)
Mr. D.Gangadhar
Associate Professor
using isnull()
function df.isnull()
Output:







Checking for missing values using notnull():
In order to check null values in Pandas Dataframe, we use notnull() function this function
return dataframe of Boolean values which are False for NaN values.
Code:
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
# creating a dataframe using dictionary
df = pd.DataFrame(dict)
using notnull()
function df.notnull()

Output:

Code:

# importing pandas package


import pandas as pd
making data frame from csv file
data =
pd.read_csv("employees.csv")
creating bool series True for NaN values
bool_series = pd.notnull(data["Gender"])
filtering data

Mr. D.Gangadhar
Associate Professor
displayind data only with Gender = Not
NaN data[bool_series]
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
# creating a dataframe using dictionary
df = pd.DataFrame(dict)
using notnull()
function df.notnull()
Output:

Code:

# importing pandas package


import pandas as pd
making data frame from csv file
data =
pd.read_csv("employees.csv")
creating bool series True for NaN values
bool_series = pd.notnull(data["Gender"])
filtering data
displayind data only with Gender = Not
NaN data[bool_series]

Mr. D.Gangadhar
Associate Professor
Output:

As shown in the output image, only the rows having Gender = NOT NULL are displayed.

Filling missing values using fillna(), replace() and interpolate()


In order to fill null values in a datasets, we use fillna(), replace() and interpolate() function
these function replace NaN values with some value of their own. All these function help in
filling a null values in datasets of a DataFrame. Interpolate() function is basically used to fill
NA values in the dataframe but it uses various interpolation technique to fill the missing values
rather than hard-coding the value.
Code: Filling null values with a single value
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score':[np.nan, 40, 80, 98]}
creating a dataframe from
dictionary df = pd.DataFrame(dict)
filling missing value using fillna()
Mr. D.Gangadhar
Associate Professor
df.fillna(0)
Output:

Dropping missing values using dropna ()


In order to drop a null values from a dataframe, we used dropna() function this function drop
Rows/Columns of datasets with Null values in different ways.
Code: Dropping rows with at least 1 null value.
# importing pandas as pd
import pandas as pd
# importing numpy as np
import numpy as np
# dictionary of lists
dict = {'First Score':[100, 90, np.nan, 95],
'Second Score': [30, np.nan, 45, 56],
'Third Score':[52, 40, 80, 98],
'Fourth Score':[np.nan, np.nan, np.nan, 65]}
# creating a dataframe from dictionary
df = pd.DataFrame(dict)
df

Mr. D.Gangadhar
Associate Professor
UNIT-5 Data Analysis Application Examples

Data Wrangling OR Data Munging in Python:


Data Wrangling is the process of gathering, collecting, and transforming raw data into
another format for better understanding, decision-making, accessing, and analysis in less
time. Data Wrangling is also known as Data Munging.
Importance of Data Wrangling:
Data wrangling is a very important step. The below example will explain its importance
as:
Books selling Website want to show top-selling books of different domains, according to
user preference. For example, a new user search for motivational books, then they want to
show those motivational books which sell the most or having a high rating, etc.
But on their website, there are plenty of raw data from different users. Here the concept of
Data Munging or Data Wrangling is used. As we know Data is not wrangled by System.
This process is done by Data Scientists. So, the data Scientist will wrangle data in such a
way that they will sort those motivational books that are sold more or have high ratings or
user buy this book with this package of Books, etc. On the basis of that, the new user will
make choice. This will explain the importance of Data wrangling.
Data Wrangling in Python
Data wrangling is a crucial topic for Data Science and Data Analysis. Pandas Framework
of Python is used for Data Wrangling. A panda is an open-source library specifically
developed for Data Analysis and Data Science. The process like data sorting or filtration,
Data grouping, etc.
Data wrangling in python deals with the below functionalities:
1. Data exploration: In this process, the data is studied, analyzed and understood by
visualizing representations of data.
2. Dealing with missing values: Most of the datasets having a vast amount of data
contain missing values of NaN, they are needed to be taken care of by replacing them
with mean, mode, the most frequent value of the column or simply by dropping the
row having a NaN value.
3. Reshaping data: In this process, data is manipulated according to the requirements,
where new data can be added or pre-existing data can be modified.
4. Filtering data: Sometimes datasets are comprised of unwanted rows or columns
which are required to be removed or filtered
5. Other: After dealing with the raw dataset with the above functionalities we get an
efficient dataset as per our requirements and then it can be used for a required purpose
like data analyzing, machine learning, data visualization, model training etc.
Data exploration, here we assign the data, and then we visualize the data in a tabular
format.
#Import pandas package
Import pandas as pd
# Assign data
data = {'Name': ['Jai', 'Princi', 'Gaurav',
'Anuj', 'Ravi', 'Natasha', 'Riya'],
'Age': [17, 17, 18, 17, 18, 17, 17],
Mr. D.Gangadhar
Associate Professor
'Gender': ['M', 'F', 'M', 'M', 'M', 'F', 'F'],
'Marks': [90, 76, 'NaN', 74, 65, 'NaN', 71]}
# Convert into DataFrame
df = pd.DataFrame(data)
# Display data
df

Cleaning Data:
Missing data is always a problem in real life scenarios. Areas like machine learning and
data mining face severe issues in the accuracy of their model predictions because of poor
quality of data caused by missing values. In these areas, missing value treatment is a major
point of focus to make their models more accurate and valid.
Let us consider an online survey for a product. Many a times, people do not share all the
information related to them. Few people share their experience, but not how long they are
using the product; few people share how long they are using the product, their experience
but not their contact information. Thus, in some or the other way a part of data is always
missing, and this is very common in real time.
Dealing with missing values, as we can see from the previous output, there
are NaN values present in the MARKS column which are going to be taken care of by
replacing them with the column mean.

# Compute average
c = avg = 0
for ele in df['Marks']:
if str(ele).isnumeric():
c += 1
avg += ele
avg /= c
# Replace missing values
df = df.replace(to_replace="NaN",
value=avg)
# Display data
df

Mr. D.Gangadhar
Associate Professor
Output:

Cleaning / Filling Missing Data:


rovide various methods for cleaning the missing values. The fillna function can “fill in” NA
values with non-null data in a couple of ways, which we have illustrated in the following
sections.

Replace NaN with a Scalar Value


The following program shows how you can replace "NaN" with "0".

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(3, 3), index=['a', 'c', 'e'],columns=['one',
'two', 'three'])
df = df.reindex(['a', 'b', 'c'])
print df
print ("NaN replaced with '0':")
print df.fillna(0)
Its output is as follows −
one two three
a -0.576991 -0.741695 0.553172
b NaN NaN NaN
c 0.744328 -1.735166 1.749580

NaN replaced with '0':


one two three
a -0.576991 -0.741695 0.553172
b 0.000000 0.000000 0.000000
c 0.744328 -1.735166 1.749580
Here, we are filling with value zero; instead we can also fill with any other value.

Mr. D.Gangadhar
Associate Professor
Fill NA Forward and Backward
Using the concepts of filling discussed in the ReIndexing Chapter we will fill the missing
values.

Method Action

pad/fill Fill methods Forward

bfill/backfill Fill methods Backward

Example
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(5, 3), index=['a', 'c', 'e', 'f',


'h'],columns=['one', 'two', 'three'])
df = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])

print df.fillna(method='pad')
Its output is as follows −
one two three
a 0.077988 0.476149 0.965836
b 0.077988 0.476149 0.965836
c -0.390208 -0.551605 -2.301950
d -0.390208 -0.551605 -2.301950
e -2.000303 -0.788201 1.510072
f -0.930230 -0.670473 1.146615
g -0.930230 -0.670473 1.146615
h 0.085100 0.532791 0.887415
Drop Missing Values
If you want to simply exclude the missing values, then use the dropna function along with
the axis argument. By default, axis=0, i.e., along row, which means that if any value within
a row is NA then the whole row is excluded.

Example
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(5, 3), index=['a', 'c', 'e', 'f',
'h'],columns=['one', 'two', 'three'])
df = df.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
print df.dropna()
Its output is as follows −

Mr. D.Gangadhar
Associate Professor
one two three
a 0.077988 0.476149 0.965836
c -0.390208 -0.551605 -2.301950
e -2.000303 -0.788201 1.510072
f -0.930230 -0.670473 1.146615
h 0.085100 0.532791 0.887415
Replace Missing (or) Generic Values
Many times, we have to replace a generic value with some specific value. We can achieve
this by applying the replace method.
Replacing NA with a scalar value is equivalent behavior of the fillna() function.

Example
import pandas as pd
import numpy as np
df = pd.DataFrame({'one':[10,20,30,40,50,2000],
'two':[1000,0,30,40,50,60]})
print df.replace({1000:10,2000:60})
Its output is as follows −
one two
0 10 10
1 20 0
2 30 30
3 40 40
4 50 50
5 60 60
Filtering data:
Suppose there is a requirement for the details regarding name, gender, marks of the top-
scoring students. Here we need to remove some unwanted data.
Filter top scoring
students df =
df[df['Marks'] >=
75]
Remove age row
df = df.drop(['Age'], axis=1)
Disp
lay
data
Df
Output:

Mr. D.Gangadhar
Associate Professor
Merging Data:
The Pandas library in python provides a single function, merge, as the entry point for all
standard database join operations between DataFrame objects.
pd.merge(left, right, how='inner', on=None, left_on=None, right_on=None,
left_index=False, right_index=False, sort=True)
Let us now create two different DataFrames and perform the merging operations on it.

# import the pandas library


import pandas as pd
left = pd.DataFrame({
'id':[1,2,3,4,5],
'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'subject_id':['sub1','sub2','sub4','sub6','sub5']})
right = pd.DataFrame(
{'id':[1,2,3,4,5],
'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'subject_id':['sub2','sub4','sub3','sub6','sub5']})
print left
print right
Its output is as follows −
Name id subject_id
0 Alex 1 sub1
1 Amy 2 sub2
2 Allen 3 sub4
3 Alice 4 sub6
4 Ayoung 5 sub5

Name id subject_id
0 Billy 1 sub2
1 Brian 2 sub4
2 Bran 3 sub3
3 Bryce 4 sub6
4 Betty 5 sub5
Reshaping data, in the GENDER column, we can reshape the data by categorizing them
into different numbers.

# Categorize gender
df['Gender'] = df['Gender'].map({'M': 0,
'F': 1, }).astype(float)
# Display data

Mr. D.Gangadhar
Associate Professor
df

Output:

Grouping Data:
Grouping data sets is a frequent need in data analysis where we need the result in terms of
various groups present in the data set. Panadas has in-built methods which can roll the data
into various groups.
In the below example we group the data by year and then get the result for a specific year.

# import the pandas library


import pandas as pd
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',
'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'],
'Rank': [1, 2, 2, 3, 3,4 ,1 ,1,2 , 4,1,2],
'Year': [2014,2015,2014,2015,2014,2015,2016,2017,2016,2014,2015,2017],
'Points':[876,789,863,673,741,812,756,788,694,701,804,690]}
df = pd.DataFrame(ipl_data)
grouped = df.groupby('Year')
print grouped.get_group(2014)
Its output is as follows −
Points Rank Team Year
0 876 1 Riders 2014
2 863 2 Devils 2014
4 741 3 Kings 2014
9 701 4 Royals 2014
Concatenating Data:
Pandas provide various facilities for easily combining together Series, Data Frame,
and Panel objects. In the below example the concat function performs concatenation
operations along an axis. Let us create different objects and do concatenation.

import pandas as pd
one = pd.DataFrame({
'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'subject_id':['sub1','sub2','sub4','sub6','sub5'],
Mr. D.Gangadhar
Associate Professor
'Marks_scored':[98,90,87,69,78]},
index=[1,2,3,4,5])
two = pd.DataFrame({
'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'subject_id':['sub2','sub4','sub3','sub6','sub5'],
'Marks_scored':[89,80,79,97,88]},
index=[1,2,3,4,5])
print pd.concat([one,two])
Its output is as follows −
Marks_scored Name subject_id
1 98 Alex sub1
2 90 Amy sub2
3 87 Allen sub4
4 69 Alice sub6
5 78 Ayoung sub5
1 89 Billy sub2
2 80 Brian sub4
3 79 Bran sub3
4 97 Bryce sub6
5 88 Betty sub5

Data Aggregation
Python has several methods are available to perform aggregations on data. It is done using
the pandas and numpy libraries. The data must be available or converted to a dataframe to
apply the aggregation functions.
Applying Aggregations on DataFrame
Let us create a DataFrame and apply aggregations on it.
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10, 4),
index = pd.date_range('1/1/2000', periods=10),
columns = ['A', 'B', 'C', 'D'])
print df
r = df.rolling(window=3,min_periods=1)
print r
Its output is as follows −
A B C D
2000-01-01 1.088512 -0.650942 -2.547450 -0.566858
2000-01-02 0.790670 -0.387854 -0.668132 0.267283
2000-01-03 -0.575523 -0.965025 0.060427 -2.179780
2000-01-04 1.669653 1.211759 -0.254695 1.429166
2000-01-05 0.100568 -0.236184 0.491646 -0.466081
2000-01-06 0.155172 0.992975 -1.205134 0.320958
2000-01-07 0.309468 -0.724053 -1.412446 0.627919
Mr. D.Gangadhar
Associate Professor
2000-01-08 0.099489 -1.028040 0.163206 -1.274331
2000-01-09 1.639500 -0.068443 0.714008 -0.565969
2000-01-10 0.326761 1.479841 0.664282 -1.361169

Rolling [window=3,min_periods=1,center=False,axis=0]
We can aggregate by passing a function to the entire DataFrame, or select a column via the
standard get item method.

Apply Aggregation on a Whole Dataframe


import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10, 4),
index = pd.date_range('1/1/2000', periods=10),
columns = ['A', 'B', 'C', 'D'])
print df
r = df.rolling(window=3,min_periods=1)
print r.aggregate(np.sum)
Its output is as follows −

A B C D
2000-01-01 1.088512 -0.650942 -2.547450 -0.566858
2000-01-02 1.879182 -1.038796 -3.215581 -0.299575
2000-01-03 1.303660 -2.003821 -3.155154 -2.479355
2000-01-04 1.884801 -0.141119 -0.862400 -0.483331
2000-01-05 1.194699 0.010551 0.297378 -1.216695
2000-01-06 1.925393 1.968551 -0.968183 1.284044
2000-01-07 0.565208 0.032738 -2.125934 0.482797
2000-01-08 0.564129 -0.759118 -2.454374 -0.325454
2000-01-09 2.048458 -1.820537 -0.535232 -1.212381
2000-01-10 2.065750 0.383357 1.541496 -3.201469

A B C D
2000-01-01 1.088512 -0.650942 -2.547450 -0.566858
2000-01-02 1.879182 -1.038796 -3.215581 -0.299575
2000-01-03 1.303660 -2.003821 -3.155154 -2.479355
2000-01-04 1.884801 -0.141119 -0.862400 -0.483331
2000-01-05 1.194699 0.010551 0.297378 -1.216695
2000-01-06 1.925393 1.968551 -0.968183 1.284044
2000-01-07 0.565208 0.032738 -2.125934 0.482797
2000-01-08 0.564129 -0.759118 -2.454374 -0.325454
2000-01-09 2.048458 -1.820537 -0.535232 -1.212381
2000-01-10 2.065750 0.383357 1.541496 -3.201469

Apply Aggregation on a Single Column of a Dataframe


import pandas as pd
import numpy as np
Mr. D.Gangadhar
Associate Professor
df = pd.DataFrame(np.random.randn(10, 4),
index = pd.date_range('1/1/2000', periods=10),
columns = ['A', 'B', 'C', 'D'])
print df
r = df.rolling(window=3,min_periods=1)
print r['A'].aggregate(np.sum)
Its output is as follows −
A B C D
2000-01-01 1.088512 -0.650942 -2.547450 -0.566858
2000-01-02 1.879182 -1.038796 -3.215581 -0.299575
2000-01-03 1.303660 -2.003821 -3.155154 -2.479355
2000-01-04 1.884801 -0.141119 -0.862400 -0.483331
2000-01-05 1.194699 0.010551 0.297378 -1.216695
2000-01-06 1.925393 1.968551 -0.968183 1.284044
2000-01-07 0.565208 0.032738 -2.125934 0.482797
2000-01-08 0.564129 -0.759118 -2.454374 -0.325454
2000-01-09 2.048458 -1.820537 -0.535232 -1.212381
2000-01-10 2.065750 0.383357 1.541496 -3.201469
2000-01-01 1.088512
2000-01-02 1.879182
2000-01-03 1.303660
2000-01-04 1.884801
2000-01-05 1.194699
2000-01-06 1.925393
2000-01-07 0.565208
2000-01-08 0.564129
2000-01-09 2.048458
2000-01-10 2.065750
Freq: D, Name: A, dtype: float64

Mr. D.Gangadhar
Associate Professor
UNIT-6 Data Visualization
Matplotlib:
Matplotlib is one of the most popular Python packages used for data visualization. It is a
cross-platform library for making 2D plots from data in arrays. Matplotlib is written in
Python and makes use of NumPy, the numerical mathematics extension of Python. It
provides an object-oriented API that helps in embedding plots in applications using Python
GUI toolkits such as PyQt, WxPythonotTkinter. It can be used in Python and IPython
shells, Jupyter notebook and web application servers also.
Matplotlib has a procedural interface named the Pylab, which is designed to resemble
MATLAB, a proprietary programming language developed by MathWorks. Matplotlib
along with NumPy can be considered as the open source equivalent of MATLAB.
Matplotlib was originally written by John D. Hunter in 2003. The current stable version is
2.2.0 released in January 2018.
Matplotlib - Environment Setup:
Matplotlib and its dependency packages are available in the form of wheel packages on the
standard Python package repositories and can be installed on Windows, Linux as well as
MacOS systems using the pip package manager.
pip3 install matplotlib
Incase Python 2.7 or 3.4 versions are not installed for all users, the Microsoft Visual C++
2008 (64 bit or 32 bit forPython 2.7) or Microsoft Visual C++ 2010 (64 bit or 32 bit for
Python 3.4) redistributable packages need to be installed.
If you are using Python 2.7 on a Mac, execute the following command −

xcode-select –install
Upon execution of the above command, the subprocess32 - a dependency, may be
compiled.
On extremely old versions of Linux and Python 2.7, you may need to install the master
version of subprocess32.
Matplotlib requires a large number of dependencies −

 Python (>= 2.7 or >= 3.4)


 NumPy
 setuptools
 dateutil
 pyparsing
 libpng
 pytz
 FreeType

Mr. D.Gangadhar
Associate Professor
 cycler
 six
Matplotlib - Pyplot API:

A new untitled notebook with the .ipynbextension (stands for the IPython notebook) is
displayed in the new tab of the browser.

Pyplot is a Matplotlib module which provides a MATLAB-like interface. Matplotlib is


designed to be as usable as MATLAB, with the ability to use Python and the advantage of
being free and open-source. Each pyplot function makes some change to a figure: e.g.,
creates a figure, creates a plotting area in a figure, plots some lines in a plotting area,
decorates the plot with labels, etc. The various plots we can utilize using Pyplot are Line
Plot, Histogram, Scatter, 3D Plot, Image, Contour, and Polar.
Syntax :
matplotlib.pyplot.plot(*args, scalex=True, scaley=True, data=None, **kwargs)
Types of Plots
Sr.No Function & Description

1 Bar
Make a bar plot.

2 Barh
Make a horizontal bar plot.

3 Boxplot
Make a box and whisker plot.

Mr. D.Gangadhar
Associate Professor
4 Hist
Plot a histogram.

5 hist2d
Make a 2D histogram plot.

6 Pie
Plot a pie chart.

7 Plot
Plot lines and/or markers to the Axes.

8 Polar
Make a polar plot..

9 Scatter
Make a scatter plot of x vs y.

10 Stackplot
Draws a stacked area plot.

11 Stem
Create a stem plot.

12 Step
Make a step plot.

13 Quiver
Plot a 2-D field of arrows.

Image Functions
Sr.No Function & Description

1 Imread
Read an image from a file into an array.

2 Imsave
Save an array as in image file.

3 Imshow
Display an image on the axes.

Axis Functions
Sr.No Function & Description

1 Axes

Mr. D.Gangadhar
Associate Professor
Add axes to the figure.

2 Text
Add text to the axes.

3 Title
Set a title of the current axes.

4 Xlabel
Set the x axis label of the current axis.

5 Xlim
Get or set the x limits of the current axes.

6 Xscale
.

7 Xticks
Get or set the x-limits of the current tick locations and
labels.

8 Ylabel
Set the y axis label of the current axis.

9 Ylim
Get or set the y-limits of the current axes.

10 Yscale
Set the scaling of the y-axis.

11 Yticks
Get or set the y-limits of the current tick locations and
labels.

Figure Functions
Sr.No Function & Description

1 Figtext
Add text to figure.

2 Figure
Creates a new figure.

3 Show
Display a figure.

4 Savefig
Mr. D.Gangadhar
Associate Professor
Save the current figure.

5 Close
Close a figure window.

Pyplot

Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported
under the plt alias:

import matplotlib.pyplot as plt

Now the Pyplot package can be referred to as plt.

Code:
import matplotlib.pyplot as plt
import numpy as np
xpoints = np.array([0, 6])
ypoints = np.array([0, 250])
plt.plot(xpoints, ypoints)
plt.show()

Result:

Matplotlib Plotting:
Plotting x and y points
The plot() function is used to draw points (markers) in a diagram.
By default, the plot() function draws a line from point to point.
The function takes parameters for specifying points in the diagram.
Parameter 1 is an array containing the points on the x-axis.
Parameter 2 is an array containing the points on the y-axis.

If we need to plot a line from (1, 1) to (5, 7), we have to pass two arrays [1, 5] and [1, 7] to
the plot function.
Mr. D.Gangadhar
Associate Professor
Code:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
xpoints = np.array([1, 5])
ypoints = np.array([1, 7])
plt.plot(xpoints, ypoints)
plt.show()

Result:

The x-axis is the horizontal axis.


The y-axis is the vertical axis.
Plotting Without Line:
To plot only the markers, you can use shortcut string notation parameter 'o', which means
'rings'.
Code:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
xpoints = np.array([1, 5])
ypoints = np.array([3, 7])
plt.plot(xpoints, ypoints, 'o')
plt.show()

Result:

Mr. D.Gangadhar
Associate Professor
Multiple Points:
You can plot as many points as you like, just make sure you have the same number of points
in both axis.

Code:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
xpoints = np.array([1, 3, 6, 9])
ypoints = np.array([2, 8, 2, 8])
plt.plot(xpoints, ypoints)
plt.show()

Result:

Default X-Points:
If we do not specify the points in the x-axis, they will get the default values 0, 1, 2, 3, (etc.
depending on the length of the y-points.
So, if we take the same example as above, and leave out the x-points, the diagram will look like
this:
Example
Plotting without x-points:
import sys
import matplotlib
Mr. D.Gangadhar
Associate Professor
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([2, 4, 3, 6, 8])plt.plot(ypoints)
plt.show()
Result:

Markers:
You can use the keyword argument marker to emphasize each point with a specified marker
Example
Mark each point with a star:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([2, 5, 1, 8])
plt.plot(ypoints, marker = '*')
plt.show()
Result:

Marker Reference
You can choose any of these markers:
Marker Description

'o' Circle

'*' Star

Mr. D.Gangadhar
Associate Professor
'.' Point

',' Pixel

'x' X

'X' X (filled)

'+' Plus

'P' Plus (filled)

's' Square

'D' Diamond

'd' Diamond (thin)

'p' Pentagon

'H' Hexagon

'h' Hexagon

'v' Triangle Down

'^' Triangle Up

'<' Triangle Left

'>' Triangle Right

'1' Tri Down

'2' Tri Up

'3' Tri Left

'4' Tri Right

'|' Vline

'_' Hline
Format Strings fmt:
You can use also use the shortcut string notation parameter to specify the marker.
This parameter is also called fmt, and is written with this syntax:
Example
import sys
import matplotlib
Mr. D.Gangadhar
Associate Professor
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, '*-.g')
plt.show()
Result:

The marker value can be anything from the Marker Reference above.
The line value can be one of the following:
Line Reference
Line Syntax Description

'-' Solid line

':' Dotted line

'--' Dashed line

'-.' Dashed/dotted line


Note: If you leave out the line value in the fmt parameter, no line will be plotted.

The short color value can be one of the following:

Color Reference

Color Syntax Description

'r' Red

'g' Green

'b' Blue

'c' Cyan

'm' Magenta

'y' Yellow

Mr. D.Gangadhar
Associate Professor
'k' Black

'w' White

Marker Size:
You can use the keyword argument markersize or the shorter version, ms to set the size of the
markers:
Example
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([2, 8, 1, 6])
plt.plot(ypoints, marker = '*', ms = 23)
plt.show()
Result:

Marker Color:
You can use the keyword argument markeredgecolor or the shorter mec to set the color of
the edge of the markers:
Example
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 2, 6])
plt.plot(ypoints, marker = '*', ms = 25, mec = 'r')
plt.show()

Mr. D.Gangadhar
Associate Professor
Result:

Matplotlib Line:
Linestyle
You can use the keyword argument linestyle, or shorter ls, to change the style of the plotted
line:
Example
Use a dotted line:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([2, 8, 1, 5])
plt.plot(ypoints, linestyle = 'dotted')
plt.show()
Result:

Example

Use a dashed line:

import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np

Mr. D.Gangadhar
Associate Professor
ypoints = np.array([2,6, 1, 4])
plt.plot(ypoints, linestyle = 'dashed')
plt.show()
Result:

Line Styles
You can choose any of these styles:
Style Or

'solid' (default) '-'

'dotted' ':'

'dashed' '--'

'dashdot' '-.'

'None' '' or ' '


Line Color
You can use the keyword argument color or the shorter c to set the color of the line:
Example
Set the line color to hoppink:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([2, 5, 1, 10])
plt.plot(ypoints, c = 'hotpink')
plt.show()
Result:

Mr. D.Gangadhar
Associate Professor
Line Width:
You can use the keyword argument line width or the shorter lw to change the width of the
line.
The value is a floating number, in points:
Example
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, linewidth = '26.5')
plt.show()
Result:

Multiple Lines:
You can plot as many lines as you like by simply adding more plt.plot() functions:
Example
Draw two lines by specifying a plt.plot() function for each line:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y1 = np.array([1, 8, 1, 6])
y2 = np.array([3, 2, 6, 9])

Mr. D.Gangadhar
Associate Professor
plt.plot(y1)
plt.plot(y2)
plt.show()
Result:

Matplotlib Subplot:
Display Multiple Plots
With the subplot() function you can draw multiple plots in one figure:
Example
Draw 2 plots:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([2, 8, 4, 6])
plt.subplot(1, 2, 1)
plt.plot(x,y)
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 15, 11, 24])
plt.subplot(1, 2, 2)
plt.plot(x,y)
plt.show()
Result:

Mr. D.Gangadhar
Associate Professor
The subplot() Function:
The subplot() function takes three arguments that describes the layout of the figure.
The layout is organized in rows and columns, which are represented by
the first and second argument.
The third argument represents the index of the current plot.
plt.subplot(1, 2, 1)
#the figure has 1 row, 2 columns, and this plot is the first plot.
plt.subplot(1, 2, 2)
#the figure has 1 row, 2 columns, and this plot is the second plot.
So, if we want a figure with 2 rows an 1 column (meaning that the two plots will be
displayed on top of each other instead of side-by-side), we can write the syntax like this:
Example:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([2, 8, 4, 6])
plt.subplot(2, 1, 1)
plt.plot(x,y)
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 15, 11, 24])
plt.subplot(2, 1, 2)
plt.plot(x,y)
plt.show()

Result:

You can draw as many plots you like on one figure, just descibe the number of rows, columns,
and the index of the plot.

Mr. D.Gangadhar
Associate Professor
Example
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([0, 1, 2, 3])
y = np.array([3, 5, 1, 8])
plt.subplot(2, 3, 1)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([10, 12, 30, 15])
plt.subplot(2, 3, 2)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(2, 3, 3)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(2, 3, 4)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(2, 3, 5)
plt.plot(x,y)
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(2, 3, 6)
plt.plot(x,y)
plt.show()
Result:

Mr. D.Gangadhar
Associate Professor
Exploring plot types-Scatter plots:
Creating Scatter Plots
With Pyplot, you can use the scatter() function to draw a scatter plot.
The scatter() function plots one dot for each observation. It needs two arrays of the same
length, one for the values of the x-axis, and one for values on the y-axis:
Example
A simple scatter plot:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
plt.show()
Result:

The observation in the example above is the result of 13 cars passing by.
The X-axis shows how old the car is.
The Y-axis shows the speed of the car when it passes.
Are there any relationships between the observations?
It seems that the newer the car, the faster it drives, but that could be a coincidence, after all
we only registered 13 cars.
Compare Plots
In the example above, there seems to be a relationship between speed and age, but what if
we plot the observations from another day as well? Will the scatter plot tell us something
else?
Example
Draw two plots on the same figure:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
#day one, the age and speed of 13 cars:
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
Mr. D.Gangadhar
Associate Professor
plt.scatter(x, y)
#day two, the age and speed of 15 cars:
x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = np.array([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
plt.scatter(x, y)
plt.show()

Result:

By comparing the two plots, I think it is safe to say that they both gives us the same
conclusion: the newer the car, the faster it drives.
Colors
You can set your own color for each scatter plot with the color or the c argument:
Example
Set your own color of the markers:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y, color = 'Red')
x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = np.array([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
plt.scatter(x, y, color = '#88c999')
plt.show()
Result:

Mr. D.Gangadhar
Associate Professor
Color Each Dot
You can even set a specific color for each dot by using an array of colors as value for
the c argument:
Example:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors =
np.array(["red","green","blue","yellow","pink","black","orange","purple","beige","brown","
gray","cyan","magenta"])
plt.scatter(x, y, c=colors)
plt.show()
Result:

ColorMap:
The Matplotlib module has a number of available colormaps. A colormap is like a list of
colors, where each color has a value that ranges from 0 to 100.
Here is an example of a colormap:
This colormap is called 'viridis' and as you can see it ranges from 0, which is a purple color,
and up to 100, which is a yellow color.
How to Use the ColorMap
You can specify the colormap with the keyword argument cmap with the value of the
colormap, in this case 'viridis' which is one of the built-in colormaps available in Matplotlib.
In addition you have to create an array with values (from 0 to 100), one value for each of the
point in the scatter plot:
Example
Create a color array, and specify a colormap in the scatter plot:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
Mr. D.Gangadhar
Associate Professor
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')
plt.show()
Result:

You can include the colormap in the drawing by including the plt.colorbar() statement:
Exsmple:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')
plt.colorbar()
plt.show()
Result:

Available ColorMaps
You can choose any of the built-in colormaps:
Name Reverse

Accent Accent_r

Blues Blues_r

Mr. D.Gangadhar
Associate Professor
BrBG BrBG_r

BuGn BuGn_r

BuPu BuPu_r

CMRmap CMRmap_r

Dark2 Dark2_r

GnBu GnBu_r

Greens Greens_r

Greys Greys_r

OrRd OrRd_r

Oranges Oranges_r

PRGn PRGn_r

Paired Paired_r

Pastel1 Pastel1_r

Pastel2 Pastel2_r

PiYG PiYG_r

PuBu PuBu_r

PuBuGn PuBuGn_r

PuOr PuOr_r

PuRd PuRd_r

Purples Purples_r

RdBu RdBu_r

RdGy RdGy_r

RdPu RdPu_r

RdYlBu RdYlBu_r

RdYlGn RdYlGn_r

Mr. D.Gangadhar
Associate Professor
Reds Reds_r

Set1 Set1_r

Set2 Set2_r

Set3 Set3_r

Spectral Spectral_r

Wistia Wistia_r

YlGn YlGn_r

YlGnBu YlGnBu_r

YlOrBr YlOrBr_r

YlOrRd YlOrRd_r

afmhot afmhot_r

autumn autumn_r

binary binary_r

bone bone_r

brg brg_r

bwr bwr_r

cividis cividis_r

cool cool_r

coolwarm coolwarm_r

copper copper_r

cubehelix cubehelix_r

flag flag_r

gist_earth gist_earth_r

gist_gray gist_gray_r

gist_heat gist_heat_r

Mr. D.Gangadhar
Associate Professor
gist_ncar gist_ncar_r

gist_rainbow gist_rainbow_r

gist_stern gist_stern_r

gist_yarg gist_yarg_r

gnuplot gnuplot_r

gnuplot2 gnuplot2_r

gray gray_r

hot hot_r

hsv hsv_r

inferno inferno_r

jet jet_r

magma magma_r

nipy_spectral nipy_spectral_r

ocean ocean_r

pink pink_r

plasma plasma_r

prism prism_r

rainbow rainbow_r

seismic seismic_r

spring spring_r

summer summer_r

tab10 tab10_r

tab20 tab20_r

tab20b tab20b_r

tab20c tab20c_r

Mr. D.Gangadhar
Associate Professor
terrain terrain_r

twilight twilight_r

twilight_shifted twilight_shifted_r

viridis viridis_r

winter winter_r
Size
You can change the size of the dots with the s argument.
Just like colors, make sure the array for sizes has the same length as the arrays for the x- and
y-axis:
Example
Set your own size for the markers:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
sizes = np.array([20,50,100,200,500,1000,60,90,10,300,600,800,75])
plt.scatter(x, y, s=sizes)
plt.show()
Result:

Combine Color Size and Alpha


You can combine a colormap with different sizes on the dots. This is best visualized if the
dots are transparent:
Example
Create random arrays with 100 values for x-points, y-points, colors and sizes:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.random.randint(100, size=(100))
Mr. D.Gangadhar
Associate Professor
y = np.random.randint(100, size=(100))
colors = np.random.randint(100, size=(100))
sizes = 10 * np.random.randint(100, size=(100))
plt.scatter(x, y, c=colors, s=sizes, alpha=0.5, cmap='nipy_spectral')
plt.colorbar()
plt.show()
Result:

Matplotlib Bars:
Creating Bars
With Pyplot, you can use the bar() function to draw bar graphs:
Example
Draw 4 bars:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([6, 2, 7, 4])
plt.bar(x,y)
plt.show()
Result:

Mr. D.Gangadhar
Associate Professor
Horizontal Bars
If you want the bars to be displayed horizontally instead of vertically, use
the barh() function:
Example
Draw 4 horizontal bars:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.barh(x, y)
plt.show()
Result:

Bar Color:
The bar() and barh() takes the keyword argument color to set the color of the bars:
Example
Draw 4 red bars:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.bar(x, y, color = "hotpink")
plt.show()
Result:

Mr. D.Gangadhar
Associate Professor
Bar Width:
The bar() takes the keyword argument width to set the width of the bars:
Example
Draw 4 very thin bars:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.bar(x, y, width = 0.3)
plt.show()
Result:

Bar Height:
The barh() takes the keyword argument height to set the height of the bars:
Example
Draw 4 very thin bars:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.barh(x, y, height = 0.1)
plt.show()
Result:

Mr. D.Gangadhar
Associate Professor
Matplotlib Histograms:
Histogram
A histogram is a graph showing frequency distributions. It is a graph showing the number of
observations within each given interval.
Create Histogram
In Matplotlib, we use the hist() function to create histograms.
The hist() function will use an array of numbers to create a histogram, the array is sent into
the function as an argument.
For simplicity we use NumPy to randomly generate an array with 250 values, where the
values will concentrate around 170, and the standard deviation is 10.
Example
A Normal Data Distribution by NumPy:
import numpy as np
x = np.random.normal(170, 10, 250)
print(x)
Result:
[167.62255766 175.32495609 152.84661337 165.50264047 163.17457988
162.29867872 172.83638413 168.67303667 164.57361342 180.81120541
170.57782187 167.53075749 176.15356275 176.95378312 158.4125473
187.8842668 159.03730075 166.69284332 160.73882029 152.22378865
164.01255164 163.95288674 176.58146832 173.19849526 169.40206527
166.88861903 149.90348576 148.39039643 177.90349066 166.72462233
1776004 170.93335636 173.26312881 174.76534435 162.28791953
166.77301551 160.53785202 170.67972019 159.11594186 165.36992993
178.38979253 171.52158489 173.32636678 159.63894401 151.95735707
175.71274153 165.00458544 164.80607211 177.50988211 149.28106703
179.43586267 181.98365273 170.98196794 179.1093176 176.91855744
168.32092784 162.33939782 165.18364866 160.52300507 174.14316386
163.01947601 172.01767945 173.33491959 169.75842718 198.04834503
192.82490521 164.54557943 206.36247244 165.47748898 195.26377975
164.37569092 156.15175531 162.15564208 179.34100362 167.22138242
147.23667125 162.86940215 167.84986671 172.99302505 166.77279814
196.6137667 159.79012341 166.5840824 170.68645637 165.62204521
174.5559345 165.0079216 187.92545129 166.86186393 179.78383824
161.0973573 167.44890343 157.38075812 151.35412246 171.3107829
162.57149341 182.49985133 163.24700057 168.72639903 169.05309467
167.19232875 161.06405208 176.87667712 165.48750185 179.68799986
158.7913483 170.22465411 182.66432721 173.5675715 176.85646836
157.31299754 174.88959677 183.78323508 174.36814558 182.55474697
180.03359793 180.53094948 161.09560099 172.29179934 161.22665588
171.88382477 159.04626132 169.43886536 163.75793589 157.73710983
174.68921523 176.19843414 167.39315397 181.17128255 174.2674597
186.05053154 177.06516302 171.78523683 166.14875436 163.31607668
174.01429569 194.98819875 169.75129209 164.25748789 180.25773528
170.44784934 157.81966006 171.33315907 174.71390637 160.55423274
Mr. D.Gangadhar
Associate Professor
163.92896899 177.29159542 168.30674234 165.42853878 176.46256226
162.61719142 166.60810831 165.83648812 184.83238352 188.99833856
161.3054697 175.30396693 175.28109026 171.54765201 162.08762813
164.53011089 189.86213299 170.83784593 163.25869004 198.68079225
166.95154328 152.03381334 152.25444225 149.75522816 161.79200594
162.13535052 183.37298831 165.40405341 155.59224806 172.68678385
179.35359654 174.19668349 163.46176882 168.26621173 162.97527574
192.80170974 151.29673582 178.65251432 163.17266558 165.11172588
183.11107905 169.69556831 166.35149789 178.74419135 166.28562032
169.96465166 178.24368042 175.3035525 170.16496554 158.80682882
187.10006553 178.90542991 171.65790645 183.19289193 168.17446717
155.84544031 177.96091745 186.28887898 187.89867406 163.26716924
169.71242393 152.9410412 158.68101969 171.12655559 178.1482624
187.45272185 173.02872935 163.8047623 169.95676819 179.36887054
157.01955088 185.58143864 170.19037101 157.221245 168.90639755
178.7045601 168.64074373 172.37416382 165.61890535 163.40873027
168.98683006 149.48186389 172.20815568 172.82947206 173.71584064
189.42642762 172.79575803 177.00005573 169.24498561 171.55576698
161.36400372 176.47928342 163.02642822 165.09656415 186.70951892
153.27990317 165.59289527 180.34566865 189.19506385 183.10723435
173.48070474 170.28701875 157.24642079 157.9096498 176.4248199 ]

The hist() function will read the array and produce a histogram:
Example
A simple histogram:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
x = np.random.normal(170, 10, 250)
plt.hist(x)
plt.show()
Result:

Mr. D.Gangadhar
Associate Professor
Legends and annotations:
Legends and annotations are effective tools to display information required to comprehend a
plot in a glance. A typical plot will have the following additional information elements:
A legend describing the various data series in the plot. This is provided by invoking the
matplotlib legend() function and supplying the labels for each data series.
Annotations for important points in the plot. The matplotlib annotate() function can be used
for this purpose. A matplotlib annotation consists of a label and an arrow. This function has
many parameters describing the label and arrow style and position, so you may need to
call help(annotate) for a detailed description. Labels on the horizontal and vertical axes.
These labels can be drawn by the xlabel() and ylabel() functions. We need to give these
functions the text of the labels as a string and optional parameters such as the font size of the
label. A descriptive title for the graph with the matplotlib title() function. Legends and
annotations are effective tools to display information required to comprehend a plot in a
glance. A typical plot will have the following additional information elements:
A legend describing the various data series in the plot. This is provided by invoking the
matplotlib legend() function and supplying the labels for each data series. Annotations for
important points in the plot. The matplotlib annotate() function can be used for this purpose.
A matplotlib annotation consists of a label and an arrow. This function has many parameters
describing the label and arrow style and position, so you may need to call help(annotate) for
a detailed description. Labels on the horizontal and vertical axes. These labels can be drawn
by the xlabel() and ylabel() functions. We need to give these functions the text of the labels
as a string and optional parameters such as the font size of the label.

Matplotlib Pie Charts:


Creating Pie Charts
With Pyplot, you can use the pie() function to draw pie charts:
Example
A simple pie chart:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
plt.pie(y)
plt.show()
Result:

Mr. D.Gangadhar
Associate Professor
Labels
Add labels to the pie chart with the label parameter.
The label parameter must be an array with one label for each wedge:
Example
A simple pie chart:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels)
plt.show()
Result:

Start Angle
As mentioned the default start angle is at the x-axis, but you can change the start angle by
specifying a startangle parameter.
The startangle parameter is defined with an angle in degrees, default angle is 0:

Example
Start the first wedge at 90 degrees:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])

Mr. D.Gangadhar
Associate Professor
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels, startangle = 90)
plt.show()
Result:

Explode
Maybe you want one of the wedges to stand out? The explode parameter allows you to do
that. The explode parameter, if specified, and not None, must be an array with one value for
each wedge. Each value represents how far from the center each wedge is displayed:
Example
Pull the "Apples" wedge 0.2 from the center of the pie:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
myexplode = [0.2, 0, 0, 0]
plt.pie(y, labels = mylabels, explode = myexplode)
plt.show()
Result:

Shadow
Add a shadow to the pie chart by setting the shadows parameter to True:
Example
Add a shadow:
import sys
Mr. D.Gangadhar
Associate Professor
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
myexplode = [0.2, 0, 0, 0]
plt.pie(y, labels = mylabels, explode = myexplode, shadow = True)
plt.show()
Result:

Colors
You can set the color of each wedge with the colors parameter. The colors parameter, if
specified, must be an array with one value for each wedge:
Example
Specify a new color for each wedge:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
mycolors = ["black", "hotpink", "b", "#4CAF50"]
plt.pie(y, labels = mylabels, colors = mycolors)
plt.show()
Result:

Mr. D.Gangadhar
Associate Professor
You can use Hexadecimal color values, any of the 140 supported color names, or one of these
shortcuts:
'r' - Red
'g' - Green
'b' - Blue
'c' - Cyan
'm' - Magenta
'y' - Yellow
'k' - Black
'w' - White
Legend
To add a list of explanation for each wedge, use the legend() function:
Example
Add a legend:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels)
plt.legend()
plt.show()

Result:

Legend with Header


To add a header to the legend, add the title parameter to the legend function.
Example
Add a legend with a header:
import sys
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
Mr. D.Gangadhar
Associate Professor
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels)
plt.legend(title = "Four Fruits:")
plt.show()

Result:

Mr. D.Gangadhar
Associate Professor

You might also like