Python Module 5 Important Topics

Python-Module-5-Important-Topics
For more notes visit
https://rtpnotes.vercel.app
Python-Module-5-Important-Topics
1. Os Module
Creating a Directory
Changing current working directory
How to know what is my current working directory?
Removing a directory
List files and subdirectories
2. Sys Module
sys.argv
sys.exit
sys.maxsize
sys.path
sys.version
3. Numpy
Difference between python list and a NumPy array
ndarray Object
Features of ndarray
Simple example
Example - Creating Arrays
Program
Output
ndarray Object - Parameters
Example - ndarray Object Parameters
Program
Output
Arithmetic operations with NumPy Array
Basic Operations with Scalars
Program
Output
Arithmetic Operators in numpy
Trignometry operations with NumPy Array
Comparison in NumPy
4. Pandas
What is Pandas?
Advantages
Series
Example of Series
Dataframe
What is Dataframe?
Basic operations of Dataframe
Creating a DataFrame
5. Matplotlib
What is Matplotlib?
Example - 1 -Sin Wave
Example - 2 - Creating a bar plot
Pylab
Example - 3 - Creating a Pieplot
Python-Module-5-University-Questions-Part-A
1. How do you assign a random number to a variable in Python?
2. What is the use of os module in python?
3. Write a Python code that checks to see, if a file with the given pathname exists on
the disk, before attempting to open a file for input
4. What is Flask in Python?
5. Explain the os and os.path modules in Python with examples. Also, discuss the
walk( ) and getcwd( ) methods of the os module
What is os and os.path?
Example of os
Example of os.path
os.walk()
os.getcwd()
6. What are the important characteristics of CSV file format.
What is CSV?
Characteristics of CSV File format
7. Write the output of the following python code:
8. What is the difference between loc and iloc in pandas DataFrame. Give a suitable
example
Difference between loc and iloc
Example
Using loc
Using iloc
9. Explain the attributes of an ndarray object.

Python-Module-5-University-Questions-Part-B
1. Explain how the matrix multiplications are done using numpy arrays.
2. How to plot two or more lines on a same plot with suitable legends, labels and title.
3. Consider a CSV file ‘employee.csv’ with the following columns(name, gender,
startdate ,salary, team)
4. Write Python program to write the data given below to a CSV file
5. Consider the following two-dimensional array named arr2d
arr2d[:2]
arr2d[:2, 1:]
arr2d[1, :2]
arr2d[:2, 1:] = 0
6. Write a Python program to add two matrices and also find the transpose of the
resultant matrix.
7. Given a file “auto.csv” of automobile data with the fields index, company,body-
style, wheel- base, length, engine-type, num-of-cylinders, horsepower,average-
mileage, and price, write Python codes using Pandas to
8. Given the sales information of a company as CSV file with the following fields
monthnumber, facecream, facewash, toothpaste, bathingsoap, shampoo, moisturizer,
totalunits, totalprofit. Write Python codes to visualize the data as follows
9. Write a code segment that prints the names of all of the items in the currentworking
directory.
10. Write a python program to create two numpy arrays of random integers between
0 and 20 of shape (3, 3) and perform matrix addition, multiplication and transpose of
the product matrix.
11. Write Python program to write the data given below to a CSV file named
student.csv
12. Consider the above student.csv file with fields Name, Branch, Year, CGPA .
13. Consider a CSV file ‘weather.csv’ with the following columns (date,temperature,
humidity, windSpeed, precipitationType, place, weather {Rainy,Cloudy, Sunny}).
1. Os Module
OS Module in python provides functions for interacting with the OS
Creating a Directory
We can create a new directory using mkdir() function from OS Module
import os
os.mkdir("d:\\tempdir")
Changing current working directory
This is done using chdir
import os
os.chdir("d:\\tempdir")
How to know what is my current working directory?
You can use getcw() to get the current working directory
Removing a directory
The rmdir() function removes the specified directory
import os
os.rmdir('d:\\samplefolder')
List files and subdirectories
The listdir() function returns the list of all files and directories in the specified directory
If we dont specify any directory, then list of files an directories in the current working
directory will be returned
2. Sys Module
The sys module provides functions and variables used to manipulate different parts of the
python runtime environment
sys.argv
Returns a list of command line arguments passed to a python script. The item at
index 0 in this list is always the name of the script. The rest of the arguments are stored at
their subsequent indices
sys.exit
This causes the script to exit back to either the Python console or the command prompt.
This is used to safely exit from the program in the case of generation of an exception
sys.maxsize
Returns the largest integer a variable can take
sys.path
This is an environment variable that is a search path for all Python modules
sys.version
This attribute displays a string containing the version number of the current python
interpreter
3. Numpy
Note
You can try these numpy examples yourselves at

https://www.w3schools.com/python/numpy/trypython.asp?filename=demo_numpy_editor
Numpy is a library consisting of multidimensional array objects and a collection of routines

for processing those arrays. Using NumPy, mathematical and logical operations on arrays
can be performed
Using NumPy, a developer can perform the following operations
Mathematical and logical operations on arrays
Fourier transforms and routines for shape manipulation
Operations related to linear algebra. NumPy has in-
Difference between python list and a NumPy array
NumPy gives you an enormous range of fast and efficient ways of creating arrays and
manipulating numerical data inside them
While a python list can contain different data types within a single list, all the elements in a
NumPy array should be homogeneous
The mathematical operations that are meant to be performed on arrays would be
extremely inefficient if arrays werent homogeneous
ndarray Object
The most important object defined in NumPy is an N-dimensional array time called
ndarray.
Features of ndarray
N-dimensional Array: The ndarray is like a grid or table of numbers. This grid can have
any number of dimensions. For example, a 1-dimensional array (like a list), a 2-
dimensional array (like a table or matrix), or even higher dimensions.
Collection of Same Type: All the items (elements) in this array are of the same type, like
all integers or all floats. This makes operations on the array very fast.
Zero-based Indexing: You can access each item in the array using an index that starts
from 0. For example, array[0] gives you the first element.
Consistent Memory Size: Each item in the array takes up the same amount of memory
space. This is different from regular Python lists, where each element can be of different
sizes.
Data-type Object (dtype): Each element in the ndarray is an object of data-type object
(called dype)
Slicing and Array Scalars: When you extract a part of the array (called slicing), the
extracted items are represented as Python objects of specific types known as array
scalars.
Simple example
import numpy as np
# Creating an ndarray
array = np.array([1, 2, 3, 4])
# Accessing the first element (zero-based index)

print(array[0]) # Output: 1
# Checking the data type

print(array.dtype) # Output: int64 (or int32 depending on your system)
# Slicing the array

slice = array[1:3]
print(slice) # Output: [2 3]
Also check this diagram on how ndarray, dtype and array scalar type are related
Here you can see a large rectangle on the bottom

This is ndarray
This ndarray contains a header
Header contains the datatype for the entire array
There is smaller squares inside the ndarray

These small squares are array scalars, which are inside the ndarray
Example - Creating Arrays
Program
import numpy as np
a = np.array([1,2,3,4])
print("Value of a is\n",a,"\n") # Gives output [1 2 3 4]
b = np.array([(1,2,3),(4,5,6)], dtype = float)

print("Value of b is\n",b,"\n")
c = np.array([(1,2,3),(4,5,6),(7,8,9)])
print("Value of c is\n",c,"\n")
Output
Value of a is
[1 2 3 4]
Value of b is
[[1. 2. 3.]
[4. 5. 6.]]
Value of c is
[[1 2 3]
[4 5 6]
[7 8 9]]
ndarray Object - Parameters
ndarray.ndim
ndim represented the number of dimensions(axes) of the ndarray
a = np.array([1,2,3,4])
This is a 1D array, so ndim gives 1
b = np.array([(1,2,3),(4,5,6)])
This is a 2D array, so ndim gives b
ndarray.shape
shape is a tuple of integers representing the size of ndarray in each dimension
If the array is 3x3
ndarray.shape gives (3,3)
ndarray.size
Total number of elements
Product of elements in shape
if shape is (3,3) then size is 3x3 = 9
ndarray.dtype
Data type of elements of numpy array
ndarray.itemsize
Returns size of each element of a numpy array
Example - ndarray Object Parameters
Program
import numpy as np
a = np.array([[[1,2,3],[4,3,5]],[[3,6,7],[2,1,0]]])
print("The dimension of array a is:",a.ndim)
print("The size of array a is:",a.shape)
print("The total no of elements in array a is:",a.size)
print("The datatype of elements in array a is:",a.dtype)
print("The size of each element in array a is:",a.itemsize)
Output
The dimension of array a is: 3

The size of array a is: (2, 2, 3)
The total no of elements in array a is: 12
The datatype of elements in array a is: int64
The size of each element in array a is: 8
Arithmetic operations with NumPy Array
The arithmetic operations with NumPy arrays perform element wise operations. This
means the operators are applied only between corresponding elements
Arithmetic operations are possible only if the array has the same structure and dimensions
Basic Operations with Scalars
Program
import numpy as np
a = np.array([1,2,3,4,5])
b = a + 1
print(b)
c = 2**a
print(c)
Output
[2 3 4 5 6]
[ 2 4 8 16 32]
In the first one, Each element is added by 1

For the second one, 2 where e is each element
e
Arithmetic Operators in numpy
Program
import numpy as np
a = np.array([7,3,4,5,1])
b = np.array([3,4,5,6,7])
print(a+b)
print(np.add(a,b))
print("----------")
print(a-b)
print(np.subtract(a,b))
print("----------")
print(a*b)
print(np.multiply(a,b))
print("-----------")
print(a/b)
print(np.divide(a,b))
print("-----------")
print(np.remainder(a,b))
print("-----------")
print(np.mod(a,b))
print("-----------")
print(np.power(a,b))
print("-----------")
print(np.reciprocal(a,b))
print("-----------")
Output
[10 7 9 11 8]
[10 7 9 11 8]
----------
[ 4 -1 -1 -1 -6]
[ 4 -1 -1 -1 -6]
----------
[21 12 20 30 7]
[21 12 20 30 7]
-----------
[2.33333333 0.75 0.8 0.83333333 0.14285714]
[2.33333333 0.75 0.8 0.83333333 0.14285714]
-----------
[1 3 4 5 1]
-----------
[1 3 4 5 1]
-----------
[ 343 81 1024 15625 1]
-----------
[0 0 0 0 1]
-----------
Here we are using add,subtract, multiply and divide etc

we can use either the symbol or name
either + or add can be used
Trignometry operations with NumPy Array
np.sin()
np.cos()
np.tan()
Comparison in NumPy
We can use == operator to check if they are equal

numpy.greater(x1,x2)
numpy.greater_equal(x1,x2)
numpy.less(x1,x2)
numpy.less_equal(x1,x2)
4. Pandas
Note
You can try these pandas examples yourselves at

https://www.w3schools.com/python/pandas/trypandas.asp?filename=demo_pandas_editor
What is Pandas?
Pandas is an open-source library that is built on top of NumPy library. It is a Python

package that offers various data structures and operations for manipulating numerical data
and time series.
It is mainly popular for importing and analyzing data much easier. Pandas is fast and it has
high-performance and productivity for users.
Advantages
Fast and efficient for manipulating and analyzing data.

Data from different file objects can be loaded.
Easy handling of missing data (represented as NaN)
Series
Pandas Series is a one-dimensional labeled array capable of holding data of any type
(integer, string, float, python objects, etc.)
The axis labels are collectively called index. Pandas Series is nothing but a column in an
excel sheet.
We can form a simple series using an array of data
Example of Series
Program-1 (Simple Series)
import pandas as pd
obj = pd.Series([3,5,-8,7,9])
print(obj)
Output
0 3
1 5
2 -8
3 7
4 9
dtype: int64
Here the index values are 0,1,2,3,4 for the Values in the series
Program-2 (Series with custom index)
We can also put custom index values

import pandas as pd
obj = pd.Series([3,5,-8,7,9],index=['d','b','a','c','e'])
print(obj)
Output
d 3
b 5
a -8
c 7
e 9
dtype: int64
Dataframe
What is Dataframe?
Pandas DataFrame is two-dimensional size-mutable, potentially heterogeneous tabular

data structure with labeled axes (rows and columns)
A Pandas DataFrame is like a table of data.
It has rows and columns.
You can change its size by adding or removing rows and columns.
It can hold different types of data in different columns (e.g., numbers, text).
Pandas DataFrame consists of three principal components, the data, rows, and columns.
Basic operations of Dataframe
Dealing with Rows and Columns
Indexing and Selecting Data
Working with Missing Data
Iterating over rows and columns
Example-1
import pandas as pd
lst = ['mec','minor','stud','eee','bio']
df = pd.DataFrame(lst)
Output
0
0 mec
1 minor
2 stud
3 eee
4 bio
Here the values are occupying column 0

Lets see how to do it for multiple columns
Example-2
import pandas as pd
lst = {
'Column 0': ['mec', 'minor', 'stud', 'eee', 'bio'],
'Column 1': ['data1', 'data2', 'data3', 'data4', 'data5']
}
df = pd.DataFrame(lst)
Output
Column 0 Column 1
0 mec data1
1 minor data2
2 stud data3
3 eee data4
4 bio data5
Example-3
Lets make a table with name and age

import pandas as pd
# initialise data of lists.
data = {'Name':['Tom', 'nick', 'krish', 'jack'], 'Age':[20, 21, 19, 18]}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
print(df)
Output
Name Age
0 Tom 20
1 nick 21
2 krish 19
3 jack 18
5. Matplotlib
What is Matplotlib?
Matplotlib is one of the most popular Python packages used for data visualization
It is a cross-platform library for making 2D plots from data in arrays. Matplotlib is written in
Python and makes use of NumPy.
Example - 1 -Sin Wave
from matplotlib import pyplot as plt

import numpy as np
import math
x = np.arange(0,math.pi*2,0.05)
y = np.sin(x)
plt.plot(x,y)
plt.xlabel("angle")
plt.ylabel("sine")
plt.title("sine wave")
plt.show()
Output
1. To begin with, the Pyplot module from Matplotlib package is imported

import matplotlib.pyplot as plt
2. Next we need an array of numbers to plot.

import numpy as np
import math
x = np.arange(0, math.pi * 2, 0.05)
np.arange: This is a function from the NumPy library that generates an array of
evenly spaced values within a given range.
0: The starting value of the range (inclusive).
math.pi * 2: The end value of the range (exclusive). This calculates 2π2π,
which is approximately 6.2832.
0.05: The step size, which determines the spacing between values in the array.
3. The ndarray object serves as values on x axis of the graph. The corresponding sine
values of angles in x to be displayed on y axis are obtained by the following
statement
y = np.sin(x)
4. The values from two arrays are plotted using the plot() function.
plt.plot(x,y)
5. You can set the plot title, and labels for x and y axes.You can set the plot title, and
labels for x and y axes.
plt.xlabel("angle")
plt.ylabel("sine")
plt.title('sine wave')
6. The Plot viewer window is invoked by the show() function

plt.show()
Example - 2 - Creating a bar plot
Program

x = [5, 2, 9, 4, 7]
y = [10, 5, 8, 4, 2]
# Function to plot the bar
plt.bar(x,y)
# function to show the plot
plt.show()
Output
Pylab
PyLab is a convenience module that bulk imports matplotlib.pyplot (for
plotting) and NumPy (for Mathematics and working with arrays) in a single
name space.
Example - 3 - Creating a Pieplot
Program
data=[20,30,10,50]
from pylab import *
pie(data)
show()
Output
Python-Module-5-University-Questions-
Part-A
1. How do you assign a random number to a variable in
Python?
To assign a random number to a variable in Python, you can use the random module,
which provides various methods for generating random numbers.
import random
random_num = random.random()
print(random_num)
2. What is the use of os module in python?

The os module in Python provides functions to interact with the operating system. It allows you
to Create, delete, rename, and list files and directories.
3. Write a Python code that checks to see, if a file with the

given pathname exists on the disk, before attempting to
open a file for input
import os
# Function to check if the file exists and then open it

def open_file_if_exists(filepath):
if os.path.isfile(filepath):
with open(filepath, 'r') as file:
content = file.read()
print("File content:\n", content)
else:
print("File does not exist.")
# Example usage
filepath = 'example.txt'
open_file_if_exists(filepath)
*4. What is Flask in Python?
Flask is a lightweight and flexible web framework for Python. It’s designed to make getting
started with web development quick and easy, while still being powerful enough to build
complex web applications.
5. Explain the os and os.path modules in Python with

examples. Also, discuss the walk( ) and getcwd( ) methods of
the os module
What is os and os.path?
The os module in Python provides a way to interact with the operating system, offering
various functions for file and directory manipulation, process management etc.
The os.path module, which is part of os , provides functions for manipulating file and
directory paths.
Example of os
import os
os.mkdir('new_directory') # Create a new directory
os.rename('old_name.txt', 'new_name.txt') # Rename a file
os.remove('file_to_delete.txt') # Remove a file
os.rmdir('directory_to_delete') # Remove a directory
Example of os.path
``
path = os.path.join('directory', 'file.txt') # Join paths

exists = os.path.exists('file.txt') # Check if file exists
is_dir = os.path.isdir('directory') # Check if it is a directory
os.walk()
The os.walk() method generates the file names in a directory tree by walking either top-
down or bottom-up through the directory.
import os
for dirpath, dirnames, filenames in os.walk('path/to/directory'):

print(f"Directory: {dirpath}")
for dirname in dirnames:
print(f"Subdirectory: {dirname}")
for filename in filenames:
print(f"File: {filename}")
os.getcwd()
The os.getcwd() method returns the current working directory.
import os
current_directory = os.getcwd()
print(f"Current Working Directory: {current_directory}")
6. What are the important characteristics of CSV file format.

What is CSV?
CSV is a data format that has fields/columns separated by the comma character and
records/rows terminated by newlines
Example of a csv file
ID,Name,Age,City
1,John Doe,28,New York
2,Jane Smith,34,Los Angeles
3,Emily Jones,22,Chicago
4,Michael Brown,45,Houston
5,Sarah Davis,29,Miami
Here first line is the column name
The remaining lines are the rows
Characteristics of CSV File format
One line for each record

Comma separated fields
Space-characters adjacent to commas are ignored
7. Write the output of the following python code:
import numpy as np
arr1 = np.arange(6).reshape((3, 2))
arr2 = np.arange(6).reshape((3,2))
arr3 = arr1 + arr2[0].reshape((1, 2))
print(arr3)
1. Importing numpy:
1. import numpy as np
2. Creating the first array ( arr1 ):
1. arr1 = np.arange(6).reshape((3, 2))
2. np.arange(6) generates an array with values [0, 1, 2, 3, 4, 5]
3. .reshape((3, 2)) reshapes this array into a 3x2 array:
4. So the arr1 variable will have 3 rows and 2 columns
arr1 = [[0, 1],

[2, 3],
[4, 5]]
3. Creating the second array ( arr2 ):

1. arr2 = np.arange(6).reshape((3, 2))
arr2 = [[0, 1],

[2, 3],
[4, 5]]
4. Adding the first row of arr2 to arr1 :

1. arr3 = arr1 + arr2[0].reshape((1, 2))
2. arr2[0] selects the first row of arr2 , which is [0, 1] .
3. .reshape((1, 2)) reshapes this into a 1x2 array:
4. so arr2[0] will become 0, 1
Adding this 1x2 array to each row of arr1 results in element-wise addition:
arr3 = [[0 + 0, 1 + 1],

[2 + 0, 3 + 1],
[4 + 0, 5 + 1]]
= [[0, 2],
[2, 4],
[4, 6]]
5. Printing the resulting array ( arr3 ):

1. print(arr3)
6. The output of the code is:
[[0 2]
[2 4]
[4 6]]
8. What is the difference between loc and iloc in pandas

DataFrame. Give a suitable example
Difference between loc and iloc
In Pandas, loc and iloc are both methods used for indexing and selecting data in a
DataFrame
loc : It is used to select data by label. When you use loc , you specify rows and columns
based on their labels. This means you refer to the index labels and column names to
retrieve data
iloc : It is used to select data by integer location. When you use iloc , you specify rows
and columns based on their integer positions, starting from 0. This means you refer to the
numerical index positions to retrieve data.
Example
Im creating a dataframe
import pandas as pd
# Creating a sample DataFrame

data = {'A': [1, 2, 3, 4, 5],
'B': ['a', 'b', 'c', 'd', 'e']}
df = pd.DataFrame(data, index=['row1', 'row2', 'row3', 'row4', 'row5'])
The dataframe will look like this
A B
row1 1 a
row2 2 b
row3 3 c
row4 4 d
row5 5 e
Here row1,row2,etc are the index, We cane use this index to access values inside the
table
Using loc
Suppose i want to print the content inside row2

In this case i need to use loc
print("Using loc:")
print(df.loc['row2']) # Selecting row with label 'row2'
print(df.loc['row2', 'B']) # Selecting value at row 'row2' and column 'B'
This will give this output

Printing content of row2
A 2
B b
Name: row2, dtype: object
Printing content inside row2 and column B
b
Using iloc
Suppose i want to access the 0th row and 1st row, without using the index name
For that we will be using iloc
print("Printing content of 0th row")

print(df.iloc[0]) # Selecting row at integer position 1 (second row)
print("Printing content of 0th row and 0th column")

print(df.iloc[0,0]) # Selecting row at integer position 1 (second row)
print("Printing content of 1st row")

print(df.iloc[1]) # Selecting row at integer position 1 (second row)
print("Printing content of 1st row and 1st column")

print(df.iloc[1, 1]) # Selecting value at row 1 and column 1 (second row,
second column)
Printing content of 0th row

A 1
B a
Printing content of 0th row and 0th column
1
Printing content of 1st row
A 2
B b
Printing content of 1st row and 1st column
b
9. Explain the attributes of an ndarray object.

ndarray.ndim
ndim represented the number of dimensions(axes) of the ndarray
a = np.array([1,2,3,4])
This is a 1D array, so ndim gives 1
b = np.array([(1,2,3),(4,5,6)])
This is a 2D array, so ndim gives b
ndarray.shape
shape is a tuple of integers representing the size of ndarray in each dimension
If the array is 3x3
ndarray.shape gives (3,3)
ndarray.size
Total number of elements
Product of elements in shape
if shape is (3,3) then size is 3x3 = 9
ndarray.dtype
Data type of elements of numpy array
ndarray.itemsize
Returns size of each element of a numpy array
Python-Module-5-University-Questions-
Part-B
1. Explain how the matrix multiplications are done using
numpy arrays.
Matrix multiplication, also called the matrix dot product.
The rule for matrix multiplication is as follows:
The number of columns (n) in the first matrix (A) must equal the number of rows (m)
in the second matrix (B).
from numpy import array

# define first matrix
A = array([[1, 2],[3, 4],[5, 6]]
print(A)
# define second matrix
B = array([[1, 2],[3, 4]])
print(B)
# multiply matrices
C = A.dot(B)
print(C)
Here Number of columns of A = 2

Number of rows of B = 2
The output of the program is
[[1 2]
[3 4]
[5 6]]
[[1 2]
[3 4]]
[[ 7 10]
[15 22]
[23 34]]
2. How to plot two or more lines on a same plot with suitable

legends, labels and title.
# Create data for the lines

x = [1, 2, 3, 4, 5]
y1 = [2, 3, 5, 7, 11]
y2 = [1, 4, 9, 16, 25]
# Plot each line with labels

plt.plot(x, y1, label='Line 1')
plt.plot(x, y2, label='Line 2')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Two Lines Plot')
# Add legend
plt.legend()
# Show the plot

plt.show()
Here We plot 2 lines using plt.plot

We set the label using plt.xlabel, plt.ylabel
We set the title using plt.title
Legend is set using plt.legend(), and the label attribute in plt.plot()
The output is
3. Consider a CSV file ‘employee.csv’ with the following
columns(name, gender, start_date ,salary, team)
Write commands to do the following using panda library.
1. print first 7 records from employees file
2. print all employee names in alphabetical order
3. find the name of the employee with highest salary
4. list the names of male employees
5. Display to which all teams employees belong
1. First we import pandas

1. import pandas as pd
2. Now lets read the employee.csv file
1. For that we will use pd.readcsv
2. data = pd.read_csv("employee.csv") # Read the CSV file
3. The question 1 says to print first 7 records
1. For that we will use .head
2. data.head(7)
4. Question 2 says to print all the employee names in alphabetical order
3. print(data.sort_values(by="name")["name"]) # Sort by name and print the
"name" column
5. Question 3 says to FInd the name of the employee with the highest salary
1. highest_paid_employee = data.loc[data["salary"].idxmax(), "name"]
2. It uses the .idxmax() function
3. .idxmax() returns the index label (row number) where the maximum value in the
"salary" column is located.
4. So here row = row number with highest salary
5. Column = name
6. Question 4 says to list the name of male employees
1. male_employees = data[data["gender"] == "M"]["Name"]
7. Question 5 says to display the teams
1. unique_teams = data["team"].unique()
The entire code will look like this
import pandas as pd
# Sample CSV file (replace 'employee.csv' with your actual file path)
data = pd.read_csv("employee.csv")
# 1. Print first 7 records

print("First 7 records:")
print(data.head(7))
# 2. Print employee names in alphabetical order

print("\nEmployee names (alphabetical):")
print(data.sort_values(by="name")["name"])
# 3. Find employee with highest salary

highest_paid_employee = data.loc[data["salary"].idxmax(), "name"]
print("\nEmployee with highest salary:", highest_paid_employee)
# 4. List names of male employees

male_employees = data[data["gender"] == "M"]["name"]
print("\nMale employees:", male_employees.tolist())
# 5. List all teams
unique_teams = data["team"].unique()
print("\nTeams:", unique_teams.tolist())
employee.csv file
name,gender,start_date,salary,team
Alice,F,2023-01-01,50000,Marketing
Bob,M,2022-05-15,72000,Engineering
Charlie,M,2024-02-10,48000,Sales
David,M,2021-12-25,65000,Marketing
Emily,F,2023-07-09,38000,Finance
Frank,M,2022-09-22,80000,Engineering
Grace,F,2024-03-14,42000,Sales
Output
First 7 records:
name gender start_date salary team
0 Alice F 2023-01-01 50000 Marketing
1 Bob M 2022-05-15 72000 Engineering
2 Charlie M 2024-02-10 48000 Sales
3 David M 2021-12-25 65000 Marketing
4 Emily F 2023-07-09 38000 Finance
5 Frank M 2022-09-22 80000 Engineering
6 Grace F 2024-03-14 42000 Sales
Employee names (alphabetical):

0 Alice
1 Bob
2 Charlie
3 David
4 Emily
5 Frank
6 Grace
Name: name, dtype: object
Employee with highest salary: Frank
Male employees: ['Bob', 'Charlie', 'David', 'Frank']
Teams: ['Marketing', 'Engineering', 'Sales', 'Finance']
*4. Write Python program to write the data given below to a

CSV file**
*Reg_no Name Sub_Mark1 Sub_Mark2 Sub_Mark3
10001 Jack 76 88 76
10002 John 77 84 79
10003 Alex 74 79 81
import csv
data = [
["Reg_no", "Name", "Sub_Mark1", "Sub_Mark2", "Sub_Mark3"],
[10001, "Jack", 76, 88, 76],
[10002, "John", 77, 84, 79],
[10003, "Alex", 74, 79, 81],
]
with open("student_data.csv", "w", newline="") as csvfile:

writer = csv.writer(csvfile)
writer.writerows(data)
print("Data written to student_data.csv successfully!")
The csv.writer function creates a writer object that helps write data to the CSV file.
Writes data to the CSV file: The writer.writerows method writes all rows from the
data list to the CSV file.
5. Consider the following two-dimensional array named arr2d
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
Write the output of following Python Numpy expressions:
1. arr2d[:2]
2. arr2d[:2, 1:]
3. arr2d[1, :2]
4. arr2d[:2, 1:] = 0
arr2d[:2]
[[1 2 3] [4 5 6]]
Slicing with [:2] selects all rows from the beginning ( 0 ) up to, but not including, index
2.
So, it extracts the first two rows (index 0 and 1) of the original array and returns a new 2D
array containing those rows.
arr2d[:2, 1:]
[[2 3] [5 6]]
Slicing with [:2, 1:] selects elements based on both rows and columns.
[:2] selects rows as explained in example 1.
, 1:] selects columns starting from index 1 (the second column) up to the end for
each row included in the first selection.
Therefore, it extracts elements from the second column (index 1) onwards for the first two
rows (0 and 1) and returns a new 2D array with those elements.
arr2d[1, :2]
[4 5]
Slicing with [1, :2] selects elements from a specific row and columns.
[1] selects the second row (index 1) of the array.
, :2 selects columns from the beginning ( 0 ) up to, but not including, index 2 (i.e.,
the first two columns).
This extracts elements from the first two columns (0 and 1) of the second row (index 1)
and returns a new 1D array (row vector) containing those elements.
arr2d[:2, 1:] = 0
[[1 0 0] [4 0 0] [7 8 9]]
[:2, 1:] selects elements from the second column onwards for the first two rows.
Assigning 0 to this selection replaces those elements with zeros, effectively modifying the
original arr2d array.
6. Write a Python program to add two matrices and also find

the transpose of the resultant matrix.
import numpy as np
# Define two matrices

matrix1 = np.array([[1, 2, 3], [4, 5, 6]])
matrix2 = np.array([[7, 8, 9], [10, 11, 12]])
# Add the matrices

sum_matrix = matrix1 + matrix2
print("Sum of matrices:\n", sum_matrix)
# Find the transpose of the resultant matrix

transpose_sum = sum_matrix.T
print("\nTranspose of the sum:\n", transpose_sum)
Output
Sum of matrices:
[[ 8 10 12]
[14 16 18]]
Transpose of the sum:

[[ 8 14]
[10 16]
[12 18]]
7. Given a file “auto.csv” of automobile data with the fields

index, company,body-style, wheel- base, length, engine-type,
num-of-cylinders, horsepower,average-mileage, and price,
write Python codes using Pandas to
1. Print total cars of all companies
2. Find the average mileage of all companies
3. Find the highest priced car of all companies
Question 1
# 1. Print total cars of all companies

data = pd.read_csv("auto.csv")
total_cars = len(data)
print(f"Total Cars: {total_cars}")
Question 2
average_mileage = data["average-mileage"].mean()
print(f"Average Mileage: {average_mileage} MPG")
Question 3
highest_priced_car = data[data["price"] == data["price"].max()]
print("Highest Priced Car:") print(highest_priced_car)
Example- auto.csv
index,company,body-style,wheel-base,length,engine-type,num-of-
cylinders,horsepower,average-mileage,price
1,Chevrolet,Wagon,98.6,190.9,2.5L,4,98,25,8145
2,Chevrolet,Minivan,97.0,186.6,3.0L,4,161,19,10095
3,Dodge,Sedan,96.8,190.0,2.3L,4,100,21,7875
4,Dodge,Sedan,96.8,190.0,2.0L,4,130,23,8845
5,Plymouth,Minivan,95.2,187.1,2.4L,4,100,18,8845
6,Ford,Sedan,97.5,190.0,2.0L,4,120,21,7195
7,Ford,Sedan,97.5,190.0,2.0L,4,100,25,7595
8,Ford,Wagon,98.8,195.0,2.3L,4,120,20,8575
9,Mercury,Sedan,97.5,190.0,2.0L,4,120,21,7195
10,Mercury,Sedan,97.5,190.0,2.0L,4,100,25,7595
Output
Total Cars: 10
Average Mileage: 21.80 MPG
Highest Priced Car:
index company body-style wheel-base length engine-type num-of-
cylinders horsepower average-mileage price
1 2 Chevrolet Minivan 97.0 186.6 3.0L
4 161 19 10095
8. Given the sales information of a company as CSV file with

the following fields month_number, facecream, facewash,
toothpaste, bathingsoap, shampoo, moisturizer, total_units,
total_profit. Write Python codes to visualize the data as
follows
1. Toothpaste sales data of each month and show it using a scatter plot
2. Face cream and face wash product sales data and show it using the bar chart
import pandas as pd
# Read the CSV file

data = pd.read_csv("sales_data.csv")
# 1. Toothpaste sales scatter plot

plt.figure(figsize=(8, 6)) # Adjust figure size as needed
plt.scatter(data["month_number"], data["toothpaste"], label="Toothpaste
Sales")
plt.xlabel("Month Number")
plt.ylabel("Sales")
plt.title("Toothpaste Sales by Month")
plt.grid(True)
plt.legend()
plt.show()
# 2. Face cream and face wash bar chart

plt.figure(figsize=(8, 6)) # Adjust figure size as needed
face_cream_sales = data["facecream"]
face_wash_sales = data["facewash"]
product_labels = ["Face Cream", "Face Wash"]
plt.bar(product_labels, [face_cream_sales.sum(), face_wash_sales.sum()])
plt.xlabel("Product")
plt.ylabel("Sales")
plt.title("Face Cream vs Face Wash Sales")
plt.grid(axis="y") # Grid on y-axis only
plt.show()
sales_data.csv
month_number,facecream,facewash,toothpaste,bathingsoap,shampoo,moisturizer,t
otal_units,total_profit
1,100,80,150,120,90,70,510,2000
2,120,90,180,130,100,80,600,2500
3,90,70,140,110,80,60,450,1800
4,110,85,160,125,95,75,550,2200
5,80,65,130,100,70,55,400,1600
6,130,100,170,140,110,85,635,2700
7,100,80,150,120,90,70,510,2000
8,140,110,180,150,120,90,690,3000
9,95,75,145,115,85,65,480,1900
10,120,90,160,130,100,80,600,2400
11,85,70,135,105,75,60,430,1700
12,110,85,155,120,90,75,535,2100
Output
9. Write a code segment that prints the names of all of the
items in the currentworking directory.
import os
for item in os.listdir():
print(item)
10. Write a python program to create two numpy arrays of
random integers between 0 and 20 of shape (3, 3) and
perform matrix addition, multiplication and transpose of the
product matrix.
import numpy as np
# Create two NumPy arrays of random integers between 0 and 20 of shape (3,
3)
array1 = np.random.randint(0, 21, size=(3, 3))
array2 = np.random.randint(0, 21, size=(3, 3))
# Print the original arrays

print("Array 1:\n", array1)
print("Array 2:\n", array2)
# Perform matrix addition

sum_matrix = np.add(array1, array2)
# Print the sum matrix

print("\nSum of matrices:\n", sum_matrix)
# Perform matrix multiplication

product_matrix = np.dot(array1, array2)
# Print the product matrix

print("\nProduct of matrices:\n", product_matrix)
# Perform transpose of the product matrix

transposed_product = product_matrix.T
# Print the transposed product matrix

print("\nTranspose of the product matrix:\n", transposed_product)
11. Write Python program to write the data given below to a
CSV file named student.csv
fields = ['Name', 'Branch', 'Year', 'CGPA']
rows = [ ['Nikhil', 'CSE', '2', '8.0'],
['Sanchit', 'CSE', '2', '9.1'],
['Aditya', 'IT', '2', '9.3'],
['Sagar', 'IT', '1', '9.5']]
import csv
# Define data fields and rows

fields = ['Name', 'Branch', 'Year', 'CGPA']
rows = [
['Nikhil', 'CSE', '2', '8.0'],
['Sanchit', 'CSE', '2', '9.1'],
['Aditya', 'IT', '2', '9.3'],
['Sagar', 'IT', '1', '9.5'],
]
# Open the CSV file in write mode

with open('student.csv', 'w', newline='') as csvfile:
# Create a CSV writer object
writer = csv.writer(csvfile)
# Write the header row

writer.writerow(fields)
# Write each data row

writer.writerows(rows)
print("Student data written to student.csv successfully!")
12. Consider the above student.csv file with fields Name,

Branch, Year, CGPA .
Write python code using pandas to
1. To find the average CGPA of the students
2. To display the details of all students having CGPA > 9
3. To display the details of all CSE students with CGPA > 9
4. To display the details of student with maximum CGPA
5. To display average CGPA of each branch
import pandas as pd
# Read the CSV data into a DataFrame

data = pd.read_csv('student.csv')
# 1) Average CGPA of all students

avg_cgpa = data['CGPA'].mean()
print("Average CGPA:", avg_cgpa)
# 2) Students with CGPA > 9

high_cgpa_students = data[data['CGPA'] > 9]
print("\nStudents with CGPA > 9:\n", high_cgpa_students)
# 3) CSE students with CGPA > 9

cse_high_cgpa = data[(data['Branch'] == 'CSE') & (data['CGPA'] > 9)]
print("\nCSE Students with CGPA > 9:\n", cse_high_cgpa)
# 4) Student with maximum CGPA

max_cgpa_student = data.loc[data['CGPA'].idxmax()]
print("\nStudent with maximum CGPA:\n", max_cgpa_student)
# 5) Average CGPA of each branch

avg_cgpa_branch = data.groupby('Branch')['CGPA'].mean()
print("\nAverage CGPA of each branch:\n", avg_cgpa_branch)
student.csv
Name,Branch,Year,CGPA
Nikhil,CSE,2,8.0
Sanchit,CSE,2,9.1
Aditya,IT,2,9.3
Sagar,IT,1,9.5
Output
Average CGPA: 8.975000000000001
Students with CGPA > 9:

Name Branch Year CGPA
1 Sanchit CSE 2 9.1
2 Aditya IT 2 9.3
3 Sagar IT 1 9.5
CSE Students with CGPA > 9:

Name Branch Year CGPA
1 Sanchit CSE 2 9.1
Student with maximum CGPA:

Name Sagar
Branch IT
Year 1
CGPA 9.5
Name: 3, dtype: object
Average CGPA of each branch:

Branch
CSE 8.55
IT 9.40
Name: CGPA, dtype: float64
13. Consider a CSV file ‘weather.csv’ with the following

columns (date,temperature, humidity, windSpeed,
precipitationType, place, weather {Rainy,Cloudy, Sunny}).
Write commands to do the following using Pandas library.
1. Print first 10 rows of weather data.

2. Find the maximum and minimum temperature
3. List the places with temperature less than 28oC.
4. List the places with weather = “Cloudy”
5. Sort and display each weather and its frequency
6. Create a bar plot to visualize temperature of each day
import pandas as pd
# Read the CSV data into a DataFrame

data = pd.read_csv('weather.csv')
# 1. Print first 10 rows

print("First 10 rows:\n", data.head(10))
# 2. Maximum and minimum temperature

max_temp = data['temperature'].max()
min_temp = data['temperature'].min()
print("\nMaximum temperature:", max_temp, "°C")
print("Minimum temperature:", min_temp, "°C")
# 3. Places with temperature less than 28°C

cold_places = data[data['temperature'] < 28]
print("\nPlaces with temperature less than 28°C:\n",
cold_places['place'].unique())
# 4. Places with weather = "Cloudy"

cloudy_places = data[data['weather'] == "Cloudy"]
print("\nPlaces with weather = 'Cloudy':\n",
cloudy_places['place'].unique())
# 5. Sort and display weather frequency

weather_counts = data['weather'].value_counts()
print("\nWeather frequency:\n", weather_counts)
# 6. Bar plot for temperature

plt.bar(data['date'], data['temperature'], color='skyblue')
plt.xlabel('Date')
plt.ylabel('Temperature (°C)')
plt.title('Daily Temperature')
plt.show()
weather.csv
First 10 rows:
date temperature humidity windSpeed precipitationType
place weather
0 2024-06-02 30.5 65 10 Light Rain New York
Rainy
1 2024-06-02 25.8 72 8 NaN London
Cloudy
2 2024-06-01 28.2 58 12 NaN Paris
Sunny
3 2024-06-02 22.1 80 5 Light Drizzle Berlin
Cloudy
4 2024-06-01 33.7 48 15 NaN Tokyo
Sunny
5 2024-06-02 29.9 70 9 Moderate Rain Singapore
Rainy
6 2024-06-01 21.5 62 11 NaN Moscow
Cloudy
7 2024-06-02 18.7 85 7 Heavy Rain Ottawa
Rainy
8 2024-06-01 31.2 55 14 NaN Beijing
Sunny
9 2024-06-02 27.4 75 6 Light Rain Rome
Rainy
Maximum temperature: 33.7 °C

Minimum temperature: 18.7 °C
Places with temperature less than 28°C:

['London' 'Berlin' 'Moscow' 'Ottawa' 'Rome']
Places with weather = 'Cloudy':

['London' 'Berlin' 'Moscow']
Weather frequency:
weather
Rainy 4
Cloudy 3
Sunny 3
Name: count, dtype: int64
Output

Python Module 5 Important Topics

Uploaded by

Copyright:

Available Formats

Python Module 5 Important Topics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Python Module 5 Important Topics

Uploaded by

Copyright:

Available Formats

Python-Module-5-Important-Topics

For more notes visit

9. Explain the attributes of an ndarray object.

We can create a new directory using mkdir() function from OS Module

Changing current working directory

This is done using chdir

How to know what is my current working directory?

You can use getcw() to get the current working directory

The rmdir() function removes the specified directory

Returns the largest integer a variable can take

You can try these numpy examples yourselves at

Numpy is a library consisting of multidimensional array objects and a collection of routines

Difference between python list and a NumPy array

# Accessing the first element (zero-based index)

# Checking the data type

# Slicing the array

Here you can see a large rectangle on the bottom

There is smaller squares inside the ndarray

Example - Creating Arrays

b = np.array([(1,2,3),(4,5,6)], dtype = float)

ndarray Object - Parameters

Example - ndarray Object Parameters

The dimension of array a is: 3

Basic Operations with Scalars

In the first one, Each element is added by 1

Arithmetic Operators in numpy

Here we are using add,subtract, multiply and divide etc

Trignometry operations with NumPy Array

We can use == operator to check if they are equal

You can try these pandas examples yourselves at

Pandas is an open-source library that is built on top of NumPy library. It is a Python

Fast and efficient for manipulating and analyzing data.

Program-1 (Simple Series)

Program-2 (Series with custom index)

We can also put custom index values

Pandas DataFrame is two-dimensional size-mutable, potentially heterogeneous tabular

Basic operations of Dataframe

Here the values are occupying column 0

Lets make a table with name and age

Example - 1 -Sin Wave

from matplotlib import pyplot as plt

1. To begin with, the Pyplot module from Matplotlib package is imported

2. Next we need an array of numbers to plot.

x = np.arange(0, math.pi * 2, 0.05)

6. The Plot viewer window is invoked by the show() function

Example - 2 - Creating a bar plot

from matplotlib import pyplot as plt

Example - 3 - Creating a Pieplot

2. What is the use of os module in python?

3. Write a Python code that checks to see, if a file with the

# Function to check if the file exists and then open it

5. Explain the os and os.path modules in Python with

path = os.path.join('directory', 'file.txt') # Join paths

for dirpath, dirnames, filenames in os.walk('path/to/directory'):

The os.getcwd() method returns the current working directory.

6. What are the important characteristics of CSV file format.

Characteristics of CSV File format

One line for each record

7. Write the output of the following python code: