Python Lab Manual

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

Acropolis Institute of Technology and Research

LAB MANUAL
CS-506: Python Lab

BTech - Computer Science and Engineering


Course Educational Objectives

CEO1 To learn and understand Python programming basics and paradigm.

CEO2 To Design and analyze python looping, control statements and string
manipulations.
CEO3 To learn and understand to create and manipulate strings, lists, tuples and
dictionaries
CEO4 To Analyze and use of different libraries like Numpy and Pandas
CEO5 To learn file handling like CSV, JSON and other file format .

Course Outcomes
Upon completion of this subject / course the student will be able:

CO1 Understand Python syntax and semantics and be fluent in the use of Python flow
control and Functions
CO2 Define and demonstrate the use of built-in data structures and Control
Structures
CO3 Identify the methods to create and manipulate strings, lists, tuples and dictionaries.

CO4 Define and demonstrate different functions of Numpy and Panda Library.

CO5 Determine the need for scraping websites and working with CSV, JSON and other file
formats.
CS-506: Python lab
List of Programs:

1. To write a Python program to find GCD of two numbers.


2. To write a Python Program to find the square root of a number by
Newton’s Method.
3. To write a Python program to find the exponentiation of a
number.
4. To write a Python Program to find the maximum from a list of
numbers.
5. To write a Python Program to perform Linear Search
6. To write a Python Program to perform binary search.
7. To write a Python Program to perform selection sort.
8. To write a Python Program to perform insertion sort.
9. To write a Python Program to perform Merge sort.
10. To write a Python program to find first n prime numbers.
11. To write a Python program to multiply matrices.
12. To write a Python program for command line arguments.
13. To write a Python program to find the most frequent words in a
text read from a file.
14. To write a Python program to simulate elliptical orbits in Pygame.
Numpy ,Pandas, Matplotlib and Seaborn

1. Create Dictionary for student database perform :


1. Display top 3 rows of dataset.
2. Check last 3 rows of dataset.
3. Find Shape of Dataset.
4. Get information about dataset like total no. of rows, total no of column ,datatype of each
column andmemory requirment.
5. Check null values in dataset, overall statics about the dataframe
6. Find total no of students having marks between 90 to 100 using between method .

2. Perform following operation on Ecommerce purchase website

Download database from

https://www.kaggle.com/datasets/utkarsharya/ecommerce-purchases

1. Display Top 10 Rows of The Dataset


2. Check Last 10 Rows of The Dataset
3. Check Datatype of Each Column
4. Check null values in the dataset
5. How many rows and columns are there in our Dataset?
6. Highest and Lowest Purchase Prices.
7. Average Purchase Price
8. How many people have French 'fr' as their Language?
9. Job Title Contains Engineer
10. Find The Email of the person with the following IP Address: 132.207.160.22
11. How many People have Mastercard as their Credit Card Provider and made a purchase above
50?
12. Find the email of the person with the following Credit Card Number: 4664825258997302
13. How many people purchase during the AM and how many people purchase during PM?
14. How many people have a credit card that expires in 2020?
15. What are the top 5 most popular email providers (e.g. gmail.com, yahoo.com, etc...)

3. Perform following operation on Employee salary dataset .

Download database from

https://www.kaggle.com/datasets/kaggle/sf-salaries

1. Display Top 10 Rows of The Dataset


2. Check Last 10 Rows of The Dataset
3. Find Shape of Our Dataset (Number of Rows And Number of Columns)
4. Getting Information About Our Dataset Like Total Number Rows, Total Number of Columns,
Datatypes of Each Column And Memory Requirement
5. Check Null Values In The Dataset
6. Drop ID, Notes, Agency, and Status Columns
7. Get Overall Statistics About The Dataframe
8. Find Occurrence of The Employee Names (Top 5)
9. Find The Number of Unique Job Titles
10.Total Number of Job Titles Contain Captain
11. Display All the Employee Names From Fire Department
12. Find Minimum, Maximum, and Average BasePay
13. Replace 'Not Provided' in EmployeeName' Column to NaN
14. Drop The Rows Having 5 Missing Values
15. Find Job Title of ALBERT PARDINI
16. How Much ALBERT PARDINI Make (Include Benefits)?
17.Display Name of The Person Having The Highest BasePay
18.Find Average BasePay of All Employee Per Year
19. Find Average `BasePay of All Employee Per JobTitle
20. Find Average BasePay of Employee Having Job Title ACCOUNTANT
21. Find Top 5 Most Common Jobs

4. Perform following operation on Income Database .

Download database from

https://www.kaggle.com/datasets/wenruliu/adult-income-dataset

1.Display Top 10 Rows of The Dataset


2. Check Last 10 Rows of The Dataset
3. Find Shape of Our Dataset (Number of Rows And Number of Columns)
4. Getting Information About Our Dataset Like Total Number Rows, Total Number of Columns,
Datatypes of Each Column And Memory Requirement
5. Fetch Random Sample From the Dataset (50%)
6.Check Null Values In The Dataset
7.Perform Data Cleaning [ Replace '?' with Python ]
8. Drop all The Missing Values
9. Check For Duplicate Data and Drop Them
10. Get Overall Statistics About The Dataframe
11. Drop The Columns education-num, capital-gain, and capital-loss
12. What Is The Distribution of Age Column?
13. Find Total Number of Persons Having Age Between 17 To 48 (Inclusive) Using Between Method
14. What is The Distribution of Workclass Column?
15. How Many Persons Having Bachelors and Masters Degree?
16. Bivariate Analsis
17. Replace Salary Values With 0 and 1
18. Which Workclass Getting The Highest Salary?
19.How Has Better Chance To Get Salary greater than 50K Male or Female?
20. Covert workclass Columns Datatype To Category Datatype
5. Perform following operation on Titanic - Machine Learning from Disaster.

Download database from

https://www.kaggle.com/c/titanic

1. Display Top 5 Rows of The Dataset


2. Check the Last 3 Rows of The Dataset
3. Find Shape of Our Dataset (Number of Rows & Number of Columns)
4. Get Information About Our Dataset Like Total Number Rows, Total Number of Columns,
Datatypes of Each Column And Memory Requirement
5. Get Overall Statistics About The Dataframe
6. Data Filtering
7.Check Null Values In The Dataset
8. Drop the Column
9. Handle Missing Values
10. Categorical Data Encoding
11. What is Univariate Analysis?How Many People Survived And How Many Died?How Many
Passengers Were In First Class, Second Class, and Third Class?Number of Male And Female
Passengers
12. Bivariate Analysis How Has Better Chance of Survival Male or Female? Which Passenger Class
Has Better Chance of Survival (First, Second, Or Third Class)?
13. Feature Engineering

6. Perform following operation on Play Store Apps Dataset.

Download database from

https://www.kaggle.com/datasets/lava18/google-play-store-apps

1. Display Top 5 Rows of The Dataset


2. Check the Last 3 Rows of The Dataset
3. Find Shape of Our Dataset (Number of Rows & Number of Columns)
4. Get Information About Our Dataset Like Total Number Rows, Total Number of Columns,
Datatypes of Each Column And Memory Requirement
5. Get Overall Statistics About The Dataframe
6. Total Number of App Titles Contain Astrology
7. Find Average App Rating
8. Find Total Number of Unique Category
9. Which Category Getting The Highest Average Rating?
10. Find Total Number of App having 5 Star Rating
11. Find Average Value of Reviews
12. Find Total Number of Free and Paid Apps
13. Which App Has Maximum Reviews?
14. Display Top 5 Apps Having Highest Reviews
15. Find Average Rating of Free and Paid Apps
16. Display Top 5 Apps Having Maximum Installs
1.To write a Python program to find GCD of two numbers.
In Python, the math module contains a number of mathematical operations, which can be performed
with ease using the module. math.gcd() function compute the greatest common divisor of 2 numbers
mentioned in its arguments.

Algorithm:
Euclidean Algorithm to find the GCD of two numbers.

1. Begin
2. if a = 0 OR b = 0, thenreturn 0
3. if a = b, thenreturn b
4. if a > b, then
5. return GCD(a-b, b)else
6. return GCD(a, b-a)
7. End
2.To write a Python Program to find the square root of a number by Newton’s
Method.
Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph
Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or
zeroes) of a real-valued function.

Algorithm:

8. def newton_method(number, number_iters = 100):


9. a = float(number)
10. for i in range(number_iters):
11. number = 0.5 * (number + a / number)
12. return number.
13. a=int(input("Enter first number:"))
14. b=int(input("Enter second number:"))
15. print("Square root of first number:",newton_method(a))
16. Exit
3.To write a Python program to find the exponentiation of a number.
In mathematics, an exponent of a number says how many times that number is repeatedly multiplied
with itself (Wikipedia, 2019). We usually express that operation as bn, where b is the base and n is
the exponent or power. We often call that type of operation “b raised to the n-th power”, “b raised to the
power of n”, or most briefly as “b to the n”

Algorithm:
(x -> Number , y-> exponent )
17. Assign x to a temporary variable(say n).
18. Loop y times; each time assigning x=n*x.
19. Return the value of x (which is x to the power y).
4. To write a Python Program to find the maximum from a list of numbers.
Algorithm:

i. Create a variable m.
ii. Assign it the most negative no. (for instance -1000000000).
iii. Now iterate through the list (L). Each time comparing the value of m with L[i].
iv. If L[i]>m then
v. assign m = L[i]
vi. Now print the value of m.
5. To write a Python Program to perform Linear Search.
Linear search is a sequential searching algorithm where we start from one end and check every
element of the list until the desired element is found. It is the simplest searching algorithm.

Algorithm

i. Start
ii. set i=0
iii. While i<size
iv. If Ar[i]==item
v. Return i and goto step 5
vi. End If
vii. Set i=i+1
viii. End While
ix. Return -1.
x. Stop.
6. To write a Python Program to perform binary search.
Binary Search is a searching algorithm for finding an element's position in a sorted array. In this approach,
the element is always searched in the middle of a portion of an array.
Note: Binary search can be implemented only on a sorted list of items. If the elements are not sorted
already, we need to sort them first.

Algorithm

i. Start
ii. Set Low =0
iii. Set High= Size-1
iv. While Low<=High
v. set Mid=(Low+High)/2
vi. If Item=Ar[Mid]
vii. Return Mid and goto step
viii. Else If Item <Ar[Mid]
ix. High=Mid-1
x. Else
xi. Low=Mid+1
xii. End If
xiii. End While
xiv. Return -1
xv. Stop.
7. To write a Python Program to perform selection sort.
Selection sort is a simple sorting algorithm. This sorting algorithm is an in-place comparison-based
algorithm in which the list is divided into two parts, the sorted part at the left end and the unsorted
partat the right end. Initially, the sorted part is empty and the unsorted part is the entire list.

Algorithm:
i. Set MIN to location 0
ii. Search the minimum element in the list
iii. Swap with value at location MIN
iv. Increment MIN to point to next element
v. Repeat until list is sorted
8. To write a Python Program to perform insertion sort.
Insertion sort is a sorting algorithm that places an unsorted element at its suitable place in each
iteration. Insertion sort works similarly as we sort cards in our hand in a card game. We assume that the
first card is already sorted then, we select an unsorted card.

Algorithm:
i. If it is the first element, it is already sorted. return 1;
ii. Pick next element
iii. Compare with all elements in the sorted sub-list
iv. Shift all the elements in the sorted sub-list that is greater than the value to be sorted
v. Insert the value
vi. Repeat until list is sorted
9. To write a Python Program to perform Merge sort.
Merge sort is a sorting algorithm based on the Divide and conquer strategy. It works by recursively
dividing the array into two equal halves, then sort them and combine them. It takes a time of (n logn) in
the worst case.

Algorithm:
1: Find the middle index of the array.
Middle = 1 + (last – first)/2
2: Divide the array from the middle.
3: Call merge sort for the first half of the
arrayMergeSort(array, first, middle)
4. Call merge sort for the second half of the
array.MergeSort(array, middle+1, last)
5. Merge the two sorted halves into a single sorted array.
10. To write a Python program to find first n prime numbers.
Algorithm:

1. Start
2. Set ct =0, n =0, i= 1, j=1
3. Repeat 4 to 12 Until n<104. j =1
4. ct =0
5. Repeat Step 7 to 9 Until j<=i
6. if i%j==0 then
7. ct = ct+19. j =j+1
8. if ct==2 then PRINT i
9. n =n+1
10. i = i+1
11. End
11. To write a Python program to multiply matrices.

Algoithm : matrixMultiply(A, B):


1. Assume dimension of A is (m x n), dimension of B is (p x q)
2. Begin
3. if n is not same as p, then exit otherwise define C matrix as (m x q)for i in range 0 to m - 1, do
4. for j in range 0 to q – 1, dofor k in range 0 to p, do
5. C[i, j] = C[i, j] + (A[i, k] * A[k, j])
6. done
7. done
8. done
9. End
12. To write a Python program for command line arguments.

The sys module provides functions and variables used to manipulate different parts of the Python
runtime environment. This module provides access to some variables used or maintained by the
interpreter and to functions that interact strongly with the interpret er.
One such variable is sys.argv which is a simple list structure. It’s main purpose are:
 It is a list of command line arguments.
 len(sys.argv) provides the number of command line arguments.
 sys.argv[0] is the name of the current Python script.
13. To write a Python program to find the most frequent words in a text read from
a file.
Python provides inbuilt functions for creating, writing, and reading files. Two types of files can be
handled in python, normal text files, and binary files (written in binary language,0s and 1s).
1. Text files: In this type of file, Each line of text is terminated with a special character called
EOL (End ofLine), which is the new line character (‘\n’) in python by default.
2. Binary files: In this type of file, there is no terminator for a line, and the data is stored after
converting it into machine-understandable binary language.
Here we are operating on the .txt file in Python. Through this program, we will find the most repeated
word in a file.

Algorithm:
1. Variable maxCount will store the count of most repeated word.
2. Open a file in read mode using file pointer.
3. Read a line from file. Convert each line into lowercase and remove the punctuation marks.
4. Split the line into words and store it in an array.
5. Use two loops to iterate through the array. Outer loop will select a word which needs to be count.
Inner loop will match the selected word with rest of the array. If match found, increment count by 1.
6. If count is greater than maxCount then, store value of count in maxCount and corresponding word
in variable word.
7. At the end, maxCount will hold the maximum count and variable word will hold most repeated word.
14. To write a Python program to simulate elliptical orbits in Pygame.

Algorithm:
1. Define the class Solar system and initialize all the child classes under it.
2. Define the class for Sun and initialize all the attributes it takes as input
3. Define the planet class and provide details of all the various planets and their attributes
4. End the program with source code .
15. To write a Python program to bouncing ball in Pygame.

Algorithm:
1. Start
2. Set screen size and background color.
3. Set speed of moving ball.
4. Create a graphical window using set_mode()
5. Set caption
6. Load the ball image and create a rectangle area covering the image
7. Use blit() method to copy the pixel color of the ball to the screen
8. Set background color of screen and use flip() method to make all images visible.
9. Move the ball in specified speed.
10. If ball hits the edges of the screen reverse the direction.
11. Create an infinite loop and Repeat steps 9 and 10 until user quits the program
12. Stop.
Numpy & Pandas
1. Create a dataframe where columns are :- SNo, Name, Enrollment, Mobile and
insert 5columns in it.
Solu: The required program is :-

import pandas as pd

data= {'SNo':[1,2,3,4,5],'Name':['A','B','C','D','E'],'Enroll':[101,102,103,104,105],
'Mobile':[123456789,234567891,345678912,456789123,567891234]}

df = pd.DataFrame(data)

print(df)
2. Create a dataframe where columns are :- Student name, MST1 and MST2. Insert
10 rows and apply different operations on it.
Solu: The required program is :-

import pandas as pd

d={ 'Name':pd.Series(['A','B','C','D','E','F','G','H','I','J']),
'MST1':pd.Series([10,15,12,13,10,8,20,19,15,13]),
'MST2':pd.Series([15,16,13,18,19,20,10,5,18,10])
}

df = pd.DataFrame(d)print(df.describe())
3. Create a dataframe where columns are student name, MST1 and MST2 .Insert 10
rows and apply different operations on it.

Solu: The required program is :-

Import pandas aspd


Import numpyasnp

1 roll =pd.Series([1,2,3,4,5,6,7,8,9,10])
2 name =pd.Series(['A','B','C','D','E','F','G','H','I','J'])
3 mst1 =pd.Series([10,20,30,40,10,15,12,13,17,25])
4 mst2 =pd.Series([10,20,30,40,10,15,12,13,17,25]) In [10]:

1S =pd. DataFrame({ 'roll': roll , 'name' : name, 'mst1' : mst1, 'mst2' : mst2})
3. Create a dataframe where columns are student name, MST1 and MST2 .Insert 10
rows and apply Sorting operations on it.
Solu: The required program is :-

import pandas as pdimport numpy as np

df = pd.DataFrame({'Name':['A','D','C','B','J','K','P','H','G','U'],'MST1':[10,15,12,14,9,13,20,14,
10,11],'MST2':[13,12,9,8,16,17,19,20,7,18]})

print("Sort on basis of Columns")sorted_df=df.sort_index(axis=1) print(sorted_df)

print("Sort on basis of MST1 Marks")sorted_df1=df.sort_values(by='MST1')print(sorted_df1)

print("Sort on basis of MST2 Marks")sorted_df2=df.sort_values(by='MST2')print(sorted_df2)


4. Reshape 5 x 4 array into 2 x 10 array.
Solu: The required program is :-

import numpy as np

A = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16],[17,18,19,20]])

#using reshape function of numpyB = A.reshape(2,10)

print("Reshaped array : ",B)


5. Flatten 5 x 3 array.
Solu: The required program is :-

import numpy as np

A = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15]])

#using flatten function of numpyB = A.flatten()

print("Flattened Array : ",B)


6. Split 6 x 8 array into 3 x 8 and 3 x 8 two arrays.
Solu: The required program is :-

import numpy as np

A = np.array([[0,0,0,0,0,0,0,0],
[1,1,1,1,1,1,1,1],
[0,0,0,0,0,0,0,0],
[1,1,1,1,1,1,1,1],
[0,0,0,0,0,0,0,0],
[1,1,1,1,1,1,1,1]])

#using array_split function of numpyB = np.array_split(A,2)

print("Two 3*8 arrays are : ")print('1: ',B[0])


print('2: ',B[1])
7. Plot a Graph for Y = 2x + 3
Solu: The required program is :-

import matplotlib.pyplot as plt


import numpy as np
x = np.linspace(0,10)

y = 2*x+3 plt.plot(x, y)

plt.title('Graph for y=2x+3')plt.xlabel('x-axis') plt.ylabel('y-axiz')

plt.grid()plt.show()
8. Plot Bar Graph between x and y array.
Solu: The required program is :-

import matplotlib.pyplot as plt


x = [2,3,4,5,6,7,8,9,20,25]
y = [4,9,16,25,36,49,64,81,400,625]

plt.plot(x, y,color='#FF0000') plt.title('Graph for SQUARE function')plt.xlabel('x-axis')


plt.ylabel('y-axiz')plt.show()
9. Create a dataframe from excel (Name, mst-1 marks) and print.
Solu: The required program is :-

import pandas as pd

df = pd.read_excel("Student.xlsx")print(df)
10. Sort dataframe in ascending order.
Solu: The required program is :-

import pandas as pd

df = pd.read_excel("Student.xlsx")

df=df.sort_values("Marks")print(df)
11. Find max, min, mean value in dataframe.

Solu: The required program is :-

import pandas as pd
df = pd.read_excel("Student.xlsx")maxx = df['Marks'].max()
minn = df['Marks'].min()
mean = df['Marks'].mean() print('Maximum marks : ',maxx)print('Minimum marks :
',minn)print('Average marks : ',mean)

You might also like