Pds Record Document Ds II

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

21AD321 - PRINCIPLES OF DATA SCIENCE

LABORATORY PRACTICAL RECORD

DEPARTMENT OF ARTIFICIAL INTELLIGENCE


AND DATA SCIENCE

SRI SHAKTHI
INSTITUTE OF ENGINEERING AND TECHNOLOGY
An Autonomous Institution, Affiliated to Anna University
Accredited by NAAC with “A” Grade COIMBATORE –

62

DECEMBER 2024
SRI SHAKTHI
INSTITUTE OF ENGINEERING AND TECHNOLOGY
COIMBATORE - 62.

DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND DATA SCIENCE

CERTIFICATE

Certified that this is a bonafide record of practical work done


by Mr. /Ms. bearing Register Number of
Third Semester Bachelor of Technology in Artificial Intelligence and Data
Science in the 21AD321 Principles of Data Science Laboratory during the
Academic Year 2024-2025 under our supervision.

Place: Coimbatore

Date:

Staff In-Charge Head of the Department

Submitted for the End Semester Practical Examination held on ………………

Internal Examiner External Examiner


EXP.NO DATE LIST OF EXPERIMENT PAGE MARKS SIGN
NO

01 NUMPY SIMPLE
IMPLEMENTATION

PANDAS-INDEXING
02
AND SELECTION

03 VARIABILITY-RANGE

IMPLEMENTATION
04
OF NORMAL CURVE

05 FINDING MEAN AND MEDIAN USING


PYTHON

THE CORRELATION BETWEEN STUDY


06
HOURS AND EXAM SCORES USING
PYTHON

07 BASIC OPERATIONS ON A NUMPY


ARRAY: MULTIPLICATION AND
DIVISION

08 CREATING
A DATAFRAME
USING PANDAS

09 CREATING A DATAFRAME USING


LISTS
10

Z-SCORES CALCULATION

11
AVERAGE NUMPY

12
PYTHON PROGRAM TO FIND MEAN IN
AN ARRAY

PYTHON PROGRAM TO FIND MEDIAN


13
IN AN ARRAY

14 CUMULATIVE FREQUENCY

FREQUENCY OF EACH UNIQUE


15 ELEMENT IN A LIST USING PYTHON
EX:NO:01
NUMPY SIMPLE IMPLEMENTATION
DATE:

AIM:
Develop a python program to create a one-dimensional array from user input
and perform various comparison and arithmetic operations on it.

ALGORITHM:

STEP 1: Import the numpy module as np.

STEP 2: Initialize an empty list called arr.

STEP 3: Take an integer input from the user and store it in a variable n.

STEP 4: Loop from 0 to n-1, and for each iteration:

STEP 4.1: Take an integer input from the user and store it in a variable ele.

STEP 4.2: Append ele to the list arr.

STEP 5: Convert the list arr to a numpy array and store it in a variable arr1.

STEP 6: Print the result of applying the le method on arr1 with 8 as the argument.

STEP 7: Print the result of applying the lt method on arr1 with 10 as the
argument.

STEP 8: Print the result of applying the gt method on arr1 with 6 as the argument.

STEP 9: Print the result of applying the ge method on arr1 with 4 as the
argument.

STEP 10: Print the result of applying the eq method on arr1 with 4 as the
argument.

STEP 11: Print the result of applying the ne method on arr1 with 4 as the
argument.

STEP 12: Print the result of applying the np.negative function on arr1.
PROGRAM:
import numpy as np #numerical python
arr=[]
n=int(input("Enter no of elements - "))
for i in range(0,n):
ele=int(input("Enter the value - "))
arr.append(ele)
arr1 = np.array(arr)
print(arr1. le (8))
print(arr1. lt (10))
print(arr1. gt (6))
print(arr1. ge (4))
print(arr1. eq (4))
print(arr1. ne (4))
print(np.negative(arr1))
OUTPUT:

RESULT:
Thus, a python program to create a one-dimensional array from user
input andperform various comparison and arithmetic operations on it is executed
successfully and the output is verified.
EX:NO:02

DATE: PANDAS-INDEXING AND SELECTION

AIM:
Develop a python program for indexing and selection using pandas.

ALGORITHM:
STEP1: Start the program.
STEP2: Import pandas as pd
STEP3: Using pandas indexing and selection operation will takes place
STEP4: Display the result
STEP5: Stop the program

PROGRAM:

import pandas as pd
data = pd.Series([0.25,0.50,0.75,1.0],['a','b','c','d'])
print(data,'\n')
print(data[1],'\n')
#print(data.loc[1],'\n')
#print(data.loc[1:3],'\n')
print(data.iloc[1],'\n')
print(data.iloc[1:3],'\n')
OUTPUT:

RESULT:
Thus, the python program for indexing and selection using pandas is
executed successfully and the output is verified.
EX NO: 03
DATE: VARIABILITY-RANGE

AIM:
To calculate and display the maximum, minimum, and range of the entered
values.

ALGORITHM:
STEP 1:Initialize an empty list lst.
STEP 2:Prompt the user to enter the number of elements (n).
STEP 3:Loop for i from 1 to n:
STEP 3.1:Prompt the user to enter a value (ele).
STEP 3.2:Append ele to the list lst.
STEP 4:Calculate the minimum value using min(lst) and store it in minimum.
STEP 5:Calculate the maximum value using max(lst) and store it in maximum.
STEP 6:Calculate the range by subtracting the minimum from the
maximum andstore it in ran.
STEP 7:Print "Maximum is {maximum}".
STEP 8:Print "Minimum is {minimum}".
STEP 9:Print "Range is {ran}".

PROGRAM:

lst = []
n = int(input("Enter no of elements: "))
for i in range(n):
ele = int(input("Enter value: "))
lst.append(ele)

minimum = min(lst)
maximum = max(lst)
ran = maximum - minimum
OUTPUT:

RESULT:
The program calculates and display the maximum,minimum and range
successfully and the output was verified.
EX:NO:04
IMPLEMENTATION OF NORMAL CURVE
DATE:

AIM:
To develop a python program to visualize a normal curve.

ALGORITHM:

STEP 1: Import necessary libraries: `numpy`, `matplotlib.pyplot`,

`scipy.stats.norm`, and `statistics`.

STEP 2: Define the x-axis range using `np.arange()` from -20 to 20 with a step

of 0.01.

STEP 3: Calculate the mean of the x-axis values using `statistics.mean()`.

STEP 4: Calculate the standard deviation of the x-axis values using

`statistics.stdev()`.

STEP 5: Generate y-axis values using `norm.pdf()` with the x-axis, mean, and

standard deviation as parameters.

STEP 6: Plot the x-axis against the y-axis using `plt.plot()`.

STEP 7: Set the title of the plot to "First Curve" using

`plt.title()`.

STEP 8: Display the plot using `plt.show()`.


PROGRAM:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
import statistics

x_axis = np.arange(-20,20,0.01)
mean = statistics.mean(x_axis)
sd = statistics.stdev(x_axis)
plt.plot(x_axis,norm.pdf(x_axis,mean,sd))
plt.title("First Curve")
plt.show()
OUTPUT:

RESULT:
Thus, a python program to visualize a normal curve has been executed
and theoutput has been verified.
EX.NO:05 FINDING MEAN AND MEDIAN USING PYTHON
DATE:

AIM:

To calculate the mean and median of an array of numbers using NumPy.

ALGORITHM:

STEP 1: Create an array with the values [55, 65, 75, 85, 95, 105, 115].

STEP 2:Use the NumPy function to sum all elements in the array and
divide by the total number of elements.

STEP 3:Use the NumPy function to sort the array and find the middle
value.

STEP 4: If the array has an even number of elements, average the two
middle values to find the median.

STEP 5:Save the calculated mean and median in variables.

STEP 6:Print the mean and median values to the console.

STEP 7:Terminate the program after displaying the results

PROGRAM

import numpy as np
arr = np.array([55, 65, 75, 85, 95, 105, 115]
mean_value = np.mean(arr)
median_value = np.median(arr)
print(f"The mean of the array is: {mean_value}")
print(f"The median of the array is: {median_value}"
OUTPUT:

The mean of the array is: 85.0


The median of the array is: 85.0

RESULT:

The program successfully calculates the mean and median of the array [55, 65,
75, 85, 95, 105, 115] using NumPy.
The mean is 85.0 and the median is also 85.0 .
EX.NO:06
DATE: THE CORRELATION BETWEEN STUDY HOURS AND
EXAM SCORES USING PYTHON

AIM:

To calculate the correlation between study hours and exam scores using the given
datasets.

ALGORITHM:

STEP 1: Create an array representing study hours: [1,2,3,4,5].

STEP 2:Create an array representing exam scores: [50,60,70,80,90] .

STEP 3:Import the NumPy library to utilize its functions for calculations.

STEP 4: Use NumPy’s corrcoef function to compute the correlation coefficient


between the two arrays.

STEP 5:Retrieve the correlation value from the output of the correlation
coefficient calculation.

STEP 6: Print the correlation coefficient to the console for interpretation.

STEP 7:Assess the correlation value to determine the strength (positive, negative,
or none) and direction of the relationship between study hours and exam scores.

STEP 8:Terminate the program after displaying and interpreting the results.

PROGRAM:

import numpy as np
study_hours = np.array([1, 2, 3, 4, 5])
exam_scores = np.array([50, 60, 70, 80, 90])
correlation = np.corrcoef(study_hours, exam_scores)[0, 1]
print(f"The correlation between study hours and exam scores is: {correlation}")
OUTPUT:

The correlation between study hours and exam scores is: 1.0.

RESULT:

The program calculates the correlation coefficient between study hours and
exam scores as approximately 1.0, indicating a perfect positive correlation.
This means that as study hours increase, exam scores also tend to increase consistently
EX.NO:07
DATE: BASIC OPERATIONS ON A NUMPY ARRAY:
MULTIPLICATION AND DIVISION

AIM:

To create a NumPy array and perform basic operations such as multiplication and
division.

ALGORITHM:

STEP 1:Import the NumPy library to access its functionalities.

STEP 2:Define a NumPy array with a set of numerical values.

STEP 3:Define a second NumPy array of the same shape to perform operations.

STEP 4:Use the multiplication operator to multiply the two arrays element-wise.

STEP 5:Use the division operator to divide the elements of the first array
by the corresponding elements of the second array.

STEP 6:Ensure that division by zero is handled if applicable.

STEP 7:Print the results of the multiplication and division operations.

STEP 8:Terminate the program after displaying the results.

PROGRAM:

import numpy as np array1 = np


.array([10, 20, 30, 40, 50])
array2 = np.array([2, 4, 6, 8, 10])
multiplication_result = array1 * array2
division_result = array1 / array2
print("Multiplication Result:", multiplication_result)
print("Division Result:", division_result)
OUTPUT:

Multiplication Result: [ 20 80 180 320 500]


Division Result: [5. 5. 5. 5. 5.]

RESULT:

The program successfully creates two NumPy arrays and performs basic operations.
The multiplication of the arrays results in [20, 80, 180, 320, 500], while the
division yields
[5., 5., 5., 5., 5.]
EX.NO:08
DATE: CREATING A DATAFRAME USING PANDAS

AIM :

Creating a DataFrame using a Dictionary and Inserting Data

ALGORITHM:

Step 1: Import the Pandas library.

Step 2: Prepare the data in dictionary format.

Step 3: Create a DataFrame from the dictionary using pd.DataFrame().

Step 4: Insert a new column into the DataFrame.

Step 5: Display the DataFrame.Program:import pandas as pd

PROGRAM:

data = { 'Product': ['Laptop', 'Mobile', 'Tablet'], 'Price': [1000, 500, 300],'Stock': [50, 150,
100]}
df = pd.DataFrame(data)
df['Discount'] = [10, 5, 7]
print(df)
OUTPUT:

Product Price Stock Discount


Laptop 1000 50 10
Mobile 500 150 5
Tablet 300 100 7

RESULT:

The DataFrame was successfully created from a dictionary, and a new column
named "Discount" was inserted. The DataFrame was displayed with the updated data,
showing products, their prices, stock, and discount values.
EX.NO:09
DATE: CREATING A DATAFRAME USING LISTS

AIM :

Creating a DataFrame using Lists and Adding Rows.

ALGORITHM:

STEP 1: Import the Pandas library.

STEP 2: Prepare the data as lists.

STEP 3: Create a DataFrame by passing the lists to pd.DataFrame()

STEP 4: Insert a new row using loc[].

STEP 5: Display the updated DataFrame.Program:import pandas as pd

PROGRAM:

names = ['John', 'Emma', 'Sophia']


ages = [28, 22, 32]
cities = ['New York', 'London', 'Sydney']
df = pd.DataFrame({'Name': names, 'Age': ages, 'City':
cities})
df.loc[3] = ['Michael', 26, 'Toronto']
print(df)
OUTPUT:

Name Age City


John 28 New York
Emma 22 London
Sophia 32 Sydney
Michael 26 Toronto

RESULT:

The DataFrame was created using lists, and a new row with data for
"Michael" Was successfully added using the loc[] method. The updated
DataFrame, showing names, ages, and cities, was displayed as expected.
EX.NO:10
DATE: Z-SCORES CALCULATION

AIM :

To standardize data points and compare them across different datasets by converting
them to a common scale.

ALGORITHM:

STEP 1:Calculate the Mean of the dataset.

STEP 2:Compute the Standard Deviation.

STEP 3:Apply the Z-Score Formula:

where 𝑋 is the data point, 𝜇 is the mean, and σ is the standard deviation.

PROGRAM :
import numpy as np
data = np.array([10, 12, 23, 23, 16, 23, 21, 16])
mean = np.mean(data)
std_dev = np.std(data)
z_scores = (data - mean) / std_dev
print(f"Data: {data}")
print(f"Mean: {mean}")
print(f"Standard Deviation: {std_dev}")
print(f"Z-Scores: {z_scores}")
OUTPUT :
Data: [10 12 23 23 16 23 21 16]
Mean: 18.0
Standard Deviation: 4.24
Z-Scores: [-1.89 -1.41 1.19 1.19 -0.47 1.19 0.71 -0.47]

RESULT :
Displays the mean, standard deviation, and Z-scores for the dataset.
EX.NO:11
DATE: AVERAGE NUMPY

AIM :

To compute the average (mean) of a list of numbers using NumPy.

ALGORITHM :

STEP 1:Import NumPy: You need to have the NumPy library available in your Python
environment.

STEP 2:Create a List or Array of Numbers: Prepare the data for which you want to calculate
the average.

STEP 3:Use NumPy's Mean Function: Apply the np.mean() function to compute the average.

STEP 4:Output the Result: Display or use the computed average.

PROGRAM :

import numpy as np

data = [10, 20, 30, 40, 50]

average = np.mean(data)

print(f"The average is: {average}")


OUTPUT:

The average is: 30.0

RESULT :

The given program is executed and output is verified.


EX.NO:12
DATE: PYTHON PROGRAM TO FIND MEAN IN AN ARRAY

AIM :

Write a python program to find mean in an array.

ALGORITHM :

STEP 1: Get Input from User .

STEP 1.1: Prompt the user to enter a series of numbers separated by spaces

STEP 1.2: Read the input string from the user

STEP 2: Convert Input to List of Integers

STEP 2.1: Split the input string by spaces to create a list of substrings

STEP 2.2: Convert each substring to an integer

STEP 2.3: Form a list of integers from the converted substrings

STEP 3: Calculate the Mean

STEP 3.1: Check if the list of integers is not empty

STEP 3.2: If the list is not empty

STEP 3.2.1: Compute the sum of the integers in the list

STEP 3.2.2: Divide the sum by the number of integers to obtain the mean

STEP 3.3: If the list is empty, set the mean to 0

STEP 4: Display the Mean

STEP 4.1: Print the calculated mean to the user

STEP 5: End.
PROGRAM :

def calculate_mean(array):
return sum(array) / /len(array) if array else 0
user_input = input("Enter numbers separated by spaces: ")
array = list(map(int, user_input.split()))

OUTPUT :
10 20 30 40
Mean: 25

RESULT :

The given program is executed and output is verified.


EX.NO:13
DATE: PYTHON PROGRAM TO FIND MEDIAN IN AN ARRAY

AIM :

Write a python program to find median in an array.

ALGORITHM :

STEP 1: Get Input from User

STEP 1.1: Prompt the user to enter a series of numbers separated by spaces

STEP 1.2: Read the input string from the user

STEP 2: Convert Input to List of Integers

STEP 2.1: Split the input string by spaces to create a list of substrings

STEP 2.2: Convert each substring to an integer

STEP 2.3: Form a list of integers from the converted substrings

STEP 3: Calculate the Median

STEP 3.1: Check if the list of integers is not empty

STEP 3.2: If the list is not empty

STEP 3.2.1: Sort the list of integers

STEP 3.2.2: Find the median: - If the list has an odd number of elements, the median is the
middle element. - If the list has an even number of elements, the median is the average of the
two middle elements.

STEP 3.3: If the list is empty, set the median to 0


STEP 4: Display the Median

STEP 4.1: Print the calculated median to the user

STEP 5: End

PROGRAM :

def calculate_median(array):
if not array:
return 0
array.sort()
n = len(array)
mid = n // 2

OUTPUT :
10 20 30 40 50
Median: 30

RESULT :

The given program is executed and output is verified.


EX.NO:14
DATE: CUMULATIVE FREQUENCY

AIM :

The aim is to compute the cumulative frequency of a dataset using pandas, which can
simplify the process and provide more powerful data handling capabilities .

ALGORITHM :

STEP 1:Import Libraries: Use pandas to handle and process the data.

STEP 2:Create a DataFrame: Convert the dataset into a DataFrame for easier
manipulation.

STEP 3:Calculate Frequencies: Count the occurrences of each unique value using
pandas functions.

STEP 4:Sort Data: Sort the values in ascending order.

STEP 5:Compute Cumulative Frequency: Use the cumulative sum function to compute
cumulative frequencies.

STEP 6:Display Results: Print or display the results in a readable format.

PROGRAM :

import pandas as pd

data = [5, 1, 9, 2, 3, 5, 6, 2, 9, 5, 1, 3, 2, 7, 9]

df = pd.Series(data).value_counts().sort_index().cumsum().reset_index()

df.columns = ['Value', 'Cumulative Frequency']

print(df)
OUTPUT :

Value Cumulative Frequency

0 1 2

1 2 5

2 3 7

3 5 10

4 6 11

5 7 12

6 9 15

RESULT :

Thus,the above program is executed and output is verified.


EX.NO:15
DATE: FREQUENCY OF EACH UNIQUE ELEMENT IN A LIST
USING PYTHON

AIM:

To determine the frequency of each unique element in a list using Python.

ALGROTHIM:

STEP 1:Initialize a Data Structure: Use a dictionary to store each unique element and its
corresponding frequency.

STEP 2:Iterate Through the List: Traverse the list and update the frequency count of each
element in the dictionary.

STEP 3:Output the Results: Print or return the dictionary containing the elements and their
frequencies.

PROGRAM:

from collections import Counter

elements = ['apple', 'banana', 'apple', 'orange', 'banana', 'banana']

frequency = Counter(elements)

print("Element Frequencies:")

for element, count in frequency.items():

print(f"{element}: {count}")
OUTPUT :

Element Frequencies:

apple: 2

banana: 3

orange: 1

RESULT:

Thus, the above program is executed and output is verified.

You might also like