Ml record_merged (1)

ROCHIS VALLEY, MANIKBANDAR
NIZAMABD -503 003
CERTIFICATE
NAME OF THE LABORATORY________________________________
ACADEMIC YEAR:20___-20___
Certified that is the bonafide record of work done in the_______________

____________laboratory by Mr/Ms _______________________________
__________________________of____Year B.Tech, ___SEM____Branch
With HALLTICKET NO:_____________________during Academic year
20___-20___ and has performed_____no.Of Experiments out of_____no.
Of experiment under my Supervision
LECTURER INCHARGE HEAD OF THE DEPARTMENT

WITH SEAL
DATE EXTERNAL EXAMINER

VIJAY RURAL ENGINEERING COLLEGE
MACHINE LEARNING LAB INDEX
DATE OF
DATE OF LECTURER
S.NO NAME OF THE EXPERIMENT EXPERIMENT PAGE NO. REMARK
SUBMISSION SIGN
PERFORMED
Write a python program to compute Central

Tendency Measures: Mean, Median,
1
Mode Measure of Dispersion: Variance,
Standard Deviation
Study of Python Basic Libraries such as

2
Statistics, Math, Numpy and Scipy
Study of Python Libraries for ML application

3
such as Pandas and Matplotlib
Write a Python program to implement Simple

4
Linear Regression
Implementation of Multiple Linear Regression

5
for House Price Prediction using sklearn
Implementation of Decision tree using sklearn

6
and its parameter tuning
7 Implementation of KNN using sklearn

Implementation of Logistic Regression using
8
sklearn
9 Implementation of K-Means Clustering
Performance analysis of Classification

10
Algorithms on a specific dataset (Mini Project)
MEACHINE LEARNING
LAB MANUAL
III - II
1)Write a python program to compute Central Tendency Measures: Mean, Median,
Mode Measure of Dispersion: Variance, Standard Deviation
Program:
import statistics
def ctm(data):
if not data:
return("no data found")
mean = statistics.mean(data)
median = statistics.median(data)
try:
mode = statistics.mode(data)
except statistics.StatisticsError:
mode= "No unique mode found"
variance = statistics.variance(data)
sd = statistics.stdev(data)
print(f"mean: {mean}")
print(f"median: {median}")
print(f"mode: {mode}")
print(f"varience: {variance}")
print(f"standard division: {sd}")
if __name__ == "__main__":
data = [10,20,30,40,40,50,60,70,80,90]
ctm(data)
Output:
mean: 49
median: 45.0
mode: 40
varience: 676.6666666666666
standard division: 26.01281735350223
2) Study of Python Basic Libraries such as Statistics, Math, Numpy and Scipy
1. Statistics Library
The statistics module in Python provides functions for calculating mathematical statistics of
numeric data.
Key Features:
 Central Tendency Measures:

o mean(data): Arithmetic mean.
o median(data): Median value.
o mode(data): Most common value.
 Spread Measures:
o variance(data): Variance of the data.
o stdev(data): Standard deviation.
 Additional Functions:
o median_low(data): Low median of data.
o median_high(data): High median of data.
o harmonic_mean(data): Harmonic mean of data.
Example:
import statistics as stats
data = [1, 2, 2, 3, 4]
print("Mean:", stats.mean(data))
print("Median:", stats.median(data))
print("Mode:", stats.mode(data))
2. Math Library
The math module provides access to mathematical functions defined by the C standard.
Key Features:
 Basic Math Operations:

o sqrt(x): Square root.
o pow(x, y): x raised to the power y.
o factorial(x): Factorial of x.
 Trigonometric Functions:
o sin(x), cos(x), tan(x): Trigonometric functions.
 Logarithmic Functions:
o log(x, base): Logarithm of x to the specified base.
o log10(x): Base-10 logarithm.
 Constants:
o pi: Mathematical constant π.
o e: Euler's number.
Example:
import math
print("Square root of 16:", math.sqrt(16))

print("Value of Pi:", math.pi)
print("Sine of 90 degrees:", math.sin(math.radians(90)))
3. NumPy
NumPy is a powerful library for numerical computations.
Key Features:
 Arrays:
o numpy.array(): Create arrays.
o numpy.zeros(shape): Create an array of zeros.
o numpy.ones(shape): Create an array of ones.
 Mathematical Operations:
o Element-wise operations on arrays (+, -, *, /).
o Linear algebra functions (dot, cross, linalg).
 Statistical Functions:
o numpy.mean(): Mean of array elements.
o numpy.std(): Standard deviation.
o numpy.median(): Median value.
 Indexing and Slicing:
o Access specific elements or subarrays.
Example:
import numpy as np
arr = np.array([1, 2, 3, 4])

print("Array:", arr)
print("Mean:", np.mean(arr))
print("Standard Deviation:", np.std(arr))
4. SciPy
SciPy builds on NumPy and provides additional functionality for scientific computing.
Key Features:
 Optimization:
o scipy.optimize.minimize(): Minimize a scalar function.
 Integration:
o scipy.integrate.quad(): Numerical integration.
 Linear Algebra:
o scipy.linalg.solve(): Solve linear systems.
 Statistics:
o scipy.stats: Statistical distributions and functions.
 Signal Processing:
o scipy.signal: Signal processing utilities.
Example:
from scipy import stats
data = [1, 2, 2, 3, 4]
print("Mode:", stats.mode(data).mode[0])
Applications:
1. Statistics: Data analysis and summarization.

2. Math: Solving equations, trigonometry, and logarithmic calculations.
3. NumPy: High-performance array manipulation.
4. SciPy: Advanced computation in engineering, machine learning, and scientific domains.
3) Study of Python Libraries for ML application such as Pandas and Matplotlib
Python libraries like Pandas and Matplotlib are essential for Machine Learning (ML)
applications, as they help with data manipulation, analysis, and visualization. Here’s a detailed
overview:
1. Pandas
Pandas is a powerful library for data manipulation and analysis. It provides data structures like
Series and DataFrame, which are widely used in ML for preprocessing and exploration.
Key Features:
 Data Structures:
o Series: One-dimensional labeled array.
o DataFrame: Two-dimensional labeled data structure (like a table).
 Data Manipulation:
o read_csv(), read_excel(): Load datasets from files.
o to_csv(), to_excel(): Save datasets to files.
o Filtering and indexing using .loc[] and .iloc[].
 Data Cleaning:
o Handling missing values: dropna(), fillna().
o Duplicates: drop_duplicates().
 Data Analysis:
o Aggregation: groupby(), pivot_table().
o Statistical methods: .mean(), .std(), .describe().
 Integration with ML Libraries:
o Easily convert DataFrames to NumPy arrays or directly use them in ML libraries
like Scikit-learn.
Example:
import pandas as pd
# Load data
data = pd.read_csv('sample.csv')
# Display first few rows

print(data.head())
# Basic statistics
print(data.describe())
# Handling missing values

data.fillna(0, inplace=True)
2. Matplotlib
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations.
It is often used to visualize data trends and patterns in ML.
Key Features:
 Basic Plotting:
o plot(): Line plots.
o scatter(): Scatter plots.
o bar(): Bar charts.
 Customizations:
o Title, labels, and legends: title(), xlabel(), ylabel(), legend().
o Colors, markers, and styles.
 Subplots:
o Multiple plots in one figure: subplot().
 Integration:
o Works seamlessly with Pandas: Direct plotting from DataFrames.
Example:
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
# Line plot
plt.plot(x, y, label='Line Plot')
plt.title('Basic Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend()
plt.show()
Combined Use of Pandas and Matplotlib in ML:
The combination of Pandas and Matplotlib is particularly useful in the exploratory data analysis
(EDA) phase of ML, where you examine your dataset to identify trends, correlations, and
anomalies.
Example: Data Analysis and Visualization
import pandas as pd
# Load dataset
data = pd.read_csv('sample.csv')
# Check missing values

print(data.isnull().sum())
# Visualize a specific column

data['Age'].hist(bins=10)
plt.title('Age Distribution')
plt.xlabel('Age')
plt.ylabel('Frequency')
plt.show()
# Scatter plot
plt.scatter(data['Height'], data['Weight'], color='blue', alpha=0.5)
plt.title('Height vs. Weight')
plt.xlabel('Height')
plt.ylabel('Weight')
plt.show()
Applications in ML:
1. Pandas:
o Preprocessing: Data cleaning, normalization, and transformation.
o Feature engineering: Creating new features from existing ones.
o Handling time-series data.
2. Matplotlib:
o Visualizing data distributions, trends, and outliers.
o Understanding relationships between features.
o Plotting model performance (e.g., training and validation loss curves)
4)Write a Python program to implement Simple Linear Regression
PROGRAM
Example as pizza
import statistics
#defining sizes
data_items_collected = int(input("enter the parameter of data:"))
pizza_sizes = []
for i in range (data_items_collected):
size_of_pizza = int(input("enter the pizza sizes: "))
pizza_sizes.append(size_of_pizza)
print("list of pizza sizes collected are: ")
for size in pizza_sizes:
print(size)
#defining prizes
prizes = []
no_of_prizes = int(input("enter the pizza prizes: "))
prizes.append(no_of_prizes)
print("list of pizza prizes collected are ")
for prize in prizes:
print(prize)
#mean of sizes(x)n
x_mean = statistics.mean(pizza_sizes)
print(f"mean of sizes:{x_mean}")
#mean of costs(y)
y_mean = statistics.mean(prizes)
print(f"mean of prizes:{y_mean}")
#diviation of sizes
diviation_sizes = []
dev_x = pizza_sizes[i]-x_mean
diviation_sizes.append(dev_x)
print("diviation of x: ")
for dev_s in diviation_sizes:
print(dev_s)
#diviation of prizes
diviation_prizes = []
dev_y = prizes[i]-y_mean
diviation_prizes.append(dev_y)
print("diviation of y:")
for dev_p in diviation_prizes:
print(dev_p)
#product of divitions(pod)
product_of_diviation = []
p_o_d = diviation_sizes[i] * diviation_prizes[i]
product_of_diviation.append(p_o_d)
print("product_of_diviation")
for pod in product_of_diviation:
print(pod)
#sum of product
sum_of_product_of_diviation = sum(product_of_diviation)
print(f"sum of the product: {sum_of_product_of_diviation}")
#square_of_diviation_of_sizes
square_of_diviation_of_sizes = []
sod = diviation_sizes[i]** 2
square_of_diviation_of_sizes.append(sod)
print("square_of_diviation_of_sizes:")
for sodos in square_of_diviation_of_sizes:
print(sodos)
#sum of square_of_diviation_of_sizes_x(sod)
sum_of_square_of_diviation_of_sizes_x = sum(square_of_diviation_of_sizes)
print(f"sum of the product: {sum_of_square_of_diviation_of_sizes_x}")
#m vlaue
slopp_m = sum_of_product_of_diviation/sum_of_square_of_diviation_of_sizes_x
print(f"m:{slopp_m}")
# Y = m * mean of x = > (mean of y)-m * (mean of x) final output is denoted as flag
flag = y_mean - slopp_m * x_mean
print(f"flag value:{flag}")
#doing a pridiction
new_size = int(input("enter the size of pizza you created: "))
prize_pridiction = slopp_m * new_size + flag
print(f"size of {new_size} pizza can be: {prize_pridiction} ")
OUTPUT
enter the parameter of data:3
enter the pizza sizes: 8
list of pizza sizes collected are:
10
12
enter the pizza prizes: 10
list of pizza prizes collected are
10
13
16
mean of sizes:10
mean of prizes:13
diviation of x:
-2
diviation of y:
-3
product_of_diviation
sum of the product: 12
square_of_diviation_of_sizes:
sum of the product: 8
m:1.5
flag value:-2.0
enter the size of pizza you created: 20
size of 20 pizza can be: 28.0

5. Implementation of Multiple Linear Regression for House Price Prediction using sklearn
Program:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
# Sample Data
data = {
'square_feets': [1000, 1500, 2000, 2500, 3000],
'bedrooms': [1, 2, 3, 4, 5],
'bathrooms': [1, 2, 2, 2.5, 3],
'prize': [5000, 8000, 12000, 18000, 25000]
df = pd.DataFrame(data)
print(df)
# Features and Target Variable
X = df[['square_feets', 'bedrooms', 'bathrooms']]
y = df[['prize']]
# Train the Model
model = LinearRegression()
model.fit(X, y)
# User Input
sq_feet = float(input("Enter the area of the plot: "))
bdrooms = float(input("Enter the number of bedrooms: ")) # Converted to float
btrooms = float(input("Enter the number of bathrooms: ")) # Converted to float
# Predict Rent
predict_rent = model.predict(np.array([[sq_feet, bdrooms, btrooms]]))
# Print Prediction
print(f"Predicted rent for {sq_feet} square feet, {bdrooms} bedrooms, and {btrooms} bathrooms
is: ₹{predict_rent[0][0]:,.2f}")
Output:
6. Implementation of Decision tree using sklearn and its parameter tuning
Program
from sklearn.tree import DecisionTreeClassifier
b_tech = ["cse","aiml","ece","eee","mech","civil"]
bse = ["mpcs", "mecs", "ba", "ca", "bba"]
X=[
[1, 450],
[1, 800],
[0, 0],
[1, 250],
[1, 600],
]
y = [0, 1, 2, 3, 1]
clf = DecisionTreeClassifier()
clf.fit(X, y)
emcet_rank = int(input("did you get rank(yes(1)/no(0))"))
if emcet_rank == 1 or emcet_rank == 0:
score_1 = int(input("enter the score you got in emcet: "))

prediction = clf.predict([[emcet_rank, score_1]])[0]
if prediction == 0:
print("You can apply for this courses::", b_tech[len(b_tech)//2:])
elif prediction == 1:
print("You can apply for this courses:", b_tech[:len(b_tech)//2])
print(bse)
print("You are not eligible for application")
else:
print("Invalid decision")
else:
print("Invalid input")
Output:
7. Implementation of KNN using sklearn
Program:
import numpy as np
from sklearn.neighbors import KNeighborsClassifier
marks = [95, 75, 69, 51, 39, 21, 0]
grade = ["a1", "a", "b", "c", "d", "e", "f"]
testing_score = int(input("enter your score: "))
marks_array = np.array(marks).reshape(-1, 1)
k=3
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(marks_array, grade)
distances, indices = knn.kneighbors([[testing_score]])
print(f"Distances: {distances[0]}")
print(f"Indices of the nearest neighbors: {indices[0]}")
predicted_grade = [grade[i] for i in indices[0]]
print(f"The grade for {testing_score} marks is: {predicted_grade[0]}")
Output:
8. Implementation of Logistic Regression using sklearn
from sklearn.linear_model import LogisticRegression
import numpy as np
import pandas as pd
data = {
'in_study_hours' : [2,4,6,8,10,12],
'y_n':[0,0,1,1,1,1]
}
print(data)
x = df[['in_study_hours']]
y = df['y_n']
model = LogisticRegression()
model.fit(x,y)
study_hours = float(input("enter the no of study hours: "))

predict = model.predict(np.array([[study_hours]]))
if predict == 1:
print("pass")
else:
print("fail")
Output:
9. Implementation of K-Means Clustering
import numpy as np
import pandas as pd
from sklearn.cluster import KMeans

data = {
'sr_no':["cr1","cr2","cr3","cr4","cr5","cr6","cr7"],
'age':[20,40,30,18,28,35,45],
'amount':[500,1000,800,300,1200,1400,1800]
}
print(df)
x = df[['age','amount']].values
nc = KMeans(n_clusters=3)
nc.fit(x)
test_data = np.array([[13,750]])
pridect = nc.predict(test_data)
print(f"the cloeset class are: {pridect[0]}")
Output:
10. Performance analysis of Classification Algorithms on a specific dataset (Mini Project)
Program:
# Import necessary libraries
import numpy as np
import pandas as pd
import seaborn as sns
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier

from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score,
confusion_matrix
# Load Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split the data into training and test sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Standardize the features (important for some models like SVM, KNN)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Initialize classifiers
models = {
"Logistic Regression": LogisticRegression(),
"Decision Tree": DecisionTreeClassifier(),
"KNN": KNeighborsClassifier(),
"SVM": SVC(),
"Random Forest": RandomForestClassifier()
}
# Function to evaluate models

def evaluate_model(model, X_train, X_test, y_train, y_test):
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')

recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')
cm = confusion_matrix(y_test, y_pred)
return accuracy, precision, recall, f1, cm
# Create a DataFrame to store results
results = []
# Loop through each model, evaluate and store results
for model_name, model in models.items():

accuracy, precision, recall, f1, cm = evaluate_model(model, X_train, X_test, y_train, y_test)
results.append([model_name, accuracy, precision, recall, f1, cm])
# Convert the results into a DataFrame for easy display

results_df = pd.DataFrame(results, columns=["Model", "Accuracy", "Precision", "Recall", "F1-
Score", "Confusion Matrix"])
# Print the results

print(results_df)
# Plot the performance comparison
metrics = ['Accuracy', 'Precision', 'Recall', 'F1-Score']

for metric in metrics:
plt.figure(figsize=(10, 6))
sns.barplot(x="Model", y=metric, data=results_df)
plt.title(f'Comparison of {metric} across Models')

plt.show()
# Plot confusion matrix for the best model (based on accuracy)
best_model_index = results_df['Accuracy'].idxmax()
best_model_cm = results_df.iloc[best_model_index]['Confusion Matrix']
plt.figure(figsize=(6,6))
sns.heatmap(best_model_cm, annot=True, fmt="d", cmap='Blues', xticklabels=iris.target_names,
yticklabels=iris.target_names)
plt.title(f'Confusion Matrix of {results_df.iloc[best_model_index]["Model"]}')
plt.ylabel('True Label')
plt.xlabel('Predicted Label')
plt.show()
Output

Ml record_merged (1)

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Ml record_merged (1)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ml record_merged (1)

Uploaded by

Copyright:

Available Formats

ROCHIS VALLEY, MANIKBANDAR

NIZAMABD -503 003

NAME OF THE LABORATORY________________________________

Certified that is the bonafide record of work done in the_______________

LECTURER INCHARGE HEAD OF THE DEPARTMENT

DATE EXTERNAL EXAMINER

Write a python program to compute Central

Study of Python Basic Libraries such as

Study of Python Libraries for ML application

Write a Python program to implement Simple

Implementation of Multiple Linear Regression

Implementation of Decision tree using sklearn

7 Implementation of KNN using sklearn

Performance analysis of Classification

Mode Measure of Dispersion: Variance, Standard Deviation

return("no data found")

mode= "No unique mode found"

print(f"standard division: {sd}")

 Central Tendency Measures:

import statistics as stats

 Basic Math Operations:

print("Square root of 16:", math.sqrt(16))

NumPy is a powerful library for numerical computations.

arr = np.array([1, 2, 3, 4])

from scipy import stats

1. Statistics: Data analysis and summarization.

# Display first few rows

# Handling missing values

import matplotlib.pyplot as plt

Combined Use of Pandas and Matplotlib in ML:

Example: Data Analysis and Visualization

# Check missing values

# Visualize a specific column

data_items_collected = int(input("enter the parameter of data:"))

for i in range (data_items_collected):

size_of_pizza = int(input("enter the pizza sizes: "))

print("list of pizza sizes collected are: ")

for size in pizza_sizes:

for i in range (data_items_collected):

no_of_prizes = int(input("enter the pizza prizes: "))

print("list of pizza prizes collected are ")

for prize in prizes:

for i in range (data_items_collected):

for dev_s in diviation_sizes:

for i in range (data_items_collected):

for dev_p in diviation_prizes:

p_o_d = diviation_sizes[i] * diviation_prizes[i]

for pod in product_of_diviation:

print(f"sum of the product: {sum_of_product_of_diviation}")

for i in range (data_items_collected):

for sodos in square_of_diviation_of_sizes:

print(f"sum of the product: {sum_of_square_of_diviation_of_sizes_x}")

flag = y_mean - slopp_m * x_mean

new_size = int(input("enter the size of pizza you created: "))

prize_pridiction = slopp_m * new_size + flag

print(f"size of {new_size} pizza can be: {prize_pridiction} ")

enter the parameter of data:3

enter the pizza sizes: 8

enter the pizza sizes: 10