Ashin ML Record - Merged

KGiSL Institute of Technology
(Affiliated to ANNA University, Chennai and Approved by AICTE, New Delhi)

365, KGiSL Campus, Thudiyalur Road, Saravanampatti
Coimbatore – 641035
Department of Artificial
Intelligence and Data Science
Name :
Register Number :
Regulation : R-2021
Branch : B.Tech – Artificial Intelligence and Data Science
Subject Code/ Title : AD3461 Machine Learning
Semester/ Year : IV/ II

KGiSL Institute of Technology
365, KGiSL Campus, Thudiyalur Road, Saravanampatti
Coimbatore – 641035
NAME :
CLASS : II YEAR/ IV SEM AI&DS
UNIVERSITY REG NO :
This is to certify that, this is a bonafide record of practical work done by

of Artificial Intelligence and Data Science
branch in AD3461 MACHINE LEARNING during Fourth Semester of academic
year 2023 - 2024.
FACULTY IN CHARGE HEAD OF THE DEPARTMENT
Submitted during Anna University Practical Examination held on ........................... at

KGiSL Institute of Technology, Coimbatore – 641 035.
INTERNAL EXAMINER EXTERNAL EXAMINER

INDEX
PAGE SIGN OF
EX.NO DATE NAME OF THE EXPERIMENT NO MARKS FACULTY
1 CANDIDATE ELIMINATION
ALGORITHM
2 ID3 ALGORITHM
3 BACK PROPAGATION
ALGORITHM
4 NAÏVE BAYESIAN CLASSIFIER

(DATASET)
NAÏVE BAYESIAN CLASSIFIER

MODEL
5
6 BAYESIAN NETWORK
EM ALGORITHM (K MEANS
7 ALGORITHM)
8 K- NEAREST NEIGHBOUR
ALGORITHM
LOCALLY WEIGHTED
REGRESSION ALGORITHM
9
EX NO: 01 IMPLEMENTATION OF CANDIDATE - ELIMINATION ALGORITHAM
DATE:
AIM:
To implement and demonstrate the Candidate-Elimination algorithm, for a given setof training
data examples stored in a .CSV file, to output a description of the set of all hypotheses consistent with the
training examples.
ALGORITHM:
Step 1: Load Data set.
Step 2: Initialize General Hypothesis and Specific Hypothesis.
Step 3: For each training example
Step 4: If example is positive example
if attribute_value == hypothesis_value:
Do nothing else:
replace attribute value with '?' (Basically generalizing it)
Step 5: If example is Negative example
Make generalize hypothesis more specific.
SOURCE CODE:
import pandas as pdimport numpy as np

data = pd.read_csv("trainingdata.csv")concepts = data.iloc[:, 0:-1]
target = data.iloc[:, -1]
def learn(concepts, target):
specific_h = concepts.iloc[0].copy()
KGISL INSTITUTE OF TECHNOLOGY 1 711722243019

print("\nInitialization of specific_h and general_h")
print(specific_h)
general_h = pd.DataFrame([["?" for _ in range(len(specific_h))] for _ inrange(len(specific_h))],
columns=specific_h.index)print(general_h)
for i, h in concepts.iterrows():if target[i] == "Yes":
for x in range(len(specific_h)):if h[x] != specific_h[x]:
specific_h[x] = '?' general_h.iloc[x, x] = '?'
if target[i] == "No":
for x in range(len(specific_h)):if h[x] != specific_h[x]:
general_h.iloc[x, x] = specific_h[x]else:
general_h.iloc[x, x] = '?'
print("\nSteps of Candidate Elimination Algorithm", i + 1)print(specific_h)
print(general_h)
general_h = general_h.loc[~(general_h == '?').all(axis=1)]return specific_h, general_h
s_final, g_final = learn(concepts, target)print("\nFinal Specific_h:") print(s_final)
print("\nFinal General_h:")print(g_final)

OUTPUT:

Qr Codes:
GitHub
Experience the power of prediction through the
→ candidate elimination algorithm. Explore the code at Git
Hub for seamless integration and advanced forecasting
capabilities.
Explanation
Video Explore concept learning with ease. Our app happiness the
→ → power of the candidate elimination algorithm to iteratively
refine hypotheses based on data. Watch how it works
App Predict weather patterns seamlessly with our streamlit app

leveraging the candidate elimination algorithm. Witness
→ dynamic hypothesis refinement of accurate forecast.
Blog Unlock the prediction with our latest blog post. Dive into the
→ depth of the candidate elimination algorithm and discover
how it enhances forecasting accuracy
→

RESULT:
Thus, the Candidate-Elimination algorithm, to test all the hypotheses with the training sets using
python was executed and verified successfully.

EX. NO: 02 IMPLEMENTATION OF DECISION TREE IN ID3 ALGORITHM
DATE:
AIM:
To build Decision tree in ID3 algorithm to classify a new sample using python.
ALGORITHM:
Step 1: Observe the dataset. Import the necessary basic python libraries.
Step 2: Read the dataset.
Step 3: Calculate the Entropy of the whole dataset.
Step 4: Calculate the Entropy of the filtered dataset.
Step 5: Calculate the Information gain for the feature (outlook).
Step 6: Finding the most informative feature (feature with highest information gain).
Step 7: Adding a node to the tree.
Step 8: Perform ID3 algorithm and generate a tree.
Step 9: Finding unique classes of the label.
Step 10: Predicting from the tree.
Step 11: Evaluating the test dataset.
Step 12: Checking the test dataset.

SOURCE CODE:
import streamlit as stimport numpy as np import pandas as pd class Node:

def _init_(self, feature=None, value=None, result=None):self.feature = feature
self.value = valueself.result = resultself.children = {}

class DecisionTreeID3:def _init_(self):
self.root = None
def entropy(self, data):
counts = np.unique(data, return_counts=True)probabilities = counts / len(data)
return -np.sum(probabilities * np.log2(probabilities))
def information_gain(self, data, feature_name, target_name):total_entropy = self.entropy(data[target_name])
unique_values = data[feature_name].unique() weighted_entropy = 0
for value in unique_values:
subset = data[data[feature_name] == value]
weighted_entropy += len(subset) / len(data) * self.entropy(subset[target_name]) return total_entropy -

weighted_entropy
def build_tree(self, data, features, target_name):if len(data) == 0:
return None
if len(data[target_name].unique()) == 1:
return Node(result=data[target_name].iloc[0])
information_gains = [(feature, self.information_gain(data, feature, target_name)) forfeature in features]
best_feature, _ = max(information_gains, key=lambda x: x[1])root = Node(feature=best_feature)
for value in data[best_feature].unique(): subset = data[data[best_feature] == value]
root.children[value] = self.build_tree(subset, [f for f in features if f != best_feature],target_name)
return root
def fit(self, data, target_name):
features = [col for col in data.columns if col != target_name] self.root = self.build_tree(data, features,
target_name)

def predict_instance(self, instance, node):if node.result is not None:
return node.result
value = instance[node.feature]if value not in node.children:

return None
return self.predict_instance(instance, node.children[value])

def predict(self, data):predictions = []
for index, row in data.iterrows():
result = self.predict_instance(row, self.root)predictions.append(result)

return predictions
def main():
st.title("Decision Tree ID3 with Streamlit")

# Upload dataset file
uploaded_file = st.file_uploader("Upload Dataset", type=["csv"])if uploaded_file is not None:

# Read the dataset
data = pd.read_csv(uploaded_file)
# Create DataFrame
df = pd.DataFrame(data)
# Initialize the DecisionTreeID3 modelmodel = DecisionTreeID3()
# Train the model model.fit(df, 'PlayTennis')
# Make predictions
predictions = model.predict(df)
st.write("Predictions:", predictions)
if _name_ == "_main_":main()

OUTPUT:

Qr Codes:
GitHub
Experience the power of prediction through the ID3 algorithm.
→ Explore the code at Git Hub for seamless integration and
advanced forecasting capabilities.
Explanation
→ power of the ID3 algorithm to iteratively refine hypotheses
based on data. Watch how it works
App
Predict weather patterns seamlessly with our streamlit app
→ leveraging the ID3 algorithm. Witness dynamic hypothesis
refinement of accurate forecast.
Blog
Unlock the prediction with our latest blog post. Dive into the
→
depth of the ID3 algorithm and discover how it enhances
→ forecasting accuracy

RESULT:
Thus, the program to implement decision tree based ID3 algorithm using pythonwas executed
and verified successfully.

EX.NO:03 IMPLEMENTATION OF BACK PROPAGATION ALGORITHM
DATE:
AIM:
To implement the Back Propagation algorithm to build an Artificial Neural Network.
ALGORITHM:
Step 1: Inputs X, arrive through the preconnected path.
Step 2: Input is modeled using real weights W. The weights are usually randomly selected.
Step 3: Calculate the output for every neuron from the input layer, to the hidden layers,to the output layer.
Step 4: Calculate the error in the outputs
Step 5: Travel back from the output layer to the hidden layer to adjust the weights such that the errors are
decreased. Keep repeating the process until the desired output isachieved.
SOURCE CODE:
from math import exp fromrandom import seed

from random import random #
Initialize a network
def initialize_network(n_inputs, n_hidden, n_outputs):
network = list()
hidden_layer = [{'weights':[random() for i in range(n_inputs + 1)]} for i inrange(n_hidden)]

network.append(hidden_layer)
output_layer = [{'weights':[random() for i in range(n_hidden + 1)]} for i inrange(n_outputs)]
network.append(output_layer) return network# Calculate neuron activation for an input def

activate(weights, inputs):

activation = weights[-1]
for i in range(len(weights)-1):
activation += weights[i] * inputs[i]return activation
# Transfer neuron activation deftransfer(activation):
return 1.0 / (1.0 + exp(-activation))
inputs = row
for layer in network:
new_inputs = [] forneuron in layer:
activation = activate(neuron['weights'], inputs) neuron['output'] = transfer(activation)
new_inputs.append(neuron['output'])inputs = new_inputs
return inputs
# Calculate the derivative of a neuron outputdef transfer_derivative(output):
return output * (1.0 - output) #
Backpropagate error and store in neurons
def backward_propagate_error(network, expected):for i in reversed(range(len(network))):

layer = network[i] errors =
list()
if i != len(network)-1:
for j in range(len(layer)):
error = 0.0
for neuron in network[i + 1]:
error += (neuron['weights'][j] * neuron['delta'])errors.append(error)
else:
neuron = layer[j] errors.append(neuron['output'] - expected[j])
neuron = layer[j]

neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])
# Update network weights with error
def update_weights(network, row, l_rate):

for i in range(len(network)): inputs =row[:-1]
if i != 0:
inputs = [neuron['output'] for neuron in network[i - 1]] forneuron in network[i]:
for j in range(len(inputs))
neuron['weights'][j] -= l_rate * neuron['delta'] * inputs[j] neuron['weights'][-1] -= l_rate *neuron['delta']
# Train a network for a fixed number of epochs
def train_network(network, train, l_rate, n_epoch, n_outputs):for epoch in range(n_epoch):
sum_error = 0 for rowin train:
outputs = forward_propagate(network, row)expected = [0 for i in range(n_outputs)] expected[row[-1]] = 1
sum_error + um([(expected[i]-outputs[i])**2 for i inrange(len(expected))])
backward_propagate_error(network, expected)update_weights(network, row, l_rate)

print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate,sum_error)) # Test training backprop algorithm
seed(1)
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
n_inputs = len(dataset[0]) - 1
n_outputs = len(set([row[-1] for row in dataset]))
network = initialize_network(n_inputs, 2, n_outputs
print(layer)

OUTPUT:

Qr Codes:
GitHub
Experience the power of prediction through the
→ backpropagation algorithm. Explore the code at Git Hub
for seamless integration and advanced forecasting
capabilities.
Explanation
→ power of the backpropagation algorithm to iteratively
App

→ leveraging the backpropagation algorithm. Witness
dynamic hypothesis refinement of accurate forecast.
Blog
depth of the backpropagation algorithm and discover how it
→ enhances forecasting accuracy

RESULT:
Thus, the Back propagation algorithm to build an Artificial Neural network was implemented
successfully.

EX.NO:04 IMPLEMENTATION OF NAÏVE BAYESIAN CLASSIFIER
DATE:
AIM:
To implement Naïve Bayesian classifier for Tennis data set and to compute theaccuracy with few datasets.
ALGORITHM:
Step 1: Convert the data set into a frequency table.

Step 2: Create likelihood table by finding the probabilities like overcast probability = 0.29and probability
of plating is 0.64.
Step 3: Now, use Naive Bayesian equation to calculate the posterior probability for each class. The class with
the highest posterior probability is the outcome of prediction.
Step 4: Exit.
SOURCE CODE:
import streamlit as stimport pandas as pd import numpy as np

class NaiveBayesClassifier:def fit(self, X, y):
self.classes = np.unique(y)self.parameters = {}
for i, c in enumerate(self.classes):
X_c = X[np.where(y == c)] self.parameters[c] = {
'mean': X_c.mean(axis=0),'var': X_c.var(axis=0),
'prior': X_c.shape[0] / X.shape[0]
}
def predict(self, X):posteriors = [] for x in X:
posteriors.append([self._posterior(x, c) for c in self.classes])return self.classes[np.argmax(posteriors, axis=1)]
def _posterior(self, x, c):
mean = self.parameters[c]['mean']var = self.parameters[c]['var'] prior = self.parameters[c]['prior']

posterior = np.sum(-0.5 * np.log(2. * np.pi * var) - ((x - mean) ** 2) / (2. * var))return posterior + np.log(prior)
def main():
st.title("Tennis Data Classifier")

# File upload
uploaded_file = st.file_uploader("Upload CSV file", type=['csv'])if uploaded_file is not None:

try:
data = pd.read_csv(uploaded_file) st.write("The first 5 rows of data:")st.write(data.head())
X = data.iloc[:, :-1]
y = data.iloc[:, -1]
# Convert categorical data to numericalfor col in X.columns:
X[col] = X[col].astype('category').cat.codesy = y.astype('category').cat.codes
# Split data into train and test setssplit_ratio = 0.8
indices = np.random.permutation(len(X))train_size = int(len(X) * split_ratio)
train_idx, test_idx = indices[:train_size], indices[train_size:] X_train, X_test = X.iloc[train_idx],
X.iloc[test_idx]
y_train, y_test = y.iloc[train_idx], y.iloc[test_idx]
# Train classifier
classifier = NaiveBayesClassifier() classifier.fit(X_train.to_numpy(), y_train.to_numpy())

# Predict and evaluate
y_pred = classifier.predict(X_test.to_numpy()) accuracy = np.mean(y_pred == y_test.to_numpy())

st.write(f"Accuracy: {accuracy:.2f}")except Exception as e:
st.error(f"An error occurred: {e}")
if name == " main ":main()

OUTPUT:

Qr Codes:
GitHub
Experience the power of prediction through the naïve
→ Bayesian classifier(dataset). Explore the code at Git Hub
for seamless integration and advanced forecasting
capabilities.
Explanation
Video
Explore concept learning with ease. Our app happiness the
power of the naïve Bayesian classifier(dataset) to
→ iteratively refine hypotheses based on data. Watch how it
works
App

→ leveraging the naïve Bayesian classifier(dataset). Witness
Blog
Unlock the prediction with our latest blog post. Dive into
the depth of the naïve Bayesian classifier(dataset) and
→
discover how it enhances forecasting accuracy

RESULT:
Thus, the program to implement Naïve Bayesian classifier to compute the accuracywith few datasets using
python was executed and verified successfully.

EX.NO: 05
NAÏVE BAYESIAN CLASSIFIER FOR DOCUMENT CLASSIFICATION
DATE:
AIM:
To classify a set of documents using Naïve Bayesian classifier and to measure theaccuracy and precision.
ALGORITHM:
Step 1: Import basic libraries.
Step 2: Importing the dataset.
Step 3: Data preprocessing.
Step 4: Training the model.
Step 5: Testing and evaluation of the model.
Step 6: Visualizing the model.
SOURCE CODE:
import pandas as pdimport numpy as np

from collections import Counterimport math
import streamlit as st# File uploader
uploaded_file = st.file_uploader("Choose a CSV file", type="csv")
if uploaded_file is not None:# Load data
msg = pd.read_csv(uploaded_file, names=['message', 'label'])st.write("Total Instances of Dataset: ",
msg.shape[0]) msg['labelnum'] = msg.label.map({'pos': 1, 'neg': 0})
X = msg.messagey = msg.labelnum
# Split data (using numpy)

def train_test_split(X, y, test_size=0.25, random_state=None):if random_state:
np.random.seed(random_state)indices = np.arange(X.shape[0]) np.random.shuffle(indices)
split_idx = int(X.shape[0] * (1 - test_size))
train_idx, test_idx = indices[:split_idx], indices[split_idx:]
return X.iloc[train_idx], X.iloc[test_idx], y.iloc[train_idx], y.iloc[test_idx]

Xtrain, Xtest, ytrain, ytest = train_test_split(X, y)
# Count Vectorization
def count_vectorize(corpus, vocab=None):if vocab is None:
vocab = Counter()
for doc in corpus: vocab.update(doc.split())

vocab = sorted(vocab.keys())
def vectorize(doc):
vec = np.zeros(len(vocab)) word_counts = Counter(doc.split())for i, word in enumerate(vocab):

vec[i] = word_counts[word]return vec
return np.array([vectorize(doc) for doc in corpus]), vocab

Xtrain_dm, vocab = count_vectorize(Xtrain)Xtest_dm, _ = count_vectorize(Xtest, vocab)
# Custom Multinomial Naive Bayesclass MultinomialNB:

def fit(self, X, y): self.classes = np.unique(y)
self.class_count = np.array([np.sum(y == c) for c in self.classes]) self.feature_count = np.array([np.sum(X[y
== c], axis=0) for c in self.classes])
self.feature_log_prob = np.log((self.feature_count + 1) / (self.feature_count.sum(axis=1, keepdims=True) +
X.shape[1]))
self.class_log_prior = np.log(self.class_count / y.size)

def predict(self, X):
jll = X @ self.feature_log_prob.T + self.class_log_priorreturn self.classes[np.argmax(jll, axis=1)]
clf = MultinomialNB() clf.fit(Xtrain_dm, ytrain) pred = clf.predict(Xtest_dm)

# Print predictions
for doc, p in zip(Xtest, pred): # Changed from Xtrain to Xtestp = 'pos' if p == 1 else 'neg'
st.write(f"{doc} -> {p}")
# Accuracy Metrics
def accuracy_score(y_true, y_pred):return np.mean(y_true == y_pred)
def confusion_matrix(y_true, y_pred):classes = np.unique(y_true)

cm = np.zeros((classes.size, classes.size), dtype=int)for i, j in zip(y_true, y_pred):
cm[i, j] += 1
return cm
def precision_score(y_true, y_pred):
cm = confusion_matrix(y_true, y_pred)return np.diag(cm) / cm.sum(axis=0)

def recall_score(y_true, y_pred):
cm = confusion_matrix(y_true, y_pred)return np.diag(cm) / cm.sum(axis=1)

st.write('Accuracy Metrics: \n') st.write('Accuracy: ', accuracy_score(ytest, pred)) st.write('Recall: ',
recall_score(ytest, pred)) st.write('Precision: ', precision_score(ytest, pred))
st.write('Confusion Matrix: \n', confusion_matrix(ytest, pred))else:
st.write("Please upload a CSV file.")

OUTPUT:

Qr Codes:
GitHub
→ depth of the naïve Bayesian classifier model and discover
Explanation
Video
→ Explore concept learning with ease. Our app happiness the
→ power of the naïve Bayesian classifier model to iteratively
App

→ leveraging the naïve Bayesian classifier model. Witness
Blog
→ → depth of the naïve Bayesian classifier model and discover

RESULT:
Thus, the accuracy and precision were measured by Naïve Bayesian classifier model.

EX.NO:06 BAYESIAN NETWORK TO DIAGNOSE CORONA INFECTION
DATE:
AIM:
To construct a Bayesian network to diagnose corona infection using WHO data set.
ALGORITHM:
Step 1: Separate by Class.
Step 2: Summarize Dataset.
Step 3: Summarize Data by Class.
Step 4: Gaussian Probability Density Function.
Step 5: Class Probabilities.
SOURCE CODE:
import pandas as pd import streamlit as st

# Define your BayesianNetwork class hereclass BayesianNetwork:
def init (self, data):self.data = data
# Add your Bayesian Network initialization logic heredef predict(self, symptoms):
# Add your prediction logic based on symptoms here# This is just a placeholder
return 1 # Placeholder for positive diagnosis
# File uploader for WHO dataset
uploaded_file = st.file_uploader("Upload WHO Dataset (CSV)", type=["csv"])

if uploaded_file is not None:
data = pd.read_csv(uploaded_file)st.write("Dataset Preview:") st.write(data.head())

# Add your prediction logic based on symptoms here# This is just a placeholder

return 1 # Placeholder for positive diagnosis
# File uploader for WHO dataset
uploaded_file = st.file_uploader("Upload WHO Dataset (CSV)", type=["csv"])

if uploaded_file is not None:
data = pd.read_csv(uploaded_file)st.write("Dataset Preview:") st.write(data.head())

# Instantiate Bayesian Network bayesian_network = BayesianNetwork(data)
# Streamlit app
st.title('Corona Infection Diagnosis')

# User input for symptomssymptoms = {}
for feature in data.columns[:-1]: # Exclude 'Diagnosis'
symptoms[feature] = st.selectbox(f'Select {feature}', data[feature].unique())

# Predict diagnosis
if st.button('Diagnose'):
diagnosis = bayesian_network.predict(symptoms)
# Instantiate Bayesian Network bayesian_network = BayesianNetwork(data)
st.title('Corona Infection Diagnosis')

# User input for symptomssymptoms = {}
for feature in data.columns[:-1]: # Exclude 'Diagnosis'
symptoms[feature] = st.selectbox(f'Select {feature}', data[feature].unique())

# Predict diagnosis
if st.button('Diagnose'):
diagnosis = bayesian_network.predict(symptoms)
st.write(f'The diagnosis based on the symptoms is: {"Positive" if diagnosis == 1 else"Negative"}')
else:
st.write("Please upload a CSV file.")

OUTPUT:

Qr Codes:
GitHub
→ the depth of the Bayesian network and discover how it
enhances forecasting accuracy
Explanation
Video

→→ power of the Bayesian network to iteratively refine
hypotheses based on data. Watch how it works
App
→ leveraging the Bayesian network. Witness dynamic
hypothesis refinement of accurate forecast.
Blog
→ the depth of the Bayesian network and discover how it

RESULT:
Thus, the program to diagnose corona infection using Bayesian network was successfully implemented
using python.

EX.NO: 07 IMPLEMENTATION OF K-MEANS ALGORITHM FOR CLUSTERING
DATE:
AIM:
To compare the clustering in EM algorithm and K-mean algorithm using the samedata sets.
ALGORITHM:
The K-means implementation is as follows:
Step 1: Choose the number of clusters k.
Step 2: Select k random points from the data as centroids.
Step 3: Assign all the points to the closest cluster centroid.
Step 4: Recompute the centroids of newly formed clusters.
Step 5: Repeat steps 3 and 4.
The EM implementation is as follows:
Step 1: Expectation step (E - step): It involves the estimation (guess) of all missing valuesin the dataset so
that after completing this step, there should not be any missing value.
Step 2: Maximization step (M - step): This step involves the use of estimated data in the E-step and updating
the parameters.
Step 3: Repeat E-step and M-step until the convergence of the values occurs.
SOURCE CODE:
import streamlit as stimport sklearn

from sklearn.cluster import KMeans
from sklearn import preprocessing
from sklearn.mixture import GaussianMixturefrom sklearn.datasets import load_iris

import sklearn.metrics as smimport pandas as pd
import numpy as np
import matplotlib.pyplot as plt

def main():
st.title("Iris Clustering Visualization")

# Load the Iris datasetdataset = load_iris()
X = pd.DataFrame(dataset.data, columns=['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width'])
y = pd.DataFrame(dataset.target, columns=['Targets'])
# Real Plot plt.figure(figsize=(14, 7))
colormap = np.array(['red', 'lime', 'black'])plt.subplot(1, 3, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y.Targets], s=40)plt.title('Real')
# KMeans Plot plt.subplot(1, 3, 2)
model = KMeans(n_clusters=3)
model.fit(X)
predY = np.choose(model.labels_, [0, 1, 2]).astype(np.int64) plt.scatter(X.Petal_Length, X.Petal_Width,

c=colormap[predY], s=40)plt.title('KMeans')
# Gaussian Mixture Model Plot
scaler = preprocessing.StandardScaler()scaler.fit(X)
xsa = scaler.transform(X)
xs = pd.DataFrame(xsa, columns=X.columns)gmm = GaussianMixture(n_components=3) gmm.fit(xs)

y_cluster_gmm = gmm.predict(xs)plt.subplot(1, 3, 3)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y_cluster_gmm], s=40)plt.title('GMM Classification')
# Display the plotsst.pyplot()
if name == " main ":main()

OUTPUT:

Qr Codes:
Github
Experience the power of prediction through the EM algorithm

→ and K- mean algorithm. Explore the code at Git Hub for
seamless integration and advanced forecasting capabilities.
Explanation
Video
Experience the power of prediction through the EM algorithm

→ → and K- mean algorithm. Explore the code at Git Hub for
seamless integration and advanced forecasting capabilities.
App

→ leveraging the EM algorithm and K- mean algorithm. Witness
Blog
→ → depth of the EM algorithm and K- mean algorithm and
discover how it enhances forecasting accuracy

RESULT:
Thus, the program to compare clustering in EM and K-means algorithm with fewdatasets was performed
successfully.

EX.NO: 08 IMPLEMENTATION OF K-NEAREST NEIGHBOUR ALGORITHM
DATE:
AIM:
To implement K-Nearest Neighbor algorithm to classify iris data set.
ALGORITHM:
Step 1: Calculate the Information Gain of each feature.
Step 2: Considering that all rows don’t belong to the same class, split the dataset S into subsetsusing the
feature for which the Information Gain is maximum.
Step 3: Make a decision tree node using the feature with the maximum Information gain.
Step 4: If all rows belong to the same class, make the current node as a leaf node with the classas its label.
Step 5: Repeat for the remaining features until we run out of all features, or the decision treehas all leaf
nodes.
SOURCE CODE:
import streamlit as stimport numpy as np

# Define the features (sepal length, sepal width, petal length, petal width)features = np.array([
[5.1, 3.5, 1.4, 0.2],
[4.9, 3.0, 1.4, 0.2],
[4.7, 3.2, 1.3, 0.2],
[4.6, 3.1, 1.5, 0.2],

[5.0, 3.6, 1.4, 0.2],
[5.4, 3.9, 1.7, 0.4],
[4.6, 3.4, 1.4, 0.3],
[5.0, 3.4, 1.5, 0.2],

[4.4, 2.9, 1.4, 0.2],
[4.9, 3.1, 1.5, 0.1],
[5.4, 3.7, 1.5, 0.2],
[4.8, 3.4, 1.6, 0.2],
[4.8, 3.0, 1.4, 0.1],
[4.3, 3.0, 1.1, 0.1],
[5.8, 4.0, 1.2, 0.2]
])
# Define the target labels (0: setosa, 1: versicolor, 2: virginica)targets = np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

0, 0, 0, 0])
# Define the target names target_names = np.array(['setosa'])
# Combine features and targets into a dictionary similar to the Iris datasetiris_dataset = {
'data': features,'target': targets,
'target_names': target_names,
'feature_names': ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
}
# Streamlit code to display the datasetst.title("Sample of the Iris dataset")
for i in range(5): st.write(f"Sample {i+1}:")
st.write(" Features:", iris_dataset['data'][i]) st.write(" Target:", iris_dataset['target'][i], "(",
iris_dataset['target_names'][iris_dataset['target'][i]], ")")st.write("")

OUTPUT:

Qr Codes:
Github
Experience the power of prediction through the K-nearest
→ neighbor algorithm. Explore the code at Git Hub for seamless
integration and advanced forecasting capabilities.
Explanation
Video

→ → power of the K-nearest neighbor algorithm to iteratively refine
hypotheses based on data. Watch how it works
App

→ leveraging the K-nearest neighbor algorithm. Witness dynamic
hypothesis refinement of accurate forecast.
Blog
→ → depth of the K-nearest neighbor algorithm and discover how it

RESULT:
Thus, the program for K-Nearest Neighbor algorithm was implemented successfullyusing an iris data set.

EX.NO: 09 LOCALLY WEIGHTED REGRESSION ALGORITHM
DATE:
AIM:
To implement the non-parametric Locally Weighted Regression algorithm in order tofit data points.
ALGORITHM:
Step 1: Read the Given Data Sample to X and the curve (linear or nonlinear) to Y
Step 2: Set the value for Smoothening parameter or Free parameter say τ
Step 3: Set the bias /Point of interest set x0 which is a subset of X
Step 4: Determine the weight matrix using:240
Step 5: Determine the value of model term parameter β using

Step 6: Prediction = x0*β
SOURCE CODE:
import numpy as np
import matplotlib.pyplot as pltimport pandas as pd

import streamlit as st
from sklearn.datasets import make_regression
def locally_weighted_regression(x0, X, Y, tau):"""
Locally Weighted Linear RegressionArgs:
x0 : array-like, shape (m,)
The input point where the prediction is to be made.
X : array-like, shape (n, m)The input features.

Y : array-like, shape (n,)The output values.

tau : float
The bandwidth parameter.

Returns:
y0 : float
The predicted value at x0."""

m = X.shape[0]
x0 = np.r_[1, x0] # Add intercept term
X = np.c_[np.ones(m), X] # Add intercept term

# Calculate weights
W = np.exp(-np.sum((X - x0)*2, axis=1) / (2 * tau*2))

# Compute the theta values using normal equation
theta = np.linalg.inv(X.T @ (W[:, None] * X)) @ (X.T @ (W * Y))

# Predict the value at x0y0 = x0 @ theta
return y0
def main():
st.title("Locally Weighted Regression")

# Generate synthetic datasetnp.random.seed(0)
X, Y = make_regression(n_samples=100, n_features=1, noise=10.0)
X = X.flatten() # Ensure X is 1D
st.subheader("Generated Dataset")data = np.vstack((X, Y)).T
st.write(pd.DataFrame(data, columns=["Feature", "Target"]))
tau = st.slider("Select Bandwidth (tau)", 0.01, 1.0, 0.1)
# Predict y values using locally weighted regression
y_pred = np.array([locally_weighted_regression(x, X, Y, tau) for x in x_pred])

# Plotting the results

fig, ax = plt.subplots(figsize=(10, 6)) ax.scatter(X, Y, color='blue', label='Data Points')
ax.plot(x_pred, y_pred, color='red', label='LWR Curve')ax.set_xlabel('Feature')
ax.set_ylabel('Target')ax.legend()
ax.set_title('Locally Weighted Regression')
# Display plot in Streamlitst.pyplot(fig)
if _name_ == "_main_":main()
OUTPUT:

Qr Codes:
Github
Experience the power of prediction through the non-parametric
→ Locally Weighted Regression algorithm. Explore the code at Git
Hub for seamless integration and advanced forecasting
capabilities.
Explanation
Video
→ Explore concept learning with ease. Our app happiness the power
of the non-parametric Locally Weighted Regression algorithm to
→ iteratively refine hypotheses based on data. Watch how it works.
App
→
leveraging the non-parametric Locally Weighted Regression
algorithm. Witness dynamic hypothesis refinement of accurate
forecast.
Blog
→ depth of the non-parametric Locally Weighted Regression
algorithm and discover how it enhances forecasting accuracy

RESULT:
Thus, the non-parametric Locally Weighted Regression algorithm to fit data points was implemented
successfully.


Ashin ML Record - Merged

Uploaded by

Copyright:

Available Formats

Ashin ML Record - Merged

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ashin ML Record - Merged

Uploaded by

Copyright:

Available Formats

KGiSL Institute of Technology

(Affiliated to ANNA University, Chennai and Approved by AICTE, New Delhi)

Branch : B.Tech – Artificial Intelligence and Data Science

Subject Code/ Title : AD3461 Machine Learning

Semester/ Year : IV/ II

(Affiliated to ANNA University, Chennai and Approved by AICTE, New Delhi)

CLASS : II YEAR/ IV SEM AI&DS

This is to certify that, this is a bonafide record of practical work done by

FACULTY IN CHARGE HEAD OF THE DEPARTMENT

Submitted during Anna University Practical Examination held on ........................... at

INTERNAL EXAMINER EXTERNAL EXAMINER

4 NAÏVE BAYESIAN CLASSIFIER

NAÏVE BAYESIAN CLASSIFIER

Step 1: Load Data set.

Step 2: Initialize General Hypothesis and Specific Hypothesis.

Step 3: For each training example

Step 4: If example is positive example

replace attribute value with '?' (Basically generalizing it)

Step 5: If example is Negative example

Make generalize hypothesis more specific.

import pandas as pdimport numpy as np

def learn(concepts, target):

KGISL INSTITUTE OF TECHNOLOGY 1 711722243019

general_h = pd.DataFrame([["?" for _ in range(len(specific_h))] for _ inrange(len(specific_h))],

for i, h in concepts.iterrows():if target[i] == "Yes":

for x in range(len(specific_h)):if h[x] != specific_h[x]:

specific_h[x] = '?' general_h.iloc[x, x] = '?'

for x in range(len(specific_h)):if h[x] != specific_h[x]:

print("\nSteps of Candidate Elimination Algorithm", i + 1)print(specific_h)

general_h = general_h.loc[~(general_h == '?').all(axis=1)]return specific_h, general_h

s_final, g_final = learn(concepts, target)print("\nFinal Specific_h:") print(s_final)

KGISL INSTITUTE OF TECHNOLOGY 2 711722243019

KGISL INSTITUTE OF TECHNOLOGY 3 711722243019

App Predict weather patterns seamlessly with our streamlit app

KGISL INSTITUTE OF TECHNOLOGY 4 711722243019

KGISL INSTITUTE OF TECHNOLOGY 5 711722243019

Step 2: Read the dataset.

Step 3: Calculate the Entropy of the whole dataset.

Step 4: Calculate the Entropy of the filtered dataset.

Step 5: Calculate the Information gain for the feature (outlook).

Step 7: Adding a node to the tree.

Step 8: Perform ID3 algorithm and generate a tree.

Step 9: Finding unique classes of the label.

Step 10: Predicting from the tree.

Step 11: Evaluating the test dataset.

Step 12: Checking the test dataset.

KGISL INSTITUTE OF TECHNOLOGY 6 711722243019

import streamlit as stimport numpy as np import pandas as pd class Node:

self.value = valueself.result = resultself.children = {}

subset = data[data[feature_name] == value]

weighted_entropy += len(subset) / len(data) * self.entropy(subset[target_name]) return total_entropy -

KGISL INSTITUTE OF TECHNOLOGY 7 711722243019

value = instance[node.feature]if value not in node.children:

return self.predict_instance(instance, node.children[value])

result = self.predict_instance(row, self.root)predictions.append(result)

st.title("Decision Tree ID3 with Streamlit")

uploaded_file = st.file_uploader("Upload Dataset", type=["csv"])if uploaded_file is not None:

KGISL INSTITUTE OF TECHNOLOGY 8 711722243019

KGISL INSTITUTE OF TECHNOLOGY 9 711722243019