School of Engineering: Lab Manual On Machine Learning Lab
School of Engineering: Lab Manual On Machine Learning Lab
School of Engineering: Lab Manual On Machine Learning Lab
Lab Manual
on
Machine Learning Lab
Submitted By - Submitted To -
Ravi Kumawat Ms. Varsha Himthani
18BCON695
Lab 1. Implement the CANDIDATE – ELIMINATION algorithm. Show how it is used to learn
from training examples.
Lab 3. Implement the ID3 algorithm for learning Boolean–valued functions for classifying the
training examples by searching through the space of a Decision Tree.
Lab 4. Design and implement Naïve Bayes Algorithm for learning and classifying TEXT
DOCUMENTS.
Lab 5. Implement K-Nearest Neighbor algorithm to classify the iris data set. Also calculate the
score.
Lab 6. Write a program to implement Support Vector Machine. Also discuss the confusion matrix
and score of models.
Lab 7. Apply EM algorithm to cluster a set of data and also apply K-Means algorithm on the
same data set to compare two algorithms.
Lab 8. Build an Artificial Neural Network by implementing Back-Propagation algorithm and test
the same using an appropriate data set.
Lab 9. Implement the Non-Parametric Locally Weighted Regression Algorithm in order to fit
data points. Select the appropriate data set for your experiment and draw a graph.
Lab 10. Build a Face detection system to recognize faces in a frame or image. You can use
OpenCV for this task.
1. Implement the CANDIDATE – ELIMINATION algorithm. Show
how it is used to learn from training examples.
import numpy as np
import pandas as pd
data = pd.DataFrame(data=pd.read_csv('CE.csv'))
print(data.head())
concepts = np.array(data.iloc[:,0:-1])
target = np.array(data.iloc[:,-1])
print(target)
print(concepts)
OUTPUT -
Logistic Regression
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
features_train, features_test, labels_train, labels_test =
train_test_split(features, labels, test_size = 0.25, random_state
= 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
features_train = sc.fit_transform(features_train)
features_test = sc.transform(features_test)
Logistic Regression
4. Assuming a set of documents that need to be classified, use the naïve
Bayesian Classifier model to perform this task. Built-in Java classes/API
can be used to write the program. Calculate the accuracy, precision, and
recall for your data set.
import pandas as pd
msg=pd.read_csv('naivetext1.csv',names=['message','label'])
print('The dimensions of the dataset',msg.shape)
msg['labelnum']=msg.label.map({'pos':1,'neg':0})
X=msg.message, y=msg.labelnum
print(X)
print(y)
df=pd.DataFrame(xtrain_dtm.toarray(),columns=count_vect.get_feature_names())
print(df)#tabular representation
print(xtrain_dtm) #sparse matrix representation
# Training Naive Bayes (NB) classifier on training data. from sklearn.naive_bayes import
MultinomialNB
clf = MultinomialNB().fit(xtrain_dtm,ytrain)
predicted = clf.predict(xtest_dtm)
X_new_counts = count_vect.transform(docs_new)
predictednew = clf.predict(X_new_counts)
for doc, category in zip(docs_new, predictednew):
print('%s->%s' % (doc, msg.labelnum[category]))
OUTPUT -
['about', 'am', 'amazing', 'an', 'and', 'awesome', 'beers', 'best', 'boss', 'can', 'deal',
'do', 'enemy', 'feel', 'fun', 'good', 'have', 'horrible', 'house', 'is', 'like', 'love', 'my',
'not', 'of', 'place', 'restaurant', 'sandwich', 'sick', 'stuff', 'these', 'this', 'tired', 'to',
'today', 'tomorrow', 'very', 'view', 'we', 'went', 'what', 'will', 'with', 'work'] about
am amazing an and awesome beers best boss can ... today]
import csv
import random
import math
import operator
def getResponse(neighbors):
classVotes = {}
for x in range(len(neighbors)): response = neighbors[x][-1] if response in classVotes:
classVotes[response] += 1
else:
classVotes[response] = 1
sortedVotes = sorted(classVotes.iteritems(),
reverse=True)
return sortedVotes[0][0]
OUTPUT -
Confusion matrix is as follows -
[
[11 0 0]
[0 9 1]
[0 1 8]
]
Accuracy metrics -
0 1.00 1.00 1.00 11
1 0.90 0.90 0.90 10
2 0.89 0.89 0,89 9
Avg/Total -
0.93 0.93 0.93 30
6. Write a program to implement Support Vector Machine. Also discuss
the confusion matrix and score of the model.
#importing datasets
data_set= pd.read_csv('user_data.csv')
OUTPUT -
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets.samples_generator import make_blobs
X, y_true = make_blobs(n_samples=100, centers = 4,Cluster_std=0.60,random_state=0)
X = X[:, ::-1]
OUTPUT -
[[1 ,0, 0, 0]
[0 ,0, 1, 0]
[1 ,0, 0, 0]
[1 ,0, 0, 0]
[1 ,0, 0, 0]]
#Sigmoid Function
def sigmoid (x):
return 1/(1 + np.exp(-x))
#Variable initialization
epoch=7000 #Setting training iterations
lr=0.1 #Setting learning rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output layer #weight and bias initialization
wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
bout=np.random.uniform(size=(1,output_neurons))
#draws a random range of numbers uniformly of dim x*y for i in range(epoch):
#Forward Propogation
hinp1=np.dot(X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout)
outinp= outinp1+ bout
output = sigmoid(outinp)
#Backpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)#how much hidden layer wts contributed to
error
d_hiddenlayer = EH * hiddengrad
wout += hlayer_act.T.dot(d_output) *lr# dotproduct of nextlayererror and currentlayerop
# bout += np.sum(d_output, axis=0,keepdims=True) *lr wh += X.T.dot(d_hiddenlayer) *lr
#bh += np.sum(d_hiddenlayer, axis=0,keepdims=True) *lr print("Input: \n" + str(X))
print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)
OUTPUT -
Input:
[[ 0.66666667 1. ]
[ 0.33333333 0.55555556]
[ 1. 0.66666667]]
Actual Output:
[[ 0.92]
[ 0.86]
[ 0.89]]
Predicted Output:
[[ 0.89559591]
[ 0.88142069]
[ 0.8928407 ]]
9. Implement the non-parametric Locally Weighted Regression algorithm
in order to fit data points. Select appropriate data set for your
experiment and drawgraphs.
import pandas as pd
import numpy as np1
def localWeight(point,xmat,ymat,k):
wei = kernel(point,xmat,k)
W=(X.T*(wei*X)).I*(X.T*(wei*ymat.T))
return W
def localWeightRegression(xmat,ymat,k):
m,n = np1.shape(xmat)
ypred = np1.zeros(m)
for i in range(m):
ypred[i] = xmat[i]*localWeight(xmat[i],xmat,ymat,k)
return ypred
SortIndex = X[:,1].argsort(0)
xsort = X[SortIndex][:,0]
OUTPUT -
10. Build a Face detection system to recognize faces in a frame or image.
You can use OpenCV for this task
import cv2
from matplotlib import pyplot as plt #To plot the image
minNeighbors=5);
print("face found",len(faces))