0% found this document useful (0 votes)
49 views

SL Classification For Data Science..

This document discusses implementing the Naive Bayes algorithm on a social network advertising dataset. It loads and prepares the data, then splits it into training and test sets. It fits three Naive Bayes models - Gaussian, Bernoulli, and Multinomial - and evaluates their accuracy using metrics like accuracy score, classification report, and confusion matrix. It finds that the Gaussian model has the highest accuracy at 65%.

Uploaded by

shivaybhargava33
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

SL Classification For Data Science..

This document discusses implementing the Naive Bayes algorithm on a social network advertising dataset. It loads and prepares the data, then splits it into training and test sets. It fits three Naive Bayes models - Gaussian, Bernoulli, and Multinomial - and evaluates their accuracy using metrics like accuracy score, classification report, and confusion matrix. It finds that the Gaussian model has the highest accuracy at 65%.

Uploaded by

shivaybhargava33
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Naïve Bayce Algorithm

Import the Libraries


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split

Load the Data Set


os.chdir('C:\\Noble\\Training\\Top Mentor\\Training\\Data Set\\')
df1 = pd.read_csv('Social_Network_Ads.csv')
print (df1)

Independent Variable X
x= df1.iloc [:,2:4].values
print (x)

Dependent Variable y
y = df1.iloc[:,4].values
print (y)
Feature Scaling – Standardization (To standardize Salary)
from sklearn.preprocessing import StandardScaler
sc_x= StandardScaler()
x = sc_x.fit_transform(x)
print (x)
Train Test Split Data
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size= 0.2)

Create Model - GaussianNB


from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB()
classifier.fit(x_train,y_train)

Predict the Result


y_pred = classifier.predict(x)
y_pred

Display the result in a Data Frame


df_x = pd.DataFrame(x,columns= ['Age','Salary'])
df_y = pd.DataFrame(y,columns = ['Y'])
df_ypred = pd.DataFrame(y_pred,columns = ['Y Pred'])
result = pd.concat ([df_x,df_y,df_ypred],axis = 1 )
print (result)

Accuracy Score to check Accuracy

from sklearn.metrics import accuracy_score


accuracy_score (y,y_pred)

Classification Report to check Accuracy

from sklearn.metrics import classification_report


print(classification_report (y,y_pred))
Confusion Matrix to check Accuracy
cm = confusion_matrix (y ,y_pred)
print (cm)
No Yes
As Per Model
0 1 2
No 0 52 0
As Per Data Yes 1 28 0
2

Accuracy = Sum of Diagonal Values / Sum of all the values


= (52+0)/80 = 52/80 = 0.65

Create Model - BernoulliNB

from sklearn.naive_bayes import BernoulliNB

classifier = BernoulliNB()

classifier.fit(x_train,y_train)

Prediction
y_pred= classifier.predict(x)

y_pred

Confusion Matrix to check Accuracy


cm = confusion_matrix (y,y_pred)
print (cm)

Standardisation with Min Max Scaler - Since MultinomialNB require all positive Numbers
x= df1.iloc [:,2:4].values

y= df1.iloc[:,4].values

from sklearn.preprocessing import MinMaxScaler

minmax_x= MinMaxScaler()

minmax_x = minmax_x.fit_transform(x)

print (minmax_x)

Train Test Split


x_train,x_test,y_train,y_test = train_test_split(minmax_x,y,test_size= 0.2)

Create Model - MultinomialNB


from sklearn.naive_bayes import MultinomialNB

classifier = MultinomialNB()

classifier.fit(x_train,y_train)

Predict the Result


y_pred = classifier.predict(x)

y_pred

Accuracy Score to check Accuracy


from sklearn.metrics import accuracy_score

accuracy_score (y,y_pred)

Classification Report to check Accuracy


from sklearn.metrics import classification_report

print(classification_report (y,y_pred))

Confusion Matrix to check Accuracy


cm = confusion_matrix (y,y_pred)

print (cm)

You might also like