0% found this document useful (0 votes)
19 views

Data Analysis in Python-4

The document discusses using a K-nearest neighbors (KNN) classifier model in python. It imports the KNN library, creates a KNN classifier with 5 neighbors, fits the training data and predicts test values. It calculates performance metrics and accuracy. It then experiments with different K values and finds the misclassified samples are lowest with K=16.

Uploaded by

mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Data Analysis in Python-4

The document discusses using a K-nearest neighbors (KNN) classifier model in python. It imports the KNN library, creates a KNN classifier with 5 neighbors, fits the training data and predicts test values. It calculates performance metrics and accuracy. It then experiments with different K values and finds the misclassified samples are lowest with K=16.

Uploaded by

mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Data Analysis in python-4

'''
Now we will see the KNN classifier model
'''
#importing the necessary library of KNN
from sklearn.neighbors import KNeighborsClassifier
#nwo creat an instance of the model using K nearest Neighbors Classifer
KNN_classifier=KNeighborsClassifier(n_neighbors=5) #here K vlaue is
5,i.e nearest neighbors of 5 having
#sal less than or equal to 50000 will be considered.
KNN_classifier.fit(train_x,train_y) #fitting the values for x and y
#predicting the test values with this model
prediction=KNN_classifier.predict(test_x)
print(prediction)
#Now performance matrix check
confusion_matrix=confusion_matrix(test_y,prediction)
print('\t','predicted values')
print('original values','\n',confusion_matrix)
accuracy_score=accuracy_score(test_y,prediction)
print(accuracy_score)
print('miss-classified values: %d',(test_y!=prediction).sum())
'''
Now check the effect of K values on classifier
'''
Misclassified_sample=[]
#calculating errors for K values between 1 to 20
for i in range(1,20):
knn=KNeighborsClassifier(n_neighbors=i)
knn.fit(train_x,train_y)
pred_i=knn.predict(test_x)
Misclassified_sample.append((test_y!=pred_i).sum())
print(Misclassified_sample)
#therefor form these K values we can take K=16 for which the
misclassified value is lowest=1401
'''
So, we considered and studied two algorithms for classification problem
1. LogisticRegressiion
2. KNN
'''

You might also like