0% found this document useful (0 votes)
7 views2 pages

MLT 8 KK

Uploaded by

717821F109
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views2 pages

MLT 8 KK

Uploaded by

717821F109
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

DEPARTMENT OF INFORMATION TECHNOLOGY 21ID14 MACHINE LEARNING TECHNIQUES

Ex no: 8
Date:
Data And Text Clustering Using K Means Clustering

Aim:
To write an python program on data and text clustering using k means clustering.

Algorithm:
Step 1: Applying the k-means clustering algorithm and import the libraries for the
execution.
Step 2: Ignore the warnings when importing the dataset and check, preview and view
the summary of the loaded dataset
Step 3: Drop the redundant columns and view the dataset again for explore and drop
the desired variables from the set.
Step 4: Declare the feature vector and target variable and convert the categorical
variable into integers.
Step 5: Apply the k-means algorithm with two clusters for checking the quality of the
dataset. If not,apply the elbow method and apply different clusters.
Step 6: Compare all the cluster’s accurate values and find the highest accurate value
to conclude.

Code:

import numpy as np

import matplotlib.pyplot as mtp

import pandas as pd

from sklearn.cluster import KMeans

dataset = pd.read_csv('/content/50_Startups (1).csv')

dataset.head()

dataset.tail()

1 717821F122-Kanish R
DEPARTMENT OF INFORMATION TECHNOLOGY 21ID14 MACHINE LEARNING TECHNIQUES

x = dataset.iloc[:, [1, 2]].values

kmeans = KMeans(n_clusters=5, init='k-means++', random_state= 42)

y_predict= kmeans.fit_predict(x)

print('717821F122_Kanish_R')

mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s = 100, c = 'blue', label = 'Cluster 1') #for first cluster

mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s = 100, c = 'green', label = 'Cluster 2') #for second
cluster

mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c = 'red', label = 'Cluster 3') #for third cluster

mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4') #for fourth
cluster

mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5') #for fifth
cluster

mtp.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300,c = 'yellow', label =


'Centroid')

mtp.title('K-Means Cluster')

mtp.xlabel('Market spend')

mtp.ylabel('admin')

mtp.legend()

mtp.show()

Result:
Thus, the implementation of python programming using linear regression algorithm for
prediction application has been completed successfully.

2 717821F122-Kanish R

You might also like