0% found this document useful (0 votes)
44 views28 pages

Machine Learning Report

This report is on the implementation of different ML algorithms from scratch.

Uploaded by

estiaksazid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views28 pages

Machine Learning Report

This report is on the implementation of different ML algorithms from scratch.

Uploaded by

estiaksazid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Chittagong University of Engineering and Technology

Department Of Computer Science and Engineering

Final Report on Machine Learning Algorithms Implementation

Course No. : CSE-464


Course Title : Machine Learning (Sessional)
Date of Submission : 31/08/2023

Submitted To:
Md. Rashadur Rahman
Lecturer,
Department of CSE, CUET
Hasan Murad
Lecturer,
Department of CSE, CUET

Submitted By: Remarks


Name : Estiak Ahamed Sazid
ID : 1804051
Section :A
Level :4
Term :1
Table of Contents
Abstract ...................................................................................................................................... 1
1 Introduction ........................................................................................................................ 1
2 Algorithm Descriptions and Implementations ................................................................... 2
2.1 Apriori Algorithm ........................................................................................................ 2
2.1.1 Dataset.................................................................................................................. 2
2.1.2 Implementation .................................................................................................... 2
2.1.3 Performance Evaluation ....................................................................................... 3
2.2 Multivariable Linear Regression ................................................................................. 4
2.2.1 Dataset.................................................................................................................. 4
2.2.2 Implementation .................................................................................................... 5
2.2.3 Performance Evaluation ....................................................................................... 6
2.3 K-Means Clustering .................................................................................................... 7
2.3.1 Dataset.................................................................................................................. 7
2.3.2 Implementation .................................................................................................... 7
2.3.3 Performance Evaluation ....................................................................................... 9
2.4 Decision Tree............................................................................................................... 9
2.4.1 Dataset.................................................................................................................. 9
2.4.2 Implementation .................................................................................................... 9
2.4.3 Performance Evaluation ..................................................................................... 10
2.5 Artificial Neural Network (ANN) ............................................................................. 10
2.5.1 Dataset................................................................................................................ 11
2.5.2 Implementation .................................................................................................. 11
2.5.3 Performance Evaluation ..................................................................................... 12
3 Discussion ........................................................................................................................ 13
4 Conclusion ....................................................................................................................... 14
5 References ........................................................................................................................ 15
6 Appendices ....................................................................................................................... 16

i
Abstract
Machine learning algorithms have become indispensable in our modern world, where data is
the king. They can be used to extract valuable insights and make informed decisions about a
wide range of topics. This comprehensive report takes a deep dive into the intricacies of five
distinct machine learning algorithms: Apriori, Multivariable Linear Regression, Decision
Trees, K-Means Clustering, and Artificial Neural Networks (ANN). We'll break down how
each of these methods works, where they're useful, what they're good at, and what they're not
so good at. The algorithms are described in detail, with examples of how they can be used in
practice. The strengths and weaknesses of each algorithm are discussed, so that can make
informed decisions about which algorithm to use for a particular task.

1 Introduction
Machine learning is a rapidly evolving field that lies at the intersection of computer science
and artificial intelligence (AI). It provides computers with the ability to learn from data patterns
and experiences, continuously improving their performance without requiring explicit
programming. In essence, it enables machines to make predictions, recognize patterns, and
autonomously solve complex problems. As this field continues to progress, grasping its
principles and applications becomes increasingly crucial in our data-centric world, where it
plays a transformative role [1].

In our lab we have implemented five fundamental machine learning algorithms: Apriori,
Multivariable Linear Regression, Decision Trees, K-Means Clustering, and Artificial Neural
Networks (ANN). Each algorithm possesses unique characteristics and is used to solve specific
types of problems.

Apriori is great for spotting patterns in transaction data, while multivariable linear regression
helps us understand how different things relate. K-means clustering groups similar data points
together, and decision trees are handy for classifying and predicting. Artificial neural networks
(ANNs) are advanced models used for tricky tasks like recognizing images and understanding
language. Each algorithm has its own special traits, making them valuable tools for a variety
of fields [1].

Machine learning is a powerful tool for examining data and making intelligent choices. When
we grasp the various sorts of machine learning methods and their pros and cons, we can select

1
the most suitable one for a particular task. Since machine learning is a rapidly changing area,
keeping up to date with the latest advancements is crucial.

2 Algorithm Descriptions and Implementations


2.1 Apriori Algorithm

The Apriori method is a recognized algorithm for association rule mining. It begins by
identifying all the frequent itemset, which are collections of items that appear together in a
dataset with a specific minimum frequency. The algorithm may produce association rules from
the frequent itemset once they have been identified.

The initial step in the Apriori algorithm's operation is to identify all 1-itemsets, or the items
that are present in the dataset. Then, it identifies all frequent 2-itemsets, or sets of two things
that happen together in a transaction with a minimal frequency. For each of the k-itemset where
k is the total number of items in the dataset, then this process is repeated [2].

Finding frequent item sets may be done simply and effectively using the Apriori method. It
may be used to mine huge datasets and is comparatively simple to put into practice. For huge
datasets, it could, however, be computationally costly.
2.1.1 Dataset

The dataset is comprised with 20 columns and 7500 rows. Each column denotes an item or
product that can be found in each row, which denotes a transaction or record.

Figure 01: Snapshot of the dataset


2.1.2 Implementation

The code is implemented by using the built in library of apriori algorithm of python. Apriori
can be easily imported in code by using this command “from apyori import apriori”

➢ Transaction List Creation:


• In order to save individual item entries from the dataset, a list called transaction
is made.

2
• The dataset's cells are iterated over in a nested loop where i stands for the row
index and j for the column index.
• The item (if present) is appended to the transaction list for each cell in the
dataset.
➢ Apriori Algorithm:
• To do association rule mining using the Apriori algorithm, the apyori library is
loaded.
• To create association rules, the apriori function is used:
• Transactions specification: The transaction list, which includes the individual
item entries, is where this is set.
• The minimal level of support that must be attained for an itemset to be deemed
frequent is specified by the min_support option. It is set at 0.003, meaning that
for an itemset to be deemed frequent, it must occur in at least 0.3% of the
transactions.
• The minimal confidence level for association rules is specified by the
min_confidence option. It is set to 0.2, meaning that a rule must have a
confidence level of at least 20% to be taken into account.
• The min_lift option provides the association rules' minimal lift threshold. The
value is set to 3, which denotes that a rule
2.1.3 Performance Evaluation

The list of association rules produced by the Apriori algorithm is stored in the results variable.
It will show the association rules that were found together with their confidence, and support
values.

Figure 02: Association rules with their confidence and support values

3
2.2 Multivariable Linear Regression

A statistical method known as multivariable linear regression employs two or more


independent variables to forecast the outcome of a dependent variable. Modeling the linear
connection between the independent (explanatory) and response (dependent) variables is the
aim of multivariable linear regression. Because it considers many explanatory variables,
multivariable regression is essentially an extension of ordinary least-squares (OLS) regression.
The model can be represented as:

𝑦 = 𝑤0 + 𝑤1 𝑥1 + 𝑤2 𝑥2 + ⋯ + 𝑤𝑛 𝑥𝑛

Where:

y is the target variable's predicted value.

The bias term for the intercept is w0.

The coefficients for the independent variables x2,x2,…xn are w1,w2,….,wn

The values of the coefficients that minimize the difference between what was expected, and
the actual target values are found through the iterative optimization process known as gradient
descent. This error is frequently expressed as the mean squared error. The coefficients are
updated iteratively by the algorithm from beginning values to get closer to the ideal values [3].
According to the following updating rule, each coefficient wi:
𝑚
1
𝑤𝑖 = 𝑤𝑖 − 𝛼 ∗ ∑(𝑦𝑗 − 𝑦̂𝑗 ) ∗ 𝑥𝑖𝑗
𝑚
𝑗=1

2.2.1 Dataset

For the multivariable linear regression, “property listing data in Bangladesh” is used. There are
almost 10 features and a one target column named ‘price’.

Figure 03: Snapshot of dataset

4
2.2.2 Implementation

Data Preprocessing and Feature Selection:

In the above dataset, 'title', 'adress', 'purpose', 'flooPlan', 'url', 'lastUpdated' features are
unnecessary to our predicted target value. That’s why these columns are dropped initially. After
that, from ‘types’ features building and duplex rows are removed. Then, ‘type’ column is
removed. After that, some preprocessing is done on beds, baths area and price columns. After
that, z-normalization is used to normalize the data. Data is splitted for 20% test data, 20%
validation data and rest of the 60% is for training data.

Hyperparameters:

For our model, we defined various hyperparameters:

• Learning Rate: The learning rate was set to 0.01. Each gradient descent iteration's step
size is based on the learning rate.
• Number of Epoch: We specified a maximum of 1000 iterations for the number of
iterations (iter_number). Gradient descent continues until convergence or this many
iterations.

Cost Function:

In order to calculate the error of our model, we constructed a cost function. The mean squared
error (MSE) between the anticipated values and the actual target values is calculated using the
cost function.

Gradient Descent:

The gradient descent technique serves as the foundation of our implementation. To reduce the
cost function, the weight vector is updated repeatedly. We determine the gradient of the cost
function with respect to the weights for each iteration and change the weights as necessary.

Model Training:

We used the gradient_descent function to train the model, feeding it training data along with
starting weights, learning rate, and other hyperparameters. 1000 number of epoch is ran over
the model to train.

5
Figure 04: Training loss and validation loss per epoch
2.2.3 Performance Evaluation

Model Evaluation using Mean Squared Error (MSE)

A popular statistic for assessing how well regression models fit data is the Mean Squared Error
(MSE). The average squared difference between the goal values as intended and the projected
values is quantified. Better model performance is indicated by lower MSE values. We got about
11% test loss from our model.

The formula for MSE is:


𝑛
1
𝑀𝑆𝐸 = ∑(𝑦̂𝑖 −𝑦𝑖 )2
𝑛
𝑖=1

Where,

• n is the number of data points.


• 𝑦̂𝑖 is the predictive value of the i-th data.
• 𝑦𝑖 is the actual value of the i-th data.

Model Evaluation using R-squared (R²):

Another crucial statistic for regression models is R-squared (R2). It calculates the percentage
of the dependent variable's variation that can be accounted for by the model's independent
variables. Higher numbers suggest a better fit, and R-squared values range from 0 to 1. If the
model accurately predicts the target variable, then the value is 1. The model score is about
76.8%.

The formula for R-Squared is:

𝑆𝑆𝑅
𝑅2 = 1 −
𝑆𝑆𝑇

Where,

• 𝑆𝑆𝑅 = ∑𝑛𝑖=1(𝑦̂𝑖 −𝑦𝑖 )2

6
• 𝑆𝑆𝑇 = ∑𝑛𝑖=1(𝑦𝑖 − 𝑦̅)2

2.3 K-Means Clustering

Unlabeled data points are grouped into clusters using the unsupervised machine learning
process of clustering. Finding groupings of data points that are similar to one another and
distinct from data points in other clusters is the aim of clustering.

A straightforward and well-liked clustering technique is K-means. It functions by first


distributing data points to k clusters at random. The cluster centroids are then incrementally
updated, and each data point is then assigned to the cluster with the nearest centroid. The
centroids are changed back and forth until they stop changing.
2.3.1 Dataset

A 3-dimensional dataset is used to implement k-means clustering algorithm. We have taken 15


rows of data arbitrary in an excel csv file and loaded it in our algorithm.

Figure 05: Snapshot of dataset for K-Means Clustering


2.3.2 Implementation

➢ k_means () function
• By randomly choosing k data points from the input data x without replacement, the
cluster centers are started. The first cluster centers are these random data points.
• A loop is ran for a maximum of 100 iterations
• Each data point in x is assigned to the closest cluster center based on Euclidean distance
by using the “calc_cluster” function.
• Based on the data points provided to each cluster, the “update_centers” function
determines new cluster centers.

7
• If the centers are same as the previous centers then the loop breaks otherwise, centers
are updated with the “new_centers”, and the next iteration begins.
➢ update_centers () function:

Based on the data points provided to each cluster, this function determines new cluster centers.

• With dimensions (k, number_of_features), new_centers is first initialized as an array of


zeros. A cluster center will be represented by each row of new_centers.
• Using x[cluster == i], it chooses the data points in x that are a part of that cluster. This
retrieves all of the information associated with cluster i.
• Using.mean(axis=0), it determines the mean of these data points along each feature
dimension. This provides the cluster's new center.
• The new center is kept in the new_centers array's corresponding row.
➢ calc_cluster () function:

This function calculates the cluster assignments for each data point based on the Euclidean
distance between the data points and the cluster centers.

Cluster Result:

We have chosen 2 clusters for this data set. The 3D plot of the cluster with respective data
points is shown below:

Figure 06: 3-dimensional plot of the clusters

8
2.3.3 Performance Evaluation

The above k-means model gives a good interpretation on two clusters of 15 3D data points.
If we divide this dataset into more than two clusters, then data shows a scattered in plot.
There are 9 points in the cluster number one and the rest of the 6 points are in cluster number
two.

2.4 Decision Tree

A decision tree is a supervised machine learning tool used for both classification and
regression tasks. It works by breaking down data into smaller groups, assigning each to a
specific class or value. This tree-like structure consists of nodes representing feature tests
and branches showing the test results. The final classifications or values are found in the
leaf nodes. To make predictions, the tree is traversed from the root to a leaf, following
branches that match the data's characteristics, leading to a prediction located at the leaf node
[4].
2.4.1 Dataset

For the implementation of decision tree algorithm, we use ‘Play Tennis’ dataset. There are
three categorical and one numerical feature in the dataset. Categorical features are outlook,
temperature, humidity. Numerical feature is wind speed in mph. By analyzing this data we
have to predict whether a person can play tennis or not.

Figure 07: Snapshot of PlayTennis dataset

The numerical data then converted into the categorical data by sort of preprocessing.
2.4.2 Implementation

❖ Entropy Calculation (calc_entropy):

9
• The entropy of the dataset is first calculated before building the decision tree. An
indicator of impurity or disorder in a dataset is entropy. It serves as the dividing
condition in this code. A dataset that is more homogenous has a lower entropy.
❖ Information Gain Calculation (infoGain):
• Information gain quantifies the decrease in entropy attained by dataset segmentation
according to a certain property (feature). It measures the amount of information about
the target variable (in this case, "Play") that is acquired by employing a certain
characteristic for splitting.
• The algorithm determines the information gain by comparing the entropy of the
dataset before and after the split for each feature in the dataset. The node for splitting
is determined by which characteristic yields the most information gain.
❖ Attribute Selection (best_tree):
• Using an iterative process, the best_tree function chooses the feature that maximizes
information gain from among all those that are accessible. For the current node of
the decision tree, this chosen characteristic serves as the splitting attribute.
❖ Tree Construction (tree):
• The decision tree is built using the recursive tree function.
• Base case: It returns the value of the target variable ('Play') as a leaf node if all data
points in the current node have the same value for it.
• Recursive case: If the target variable in the current node has more than one value,
the method uses ‘best_tree’ function to choose the best splitting attribute.
• After that, the function builds a subtree for each distinct value of the chosen
characteristic. Based on this property, the dataset is separated into subgroups, and
the procedure is repeated for each subset. Subtrees that are created are joined to the
current node.
2.4.3 Performance Evaluation
The decision tree model over play tennis dataset gives almost 100% accuracy on the test
data. From the decision tree model, we can see that ‘outlook’ feature has a great impact. If
the outlook of the weather is overcast, then it directly predicts play tennis as ‘yes’.

2.5 Artificial Neural Network (ANN)


A machine learning model called an artificial neural network (ANN) is modeled after the
human brain. It is a network of linked nodes known as neurons that develops the ability to carry
out a job by processing data and modifying their connections [5].
10
The following are the main ideas behind ANNs:

• Neurons: An ANN's fundamental building blocks are neurons. They collaborate to


complete a task and are related to one another.
• Weights: The connections between neurons make up the weights. They control the
amount of data that is sent from one neuron to another.
• Activation function: The output of a neuron is determined by a mathematical function
called the activation function.
• Learning: It is how ANNs learn new information. This is accomplished through a
method known as backpropagation.
2.5.1 Dataset
We implemented a basic Artificial Neural Network (ANN) using the well-known Iris dataset,
which can be loaded with the `load_iris()` function from the `sklearn.datasets` module. This
dataset is frequently used for classification tasks and contains 150 data points, each
representing a flower. Each flower's characteristics include sepal length, sepal width, petal
length, and petal width. The goal is to classify each flower into one of three species: Iris-setosa,
Iris-versicolor, or Iris-virginica.
2.5.2 Implementation
Iris dataset has 150 data points, that’s why for the ANN our input layer has 150 inputs, then we
declare a hidden layer of 32 nodes and the output layer has 3 nodes. Then this output layer is
going through the ‘softmax’ activation function, as it is multiclass classification model.

Figure 08: Implemented artificial neural network model

11
• Forward Propagation (forward_propagation):
This function calculates the neural network's forward pass given input data X and
initialized weights and biases. Using the sigmoid activation for the hidden layer and the
softmax activation for the output layer, it determines the hidden layer activations (a)
and the final class probabilities (y_cap).
• Backpropagation (back_propagation):
Using backpropagation, this program calculates gradients for weight and bias updates.
With regard to the weights and biases in the hidden and output layers, it computes the
gradients of the loss. Equations are:
𝜕𝐿
𝑤𝑖 = 𝑤𝑖 − 𝛼 ∗
𝜕𝑤𝑖
𝜕𝐿
𝑏𝑖 = 𝑏𝑖 − 𝛼 ∗
𝜕𝑏𝑖
• Loss Function (loss_function):
The cross-entropy loss between the anticipated and actual class probabilities is
calculated using the loss function. It gauges how effectively the model is working.
• SoftMax and SoftMax derivative activation functions:
The softmax function, which is frequently employed for multiclass classification
applications, is calculated by softmax(z). It transforms a vector z of scores into
probability distributions across several classes.
The softmax function's derivative is calculated by softmax_derivative(z). When dealing
with multiclass classification, it is utilized in backpropagation.
• Loop for training (train):
The train function applies stochastic gradient descent (SGD) to train the neural network.
The gradients are produced using backpropagation, and they are used to iteratively
update the weights and biases.
Each cycle includes a computation of the loss and a history of that loss.
The factors that regulate the training process are the learning rate (lr) and the quantity
of training epochs (epochs).
2.5.3 Performance Evaluation

For the 5000 epochs, we trained our ANN model.

12
Figure 09: Loss per thousand epochs

Figure 10: Loss vs no. of epochs curve

The model overall accuracy for the test dataset is almost 83.33%

3 Discussion
Machine learning algorithms are effective tools for prediction and data analysis. The problem
determines the algorithm to use. These algorithms have been applied in a variety of industries
to offer information and support decision-making.

We have implemented five major machine learning algorithms in the lab. Each algorithms have
some strengths and weaknesses and specific field of applications.

➢ Apriori: Apriori specializes at finding common item sets in transactional data, ideal for
market basket analysis and recommendations. However, it can struggle with large and high-
dimensional datasets, potentially missing complex patterns. It finds applications in retail
for better product recommendations, healthcare for treatment analysis, and web analytics
for understanding user click behavior [6].
➢ Multivariable Linear Regression: Multivariable Linear Regression offers simplicity and
the ability to model interactions between multiple variables, making it ideal for forecasting
outcomes influenced by many factors. However, it assumes linear relationships, which
might not always be accurate and can be sensitive to outliers and multicollinearity. It's

13
widely used in fields like economics, medicine, and marketing for tasks like GDP
prediction, patient outcome forecasting, and sales projection.
➢ K-Means Clustering: K-Means Clustering is great for handling large datasets efficiently
by grouping similar data points. Yet, it requires knowing the number of clusters in advance
and can be sensitive to initial cluster choices, struggling with irregular clusters. It finds
practical use in customer segmentation for market analysis, image compression for efficient
storage, and anomaly detection to detect fraud by identifying unusual data patterns [7].
➢ Decision Tree: Decision Trees are adaptable, able to do both classification and regression
tasks, and able to handle different types of data. Their drawbacks include a propensity to
overfit complicated trees, sensitivity to small data changes, and a potential inability to grasp
links that go beyond hierarchical splits. They are useful in recommendation systems, credit
risk analysis, and medical diagnostics.
➢ ANN: Artificial Neural Networks (ANNs) specializes at processing complex, high-
dimensional data, making them ideal for tasks like image and text analysis. However, they
require substantial data and computational resources, and their intricate structure can be
hard to understand. Nevertheless, ANNs find extensive use in image recognition, natural
language processing, autonomous vehicles, and various fields that demand complex data
analysis and decision-making.

Every machine learning algorithm comes with its unique advantages and limitations, rendering
them appropriate for particular tasks and fields. The selection of the most suitable algorithm
hinges on the characteristics of the data and the specific problem being addressed.

4 Conclusion
Machine learning algorithms are powerful tools that can be used to analyze and predict data.
The best algorithm for a particular problem depends on the specific circumstances, so it is
important to carefully evaluate the strengths and weaknesses of each algorithm before applying
any algorithm to any specific task. We have implemented five important machine learning
algorithms throughout the lab. This implementation gives crystal clear and in-depth knowledge
for each of the algorithms. We have learned how mathematical calculations and statistical
formulas are applied to build a machine learning algorithm. This hands-on experience of
implementing machine learning from scratch gives as a proper understanding of machine
learning model and their uses.

14
5 References
[1] I. H. Sarker, "Machine Learning: Algorithms, Real-World Applications and Research,"
SN Compputer Science, vol. III, p. 160, 2021.
[2] M. K. J. P. Jiawei Han, Data Mining Concepts and Techniques, The Morgan Kaufmann,
2011.
[3] M. Badole, "Mastering Multiple Linear Regression: A Comprehensive Guide," Analytics
Vidhya, 1 May 2021. [Online]. Available:
https://www.analyticsvidhya.com/blog/2021/05/multiple-linear-regression-using-python-
and-scikit-learn/.
[4] IBM, "Decision Trees," IBM, [Online]. Available: https://www.ibm.com/topics/decision-
trees#:~:text=data%20mining%20solutions-
,Decision%20Trees,internal%20nodes%20and%20leaf%20nodes..
[5] G. Singh, "Introduction to Artificial Neural Networks," Analytics Vidhya, 6 September
2021. [Online]. Available: https://www.analyticsvidhya.com/blog/2021/09/introduction-
to-artificial-neural-networks/.
[6] P. S, "Underrated Apriori Algorithm Based Unsupervised Machine Learning," Analytics
Vidhya, 29 January 2022. [Online]. Available:
analyticsvidhya.com/blog/2022/01/underrated-apriori-algorithm-based-unsupervised-
machine-learning/.
[7] B. M. Banoula, "K-means Clustering Algorithm: Applications, Types, & How Does It
Work?," Simple Learn, 23 April 2023. [Online]. Available:
https://www.simplilearn.com/tutorials/machine-learning-tutorial/k-means-clustering-
algorithm.

15
6 Appendices
➢ Apriori Algorithm (Source Code)
1. import pandas as pd
2. import numpy as np
3. import matplotlib.pyplot as plt
4. data = pd.read_csv("store_data.csv")
5. data.shape
6.
7. data.head()
8.
9. transaction = []
10. for i in range(0, data.shape[0]):
11. for j in range(0, data.shape[1]):
12. transaction.append(data.values[i,j])
13.
14. pip install apyori
15.
16. from apyori import apriori
17. rules= apriori(transactions= transactions,
min_support=0.003, min_confidence = 0.2, min_lift=3,
min_length=2, max_length=2)
18.
19. results= list(rules)
20. results

➢ Multivariable Linear Regression (Source Code)


1. import pandas as pd
2. import numpy as np
3. import matplotlib.pyplot as plt
4. df = pd.read_csv("property_listing_data_in_Bangladesh.csv")
5. df.head(3)
6. """### Data Preprocessing"""
7. # Droping unnecessary columns
8. df.drop(['title','adress','purpose','flooPlan','url','lastUpda
ted'],axis = 'columns', inplace = True)
9. df.head()
10. df['type'].value_counts()
11. # Removing building and duplex type properties from
dataset
12. df = df[~df['type'].isin(['Duplex', 'Building'])]
13. df
14. df['type'].value_counts()
15. # Droping "type" columns

16
16. df.drop(["type"], axis=1, inplace=True)
17. df
18. # Working with "beds", "bath" and "area columns"
19. df['beds'] = df['beds'].str.replace(' Bed',
'').astype(int)
20. df['bath'] = df['bath'].str.replace(' Bath',
'').astype(int)
21. df['area'] = df['area'].str.replace(' sqft',
'').str.replace(',', '').astype(int)
22. df.head(3)
23. df.dtypes
24. # Working with "price columns"
25. price_value = {'Thousand': 1000, 'Lakh': 100000}
26. def convert_price(value):
27. number, unit = value.split()
28. return float(number)*price_value[unit]
29. df['price'] = df['price'].apply(convert_price)
30. df.head()
31. df.dtypes
32. df.isnull().sum()
33. """### Normalizing the data"""
34. # Using "z-normalization"
35. data=pd.DataFrame(df)
36. mean=data.mean()
37. std=data.std()
38. df_norm=(data-mean)/std
39. df_norm.head()
40. df_norm.loc[7000]
41. shuffled_df = df_norm.sample(frac=1, random_state=10)
42. shuffled_df.head()
43. df_final=shuffled_df
44. df_final.head()
45. values = df_final.values
46. values
47. x = values[:, :-1]
48. y = values[:, -1]
49. x,y
50. len(x)

17
51. total_size = len(x)
52. train_size = int(0.6 * total_size)
53. val_size = int(0.2 * total_size)
54. test_size = total_size - train_size - val_size
55. total_size,train_size,test_size,val_size
56.
57. x_train = x[:train_size]
58. y_train = y[:train_size]
59. x_val = x[train_size:train_size+val_size]
60. y_val = y[train_size:train_size+val_size]
61. x_test = x[train_size+val_size:]
62. y_test = y[train_size+val_size:]
63.
64. len(x_train), len(x_val), len(x_test)
65.
66. x_train
67. len(y_train), len(y_val), len(y_test)
68. y_train
69. # bias term column
70. X_train=np.column_stack((np.ones(len(x_train)), x_train))
71. X_val=np.column_stack((np.ones(len(x_val)), x_val))
72. X_test=np.column_stack((np.ones(len(x_test)), x_test))
73.
74. learning_rate=0.01
75. iter_number=1000
76. weight = np.zeros(X_train.shape[1])
77. def cost_function(x, y, weight):
78. m=len(y)
79. h=x.dot(weight)
80. cost=(1/(2*m))*np.sum((h-y)**2)
81. return cost
82. def gradient_descent(x_train, y_train, weight, alpha,
iter_number, x_val, y_val):
83. m=len(y_train)
84. for i in range(iter_number):
85. h=x_train.dot(weight)
86. error= h-y_train
87. gradient=(alpha/m)*x_train.T.dot(error)

18
88. weight -= gradient
89. train_loss= cost_function(x_train, y_train,
weight)
90. val_loss = cost_function(x_val, y_val, weight)
91. print("Epoch",i+1,"Training
Loss:",train_loss,"Validation Loss:", val_loss)
92. return weight
93. weight = gradient_descent(X_train, y_train, weight,
learning_rate, iter_number, X_val, y_val)
94. print("Final Weight:", weight)
95.
96. def model_evaluation(X, y, weight):
97. cost = cost_function(X, y, weight)
98. return cost
99. # Evaluating model on test dataset
100. print("Test Loss:",model_evaluation(X_test, y_test,
weight))
101. # Calculating model score
102. def model_score(x, y, weight):
103. y_pred = x.dot(weight)
104. sst = np.sum((y - np.mean(y))**2)
105. ssr = np.sum((y - y_pred)**2)
106. score = 1 - (ssr/sst) # R-Squared
107. return score
108. score = model_score(X_test, y_test, weight)
109. print("Model score:", score)

➢ K-Means Algorithm (Source Code)


1. import numpy as np
2. import pandas as pd
3. import matplotlib.pyplot as plt
4. from sklearn.preprocessing import StandardScaler
5. p_df = pd.read_csv("data.csv")
6. p_df.head()
7. data = p_df.drop(['id'],axis=1)
8. np_data = data.values
9. # Normalizing the data
10. scaler = StandardScaler()

19
11. np_data = scaler.fit_transform(np_data)
12. df = pd.DataFrame(np_data, columns=['X', 'Y', 'Z'])
13. df.head()
14. df.values
15. def calc_cluster(x, centers):
16. dist = np.sqrt(np.sum((x[:, np.newaxis] -
centers)**2, axis=2))
17. return np.argmin(dist, axis=1)
18. def update_centers(x, cluster, k):
19. new_centers = np.zeros((k, x.shape[1]))
20. for i in range(k):
21. cluster_points=x[cluster==i]
22. new_centers[i]=cluster_points.mean(axis=0)
23. return new_centers
24. def k_means(x, k):
25. centers=x[np.random.choice(x.shape[0], k,
replace=False)]
26. for i in range(100):
27. cluster=calc_cluster(x, centers)
28. new_centers=update_centers(x, cluster, k)
29. if np.all(centers==new_centers):
30. break
31. centers = new_centers
32. return cluster, centers
33. x = df.values
34. k = int(input("Enter Number of Clusters:"))
35. clusters, centers = k_means(x, k)
36. print(clusters,centers,end='\n')
37. # Plotting in 3D
38. fig = plt.figure(figsize=(15, 12))
39. ax = fig.add_subplot(111, projection='3d')
40. ax.scatter(x[:, 0], x[:, 1], x[:, 2], c=clusters)
41. ax.scatter(centers[:, 0], centers[:, 1], centers[:, 2],
42. marker='X', color='red')
43. ax.set_xlabel('X')
44. ax.set_ylabel('Y')
45. ax.set_zlabel('Z')
46. plt.title('K-means Clustering')

20
47. plt.show()

➢ Decision Tree (Source Code)


1. import numpy as np
2. import pandas as pd
3. from sklearn.model_selection import train_test_split
4. import matplotlib.pyplot as plt
5. df = pd.read_csv("PlayTennis.csv")
6. df.head()
7. w="Wind Speed (mph)"
8. for i in range(len(df[w])):
9. if df[w][i]<=25:
10. df[w][i]="Normal"
11. elif 25<df[w][i]<=40:
12. df[w][i]="Medium"
13. else:
14. df[w][i]="Gale"
15. df["Wind"]=df[w]
16. df=df.drop([w], axis=1)
17. df.head()
18. df["Play"]=df["Play Tennis"]
19. df=df.drop(['Play Tennis'], axis=1)
20. df.head()
21. x = df.drop(columns=['Play']).columns
22. train, test = train_test_split(df, test_size=0.2,
random_state=42)
23. x
24. def calc_entropy(data):
25. n = len(data)
26. if n == 0:
27. return 0
28.
29. play_counts=data['Play'].value_counts()
30. p=play_counts/n
31. entropy= -np.sum(p* np.log2(p))
32.
33. return entropy
34. def infoGain(data,play_class):

21
35. entropy_val = calc_entropy(data)
36. n = len(data)
37. unique_val = data[play_class].unique()
38. weighted_entropy = 0
39.
40. for value in unique_val:
41. subset = data[data[play_class]==value]
42. p = len(subset) / n
43. weighted_entropy += p*calc_entropy(subset)
44. info_gain = entropy_val - weighted_entropy
45. return info_gain
46. def best_tree(data, features):
47. max_gain = 0
48. best_node = None
49. for feature in features:
50. gain = infoGain(data, feature)
51. if gain > max_gain:
52. max_gain = gain
53. best_node = feature
54. return best_node
55. def tree(data, features):
56. if len(data['Play'].unique()) == 1:
57. return data['Play'].iloc[0]
58. best_node = best_tree(data, features)
59. ttree = {best_node: {}}
60. for value in data[best_node].unique():
61. subset = data[data[best_node] ==
value].drop(columns=[best_node])
62. subtree = tree(subset, features.drop(best_node))
63. ttree[best_node][value] = subtree
64. return ttree
65. DT = tree(train, x)
66. DT
67. def predict(instance, tree):
68. if isinstance(tree, str):
69. return tree
70. x = next(iter(tree))
71. val = instance[x]

22
72. sub_tree = tree[x][val]
73. return predict(instance, sub_tree)
74. pred_data = {'Outlook': 'Rain', 'Temperature': 'Hot',
'Humidity':'High', 'Wind':'Gale'}
75. play = predict(pred_data, DT)
76. print("Predicted class:", play)
77. pred_data = {'Outlook': 'Rain', 'Temperature': 'Hot',
'Humidity':'High', 'Wind':'Normal'}
78. play = predict(pred_data, DT)
79. print("Predicted class:", play)
80. def accuracy(test_set, decision_tree):
81. correct_predictions = 0
82. total_instances = len(test_set)
83. for i, instance in test_set.iterrows():
84. predicted_class = predict(instance,
decision_tree)
85. if predicted_class == instance['Play']:
86. correct_predictions += 1
87. ac = correct_predictions/total_instances
88. return ac
89. ac = accuracy(test, DT)
90. print("Accuracy:", ac)

➢ Artificial Neural Network – ANN (Source Code)


1. import numpy as np
2. from sklearn.datasets import load_iris
3. from sklearn.model_selection import train_test_split
4. iris = load_iris()
5. X = iris.data
6. y = iris.target
7. num_classes = len(np.unique(y))
8. y_one_hot = np.eye(num_classes)[y]
9. X.shape
10. y[:5],y_one_hot[:5]
11. X_train, X_test, y_train, y_test = train_test_split(X,
y_one_hot, test_size=0.2, random_state=1)
12. def normalize(X):
13. mean = np.mean(X, axis=0)

23
14. std = np.std(X, axis=0)
15. X_normalized = (X - mean) / std
16. return X_normalized
17. X_train=normalize(X_train)
18. X_test=normalize(X_test)
19. def sigmoid(x):
20. return 1/(1 + np.exp(-x))
21.
22. def sigmoid_derivative(x):
23. return sigmoid(x) * (1 - sigmoid(x))
24. def softmax(z): # for multiclass classification
25. exp_z=np.exp(z)
26. return exp_z/np.sum(exp_z, axis=0, keepdims=True)
27.
28. def softmax_derivative(z):
29. s = softmax(z)
30. dZ = s*(1-s)
31. return dZ
32. def initialize_parameters(in_dim, hid_dim, out_dim):
33. np.random.seed(1)
34. w_hid=np.random.randn(hid_dim, in_dim) * 0.01
35. b_hid=np.zeros((hid_dim, 1))
36. w_out=np.random.randn(out_dim, hid_dim) * 0.01
37. b_out=np.zeros((out_dim, 1))
38. return w_hid, b_hid, w_out, b_out
39. def loss_function(A_output, y):
40. epsilon = 1e-10
41. m = y.shape[1]
42. loss = -1/m * np.sum(y * np.log(A_output + epsilon))
43. return loss
44. def forward_propagation(X, w_hid, b_hid, w_out, b_out):
45. z_hid = np.dot(w_hid, X.T) + b_hid
46. a = sigmoid(z_hid)
47. z_out=np.dot(w_out, a) + b_out
48. y_cap = softmax(z_out)
49. return y_cap, a
50. def back_propagation(X, y, y_cap, w_out, a):
51. k = X.shape[0]

24
52. dz_out = y_cap - y.T
53. dw_out=np.dot(dz_out, a.T)/k
54. db_out=np.sum(dz_out, axis=1, keepdims=True)/k
55. dz_hid=np.dot(w_out.T, dz_out) *
sigmoid_derivative(a)
56. dw_hid=np.dot(dz_hid, X)/k
57. db_hid=np.sum(dz_hid, axis=1, keepdims=True)/k
58. return dw_hid, db_hid, dw_out, db_out
59. loss_history = []
60. def train(X, y, hid_layer, lr, epochs):
61. input_size = X.shape[1]
62. output_size = y.shape[1]
63.
64. w_hid, b_hid, w_out, b_out =
initialize_parameters(input_size, hid_layer, output_size)
65.
66. for epoch in range(epochs):
67. y_cap, a = forward_propagation(X, w_hid, b_hid,
w_out, b_out)
68. loss = loss_function(y_cap.T, y)
69. loss_history.append(loss)
70.
71. dw_hid, db_hid, dw_out, db_out =
back_propagation(X, y, y_cap, w_out, a)
72.
73. w_hid-=lr*dw_hid
74. b_hid-=lr*db_hid
75. w_out-=lr*dw_out
76. b_out-=lr*db_out
77.
78. if epoch % 1000 == 0:
79. print(f"Epoch {epoch}, Loss: {loss}")
80.
81. return w_hid, b_hid, w_out, b_out
82. lr = 0.1
83. epochs=5000
84. hid_layer=32
85.

25
86. weights_hidden, biases_hidden, weights_output,
biases_output = train(X_train, y_train, hid_layer, lr, epochs)
87. def predict(X, weights_hidden, biases_hidden,
weights_output, biases_output):
88. A_output, _ = forward_propagation(X, weights_hidden,
biases_hidden, weights_output, biases_output)
89. predictions = np.argmax(A_output, axis=0)
90. return predictions
91.
92. predictions = predict(X_test, weights_hidden,
biases_hidden, weights_output, biases_output)
93. def calculate_accuracy(y_true, y_pred):
94. accuracy = np.mean(y_true == y_pred) * 100
95. return accuracy
96. test_accuracy = calculate_accuracy(np.argmax(y_test,
axis=1), predictions)
97. print("Test Accuracy:", test_accuracy)
98. import matplotlib.pyplot as plt
99. plt.plot(range(epochs), loss_history)
100. plt.xlabel('No. of Epochs')
101. plt.ylabel('Loss')
102. plt.title('Training Loss Curve')
103. plt.show()

26

You might also like