Faculty Development Program ON Artificial Intelligence & Machine Learning For Engineering Applications
Faculty Development Program ON Artificial Intelligence & Machine Learning For Engineering Applications
Faculty Development Program ON Artificial Intelligence & Machine Learning For Engineering Applications
PROGRAM
ON
ARTIFICIAL INTELLIGENCE &
MACHINE LEARNING FOR
ENGINEERING APPLICATIONS
Dr. G.MADHU
Associate Professor
E-mail: madhu_g@vnrvjiet.in
Pre-Processing in Machine Learning
Objectives
Standardization
Robust Scaler
A1 A2 A3 A4 Class
Instances
12 ? 2 5 Yes
14.5 B 4 7 Yes
10.7 A 6 9 No
Step-1: Form Two clusters (k=2) with the given dataset having four features that have
nine data samples. (Randomly chosen clusters)
Cluster-1 R4 R5 R6 R9
Cluster-2 R1 R2 R8
For Sample
Cluster-1 R4 R5 R6 R9
A1: Centroid ( A1) =(30+35+25+25)/4 = 115/4=28.75 Cluster-2 R1 R2 R8
C1 Distance C2 Distance
13.91492005 3.887301263
8.909264841 1.943650632
5.22613624 7.187952884
1.695582496 11.87901979
6.33442973 16.69663972
3.984344363 7.007932014
13.80670127 3.366501646
8.909264841 1.943650632
3.921096785 6.839428176
For cluster-2:
D( R3, Centroid-2) =
= 7.187952884
D( R7, Centroid-2) =
= 3.366501646
AI/ML for Engineering Applications
33
Dr.G.Madhu
Step-7: Again form Mapping of these two distances ( missing records )
C2 Distance C1 Distance Mapping
R3 7.187953 5.18411 12.37206
R7 3.366502 13.8067 17.1732
records).
R3 25 ? 2 5 CLASS-1
R9 25 2 1 9 CLASS-2
R7 15 2 2 ? CLASS-2
R9 25 2 1 9 CLASS-2
A1 A2 A3 A4 Decision
R1 15 1 1 9 CLASS-1
R2 20 3 2 7 CLASS-2
R3 25 2 2 5 CLASS-1
R4 30 4 1 9 CLASS-2
R5 35 2 1 7 CLASS-2
R6 25 4 2 9 CLASS-1
R7 15 2 2 9 CLASS-2
R8 20 3 2 7 CLASS-1
R9 25 2 1 9 CLASS-2
Machine Learning
58
Dr.G.Madhu
Step 1: Normalize the data
• First step is to normalize the data that we have
so that PCA works properly.
• This is done by subtracting the respective
means from the numbers in the respective
column.
• So if we have two dimensions X and Y, all X
become 𝔁 and all Y become 𝒚.
• This produces a dataset whose mean is zero.
Machine Learning
59
Dr.G.Madhu
Machine Learning
60
Dr.G.Madhu
Step 2: Computing the covariance matrix
Machine Learning
61
Dr.G.Madhu
• Mathematically, a covariance matrix is a p × p
matrix, where p represents the dimensions of
the data set.
• Each entry in the matrix represents the
covariance of the corresponding variables.
• Consider 2-Dimensional data set with
variables a and b, the covariance matrix is a
2×2 matrix as shown below:
Machine Learning
62
Dr.G.Madhu
How to Compute Covariance
Machine Learning
Dr.G.Madhu 63
Machine Learning Dr.G.Madhu 64
Step 3: Calculating the Eigenvectors and Eigenvalues
Machine Learning
67
Dr.G.Madhu
Step 5: Forming Principal Components