Remote Sensing Assignment

PRINCIPAL COMPONENTS
ANALYSIS
REMOTE SENSING-II ASSIGNMENT
NARESH J
2022107302
Principal Components Analysis
Remote Sensing-II
Assignment-II
Principal Components Analysis:
Principal Component Analysis is an unsupervised learning algorithm that

is used for the dimensionality reduction in machine learning. It is a statistical
process that converts the observations of correlated features into a set of linearly
uncorrelated features with the help of orthogonal transformation. These new
transformed features are called the Principal Components.
PCA generally tries to find the lower-dimensional surface to project the

high-dimensional data.
PCA works by considering the variance of each attribute because the high
attribute shows the good split between the classes, and hence it reduces the
dimensionality. Some real-world applications of PCA are image processing,
movie recommendation system, optimizing the power allocation in various
communication channels. It is a feature extraction technique, so it contains the
important variables and drops the least important variable.
1
Step-by-Step Explanation of PCA:

Step 1: Standardization:
The aim of this step is to standardize the range of the continuous initial
variables so that each one of them contributes equally to the analysis.
More specifically, the reason why it is critical to perform standardization prior

to PCA, is that the latter is quite sensitive regarding the variances of the initial
variables. That is, if there are large differences between the ranges of initial
variables, those variables with larger ranges will dominate over those with small
ranges (for example, a variable that ranges between 0 and 100 will dominate
over a variable that ranges between 0 and 1), which will lead to biased results.
So, transforming the data to comparable scales can prevent this problem.
Mathematically, this can be done by subtracting the mean and dividing by the
standard deviation for each value of each variable.
Once the standardization is done, all the variables will be transformed to the
same scale.
Step 2: Covariance Matrix Computation
The aim of this step is to understand how the variables of the input data set
are varying from the mean with respect to each other, or in other words, to see if
there is any relationship between them. Because sometimes, variables are highly
correlated in such a way that they contain redundant information. So, in order to
identify these correlations, we compute the covariance matrix.
2
The covariance matrix is a p × p symmetric matrix (where p is the number of

dimensions) that has as entries the covariances associated with all possible pairs
of the initial variables. For example, for a 3-dimensional data set with 3
variables x, y, and z, the covariance matrix is a 3×3 data matrix of this from:
Covariance Matrix for 3-Dimensional Data.
Since the covariance of a variable with itself is its variance (Cov(a,a)=Var(a)),

in the main diagonal (Top left to bottom right) we actually have the variances of
each initial variable. And since the covariance is commutative
(Cov(a,b)=Cov(b,a)), the entries of the covariance matrix are symmetric with
respect to the main diagonal, which means that the upper and the lower
triangular portions are equal.
Now that we know that the covariance matrix is not more than a table that
summarizes the correlations between all the possible pairs of variables, let’s
move to the next step.
Step 3: Compute the eigenvectors and eigenvalues of the covariance matrix

to identify the principal components
Eigenvectors and eigenvalues are the linear algebra concepts that we need to
compute from the covariance matrix in order to determine the principal
components of the data.
3
What you first need to know about eigenvectors and eigenvalues is that they
always come in pairs, so that every eigenvector has an eigenvalue. Also, their
number is equal to the number of dimensions of the data. For example, for a 3-
dimensional data set, there are 3 variables, therefore there are 3 eigenvectors
with 3 corresponding eigenvalues.
It is eigenvectors and eigenvalues who are behind all the magic of principal
components because the eigenvectors of the Covariance matrix are
actually the directions of the axes where there is the most variance (most
information) and that we call Principal Components. And eigenvalues are
simply the coefficients attached to eigenvectors, which give the amount of
variance carried in each Principal Component.
By ranking your eigenvectors in order of their eigenvalues, highest to lowest,

you get the principal components in order of significance.
Principal Component Analysis Example:
Let’s suppose that our data set is 2-dimensional with 2 variables x,y and that the
eigenvectors and eigenvalues of the covariance matrix are as follows:
If we rank the eigenvalues in descending order, we get λ1>λ2, which means that
the eigenvector that corresponds to the first principal component (PC1)
is v1 and the one that corresponds to the second principal component (PC2)
is v2.
4
After having the principal components, to compute the percentage of variance

(information) accounted for by each component, we divide the eigenvalue of
each component by the sum of eigenvalues. If we apply this on the example
above, we find that PC1 and PC2 carry respectively 96 percent and 4 percent of
the variance of the data.
Now that we know that the covariance matrix is not more than a table that
summarizes the correlations between all the possible pairs of variables, let’s
move to the next step.
Step 4: Create a Feature Vector
As we saw in the previous step, computing the eigenvectors and ordering them
by their eigenvalues in descending order, allow us to find the principal
components in order of significance. In this step, what we do is, to choose
whether to keep all these components or discard those of lesser significance (of
low eigenvalues), and form with the remaining ones a matrix of vectors that we
call Feature vector.
So, the feature vector is simply a matrix that has as columns the eigenvectors of
the components that we decide to keep. This makes it the first step towards
dimensionality reduction, because if we choose to keep only p eigenvectors
(components) out of n, the final data set will have only p dimensions.
Principal Component Analysis Example:
Continuing with the example from the previous step, we can either form a
feature vector with both of the eigenvectors v1 and v2:
5
Or discard the eigenvector v2, which is the one of lesser significance, and form
a feature vector with v1 only:
Discarding the eigenvector v2 will reduce dimensionality by 1, and will

consequently cause a loss of information in the final data set. But given that v2
was carrying only 4 percent of the information, the loss will be therefore not
important and we will still have 96 percent of the information that is carried
by v1.
So, as we saw in the example, it’s up to you to choose whether to keep all the
components or discard the ones of lesser significance, depending on what you
are looking for. Because if you just want to describe your data in terms of new
variables (principal components) that are uncorrelated without seeking to reduce
dimensionality, leaving out lesser significant components is not needed.
Step 5: Recast the Data Along the Principal Components Axes
In the previous steps, apart from standardization, you do not make any changes
on the data, you just select the principal components and form the feature
vector, but the input data set remains always in terms of the original axes (i.e, in
terms of the initial variables).
In this step, which is the last one, the aim is to use the feature vector formed
using the eigenvectors of the covariance matrix, to reorient the data from the
6
original axes to the ones represented by the principal components (hence the
name Principal Components Analysis). This can be done by multiplying the
transpose of the original data set by the transpose of the feature vector.
Advantages of Principal Component Analysis:
Dimensionality Reduction: Principal Component Analysis is a popular

technique used for dimensionality reduction, which is the process of reducing
the number of variables in a dataset. By reducing the number of variables, PCA
simplifies data analysis, improves performance, and makes it easier to visualize
data.
Feature Selection: Principal Component Analysis can be used for feature

selection, which is the process of selecting the most important variables in a
dataset. This is useful in machine learning, where the number of variables can
be very large, and it is difficult to identify the most important variables.
Data Visualization: Principal Component Analysis can be used for data

visualization. By reducing the number of variables, PCA can plot high-
dimensional data in two or three dimensions, making it easier to interpret.
Multicollinearity: Principal Component Analysis can be used to deal with

multicollinearity, which is a common problem in a regression analysis where
two or more independent variables are highly correlated. PCA can help identify
the underlying structure in the data and create new, uncorrelated variables that
can be used in the regression model.
Noise Reduction: Principal Component Analysis can be used to reduce the

noise in data. By removing the principal components with low variance, which
7
are assumed to represent noise, Principal Component Analysis can improve the
signal-to-noise ratio and make it easier to identify the underlying structure in the
data.
Data Compression: Principal Component Analysis can be used for data

compression. By representing the data using a smaller number of principal
components, which capture most of the variation in the data, PCA can reduce
the storage requirements and speed up processing.
Outlier Detection: Principal Component Analysis can be used for outlier

detection. Outliers are data points that are significantly different from the other
data points in the dataset. Principal Component Analysis can identify these
outliers by looking for data points that are far from the other points in the
principal component space.
Disadvantages of Principal Component Analysis:
Interpretation of Principal Components: The principal components created

by Principal Component Analysis are linear combinations of the original
variables, and it is often difficult to interpret them in terms of the original
variables. This can make it difficult to explain the results of PCA to others.
Data Scaling: Principal Component Analysis is sensitive to the scale of the

data. If the data is not properly scaled, then PCA may not work well. Therefore,
it is important to scale the data before applying Principal Component Analysis.
Information Loss: Principal Component Analysis can result in information

loss. While Principal Component Analysis reduces the number of variables, it
can also lead to loss of information. The degree of information loss depends on
the number of principal components selected. Therefore, it is important to
carefully select the number of principal components to retain.
8
Non-linear Relationships: Principal Component Analysis assumes that the

relationships between variables are linear. However, if there are non-linear
relationships between variables, Principal Component Analysis may not work
well.
Computational Complexity: Computing Principal Component Analysis can be

computationally expensive for large datasets. This is especially true if the
number of variables in the dataset is large.
Overfitting: Principal Component Analysis can sometimes result in overfitting,

which is when the model fits the training data too well and performs poorly on
new data. This can happen if too many principal components are used or if the
model is trained on a small dataset.

Remote Sensing Assignment

Uploaded by

Copyright:

Available Formats

Remote Sensing Assignment

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Remote Sensing Assignment

Uploaded by

Copyright:

Available Formats

PRINCIPAL COMPONENTS

Principal Components Analysis:

Principal Component Analysis is an unsupervised learning algorithm that

PCA generally tries to find the lower-dimensional surface to project the

Principal Components Analysis

Step-by-Step Explanation of PCA:

More specifically, the reason why it is critical to perform standardization prior

Step 2: Covariance Matrix Computation

The covariance matrix is a p × p symmetric matrix (where p is the number of

Covariance Matrix for 3-Dimensional Data.

Since the covariance of a variable with itself is its variance (Cov(a,a)=Var(a)),

Step 3: Compute the eigenvectors and eigenvalues of the covariance matrix

By ranking your eigenvectors in order of their eigenvalues, highest to lowest,

Principal Component Analysis Example:

After having the principal components, to compute the percentage of variance

Step 4: Create a Feature Vector

Principal Component Analysis Example:

Discarding the eigenvector v2 will reduce dimensionality by 1, and will

Step 5: Recast the Data Along the Principal Components Axes

Advantages of Principal Component Analysis:

Dimensionality Reduction: Principal Component Analysis is a popular

Feature Selection: Principal Component Analysis can be used for feature

Data Visualization: Principal Component Analysis can be used for data

Multicollinearity: Principal Component Analysis can be used to deal with

Noise Reduction: Principal Component Analysis can be used to reduce the

Data Compression: Principal Component Analysis can be used for data

Outlier Detection: Principal Component Analysis can be used for outlier

Disadvantages of Principal Component Analysis:

Interpretation of Principal Components: The principal components created

Data Scaling: Principal Component Analysis is sensitive to the scale of the

Information Loss: Principal Component Analysis can result in information

Non-linear Relationships: Principal Component Analysis assumes that the

Computational Complexity: Computing Principal Component Analysis can be

Overfitting: Principal Component Analysis can sometimes result in overfitting,

You might also like