PrincipalComponentAnalysis Tutorial

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Principal Component Analysis

Introduction
The goal of PCA is dimensionality reduction. Dimensionality reduction is achieved through the formation
of basis vectors. Using basis vectors any sample from a data set can be recreated using a linear
combination of basis vectors.

For example in 3 dimensions the basis vectors are:

So if we are given the sample vector:

We can reconstruct it using a linear combination of the basis vectors:

The basis vectors must be orthogonal to one another. This means that the dot product of any two basis
vectors is 0. For example in the 3 dimensional case:

Naturally the next question becomes how can we find the basis vectors? This will be explained
mathematically in the next section.

Principal Component Analysis Algorithm Steps


1. Find the mean vector.
2. Assemble all the data samples in a mean adjusted matrix.
3. Create the covariance matrix.
4. Compute the Eigen vectors and Eigen values.
5. Compute the basis vectors.
6. Represent each sample as a linear combination of basis vectors.

To begin, assume that we have a set of images we would like to perform Principal Component Analysis
(PCA) on. Assume that each image is x pixels by y pixels. We can treat each image as a vector of size x*y
without loss of information. In this way, we can consider an image as a point in x*y dimensional space.
3x3 Image

9x1 vector representation of image

Since images are typically hundreds of pixels in x and hundreds of pixel in y dimensions, an image
becomes a point in a very high dimensional space. Using PCA we can reduce this dimensionality
significantly. One of the reasons for reducing the dimensions is so that we can focus on those
dimensions where there is a high difference between images in our dataset i.e., high variance. PCA
allows us to mathematically reduce the dimensions so that high variance dimensions (for a given
dataset) are selected in the reduced set of dimensions.

To understand this further, assume we have 500 images of faces in our dataset, each of which are 100
pixels by 100 pixels. Essentially, this means each image of a face is represented by 10,000 numbers
(dimensions). Using PCA we can have each face be represented by only 10 numbers (or whatever the
number of reduced dimensions we want).

The steps in computing the PCA are:

1. Find the mean vector:

Assuming we have samples we can compute the mean vector as:

Note in our example and the dimensions of will be:


Dimensions will be (10,000x1)

2. Assemble the mean adjusted matrix:

For every image vector the mean adjusted vector can be computed as:

We put every mean adjusted vector together to form the mean adjusted matrix:

Dimensions will be(10,000x500)

3. Compute the covariance matrix:

Dimensions will be (500x500)


Where:
( indicates matrix dot product)

Note that the covariance matrix represents the covariance measurement of each sample in the data set
with every other sample. In general, the covariance value is computed between two random variables
and represents the variables relationship with one another. If the covariance is positive, then the two
variables increase together and decrease together. If the covariance is negative the variables are
inversely proportional. Basically the covariance matrix attempts to display the relationships between
each pair of samples in the dataset.

4. Compute the Eigen vectors and Eigen values of the covariance matrix.

The Eigen values , of the covariance matrix can be solved for using the following equation:

Where:

Solving the above equation for a 500x500 covariance matrix will result in 500 possible values for (Eigen
values), .

After computing the Eigen values, sort the Eigen values by magnitude. Keep the highest n Eigen
values and discard the rest (where n is number of desired reduced dimensions). In our example, we
want to reduce down to 10 dimensions so after sorting the 500 eigen values we computed, we will only
keep the highest 10.

Why does the magnitude of the Eigen values matter? The Eigen values will be used to form the basis
vectors. The larger the magnitude of the Eigen value, the more variance its corresponding basis vector
will contain. Since we are reducing dimensionality we are inevitably losing some information, however
by picking basis vectors based on Eigen values with the highest variance, we are preserving as much of
the original information as possible.

From these 10 Eigen values, we can compute an Eigen vector associated with each Eigen value. For a
given Eigen value , the Eigen vector can be computed as follows:

Where:
Note will have dimensions of (500x1)

5. Compute the basis vectors.

From the previous step we have 10 Eigen vectors , each with dimensions (500x1). We
assemble these vectors into an Eigen vector matrix:

Dimensions of will be (500x10)

To compute the basis vectors, we multiply the mean adjusted matrix computed in step 2 by the Eigen
vector matrix:

Where:

Dimensions(10,000x10)
, dimensions(10,000x500)
, dimensions (500x10)

6. Represent each sample i.e., image as a linear combination of basis vectors.

Now each sample can be represented as a linear combination of basis vectors using the following
formula:

Where:
, dimensions(10,000x1)
, dimensions(10,000x1)
Dimensions(10,000x10)

Recall that we had images of faces (100 pixels x 100 pixels) that were represented as 10,000 dimensional
points in space. Now that we have completed Principal Component Analysis instead of each face being
represented by 10,000 numbers, now each face is represented by 10 numbers. For a specific face, these
10 numbers represent the linear combination of basis vectors used to recreate the image of that specific
face.

You might also like