0% found this document useful (0 votes)
11 views6 pages

PCA Analysis_ R and interpretation (1)

Download as docx, pdf, or txt
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 6

Data Analysis: Chapter 3: What to get!

● Interpretation of variable factor maps:

● Interpretation of the individual factors map

How to determine the optimal number of principal ● Kaiser Criterion: Only PCs with eigenvalues greater than 1
components (PCs) to retain in a PCA analysis? should be retained.
● Inertia Criterion:focuses on the percentage of total variance

1
explained by each PC. You can choose a threshold (e.g., 80%) and 3. Interpretation:
retain the PCs that cumulatively explain at least that much
variance. ● Variable: Well-projected variables contribute significantly to PCs
and are accurately represented on the map
● Elbow method:
● Attribute: Well-projected, accurately represented by the
principal axes.
plot(r$eig[,1],type="b",xlab="Dimensions",ylab="Eigenvalues",
main="Scree plot") Perform PCA:

How to Evaluate the projection quality? (variable or


individual)
Factor Map:
1. Find the cos²: For each variable, find the squared cosine of its angle
with each principal axis on the factor map.
2. Apply threshold: Identify variables with cos² ≥ 0.5(threshold) for at
least one PC. These are well-projected.

Other commands:

Dealing with R:
● Row Binding: Rbind:

2
● Column Binding: Cbind:
● Extracting rows and columns:

Matrix things:
● Vectors into Matrix:

1,3,5 / 3,4
● Extracting specific rows and columns using vector indices

● Deleting:

3
● Selecting specific columns from the iris dataset (famous dataset)

● Logical Variables: binary conditions (TRUE or FALSE).


Variable types in data analysis and programming include:

● String Variables: Store text data.


If you want to randomize the order of your data to avoid a biased
sample, you can use the sample() function to shuffle the data.

4
● Numeric Variables: Store numerical data. ● What is the difference between a matrix and a dataframe

Matrix: Dataframe:
- All components must be of the - Columns can have different data
same data type. types (e.g., numeric, character,
- Typically used for numeric or factor, logical).
homogeneous data. NB: factor is a data type used to
- More restrictive and suited for represent categorical variables.
mathematical operations. - More flexible and versatile for
diverse datasets.
- Commonly used for data
manipulation, analysis, and
visualization.
● Summary function

#y,d

● class(iris2)

5
● The str() function will provide a detailed overview of the iris2
dataframe, including the data types of its columns and a
summary of the data.
str(iris2)

You might also like