PCA Analysis_ R and interpretation (1)
PCA Analysis_ R and interpretation (1)
PCA Analysis_ R and interpretation (1)
How to determine the optimal number of principal ● Kaiser Criterion: Only PCs with eigenvalues greater than 1
components (PCs) to retain in a PCA analysis? should be retained.
● Inertia Criterion:focuses on the percentage of total variance
1
explained by each PC. You can choose a threshold (e.g., 80%) and 3. Interpretation:
retain the PCs that cumulatively explain at least that much
variance. ● Variable: Well-projected variables contribute significantly to PCs
and are accurately represented on the map
● Elbow method:
● Attribute: Well-projected, accurately represented by the
principal axes.
plot(r$eig[,1],type="b",xlab="Dimensions",ylab="Eigenvalues",
main="Scree plot") Perform PCA:
Other commands:
Dealing with R:
● Row Binding: Rbind:
2
● Column Binding: Cbind:
● Extracting rows and columns:
Matrix things:
● Vectors into Matrix:
1,3,5 / 3,4
● Extracting specific rows and columns using vector indices
● Deleting:
3
● Selecting specific columns from the iris dataset (famous dataset)
4
● Numeric Variables: Store numerical data. ● What is the difference between a matrix and a dataframe
Matrix: Dataframe:
- All components must be of the - Columns can have different data
same data type. types (e.g., numeric, character,
- Typically used for numeric or factor, logical).
homogeneous data. NB: factor is a data type used to
- More restrictive and suited for represent categorical variables.
mathematical operations. - More flexible and versatile for
diverse datasets.
- Commonly used for data
manipulation, analysis, and
visualization.
● Summary function
#y,d
● class(iris2)
5
● The str() function will provide a detailed overview of the iris2
dataframe, including the data types of its columns and a
summary of the data.
str(iris2)