Matrix: a collection of elements of the same data type (numeric, character, or logical) arranged
into a fixed number of rows and columns.
construct a matrix in R with the matrix() function.
matrix(ARG1, byrow = .., nrow = ..)
ARG1: collection of elements
byrow: TRUE if arrange elements in rows, FALSE if arrange elements in columns
nrow: number of rows
examples:
matrix(1:14, byrow = TRUE, nrow = 2)
matrix(c(a”,”b”,”c”,”d”,”e”,f”), byrow = TRUE, nrow = 3)
add names for the rows and the columns of a matrix
rownames(my_matrix) <- row_names_vector
colnames(my_matrix) <- col_names_vector
another method in the same code of creating a matrix:
matrix(c(a”,”b”,”c”,”d”,”e”,f”), byrow = TRUE, nrow = 3, dimnames = List(rows,
columns))
the function rowSums() conveniently calculates the totals for each row of a matrix. This function
creates a new vector:
rowSums(my_matrix)
colSums(my_matrix)
add a column or multiple columns to a matrix with the cbind() function, which merges matrices
and/or vectors together by column. For example:
big_matrix <- cbind(matrix1, matrix2, vector1 ...)
to bind two matrices with similar rows and columns
new_matrix <- rbind(matrix1, matrix2)
Selection of matrix elements
Similar to vectors, you can use the square brackets [ ] to select one or multiple elements from a
matrix. Whereas vectors have one dimension, matrices have two dimensions. You should
therefore use a comma to separate the rows you want to select from the columns. For example:
my_matrix[1,2] selects the element at the first row and second column.
my_matrix[1:3,2:4] results in a matrix with the data on the rows 1, 2, 3 and columns 2,
3, 4.
If you want to select all elements of a row or a column, no number is needed before or after the
comma, respectively:
my_matrix[,1] selects all elements of the first column.
my_matrix[1,] selects all elements of the first row.
Calculate means
mean(my_variable)
factors:
factor refers to a statistical data type used to store categorical variables.
factor(my_vector)
There are two types of categorical variables: a nominal categorical variable and an ordinal
categorical variable.
A nominal variable is a categorical variable without an implied order. This means that it is
impossible to say that 'one is worth more than the other'. For example, think of the categorical
variable animals_vector with the categories "Elephant", "Giraffe", "Donkey" and "Horse".
Here, it is impossible to say that one stands above or below the other. (Note that some of you
might disagree ;-) ).
Example:
animals_vector <- c("Elephant", "Giraffe", "Donkey", "Horse")
factor_animals_vector <- factor(animals_vector)
In contrast, ordinal variables do have a natural ordering. Consider for example the categorical
variable temperature_vector with the categories: "Low", "Medium" and "High". Here it is
obvious that "Medium" stands above "Low", and "High" stands above "Medium"
Example:
temperature_vector <- c("High", "Low", "High","Low", "Medium")
factor_temperature_vector <- factor(temperature_vector, order = TRUE,
levels = c("Low", "Medium", "High"))
renaming levels within a factor:
survey_vector <- c("M", "F", "F", "M", "M")
factor_survey_vector <- factor(survey_vector)
levels(factor_survey_vector) <- c("Female", "Male")
to know counts of levels (categories) within a factor (categorical variable)
summary(my_factor)