0% found this document useful (0 votes)
8 views

Programming R - 3

Uploaded by

hafsulli123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Programming R - 3

Uploaded by

hafsulli123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Vectors

 The c() function can be used to create vectors of objects by concatenating


things together.
 There exists a function vector() whose input mode is vector(mode=“numeric”,
length=10) and produces a vector of zeroes of length 10. For other modes =
“logical”, “character”, the outputs are of the form FALSE, FALSE, FALSE,… or
“”,””, “”,… and so on.
 When different objects are mixed in a vector, coercion occurs so that every
element in the vector is of the same class.
 When combining a numeric object with a character object will create a
character vector, because numbers can usually be easily represented as
strings.
Matrix
 Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of length
2 (number of rows, number of columns).
 Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner and
running down the columns.
 > m <- matrix(1:6, nrow = 2, ncol = 3)
>m
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
Matrices can also be created directly from vectors by adding a dimension attribute.
 > m <- 1:10
>m
[1] 1 2 3 4 5 6 7 8 9 10
dim(m) <- c(2, 5)
> m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
 Check the dimension of m using dim(m), attributes(m).
 Examples of R object attributes
 • names, dimnames
 • dimensions (e.g. matrices, arrays)
 • class (e.g. integer, numeric)
 • length
 • other user-defined attributes/metadata
 Matrices can be created by column-binding or row-binding with the cbind() and rbind() functions.
> x <- 1:3
> y <- 10:12
> cbind(x, y)
x y
[1,] 1 10
[2,] 2 11
[3,] 3 12
> rbind(x, y)
[,1] [,2] [,3]
x123
y 10 11 12
 Column names and row names can be set separately using the colnames() and
rownames() functions.
 > colnames(m) <- c("h", "f")
 > rownames(m) <- c("x", "z")

Lists
 Lists are a special type of vector that can contain elements of different
classes.
 Lists can be explicitly created using the list() function, which takes an
arbitrary number of arguments.

Factors
 Factors are used to represent categorical data and can be unordered or
ordered.
 One can think of a factor as an integer vector where each integer has a label.
Data Frames
 Data frames are used to store tabular data in R. They are an important type
of object in R and are used in a variety of statistical modeling applications.
 Data frames are represented as a special type of list where every element of
the list has to have the same length. Each element of the list can be thought
of as a column and the length of each element of the list is the number of
rows.
 Unlike matrices, data frames can store different classes of objects in each
column. Matrices must have every element be the same class (e.g. all
integers or all numeric).
 Data frames can be converted to a matrix by calling data.matrix(). While it
might seem that the as.matrix() function should be used to coerce a data
frame to a matrix, almost always, what you want is the result of
data.matrix().
Subsetting R Objects
There are three operators that can be used to extract subsets of R objects.
 The [ operator always returns an object of the same class as the original. It
can be used to select multiple elements of an object
 The [[ operator is used to extract elements of a list or a data frame. It can
only be used to extract a single element and the class of the returned object
will not necessarily be a list or data frame.
 The $ operator is used to extract elements of a list or data frame by literal
name. Its semantics are similar to that of [[.
 Vectors are basic objects in R and they can be subsetted using the [ operator.
> x <- c("a", "b", "c", "c", "d", "a")
> x[1] ## Extract the first element
[1] "a"
> x[2] ## Extract the second element
[1] "b"
 The [ operator can be used to extract multiple elements of a vector by passing the
operator an integer sequence. Here we extract the first four elements of the
vector.
> x[1:4]
[1] "a" "b" "c" "c"
 The sequence does not have to be in order; you can specify any arbitrary integer
vector.
> x[c(1, 3, 4)]
[1] "a" "c" "c"
 We can also pass a logical sequence to the [ operator to extract elements of a
vector that satisfy a given condition.
 For example, here we want the elements of x that come lexicographically after
the letter “a”.
> u <- x > "a"
>u
[1] FALSE TRUE TRUE TRUE TRUE FALSE
> x[u]
[1] "b" "c" "c" "d"
 A more compact, way to do this would be to skip the creation of a logical vector
and just subset the vector directly with the logical expression.
> x[x > "a"]
[1] "b" "c" "c" "d"
 Subsetting a Matrix: Matrices can be subsetted in the usual way with (i,j) type
indices. Here, we create a simple 2 × 3 matrix with the matrix function.
> x <- matrix(1:6, 2, 3)
>x
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
 We can access the (1, 2) or the (2, 1) element of this matrix using the appropriate
indices.
> x[1, 2]
[1] 3
> x[2, 1]
[1] 2
 Indices can also be missing. This behavior is used to access entire rows or columns
of a matrix.
> x[1, ] ## Extract the first row
[1] 1 3 5
> x[, 2] ## Extract the second column
[1] 3 4
Dropping matrix dimensions
 By default, when a single element of a matrix is retrieved, it is returned as a
vector of length 1 rather than a 1 × 1 matrix. Often, this is exactly what we want,
but this behavior can be turned off by setting drop = FALSE.
> x <- matrix(1:6, 2, 3)
> x[1, 2]
[1] 3
> x[1, 2, drop = FALSE]
[,1]
[1,] 3
 Similarly, when we extract a single row or column of a matrix, R by default drops the dimension of
length 1, so instead of getting a 1 × 3 matrix after extracting the first row, we get a vector of length
3. This behavior can similarly be turned off with the drop = FALSE option.
> x <- matrix(1:6, 2, 3)
> x[1, ]
[1] 1 3 5
> x[1, , drop = FALSE]
[,1] [,2] [,3]
[1,] 1 3 5
 Subsetting Lists: Lists in R can be subsetted using all three of the operators mentioned earlier.
> x <- list(foo = 1:4, bar = 0.6)
>x
$foo
[1] 1 2 3 4
$bar
[1] 0.6
 The [[ operator can be used to extract single elements from a list. Here we
extract the first element of the list.
> x[[1]]
[1] 1 2 3 4
 The [[ operator can also use named indices so that you don’t have to
remember the exact ordering of every element of the list. You can also use
the $ operator to extract elements by name.
> x[["bar"]]
[1] 0.6
> x$bar
[1] 0.6
 Notice you don’t need the quotes when you use the $ operator.
 One thing that differentiates the [[ operator from the $ is that the [[ operator
can be used with computed indices. The $ operator can only be used with
literal names.
> x <- list(foo = 1:4, bar = 0.6, baz = "hello")
> name <- "foo"
>
> ## computed index for "foo"
> x[[name]]
[1] 1 2 3 4
>
> ## element "name" doesn’t exist! (but no error here)
> x$name
NULL
>
> ## element "foo" does exist
> x$foo
[1] 1 2 3 4
 The [[ operator can take an integer sequence if you want to extract a nested
element of a list.
x <- list(a = list(10, 12, 14), b = c(3.14, 2.81))
>
> ## Get the 3rd element of the 1st element
> x[[c(1, 3)]]
[1] 14
>
> ## Same as above
> x[[1]][[3]]
[1] 14
>
> ## 1st element of the 2nd element
> x[[c(2, 1)]]
[1] 3.14
 The [ operator can be used to extract multiple elements from a list. For
example, if you wanted to extract the first and third elements of a list, you
would do the following
> x <- list(foo = 1:4, bar = 0.6, baz = "hello")
> x[c(1, 3)]
$foo
[1] 1 2 3 4
$baz
[1] "hello"
 Note that x[c(1, 3)] is NOT the same as x[[c(1, 3)]].
 Remember that the [ operator always returns an object of the same class as
the original. Since the original object was a list, the [ operator returns a list.
In the above code, we returned a list with two elements (the first and the
third).
Reading datasets
 read.table, read.csv, for reading tabular data
 readLines, for reading lines of a text file
 source, for reading in R code files (inverse of dump)
 dget, for reading in R code files (inverse of dput)
 load, for reading in saved workspaces
 unserialize, for reading single R objects in binary form
Writing datasets

There are analogous functions for writing data to files


 write.table, for writing tabular data to text files (i.e. CSV) or connections
 writeLines, for writing character data line-by-line to a file or connection
 dump, for dumping a textual representation of multiple R objects
 dput, for outputting a textual representation of an R object
 save, for saving an arbitrary number of R objects in binary format (possibly
compressed) to a file.
 serialize, for converting an R object into a binary format for outputting to a
connection (or file).

You might also like