R Programming Swirl
R Programming Swirl
R Programming Swirl
> swirl()
> swirl()
> install.packages("swirl")
--- Please select a CRAN mirror for use in this session ---
downloaded 341 KB
package ‘swirl’ successfully unpacked and MD5 sums checked
C:\Users\Paras Dhama\AppData\Local\Temp\RtmpOquy0B\downloaded_packages
> packageVersion("swirl")
[1] ‘2.4.4’
> library(swirl)
> swirl()
| Welcome to swirl! Please sign in. If you've been here before, use the same
| name as you did then. If you are new, call yourself something unique.
| begin our first lesson. First of all, you should know that when you see
| '...', that means you should press Enter when you are done reading and ready
| to continue.
| Also, when you see 'ANSWER:', the R prompt (>), or when you are asked to
| select from a list, that means it's your turn to enter a response, then press
| Enter to continue.
1: Continue.
2: Proceed.
Selection: 1
| You can exit swirl and return to the R prompt (>) at any time by pressing the
| Esc key. If you are already at the prompt, type bye() to exit and save your
| progress. When you exit properly, you'll see a short message letting you know
| -- Typing play() lets you experiment with R on your own; swirl will ignore
...
| To begin, you must install a course. I can install a course for you from the
| and directions for installing courses yourself. (If you are not connected to
Selection: 1
|
| | 0%
|
| | 1%
|
|========= | 13%
|
|================== | 26%
|
|======================= | 33%
|
|========================= | 36%
|
|================================= | 47%
|
|================================== | 48%
|
|=========================================== | 61%
|
|================================================= | 71%
|
|========================================================== | 83%
|
|============================================================== | 89%
|
|================================================================== | 94%
|
|======================================================================| 100%
1: R Programming
2: Take me to the swirl course repository!
Selection: 1
Selection: 2
|
| | 0%
| In this lesson, you'll learn how to examine your local workspace in R and
| begin to explore the relationship between your workspace and the file system
| of your machine.
...
|
|== | 3%
| to things like file paths, the outputs of these commands may vary across
| machines.
...
|
|==== | 5%
| However it's important to note that R provides a common API (a common set of
| commands) for interacting with files, that way your code will work across
...
|
|===== | 8%
| Let's jump right in so you can get a feel for how these special functions
| work!
...
|
|======= | 10%
> getwd()
| Great job!
|
|========= | 13%
> ls()
character(0)
|
|=========== | 15%
| Mac. Both Linux and Mac operating systems are based on an operating system
| called Unix. It's always a good idea to learn more about Unix!
...
|
|============= | 18%
> x <- 9
|
|============== | 21%
| Now take a look at objects that are in your workspace using ls().
> ls()
[1] "x"
|
|================ | 23%
| List all the files in your working directory using list.files() or dir().
> list.files()
|
|================== | 26%
| As we go through this lesson, you should be examining the help page for each
| new function. Check out the help page for list.files with the command
| ?list.files.
> ?list.files
| Excellent job!
|
|==================== | 28%
| One of the most helpful parts of any R help file is the See Also section.
| Read that section for list.files. Some of these functions may be used in
...
|
|====================== | 31%
| Using the args() function on a function name is also a handy way to see what
...args()
|
|======================= | 33%
> args()
> args(list.files())
NULL
| One more time. You can do it! Or, type info() for more options.
> args(list.files)
NULL
|
|========================= | 36%
| "old.dir".
|
|=========================== | 38%
| We will use old.dir at the end of this lesson to move back to the place that
| we started. A lot of query functions like getwd() have the useful property
| that they return the answer to the question as a result of the function.
...
|
|============================= | 41%
| called "testdir".
> args(dir.create)
| Not quite right, but keep trying. Or, type info() for more options.
> dir.create("testdir")
|
|=============================== | 44%
| We will do all our work in this new directory and then delete it after we are
| done. This is the R analog to "Take only pictures, leave only footprints."
...
|
|================================ | 46%
> info(setwd)
> info("setwd")
> info("setwd()")
> setwd()
> setwd("testdir")
|
|================================== | 49%
| perhaps created for the specific project that you are working on. In fact,
...
|
|==================================== | 51%
| file.create() function.
> file.create("mytest.R")
[1] TRUE
|
|====================================== | 54%
| This should be the only file in this newly created directory. Let's check
<bytecode: 0x00000000149e9000>
<environment: namespace:base>
| Not quite! Try again. Or, type info() for more options.
> list.files()
[1] "mytest.R"
|
|======================================= | 56%
| file.exists() function.
> file.exists(getwd())
[1] TRUE
| That's not the answer I was looking for, but try again. Or, type info() for
| more options.
[1] TRUE
|
|========================================= | 59%
| These sorts of functions are excessive for interactive use. But, if you are
| running a program that loops through a series of files and does some
| processing on each one, you will want to check to see that each exists before
...
|
|=========================================== | 62%
> file.info("mytest.R")
atime exe
|
|============================================= | 64%
| You can use the $ operator --- e.g., file.info("mytest.R")$mode --- to grab
| specific items.
...
|
|=============================================== | 67%
> ?file.rename
> file.rename("mytest.R","mytest2.R")
[1] TRUE
|
|================================================ | 69%
| Your operating system will provide simpler tools for these sorts of tasks,
| won't work since mytest.R no longer exists. You have already renamed it.
...
|
|================================================== | 72%
> file.copy("mytest2.R","mytest3.R")
[1] TRUE
| Great job!
|
|==================================================== | 74%
| You now have two files in the current directory. That may not seem very
| files would be absolutely necessary. Don't forget that you can, temporarily,
| leave the lesson by typing play() and then return by typing nxt().
...
|
|====================================================== | 77%
> file.path("mytest.R")
[1] "mytest.R"
| Not quite right, but keep trying. Or, type info() for more options.
| file.path("mytest3.R") works.
> file.path("mytest3.R")
[1] "mytest3.R"
|
|======================================================== | 79%
| You can use file.path to construct file and directory paths that are
| independent of the operating system your R code is running on. Pass 'folder1'
| pathname.
>
> file.path("folder1")
[1] "folder1"
| Not quite right, but keep trying. Or, type info() for more options.
> file.path("folder1","folder2")
[1] "folder1/folder2"
| That's correct!
|
|========================================================= | 82%
> ?dir.create
|
|=========================================================== | 85%
> dir.create("testdir3")
| Not quite, but you're learning! Try again. Or, type info() for more options.
| trick. If you forgot the recursive argument, the command may have appeared to
| Excellent job!
|
|============================================================= | 87%
| created the variable old.dir with the full path for the orginal working
> ..
> setwd("old.dir")
> setwd(old.dir)
| It is often helpful to save the settings that you had before you began an
| analysis and then go back to them at the end. This trick is often used within
| functions; you save, say, the par() settings that you started with, mess
| around a bunch, and then set them back to the original values at the end.
| This isn't the same as what we have done here, but it seems similar enough to
| mention.
...
|
|================================================================= | 92%
| After you finish this lesson delete the 'testdir' directory that you just
...
|
|================================================================== | 95%
| Take nothing but results. Leave nothing but assumptions. That sounds like
| 'Take nothing but pictures. Leave nothing but footprints.' But it makes no
...
|
|==================================================================== | 97%
| In this lesson, you learned how to examine your R workspace and work with the
|
|======================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?
1: No
2: Yes
Selection: 2
Timeout was reached: [www.coursera.org] Operation timed out after 10002 milliseconds with 0 out
of 0 bytes received
> swirl()
| Welcome to swirl! Please sign in. If you've been here before, use the same
| name as you did then. If you are new, call yourself something unique.
| Would you like to receive credit for completing this course on Coursera.org?
1: No
2: Yes
Selection: 2
| Excellent job!
| You've reached the end of this lesson! Returning to the main menu...
1: R Programming
Selection: 1
Selection: 3
|
| | 0%
...
|
|=== | 4%
> 1:20
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| Excellent work!
|
|====== | 9%
| That gave us every integer between (and including) 1 and 20. We could also
> pi:10
[1] 3.141593 4.141593 5.141593 6.141593 7.141593 8.141593 9.141593
|
|========= | 13%
...
|
|============ | 17%
> 15:1
[1] 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
|
|=============== | 22%
...
|
|================== | 26%
| Remember that if you have questions about a particular R function, you can
| access its documentation with a question mark followed by the function name:
| above, you must enclose the symbol in backticks like this: ?`:`. (NOTE: The
| backtick (`) key is generally located in the top left corner of a keyboard,
| above the Tab key. If you don't have a backtick key, you can use regular
| quotes.)
...
|
|===================== | 30%
> ?`:`
|
|======================== | 35%
| Often, we'll desire more control over a sequence we're creating than what the
| `:` operator gives us. The seq() function serves this purpose.
...
|
|=========================== | 39%
| The most basic use of seq() does exactly the same thing as the `:` operator.
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
|
|============================== | 43%
| This gives us the same output as 1:20. However, let's say that instead we
| want a vector of numbers ranging from 0 to 10, incremented by 0.5. seq(0, 10,
> seq(0,10,by=0.5)
[1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0
| Great job!
|
|================================= | 48%
| Or maybe we don't care what the increment is and we just want a sequence of
| 30 numbers between 5 and 10. seq(5, 10, length=30) does the trick. Give it a
| shot now and store the result in a new variable called my_seq.
| Not quite, but you're learning! Try again. Or, type info() for more options.
| You're using the same function here, but changing its arguments for different
| results. Be sure to store the result in a new variable called my_seq, like
|
|===================================== | 52%
| To confirm that my_seq has length 30, we can use the length() function. Try
| it now.
> length(my_seq)
[1] 30
|
|======================================== | 57%
| Let's pretend we don't know the length of my_seq, but we want to generate a
| vector. In other words, we want a new vector (1, 2, 3, ...) that is the same
| length as my_seq.
...
|
|=========================================== | 61%
| There are several ways we could do this. One possibility is to combine the
| `:` operator and the length() function like this: 1:length(my_seq). Give that
| a try.
> 1:length(my_seq)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26 27 28 29 30
|
|============================================== | 65%
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26 27 28 29 30
|
|================================================= | 70%
| However, as is the case with many common tasks, R has a separate built-in
| it in action.
> seq_along(my_seq)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26 27 28 29 30
|
|==================================================== | 74%
| There are often several approaches to solving the same problem, particularly
| in R. Simple approaches that involve less typing are generally best. It's
| also important for your code to be readable, so that you and others can
...
|
|======================================================= | 78%
| If R has a built-in function for a particular task, it's likely that function
| is highly optimized for that purpose and is your best option. As you become a
| more advanced R programmer, you'll design your own functions to perform tasks
| when there are no better options. We'll explore writing your own functions in
| future lessons.
...
|
|========================================================== | 83%
...
|
|============================================================= | 87%
| Not exactly. Give it another go. Or, type info() for more options.
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[39] 0 0
|
|================================================================ | 91%
| Not quite, but you're learning! Try again. Or, type info() for more options.
| Try rep(c(0, 1, 2), times = 10) for a different variation on the same theme.
| Be sure to use the c() function to tell R that the numbers 0, 1, and 2 make
| up a vector.
[1] 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2
| Finally, let's say that rather than repeating the vector (0, 1, 2) over and
| over again, we want our vector to contain 10 zeros, then 10 ones, then 10
| twos. We can do this with the `each` argument. Try rep(c(0, 1, 2), each =
| 10).
[1] 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
|
|======================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?
1: No
2: Yes
Selection: yes
Selection: 2
| You've reached the end of this lesson! Returning to the main menu...
| Please choose a course, or type 0 to exit swirl.
1: R Programming
Selection: 1
Selection: 4
|
| | 0%
...
|
|== | 3%
| Vectors come in two different flavors: atomic vectors and lists. An atomic
| vector contains exactly one data type, whereas a list may contain multiple
| data types. We'll explore atomic vectors further before we get to lists.
...
|
|==== | 5%
| In previous lessons, we dealt entirely with numeric vectors, which are one
| character, integer, and complex. In this lesson, we'll take a closer look at
...
|
|====== | 8%
| Logical vectors can contain the values TRUE, FALSE, and NA (for 'not
...
|
|======= | 11%
| First, create a numeric vector num_vect that contains the values 0.5, 55,
| -10, and 6.
| Excellent work!
|
|========= | 13%
| Now, create a variable called tf that gets the result of num_vect < 1, which
|
|=========== | 16%
Selection: 1
| Remember our lesson on vector arithmetic? The theme was that R performs many
| operations.
Selection: 1
| You're the best!
|
|============= | 18%
> tf
|
|=============== | 21%
| condition.
...
|
|================= | 24%
| The first element of num_vect is 0.5, which is less than 1 and therefore the
| statement 0.5 < 1 is TRUE. The second element of num_vect is 55, which is
| greater than 1, so the statement 55 < 1 is FALSE. The same logic applies for
...
|
|================== | 26%
| Let's try another. Type num_vect >= 6 without assigning the result to a new
| variable.
|
|==================== | 29%
| greater than OR equal to 6. Since only 55 and 6 are greater than or equal to
| 6, the second and fourth elements of the result are TRUE and the first and
...
|
|====================== | 32%
| The `<` and `>=` symbols in these examples are called 'logical operators'.
| Other logical operators include `>`, `<=`, `==` for exact equality, and `!=`
| for inequality.
...
|
|======================== | 34%
| If we have two logical expressions, A and B, we can ask whether at least one
| is TRUE with A | B (logical 'or' a.k.a. 'union') or whether they are both
...
|
|========================== | 37%
| It's a good idea to spend some time playing around with various combinations
| of these logical operators until you get comfortable with their use. We'll do
...
|
|============================ | 39%
| Try your best to predict the result of each of the following statements. You
| can use pencil and paper to work them out if it's helpful. If you get stuck,
| just guess and you've got a 50% chance of getting the right answer!
...
|
|============================= | 42%
| (3 > 5) & (4 == 4)
1: TRUE
2: FALSE
Selection: 2
1: FALSE
2: TRUE
Selection: 2
|
|================================= | 47%
1: TRUE
2: FALSE
Selection: 2
| Nice try, but that's not exactly what I was hoping for. Try again.
| This is a tricky one. Remember that the `!` symbol negates whatever comes
| that are enclosed within parentheses should be evaluated first. Then, work
1: FALSE
2: TRUE
Selection: 1
| This is a tricky one. Remember that the `!` symbol negates whatever comes
| that are enclosed within parentheses should be evaluated first. Then, work
1: TRUE
2: FALSE
Selection: 1
|
|=================================== | 50%
| Don't worry if you found these to be tricky. They're supposed to be. Working
...
|
|===================================== | 53%
| Character vectors are also very common in R. Double quotes are used to
|
|======================================= | 55%
| Create a character vector that contains the following words: "My", "name",
| "is". Remember to enclose each word in its own set of double quotes, so that
| R knows they are character strings. Store the vector in a variable called
| my_char.
| Nice work!
|
|========================================= | 58%
> my_char
|
|========================================== | 61%
| join the elements of my_char together into one continuous character string
| (i.e. a character vector of length 1). We can do this using the paste()
| function.
...
|
|============================================ | 63%
| Type paste(my_char, collapse = " ") now. Make sure there's a space between
| the double quotes in the `collapse` argument. You'll see why in a second.
|
|============================================== | 66%
| The `collapse` argument to the paste() function tells R that when we join
| together the elements of the my_char character vector, we'd like to separate
...
|
|================================================ | 68%
...
|
|================================================== | 71%
| To add (or 'concatenate') your name to the end of my_char, use the c()
| quotes where I've put "your_name_here". Try it now, storing the result in a
| new variable called my_name.
|
|==================================================== | 74%
> my_name
|
|===================================================== | 76%
| Now, use the paste() function once more to join the words in my_name together
| into a single character string. Don't forget to say collapse = " "!
|
|======================================================= | 79%
| single character vector. paste() can also be used to join the elements of
|
|========================================================= | 82%
| In the simplest case, we can join two character vectors that are each of
| length 1 (i.e. join two words). Try paste("Hello", "world!", sep = " "),
| where the `sep` argument tells R that we want to separate the joined elements
|
|=========================================================== | 84%
| For a slightly more complicated example, we can join two vectors, each of
| length 3. Use paste() to join the integer vector 1:3 with the character
| vector c("X", "Y", "Z"). This time, use sep = "" to leave no space between
| Not quite, but you're learning! Try again. Or, type info() for more options.
| Use paste(1:3, c("X", "Y", "Z"), sep = "") to see what happens when we join
| One more time. You can do it! Or, type info() for more options.
| Use paste(1:3, c("X", "Y", "Z"), sep = "") to see what happens when we join
| That's not the answer I was looking for, but try again. Or, type info() for
| more options.
| Use paste(1:3, c("X", "Y", "Z"), sep = "") to see what happens when we join
| That's not exactly what I'm looking for. Try again. Or, type info() for more
| options.
| Use paste(1:3, c("X", "Y", "Z"), sep = "") to see what happens when we join
| What do you think will happen if our vectors are of different length? (Hint:
...
|
|=============================================================== | 89%
[1] "A-1" "B-2" "C-3" "D-4" "E-1" "F-2" "G-3" "H-4" "I-1" "J-2" "K-3" "L-4"
[13] "M-1" "N-2" "O-3" "P-4" "Q-1" "R-2" "S-3" "T-4" "U-1" "V-2" "W-3" "X-4"
|
|================================================================ | 92%
| Since the character vector LETTERS is longer than the numeric vector 1:4, R
...
|
|================================================================== | 95%
| Also worth noting is that the numeric vector 1:4 gets 'coerced' into a
| character vector by the paste() function.
...
|
|==================================================================== | 97%
| We'll discuss coercion in another lesson, but all it really means is that the
...
|
|======================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?
1: No
2: Yes
Selection: yes
Selection: 2
| You've reached the end of this lesson! Returning to the main menu...
| Please choose a course, or type 0 to exit swirl.
1: R Programming
Selection: 1
Selection: 5
|
| | 0%
| Missing values play an important role in statistics and data analysis. Often,
| missing values must not be ignored, but rather they should be carefully
| missingness.
...
|
|==== | 5%
| (in the statistical sense). In this lesson, we'll explore missing values
| further.
...
|
|======= | 10%
|
|=========== | 15%
> x*3
[1] 132 NA 15 NA
|
|============== | 20%
| Notice that the elements of the resulting vector that correspond with the NA
|
|================== | 25%
|
|===================== | 30%
| Next, let's create a vector containing 1000 NAs with z <- rep(NA, 1000).
|
|========================= | 35%
| Finally, let's select 100 elements at random from these 2000 values
| (combining y and z) such that we don't know how many NAs we'll wind up with
| Let's first ask the question of where our NAs are located in our data. The
|
|================================ | 45%
> my_na
[1] FALSE FALSE TRUE FALSE TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE
[13] TRUE TRUE FALSE FALSE TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE
[25] FALSE FALSE FALSE TRUE TRUE TRUE FALSE TRUE TRUE FALSE FALSE FALSE
[37] TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE
[49] FALSE FALSE FALSE TRUE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE
[61] FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
[73] TRUE TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE FALSE FALSE FALSE
[85] FALSE TRUE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE FALSE
|
|=================================== | 50%
| Everywhere you see a TRUE, you know the corresponding element of my_data is
| NA. Likewise, everywhere you see a FALSE, you know the corresponding element
| of my_data is one of our random draws from the standard normal distribution.
...
|
|====================================== | 55%
| operator as a method of testing for equality between two objects. So, you
| might think the expression my_data == NA yields the same results as is.na().
| Give it a try.
> my
> my
> my_data == NA
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[26] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[76] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
|
|========================================== | 60%
| The reason you got a vector of all NAs is that NA is not really a value, but
...
|
|============================================== | 65%
| when using logical expressions anytime NAs might creep in, since a single NA
...
|
|================================================= | 70%
| So, back to the task at hand. Now that we have a vector, my_na, that has a
| TRUE for every NA and FALSE for every numeric value, we can compute the total
...
|
|==================================================== | 75%
| the number 1 and FALSE as the number 0. Therefore, if we take the sum of a
...
|
|======================================================== | 80%
| Let's give that a try here. Call the sum() function on my_na to count the
| total number of TRUEs in my_na, and thus the total number of NAs in my_data.
> sum(my_na)
[1] 56
| Nice work!
|
|============================================================ | 85%
| Pretty cool, huh? Finally, let's take a look at the data to convince
> my_data
[37] NA NA NA NA NA 2.50670961
[43] NA NA NA NA NA NA
[55] NA -0.83728019 NA NA NA NA
[67] NA NA NA NA -0.52111371 NA
[73] NA NA NA NA 0.40754269 NA
|
|=============================================================== | 90%
| Now that we've got NAs down pat, let's look at a second type of missing value
| -- NaN, which stands for 'not a number'. To generate NaN, try dividing (using
> 0/0
[1] NaN
|
|================================================================== | 95%
| Let's do one more, just for fun. In R, Inf stands for infinity. What happens
> Inf-Inf
[1] NaN
|
|======================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?
1: Yes
2: No
Selection: 1
| You've reached the end of this lesson! Returning to the main menu...
1: R Programming
Selection: 1
Selection: 6
|
| | 0%
| In this lesson, we'll see how to extract elements from a vector based on some
...
|
|== | 3%
| or only the elements that are not NA, or only those that are positive or
...
|
|==== | 5%
| I've created for you a vector called x that contains a random ordering of 20
| numbers (from a standard normal distribution) and 20 NAs. Type x now to see
>x
[21] -0.067661834 NA NA NA NA
|
|===== | 8%
| The way you tell R that you want to select some particular elements (i.e. a
...
|
|======= | 10%
| For a simple example, try x[1:10] to view the first ten elements of x.
> x[1:10]
|
|========= | 13%
...
|
|=========== | 15%
| Let's start by indexing with logical vectors. One common scenario when
| vector that are not NA (i.e. missing data). Recall that is.na(x) yields a
...
|
|============= | 18%
4: A vector of length 0
Selection: 1
|
|============== | 21%
> x[is.na(x)]
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
| Great job!
|
|================ | 23%
| can be read as 'is not NA'. Therefore, if we want to create a vector called y
| that contains all of the non-NA values from x, we can use y <- x[!is.na(x)].
| Give it a try.
> x[!is.na(x)]
|
|================== | 26%
>y
|
|==================== | 28%
| Now that we've isolated the non-missing values of x and put them in y, we can
| subset y as we please.
...
|
|====================== | 31%
| Recall that the expression y > 0 will give us a vector of logical values the
| than zero and FALSEs corresponding to values of y that are less than or equal
2: A vector of length 0
Selection: 4
| Type y[y > 0] to see that we get all of the positive elements of y, which are
> y[y>0]
| Excellent job!
|
|========================= | 36%
| You might wonder why we didn't just start with x[x > 0] to isolate the
> x[x>0]
[13] NA NA NA NA 0.69527742 NA
[31] NA 0.40097933
|
|=========================== | 38%
| the expression NA > 0 evaluates to NA. Hence we get a bunch of NAs mixed in
| with our positive numbers when we do this.
...
|
|============================= | 41%
|
|=============================== | 44%
| In this case, we request only values of x that are both non-missing AND
...
|
|================================ | 46%
| I've already shown you how to subset just the first ten values of x using
...
|
|================================== | 49%
| 'one-based indexing', which (you guessed it!) means the first element of a
...
|
|==================================== | 51%
| Can you figure out how we'd subset the 3rd, 5th, and 7th elements of x? Hint
| -- Use the c() function to specify the element numbers as a numeric vector.
> c(x[3],x[5],x[7])
[1] NA 0.1130294 NA
| Not quite, but you're learning! Try again. Or, type info() for more options.
| Create a vector of indexes with c(3, 5, 7), then put that inside of the
| square brackets.
> c(3,5,7)
[1] 3 5 7
| Nice try, but that's not exactly what I was hoping for. Try again. Or, type
| Create a vector of indexes with c(3, 5, 7), then put that inside of the
| square brackets.
> info()
| -- Typing play() lets you experiment with R on your own; swirl will ignore
> c(x[3],x[5],x[7])
[1] NA 0.1130294 NA
| You're close...I can feel it! Try it again. Or, type info() for more options.
| Create a vector of indexes with c(3, 5, 7), then put that inside of the
| square brackets.
> c(x[3],x[5],x[7])
[1] NA 0.1130294 NA
| One more time. You can do it! Or, type info() for more options.
| Create a vector of indexes with c(3, 5, 7), then put that inside of the
| square brackets.
> c(x[3],x[5],x[7])
[1] NA 0.1130294 NA
| That's not exactly what I'm looking for. Try again. Or, type info() for more
| options.
| Create a vector of indexes with c(3, 5, 7), then put that inside of the
| square brackets.
> c(y[3],y[5],y[7])
| Create a vector of indexes with c(3, 5, 7), then put that inside of the
| square brackets.
> x[c(3,5,7)]
[1] NA 0.1130294 NA
|
|====================================== | 54%
| It's important that when using integer vectors to subset our vector x, we
| stick with the set of indexes {1, 2, ..., 40} since x only has 40 elements.
| What happens if we ask for the zeroth element of x (i.e. x[0])? Give it a
| try.
> x[0]
numeric(0)
| us from doing this. What if we ask for the 3000th element of x? Try it out.
> x[3000]
[1] NA
|
|========================================= | 59%
| Again, nothing useful, but R doesn't prevent us from asking for it. This
| should be a cautionary tale. You should always make sure that what you are
| asking for is within the bounds of the vector you're working with.
...
|
|=========================================== | 62%
| What if we're interested in all elements of x EXCEPT the 2nd and 10th? It
...
|
|============================================= | 64%
| EXCEPT for the 2nd and 10 elements. Try x[c(-2, -10)] now to see this.
> x[c(-2,-10)]
[21] NA NA NA NA 0.031770282
| Excellent work!
|
|=============================================== | 67%
| negative sign out in front of the vector of positive numbers. Type x[-c(2,
> x[-c(2,10)]
[21] NA NA NA NA 0.031770282
|
|================================================ | 69%
...
|
|================================================== | 72%
| Create a numeric vector with three named elements using vect <- c(foo = 11,
|
|==================================================== | 74%
| When we print vect to the console, you'll see that each element has a name.
| Try it out.
> vect
11 2 NA
| We can also get the names of vect by passing vect as an argument to the
> names(vect)
| Nice work!
|
|======================================================== | 79%
| that now.
|
|========================================================= | 82%
| Then, we can add the `names` attribute to vect2 after the fact with
> names
NULL
| You almost had it, but not quite. Try again. Or, type info() for more
| options.
|
|=========================================================== | 85%
| Now, let's check that vect and vect2 are the same by passing them as
> identical(vect2)
> identical("vect2")
> identical("vect","vect2")
[1] FALSE
| That's not exactly what I'm looking for. Try again. Or, type info() for more
| options.
| The identical() function tells us if its first two arguments are, well,
| identical.
> identical(vect,vect2)
[1] TRUE
|
|============================================================= | 87%
...
|
|=============================================================== | 90%
| the following commands do you think would give us the second element of vect?
1: vect["bar"]
2: vect[bar]
3: vect["2"]
Selection: 1
|
|================================================================= | 92%
> vect["bar"]
bar
2
|
|================================================================== | 95%
| out.
> vect[c("foo","bar")]
foo bar
11 2
|
|==================================================================== | 97%
| Now you know all four methods of subsetting data from vectors. Different
| approaches are best in different scenarios and when in doubt, try it out!
...
|
|======================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?
1: Yes
2: No
Selection: 1
What is your email address? parasdhama1@gmail.com
| You've reached the end of this lesson! Returning to the main menu...
1: R Programming
Selection: 1
Selection: 7
|
| | 0%
| In this lesson, we'll cover matrices and data frames. Both represent
| 'rectangular' data types, meaning that they are used to store tabular data,
...
|
|== | 3%
| The main difference, as you'll see, is that matrices can only contain a
| single class of data, while data frames can consist of many different classes
| of data.
...
|
|==== | 6%
| Let's create a vector containing the numbers 1 through 20 using the `:`
|
|====== | 8%
> my_vector
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| You nailed it! Good job!
|
|======== | 11%
> dim(my_vector)
NULL
|
|========== | 14%
| have a `dim` attribute (so it's just NULL), but we can find its length using
> length(my_vector)
[1] 20
|
|============ | 17%
| Ah! That's what we wanted. But, what happens if we give my_vector a `dim`
|
|============== | 19%
| It's okay if that last command seemed a little strange to you. It should! The
| dim() function allows you to get OR set the `dim` attribute for an R object.
| my_vector.
...
|
|================ | 22%
| Use dim(my_vector) to confirm that we've set the `dim` attribute correctly.
> dim(my_vector)
[1] 4 5
|
|================== | 25%
| Try it now.
> attributes(my_vector)
$dim
[1] 4 5
| Perseverance, that's the answer.
|
|=================== | 28%
| Just like in math class, when dealing with a 2-dimensional object (think
| rectangular table), the first number is the number of rows and the second is
| columns.
...
|
|===================== | 31%
| But, wait! That doesn't sound like a vector any more. Well, it's not. Now
| it's a matrix. View the contents of my_vector now to see what it looks like.
> my_vector
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 11 15 19
[4,] 4 8 12 16 20
|
|======================= | 33%
| Now, let's confirm it's actually a matrix by using the class() function. Type
[1] "matrix"
|
|========================= | 36%
| that helps us remember what it is. Store the value of my_vector in a new
|
|=========================== | 39%
| The example that we've used so far was meant to illustrate the point that a
...
|
|============================= | 42%
| Bring up the help file for the matrix() function now using the `?` function.
> ?matrix
| That's correct!
|
|=============================== | 44%
| Now, look at the documentation for the matrix function and see if you can
| figure out how to create a matrix containing the same numbers (1-20) and
|
|================================= | 47%
| Finally, let's confirm that my_matrix and my_matrix2 are actually identical.
| The identical() function will tell us if its first two arguments are the
[1] TRUE
| Excellent job!
|
|=================================== | 50%
| Now, imagine that the numbers in our table represent some measurements from a
| clinical experiment, where each row represents one patient and each column
|
|===================================== | 53%
| We may want to label the rows, so that we know which numbers belong to each
...
|
|======================================= | 56%
| patients -- Bill, Gina, Kelly, and Sean. Remember that double quotes tell R
| patients.
|
|========================================= | 58%
| Now we'll use the cbind() function to 'combine columns'. Don't worry about
| storing the result in a new variable. Just call cbind() with two arguments --
patients
| That's correct!
|
|=========================================== | 61%
| Something is fishy about our result! It appears that combining the character
| quotes. This means we're left with a matrix of character strings, which is no
| good.
...
|
|============================================= | 64%
| If you remember back to the beginning of this lesson, I told you that
| matrices can only contain ONE class of data. Therefore, when we tried to
...
|
|=============================================== | 67%
| This is called 'implicit coercion', because we didn't ask for it. It just
| happened. But why didn't R just convert the names of our patients to numbers?
...
|
|================================================= | 69%
| So, we're still left with the question of how to include the names of our
| patients in the table without destroying the integrity of our numeric data.
|
|=================================================== | 72%
| Now view the contents of my_data to see what we've come up with.
> my_data
patients X1 X2 X3 X4 X5
1 Bill 1 5 9 13 17
2 Gina 2 6 10 14 18
3 Kelly 3 7 11 15 19
4 Sean 4 8 12 16 20
| vector of names right alongside our matrix of numbers. That's exactly what we
...
|
|====================================================== | 78%
| Behind the scenes, the data.frame() function takes any number of arguments
| original objects.
...
|
|======================================================== | 81%
| Let's confirm this by calling the class() function on our newly created data
| frame.
> class(data_frame)
> class(my_data)
[1] "data.frame"
| Nice work!
|
|========================================================== | 83%
| It's also possible to assign names to the individual rows and columns of a
| data frame, which presents another possible way of determining which row of
...
|
|============================================================ | 86%
| However, since we've already solved that problem, let's solve a different
| problem by assigning names to the columns of our data frame so that we know
...
|
|============================================================== | 89%
| Since we have six columns (including patient names), we'll need to first
| create a vector containing one element for each column. Create a character
| vector called cnames that contains the following values (in order) --
|
|================================================================ | 92%
| Now, use the colnames() function to set the `colnames` attribute for our data
| frame. This is similar to the way we used the dim() function earlier in this
| lesson.
> colnames(cnames)
NULL
| That's not the answer I was looking for, but try again. Or, type info() for
| more options.
| Nice work!
|
|================================================================== | 94%
| Let's see if that got the job done. Print the contents of my_data.
> my_data
1 Bill 1 5 9 13 17
2 Gina 2 6 10 14 18
3 Kelly 3 7 11 15 19
4 Sean 4 8 12 16 20
|
|==================================================================== | 97%
| In this lesson, you learned the basics of working with two very important and
| common data structures -- matrices and data frames. There's much more to
| learn and we'll be covering more advanced topics, particularly with respect
...
|
|======================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?
1: No
2: Yes
Selection: yes
Selection: 2
| Great job!
| You've reached the end of this lesson! Returning to the main menu...
1: R Programming