How To Choose A Random Number in R

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

How to choose a random number in R

As a language for statistical analysis, R has a comprehensive library of functions for


generating random numbers from various statistical distributions. In this post, I want to focus
on the simplest of questions: How do I generate a random number?

The answer depends on what kind of random number you want to generate. Let's illustrate by
example.

Generate a random number between 5.0 and 7.5

If you want to generate a decimal number where any value (including fractional values)
between the stated minimum and maximum is equally likely, use the runif function. This
function generates values from the Uniform distribution. Here's how to generate one random
number between 5.0 and 7.5:

> x1 <- runif(1, 5.0, 7.5)


> x1
[1] 6.715697

Of course, when you run this, you'll get a different number, but it will definitely be between
5.0 and 7.5. You won't get the values 5.0 or 7.5 exactly, either.

If you want to generate multiple random values, don't use a loop. You can generate several
values at once by specifying the number of values you want as the first argument to runif.
Here's how to generate 10 values between 5.0 and 7.5:

> x2 <- runif(10, 5.0, 7.5)


> x2
[1] 6.339188 5.311788 7.099009 5.746380 6.720383 7.433535
7.159988
[8] 5.047628 7.011670 7.030854

Generate a random integer between 1 and 10

This looks like the same exercise as the last one, but now we only want whole numbers, not
fractional values. For that, we use the sample function:

> x3 <- sample(1:10, 1)


> x3
[1] 4
The first argument is a vector of valid numbers to generate (here, the numbers 1 to 10), and
the second argument indicates one number should be returned. If we want to generate more
than one random number, we have to add an additional argument to indicate that repeats are
allowed:

> x4 <- sample(1:10, 5, replace=T)


> x4
[1] 6 9 7 6 5

Note the number 6 appears twice in the 5 numbers generated. (Here's a fun exercise: what is
the probability of running this command and having no repeats in the 5 numbers generated?)

Select 6 random numbers between 1 and 40, without replacement

If you wanted to simulate the lotto game common to many countries, where you randomly
select 6 balls from 40 (each labelled with a number from 1 to 40), you'd again use the sample
function, but this time without replacement:

> x5 <- sample(1:40, 6, replace=F)


> x5
[1] 10 21 29 12 7 31

You'll get a different 6 numbers when you run this, but they'll all be between 1 and 40
(inclusive), and no number will repeat. Also, you don't actually need to include the
replace=F option -- sampling without replacement is the default -- but it doesn't hurt to
include it for clarity.

Select 10 items from a list of 50

You can use this same idea to generate a random subset of any vector, even one that doesn't
contain numbers. For example, to select 10 distinct states of the US at random:

> sample(state.name, 10)


[1] "Virginia" "Oklahoma" "Maryland" "Michigan"

[5] "Alaska" "South Dakota" "Minnesota" "Idaho"

[9] "Indiana" "Connecticut"

You can't sample more values than you have without allowing replacements:

> sample(state.name, 52)


Error in sample(state.name, 52) :
cannot take a sample larger than the population when 'replace
= FALSE'
... but sampling exactly the number you do have is a great way to randomize the order of a
vector. Here are the 50 states of the US, in random order:

> sample(state.name, 50)


[1] "California" "Iowa" "Hawaii"
[4] "Montana" "South Dakota" "North Dakota"
[7] "Louisiana" "Maine" "Maryland"
[10] "New Hampshire" "Rhode Island" "Texas"
[13] "Florida" "North Carolina" "Minnesota"
[16] "Arkansas" "Pennsylvania" "Colorado"
[19] "Idaho" "Connecticut" "Utah"
[22] "South Carolina" "Illinois" "Ohio"
[25] "New Jersey" "Indiana" "Wisconsin"
[28] "Mississippi" "Michigan" "Wyoming"
[31] "West Virginia" "Alaska" "Georgia"
[34] "Vermont" "Virginia" "Oklahoma"
[37] "Washington" "New Mexico" "New York"
[40] "Delaware" "Nevada" "Alabama"
[43] "Kentucky" "Missouri" "Oregon"
[46] "Tennessee" "Arizona" "Massachusetts"
[49] "Kansas" "Nebraska"

You could also have just used sample(state.name) for the same result -- sampling as
many values as provided is the default.

Further reading

For more information about how R generates random numbers, check out the following help
pages:

> ?runif
> ?sample
> ?.Random.seed

The last of these provides technical detail on the random number generator R uses, and how
you can set the random seed to recreate strings of random numbers.

You might also like