R Complete
R Complete
R Complete
Example:
barplot(
x,
main = "Number of customers visited the store",
xlab = "Days",
ylab = "Number of customers",
R Complete Note 1
names.arg = c("Monday", "Tuesday", "Wednesday", "Thursda
y", "Friday", "Saturday", "Sunday"),
col = rainbow(length(x))
)
Exercise
Create a barplot for this Exercise using R.
Good 13
Above Average 12
Average 15
Poor 10
Total 50
R Complete Note 2
col = rainbow(length(x))
)
barplot(
x,
main = "Number of customers visited the store",
xlab = "Days",
ylab = "Number of customers",
names.arg = c("Monday", "Tuesday", "Wednesday", "Thursd
ay", "Friday", "Saturday", "Sunday"),
col = rainbow(length(x)),
density = seq(10, 70, 10)
)
R Complete Note 3
Tabular Method
Example:
set = c(1, 3, 8, 4, 2, 3, 6, 5, 5, 8, 4, 2, 4, 1, 5)
# Relatove frequency
prop.table(ta)
# Cumulative Frequency
cumsum(ta)
R Complete Note 4
Pie Chart
# Function to create pie charts:
pie()
Example:
R Complete Note 5
lbls <- paste(piepercent, "%", sep = "")
pie(
x,
labels = lbls,
main = "Pie chart with slice percentage",
col = rainbow(length(x)),
radius = 1
)
R Complete Note 6
radius = 1
)
Find mean values for each category in a column (similar to group by in SQL):
R Complete Note 7
# Find the maximum value of the 'uptake' variable for each
category in the 'Treatment' column
tapply(d$uptake, d$Treatment, max)
Median
median(u) # Find the median of 'u'
Mode
## Find the mode -- Method 1
R Complete Note 8
💬 match(v, uniqv) returns a vector of the same length as
element is the index of the corresponding element of
v
v
where each
in uniqv .
which.max() returns the index of the maximum value in the input vector.
So,
uniqv[which.max(tabulate(match(v, uniqv)))] returns the value in uniqv that
corresponds to the maximum count in v , which is the mode of v .
💬 The table() function counts the number of times each unique value
occurs in u
R Complete Note 9
💬 max(y) finds the maximum frequency in y .
Measures of Dispersion
Range and Interquartile Range
# Find the range
Range = max(u) - min(u)
Range
Quartiles
# Find Quartiles
quantile(u, 0.25) #First Quartile
quantile(u, 0.5) #Second Quartile
quantile(u, 0.75) #Third Quartile
R Complete Note 10
Find the five-number summary for a specific category in a column:
Exercise
Find the summary statistics for uptake where plant type is Qn1 and uptake value
is more than 20
others do not.
To neglect the missing values in other functions you have to
specifically mention it
Example:
Deciles
# Find Deciles
quantile(u, 0.4) #Fourth Decide
quantile(u, 0.7) #Seventh Decile
Percentiles
R Complete Note 11
# Find Percentiles
quantile(u, 0.98) # 98th Percentile
quantile(u, 0.37) # 37th Percentile
Variance
# Find the Sample Variance
var(u)
varwidth — If set to FALSE , all boxes will have the same width regardless of
the size of the group
Examples:
R Complete Note 12
datasets::ToothGrowth
TG<-ToothGrowth
boxplot(
TG$len,
main="Box plot of tooth length",
ylab="Tooth length",
col="hotpink",
border="lightpink",
notch = FALSE,
varwidth = FALSE,
horizontal = TRUE
)
datasets::ToothGrowth
TG<-ToothGrowth
boxplot(
R Complete Note 13
len~supp,
data = TG,
main = "Tooth growth with supplement types",
xlab = "Supplement type",
ylab = "Tooth length",
col = c("hotpink", "lightpink")
)
Tally Table
# Tally Table
datasets::iris
i <- iris
table(i$Species)
Output:
R Complete Note 14
Contingency Table
# Contingency Table
table(d$Plant, d$Type)
table(d$Plant, d$Treatment)
Output:
Binomial Distribution
dbinom
For binomial distributions, dbinom is used in R.
# dbinom Help
help(dbinom)
Example:
Find P (x = 1)when n = 5, and θ = 0.1.
R Complete Note 15
x=1
n=5
θ = 0.1
P (x = 1) =5 C1 (0.1)1 (0.9)4
= 5 × 0.1 × 0.6561
= 0.32805
Exercise
A customer receiving service from a customer care center can be classified as
good service or bad service. The probability of getting good service is 0.4.
n = 10
x=2
θ = 0.4
2. What is the probability he/she getting bad service between 3 and 7 out of
10 tries?
n = 10
3 < x < 10
θ = 0.6
R Complete Note 16
sum(dbinom(x = 4:6, size = 10, prob = 0.6))
pbinom
pbinom is a cumulative function
# pbinomm Help
help(pbinom)
Examples:
Poisson Distribution
dpois
dpois is used for Poisson distributions in R
# dpois Help
help(dpois)
Examples:
Find P (x = 0)when λ = 0.03.
R Complete Note 17
ppois
ppois is a cumulative function
# ppois Help
help(ppois)
Example:
Find the value of P (x = 0) + P (x = 1) + P (x = 2)when λ = 2.
ppois(2, lamba = 2)
# Method 1
p1 <- dpois(x = 0, lambda = 2)
p2 <- dpois(x = 1, lambda = 2)
p3 <- dpois(x = 2, lambda = 2)
p <- p1 + p2 + p3
p
# Method 2
sum(dpois(x=0:2, lambda = 2))
Exercise
Suppose it has been observed that, on average 180 cars per hour pass a
specified point on a particular road in the morning rush hour. Due to impending
road works it is estimated that congestion will occur closer to the city center if
more than 5 cars pass the point in any of one minute. What is the probability of
congestion occurring?
180
λ= =3
60
x>5
1 - ppois(5, lambda = 3)
R Complete Note 18
Exercise
A manufacturer of balloons produces 40% that are oval and 60% that are
round. Packets of 20 balloons may be assumed to contain random samples of
balloons. Determine the probability that such a packet contains:
2. P (oval) = 0.4
P (round) = 0.6
20
C10 (0.4)10 (0.6)10
P (x ≤ 9)
The number of trials is not fixed even though they are independent
events
# dunif Help
?dunif
R Complete Note 19
Example:
Find the PDF of a uniform distribution between 0and 5at the point x = 2.
punif
punif is a cumulative function in R.
qunif
qunif is a quantile function in R.
Example:
R Complete Note 20
pnorm
pnorm is used for Normal Distribution calculations in R.
# pnorm Help
?pnorm
Example:
Find P (x < 18)when mean is 15and the standard deviation is 2
x−μ 18 − 15
<
2
σ
z < 1.5
= 0.9332
Example:
Find P (x > 18)when mean is 15and the standard deviation is 2
1 − P (x < 18)
= 1 − 0.9331928
= 0.0668072
Example:
Find P (970000 < x < 1060000)when mean is 1000000and standard
deviation is 30000
R Complete Note 21
# P(x < 1060000)
P1 <- pnorm(q = 1060000, mean = 1000000, sd = 30000, lower.
tail = TRUE)
# P(x < 970000)
P2 <- pnorm(q = 970000, mean = 1000000, sd = 30000, lower.t
ail = TRUE)
# P(970000 < x < 1060000) = P(x < 1060000) - P(x < 970000)
P <- P2 - P1
P
H1 : ? > 80
Sample Mean = 83
Standard Deviation = 8
sd = 65
mu_0 = 170
R Complete Note 22
n = 400
x_bar = 178
The owner of the shop wants to induce the annual income of the shop. He
suspects compared to previous years annual income rate declined to less than
5%.. He suspects at 5% significance error. Standard deviation of annual
income for last 16 years is 0.1%. The population mean is 5%, and sample mean
is 4.962%.
sd = 0.1
mu_0 = 5
n = 16
x_bar = 4.962
R Complete Note 23
print("Reject the H0")
} else {
print("Failed to reject H0")
}
R Complete Note 24