MYLab 3
MYLab 3
MYLab 3
12 August 2016
Question 1
yawn_expt <- data.frame(group=c(rep("treatment", 34), rep("control", 16)),
yawn=c(rep("yes", 10), rep("no", 24), rep("yes", 4), rep("no", 12)))
library(dplyr)
##
## Attaching package: 'dplyr'
library(tidyr)
library(knitr)
yawn_expt %>%
group_by(group, yawn) %>%
tally() %>%
ungroup() %>%
spread(yawn, n, fill=0) %>%
mutate(total = rowSums(.[-1])) %>%
mutate(proportion=yes/total) %>%
kable()
b. The participants were randomly assigned to two groups. 16 people were assigned to
the control group where there wasn't a person yawning near them and 34 people were
assigned to the treatment group where a person near them yawned.
c. The two variables are: i) which group they belong to (control or treatment) and ii) the
outcome depending on whether they yawned or not (yes or no).
d. 25% of the control group yawned and roughly 29.41% of the treatment group yawned.
Question 2
a. The null hypothesis is the hypothesis that the proportion of people yawning in the
control group and the treatment group is the same.
b. The alternative hypothesis would be the hypothesis that the proportion of people
yawning in the treatment group is greater than the proportion of people yawning in
the control group i.e. : < .
Question 3
yawn_expt <- data.frame(group=c(rep("treatment", 34), rep("control", 16)),
yawn=c(rep("yes", 10), rep("no", 24), rep("yes", 4), rep("no", 12)))
yawn_expt %>%
group_by(group, yawn) %>%
tally() %>%
ungroup() %>%
spread(yawn, n, fill=0) %>%
mutate(total = rowSums(.[-1])) %>%
mutate(p=yes/total)->yawn.table
expt.diff<-yawn.table$p[2]-yawn.table$p[1]
set.seed(2)
permutation <-function(data){
data%>%
mutate(yawn=sample(yawn)) %>%
group_by(group, yawn) %>%
tally() %>%
ungroup() %>%
spread(yawn, n, fill=0) %>%
mutate(total = rowSums(.[-1])) %>%
mutate(p=yes/total)->yawn.table
expt.diff<-yawn.table$p[2]-yawn.table$p[1]
return(expt.diff)
}
permutation(yawn_expt) #difference between proportions of treatment and contr
ol groups
## [1] 0.1360294
Question 4
prop.difference<-vector(length=10000)
set.seed(2)
for (i in 1:10000) {
prop.difference[i] <- permutation(yawn_expt)
}
library(ggplot2)
prop.diff<-data.frame(prop.difference)
ggplot(data=prop.diff)+geom_histogram(aes(x=prop.difference), binwidth=0.025)
+geom_vline(xintercept=expt.diff,colour="red")+labs(x="Permutation test diffe
rences")
sum(prop.difference>expt.diff)
## [1] 2549
sum(prop.difference>expt.diff)/length(prop.difference)
## [1] 0.2549
above=filter(prop.diff,prop.difference>expt.diff)
ggplot()+geom_histogram(data=prop.diff, aes(x=prop.difference),binwidth=0.025
)+geom_histogram(data=above,aes(x=prop.difference), fill="red",binwidth=0.025
)+geom_vline(xintercept=expt.diff,colour="red")
The results indicate that 25.49% of the time, the permuted data yields a difference larger
than the difference of the actual data.
Question 5
a. p-value = 0.2549
c. We conclude that there is insufficient evidence to reject the null hypothesis that the
proportion of people yawning in the control group is the same as the proportion of
people yawning in the treatment group.