Assignment Y1: (I) Median

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Assignment Y1

Question 1

(i) Median
n<-1000
p<-0.05
qbinom(0.5,n,p)

## [1] 50

(ii) Probability that the number of deaths D lies between 45 and 59


inclusive
#(a) exactly
pbinom(59,n,p)-pbinom(45,n,p)

## [1] 0.6523035

#(b) using a Poisson approximation


# Since Bin(n,p) is approximately Pois(np), we have a Poisson Distribution
with mean 50
ppois(59,n*p)-ppois(45,n*p)

## [1] 0.6408685

#(c) using a normal approximation


#Since Bin(n,p) is approximately N(np, npq), we have
pnorm(59,50,sqrt(47.5)) - pnorm(45, 50, sqrt(47.5))

## [1] 0.6701195
Question 2

(i) (a) Simulated values


set.seed(13)
u<-runif(1000,0,1)
u

(i) (b)
Rearranging u=F ( x ) to get x=F−1 (u)
1 1
u=1− becomes x= −1
1+ x 1−u
x<- (1-u)^(-1)-1
x

(ii) (a)
plot(density(x),xlim=c(0,200), main="Empirical PDF of the simulations",
xlab="x", col="blue")
(ii) (a)
mean(x)

## [1] 14.54969

sd(x)

## [1] 199.7557

skew<-sum((x-mean(x))^3)/length(x)

skew/(sd(x)^3)

## [1] 27.72327

The huge standard deviation for such a small mean indicates that we have a very long tail
(as the values must be greater than zero).
This is confirmed by the very large positive coefficient of skewness.
Question 3

(i) Width of as exact 95% confidence interval


n<-20
x<-10
test<-binom.test(x,n,conf=0.95)
test$conf[2]-test$conf[1]

## [1] 0.4560843

Hence the width is approximately45.6.

(ii) Showing that the greatest width occurs when there are 10
successes
width<-rep(0,20)
for (i in 1:20)
{test<-binom.test(i,20); width[i]<-test$conf[2]-test$conf[1]}

width

## [1] 0.2474677 0.3046342 0.3468559 0.3792800 0.4044744 0.4238576 0.4382793


## [8] 0.4482674 0.4541440 0.4560843 0.4541440 0.4482674 0.4382793 0.4238576
## [15] 0.4044744 0.3792800 0.3468559 0.3046342 0.2474677 0.1684335

max(width)

## [1] 0.4560843

By examining the widths, we see that the greatest width (0.4560843) occurs for 10
successes.
Question 4

(i) 99% Confidence interval for the exponential parameter


data<-c(14, 4, 3, 2, 3, 1, 5, 10, 4, 23)
n<-length(data)
bm<-rep(0,1000)

set.seed(17)
for (i in 1:1000)
{y<-sample(data,replace=TRUE);bm[i]<-mean(y)}
ci<-quantile(bm,c(0.005,0.995))

Therefore a 99 % confidence interval for λ is given by:


1/ci

## 0.5% 99.5%
## 0.37043897 0.07517949

This is (0.07518, 0.3704).

(ii) (a) Simulating 1000 sample means from Exp(0.145)


l<-0.145
xbar<-rep(0,1000)
set.seed(19)
for (i in 1:1000)
{x<-rexp(10,l);xbar[i]<-mean(x)}

(ii) (b) Histogram


hist(xbar, prob=TRUE, main="Densities of the sample means from Exp(0.145)",
xlab="Sample means")
(ii) (c) Probability that empirical mean is less than 5
length(xbar[xbar<5])/length(xbar)

## [1] 0.183

This gives an answer of 0.183.

(iii) Gamma
Using X ∼ Gamma(n , nλ), we have X ∼ Gamma(10,1.45)
xvals<-seq(0,20,by=0.01)
hist(xbar, prob=TRUE, main="Densities of the sample means from Exp(0.145)",
xlab="Sample means")
lines(xvals, dgamma(xvals, 10, 1.45), type="l", col="blue")
The histogram has
a similar shape to the gamma distribution.

(iv) Exact probability


pgamma(5,10,1.45)

## [1] 0.1957324

This gives an answer of 0.19573. There is about a 7% difference in the answers so they’re
not that close. # (v) (a) Simulation
set.seed(21)
sim<-rgamma(1000,10,1.45)

(v) (b) QQ plot


qqplot(sim, xbar, xlab="Gamma quantiles", ylab="Sample mean quantiles")
abline(0,1,col="red", lty=2, lwd=2)
The fit appears fairly good in the middle. The lower end sample means are slightly higher
than expected - so we have a lighter lower tail. The upper end sample means are much
lower than expected - so we have a lighter upper tail except for a handful of extremely large
sample mean values. So it’s not a very good fit, possibly because of the sample size not
being large enough.

You might also like