CS1B044
CS1B044
CS1B044
i.
a. pbeta(0.8,5,1)-pbeta(0.2,5,1)
[1] 0.32736
b. qbeta(0.65,5,1)
[1] 0.9174506
ii.
> alpha1<-5
> beta1<-1
>
> alpha2<-1
> beta2<-5
>
> alpha3<-3
> beta3<-3
> skew1<-2*((beta1-alpha1)/(alpha1+beta1+2))*sqrt((alpha1+beta1+1)/(alpha1*beta1))
> skew1
[1] -1.183216
>
> skew2<-2*((beta2-alpha2)/(alpha2+beta2+2))*sqrt((alpha2+beta2+1)/(alpha2*beta2))
> skew2
[1] 1.183216
>
> skew3<-2*((beta3-alpha3)/(alpha3+beta3+2))*sqrt((alpha3+beta3+1)/(alpha3*beta3))
> skew3
[1] 0
iii.
> set.seed(421967)
> x1<-rbeta(12000,alpha1,beta1)
> head(x1)
[1] 0.9488977 0.9572279 0.9720574 0.5234291 0.9041575 0.6089291
> x2<-rbeta(12000,alpha2,beta2)
> head(x2)
[1] 0.002203991 0.028198505 0.172158009 0.122690885 0.043582907 0.533057382
> x3<-rbeta(12000,alpha3,beta3)
> head(x3)
[1] 0.6647891 0.6581145 0.3857413 0.4688010 0.2719120 0.5816465
>
> hist(x1,probability = T)
> hist(x2,probability = T)
> hist(x2,probability = T)
iv.
under scenario 1, coefficient of skewness is negative i.e. -1.18326, and we can see in the
histogram that the data is negatively skewed.
under scenario 2, coefficient of skewness is positive i.e 1.18326, and we can see in the
histogram that the data is positively skewed.
under scenario 3, coefficient of skewness is 0, and we can see in the histogram that the
data is symmetrical and hence not skewed to any side.
v.
set.seed(421967)
x1<-rbeta(12000,alpha1,beta1)
x1_bar <- mean(x1);x1_bar
set.seed(421967)
x2<-rbeta(12000,alpha1,beta1)
x2_bar <- mean(x2);x2_bar
set.seed(421967)
x3<-rbeta(12000,alpha1,beta1)
x3_bar <- mean(x3);x3_bar
vi.
As the number of simulations increases , all three histograms become non skewed due to
high frequency, and thus all three tend to normal distribution as per the central limit
theorem.
Q2.
i. Y=alpha+beta*x
Here alpha = 100
Beta =1
ii. Plot
x=c(28, 37, 41, 52, 57, 49, 38, 25, 23, 48, 60, 55, 29, 43, 36, 50, 34, 40, 26, 33)
y=c(132, 140, 155, 160, 167, 148, 128, 131, 118, 139, 149, 154, 117, 146, 142,
168, 144, 156, 114, 133)
iii.
plot(x,y,main = "Age and Blood Pressure",xlab = "age",ylab = "sys
BP",abline(lm(y~x)),col="red")
iv.
> fit<-lm(y~x)
> fit
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
96.499 1.133
> anova(fit)
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
x 1 3082.9 3082.94 33.591 1.717e-05 ***
Residuals 18 1652.0 91.78
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
V.
> summary(fit)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-15.485 -6.504 1.177 5.979 14.846
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 96.4994 8.1460 11.846 6.21e-10 ***
x 1.1331 0.1955 5.796 1.72e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Alpha=96.4994
Beta = 1.1331