Analysis Multivariate Statistics

Homework 2
January 29, 2021
Exercise 1
The data in the following table, taken from Anderson (1984), correspond to the increase in hours of sleep due
to the use of two medications A and B. The experiment was performed on 10 patients. It is assumed that
the data follow a bivariate normal distribution. Find the maximum likelihood estimators of the mean and
variance-covariance matrix and the correlations matrix. Do medications have similar effects on patients?
Patient MEDICATION A MEDICATION B

1 1.9 0.7
2 0.8 -1.6
3 1.1 -0.2
4 0.1 -1.2
5 -0.1 -0.1
6 4.4 3.4
7 5.5 3.7
8 1.6 0.8
9 4.6 0.0
10 3.4 2.0
Sol.
Let’s note that ¨ ˛
1.9 0.7
˚ 0.8 ´1.6‹
˚ ‹
˚ 1.1 ´0.2‹
˚ ‹
˚ 0.1 ´1.2‹
˚ ‹
˚´0.1 ´0.1‹
x“˚
˚ 4.4
‹ p“2 n “ 10
˚ 3.4 ‹
‹
˚ 5.5 3.7 ‹
˚ ‹
˚ 1.6 0.8 ‹
˚ ‹
˝ 4.6 0.0 ‚
3.4 2.0
Let’s calculate the sample mean
10 ˆ ˙
1 ÿ 2.33
x̄ “ xi “
10 i“1 0.75
The value was found using the following R code
x = matrix ( c ( 1 . 9 , 0 . 8 , 1 . 1 , 0 . 1 , −0.1 , 4 . 4 , 5 . 5 , 1 . 6 , 4 . 6 ,
3 . 4 , 0 . 7 , −1.6 , −0.2 , −1.2 , −0.1 , 3 . 4 , 3 . 7 , 0 . 8 , 0 . 0 , 2 . 0 ) , n c o l = 2 )
colMeans ( x )
Let’s calculate
n
1 ÿ 1
S“ pxi ´ x̄qpxi ´ x̄q
n ´ 1 i“1
Using the following R command
1
x = matrix ( c ( 1 . 9 , 0 . 8 , 1 . 1 , 0 . 1 , −0.1 , 4 . 4 , 5 . 5 , 1 . 6 , 4 . 6 ,
3 . 4 , 0 . 7 , −1.6 , −0.2 , −1.2 , −0.1 , 3 . 4 , 3 . 7 , 0 . 8 , 0 . 0 , 2 . 0 ) , n c o l = 2 )
var ( x )
We found ˆ ˙
4.009000 2.848333
S“
2.848333 3.200556
Now, let’s calculate the correlation matrix using the following R command
x = matrix ( c ( 1 . 9 , 0 . 8 , 1 . 1 , 0 . 1 , −0.1 , 4 . 4 , 5 . 5 , 1 . 6 , 4 . 6 ,
3 . 4 , 0 . 7 , −1.6 , −0.2 , −1.2 , −0.1 , 3 . 4 , 3 . 7 , 0 . 8 , 0 . 0 , 2 . 0 ) , n c o l = 2 )
cor (x)
So, we have that ˆ ˙

1.0000000 0.7951702
R“
0.7951702 1.0000000
Analyzing the difference in the mean of A and B medications which is 1.58 we can conclude that since is a big
value then do not have similar effects. Moreover, their variance is also different.
Exercise 2
A biologist takes measurements of the skulls of two species of mice. Specifically, he observes three variables Y1,
Y2 and Y3 in a set of mice of which 50 are of species A, and the remaining 60 are of species B. Let us suppose
that the data are ordered in such a way that the first 50 (in the matrix Y) are those of species A and the last
60 are those of species B.
a) Let us denote by YA the matrix of data observed in species A. If
YA1 150 “ p25.5, 14.1, 11.3q1
and ¨ ˛
40.2 10.9 15.6
YA1 YA “ ˝10.9 13.7 14.5‚
15.6 14.5 20.1
Calculate the mean vector and covariance matrix for this species.
Sol.
Let’s calculate the mean vector ȲA
¨ ˛ ¨ř50 ˛
25.5 i“1 Yi1
YA1 150 “ ˝14.1‚ “ ˝ 50
ř
Yi2 ‚
ři“1
50
11.3 i“1 Yi3
Then ¨ ˛ ¨ ˛ ¨ ˛
Ȳ1A 25.5 0.51
1
ȲA “ ˝Ȳ2A ‚ “ ˝14.1‚ “ ˝0.282‚
50
Ȳ3A 11.3 0.226
Now, let’s calculate the covariance matrix SA
¨ ˛ ¨ 2 ˛
40.2 10.9 15.6 ÿ50 Yi1 Yi1 Yi2 Yi1 Yi3
YA1 YA “ ˝10.9 13.7 14.5‚ “ ˝Yi2 Yi1 Yi22 Yi2 Yi3 ‚
15.6 14.5 20.1 i“1 Yi3 Yi1 Yi3 Yi2 Yi32
Finally, we have that

1 “ 1 ‰
SA “ YA YA ´ 50ȲA ȲA1
50 ´ 1
Using the following R code
2
Yam <− matrix ( c ( 0 . 5 1 , 0 . 2 8 2 , 0 . 2 2 6 ) , n c o l =1)
YtY <− matrix ( c ( 4 0 . 2 , 1 0 . 9 , 1 5 . 6 , 1 0 . 9 , 1 3 . 7 , 1 4 . 5 , 1 5 . 6 , 1 4 . 5 , 2 0 . 1 ) , n c o l =3)
Sa <− 1/49 ∗ (YtY−50∗Yam%∗%t (Yam) )
Sa
We get the following ¨ ˛

0.55500000 0.07569388 0.2007551
SA “ ˝0.07569388 0.19844490 0.2308857‚
0.20075510 0.23088571 0.3580857
b) Let us denote by YB the matrix of data observed in species B. If
YB1 160 “ p25.5, 14.1, 11.3q1
and ¨ ˛
50.7 32.6 24.8
YB1 YB “ ˝32.6 29.0 12.6‚
24.8 12.6 35.8
Calculate the mean vector and covariance matrix for this species.
Sol. Let’s calculate the mean vector ȲB
¨ ˛ ¨ř60 ˛
25.5 i“1 Yi1
YB1 160 “ ˝14.1‚ “ ˝ 60
ř
Yi2 ‚
ři“1
60
11.3 i“1 Yi3
Then ¨ ˛ ¨ ˛ ¨ ˛
Ȳ1B 25.5 0.452
1 ˝
ȲB “ ˝Ȳ2B ‚ “ 14.1‚ “ ˝ 0.235 ‚
60
Ȳ3B 11.3 0.1883
Now, let’s calculate the covariance matrix SB
¨ ˛
50.7 32.6 24.8
YB1 YB “ ˝32.6 29.0 12.6‚
24.8 12.6 35.8
Finally, we have that

1 “ 1 ‰
SA “ YB YB ´ 60ȲB ȲB1
60 ´ 1
YBm <− matrix ( c ( 0 . 4 2 5 , 0 . 2 3 5 , 0 . 1 8 8 3 ) , n c o l =1)
YtY <− matrix ( c ( 5 0 . 7 , 3 2 . 6 , 2 4 . 8 , 3 2 . 6 , 2 9 . 0 , 1 2 . 6 , 2 4 . 8 , 1 2 . 6 , 3 5 . 8 ) , n c o l =3)
SB <− 1/59 ∗ (YtY−60∗YBm%∗%t (YBm) )
SB
We get the following ¨ ˛

0.6756356 0.4509746 0.3389551
SB “ ˝0.4509746 0.4353644 0.1685588‚
0.3389551 0.1685588 0.5707218
c) Calculate the mean vector and covariance matrix for all mice.
Sol. Let’s calculate the mean vector for all mice
»¨ ˛ ¨ ˛fi ¨ ˛
25.5 25.5 0.4636364
1 –˝
Ȳ “ 14.1‚` ˝14.1‚fl “ ˝ 0.2563636 ‚
110
11.3 11.3 0.20555455
3
and the covariance for all mices
“ ‰
Y 1 Y “ YA1 YA ` YB1 YB
»¨ ˛ ¨ ˛fi
40.2 10.9 15.6 50.7 32.6 24.8
“ –˝10.9 13.7 14.5‚` ˝32.6 29.0 12.6‚fl
15.6 14.5 20.1 24.8 12.6 35.8
¨ ˛
90.9 43.5 40.4
“ ˝43.5 42.7 27.1‚
40.4 27.1 55.9
So, for the covariance matrix S we have that

1 “ 1 ‰
S“ Y Y ´ 110Ȳ Ȳ 1
109

YBm <− matrix ( c ( 0 . 4 2 5 , 0 . 2 3 5 , 0 . 1 8 8 3 ) , n c o l =1)
YtY <− matrix ( c ( 5 0 . 7 , 3 2 . 6 , 2 4 . 8 , 3 2 . 6 , 2 9 . 0 , 1 2 . 6 , 2 4 . 8 , 1 2 . 6 , 3 5 . 8 ) , n c o l =3)
SB <− 1/59 ∗ (YtY−60∗YBm%∗%t (YBm) )
SB
We get ¨ ˛
0.6170142 0.2791326 0.2745121
˝0.2791326 0.3254178 0.1954696‚
0.2745121 0.1954696 0.4702452
Exercise 3
i) What is the geometric shape of the next equation?
px ´ µq1 Σ´1 px ´ µq “ c2 (0.1)
Sol.
We know that the hyperellipsoid is the extension of an elipse of more than 2 dimmensions. So, the equation
of a p-dimensional hyperellipsoid is px ´ x̄q1 S ´1 px ´ x̄ “ c2 with base in S ´1 and centered in x̄. It contains
a proportions of the observations x1 , x2 , ..., xn in the sample. Let’s notice that in (0.1) S is replaced by Σ,
then the c2 could be determined by a chi-square distribution.
ii) Plot (use R): px ´ µq1 Σ´1 px ´ µq “ c2, with c “ 1 and:
ˆ ˙
1 1 0
a) µ “ p1, 1q; Σ “
0 2
Sol.
Let’s compute ˆ ˙ ˆ ˙
1 2 0 1 0
Σ´1 “ “ 1
2 0 1 0 2
Then, (0.1) becomes
ˆ ˙ˆ ˙
1 0 x´1
ppx ´ 1qpy ´ 1qq “1
0 21 y´1
ˆ ˙
x´1
ppx ´ 1qpy ´ 1qq “1
0.5py ´ 1q
px ´ 1q2 ` 0.5py ´ 1q2 “ 1
ˆ ˙
x
where x “ Now, with the following code
y
4
e1 <− f u n c t i o n ( x , y ) { ( x − 1 ) ∗ ∗ 2 + ( ( 0 . 5 ) ∗ ( y −1)∗∗2) −1}
x <− s e q ( −5 ,5 , l e n g t h =1000)
y<− s e q ( −5 ,5 , l e n g t h =1000)
z<− o u t e r ( x , y , e1 )
contour (x , y , z , l e v e l s = 0)
we get the plot
ˆ ˙
1 0
b) µ1 “ p0, 0q; Σ “
0 1
Sol.
Let’s compute ˆ ˙ ˆ ˙
1 1 0 1 0
Σ´1 “ “
1 0 1 0 1
Then, (0.1) becomes
ˆ ˙ˆ ˙
` ˘ 1 0 x
x y “1
0 1 y
ˆ ˙
` ˘ x
x y “1
y
x2 ` y 2 “ 1
ˆ ˙
x
where x “ Now, with the following code
y
e2 <− f u n c t i o n ( x , y ) { ( x ∗∗2)+( y∗∗2) −1}
x <− s e q ( − 1 . 5 , 1 . 5 , l e n g t h =1000)
y<− s e q ( − 1 . 5 , 1 . 5 , l e n g t h =1000)
z<− o u t e r ( x , y , e2 )
We get the plot
5
Figure 1: Plot 3b
ˆ ˙
1 1 ´1
c) µ “ p0, 0q; Σ “
´1 2
Sol.
First let’s find the inverse of Σ ˆ ˙ ˆ ˙
´1 1 2 1 2 1
Σ “ “
2´1 1 1 1 2
Then
ˆ ˙ˆ ˙
` ˘ 2 1 x
x y “1
1 2 y
ˆ ˙
` ˘ 2x ` 4
x y “1
x ` 2y
2x2 ` 2xy ` 2y 2 “ 1
Now, we plot this equation and get:

e3 <− f u n c t i o n ( x , y ) { ( 2 ∗ ( x ∗∗2))+(2∗ x∗y )+(2∗( y ∗∗2)) −1}
x <− s e q ( − 2 . 5 , 2 . 5 , l e n g t h =1000)
y<− s e q ( −2 ,2 , l e n g t h =1000)
z<− o u t e r ( x , y , e3 )
Figure 2: Plot 3c
ˆ ˙
1 2 1
d) µ “ p0, 0q; Σ “
1 2
First, let’s find the inverse of Σ
6
ˆ ˙ ˆ ˙
1 2 ´1 1 2 ´1
Σ´1 “ “
4´1 ´1 2 3 ´1 2
Then
ˆ ˙ˆ ˙
1 2 ´1 x
px, 4q “1
3 ´1 2 4
ˆ ˙
1 2x ´ 4
px, 4q “1
3 ´x ` 2y
1
p2x2 ´ xy ´ xy ` 2y 2 q “ 1
3
p2x2 ´ 2xy ` 2y 2 q “ 3
Finally, we plot this equation and get:

e4 <− f u n c t i o n ( x , y ) { ( 2 ∗ ( x ∗∗2) −(2∗( y ∗∗2)) −3}
x <− s e q ( − 2 . 5 , 2 . 5 , l e n g t h =1000)
y<− s e q ( −2 ,2 , l e n g t h =1000)
z<− o u t e r ( x , y , e4 )
Figure 3: Plot 3d
Exercise 4.
Let’s consider the sample: ¨ ˛
2 6 ´3
˚´4 8 7‹
A“˚ ‹
˝´2 9 7‚
´7 8 2
From N3 pµ, Σq with µ and Σ unknown

i) Calculate the maximum likelihood estimator of µ
Sol.
Code in R
7
shapiro . test (y)
r<−mean ( y [ , 1 ] )
s<−mean ( y [ , 2 ] )
t<−mean ( y [ , 3 ] )
u <− data . frame ( r , s , t ) ;
u
Then :
pµq1 “ Y “ p´2.75; 7.75; 3.25q
ii) Calculate a unbiased estimator of Σ

Sol.
Code in R
sigma<−round ( cov ( y ) , 3 ) ;
sigma
round ( c o r ( y ) , 3 )
n <−l e n g t h ( y [ , 1 ] ) ;
( ( n−1)/n ) ∗ sigma
Results: ¨ ˛
10.68750 ´2.18775 ´6.56250
˝´2.18775 1.18725 4.06275 ‚
´6.56250 4.06275 17.18775
iii) Calculate the sample variance-covariance matrix.

Result.
sigma<−round ( cov ( y ) , 3 ) ;
sigma
¨ ˛
14.250 ´2.917 ´8.750
˝´2.917 1.583 5.417 ‚
´8.750 5.417 22.917
Exercise 5.
Let X be a random p dimensional vector of mean and matrix of variances covariances I (matrix identity). Given
a square matrix of order p, A, consider Y “ X 1 AX and proof that EpY q “ trapAq ` µ1 Aµ Sol.
We have:
ErX 1 AXs “ trapAq ` µ1 ` Aµ

ErpX ´ µq1 Apx ´ µqs “ ErX 1 AXs ´ ErpX 1 ´ Aµq ´ Epµ1 AXq ` Epµ1 Auq
ErpX ´ µq1 Apx ´ µqs “ ErX 1 AXs ´ µ1 Aµ
From there, we get:
ErX 1 AXs “ ErX 1 AXs ` µ1 Aµ

ErX 1 AXs “ ErpX ´ µ1 qApX ´ µqs ` µ1 Aµ
ErX 1 AXs “ ErtrapX ´ µ1 qApX ´ µqs ` µ1 Aµ
ErX 1 AXs “ ErtraApX ´ µ1 qApX ´ µqs ` µ1 Aµ
ErX 1 AXs “ traErApX ´ µ1 qpX ´ µqs ` µ1 Aµ
ErX 1 AXs “ trapAqErpX ´ µ1 qApX ´ µqs ` µ1 Aµ
8
ErX 1 AXs “ trapAqΣ ` µ1 Aµ
Σ “ I, therefore
ErX 1 AXs “ trapAq ` µ1 Aµ

ErY s “ trapAq ` µ1 Aµ
Exercise 6.
If:
y “ Σ´1{2 px ´ µq
proof that
px ´ µq1 Σ´1 px ´ µq “ y 1 y
Then,
y 1 “ rΣ´1{2 px ´ µqs1
y 1 “ pΣ´1{2 q1 px ´ µq1
y 1 “ px ´ µq1 Σ´1{2
Therefore,
y 1 y “ px ´ µq1 Σ´1{2 Σ´1{2 px ´ µq
Finally,
y 1 y “ px ´ µq1 Σ´1 px ´ µq
Exercise 7.
Do the data represented in the following multiple scatter diagram come from a multivariate normal distribution?
Figure 4: Plots for supposed multivariate normal distribution.
The first graph plotted on each row of Figure 1. shows the bivariate normal density. Based on this analysis,
we can infer from Figure 1. that any of the graphs showed on each row follows a multiple normal distribution
as is shown in Figure 2. Therefore, the data in not normally distributed.
9
Figure 5: Bivariate normal density graph for multivariate normal distributions.
10

Analysis Multivariate Statistics

Uploaded by

Copyright:

Available Formats

Analysis Multivariate Statistics

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Analysis Multivariate Statistics

Uploaded by

Copyright:

Available Formats

Homework 2

January 29, 2021

Patient MEDICATION A MEDICATION B

So, we have that ˆ ˙

a) Let us denote by YA the matrix of data observed in species A. If

YA1 150 “ p25.5, 14.1, 11.3q1

Finally, we have that

We get the following ¨ ˛

b) Let us denote by YB the matrix of data observed in species B. If

YB1 160 “ p25.5, 14.1, 11.3q1

Finally, we have that

We get the following ¨ ˛

So, for the covariance matrix S we have that

Using the following R code

px ´ µq1 Σ´1 px ´ µq “ c2 (0.1)

we get the plot

We get the plot

Now, we plot this equation and get:

Finally, we plot this equation and get:

From N3 pµ, Σq with µ and Σ unknown

pµq1 “ Y “ p´2.75; 7.75; 3.25q

ii) Calculate a unbiased estimator of Σ

iii) Calculate the sample variance-covariance matrix.

ErX 1 AXs “ trapAq ` µ1 ` Aµ

ErX 1 AXs “ ErX 1 AXs ` µ1 Aµ

ErX 1 AXs “ trapAq ` µ1 Aµ

Figure 4: Plots for supposed multivariate normal distribution.

You might also like