Assessment 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

CONFIDENTIAL 1 CS/JAN 2023/STA404

UNIVERSITI TEKNOLOGI MARA


ASSESSMENT 2 - TEST

COURSE : STATISTICS FOR BUSINESS AND SOCIAL


SCIENCES
COURSE CODE : STA404
EXAMINATION : JAN 2023
TIME : 2 HOURS

INSTRUCTIONS TO CANDIDATES

1. This question paper consists of SIX (6) questions.

2. Candidates are given 2 hours to accomplish and submit this assessment via email or any
suitable platform in form of word/pdf file.

3. Answer ALL parts of questions in the A4. Start each answer on a new page.

4. Please check to make sure that this examination pack consists of :


i) the Question Paper
ii) a two–page Appendix 1

5. Answer ALL questions in English.

NAME:

STUDENT NO:
2 0

GROUP:
NACAB1D

PLEASE SUBMIT THIS ASSESSMENT ON THE REQUIRED TIME


This assessment paper consists of 8 printed pages
CONFIDENTIAL
CONFIDENTIAL 2 CS/JAN 2023/STA404

QUESTION 1

The director of a government agency heard that their financial department is receiving an
average of 6 complaints from the customers in a week. To solve the problem, he assigned his
secretary to collect some data to see if he needs to replace the supervisor of that department.
The director will replace the supervisor if the actual mean number of complaints towards the
financial department is greater than 6 per week. The secretary gathered data over the next 12
weeks and discovered that the mean number of weekly complaints towards the financial
department is 7 with a variance of 3.25.

a) Determine an appropriate statistical analysis to be used in this study.


(1 mark)
b) Calculate the t-statistic for this study.
(2 marks)
c) Test at the 5% significance level, is the director going to replace the department
supervisor? Show the relevant steps.
(5 marks)

QUESTION 2

A professor at a local university wish to determine whether there is significant difference in the
average of final examination marks between the students who took his STA404 course online
and face-to-face. Fifteen students were randomly selected from each group and the final
examination marks were recorded. Hence, he analysed the data using IBM SPSS and the
results are as follows.
Independent Samples Test
Mark
Equal variances Equal variances
assumed not assumed
Levene’s Test for F 2.041
Equality of Variances
Sig .164
t -1.524 -1.524
df W 24.625
Sig. (2-tailed) .139 .140
Mean Difference -6.42000 -6.42000
t-test for Equality of
Means Std. Error Difference 4.21320 4.21320
95% Confidence Lower X -15.10390
Interval of the
difference Upper Y 2.26390
a) Are the variances of the two populations equal? Use α = 0.05.
(3 marks)
b) Find the value of W.
(1 mark)
c) Calculate the values of X and Y.
(4 marks)
d) Based on the confidence interval obtained in c), is there any evidence to support that the
average of final examination marks for students who took online class is different form
face-to-face class? (2 marks)
CONFIDENTIAL
CONFIDENTIAL 3 CS/JAN 2023/STA404

QUESTION 3

A new study program is introduced in Faculty A. To assess the effectiveness of this program, 9
students are randomly selected to undergo this program. If the new program is effective, the
numbers of hours spend by the students on doing their assignment per week will increase. The
table below shows the number of hours each student spends on doing assignment per week
before and after the new program is introduced.

Before the new program 15 9 6 12 7 10 18 13 3


After the new program 20 9 9 17 6 15 21 22 2

The SPSS output for the above information is as follow:


Pair 1
before - after
Paired Mean -3.111
Differences
Std. Deviation 3.333
Std. Error Mean 1.111

95% Confidence Lower -5.673


Interval of the Upper -.549
difference
T -2.800
df 8
Sig. (2-tailed) .023

Using p-value in the SPSS output, test whether the new study program is effective at 10%
significance level.
(5 marks)

CONFIDENTIAL
CONFIDENTIAL 4 CS/JAN 2023/STA404

QUESTION 4

A grocery chain wants to know if the three types of advertisements affect the mean sales
differently. They used each type of advertisement at four different randomly selected stores for
a month and measured the sales (RM’000) for each store at the end of the month. The results
are as follow.

Descriptives
Advertisement Statistic
Mean 11.5000
Type 1 Std. Deviation 3.41565
Sum 46.00
Mean 10.0000
Sales Type 2 Std. Deviation 3.26599
Sum 40.00
Mean 7.5000
Type 3 Std. Deviation 2.51661
Sum 30.00

ANOVA
Sales
Sum of df Mean Square F Sig.
Squares
Between Groups A 2 16.333 D .235
Within Groups 86.000 9 C
Total B 11

a) Using the sum of squares between groups formula, calculate the value of A.
(3 marks)
b) Compute the values of B, C and D.
(3 marks)
c) State the null and alternative hypothesis for the above study.
(1 mark)
d) Using the p-value method, is the any evidence to support that the types of advertisements
affect the mean sales? Test at α = 0.01.
(3 marks)

CONFIDENTIAL
CONFIDENTIAL 5 CS/JAN 2023/STA404

QUESTION 5

The lecturers of Mathematical Science Department from University M intended to study the
association between the stress levels and the hours of online lessons in a week among
accounting students. A questionnaire which aimed to assess the stress levels was administered
to the respondent of the study. Their responses towards on the stress levels were categorized
into low, medium and high levels. The students were also asked to state the number of hours
of their online lessons each week, according to the following category: less than 16 hours, 16
to 18 hours, 19 to 21 hours and more than 21 hours. The data were collected and the results
are as follow.

Hours of Online Lessons * Stress Levels Crosstabulation


Stress Levels
Low Medium High Total
Less than 16 hours Count 17 71 18 106
Expected Count 15.4 69.4 21.2 106.0
Hours of 16 - 18 hours Count 18 92 37 147
Online Expected Count 21.3 S 29.5 147.0
Lessons 19 - 21 hours Count 22 97 28 147
Expected Count 21.3 96.2 29.5 147.0
More than 21 hours Count T 60 15 89
Expected Count 12.9 58.2 17.8 89.0
Total Count 71 320 98 489
Expected Count 71.0 320.0 98.0 489.0

Chi-Square Tests
Value df
Pearson Chi-Square 4.032 U
Likelihood Ratio 3.963 6
Linear-by-Linear Association .148 1
N of Valid Cases 489

a) Give a reason for conducting the Chi-Square Test of Independence for the above study.
(1 mark)
b) Calculate the values of S using expected value formula.
(1 mark)
c) Calculate the values of T and U.
(2 marks)
d) State the null and alternative hypothesis for the above study.
(1 mark)
e) At the 10% significance level, is there any sufficient evidence to conclude that the stress
level is associated with the hours of online lessons in a week among the accounting
students?
(4 marks)

CONFIDENTIAL
CONFIDENTIAL 6 CS/JAN 2023/STA404

QUESTION 6

A study was conducted to investigate the influence of the fathers’ height on the sons’ height. The heights
(cm) of a random sample of fathers and sons were recorded and analysed by using IBM SPSS. The
following results were obtained from the bivariate analysis.
Model Summary
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .446 .199 .065 6.071

Coefficients
Unstandardized Standardized
Coefficients Coefficients t Sig.
Model B Std. Error Beta
1 (Constant) 96.281 60.053 1.603 .160
Heights of fathers (cm) .432 .354 .446 1.220 .268

Answer the following questions based on the above output.


a) Name the independent and dependent variable involved in this study.
(2 marks)
b) State the correlation coefficient value. Hence, interpret the relationship between the
variables.
(2 marks)
c) Write the least square regression equation.
(1 mark)
d) Based on the equation in c), comment on the slope value in the context of the above study.
(1 mark)
e) Predict the height of a son if the height of his father is 192 cm.
(2 marks)

END OF QUESTION PAPER

CONFIDENTIAL
CONFIDENTIAL 7 CS/JAN 2023/STA404

CONFIDENCE INTERVAL

Parameter and description A (1 - ) 100% confidence interval


Mean , for large samples, s
x  z 2
σ2 unknown n
Mean , for small samples, s
x  t 2 ; df = n – 1
σ2 unknown n

1 1
( x1 − x 2 )  t  2 sp + ; df = n1 + n2 – 2
n1 n 2
Difference in means of two normal
distributions, 1 - 2
12 =  22 and unknown (n1 − 1)s12 + (n 2 − 1)s 22
sp =
n1 + n 2 − 2

s12 s2
( x1 − x 2 )  t  2 + 2 ;
n1 n2
2
Difference in means of two normal s12 s22 
 n +
distributions, 1 - 2 , n2 
12   22 and unknown df =  1 
2 2
 s12   s22 
   
n1  n2 
  +  
n1 − 1 n2 − 1

sd
Mean difference of two normal distributions for d  t 2 ; df = n – 1 where n is no. of pairs
paired samples, d n

CONFIDENTIAL
CONFIDENTIAL 8 CS/JAN 2023/STA404

HYPOTHESIS TESTING

Null Hypothesis Test statistic


H0 :  = 0 x − 0
z=
σ2 unknown, large samples s n

H0 :  = 0 x − 0
t= ; df = n – 1
σ2 unknown, small samples s n
( x 1 − x 2 ) − (1 −  2 )
t= ; df = n1 + n2 – 2
1 1
sp +
H0 : 1 - 2 = 0 n1 n 2
12 =  22 and unknown (n1 − 1)s12 + (n 2 − 1)s 22
sp =
n1 + n 2 − 2

( x 1 − x 2 ) − (1 −  2 )
t=
s12 s 22
+
n1 n 2
H0 : 1 - 2 = 0 2
s12 s 22 
 n +
12   22 and unknown n2 
df =  1 
2 2
 s12   s 22 
   
n1  n2 
  +  
n1 − 1 n2 − 1
d − d
H0 : d = 0 t= ; df = n – 1, where n is no. of pairs
sd n
(o ij − e ij ) 2
Hypothesis for categorical data 2 =  e ij

CONFIDENTIAL
CONFIDENTIAL 9 CS/JAN 2023/STA404

ANALYSIS OF VARIANCE FOR A COMPLETELY RANDOMIZED DESIGN

Let:
k = the number of different samples (or treatments)
ni = the size of sample i
the sum of the values in sample i
Ti =
n = the number of values in all samples
= n1 + n2 + n3 + ...
x = the sum of the values in all samples
= T1 + T2 + T3 + ...
x 2
= the sum of the squares of values in all samples

Degrees of freedom for the numerator = k – 1


Degrees of freedom for the denominator = n – k

(  x) 2

Total sum of squares: SST = x 2



n
Between-samples sum of squares:

 T12 T22 T32


SSB =  + +
 ( x)
+ ...  −

2

 n1 n 2 n 3  n

Within- samples sum of squares = SST - SSB

SSB
Variance between samples: MSB =
(k − 1)
SSW
Variance within samples: MSW =
(n − k )
MSB
Test statistic for a one-way ANOVA test: F =
MSW

M
u
k
a
s
u
r
a
t CONFIDENTIAL
CONFIDENTIAL 10 CS/JAN 2023/STA404

SIMPLE LINEAR REGRESSION

Sum of squares of xy, xx, and yy:

SS xy =  xy −
(  x)( y)
n

SS xx =x −
 x)
2
( 2

and SS yy =y −
 y)
2
( 2

n n

Least Square Regression Line:

Y = a + bx

Least Squares Estimates of a and b:


SS xy
b= and a = y − bx
SS xx

Total sum of squares: SST =  y −


 y) 2
( 2

n
SS xy
Linear correlation coefficient: r =
SS xx SS yy

CONFIDENTIAL

You might also like