Stats Assign
Stats Assign
Stats Assign
Assignment 03
Instructions:
This assignment consists of 4 questions. All questions carry 5 points each. Attempt
all questions.
Refer to the questions and related data with them. You are required to solve these
questions in an Excel document. Create a separate sheet for each question in a
single Excel document.
Name the Excel document as your Student ID (e.g. 2020PBM5000) ONLY.
Upload the Excel sheet and TURN IN your assignment for evaluation.
-----------------------------------------------------------------------------------
-----------------------------------------------------
Q1. An analyst is studying the relationship between shopping centre traffic and a
department store’s daily sales. The analyst develops an index to measure the daily
volume if traffic entering the shopping centre and an index of daily sales. The
following table shows the index values for 10 randomly selected days:
Traffic index X
Sales Index Y
80
6
100
8
100
8
110
9
130
10
190
15
200
16
85
5
100
8
110
8
Using both formula and in-built Data Analysis Toolpack, compute following in a
SINGLE Excel sheet:
a. Compute cross-product.
b. Compute co-variance.
c. Compute coefficient of correlation.
Q2. Use the data from Club_Visit.xlsx from the sessions on Hypothesis Testing and
answer the following:
Using the inbuilt-function, compute the z-statistics for each value of “Visits”.
Use the sample mean and standard deviation of the Visits for the computation of z-
values.
Using the inbuilt-function, compute the p-values for each value of “Visits”. Use
the sample mean and standard deviation of the Visits for the computation of z-
values.
Q3. Using the data from Club_Visit.xlsx from the sessions on Hypothesis Testing and
answer the following:
Conduct an Independent sample t-test (assuming equal variances) using variables -
“Visits” and “Will_buy”.
State the Null and Alternate Hypothesis for the analysis.
Highlight the cell containing p-value of the test with Green color if the null
hypothesis is to be rejected and with Red color if the test results fail to reject
the null hypothesis.
Q4. Using the data from Club_Visit.xlsx from the sessions on Hypothesis Testing and
answer the following:
Conduct Paired sample t-test (assuming equal variances) using variables - “Pre-
Campaign” and “Post-Campaign”.
State the Null and Alternate Hypothesis for the analysis.
Highlight the cell containing p-value of the test with Green color if the null
hypothesis is to be rejected and with Red color if the test results fail to reject
the null hypothesis.
Student ID:
2020PBM5188
Student Name:
PRAGATI AGARWAL
Q1. A manager recorded the number of days employees were out sick from the office
during the last year. For n=7 employees, the counts were 0, 0, 4, 7, 11, 13, and 35
days.
a. What are the units of average and standard deviation?
[1 pts]
AVERAGE=10
STANDARD DEVIATION=125.71
b. Are the units of average and standard deviation different from (answer in Yes/
No): [2 pts]
Median
[ Yes / No ] YES
Range
[ Yes / No ]YES
Interquartile range
[ Yes / No ] YES
Variance
[ Yes / No ]YES
c. Find [2
pts]
Average
10
Median
7
Mode
0
Range
35
A jar contains 3 red marbles, 7 green marbles and 10 white marbles. If a marble is
drawn from the jar at random, what is the probability that this marble is white?
TOTAL MARBLES=20
WHITE MARBLES=10
P(W)=10/20=½=0.5
Two dice are rolled, find the probability that the sum is less than 13.
MAXIMUM SUM=12 WHICH IS LESS THAN 12
P=1
A card is randomly drawn from a pack of 52 cards. What is the probability that the
card is a number 3 heart card
P(3 OF HEART)=1/13*13/52=1/52
Three dice are rolled together. What is the probability of getting at least one
'2’?
P(AT LEAST 2)=91/216
Q3. Read the analysis case below and report all relevant information that you would
like to include in the report of the results.
[15 pts.]
A luxury real estate builder wished to analyze if all his three market segments –
Premium, High Networth (HNI) and Ultra High Networth (UHNI) segments, are equally
profitable. The builder used the profits (in Rs. Lakhs) from deals conducted over
one quarter as sample data for the analysis using SPSS. The test results are as
below.
Case Processing Summary
Cases
Valid
Missing
Total
N
Percent
N
Percent
N
Percent
Profit
16
100.0%
0
0.0%
16
100.0%
Descriptives
Statistic
Std. Error
Profit
Mean
759.8125
95.97544
95% Confidence Interval for Mean
Lower Bound
555.2457
Upper Bound
964.3793
5% Trimmed Mean
751.6806
Median
694.0000
Variance
147380.563
Std. Deviation
383.90176
Minimum
154.00
Maximum
1512.00
Range
1358.00
Interquartile Range
622.00
Skewness
.323
.564
Kurtosis
-.547
1.091
Sample-wise Descriptive
N
Mean
Std. Deviation
Std. Error
95% Confidence Interval
for Mean
Minimum
Maximum
Lower
Upper
Premium
7
449.428
215.502
81.452
250.123
648.734
154.00
770.00
HNI
5
970.600
263.949
118.042
642.864
1298.337
667.00
1219.00
UHNI
4
1039.500
384.691
192.345
427.371
1651.629
675.00
1512.00
Total
16
759.812
383.902
95.975
555.246
964.379
154.00
1512.00
2.046
2
13
.169
ANOVA
Sum of Squares
df
Mean Square
F
Sig.
Between Groups
1209424.523
2
604712.262
7.851
.006
Within Groups
1001283.914
13
77021.840
Total
2210708.438
15
Robust Tests of Equality of Means
Statistica
df1
df2
Sig.
Welch
7.921
2
6.379
.019
Brown-Forsythe
6.537
2
7.141
.024
a. Asymptotically F distributed.
Post Hoc Tests - Multiple Comparisons
Tukey HSD
Dependent Variable: Profit
(I) Segment
(J) Segment
Mean Diff. (I-J)
Std. Err.
Sig.
95% Confidence Interval
Lower
Upper
Premium
High Networth
-521.171
162.504
.018
-950.253
-92.090
Ultra High Network
-590.071
173.950
.012
-1049.375
-130.768
HNI
Premium
521.171
162.504
.018
92.090
950.253
Ultra High Network
-68.900
186.172
.928
-560.474
422.674
UHNI
Premium
590.071
173.950
.012
130.768
1049.375
High Networth
68.900
186.172
.928
-422.674
560.474
ALPHA=5%=0.05