Descriptive Statistics, Cross-Tabulation, and Hypothesis Testing
Descriptive Statistics, Cross-Tabulation, and Hypothesis Testing
Descriptive Statistics, Cross-Tabulation, and Hypothesis Testing
Cross-Tabulation, and
Hypothesis Testing
115-1
Variable
15-2
(In)dependent Variables
15-4
• Research studies indicate that successful new
product development has an influence on the stock
market price of the company. That is, the more
successful the new product turns out to be, the
higher will be the stock market price of that firm.
15-5
• Cross‐cultural research indicates that managerial
values govern the power distance between
superiors and subordinates.
15-6
Moderators
• Moderating variable
– Moderator is qualitative (e.g., gender, race, class)
or quantitative (e.g., level of reward) variable that
affects the direction and/or strength of relation
between independent and dependent variable.
• Example
15-7
Mediating Variable
• Mediating variable
– surfaces between the time the
independent variables start operating
to influence the dependent variable
and the time their impact is felt on it.
• Example
15-8
Causal relationhip
• Increase in X----- increase in Y
• Assumption: X is cause of Y
15-9
Hypothesis
• Good hypothesis:
– Must be adequate for its purpose
– Must be testable
15-10
Exercise
Service Customer
quality switching
Switching
cost
15-11
Exercise
15-12
Argumentation
15-13
• A store manager observes that the morale of employees in her
supermarket is low. She thinks that if their working conditions
are improved, pay scales raised, and the vacation benefits
made more attractive, the morale will be boosted. She doubts,
however, if an increase in pay scales would raise the morale of
all employees. Her conjecture is that those who have
supplemental incomes will just not be “turned on” by higher
pay, and only those without side incomes will be happy with
increased pay, with a resultant boost in morale.
• List and label the variables in this situation.
• Develop few hypotheses
15-14
Levels of Measurement
Classification
Nominal
Classification
Ordinal
Order
Classification Distance
interval
Order
Classification Distance
Ratio
Order Natural Origin
15-15
Internet Usage Data
Respondent Sex Familiarity Internet Attitude Toward Usage of Internet
Number Usage Internet Technology Shopping Banking
1 1.00 7.00 14.00 7.00 6.00 1.00 1.00
2 2.00 2.00 2.00 3.00 3.00 2.00 2.00
3 2.00 3.00 3.00 4.00 3.00 1.00 2.00
4 2.00 3.00 3.00 7.00 5.00 1.00 2.00
5 1.00 7.00 13.00 7.00 7.00 1.00 1.00
6 2.00 4.00 6.00 5.00 4.00 1.00 2.00
7 2.00 2.00 2.00 4.00 5.00 2.00 2.00
8 2.00 3.00 6.00 5.00 4.00 2.00 2.00
9 2.00 3.00 6.00 6.00 4.00 1.00 2.00
10 1.00 9.00 15.00 7.00 6.00 1.00 2.00
11 2.00 4.00 3.00 4.00 3.00 2.00 2.00
12 2.00 5.00 4.00 6.00 4.00 2.00 2.00
13 1.00 6.00 9.00 6.00 5.00 2.00 1.00
14 1.00 6.00 8.00 3.00 2.00 2.00 2.00
15 1.00 6.00 5.00 5.00 4.00 1.00 2.00
16 2.00 4.00 3.00 4.00 3.00 2.00 2.00
17 1.00 6.00 9.00 5.00 3.00 1.00 1.00
18 1.00 4.00 4.00 5.00 4.00 1.00 2.00
19 1.00 7.00 14.00 6.00 6.00 1.00 1.00
20 2.00 6.00 6.00 6.00 4.00 2.00 2.00
21 1.00 6.00 9.00 4.00 2.00 2.00 2.00
22 1.00 5.00 5.00 5.00 4.00 2.00 1.00
23 2.00 3.00 2.00 4.00 2.00 2.00 2.00
24 1.00 7.00 15.00 6.00 6.00 1.00 1.00
25 2.00 6.00 6.00 5.00 3.00 1.00 2.00
26 1.00 6.00 13.00 6.00 6.00 1.00 1.00
27 2.00 5.00 4.00 5.00 5.00 1.00 1.00
28 2.00 4.00 2.00 3.00 2.00 2.00 2.00
29 1.00 4.00 4.00 5.00 3.00 1.00 2.00
30 1.00 3.00 3.00 7.00 5.00 1.00 2.00
15-16
Frequency Distribution of Familiarity
with the Internet
Valid Cumulative
Value label Value Frequency (N) Percentage percentage percentage
15-17
Frequency Histogram
8
7
6
5
Frequency
4
3
2
1
0
2 3 4 5 6 7
Familiarity
15-18
Statistics Associated with Frequency
Distribution Measures of Location
• The mean, or average value, is the most commonly
used measure of central tendency. The mean, ,is
given by n X
X = S X i /n
i=1
Where,
Xi = Observed values of the variable X
n = Number of observations (sample size)
15-20
Statistics Associated with Frequency
Distribution Measures of Variability
Skewed Distribution
Mean
Median
Mode
(a)
Type II Error
• Type II error occurs when, based on the sample
results, the null hypothesis is not rejected when it is in
fact false.
• The probability of type II error is denoted by b .
• Unlike , which is specified by the researcher, the
magnitude of b depends on the actual value of
the population parameter (proportion).
15-30
Probability of z with a One-Tailed Test
Shaded Area
= 0.9699
Unshaded Area
= 0.0301
0 z = 1.88
15-31
A General Procedure for Hypothesis Testing
Step 4: Collect Data and Calculate Test Statistic
• The required data are collected and the
value of the test statistic computed.
• In our example, the value of the sample
proportion is
p= 17/30 = 0.567.
• The value of sp can be determined as
follows:
sp = p(1 - p)
n
=
(0.40)(0.6)
30
= 0.089 15-32
A General Procedure for Hypothesis Testing
Step 4: Collect Data and Calculate Test Statistic
pˆ - p
z =
s p
= 0.567-0.40
0.089
= 1.88
15-33
A General Procedure for Hypothesis Testing
Step 5: Determine the Probability
(Critical Value )
• Using standard normal tables, the probability of obtaining a z value
of 1.88 can be calculated
• The shaded area between - and 1.88 is 0.9699. Therefore, the
area to the right of z = 1.88 is 1.0000 - 0.9699 = 0.0301.
• Alternatively, the critical value of z, which will give an area to the
right side of the critical value of 0.05, is between 1.64 and 1.65 and
equals 1.645.
• Note, in determining the critical value of the test statistic, the area
to the right of the critical value is either or /2 .
15-34
A General Procedure for Hypothesis Testing
Steps 6 & 7: Compare the Probability
(Critical Value) and Making the Decision
• If the probability associated with the calculated or
observed value of the test statistic (TS CAL) is less than
the level of significance (), the null hypothesis is
rejected.
• The probability associated with the calculated or
observed value of the test statistic is 0.0301. This is
the probability of getting a p value of 0.567 when =
0.40. This is less than the level of significance of 0.05.
Hence, the null hypothesis is rejected.
• Alternatively, if the calculated value of the test
statistic is greater than the critical value of the test
statistic ( T ), the null hypothesis is rejected.
SCR 15-35
A General Procedure for Hypothesis Testing
Steps 6 & 7: Compare the Probability (Critical
Value) and Making the Decision
15-37
A Broad Classification of
Hypothesis Tests
Hypothesis Tests
Tests of Tests of
Association Differences
Proportions Median/
Distributions Means
Rankings
15-38
Cross-Tabulation
15-39
Gender and Internet Usage
Gender
Row
Internet Usage Male Female Total
Light (1) 5 10 15
Heavy (2) 10 5 15
Column Total 15 15
15-40
Internet Usage by Gender
Gender
15-41
Gender by Internet Usage
Internet Usage
15-42
Introduction of a Third Variable in
Cross-Tabulation
Original Two Variables
15-44
Purchase of Sex
Fashion Male Female
Clothing
Married Not Married Not
Married Married
High 35% 40% 25% 60%
15-45
Ownership of Expensive Automobiles by Education Level
No 68% 79%
15-46
Income
Own Low Income High Income
Expensive
Automobile
College No College No College
Degree College Degree Degree
Degree
15-47
Three Variables Cross-Tabulation
Reveal Suppressed Association
15-48
Desire to Travel Abroad by Age
No 50% 50%
15-49
Desire to Travel Abroad by
Age and Gender
Desir e to Sex
Tr avel Male Female
Abr oad Age Age
< 45 >=45 <45 >=45
15-51
Eating Frequently in
Fast-Food Restaurants by Family Size
Eat Frequently in Fast- Family Size
Food Restaurants
Small Large
No 35% 35%
15-52
Eating Frequently in Fast Food-Restaurants
by Family Size and Income
Income
Eat Frequently in Fast- Low High
Food Restaurants
Family size Family size
Small Large Small Large
Yes 65% 65% 65% 65%
No 35% 35% 35% 35%
Column totals 100% 100% 100% 100%
Number of respondents 250 250 250 250
15-53
Statistics Associated with
Cross-Tabulation Chi-Square
• To determine whether a systematic association exists, the
probability of obtaining a value of chi-square as large or larger
than the one calculated from the cross-tabulation is estimated.
15-54
Chi-square Distribution
Do Not Reject
H0
Reject H0
2
Critical
Value
15-55
Statistics Associated with
Cross-Tabulation Chi-Square
nrnc
fe = n
15 X 15 15 X 15
= 7.50 = 7.50
30 30
2
Then the value of is calculated as follows:
2 = S (fo - fe)2
fe
all
15-57
cells
Statistics Associated with
Cross-Tabulation Chi-Square
For the data internet, the value of 2 is
calculated as:
= 3.333
15-58
Statistics Associated with
Cross-Tabulation Chi-Square
15-59
Hypothesis Testing Related to Differences
• Parametric tests assume that the variables of interest are measured
on at least an interval scale.
• Nonparametric tests assume that the variables are measured on a
nominal or ordinal scale.
• These tests can be further classified based on whether one or two
or more samples are involved.
• The samples are independent if they are drawn randomly from
different populations. For the purpose of analysis, data pertaining
to different groups of respondents, e.g., males and females, are
generally treated as independent samples.
• The samples are paired when the data for the two samples relate
to the same group of respondents.
15-60
A Classification of Hypothesis Testing
Procedures for Examining Differences
Hypothesis Tests
15-62
One Sample : t Test
For the data Internet , suppose we wanted to test
the hypothesis that the mean familiarity rating exceeds
4.0, the neutral value on a 7 point scale. A significance
level of = 0.05 is selected. The hypotheses may be
formulated as:
H0: m <4.0
H1: m >4.0
t = (X - m)/sX
sX = s/ n
sX = 1.579/ 29
= 1.579/5.385 = 0.293
t = (4.724-4.0)/0.293 = 0.724/0.293 = 2.471 15-63
One Sample : t Test
15-64
Two Independent Samples Means
• In the case of means for two independent samples, the
hypotheses take the following form.
Ho: µ1 = µ2
H1: : µ1 ≠ µ2
15-65
Paired Samples
The difference in these cases is examined by a paired
samples t test. To compute t for paired samples, the
paired difference variable, denoted by D, is formed
and its mean and variance calculated. Then the t
statistic is computed. The degrees of freedom are n -
1, where n is the number of pairs. The relevant
formulas are:
H0 : m D = 0
H1: m D 0
D - mD
tn-1 = sD
n
continued… 15-66
Paired Samples
Where:
n
S Di
D = i=1n
n
S=1 (Di - D)2
sD = i
n-1
S
SD = n
D
15-68