Quantitative Methods in Management
Quantitative Methods in Management
Quantitative Methods in Management
Management
Day-4
Recap..
• Introduction
• Definition
• Terms and terminologies
• Types of statistics
• Types of data
• Levels of measurements
• Application of statistics in business
• Sources of data
Organizing and visualizing variables
• Tables
• Frequency distribution
• Relative frequency distribution
• Relative percent frequency distribution
• Cumulative frequency distribution
• Univariate
• Bivariate / cross tabulation
• Diagrams
• Bar charts
• Pie charts
• Graphs
• Histogram
• Frequency polygon
• Frequency curve
• Cumulative frequency curve ( Ogive)
• EDA
• Stem and leaf plot
• Scatter diagram
• Dot plots
• Pareto chart
Numerical descriptive
statistics (cont…)
Day 4
Pg. 99-148
• Measures of location
• Measures of dispersion
• Measures of shapes
• kurtosis
RELATIVE LOCATION
Z score
Chebyshev's inequality
Empirical rule
Relative location – Z score
* In addition to measures of location, variability, and
shape, we are also interested in the relative location of
values within a data set.
xi x
zi
s
The larger the absolute value of the Z-score, the farther the
data value is from the mean.
Locating Extreme Outliers:
Z-Score
DCOVA
XX
Z
S
A score of 620 is 1.3 standard deviations above the mean and would not
be considered an outlier.
z-Scores
m
x
m – 3s m – 1s m + 1s m + 3s
m – 2s m + 2s
The Empirical Rule
• The empirical rule approximates the variation of
data
in a bell-shaped distribution
• Approximately 68% of the data in a bell shaped
distribution is within 1 standard deviation of the
mean
or μ 1σ
68%
μ
μ 1σ
The Empirical Rule
• Approximately 95% of the data in a bell-shaped
distribution lies within two standard deviations of
the mean, or µ ± 2σ
• Approximately 99.7% of the data in a bell-shaped
distribution lies within three standard deviations
of the mean, or µ ± 3σ
95% 99.7%
μ 2σ μ 3σ
Using the Empirical Rule
Suppose that the variable Math SAT scores is bell-shaped with a mean of 500 and
a standard deviation of 90. Then,
68% of all test takers scored between 410 and 590 (500 ± 90).
95% of all test takers scored between 320 and 680 (500 ± 180).
99.7% of all test takers scored between 230 and 770 (500 ± 270).
Chebyshev’s Theorem
At least (1 - 1/z2) of the items in any data set will be
within z standard deviations of the mean, where z is
any value greater than 1.
At least withi
n
(1 - 1/22) x 100% = 75% …........ k=2 (μ ± 2σ)
(1 - 1/32) x 100% = 89% ………. k=3 (μ ± 3σ)
EXPLORATORY DATA
ANALYSIS
FIVE NUMBER SUMMARY
BOX PLOT
Exploratory Data Analysis
Exploratory data analysis procedures enable us to use
simple arithmetic and easy-to-draw pictures to
summarize data.
( for small sample size ; conflicting results may occur , the shape cannot
be clearly determined.)
• The monthly starting salaries for a sample of 12 business school graduates
are given below ( in ascending order)
3310 3355 3450 3480 3480
3490 3520 3540 3550 3650
3730 3925
Minimum Q1 Q2 Q3 Maximum
Five-Number Summary
Example: Apartment Rents
Lowest Value = 425First Quartile = 445
Median = 475
Third Quartile = 525
Largest Value = 615
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Box Plot
40 42 45 47 50 52 55 57 60 62
0 5 0 5 0 5 0 5 0 5
Q1 = 445 Q3 = 525
Q2 = 475
Box Plot
3310
Mean 3540
3355
Standard Error 47.81989569
3450 Median 3505
3480 Mode 3480
3480 Standard Deviation 165.6529779
3490 Sample Variance 27440.90909
3520 Kurtosis 1.718883645
3540 Skewness 1.091108688
3550 Range 615
Minimum 3310
3650
Maximum 3925
3730
Sum 42480
3925 Count 12
General Descriptive Stats Using
Microsoft Excel Data Analysis Tool
DCOVA
1. Select Data.
2. Select Data Analysis.
3. Select Descriptive Statistics and
click OK.
General Descriptive Stats Using
Microsoft Excel
DCOVA
Total
Variable Count Mean SE Mean StDev Variance Sum Minimum
House Price 5 600000 357771 800000 6.40000E+11 3000000 100000
N for
Variable Median Maximum Range Mode Mode Skewness Kurtosis
House Price 300000 2000000 1900000 100000 2 2.01 4.13
Distribution Shape and
The Boxplot DCOVA
Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3
Boxplot Example
DCOVA
00 22335 5 27 27
( X X)( Y Y)
i i
cov ( X , Y ) i1
n 1
cov (X , Y)
r
SX SY
where
n
(X X)(Y Y)
n n
i i (X X)
i
2
i
(Y Y ) 2
cov (X , Y) i1
SX i1
SY i1
n 1 n 1 n 1
Features of the
Coefficient of Correlation
DCOVA
• The population coefficient of correlation is referred as ρ.
• The sample coefficient of correlation is referred to as r.
• Either ρ or r have the following features:
• Unit free
• Range between –1 and 1
• The closer to –1, the stronger the negative linear relationship
• The closer to 1, the stronger the positive linear relationship
• The closer to 0, the weaker the linear relationship
Scatter Plots of Sample Data with Various
Coefficients of Correlation
Y Y
DCOVA
X X
r = -1 r = -.6
Y
Y Y
X X X
r = +1 r = +.3 r=0
The Coefficient of Correlation Using Microsoft
Excel Function
DCOVA
Test #1 Score Test #2 Score Correlation Coefficient
78 82 0.7332 =CORREL(A2:A11,B2:B11)
92 88
86 91
83 90
95 92
85 85
91 89
76 81
88 96
79 77
The Coefficient of Correlation Using Microsoft
Excel Data Analysis Tool
1. Select Data DCOVA
2. Choose Data Analysis
3. Choose Correlation &
Click OK
The Coefficient of Correlation
Using Microsoft Excel
DCOVA
r = .733
Scatter Plot of Test Scores
100
Test #2 Score
90
75
70
Students who scored high 70 75 80 85 90 95 100
Test #1 Score
on the first test tended to
score high on second test.
Pitfalls in Numerical
Descriptive Measures
DCOVA
• Data analysis is objective
• Should report the summary measures that best describe
and communicate the important aspects of the data set
Models with DVD Player Price Models without DVD Player Price
Sony HT-1800DP $450 Pioneer HTP-230 $300
Pioneer HTD-330DV 300 Sony HT-DDW750 300
Sony HT-C800DP 400 Kenwood HTB-306 360
Panasonic SC-HT900 500 RCA RT-2600 290
Panasonic SC-MTI 400 Kenwood HTB-206 300
• Compute the mean price for models with a DVD player and the mean price for
models without a DVD player. What is the additional price paid to have a DVD
player included in a home theatre unit?
• Compute the range, variance, and standard deviation for the two samples. What does
this information tell you about the prices for models with and without a DVD player?
Price with DVD player Price without DVD player
Count 5 Count 5
• The following data were used to construct the histograms of the number of
days required to fill orders for Dawson Supply, Inc., and J.C. Clark
Distributors
• Use the range and standard deviation to support that Dawson Supply
provides the more consistent and reliable delivery times.
dawson clark
Range 2 Range 8
Minimum 9 Minimum 7
Maximum 11 Maximum 15
Count 10 Count 10
coefficient of variation 25.08873455
coefficient of variation 6.552898619
Practice
• The following times were recorded by the quarter-mile and mile runners of a
university track team (times are in minutes).
Quarter-Mile Times: .92 .98 1.04 .90 .99
Mile Times: 4.52 4.35 4.60 4.70 4.50
After viewing this sample of running times, one of the coaches commented
that the quarter milers turned in the more consistent times. Use the standard
deviation and the coefficient of variation to summarize the variability in the
data. Does the use of the coefficient of variation indicate that the coach’s
statement should be qualified?