Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
• R
• Ordinal
– e.g. Likert scale rating, ranking, education
• Pictorial representation
– Histogram (or bar chart, pie chart)
– Scatter plot
Histogram
• Histogram shows the frequency
distribution of the data
• In Excel, only quantitative numeric data
can be displayed with histograms
Histogram
Frequency
30
25
20
15
10
0
15000 24900 34900 44900 54900 64900 74900 84900 94900 104900 More
Statistics for Central Tendency
• Central tendency: where are the bulk of the
data?
– Mean: average value, used for interval & ratio data
4 22 n
5 21 21 22 22 21
21.6
6 22 10
2
X
7 22 n
X
8 21 i
Variance: 2 i 1
0.26667
9 22 n 1
10 21
Standard Deviation : S 2 0.516398
Measures of Dispersion
Obs Age -21 Deviation Abs. Dev Square Deviation
1 21 0 -0.6 0.6 0.36
2 22 1 0.4 0.4 0.16
3 22 1 0.4 0.4 0.16
4 22 1 0.4 0.4 0.16
5 21 0 -0.6 0.6 0.36
6 22 1 0.4 0.4 0.16
7 22 1 0.4 0.4 0.16
8 21 0 -0.6 0.6 0.36
9 22 1 0.4 0.4 0.16
10 21 0 -0.6 0.6 0.36
sum 216 6 0.00 4.80 2.40
avg 21.6 0.6 0.00 0.48 0.24
Normal Distribution
i i
Cov( X , Y )
rXY i 1
(X i X )(Yi Y )
1 n
( X i X ) (Yi Y )
i 1
(n 1) sx s y
(n 1) i 1 sx
s y
= “Fit” + “Error”
•Dependent variable: Y, must be interval/ratio variable
•Independent variables: X1 , X2 , … , Xk
•Coefficients: b0 , b1 , b2 , … , bk
Regression – Log 12 Pack Volume with Carry Over
Coefficientsa
Model Summaryb