Introduction To Descriptive Statistics
Introduction To Descriptive Statistics
Introduction To Descriptive Statistics
Descriptive
Statistics
17.871
Key measures
Describing data
Moment
Non-mean based
measure
Center
Spread
Mean
Mode, median
Skew
Peaked
Skewness
--
Kurtosis
--
Variance
Range,
(standard deviation) Interquartile range
Key distinction
Population vs. Sample Notation
Population
Greeks
, ,
vs. Sample
Romans
s, b
Mean
x
i
i 1
n
i 1
2
( xi )
n
i 1
n
n 1
i 1
2
Degrees of freedom
( xi )
s
n 1
i 1
n
Binary data
X prob ( X ) 1 proportion of time x 1
s x (1 x ) s x x (1 x )
2
x
Frequency
Value
1
( x ) / 2 2
f ( x)
e
2
IQ
SAT
Height
No skew
Zero skew
Symmetrical
Mean = median = mode
Skewness
Asymmetrical distribution
Frequency
Value
Income
Contribution to
candidates
Populations of
countries
Residual vote rates
Positive skew
Right skew
Skewness
Asymmetrical distribution
Frequency
Negative skew
Left skew
Value
Skewness
Frequency
Value
Kurtosis
k>3
leptokurtic
Frequency
k=3
k<3
Value
mesokurtic
platykurtic
Normal distribution
1
( x ) / 2 2
f ( x)
e
2
Skewness = 0
Kurtosis = 3
The z-score
or the
standardized score
x x
x
recodedty|
pe|mean(totals~e)
+
1|.3729735
2|.4475548
3|.589883
1.5
Density
1
.5
0
.histtotalscore
Graph totalscore
-.5
totalscore
.5
.5
Density
1
1.5
-.5
totalscore
.5
.5
Density
1
1.5
histogramtotalscore,width(.01)xlabel(.2(.1)1)
(bin=124,start=.24209334,width=.01)
-.2
-.1
.1
.2
.3
.4
.5
totalscore
.6
.7
.8
.9
.5
Density
1
1.5
(bin=124,start=.24209334,width=.01)
-.2
-.1
.1
.2
.3
.4
.5
totalscore
.6
.7
.8
.9
Histograms by category
.histogramtotalscore,width(.01)xlabel(.2(.1)1)by(recodedtype)
(bin=124,start=.24209334,width=.01)
Public
Religious private
Density
-.2 -.1 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
Nonsectarian private
-.2 -.1 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
totalscore
Graphs by recodedtype
No Response
Refused
Don't know
Not in universe
Less than 1 month
1-6 months
7-11 months
1-2 years
3-4 years
5 years or longer
Solution, Step 1
Map artificial category onto
natural
midpoint
-9 No Response missing
-3
-2
-1
1
2
3
4
5
6
Refused missing
Don't know missing
Not in universe missing
Less than 1 month 1/24 = 0.042
1-6 months 3.5/12 = 0.29
7-11 months 9/12 = 0.75
1-2 years 1.5
3-4 years 3.5
5 years or longer 10 (arbitrary)
Fraction
.557134
0
0
5
longevity
10
0
0
5
6
longevity
10
15
Fraction
.0156
X-min
0
X-max
1/12
X-length
.082
Height
(density)
.19*
1-6 mo.
.0909
1/12
.417
.22
7-11 mo.
.0430
.500
.09
1-2 yr.
.1529
.15
3-4 yr.
.1404
.07
5+ yr.
.5571
15
11
.05
* = .0156/.082
.graphboxtotalscore
.5
0
-.5
Upper quartile
Median
Lower quartile
1.5 x IQR
Inter-quartile
range
-.5
.5
-.5
.5
Graphs by recodedtype
-.5
.5
graphboxtotalscore,over(recodedtype)