0% found this document useful (0 votes)
12 views24 pages

Summarizing Graphics

Uploaded by

David Gbolagun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views24 pages

Summarizing Graphics

Uploaded by

David Gbolagun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Descriptive Statistics

Summarizing data using graphs


Which graph to use?
• Depends on type of data
• Depends on what you want to illustrate
• Depends on available statistical software
Bar Chart
Birth Order of Spring 1998 Stat 250 Students

40

30
ercent
Pe

20

10

Middle Oldest Only Youngest


Birth Order
n=92 students
Bar Chart
• Summarizes categorical data.
• Horizontal axis represents categories,
categories while
vertical axis represents either counts
((“frequencies”)
frequencies ) or percentages ((“relative
relative
frequencies”).
• Used to illustrate the differences in
percentages (or counts) between categories.
Histogram
Age of Spring 1998 Stat 250 Students

50

40
ncy (Count)

30
Frequen

20

10

18 19 20 21 22 23 24 25 26 27
Age (in years)
n=92 students
Analogy

Bar chart is to categorical data as


histogram is to ...

measurement data.
Histogram
• Divide measurement up into equal-sized
categories.
g
• Determine number (or percentage) of
measurements falling into each category.
category
• Draw a bar for each category so bars’
heights represent number (or percent)
falling into the categories.
• Label
L b l andd title
i l appropriately.
i l
Histogram

Use common sense in determining


number
b off categories
t i tot use.

((Trial-and-error works fine,, too.))


Too few categories
Age of Spring 1998 Stat 250 Students

60

50
ncy (Count)

40

30
Frequen

20

10

18 23 28
Age (in years)
n=92 students
Too many categories
GPAs of Spring 1998 Stat 250 Students

6
Frequency (Count)

2 3 4
GPA
n=92 students
Dot Plot
Fastest Ever Driving Speed
226 Stat 100 Students, Fall '98

100
Men

126
Women
70 80 90 100 110 120 130 140 150 160
S
Speed
d
Dot Plot
• Summarizes measurement data.
• Horizontal axis represents measurement
scale.
• Plot one dot for each data point.
point
Stem-and-Leaf Plot
Stem-and-leaf of Shoes N = 139 Leaf Unit = 1.0

12 0 223334444444
63 0 555555555555566666666677777778888888888888999999999
(33) 1 000000000000011112222233333333444
43 1 555555556667777888
25 2 0000000000023
12 2 5557
8 3 0023
4 3
4 4 00
2 4
2 5 0
1 5
1 6
1 6
1 7
1 7 5
Stem-and-Leaf Plot
• Summarizes measurement data.
• Each data point is broken down into a
“stem” and a “leaf.”
• First,
First “stems”
stems are aligned in a column.
column
• Then, “leaves” are attached to the stems.
Box Plot
Amount of sleep in past 24 hours
of Spring 1998 Stat 250 Students
10
9
8
of sleep

7
6
5
Hours o

4
3
2
1
0
Box Plot
• Summarizes measurement data.
• Vertical (or horizontal) axis represents
measurement scale.
• Lines in box represent the 25th percentile
(“first quartile”), the 50th percentile
((“median”)
median ), and the 75th percentile ((“third
third
quartile”), respectively.
An aside...
• Roughly speaking:
– The “25th
25th percentile”
percentile is the number such that
25% of the data points fall below the number.
– The “median” or “50th p percentile” is the
number such that half of the data points fall
below the number.
– The “75th percentile” is the number such that
75% of the data points fall below the number.
Box Plot (cont’d)
• “Whiskers” are drawn to the most extreme
data p
points that are not more than 1.5 times
the length of the box beyond either quartile.
– Whiskers are useful for identifying outliers.
• “Outliers,” or extreme observations, are
denoted by asterisks
asterisks.
– Generally, data points falling beyond the
whiskers are considered outliers.
outliers
Using Box Plots to Compare
Fastest Ever Driving Speed
226 Stat 100 Students, Fall 1998
160
eed (mph)
Fastest Spe

110
F

60
female male
G d
Gender
Which graph to use when?
• Stem-and-leaf plots and dotplots are good
for small data sets,, while histograms
g and
box plots are good for large data sets.
• Boxplots and dotplots are good for
comparing two groups.
• Boxplots are good for identifying outliers
outliers.
• Histograms and boxplots are good for
id if i “shape”
identifying “h ” off data.
d
Scatter Plots
F t sizes
Foot i off Spring
S i 1998 St
Statt 250 students
t d t

31
30
29
oot (in cm)

28
27
Right fo

26
25
24
23
22
22 23 24 25 26 27 28 29 30 31
Left foot (in cm)
n=88
88 students
t d t
Scatter Plots
• Summarizes the relationship between two
measurement variables.
• Horizontal axis represents one variable and
vertical axis represents second variable.
variable
• Plot one point for each pair of
measurements.
measurements
No relationship
Lengths
g of left forearms and head circumferences
of Spring 1998 Stat 250 Students
32
31
Left forearrm (in cm)

30
29
28
27
26
25
24
23
22
52 57 62
Head circumference ((in cm))
n=89 students
Closing comments
• Many possible types of graphs.
• Use common sense in reading graphs
graphs.
• When creating graphs, don’t summarize
your data too much or too little
little.
• When creating graphs, label everything for
others.
h R
Remember
b you are tryingi to
communicate something to others!

You might also like