Ch-1 Solution
Ch-1 Solution
Ch-1 Solution
Descriptive Statistics
1.1 Introduction
1.2 Basic concepts
1.3 Sampling schemes
1.4 Graphical representation of data
1.5 Numerical description of data
1.6 Computers and statistics
1.7 Chapter summary
1.8 Computer examples
Projects for Chapter 1
Chapter 1
Exercises 1.2
1.2.1
The suggested solutions:
For qualitative data we can have color, sex, race, Zip code and so on. For quantitative data we
can have age, temperature, time, height, weight and so on. For cross section data we can have
school funding for each department in 2000. For time series data we can have the crude oil
price from 1995 to 2008.
1.2.2
The suggested solutions:
For qualitative data we collect the frequency information of the data and we want to see the
comparison by either bar chart or pie chart.
For quantitative data we collect the numerical information of the data and we want to see the
comparison by histogram distribution.
For cross section data we collect different section data on the same time and we want to make
comparison between them.
For time series data we collect same type of data on different time spot and we want to see if
there is any trend or pattern of this data with time shifting.
1.2.3
The suggested questions can be
1. What types of data the amount is?
2. Are these Federal Agency get same amount of money? If not, why?
3. Which Federal Agency should get more money? Why?
1.2.4
The suggested questions can be
1. Are the money same on each year or not?
2. Should we change the proportion between the Agencies or not?
3. Should we increase the total amount or not?
1.3.1
For stratified sample, we can say suppose we decide to sample 100 college students
from the population of 1000 (that is 10% of the population). We know these 1000 students
come from three different major, Math, Computer Science and Social Science. We have Math
200, CS 400 and SS 400 students. Then we choose 10% of each of them Math 20, CS 40 and
SS 40 by using random sampling within each major.
For cluster sample, we can say suppose we decide to sample some college students
from the population of 2000. We know these 2000 students come from 20 different countries
and we choose 3 out of the 20 countries by random sampling. Then we get all the individual
information from each of the 3 countries.
1.3.2
Answer will vary and try to cover all possible errors we studied in this chapter.
Exercises 1.4
1.4.1
By minitab
1.4.2
(a) Bar graph
1.4.3
(a) bar graph
1.4.4
(a)Bar graph
1.4.6
1.4.7
1.4.9
Bar chart
1.4.10
1.4.11
Revenues
Bar graph
(b)
Expenditure
Pie chart
Revenues
Pie chart
1.4.13
1.4.14
(a) Stem and leaf
Stem-and-Leaf Display: C1
Stem-and-leaf of C1 N = 40
Leaf Unit = 1.0
2 0 00
12 0 2222223333
13 0 5
20 0 6666677
20 0 888899
14 1 111
11 1 223333
5 1 55
3 1 677
(b) Histogram
1.4.15
( a ) Stem and leaf
Stem-and-leaf of C1 N = 20
Leaf Unit = 10
1 4 7
3 4 99
8 5 00011
10 5 22
10 5 4455
6 5 6667
2 5 9
1 6 0
( b ) Histogram
1.4.16
(a)
Frequency table
(b)
Histogram
1.4.17
Exercises 1.5
1.5.1
Mean is 165.6667 and standard deviation is 63.15397
1.5.2
1.5.3
Required Data is 3,3,5,13 with mean 6, median 4 and mode 3. The standard deviation is
4.760952
1.5.4
(a) Mean is 1243.5, Variance is 792365.8 and Range is 2621.
(b) Lower Quantiles is 532.5, Median is 1083.5, Upper Quantiles is 1814.25 and Inter
Quantile Range is 1281.75. The lower limit of outliers is -1390.125 and upper limit of outliers
is 3736.875. Therefore, there are no outliers.
(c)
2500
2000
1500
1000
500
1.5.5
(a) Lower Quantiles is 80, Median is 95, Upper Quantiles is 115 and Inter Quantile Range is
35. The lower limit of outliers is 27.5 and upper limit of outliers is 167.5.
1.5.6
Mean is 11.8
N=50
Sample Variance is 34.653
Sample Standard Deviation is 5.887
1.5.7
(a)
(b)
1.5.8
1.5.9
(a) Mean is 33.105, Variance is 177.0430 and Range is 48.19.
(b) Lower Quantile is 24.9225, Median is 32 and Upper Quantiles is 42.985. The Inter
Quantile Range is 18.0625. The lower limit of outliers is -2.17125 and upper limit of outliers
is 70.07875. Therefore, there are no outliers.
(c)
50
40
30
20
10
(d)
Histogram of y
8
6
Frequency
4
2
0
0 10 20 30 40 50 60
1.5.10
(a) Mean is 8.34, Variance is 24.21477 and Range is 16.7.
(b) Lower quantile is 3.65, Median is 8.1 and Upper Quantiles is 12.55. The Inter Quantile
Range is 8.9. The lower limit of outliers is -9.7 and upper limit of outliers is 25.9. Therefore,
there are no outliers.
(c)
15
10
5
0
(d)
Histogram of y
10
8
Frequency
6
4
2
0
0 5 10 15
(e)
1.5.11
(b) By assuming bell-shaped distribution, from empirical rule we can say that
1.5.12
(a) Lower Quantile is 39, Median is 41 and Upper Quantiles is 46. The Inter Quantile Range
is 7. Mean is 41.8, Standard Deviation is 11.30192
(b)
(c)
60
50
40
30
20
(d)
The lower limit of outliers is 28.5 and upper limit of outliers is 56.5. Therefore, we have two
outliers 18 and 60.
1.5.13
(a) Mean is 3.7433, Variance is 3.501 and Standard Deviation is 1.871323.
(b) Frequency table
(c)
By grouped data, Mean is 3.69, Variance is 3.62 and Standard Deviation is 1.9.
The results are similar to the none grouped data.
1.5.14
(a) Mean is 60.47, Variance is 685.0851 and Standard Deviation is 26.17413.
(b)
SN Class Frequency
1 0-19 1
2 20-39 6
3 40-59 8
4 60-79 5
5 80-99 10
(c)
Class Interval fi mi
= =61.333
=25.0149
1.5.15
Class fi Cumulative fi Cumulative fi/n
10-14 895 895 0.0174
15-19 55,373 56,268 0.1093
20-24 122,591 178,859 0.3475
25-29 139,615 318,474 0.6188
30-34 127,502 445,976 0.8665
35-39 68,685 514,661 1.0000
=27.248
1.5.16
(a) Mean is 177.5, Variance is 134.694 and Standard Deviation is 11.6058.
(b)
.
1.5.17
(a) Mean is 44.27, Variance is 536.15 and Standard Deviation is 23.15.
(b)
Exercises 1.8
1.8.1
(a)
Histogram of y
20
15
Frequency
10
5
0
66 68 70 72 74 76 78 80
(b) Mean is 74.0625, median is 74, variance is 7.223892 and standard deviation is 2.68773.
(c)
80
78
76
74
72
70
68
66
The lower limit of outliers is 66 and upper limit of outliers is 82. Therefore we have no
outlier.
1.8.2
Histogram of y
10
8
Frequency
6
4
2
0
0 10 20 30 40 50
y
(a)
(b) Mean is 20.16667, median is 18, variance is 125.7299 and standard deviation is 11.21293.
(c)
40
30
20
10
0
The lower limit of outliers is -16.25 and upper limit of outliers is 57.75. Therefore we have no
outlier.