Group Mid-Term Exam: Ministry of Education and Training National Economics University 000
Group Mid-Term Exam: Ministry of Education and Training National Economics University 000
Ha Noi - 02/2021
Question 1:
1. Make a frequency table for the variable. Does the frequency table make sense?
Does it make sense to make a histogram of the variable? A bar chart?
The variable we chose is “income”, which perfectly represents the total family
income in the year before the survey.
FREQUENCY TABLE
Statistics
=> As we all know, a frequency distribution records the number of times each
value occurs and is presented in the form of table. However, as can be seen from
the frequency table, it is hard to draw a conclusion and figure out the trend since
there are too many observations. Therefore, in our opinion, this frequency table
does not make sense as we believe readers would struggle to fully interpret the
aforesaid table.
HISTOGRAM
BAR CHART
Advanced Finance 61B
=> A bar chart is used when you want to show a distribution of data points or
perform a comparison of metric values of your data. From a bar chart, we can see
which groups are highest or most common, and how other groups compare against
the others. Even though there are a lot of values on the X-axis of the bar chart, we
can still see the trend going on here and can even compare these values in order to
draw a conclusion. Besides, the missing values column (so-called “REFUSED”) is
stated, making readers understand the graph more clearly. Hence, the bar graph in
this case does make sense.
2. What is the scale of measurement for the variable?
The scale of measurement for the variable “income” is ordinal because it has
been divided into several categories, which are not mathematically measured or
determined but are merely assigned as lables for opinion.
Measures of
Measures of Measures of Measures of
Dispersion or
Frequency Central Tendency Position
Variation
- Count, Percent, - Mean, Median, - Range, Variance, - Percentile
Frequency and Mode Standard Deviation Ranks, Quartile
- Shows how often - Locates the - Identifies the Ranks
something occurs distribution by spread of scores by - Describes how
- Use this when you various points stating intervals scores fall in
want to show how - Use this when - Range = High/Low relation to one
often a response is you want to show points another. Relies
given how an average or - Variance or on standardized
most commonly Standard Deviation scores
indicated response =difference between - Use this when
observed score and you need to
mean compare scores to
Advanced Finance 61B
- Use this when you
want to show how
"spread out" the data
a normalized
are. It is helpful to
score (e.g., a
know when your
national norm)
data are so spread
out that it affects the
mean
=> As we can see from the table, Measure of Central Tendency is the most
appropriate to describe the data and closest to our purpose, which is to know the
most commonly indicated response.
Statistics
TOTAL FAMILY INCOME FOR LAST
YEAR
Valid 1354
N
Missing 65
Mean 16.13
Median 17.00
Mode 24
Skewness -.608
Std. Error of Skewness .066
As can be observed from the Histogram drawn in part 1.1 and the fact that the
Coefficient of Skewness is smaller than 0 (negative), the graph appears to be
NEGATIVELY SKEWED. Moreover, as we look at the frequency table, this is an
open-ended distribution, which means one or more of the classes (or bins) is open-
ended.
=> Hence, it does NOT make sense to compute a Mean.
Median Mode
- Easy to understand and - Easy to understand and
calculate calculate
- Values of every items - Not affected by extreme
Advantages are included => values
representative for the - Can be computed in an
whole set of data open-ended frequency
table
Statistics
Statistics
Median 2.00
Mode 2
Percentiles 25 1.00
50 2.00
75 4.00
95 8.00
As can be seen from the results which appear in the SPSS output view:
The value for 25th percentiles is 1.00
The value for 50th percentiles is 2.00
The value for 75th percentiles is 4.00
The value for 95th percentiles is 8.00
Both the values for Median and Mode is 2
(d) Make a bar chart of the hours of TV watched. What problem do you see with
this display?
As can be seen from the bar graph below, most of the respondents watch TV
from 1 to 4 hours per day, whereas only a minority of those watch TV for more
than 10 hours. As a result, the dataset is not distributed equally.
Moreover, the values “9, 13, 16, 17, 18, 19, 21, 22, 23” are not included in the
bar chart due to the fact that these values do not appear in the survey answers (this
might occured since the number of respondents are not large enough). Therefore,
the problem in the bar chart below is that it does not show a gap which represents
BAR GRAPH
(e) Make a histogram of the hours of TV watched. What causes all of the values to
be clumped together? Compare this histogram to the bar chart you generated in
question 2d. Which is a better display for these data?
HISTOGRAM