CHAPTER 2 Norms and Basic Statistics For Testing
CHAPTER 2 Norms and Basic Statistics For Testing
CHAPTER 2 Norms and Basic Statistics For Testing
CHAPTER 2 EXERCISES
Statistics are necessary to describe the meaning of test scores. Without statistics we cannot know whether a
test score of 15, for example, is a good score or a poor score. Further, we can use statistics to make inferences
about larger populations of individuals based on a representative sample of those individuals. Chapter 2 reviews
both descriptive statistics (such as the mean and standard deviation) and inferential statistics (such as
correlation). Even if you have completed a statistics course, this chapter will serve as a good review. Moreover,
you will discover many applications of familiar statistics that are unique to testing and measurement.
1
A. Properties of Scales (text pp. 27-29)
2. Measurement scales differ from one another in terms of the properties of magnitude, equal intervals,
and an absolute zero. Summarize the characteristics of each property below.
a. magnitude: ______________________________________________________________________________
__________________________________________________________________________________________
b. equal intervals: ___________________________________________________________________________
__________________________________________________________________________________________
c. absolute zero: ____________________________________________________________________________
__________________________________________________________________________________________
3. In the table below, describe the type of scale listed, indicate (with a ) whether the scale possesses each
of the three properties you summarized above, and give an example of the type of scale.
Ma Eq Abs
g u o
n a l
i l u
Type of
Description t I t Example
Scale
u n e
d t z
e e e
r r
v o
a
l
s
Nominal
Ordinal
Equal
Intervals
2
Chapter 2: Norms and Basic Statistics for Testing
Ratio
4. Indicate whether each of the following scales is nominal, ordinal, equal intervals, or ratio.
Time (e.g., hours, minutes, seconds): ___________________________________________________
USDA stickers placed on cuts of meat (e.g., “prime,” “lean”): ________________________________
Political party membership: ___________________________________________________________
Grade point average (be careful with this one!): ___________________________________________
Percentage scores on an exam: _________________________________________________________
C. Permissible Operations (text pp. 30-31)
5. Mathematical manipulations cannot be applied to ________________ data. Some mathematical
operations can be applied to _______________________ data, but the results are sometimes difficult to
interpret. On the other hand, most mathematical operations can be applied to ________________________data,
and all can be applied to __________________ data.
NOTE: Several activities in Chapters 2-6 draw upon a fabricated data set of 15 scores on a hypothetical test
called the University Aptitude Test (UAT). The UAT is a verbal analogies test consisting of 18 items. Parts of
the data set corresponding to specific activities are provided in each Workbook chapter.
UNIVERSITY APTITUDE TEST (UAT) SCORES The table at left shows the raw scores of 15
Examinee UAT Raw Score
Greg 16 examinees on the University Aptitude Test
Allison 7 (UAT). The UAT consists of 18 verbal
Janine 10
analogies items. In Chapter 2, exercises using
Corey 17
Michelle 3 the UAT data set emphasize descriptive
Thomas 11
statistics. You will use answers to Chapter 2
Randall 14
Tina 10 exercises in exercises utilizing the UAT data set
LeeAnn 13 in subsequent chapters. Therefore, be sure to
David 12
Marcia 4 compare your answers to the answer key to
Lance 9 make sure you have done your calculations
Keisha 15
Blair 12 correctly.
Joe 6
1. Using the template below, create a histogram depicting the distribution of UAT raw scores. The class
interval is presented on the horizontal (x) axis and frequency of occurrence is presented in the vertical (y) axis.
3
Freq 6
ue
nc 5
y
of 4
Oc
cu 3
rre
nc 2
e
1
After reading through text pp. 34-38 (and Box 2-1), you should be ready to calculate the percentile rank of
selected UAT scores. Begin by arranging the scores in ascending order (the method suggested in Box-2-1) or in
descending order (the method the Workbook author prefers!). Then use the formula for percentile rank shown
below to find the percentile ranks corresponding to the UAT raw scores listed.
B
Pr x100
N
2. Calculate and record the percentile ranks corresponding to the following examinees’ raw scores in the
table below. The first is completed as an example.
Examinee Name Raw Score (Xi) B/N x 100 Percentile rank
Janine 10 5/15 x 100 33rd (or 33.33)
Keisha
Thomas
Joe
Corey
Marcia
4
Chapter 2: Norms and Basic Statistics for Testing
1. Percentiles are very closely related to percentile ranks. Percentiles are expressed as raw scores below
which a certain percentage of scores fall. Identify the following percentiles from the UAT data set.
a. What UAT score is approximately the 33rd percentile? _______________
b. What UAT score is approximately the 13th percentile? _______________
c. What UAT score is approximately the 93rd percentile? _______________
2. Imagine that you are a high school guidance counselor. One morning, Blair walks into your office and
asks about her UAT score. Obviously, telling Blair “you got a 12” will mean nothing to her. Write a response to
Blair’s question using percentile rank and percentile. _______________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
VI. DESCRIBING DISTRIBUTIONS (text pp. 39-53)
A. Mean (text pp. 39-40)
B. Standard Deviation (text pp. 40-42)
1. The mean is the average score in a distribution. What is the mean UAT score? ___________________
The standard deviation is an approximation of the average deviation around the mean. Standard deviation is
an extremely important statistic in psychological testing. Question 2. through 6. deal with standard deviation.
Data sets 1, 2, and 3 (the five numbers presented in columns under Set 1, Set 2, and Set 3) are scores on an 8-
item math test administered to three different classes of second grade students who just completed a unit on
arithmetic. The mean score of all three data sets (or all three arithmetic tests) is 4. If we used only the mean to
assess how much arithmetic students in the three classes learned, we might be tempted to conclude that all 18
students learned about the same amount. However, simply looking at the three data sets is enough to tell us that
the three classes of students are not the same in terms of their performance on the arithmetic test!
2. On the templates shown above, draw the frequency distributions for each of the three data sets. These
distributions graphically represent the fact that even though the data sets have the same mean score, 4, they are
different in terms of __________________________ (or spread).
3. At first, it seems as though measuring how much variation there is in a data set would be easy – all one
would do is subtract each score in the data set from the mean score (X- X ). One could add up the “variations
5
(more accurately, deviations) around the mean” and calculate the average deviation around the mean. Try this
method of calculating variation or “spread” below.
Σx1 = ____, Σx1/N = ____ Σx2 = ____, Σx2/N = ____ Σx3 = ____, Σx3/N = ____
4. What you discovered is that the method used on the previous page to calculate variation or “spread” is
not very useful! A better method might be to square the deviations around the mean – this will get rid of all the
negative numbers. Go back to the previous page and square each of the values you found earlier (6 in each data
set). Write these new values next to the old ones. Finally, add up the “squared deviations around the mean” and
then calculate the average squared deviation around the mean: this is called the variance.
a. variance of Set 1: ________________
b. variance of Set 2: ________________
c. variance of Set 3: ________________
5. The method you used in the previous exercise to estimate variation is better, but it still isn’t as useful as
it could be. After all, what does “squared deviation” really mean? What does a “squared deviation” look like?
We can resolve this problem by taking the square root of the variance. This method will bring the resulting
number back into the real world of raw score units, yielding an approximation of the average deviation around
the mean, and this is the standard deviation (voilà!).
a. standard deviation of Set 1: __________________
b. standard deviation of Set 2: __________________
c. standard deviation of Set 3: __________________
X X
2
Whether you realize it or not, you have just applied the formula N to calculate the standard
deviations of the three data sets! But…there’s just one problem. As you have read on text pp. 41-42, the
formula shown above is used to calculate the standard deviation of a population, not a sample. In psychological
testing, you will almost always use the formula that estimates the standard deviation of a sample, which looks
like this:
X X X
2 2
S X 2
N
N 1 S
or like this: N 1 (note: in most cases, this formula is much easier!)
6
Chapter 2: Norms and Basic Statistics for Testing
6. Go back and re-calculate the standard deviation for a sample this time. Use either formula shown above
(although in this particular case, the one on the left may be easier).
a. S of Set 1 = __________________
b. S of Set 2 = __________________
c. S of Set 3 = __________________
After completing the exercise above, go back and look at the frequency distributions of the three data
sets you created earlier. Note that the set of data with the largest variance and standard deviation is also
spread out a lot more than the other two. Whenever you encounter two or more data sets with
approximately the same mean, but very different standard deviations, bring the image of these
distributions into your mind’s eye! For many students, visualizing statistics increases understanding of
statistics.
Examinee UAT Raw Score
(X) X2 7. Use the table at left to find the
Greg 16
values you need to calculate the standard
Allison 7
deviation of UAT scores.
Janine 10
Corey 17 Try using the raw score equivalent
Michelle 3 formula,
Thomas 11 X 2
X 2
N
Randall 14 S
N 1
Tina 10
LeeAnn 13
David 12
Marcia 4
Lance 9
Keisha 15
Blair 12
Joe 6 S = ___________
N= ΣX = ΣX =2
How are the mean and standard deviation used to interpret test scores?
If you know the mean and standard deviation of scores on a test, you can interpret an examinee’s test
score in relation to the scores of others who have taken the test. The mean tells you what an “average” score is
and the standard deviation can tell you if a score is more than just an average distance (or deviation) away from
the mean.
For example, let’s say several examinees were administered a measure of loneliness. On this measure,
higher scores are associated with feeling more lonely. The mean score on the loneliness measure is 21 and the
7
standard deviation is 3.5. If an examinee obtained a score of 21 on the loneliness measure, we would know that
he or she probably feels an average level of loneliness. In fact, examinees who score within + 1 standard
deviation from the mean (i.e., who obtain any score between 17.5 and 24.5) would still be considered fairly
average, since these examinees’ scores fall within the “average deviation from the mean.” On the other hand, if
an examinee obtained a score of 29 on the loneliness measure, we might be very concerned about how lonely he
or she was feeling. A score of 29 on this measure falls more than 2 standard deviations above the mean,
suggesting that this examinee is significantly more lonely than others who took the test.
8. After reading the box above, identify the UAT scores that fall + 1 standard deviation (SD) from the
mean. Then identify scores that fall + 2 SDs from the mean.
10. Z scores transform raw test scores into ___________________________________ units that are easier
to interpret.
As you will discover a bit later in this chapter, the mean of the Z score distribution is set at 0 and the
standard deviation is set at 1. This means no matter what the raw score mean and standard deviation of a
distribution happens to be, when those raw scores are converted to Z scores the distribution will have a mean of
0 and a standard deviation of 1. Always. So, if an examinee has a Z score of 0, we know that his or her raw
score was right at the mean of the original (untransformed) score distribution. If an examinee has a Z score of -
1, it means that his or her raw score was exactly 1 standard deviation below the mean in the original score
distribution.
The bottom line is, the Z score is probably among the most efficient and useful scores you will ever
encounter! The Z score is efficient because it combines information about the distribution’s mean and the
standard deviation into one score. The Z score is useful for many reasons, and here is just one of them: if two (or
more) sets of raw test scores with different means and standard deviations are converted into Z scores, it is
possible to compare relative performance across the two (or more) tests because both sets of scores are expressed
in the same, or standardized, units. [Isn’t that brilliant?]
8
Chapter 2: Norms and Basic Statistics for Testing
11. After reading through text pp. 42-45 (including the CES-D example), calculate and record the Z scores
of the four examinees listed in the table below.
UAT score XX
Examinee (X) XX Z
S
Joe Z= = -1.09
Keisha
Thomas
Corey
Here’s the fun part! Compare the Z scores you calculated above with your interpretations of the four
examinees’ raw scores in exercise 9., above. You should be able to see the resemblance!
9
D. Standard Normal Distribution (text pp. 45-50)
Examine the figure below, which is adapted from Figure 2-7 on text p. 46. This figure shows the standard
normal distribution (also known as the “bell curve”). Theoretically, this bell-shaped curve describes the
distribution in the population of scores on measures of most psychological constructs, including intelligence and
personality traits. There are several things to note about this figure:
1. Units on the horizontal (x) axis are expressed as Z scores. As you know, Z scores have a mean of 0
and a standard deviation (SD) of 1.
2. The numbers inside the curve show the proportion of cases (or scores) that are expected to fall
within + 1 SD from the mean, + 2 SDs from the mean, etc. These proportions can easily be turned into
percentages – just multiply by 100. For example, the number of cases (or scores) we would expect to find
between the mean score and the score that is one SD above the mean is 34.13% (because .3413 x 100 = 34.13).
3. The shaded area + 1 SD from the mean contains about 68% of cases (or scores) in the normal
distribution. Scores in this range are considered relatively average.
4. Scores that are more than + 1 SD from the mean are, by definition, less frequently observed in the
population. As the figure shows, less than 16% of scores in the distribution are in the area above +1 SD from the
mean and below -1 SD from the mean (because .1359+.0214+.00135=.15865, or 15.87%). Therefore, a score
that falls outside the shaded area in either direction might be considered significantly above average or
significantly below average. Obviously, scores that are + 2 SDs and +3 SDs from the mean are considered even
more unusual. Less than 3% of scores are above +2 SDs and below -2 SDs from the mean, and less than 1% of
scores are above +3 SDs and below 3 SDs from the mean.
_________
12. When scores are normally or near-normally distributed, it is possible to convert Z scores into percentile
ranks. For example, a Z score of 0 converts to the 50th percentile rank (since 50% of scores fall below a Z score
of 0; see figure above). Identify the percentile ranks associated with the following Z scores below.
10
Chapter 2: Norms and Basic Statistics for Testing
Because ”real world” Z scores are rarely whole numbers like -3.00 or +1.00 (and are more often like -2.67
or +0.54) most Z score-to-percentile rank conversions require the use of special tables like those in Appendix 1
in the back of your text. Exercises 13. through 15. will give you practice using these tables.
Turn to Part I of Appendix 1 now. This table simply lists Z scores and corresponding percentile ranks.
13. In a normal distribution, a Z score of .90 has a percentile rank of 81.59 (in other words, almost 82% of
scores fall below a Z score of .90). Make sure you can find this value on Part I of Appendix 1. Then find the
percentile ranks associated with the following Z scores:
Z Score Percentile Rank
-2.00 ____________
-1.30 ____________
-0.50 ____________
Z Score Percentile Rank
+.50 ____________
+1.80 ____________
+2.00 ____________
Turn to Part II of Appendix 1, in the back of your text, now. This table allows you to convert more
specific Z scores (like +2.67 rather than +2.6) to percentile ranks. The four-digit numbers inside the table show
the proportion of scores that are between a Z score of 0 (or the mean) and the absolute value of the Z score of
interest. You can turn these proportions into percentages – just multiply by 100.
For example, let’s say you want to convert a Z score of +1.46 into a percentile rank. Find “1.4” in the
far left column (and put your finger on it) and then find “.06” in the row at the top (and put your finger on it).
Now, move one finger across the table and the other finger down the table – you should find a value of .4279.
You can turn that proportion into a percentage: 42.79%. This is the percentage of scores between a Z score of 0
(or the mean) and a Z score of 1.46. But you are not finished yet! You are interested in converting a Z score of
+1.46. into a percentile rank.Because 50% (or .5000) of scores fall below a Z score of 0, you must add .5000
and .4279. This sum equals .9279, so the percentile rank is 92.79. In other words, 92.79% of scores fall below a
Z score of +1.46.
Now let’s say you want to convert a negative Z score, such as -1.46, into a percentile rank. Follow the
same procedure as you did to find the Z score of +1.46—you will still find a value of .4279. Remember, this
number tells you the proportion of scores between a Z score of 0 (or the mean) and a Z score of 1.46. Because
you are interested in converting a Z score of -1.46 into a percentile rank, you must subtract .4279 from .5000 to
find the proportion of scores that fall below a Z score of -1.46. Since .5000-.4279=.0721, the percentile rank
corresponding to a Z score of -1.46 is 7.21. In other words, only 7.21% of scores are below a Z score of -1.46.
Important note! Remember that you can convert Z scores into percentile ranks (and vice versa) using
Part II of Appendix 1 only when scores are normally or near-normally distributed. Let’s say you found the
percentile rank of a raw score by using the formula Pr=B/N x 100. Now let’s say you found the Z score using the
formula Z = X- X /S. If the raw score distribution is normal or nearly normal, the percentile rank you find in
Part II of Appendix 1 based on the Z score you calculated will be the same (or very close to) the percentile rank
you found using the formula. On the other hand, if the raw score distribution is not normally distributed, the
11
percentile rank you find in Part II of Appendix 1 based on the Z score you calculated will be different from the
percentile rank you found using the formula. The degree to which the percentile ranks differ will depend on the
extent to which the raw score distribution deviates from a normal distribution of scores.
12
Chapter 2: Norms and Basic Statistics for Testing
14. Using Part II of Appendix 1, find the percentile rank associated with the following Z scores.
15. You can also use Part II of Appendix 1 to find Z scores associated with particular percentile ranks. For
example, suppose you want to find the Z score that corresponds to the 82nd percentile rank. Because you know
the Z score is going to be positive (since the percentile rank is over 50th), you need to find the value in the table
closest to .8200-.5000, or .3200. The value in the table closest to .3200 is .3212. This value corresponds to a Z
score of +.92. Make sure you can find this Z score.
Now let’s say you want to find the Z score that corresponds to the 31st percentile rank. Because you
know this Z score is going to be negative (since the percentile rank is less than 50th), you need to find the value
closest to .5000-.3100, or .1900. The value closest to .1900 in the table is .1915. This value corresponds to a Z
score of -.50. Make sure you can find this Z score.
Using Part II of Appendix 1, find the Z score associated with the following percentile ranks.
62nd ___________
44th _______
29th ___________
75th _______
13
First, go back to the figure on Workbook page 25 showing the standard normal distribution with Z
scores indicated on the x axis. Underneath the label “Z scores” you will see a blank line. On that line, write “T
scores.” Then, under the specific Z scores indicated, write the corresponding T score. For example, under the Z
score of 0, write “50.” Under the Z score of -3, write “20,” and so on.
Next, find the T scores associated with the following Z scores:
Z Score T Score
-2.28 _______________
-1.77 _______________
-0.63 _______________
Z Score T Score
+.19
+2.55
+3.06
14
Chapter 2: Norms and Basic Statistics for Testing
17. Quartiles divide the frequency distribution into equal ____________, with Q1 at the 25th percentile, Q2 at
the __________ percentile, Q3 at the ___________ percentile, and Q4 at the ____________ percentile.
18. Deciles divide the frequency distribution into equal tenths, with D2 at the __________ percentile, D5 at the
_____________ percentile, D7 at the _____________ percentile, etc.
19. The stanine system converts scores to a scale ranging from ____ to ____; the term stanine comes from
________________________. Stanine distributions have a mean of ______ and a standard deviation of ______.
1. Norms are used to interpret examinees’ scores relative to the scores of individuals making up the
standardization sample (also called the norm group). Both the mean and standard deviation are examples of norms.
What are other examples? _____________________________________________________________
2. What assumptions are made about the relationship between characteristics of a test’s standardization
sample (or norm group) and characteristics of current test-takers? _____________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
3. Most intelligence tests, such as the Wechsler scales and the Stanford-Binet scales, have different normative
groups for particular ________ groups.
15
B. Tracking (text pp. 55-59)
4. Pediatricians often use age-related norms to assess whether children’s growth (in height and weight) is
occurring at expected rates. As they grow older, children tend to stay at about their same percentile level, relative to
other children in their age group, in terms of their height and weight. This tendency is called
___________________.
5. Pediatricians use charts such Figure 2-9 on p. 57 and Figure 2-10 on p. 58 to assess whether children are
growing at the expected rate. For example, Figure 2-9 (the tracking chart for boys) shows that a 9-month old baby
boy at the 50th percentile for length is about 28.5 inches long. If the baby grows as expected (i.e., stays at the 50th
percentile), at 27 months he should be about 35.5 inches long. If a child does not stay on his “track” for growth, the
pediatrician will take a closer look at potential causes and interventions, if necessary.
Look at Figure 2-10 (p. 58), the tracking chart for girls’ growth. Complete the table below which
summarizes length (height) information for four baby girls. Be sure to indicate whether the baby is growing at the
expected rate (i.e., has roughly maintained her percentile rank).
6. If you were a pediatrician or a parent, which of the baby girls might you be concerned about? Why?
__________________________________________________________________________________________
__________________________________________________________________________________________
8. What are criterion-referenced tests, and how are the results of these tests used? __________________
_____________________________________________________________________________________________
_____________________________________________________________________________________________
_____________________________________________________________________________________________
16
Chapter 2: Norms and Basic Statistics for Testing
Key Terms and Concepts to Know from Chapter Two
descriptive statistics
inferential statistics
properties of scales: magnitude
properties of scales: equal intervals
properties of scales: absolute zero
nominal scale
ordinal scale
interval scale
ratio scale
frequency distribution
percentile rank
percentile
variable
mean
standard deviation
variance
Z score
standard normal distribution
McCall’s T score
quartiles
median
deciles
stanine system
norms
tracking
overselection
Section 106 of Civil Rights Act of 1991
norm-referenced test
criterion-referenced test
17
CHAPTER 2 PRACTICE QUIZ
1. If you lined children up according to their weight, from highest to lowest, you would be using a(n)
_________________ scale.
a. interval
b. ordinal
c. nominal
d. ratio
3. The score in the exact middle of the distribution of scores, such that equal numbers of scores fall
above and below it, is the _______.
a. mean
b. median
c. standard score
d. standard deviation unit
a. you are in the group of 13 top-scoring people who took the test.
b. 87% of the students got a score lower than your score.
c. you got 87% of the items on the test correct.
d. 87% of the students in the class scored higher than you did.
a. made it illegal for employers to use separate race-related norms for employment testing.
b. made it legal for employers to use different cut-off scores on employment-related tests for the
purposes of increasing racial diversity in the workplace.
c. made overselection of particular racial groups illegal in employment settings.
d. was deliberately vague regarding how employers could use tests to select employees.
18
Chapter 2: Norms and Basic Statistics for Testing
Questions 6-9. relate to subjects’ scores on the Green Test, described in the box below.
Several subjects have just taken a test designed to measure the tendency to be envious of others, called the “Green
Test.” On the Green Test, higher scores indicate a greater tendency to experience envy. The test manual indicates
that the original sample distribution (which conformed to the normal distribution) had a raw score mean of 20 and a
standard deviation of 2.
6. Pete received a raw score of 21 on the Green Test. His z-score would be _____ and his T-score would
be _______.
a. +1.50; 60
b. -1.50; 35
c. -.50; 40
d. +.50; 55
7. If Margie received a z-score of -1.00 on the Green Test, her percentile rank would be
a. 16.
b. 50.
c. 84.
d. 97.
8. Daniel’s percentile rank on the Green Test was 50. Therefore, his raw score was _____.
a. 18
b. 20
c. 22
d. 24
9. Which is the correct order of subjects’ tendency to be envious, as measured by the Green Test, from
least envious to most envious?
10. Although the use of____________________ is accepted in medical settings, it is much more
controversial in educational settings.
a. Z score transformations
b. criterion-referenced tests
c. tracking
d. the stanine system
19
Student Workbook Assignment 2.1
This assignment provides a “real world” problem using Z scores and percentiles. It is modeled after An example
close to home on pp. 49-50 of your text. It is very important that you go through this example carefully before you
begin this assignment (including referring to Part II of Appendix 1 in the back of your text).
DIRECTIONS
Before beginning this assignment, make sure you have read Chapter 2.
Imagine that you are a teaching assistant for a professor who would like you to convert raw test scores into Z
scores and then into letter grades. The professor gives you the following information about the test:
c. The professor grades on a curve that employs the following percentile cutoffs:
Grade Percentiles
A 84-100
B 65-87
C 18-64
D 10-17
F 0-9
Student # Score 21 11
1 13 22 14
2 11 23 15
3 15 24 12
4 13 25 17
5 16 26 14
6 13
7 9
8 12
9 10
10 19
11 13
12 8
13 14
Student # Score
14 16
15 13
16 6
17 13
18 9
19 12
20 12
20
Chapter 2: Norms and Basic Statistics for Testing
(1) First, calculate the mean ______________ and standard deviation _______________.
21
(2) In order to determine what grade each student should receive, you must first convert each of the
professor’s percentile cutoffs to z-score cutoffs. First, find the value from Appendix 1, Part II, that corresponds
to the lowest percentile rank for the particular grade. Then write the Z score cutoff. The first is done for you as
an example.
Value from
Grade Appendix 1 Part II Z-score cutoff
A .3389 +.99
B ______________ ______________
C ______________ ______________
D ______________ ______________
F ______________ ______________
(3) Now determine the z-scores and letter grades for each of the 8 students listed below.
22