0% found this document useful (0 votes)
154 views13 pages

Psych Testing Issues

This document provides an overview of key issues in psychological testing and measurement. It discusses topics like reliability, validity, bias and different types of tests. It defines the difference between psychological testing which measures specific variables, and assessment which is more comprehensive. It emphasizes that for tests to be considered valid and useful, they must demonstrate reliability, validity, lack of bias, standardization and normalization. It also discusses sources of error and bias that can influence test results. A wide range of psychological constructs and types of tests are used for assessment purposes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
154 views13 pages

Psych Testing Issues

This document provides an overview of key issues in psychological testing and measurement. It discusses topics like reliability, validity, bias and different types of tests. It defines the difference between psychological testing which measures specific variables, and assessment which is more comprehensive. It emphasizes that for tests to be considered valid and useful, they must demonstrate reliability, validity, lack of bias, standardization and normalization. It also discusses sources of error and bias that can influence test results. A wide range of psychological constructs and types of tests are used for assessment purposes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Abstract

A summary of the important issues associated with psychological tests and measurements is offered.
Topics such as reliability, validity, bias, and errors are examined. Intellectual, personality,
neuropsychological, and disability and workplace assessment are briefly discussed. After careful review,
the author then offers a critical analysis of relevant testing and measurement practices and theory,
culminating in a synthesis of ideas into a modern psychological perspective on the nature, use, and
purpose of psychological assessment.

A Summary, Critical Analysis, and Synthesis of Issues


and Aspects in Psychological Tests and Measurements
Human curiosity is a natural fact. We see images, we manipulate objects, we communicate ideas and
emotions, and we try desperately to organize our world into a neat little sphere that we can hold in
our hands. Humans organize to understand, categorizing everything possible in order to find our place
in this fragile world and the immensity of the universe.
As we investigate outward, we are drawn ever back into ourselves and find that an organization of our
minds and emotions is inevitable. The field of psychology is challenged with the awesome task of
unveiling mysteries deep within the human condition. Psychological assessment then is the light by
which this discipline must find its path. However, the tools of assessment that psychologists use are
never completely perfect; they sometimes paint pictures that are misleading. Constant discourse,
scrutiny and investigation will enable future knowledge an edge toward a more complete interpretation
of the variance of the human condition. The road is dark and changing. We must be clever in the use of
our light, patient with what it shows us, and ever mindful to challenge the truths we might see.
Summary of Psychometrics
The myriad of issues regarding psychological tests and measurements make finding a starting point
painfully difficult. To be sure, several definitions are necessary before one can begin. According to
Cohen & Swerdlik (2002) psychological assessment refers to the gathering and integration of data for
the purpose of making a psychological evaluation, by using tools such as tests, interviews, case studies,
observations, and special measurement apparatuses and procedures. In other words, assessment is a
more holistic venture, relying on different tools used together. In contrast, psychological testing can be
understood as the process of measuring psychology-related variables by means of devices or
procedures designed to obtain a sample of behavior (Cohen & Swerdlik, 2002, p. 4). While it may be
difficult to clearly identify a line between testing and assessment, generally, testing is a singular
activity aimed at a specific purpose; whereas assessment is more comprehensive in scope and

accomplishment. More than one type and style of psychological test may be used in an assessment
plan.
Psychological tests can differ in a number of ways, such as content, format, administration procedures,
scoring and interpretation procedures, and psychometric or technical quality (Cohen & Swerdlik, 2002).
Regardless, any test or assessment tool that hopes to be taken seriously must be able to prove that it is
reliable, valid, normed, standardized, and free from unreasonable bias.
The reliability of a test speaks to its consistency in measurement (Cohen & Swerdlik, 2002, p. 128). A
test for depression must have a high reliability of measuring depression in different individuals without
an excessive amount of error. Unfortunately, small amounts of measurement error are unavoidable for
any test. The test takers may try harder, guess luckier, be more alert and awake, feel less anxious, or
be healthier on a given occasion (Joint Committee on Standards for Educational and Psychological
Testing, 1999, p.25). During test construction, questions may be worded differently, or the content may
be slightly altered from question to question. These small and other not so small variances serve to
disintegrate the reliability factors of tests.
Reliability estimates are generally ascribed to tests using different methods. Test-Retest estimates rely
on a test taker to score similarly on two different occasions. A test's reliability may also be measured
by using alternate-forms or parallel-forms. Cohen and Swerdlik (2002) discuss how coefficients of
equivalence can be derived using these techniques. Split-Half reliability estimates are especially useful
when it is impractical or undesirable to assess reliability with two tests or to have two test
administrations (Cohen & Swerdlik, 2002, p. 133). Split-half reliability is the practice of dividing a test
in half, separately scoring each equivalent half, and then comparing scores using the Spearman-Brown
formula (Cohen & Swerdlik, 2002). Psychologists may also utilize inter-scorer reliability or a
combination of several estimates.
Tests must also validly measure what they purport to measure. A self-report test of eating disorders
should be consistently able to identify the majority of testees who have a diagnosable eating disorder.
Otherwise, such tests are worthless to clients and psychologists. Validity also refers to the degree to
which evidence and theory support the interpretations of test scores entailed by proposed uses of
tests (Joint Committee on Standards for Educational and Psychological Testing, 1999, p.9). New
psychological assessments may need to undergo revisions as the validation of tests scores relies in part
on the conceptual framework of the actual test (Cohen & Swerdlik, 2002). As a clearer picture of a
certain trait begins to emerge, the test can be focused. As this process continues, the validity of a test
becomes stronger.
Beyond concerns of validity and reliability, any professional who uses educational, psychological, or
other types of tests must be aware of any sources of error and bias that might exist. Since bias is a

factor inherent within a test that systematically prevents accurate, impartial measurement (Cohen &
Swerdlik, 2001, p. 179), it is essential to eliminate as much potential and actual bias as possible before
the results of the test can be utilized effectively. The Joint Committee on Standards for Educational
and Psychological Testing (1999) has written standards of practice with reference to eliminating
possible error and bias in testing. The committee identified several different areas that need to be
carefully examined by administrators. These areas cover bias both inside and outside of a test and test
situation:

Construct irrelevant componentsthose items that may lower or higher scores for different
groups of examinees.

Content relatedthose items, especially in educational testing, which discuss how well a given
test covers the domain and whether that domain is appropriate. How clear the questions and
instructions are written. The type of response necessary from the test taker, i.e. essay, short answer,
bubble, etc.

Testwisenessthose issues relating to the familiarity with skills to take a test, answer questions
in a timely manner, and ability to guess well on questions the test taker does not know.

Equitable treatment of test takersthose issues that are concerned with fair treatment of all
test takers by the administrator.

Testing environmentthose aspects related to the physical test environment, such as


temperature, comfort level, noise, etc.

Perceived relationship between test taker and administratoraspects related to how test
administrator and test taker relate during the evaluation period.

State of test takeremotional, physical, mental condition of the test taker.

Even if all test biases can be limited to a hypercritical point, the psychologist must still consider the
possible sources of error in the interpretation of scores or responses on any given evaluation. Raters
may be too lenient in their scoring, known as leniency error, may take an extreme negative attitude,
severity error, may fail to give any extreme scores causing the test takers scores to center in the
continuum, central tendency error, or may see the ratee in an excessively positive manner, causing
the scores to be unnaturally skewed, halo effect (Cohen & Swerdlik, 2002, p.181-182).
In addition to the many sources of error in construction, development, implementation, scoring,
interpretation, and application, psychologists must utilize a number of different psychometric tools in
order to assess clients in a wide spectrum of areas. Different aspects psychologists may need to assess
are intelligence (including achievement tests and aptitude tests), personality (including behavioral
aspects), and neuropsychological mental status. The assessment of these areas differs greatly with
regard to the specific design of tools, approach to assessment, testing environment, data collection,
and interpretation of client information. The variety and style of tests and assessments is as varied as
the human condition. The choice of which test to use in any given situation depends upon a number of
criteria including the purpose of assessment, time available, monetary cost, or level of diagnosis.
Tests designed to measure intelligence take varied forms dependent upon the test creators theory of

intelligence (Cohen & Swerdlik, 2002). While there are literally hundreds of intelligence tests
available, several have taken the spotlight in recent times. The Stanford-Binet Intelligence Scale
(SB:FE) is considered a sound, reliable, and valid measure of overall general ability (Cohen & Swerdlik,
2002). Recently, however, its use has declined significantly by practitioners due to the ease and
comprehensiveness of the Wechsler tests.
The Wechsler tests are perhaps the most widely used intelligence assessments. This may be due to the
ease of administration, exceptionally good reliability and validity coefficients, and the fact that
because so much research has been done on these tests, psychologists consider them extremely
reliable. Many other types of intelligence assessments are available. The Kaufman Adolescent and
Adult Intelligence test focuses on fluid and crystallized intelligence while also assessing immediate and
intermediate-term memory (Daniel, 1997). A popular intelligence test in school settings is the
Woodcock-Johnson Tests of Cognitive Ability-Revised. This test has been called a thorough
implementation of a multifactor model, assessing seven dimensions of ability (Daniel, 1997).
Measures of personality can be divided into two distinct categories: Objective and Projective.
Objective measures contain short answer items where the assessee selects one response from two or
more answers (Cohen & Swerdlik, 2002). The respondents pattern of responses is measured and
interpreted in order to measure the strength or absence of a given personality trait or state. Objective
measures may be administered by computer or paper-and-pencil, and scored easily by either a
computer or template. Generally objective measures are quick, cheap, and beneficial for preliminary
identification of personality factors. However, objective measures have distinct disadvantages such as
the test taker faking good or bad in order to impart a certain picture of themselves to the assessor
for various reasons.
Personality tests can be subjective as well as objective. Subjective measures require the assessee to
make a judgment on a particular piece of unstructured stimuli; of which the assessor then uses the
information given to discover personality aspects (Cohen & Swerdlik, 2002). The advantage of
projective measures is their flexibility. Masling (1997) has written specifically on the ability of
projective measures to predict long-term behavior more efficiently than objective tests. The Rorschach
Inkblot Test may be the most widely recognized icon of psychology. This controversial personality test
uses inkblots smashed on white cards as the unstructured stimuli. The Thematic Apperception Test
(TAT) is a collection of 31 cards, one blank, illustrating human situations. Both tests require the
assessee to project responses, which are written down verbatim by the assessor. The responses as well
as body language are then analyzed to determine personality aspects.
Neuropsychological assessment entails a very different form of test batteries. Cohen & Swerdlik (2002)
take their definition of neuropsychological evaluation from Benton (1994), who states that the
objective of this evaluation is to draw inferences about the structural and functional characteristics

of a persons brain by evaluating an individuals behaviour in defined stimulus-response situations


(Benton, 1994, p.1).
Under this definition, evaluation of the biological basis of behavior caused by deficiencies of the brain
require a more comprehensive examination, consisting of an interview, case history or case study, and
battery of psychological, behavioral, intellectual, functional, memory, and perceptual-motor tests.
Psychologists may also be called on to assess persons of disability for education, workplace, or legal
reasons. Persons with disabilities are protected under federal law and granted certain accommodations
in the work and school environment. The types of tests administered to assess disabilities may include
visual, hearing, motor, or cognitive functioning batteries. Because disabled persons are protected
under the Americans with Disabilities Act, proper assessment and sensitivity are very crucial to both
employers and schools.
Work place assessment is a growing trend with large companies. Selection of employees that are best
suited and capable of performing certain duties is important to companies that wish to maintain a
productive and safe working environment. To many companies, assessment of aptitude, physical ability,
motivation, personality, and organizational and leadership skills makes business sense because of the
value of future potential predictors of efficiency, growth, productivity, motivation, and satisfaction
(Joint Committee on Standards for Educational and Psychological Testing, 1999).
Clearly, psychological evaluation is a complex matter. The multitude of tests, the differing purposes of
assessment, the complexity of reliability and validity issues, the variety of situational testing factors,
and the growing ability of technology, create an environment so grandiose that psychologists must be
constantly vigilant with regards to evaluation processes, or run the risk of falling by the wayside.

Critical Analysis
The American Psychological Association has estimated that upward of 20,000 new psychological tests
are developed every year (Cohen & Swerdlik, 2002). The sheer volume and diversity of psychological
assessments is both a blessing and a curse to psychological measurement. With so many tools available,
the modern psychologist must have a crystal clear understanding of the theory and purpose of testing
in any given situation. Testing theory seems to be as differential as psychological theories of
personality and intelligence. According to Cohen and Swerdlik (2002), a psychologist necessarily makes
twelve assumptions in any testing process. These assumptions serve to drive the creation of
psychological tests, the theoretical framework from which they come, the situations in which they will
be applied, and how interpreted results will be utilized in a given setting. The assumptions also provide
as an excellent springboard for an analysis of the complex issues in testing and measurement.

Assumption # 1
The first assumption is that psychological traits and states exist (Cohen & Swerdlik, 2002). Traits and
states differ in that a psychological trait is seen as a relatively enduring aspect of a person, whereas a
state is less enduring. Arguably, these are not stringent definitions. Traits and states then should be
distinguishable from each other. The word distinguishable conveys the idea that behavior labeled with
one trait term can be differentiated from behavior that is labeled with another trait term (Cohen &
Swerdlik, 2002, p. 13). Inextricable from test construction theories is the fact that in order to measure
some facet of human behavior, we must be able to identify the domain to be measured. It has been
stated that for testing and evaluation purposes, psychological traits do not exist except as constructs
an informed scientific idea developed or constructed to describe or explain behavior (Cohen &
Swerdlik, 2002).
Construct development is not an easy task. The pinpointing of personality or intelligence depends
upon a mountain of choices. What theory of personality is the assessor going to use (i.e.
psychoanalytic, behavioral, etc.)? What items make up personality? Which specific aspect of the
defined collection of aspects of personality will the assessor be trying to isolate? Which observed
behaviors would serve to suggest a specific personality aspect is present? What other factors, intrinsic
and extrinsic, might also produce similar behavior to the observed behavior? With out an anchor, any
professional attempting to delineate a useful construct needs some guidance.
The Joint Committee on Standards for Educational and Psychological Testing (1999) has written
standards to address this and other issues concerning test development. Standard 3.2 specifically
states:
The purpose(s) of the test, definition of the domain, and the test specifications should be stated
clearly so that judgments can be made about the appropriateness of the defined domain for the stated
purpose(s) of the test and about the relation of items to the dimensions of the domain they are
intended to represent ( p. 43).
The importance of choosing a specific enough domain (construct) to measure cannot be understated as
it is the beginning point of any test construction.
Assumption # 2
Closely related to the first assumption, the second assumption is that the defined traits and states can
be reliably quantified and measured. Deciding how to measure some aspect is nearly as difficult as
deciding what to measure. Cohen & Swerdlik (2002) use the term aggressive to illustrate this point.
An aggressive salesperson, aggressive killer, aggressive dancer, and aggressive football player

all utilize aggressiveness in a different manner (Cohen & Swerdlik, 2002, p. 14). So how does one go
about measuring such a phenomena? Very carefully.
We create data from the measurement device in the hopes that it will provide us with a pattern or
relationship of the behavior. It is assumed that this illumination is possible because behavior aspects
can be quantified into a pattern of numbers and symbols. Since these numbers and symbols have no
place in the natural world outside of humanity, it is critical to remember that facts are created by
those who look for them, that facts, scientific and otherwise, are forever constructions by us, and do
not tangibly exist.
Assumption # 3
Just as an inch and centimeter can both measure length, so too different types of assessments are
believed to be useful in measuring various aspects. Psychologists come from a wide array of theoretical
frameworks, and each theory has its own methods of measuring constructs dependent upon their
definition and source. A collection of different methods seems, logically, more sound than does one
type or style of test. However, not all test styles and types provide equivalent data on a given subject.
For example, although subjective personality measures are widely seen as less reliable than objective
measures because of the nature of their ambiguity in test administration and interpretation, some
authors (i.e. Masling, 1997; Exner, 1986) have noted that long-term behavior can more reliably be
predicted with subjective tests.
Tests may vary in the way they are linked to a particular theory, in which the test items are selected,
whether they are developed rationally vs. empirically; in the way they are presented, in the way they
may be administered, scored, interpreted, and applied (Cohen & Swerdlik, 2002). The range of test
options provides ample opportunity for a psychologist to assess some aspect in a number of different
ways. Because more than one psychological test can measure a given construct, or characteristic (Joint
Committee on Standards for Educational and Psychological Testing, 1999), the exact purpose of
evaluation and theoretical background of the psychologist will greatly influence which type of test will
be used in a certain situation. Indeed, a diligent psychologist is obligated to utilize a number of
different tests and methods.
Assumption # 4
In order to justify testing, it must be assumed that assessment is able to provide answers to
momentous questions. Cohen & Swerdlik (2002) argue that users of tests and assessments must believe
that the process of assessment is capable of providing useful answers. Forensic psychologists are often
required to give expert testimony on the mental status of individuals. It would be contrary to the
judicial system if the assessments used to determine the mental state of such individuals were not up
to the task. Confidence in a test to measure what it reliably is supposed to measure is built up by

research.
Assumption # 5
To be useful, assessments must pinpoint certain phenomena. The entire field of clinical psychology
could be said to rest upon this assumption. If psychologists are to be useful to their clients then they
must be able to diagnose psychological functions or behaviors. A diagnostic test may be administered to
help guide a clinician towards future more specific avenues of assessment (Cohen & Swerdlik, 2002).
Correctly diagnosing a patient should be a direct outcome of good assessment.
The modern psychologist has many well-researched assessments available to their disposal. For
instance, Daniel (1997) has noted that since the mid-1980s, at least half a dozen new or fundamentally
restructured intelligence batteries have been published, and the trend does not seem to be slowing.
With so many well-researched tools at their disposal, psychologists have a wide range of choices in
both how and what will be measured (Daniel, 1997). For the purposes of educational assessment,
educators use a range of diagnostic tools geared to measure visual, auditory, motor, and cognitive
functioning. Legal issues insuring the right of all students to the least restrictive class environment
possible work to assure that proper testing is accomplished. However, given the dire state of funding in
education, access to qualified licensed school psychologists is insufficient to make substantial change
in the quality of mental health of many students.
While more violence has been reported in public schools, the number of children struggling with
delinquency has also increased. According to a study done by Wilson, Lipsey, and Derzon (2003),
programs aimed at decreasing aggressive behavior in schools have the greatest effect when they consist
of behavioral and counseling strategies. In contrast, those programs that utilized peer mediation and
multimodal strategies showed the smallest effects (Wilson et. al, 2003). The proper diagnosis of
students with aggressive behavior is better left for qualified school psychologists. Unfortunately,
teachers who do not have such training are often the only intervention with aggressive students, and
tend to utilize mediation strategies that are less effective.
Assumption # 6
Because assessment is a multifaceted approach to evaluation, it is assumed that many sources of data
are apart of this process. Testing and assessment professionals understand that decisions that are
likely to significantly influence the course of an examinees life are ideally made not on the basis of a
single test score but, rather, from data from many different sources (Cohen &Swerdlik, 2002, p. 18).
Psychologists wishing for the best possible picture of a clients condition should utilize interviews,
personal history, and other types of information. It would be unfair to assess an alleged murderer solely
with a test designed at identifying aggressive behavior. Interview material, and past medical records

might serve to build a stronger picture of the ability of such a person to commit the alleged act.
Assumption # 7
Error is simply a part of the assessment process. Cohen and Swerdlik (2002) write potential sources or
error are legion (p. 18). Humans are not perfect and cannot act so. Anything produced and evaluated
by a human is imperfect. The overwhelming realization that every diagnosis is based only on an
approximation of an evaluation, not on a true evaluation, is enough to make one cringe. Factors
other than what a test attempts to measure, will, to some extent, influence an individuals
performance on a test (Cohen & Swerdlik, 2002). In addition, test administrators and developers are
also sources of error. Because there are many sources of bias and error in psychological testing, the
practicing psychologist must be well versed in the statistical analysis of individual tests, open to
collaboration with other knowledgeable professionals, reflective and consistent in testing practices,
and treat all test takers in an equal fashion. Analysis of results should always be critically scrutinized so
that conclusions based upon interpretations from test results reflect a high level of validity and
usefulness.
The systematic reduction of possible sources of error is necessary for any given assessment tool.
Fortunately for psychologists, standardization of error management has been discussed widely, and
standards from the Joint Committee on Standards for Educational and Psychological Testing (1999)
provide a structured blue print for error analysis. Possibly the best source of error feedback comes
from within the discipline of psychology. As stated earlier, the APA estimates more than 20,000 new
psychological assessments are published every year (APA, 1993, as cited in Cohen & Swerdlik, 2002). Of
these, many will not be used widely by the disclipline, some, however, will be researched and debated
by professionals in the field; and it is this scholarly debate that will drive the error analysis in a
productive direction.
Assumption # 8
All tests and measurement devices have strengths and weaknesses. The ability of tests to measure
constructs effectively must be researched and discussed before such tests are useful to psychology. An
argument for collaboration with other test professionals is that they share different expertise with
different psychological assessment tools. Understanding the limitations of a test is emphasized
repeatedly in the codes of ethics of associations of assessment professionals (Cohen & Swerdlik, 2002,
p. 19).
Assumption # 9 and # 10
Psychologists must assume that test related behavior is capable of predicting non-test behavior. In

general, both testing and assessment are preformed under the presumption that meaningful
generalizations can be made from the test data to behavior the lies outside of the specific testing
situation (Cohen & Swerdlik, 2002). Often, psychological tests require a response that has nothing to do
with the actual measured behavior. For example, a student may be asked to write T-for true or F-for
false on a number of questions. The behavior of writing Ts and Fs has no relevance to the tests ability
to measure, say, depression or self-confidence. Computerized assessments frequently require a
test taker to press a key signifying a response. It might seem rather odd for a psychologist to assess the
individuals ability to press a computer key, unless the test was a measure of motor functioning.
Testing theory also must assume that individuals will duplicate or indicate real life behavior on a test.
Since a test is given on a particular day, the sample of behavior might not be assessed while the
behavior is clearly evident on other days. An evaluator attempting to assess manic-depression disorder
might only be able to test the individual during depressive states due to the fact that during manic
episodes the client is not likely to think problems exist, and may fail to seek treatment.
Training and experience with individual assessments greatly increases the likelihood that psychologists
will be able to recognize the success of an instrument to delineate a certain behavior at the time of
testing. Other tools of assessment, such as case history, personal or family interview, or observation,
can provide a more accurate picture of behavior (Cohen & Swerdlik, 2002).
Assumption # 11
Tests must be conducted in a fair and unbiased manner. This premise is so important that Cohen and
Swerdlik (2002) remark, If we had to pick the one of these 12 assumptions that is more controversial
than the remaining 11, this one is it (p. 20). Tests not measured or interpreted in as fair and
unbiased a manner as possible are worthless to the assessor, psychologist, and client. The fact that
psychological measuring tools are relied upon to measure important aspects of human psychology
demands that the utmost care and attention be given to their fairness.
Many sources of possible bias exist, both within the test and outside of the actual test. Cultural,
language, age, sex, social and economic status, physical condition, test situation, test purpose, and a
buffet of other factors not taken into consideration during test development, serve to diminish the
usefulness of test data in making sound interpretations of a clients mental condition. Some potential
problems related to test fairness are more political than psychometric in nature (Cohen & Swerdlik,
2002, p. 20). Many assessments in social programs that may require ethnic or cultural information are
surrounded with stigma. With a realization of the different backgrounds test takers may come from and
the different purposes for the assessment, one can reflect on the possible sources of error, and in good
faith, try to limit, to the extent possible, deviation in scores due to these influences.
A persons true score is a hypothetical error-free value that characterizes an examinee at the time of

testing (Joint Committee on Standards for Educational and Psychological Testing, 1999). Because this
true score cannot exist in actuality, estimates of deviation from the examinees actual score on a given
assessment are assigned as measurement errors. Identification of measurement errors serve to
strengthen the interpretation of scores by the psychologist.
Errors of measurement are generally viewed as random and unpredictable (Joint Committee on
Standards for Educational and Psychological Testing, 1999, p. 26). The Standards for Educational and
Psychological Testing call for systematic procedures to be established for evaluations so that error
management can effectively take place. When tests are normed in a small cross section of society, they
run the risk of unfairly biasing different cultural, and other groups. Modern psychology benefits from
the action of courts which have issued rulings supporting the careful management of bias in tests,
especially when used in political arenas.
Assumption # 12
Testing is useful. If the testing and assessment process was not beneficial to a variety of agencies of
society it would not be a multi-billion dollar industry. It serves to categorize, prioritize, criminalize,
organize, and many other izes. Tests are useful to measure progress and change in the professional
world as well as education. Evaluations are necessary for society to function on a highly complex level.
Requirements for medical doctors and other licensed health professionals stems from the premise that
these people are highly qualified and competent in their knowledge and abilities. With out tests and
measurements there would be no accountability for qualifications of individuals in positions that
demand it.
Standardized tests and measurement tools in education have placed themselves in the forefront of
controversy over the last few decades. As politicians try to implement educational reform they are
seeking a cheap measure of the success of new programs. Consequently, standardized achievement
tests have become a norm across the country. The tests scores determine the schools future funding
and in some cases, whether the administration will be taken over by state agencies if test scores fail to
make some defined climb. But can testing go to far?
According to an internet article by Stephen Horowitz (2001), too much focus on testing results in a
decline of creative thinking skills. Since the purpose of this type of assessment is to evaluate the
students progress, as far as learned material, a cautionary note should be issued to those proponents
and practitioners of teaching to the test. Educators forced to spend an abundant amount of time
preparing students to take standardized achievement tests run the risk of sacrificing valuable learning;
leading to poorer abilities to think critically, and in the long run, resulting in lower test scores. While
education, at the moment, seems to be overburdened with standardized testing, other alternative
assessments are gaining wide spread support by teacher organizations.

Closing Remarks: A Synthesis


Attempting to create an individualized perspective on the field of psychometrics is like trying to stuff
an iceberg into a paper cup. The magnitude of information and relevant issues demand that any
professional in the field be highly trained and aware of the statistical data supporting testing and score
interpretation. Computer assessment tools have given psychologists a precious advantage in the
processing of data and the formulation of models, hitherto, that did not exist or were impossible.
Given that the computer technology available to psychology will only expand in the future, the
development and use of computer assessments will have a profound affect on how and what aspects
psychologists will continue to and be able to measure.
The author has had a difficult time coming to terms with the immensity of testing theory. One theme
that seems to surface again and again is the purpose of testing. Beyond the criticals of reliability and
validity studies, psychologists must ask deeply, what am I trying to assess? What tools are available?
These questions pervade everything in testing. One question that seems to be less prevalent is this:
What good will assessments do the client. An article by Brown & Dean (2002) suggests that in forensic
settings where assessment is typically devoid of action significantly beneficial for the assessed, the
possibility of useful therapy originating from the evaluation process can result in very significant
growth for families and children. What good is testing if it does not directly affect the one being
assessed?
Often in practice assessment is done in less than perfect environments. Managed health care
organizations driven by profit try desperately to cut spending costs wherever possible, resulting in
disabling decreases in the number and type of assessments psychologists have available. Chronically
under funded mental health facilities cut back where necessary, eliminating properly trained
personnel, leaving other, untrained assessors the complex job of assessment. In school settings,
psychologists are so few that it takes weeks if not months before some students can be properly
assessed for learning disabilities, social disorders, or emotional problems. Economics, perhaps, are the
most driving factor in psychological measurement. With out the proper resources or trained
professionals, error coefficients increase dramatically, assessment relevance to the client is
diminished, and future costs of treatment are increased due to improper diagnosis.
The usefulness of evaluation cannot be touted. However, the efficiency and effectiveness of
assessment needs to be analyzed and adapted where it suffers waste. Brown and Dean (2002) make an
excellent observation:
the current economics of both public and private mental health sectors in both developed and
developing countries, and in adult as well as in child and adolescent areas of work, are experienced as
demanding increasingly brief clinical contact with consumers. It can be anticipated, of course, that

longitudinal cost-benefit studies will eventually demonstrate the brevity of clinical contact does not
necessarily mean less expense to the community in the long-term (paragraph 4 Implications for
Clinical Psychological Practice).
Though Brown and Dean (2002) are speaking specifically about the economics of clinical visits, it is easy
to see how this passage easily translates, if not includes, proper clinical assessment.
A thorough clinician must not only comprehend the psychometric aspects of the evaluation tools used,
he/she must persevere in using the correct tools in the proper situations whenever he/she is called
upon to assess an individual. Economic factors will probably continue to be a limiting factor on the
amount of recommended tools. What seems to supercede the importance of the economic problem is
the value for the individual being assessed. What good will proper assessment due for the client? This
question drives valuable, scientific, productive measurement. A clinician who remembers that he
serves an individuals interest will consistently be more effective in positively effecting change in the
larger society.

You might also like