Item Banking. ERIC/AE Digest
Item Banking. ERIC/AE Digest
Item Banking. ERIC/AE Digest
ERIC/AE Digest.
ERIC Development Team
www.eric.ed.gov
Table of Contents
If you're viewing this document online, you can click any of the topics below to link directly to that section.
very time consuming endeavor. Not only do test writers need to compose the test items,
they also must determine each item's difficulty in order to ensure that a test will neither
be too hard nor too easy.
Using item banks, test makers can escape this process. Item banks are files of various
suitable test items that are "coded by subject area, instructional level, instructional
objective measured, and various pertinent item characteristics (e.g., item difficulty and
discriminating power)" (Gronlund, 1998, p. 130). The purpose of this digest is to discuss
the advantages and disadvantages of using item banks as well as provide useful
information to those who are considering implementing an item banking project in their
school district.
Item banking provides substantial savings of time and energy over conventional test
development. In traditional test development, items can only be described relative to the
other items within the test and to whom they were given. That is, item characteristics
are extremely group and test specific. With item banking, items are described their
relative difficulty across grade levels. In order to develop a new test or subtest, one
does not need to go through the laborious process of developing a large set of items for
piloting and evaluating. Instead, one just draws from the bank. Further, drawing from the
bank allows one to make fairly accurate predictions concerning composite test
characteristics.
One additional advantage of item banking is that it helps establish a language for
discussing curriculum goals and objectives. The items describe individual tasks
students are capable or incapable of doing. The location of the items on a calibrated
scale allows one to identify the relative difficulty of particular tasks. This provides a way
to discuss possible learning hierarchies and ways to better structure curriculum.
While some districts have implemented very successful item banks and Rasch
calibrated testing programs without knowing anything about IRT, good practice calls for
a staff that is comfortable with and knowledgeable of what they are doing. A district
undertaking an item banking project should have full understanding of the practical as
well as the mathematical/theoretical aspects of item banking.
An item bank really consists of multiple collections of items with fairly unidimensional
content area, such as mathematic computations or vocabulary. Collections of items
usually span several grade levels. In order to develop the bank, many tests must be
calibrated, linked (or equated), and organized. This requires a great deal of work in
terms of preparation and planning and in terms of computer time and expertise. Once
the item bank is established, however, test development time, effort, and cost is
reduced.
Everyone on the staff should have enough familiarity with Rasch measurement
principles and item banking to be able to knowledgeably discuss and explain the
project. You can formally train your staff by using in-house personnel, bringing in a
traveling workshop, or having people attend a pre-session at a research association or
conference.
You should have senior level personnel available to answer technical questions that
might arise. You should also have computer experts that are capable of doing the
following tasks: 1.) modifying computer programs, 2.) establishing a data base system,
and 3.) capable of running packaged programs.
If you intend to do any item bank exchanges or purchases, you should have someone
on your staff who knows what is available. You need personnel capable of critically
evaluating test items for technical quality, curriculum match, unidimensionality, and
potential bias. In order to accurately calibrate test items and establish scales, items
need to be presented to examinees with a wide range of ability.
In order to link various forms and grade levels within a content area, common anchor
items are needed. (These anchor items must be administered along with the items
within a given form. The form and anchor items are calibrated together. The anchor item
parameter values based on calibration with one form are compared with the anchor item
parameter values based on calibration with another form. The difference in parameter
values is used to link the forms.) You need to identify for which content areas you have
administered overlapping subtests and the number of students responding to the set of
items. You may find you will need to gather additional item response data to link forms
and grade levels.
Your data processing staff should examine literature and programs on item banking to
determine what programs must be developed and what programs can be modified.
As much as possible, you should identify your projected testing needs for the next five
years. This would involve identification of which subtests you will need to revise, what
additional areas you may need to assess, and how objectives might be differently
stressed.
START-UP ACTIVITIES
The start-up activities would mostly involve administrative activities and the data
processing staff. Each test would have to be calibrated and equated to the parallel form
and adjacent grade levels. The data processing staff would have to adapt existing
computer programs to the local system and develop a database system. They would
then calibrate each test, equate the tests, and store the equated item parameters and
their descriptors in a database system. With a large number of tests and items, this
becomes a major undertaking.
Administrative staff would have to coordinate activities to insure that the data
requirements are met. During the planning process, a chart can be developed to identify
which tests and anchor items have been and will need to be administered to the
requisite sample. Working from these charts, testing coordinators will need to organize
the administration of tests and subtests needed to calibrate and equate all the items
going into the item bank. This involves compiling test booklets, making testing
arrangements, collecting response sheets, and preparing data for data processing.
Depending on frequency of students taking multiple subtests from different levels and
forms, this too can be a major undertaking.
The major task involved in using items from another item bank is a thorough, careful
review of the items. All potential entries must be evaluated for technical quality,
curriculum match, and potential bias. This would involve your test development experts,
curriculum/instructional staff, and coordination between the two.
After an item review, items non-calibrated could be treated like items developed by your
staff. "Small deposits" would be made by calibrating and equating a few items at a time.
One very efficient approach to collecting the requisite data is to append subtests of new
items to original groups. The items within the original group would serve as anchor
items for the new subtest(s) of items. In this manner, you can be constantly adding to
your item bank.
Once developed and growing, your item bank is ready to provide the advantages
discussed above. To develop a new subtest, you would develop a blueprint/table of
specifications to outline what you want your new subtest to be like. Curriculum
specialists and test development experts would then go to the item bank and identify
which items in the bank appear appropriate in terms of content and in terms of their
relative difficulty. If they find an insufficient number of items, they can make
arrangements to add new items to the bank.
If the bank contains a sufficient number of items of the appropriate nature, the items can
be grouped to form a new subtest. Without pilot testing, the characteristics of this new
subtest can be predicted. With reasonable accuracy, you will know how much skill an
examinee needs to obtain any given total raw score on the new subtest. The prediction
should be validated by administering the subtest to students having received
appropriate instruction and students not having received such instruction. This can also
be accomplished by appending items to the existing forms. This validation would need a
sample as large as you used in field testing the original group.
An item bank provides a scale of relative difficulty of tasks that covers multiple grade
levels and skills within content areas. As a service to the instructional/curriculum staff,
you can provide information on the relative difficulty of different tasks within and across
grades levels. For example, you can identify which fraction problems seventh graders
find as difficult as certain decimal problems; or you can identify which reading skills
taught in fourth grade can be mastered by students in their grade. It could also be used
to help organize special programs for gifted and remedial students.
ADDITIONAL READING
Grolund, N.E. (1998). Assessment of Student Achievement. Sixth Edition. Needham
Heights, MA: Allyn and Bacon.
Lord, F.M. (1980). Applications of item response theory to practical testing problems.
Hillsdale, N.J.: L. Erlbaum Associates.
Mengel, Bill E.; Schorr, Larry L. (1992). Developing Item Bank Based Achievement
Tests and Curriculum-Based Measures: Lessons Learned Enroute. (ED 344 915).
Ward, A.W.; Murray-Ward, M. (1994). Guidelines for the development of item banks. An
NCME instructional module. Educational Measurement: Issues and Practice,13(1),
34-39.
Wright, B.D.; Stone, M.H. (1979). Best Test Design. Rasch Measurement. Chicago, IL:
MESA Press.
This publication was prepared with funding from the Office of Educational Research and
Improvement, U.S. Department of Education, under contract RR93002002. The
opinions expressed in this report do not necessarily reflect the positions or policies of
OERI or the U.S. Department of Education. Permission is granted to copy and distribute
this ERIC/AE Digest.