431 1 To 8
431 1 To 8
431 1 To 8
1. Measurement
Measurement is the process by which the attributes or dimensions of some object (both
physical and abstract) are quantified. The tools used for this purpose may include test,
observation, checklist, homework, portfolios, project etc. (the process of changing the person's
ability into numbers is measurement). Measurement can be easily understood if we use this word
to measure height and distance because these things are physical present in their existence so
height can easily be measured by scale. But in the field of education, our variable are not
physical and cannot be directly measured e.g. attitude, behavior, and achievement etc. these all
are abstract, so there measurement is relatively difficult than those who have physical existence.
The tool used for measuring the abstract variables cannot measure exactly like scale
(thermometer).
So in this whole course, whenever the measurement word is used it means that tool which will be
used for measuring student abilities and then converts it in numerical form.
2. Assessment
It means appraisal of something to improve quality of teaching and learning process for
deciding what more can be done to improve the teaching, learning and outcomes.
3. Evaluation
Evaluation is process of making a value judgment against intended learning outcomes
and behavior, to decide quality and extent of learning.
Evaluation is always related to your purpose, you aligned your purpose of teaching with what
students achieved at the end, with their quality and quantity of learning
Let’s take an example
In classroom situation, as a teacher, when you teach a chapter or unit to a class, first you
made the objectives, either you will make it yourself as a teacher or you take it from curriculum
document of the particular subject. Objectives are also written at the start of the chapter or book
which shows that at the end of the unit what student will be able to do this, that is also referred as
Student learning outcomes. So these SLOs can be checked by two aspects i. assessment ii.
Evaluation. When you, as a teacher reflect on your teaching on daily basis that what you teach
yesterday was a good way to teach or not, what i taught was the students need, did they
understand what i teach, and then you can decide changing which things can cause the
improvement in the learning process of students. This is ASSESSMENT. In the end of
assessment, opinion is not about the individual, it is about the process from which the students or
individuals are passing for the betterment of that process.
Evaluation is that when the learning process is complete and you want to see what are your
targets or objectives and how much my students achieved those objectives , then tools are made
and measures comes from that tell that how much your student learn and what is the quality of
their learning.
Classification of Assessment
Assessment can be classified in four ways
1. Nature of Assessment
2. Format of Assessment
3. Use in classroom instruction
4. Method of interpreting results
1. Nature of Assessment
i. Maximum Performance Assessment
ii. Typical Performance Assessment
2. Format of Assessment
i. Fixed Choice Assessment
ii. Complex Performance Assessment
1. By Nature of Assessment
i. Maximum Performance Assessment
ii. Typical Performance Assessment
2. By format of assessment
i. Fixed Choice Assessment
ii. Complex Performance Assessment
i. Placement Assessment
Placement Assessment determines prerequisite skills, degree of mastery of course goals and
mode of learning. Placement assessment is used when we want to assess student‘s prior
knowledge so that we can decide what the level of student is. It is associated with student‘s entry
level performance to know either student have a sufficient knowledge required for a particular
course or not. Through placement assessment, teacher can be able to know that where student
should be place according to their present knowledge or skills. It determines the level of student
knowledge at the beginning of session and helps teacher plan the lesson accordingly. In the
classroom, the teacher can use placement assessment to assess the level of students‘ knowledge
and skills and then make lesson plans keeping in mind the level and need of students
accordingly.
It also determines the interest and aptitude of student regarding a subject and helps in selecting
correct path for future.
Examples
Readiness test: It is a test used to determine the students‘ knowledge or concept about a
particular course of instruction or what is the level of students
Aptitude test: It is used for the admission in a particular program
Pretest: It is made according to the course objectives and determines the student present
knowledge about them
Self- report inventories: Determines the student level by interviewing or discussion
i. Formative Assessment
Formative Assessment determines learning progress, provides feedback to reinforce learning,
and correct learning errors. When we assess student during classroom instruction with a purpose
to have a feedback that how can we make our teacher learning process better, that is formative
assessment. In this assessment, we are not assessing what students learnt or not rather we assess
the process behind the students learning .The process behind the student learning includes
teaching method , book, . If we make all these things according to the needs of students then
learning will improve.
It is conducted during the academic session or teaching-learning process so that i can get a
feedback about my way of teaching and how students are learning and decisions are made on the
basis of results immediately. It is and ongoing process to modify teaching strategies on the basis
of students need.
It provides feedback to teachers
. About weakness and strength of learning process
. To modify their teaching practices
. To improve teacher-learning process
It also helps students to reflect on their weaknesses and encourages them for their
successful learning. (When we tell students that their problems is their way of learning rather
than their intelligence, then we tell them how to change your way of learning to learn better.
With this, students can reflect on their learning process alone, and then what goes wrong with my
learning process)
Formative assessment provides feedback to students who are struggling with specific content
area or concept.
The main difference between formative and summative assessment is that the in
formative assessment improvement is in the process of learning rather than to certify students
We use different tools for formative assessment and it includes teacher made tests, custom made
tests from textbook publishers, observational techniques
Topic 6: Types of
Assessment Methods of
Interpreting Results
Norm referenced test includes standardized aptitude and achievement tests, teacher-made survey
tests, interest inventories, adjustment inventories
In this unit, students will learn about link between curriculum and assessment. For this purpose,
we precede our discussion in reference to National Curriculum of Pakistan 2006.
• Competency
• Standards
• Benchmarks
• Student learning outcome (SLOs)
Competency
It is a key learning area. For example algebra, arithmetic, geometry etc. in mathematics and
vocabulary, grammar, composition etc. in English.
Standards
These define the competency by specifying broadly, the knowledge, skills and attitudes that
students will acquire, should know and be able to do in a particular key learning area during
twelve years of schooling.
Benchmarks
The benchmarks further elaborate the standards, indicating what the students will accomplish at
the end of each of the five developmental levels in order to meet the standard.
These are built on the descriptions of the benchmarks and describe what students will accomplish
at the end of each grade. It is the lowest level of hierarchy.
Topic 8: Connecting all four levels in curriculum
In the image above, SLOs are at the bottom which is the lowest level. All SLOs combined to
make a benchmark and benchmarks convert into standards and then into competency.
Example:
Standard 1: All students will search for, discover and understand a variety of text types through
tasks which require multiple reading and thinking strategies for comprehension, fluency and
enjoyment.
Example:
6.1.1. periodic/formative assessment through homework, quizzes, class tests and group
discussions.
- A clear statement of the specific purpose(s) for which the assessment us being carried
out.
- A wide variety of assessment tools and techniques to measure students ability to use
language effectively.
- Criteria to be used for determining performance levels for the SLOs for each grade level.
- Procedures for interpretation and use of assessment results to evaluate the learning
outcomes.
- MCQs
- Constructed response
o Restricted response
o Extended response
- Performance tasks
Lecture 3
1. A model for how students present knowledge and develop competence in the subject
domain
2. Tasks or situations that allow the examiner to observe the students‘ performance
Popular Taxonomies
1. Pre-structural
2. Uni-structural
3. Multi-structural
4. Relational
5. Extended Abstract
DOK (Depth of Knowledge) was presented by Webb in 1997, giving four levels of learning
activities
1. Recall
2. Skill/Concept
3. Strategic Thinking
4. Extended Thinking
Bloom‘s Taxonomy was presented by Benjamin Bloom in the 1956, consists of a framework
with most common objectives of classroom instruction.
Bloom’s Taxonomy of Learning Objectives
Those dealing in three different domains and further sub categories in these domains.
1. Cognitive
2. Affective
3. Psychomotor
Cognitive Domain
i. Knowledge
ii. Comprehension
iii. Application
iv. Analysis
v. Synthesis
vi. Evaluation
Affective Domain
i. Receiving
ii. Responding
iii. Valuing
iv. Organization
v. Characterization
Psychomotor Domain
i. Perception
ii. Set
iii. Guided Response
iv. Mechanism
v. Complex covert Response
vi. Adaption
vii. Origination
Topic 12: SOLO Taxonomy
Levels of SOLO
1. Pre-structural
2. Uni-Structural
3. Multi-structural
4. Relational
5. Extended Abstract
1. Pre-structural
Students are simply able to acquire bits of unconnected information and respond to a question in
meaningless way. Example of pre-structural level:
2. Uni Structural
Student shows concrete understanding of the topic. But at this level is only able to respond one
relevant element from the stimuli or item that is provided.
3. Multi- Structural
Student can understand several components but the understanding of each remains discreet.
A number of connections are made but the significance of the whole is not determined. Ideas
and concepts around an issue are disorganized and aren't related together.
4. Relational
Student can indicate connection between facts and theory, action and purpose. Shows
understanding of several components which are integrated conceptually showing
understanding of how the parts contribute to the whole. Indicative verbs: compare/contrast,
explain causes, integrate, analyze, relate, and apply.
5. Extended Abstract
Student at this level is able to think hypothetically and can synthesize a material logically.
Student make connections not only with in the given subject area but understanding is
transferable and generalizable to different areas. Indicative verbs: theorize, generalize,
hypothesize, reflect, generate
Topic 13: Depth of Knowledge
Levels of DOK
1. Recall
2. Skill/concept
3. Strategic Thinking
4. Extended Thinking
DOK measures the degree to which the knowledge bring about from students on assessments is
as complex as what students are expected to know and do as stated in the curriculum.
Knowledge) Recall
Recall of a fact, information, or procedure. The subject matter at this particular level usually
involves working with facts, terms and/or properties of objects.
Skill/Concept
Strategic Thinking
Items falling in this category demand a short-term use of higher order thinking processes, such as
analysis and evaluation, to solve real-world problems with predictable outcomes.
Extended Thinking
Learning outcomes to this level demand extended use of higher order thinking processes such as
synthesis, reflection, assessment and adjustment of plans over time.
There are three main domains of learning and all teachers should know about them and use them
to construct lessons.
• Cognitive Domain
• Affective Domain
• Psychomotor Domain
In 2000-01 revisions to the cognitive taxonomy were spearheaded by one of Bloom's former
students, Lorin Anderson, and Bloom's original partner in defining and publishing the cognitive
domain, David Krathwohl. One of the major changes that occurred between the old and the
newer updated version is that the two highest forms of cognition have been reversed.
Knowledge:
It is defined as the remembering of previously learned material. This may involve the recall of a
wide range of facts, procedures principals and generals, the recall of procedures and the
processes.
Sample Question: Define the 6 levels of Bloom's taxonomy of the cognitive domain.
Comprehension:
It is defined as the ability to grasp the meaning of the material. individual can make use of the
content or idea being communicated without necessarily related it to other content and seeing its
fullest implications. Sample Question: explain the purpose of Bloom's taxonomy of the cognitive
domain.
Application:
It refers to the ability to use the previously learned material in new and concrete situations. The
abstractions may be in the shape of universal ideas, rules of methods. Sample Question: write an
instructional objective for each level of Bloom's taxonomy.
Analysis:
The breakdown of a concept into its constituents parts such that the relative hierarchy of the
concept is made easy to understand or the relation between the parts of the concept is elaborated.
Sample Question: compare and contrast the cognitive and affective domains.
Synthesis:
Evaluation:
It is concerned with the ability to judge the value of the material for a given purpose. Judgments
are made on the definite criteria. Sample Question: How far the different BISEs and universities
are developing papers using Bloom's taxonomy? Support your answer with arguments.
Topic 16: Revised version of Bloom’s Taxonomy
Levels
Remembering:
Exhibitmemory of previously learned material by recalling facts, terms, basic concepts, and
answers.
Key verbs:
Choose, Define, Find, How, Label, List, Match, Name, Omit, Recall, Relate, Select, Show, Spell,
Tell, What, When, Where, Which, Who, Why
Understanding:
Constructing meaning from different types of functions be they written or graphic messages, or
activities.
Key verbs: Classify, Compare, Contrast, Demonstrate, Explain, Extend, Illustrate, Infer,
Interpret, Outline, Relate, Rephrase, Show, Summarize, Translate
Applying:
Solve problems to new situations by applying acquired knowledge, facts, techniques and rules in
a different way.
Key verbs:, Apply, Build, Choose, Construct, Develop, Experiment with, Identify, Interview,
Make use of, Model, Organize, Plan, Select, Solve, Utilize.
Analyzing:
Breaking materials or concepts into parts, determining how the parts relate to one another, or
how the parts relate to an overall structure or purpose.
Key verbs: Analyze, Assume, Categorize, Classify, Compare, Conclusion, Contrast, Discover,
Dissect, Distinguish, Divide, Examine, Function, Inference, Inspect
Evaluating:
Making judgments based on criteria and standards through checking and critiquing.
Key verbs: Agree, Appraise, Assess, Award, Choose, Compare, Conclude, Criteria, Criticize,
Decide, Deduct, Defend, Determine, Disprove, Estimate
Creating:
Putting elements together to form a coherent or functional whole; reorganizing elements into a
new pattern or structure through generating, planning, or producing.
Key verbs: Adapt, Build, Change, Choose, Combine, Compile, Compose, Construct, Create,
Delete, Design, Develop, Discuss, Elaborate, Estimate, Formulate
These categories range from simple to complex and from concrete to abstract level of student‘s
learning. It is assumed that the taxonomy represents a cumulative hierarchy, so that mastery of
each simpler category is considered as prerequisite to mastery of the next, more complex one.
Comparison of Bloom, SOLO and DoK
2. General objectives
3. Specific objectives
• Methods books
• Year books
• Curriculum Frameworks
• Test manuals
1. Completeness
2. Appropriateness
3. Soundness
4. Feasibility
General Objectives
• Objective should be specific enough to provide the direction for instruction but not so
specific that instruction is reduced to training
• Stating general objectives in general terms, we provide for the integration of specific
facts and skills into complex response
• General statements gives teachers freedom in selecting the method and materials of
instruction
• Understands concepts
• Interpret graphs
3. State each general objective to include only one general learning outcome
5. Keep each general objective sufficiently free of course content so it can be used with
various units of study
Each General objective must be defined by a sample of specific learning outcome to clarify how
students can demonstrate that they have achieved general objective. Until the general objective
are further defined in this manner they will not provide adequate direction for assessment
Steps for Stating Specific Outcomes
1. List beneath each general objective a representative sample of specific learning outcome
that describes terminal performance students are expected to demonstrate
2. Begin each specific learning outcome with an action verb that specifies observable
performance
3. Make sure that each specific learning outcome is relevant to the general objective it
describes
4. Include enough SLOs to describe adequately the performances of students who have
attained the objectives.
5. Keep the SLOs sufficiently free of course content so that the list can be used with various
units of study
6. Consult reference materials for the specific components of those complex outcomes that
are difficult to define
Why Test?
In the classroom, decisions are constantly being made. Teachers face huge numbers of dilemmas
every day. These decisions can be of following nature
• Instructional
• Grading
• Diagnostic
• Selecting
• Placement
• Program or Curriculum
• Administrative
These types of decisions are taken at different levels. Some are decided at Board/Administrative
level while some are taken at school management level and other are taken in classrooms by
teachers
Instructional Decisions
Instructional decisions are the nuts and bolts types of decisions made in classroom by teachers.
These are most frequently made decisions. Such decisions include deciding to
• Instructional plans
Grading Decisions
Educational decisions based on grades are also made by the classroom teacher but much less
frequently than instructional decisions. For most students grading decisions are most influential
decision made about them
Diagnostic Decisions
Diagnostic decisions are those made about a student‘s strengths and weaknesses and the reasons
behind them. Teachers make diagnostic decisions based on information yielded by an in-formal
teacher made test
Decisions of diagnostic nature can also be made by the help of standardized tests (will be
discussed in next session)
Selection Decisions
Selection decisions involves test data used in part for accepting or rejecting applicants for
admission into a group, program, or institution
Placement Decisions
Placement decisions are made after an individual has been accepted in a program. They involve
determining where in program someone is best suited to begin with.
Counseling and guidance decisions involve the use of test data to help recommend programs of
study that are likely to be appropriate for the students
Program or curriculum decision
This type of decision is taken at policy level. Where it is decided if a lesion, unit or subject will
continue or abandoned for next academic session according to the national objectives of
education.
Administrative Decisions
Administrative policy decisions may be made at school, district, state or national level.
How to Measure
In classroom assessment different forms of assessments are utilized. Each form of test has its
own benefits and disadvantages. Most common type of assessment used in classrooms is written
assessment
• Verbal
• Non-verbal
• Objective
• Subjective
• Teacher Made
• Standardized
• Power
• Speed
Verbal
Emphasize reading, writing, or speaking. Most tests in education are verbal tests.
Non-verbal
Does not require reading, writing or speaking ability, tests composed of numerals or drawings is
example.
Objective
Refers to scoring of tests when two or more scorers can easily agree on whether the answer is
correct or incorrect, the test is objective one. True false, multiple choice and matching tests are
example
Subjective
Also refers to scoring. When it is difficult for two scorers to agree on whether an item is correct
or incorrect, the test is a subjective one. Essay tests are the example.
Teacher Made
Constructed solely by teacher only to be used in his/her own classroom. This type of test is
custom designed according to need and issues related to specific class
Standardized
Test constructed by measurement experts over a period of years. They are designed to measure
broad national objectives and have a uniform set of instructions that are adhered to during each
administration
Most also have tables of norms, to which a student performance may be compared to determine
where the student stands in relation to a national sample of students at same level of age or grade
Power
Tests with liberal time limits that allow each student to attempt each item. Item tend to be
difficult.
Speed
Tests with time limits so strict that no one is expected to complete all items. Items tend to be
easy.
Why Test?
General purpose of assessment is to gather information to make better and more informed
decision. The utility of that information is what differentiate between types of assessments. In
earlier session classification of assessment by method of interpreting results was discussed. This
session will further unpack the complexity of norm and criterion referenced assessment
NRT
Type of test which tells us where a student stands compared to other students. It helps
determining a student‘s place or rank among a group of similar students. Such kind of test is
called norm-referenced test
Dimensions
• NRT tend to be general. It measures variety of skills at same time but fails to measure
them thoroughly.
• It‘s hard to make decisions regarding the mastery of student‘s skill in subject.
It provides estimate of ability in a variety of skills in much shorter time. NRT are much difficult
for students to solve. On average only 50% students are able to get an item right in a test
A second type of test tells us about student‘s level of proficiency in or mastery of some skill or
set of skills. This is achieved by comparing a student‘s performance to a standard mastery called
a criterion. Test that yields such information is called Criterion Referenced Test
Dimensions
• CRT tends to be specific. It measures particular set of skill at one time and focus on level
of achievement of that skill. CRT gives clear picture regarding the mastery of student‘s
skill in subject.
• It measures skill more thoroughly so naturally it takes more time comparing to NRT in
measuring the mastery of said skill
• Items included in CRT are relatively easier. Around 80% of the students are expected to
respond item correctly in the test
• Sampled content in CRT is much more comprehensive, usually three or more items are
used to cover single objective.
• The meaning of the score does not depend upon on comparison with other scores.
• It flows directly from the connection between the items and the criterion.
• Items are chosen to reflect the criterion behavior. Emphasis is placed upon the domain of
relevant responses.
Basis of comparison
• Comparison targets
• Selection of items
• Meaning of success
• Average item difficulty
• Score distribution
• Reported scores
Comparison targets
Meaning of success
In CRT, the average item difficulty is fairly high. Examinees are expected to show mastery. In
NRT, the average item difficulty is lower. Tests are able to spread out the examinees‘ and
provide a reliable ranking.
Score Distributions
In CRT, a plot of the resulting score distribution will show most of the scores clustering near the
high end of the score scale. In NRT, broader spread of scores is expected, with a few examinees
earning very low or high scores and many earning medium scores.
Reported Scores
In earlier session classification of assessment by use in classroom instruction was discussed. This
session will further unpack the complexity of norm and criterion referenced assessment
Formative Assessment
Formative assessment provides feedback and information during the instructional process, while
learning is taking place, and while learning is occurring. Formative assessment measures student
progress but it can also assess your own progress as an instructor.
• Question and answer sessions, both formal (planned) and informal (spontaneous)
• Conferences between the instructor and student at various points in the semester
Summative Assessment
Summative assessment takes place after the learning has been completed and provides
information and feedback that sums up the teaching and learning process. Typically, no more
formal learning is taking place at this stage, other than incidental learning which might take place
through the completion of projects and assignments.
Summative assessment is more product-oriented and assesses the final product, whereas
formative assessment focuses on the process toward completing the product. Once the project is
completed, no further revisions can be made.
If, students are allowed to make revisions, the assessment becomes formative.
• Term papers (drafts submitted during the semester would be a formative assessment)
• Performances
Topic 27: Functions of Summative Assessment
Table of specification
One of the tools used by teachers to develop a blueprint for the test is called ―Table of
Specification‖ in other words Table of Specification is a technical name for the blue print of test.
It is the first formal step to develop a test.
Carey (1988) listed six major elements that should be attended to in developing a Table of
Specifications for a comprehensive end of unit exam:
1. Balance among the goals selected for the exam (weighing objectives)
2. Balance among the levels of learning (higher order and lower order
5. The number of test items for each goal and level of learning
2. Do the specifications indicate the nature and limits of the achievement domain?
5. Is the number of test items indicated for the total test and for each subdivision?
6. Are the types of items to be used appropriate for the outcomes to be measured?
7. Is the difficulty of the items appropriate for the types of interpretation to be made?
Topic 31: Balance among Learning Objectives and their Weight in table of
specification
In developing a test blueprint first of all it is necessary to select some learning. Objectives
and among this list of learning objectives some objectives are more important in sense that more
time of instruction is spent on them while some other are less important in terms of time spent on
them in classroom so in developing table of specification balance among these learning
objectives is important, for this purpose we need to weigh the learning objectives for calculating
their relative weightage in test.
Step 3
25 ±2 = 25 ±2
It can be a bit tricky if the total marks of the test are 50. Then 25% of 50 will
be 12.5 marks. Point total of questions for objective / total points * on examination
= % of examination value
We have learnt to give weightage to the content area in a table of specification. Now we
look at an example to develop table of specification practically. Following is the table of
specification comprised of topics to be cover in test and their weightage that represent percentage
of marks for each topic.
Pakistan Movement
Time: (100/500)*100 = 20%
Geography of Pakistan
Time: (150/500)*100 = 30%
Climate Change
Time: (150/500)*100 = 20%
Industries
Time: (50/500)*100 = 10%
Economy
Time: (50/500)*100 = 10%
Let‘s consider that we have to develop a test of 50 marks according to the above
discussed table of specification then distribution of marks for each topic is as under.
Industries 5 (10%)
Time: (50/500)*100 = 10%
Economy 5 (10%)
Time: (50/500)*100 = 10%
Published test, supplement and complement informal classroom tests, and aid in many
instructional decisions.
Published test are designed and conducted in such a manner that each and every
characteristic is pre planned and known.
There are many published tests available for school use. The two most value to the instructional
program are:
1. Achievement tests
2. Aptitude tests
There are hundreds of tests available for each type. Selecting the most appropriate one is
important task. In some cases published tests are used by teachers. But more frequently these are
used by provincial or national testing programs.
In classrooms most used published tests are:
1. Achievement tests
2. Reading test
Published tests commonly used by provincial or national testing programs are:
1. Aptitude tests
2. Readiness tests
3. Placement tests
We discussed different types of assessment or how the results are to be used, all assessments
should possess certain characteristics. The most essential of these are:
• Validity
• Reliability
• Usability
Validity
Reliability
Reliability vs Validity
Reliability of measurement is needed to obtain the valid results, but we can have reliability
without validity. Reliability is necessity but not sufficient condition for validity.
Usability
In addition to validity and reliability, an assessment procedure must meet certain practical
requirement which includes feasibility, administration environment and availability of results for
decision makers.
1. Nature of validity.
Validity is referred as ―validity of test‖ but it is in fact validity of the interpretation and use to be
made of the results.
It does not exist on all or none basis. It is best considered in term of categories that specify
degree, such as high, moderate or low validity
No assessment is valid for all purposes. An arithmetic test may have high degree of validity for
computational skill and low degree for arithmetical reasoning
Validity does not have different types. It is viewed as a unitary concept based on different kind
of evidences
1. Evidences of validity
2. Concept of content validity
3. Procedure to find content validity
4. Method of ensuring content validity.
Content
Construct
Criterion
Meaning
How well the sample of assessment tasks represents the domain of the tasks to be measured.
Procedure
It compares the assessment tasks to the specifications describing the task domain under
consideration
Method
Meaning
How well a test measures up to its claims. A test designed to measure depression must only
measure that particular construct, not closely related ideals such as anxiety or stress.
Procedure
Introduction It introduces the main 1. Single sentence called the thesis statement is
Paragraph idea, captures the written
interest of reader and 2. Background information about your topic
tells why topic is provided
important. 3. Definitions of important terms written
Supporting Supporting paragraphs 1. List the points about main idea of essay.
Paragraphs make up the main body 2. Write separate paragraph for each supporting
of your essay point.
3. Develop each supporting point with facts,
details, and examples.
Method
1. Expert judgment
There are experts of the field. For above example, people who are expert in essay writing
will be considered to assess the construct validity of the table and table will be revised
under their guidance.
2. Factor analysis
In this, we group the questions by keeping in view the responses of respondents on them.
Demonstrates the degree of accuracy of a test by comparing it with another test, measure or
procedure which has been demonstrated to be valid.
Concurrent validity
This approach allows one to show the test is valid by comparing it with an already valid test
Predictive
It involves testing a group of subjects for a certain construct, and then comparing them with
results obtained at some point in the future
Procedure
Compare assessment results with another measure of performance obtained at a later date (for
prediction) or with another measure of performance obtained concurrently (for estimating
present status)
Method
The degree of relationship can be described more precisely by statistically correlating the two
sets of scores. The resulting correlation coefficient provides numerical summary of relationship
How well use of assessment results accomplishes intend purposes and avoids unintended effects
Procedure
Evaluate the effects of the use of assessment results on teachers and students. Both, the intended
positive effects (e.g., increased learning) and possible unintended negative effects (e.g.,, dropout
of school) need to be evaluated
Considerations
• Unclear directions
• Ambiguity
• Overemphasis of easy to access aspects of domain at the expense of important, but hard
to access aspects
1. Reliability refers to the results obtained with an assessment instrument and not to the
instrument itself.
2. An estimate of reliability always refers to particular type of consistency (stability,
equivalence, internal consistency)
3. Reliability is necessary but not sufficient condition for validity.
4. Reliability is primarily statistical (range +1 and -1).
Characteristics
1. Stability:
2. Equivalence:
3. Internal consistency:
• Test-Retest (stability)
1. Test-retest method
• It gives the same test twice to the same group with any time interval between tests, Time
interval can range from several minutes to the several years
Test- Retest
September 25 October 15
Form A Form A
1. Item a yes 2. Item a yes
2. Item b no 2. Item b no
3. Item c yes 3. Item c yes
Time interval is key point in this type
• Very long interval will influence results by instability and actual changes in students over
time.
• It gives two forms of the test to the same group in close succession
September 25 September 25
Form A Form B
3. Item c No 3. Item f No
• It gives two forms of the test to the same group with increased interval between forms
Equivalent Forms method
September 25 September 25
Form A Form B
1. Item a 2. Item a
2. Item b 2. Item b
3. Item c 3. Item c
Score = 82 Score= 78
Test- Retest with Equivalent Forms
September 25 October 15
Form A Form B
1. Item a 2. Item a
2. Item b 2. Item b
3. Item c 3. Item c
Score = 82 Score= 74
• It gives test once. Score two equivalent halves of test, correct correlation between halves
to fit whole test by spearman –brown formula
Split Half Reliabilities tend to be higher than equivalent form reliabilities because split half
method is based on the administration of single assessment
• It gives test once. Score total test and apply Kuder- Richardson
As with the split half method, these formulas provide an index of internal consistency but do not
require splitting the assessment in half for scoring purposes
One formula KR20 is applicable only when student responses are scored dichotomously (0 or 1).
It is most useful with traditional test items scored correct or incorrect
The generalization of KR20 for assessments that have more than dichotomous, right-wrong
scores is called Coefficient Alpha
Inter-Rater Method
• It gives a set of students responses requiring judgmental scoring to two or more raters and
have them independently score the responses
Lecture 8: Alternate Assessment Tools
Many outcomes in the cognitive domain, such as those pertaining to knowledge, understanding,
and thinking skills, can be measured by paper pencil tests. But there are still many learning
outcomes that require informal observation of natural interactions.
1. Observing students as they perform and describing or judging that behaviors (Anecdotal
record).
2. Asking their peers about them and assessing social relationships (Peer appraisal).
3. Questioning them directly and assessing expressed interests (Self-appraisal).
4. Measuring progress by recorded work (portfolio).
Anecdotal records
Impressions gained through observation are apt to provide an incomplete and biased
picture, however unless we keep an accurate record of our observations. Method to do so is
called anecdotal records.
Anecdotal records are factual descriptions of meaning incidents and events that the
teacher observes.
One should keep in mind the following points to use anecdotal records effectively.
1. Peer appraisal
Peer appraisal
In this procedure students rate their peers on the same rating device used by their teacher.
It depends on greatly simplified procedures.
The guess who technique is based on nomination method of obtaining peer ratings and is
scored by simply counting the number of mentions each students receive on each description.
Sociometric technique
This form was used to measure student‘s acceptance as seating companions, work
companions and play companions.
1. Portfolio
2. Weakness and strengths of portfolio
Portfolio
Systematic collection of students work into portfolios can serve a variety of instructional
and assessment purposes. The value of portfolios depend heavily on the clarity of purpose the
guidelines for the inclusion of materials, and the criteria to be used in evaluating portfolio.
1. Specify purpose.
2. Provide guidelines for selecting portfolios.
3. Define student‘s role in selection and self-evaluation.
4. Specify evaluation criteria.
5. Use portfolios in instruction and communication.
Strengths of portfolios
1. The can be readily integrated with the instruction.
2. Provide opportunity to student‘s to show what they can do.
3. Encourage to become reflective learner.
4. Help in setting goal and self-evaluation
5. Help teacher and student to collaborate and reflect on student‘s progress.
6. Effective way to communicate with parents.
7. Provide mechanism for student centered and student-directed conferences with
parents.
8. Provide concrete examples of students development and current skills.
Weaknesses of portfolios
1. Purpose of Portfolio
2. Guidelines for portfolios entries.
Purposes of portfolios
Fundamentally two global purposes for creating portfolios of students work: for student‘s
assessment and instruction. It can be used to showcase student‘s accomplishment and document
the progress.
Instructional purposes:
When primary purpose is instruction, the portfolio might be used as means of:
Assessment purposes:
When the focus is on accomplishments, portfolios usually are limited to finished work and
may cover only a relatively small period of time.
When focus is on demonstrating growth and development the time frame is longer. It will
include multiple version of same work over time to measure progress.
It contains student selected entries. It demonstrate students ability to choose his best work
which demonstrates his ability to do a task.
It implies that work is complete for specific audience. A job application portfolio for
example. It is finished product for specific audience.