Exploring Teachers' Informal Formative Assessment Practices and Students' Understanding in The Context of Scientific Inquiry
Exploring Teachers' Informal Formative Assessment Practices and Students' Understanding in The Context of Scientific Inquiry
Exploring Teachers' Informal Formative Assessment Practices and Students' Understanding in The Context of Scientific Inquiry
Contract grant sponsor: Educational Research and Development Centers Program (Office of Educational Research
and Improvement, U.S. Department of Education); Contract grant number: R305B60002.
Correspondence to: M. Araceli Ruiz-Primo; E-mail: aruiz@stanford.edu
DOI 10.1002/tea.20163
Published online 8 December 2006 in Wiley InterScience (www.interscience.wiley.com).
2006 Wiley Periodicals, Inc.
58
59
learning (acting). Teachers plan the implementation of this type of formative assessment at the
beginning, during, or at the end of a unit.
Conversely, informal formative assessment is more improvisational and can take place in any
studentteacher interaction at the whole-class, small-group, or one-on-one level. It can arise out of
any instructional/learning activity at hand, and it is embedded and strongly linked to learning
and teaching activities (Bell & Cowie, 2001, p. 86). The information gathered during informal
formative assessment is transient (Bell & Cowie, 2001; e.g., students comments, responses, and
students questions) and many times goes unrecorded. It can also be nonverbal (based on teachers
observations of students during the course of an activity). The time-frame for interpreting and
acting is more immediate when compared with formal formative assessments. A students
incorrect response or unexpected question can trigger an assessment event by making a teacher
aware of a students misunderstanding. Acting in response to the evidence found is usually quick,
spontaneous, and flexible, because it can take different forms (e.g., responding with a question,
eliciting other points of view from other students, conducting a demonstration when appropriate,
repeating an activity).
Our study focuses on the latter type of formative assessment, the ongoing and informal
assessment that can take place in any studentteacher interaction and that helps teachers acquire
information on a continuing basis. In this context, we use eliciting, recognizing, and using to
describe the teachers activities rather than gathering, interpreting, and acting, to more accurately
reflect the model of informal formative assessment. In formal formative assessment, teachers
gather (i.e., collect or bring together) information from all the students in the groups at a planned
time; however, the word eliciting means evoking, educing, bringing out, or developing. To
describe a teachers actions as eliciting during informal formative assessment is thus a more
accurate description, as teachers are calling for a reaction, clarification, elaboration, or
explanation from students. During formal formative assessment, teachers have the time to step
back to analyze and interpret the information collected/gathered. Based on this interpretation an
action can be planned (e.g., reteaching a concept). However, during informal formative
assessment, teachers must react on the fly by recognizing whether a students response is a
scientifically accepted idea and then use the information from the response in a way that the
general flow of the classroom narrative is not interrupted (e.g., calling students in the class to start a
discussion, shaping students ideas). Therefore, we believe that the differences between formal
and informal formative assessment are better conceptualized as a cycle of eliciting, recognizing,
and using. Explanations and examples of formal and informal formative assessment practices are
provided in Table 1. In what follows we contextualize informal assessment in the everyday
classroom talk in a scientific inquiry learning environment.
Classroom Talk
Classroom talk has been established as a legitimate object of study (Edwards & Westgate,
1994). Early research in the processproduct tradition focused on the conceptual level of teacher
questions (Redfield & Rousseau, 1981) and other teacher behaviors that might be correlated
with measures of student learning (Brophy & Good, 1986). Contemporary studies have placed
teacherstudent discourse in context by examining authority structures, the responsiveness of the
teacher-to-student contributions, and patterns in classroom talk (Cazden, 2001; Edwards &
Mercer, 1987; Lemke, 1990; Scott, 1998).
One of these patterns of studentteacher interaction that has become the subject of extensive
discussion has been alternately described as IRE (Initiation, Response, Evaluation) or IRF
(Initiation, Response, Feedback). In this sequence, the teacher initiates a query, a student
Journal of Research in Science Teaching. DOI 10.1002/tea
60
Table 1
Differences between formal and informal formative assessment practices
Formal: Designed to provide evidence about students learning
Gathering
Interpreting
Acting
Teacher plans an action to help
students achieve learning goals.
For example, writing or changing
lesson plans to address state of
student learning.
Recognizing
Using
responds, and the teacher provides a form of evaluation or generic feedback to the student
(Cazden, 2001). The IRE and IRF sequences are characterized by the teacher often asking
inauthentic questions (Nystrand & Gamoran, 1991) in which the answer is already known by
the teacher, sometimes for the sake of making the classroom conversation appear more like a
dialogue than a monologue. Teaching practices that constrain students within IRE/F patterns
have been criticized because they involve students in procedural rather than authentic
engagement (Nystrand & Gamoran, 1991).
Informal Formative Assessment and Classroom Talk: Assessment Conversations
Ongoing formative assessment occurs in a classroom learning environment that helps
teachers acquire information on a continuing and informal basis, such as within the course of daily
classroom talk. This type of classroom talk has been termed an assessment conversation (Duschl,
2003; Duschl & Gitomer, 1997), or an instructional dialogue that embeds assessment into an
activity already occurring in the classroom. Assessment conversations permit teachers to
recognize students conceptions, mental models, strategies, language use, or communication
skills, and allow them to use this information to guide instruction. In classroom learning
environments in which assessment conversations take place, the boundaries of curriculum,
instruction, and assessment should blur (Duschl & Gitomer, 1997). For example, an instructional
activity suggested by a curriculum, such as discussion of the results of an investigation, can be
viewed by the teachers as an assessment conversation to find out about how students evaluate the
quality of evidence and how they use evidence in explanations.
Assessment conversations have the three characteristics of informal assessment previously
described: eliciting, recognizing, and using information. Eliciting information employs strategies
that allow students to share and make visible or explicit their understanding as completely as
possible (e.g., sharing their thinking to the class, in overheads or posters). Recognizing students
Journal of Research in Science Teaching. DOI 10.1002/tea
61
thinking requires the teacher to judge differences between students responses, explanations, or
mental models so the critical dimensions relevant for their learning can be made explicit (e.g.,
teacher compares students responses according to the evidence provided or responds to students
by asking which explanation is more scientifically acceptable based on the information provided).
Using information from assessment conversations involves taking action on the basis of student
responses to help students move toward learning goals (e.g., helping students to reach consensus
on a particular explanation based on scientific evidence). The range of student conceptions at
different points of a unit should determine the nature of the conversation; therefore, more than one
iteration of the cycle of eliciting, recognizing, and using may be needed to reach a consensus that
reflects the most complete and appropriate understanding.
Assessment conversations can be thus be described in the context of classroom talk moves
(Christie, 2002) as ESRU cyclesthe teacher Elicits a question; the Student responds; the teacher
Recognizes the students response; and then Uses the information collected to support student
learning.2 This model of informal formative assessment is depicted in Figure 1.
We distinguish the ESRU model from the IRE/F sequence in three ways: First, the eliciting
questions used to initiate a sequence have the potential for providing information on the evolving
status of student conceptions and understanding about scientific inquiry skills and habits of mind.
For example, teachers can ask questions about understanding key points (e.g., clarification
questions about vague statements or lifting questions to improve the level of discussion). Second,
recognizing students responses in diverse ways (e.g., revoicing, rephrasing) is considered
fundamental because it indicates to the student that his/her contribution has been heard and
accepted into the ongoing classroom narrative (Scott, 1998). Recognizing students responses also
provides the opportunity for the teacher to pull out the essential aspects of student participation
and to act on them, and also provides the student the opportunity to evaluate the correctness of the
teachers interpretation of their contribution.
Third, our conception of using comes from the formative assessment and the scientific inquiry
literature (Black & Wiliam, 1998; Clarke, 2000; Duschl, 2000, 2003; Duschl & Gitomer, 1997;
National Research Council, 2001; Ramaprasad, 1983; Sadler, 1989, 1998). Use, in our sequence,
is more specific than the meaning assigned to feedback in the IRF sequence. Within use, a
teacher can provide students with specific information on actions they may take to reach learning
goals: ask another question that challenges or redirects the students thinking; model
communication; promote the exploration and contrast of students ideas; make connections
between new ideas and familiar ones; recognize a students contribution with respect to the topic
under discussion; or increase the difficulty of the task at hand. Clearly, evaluations by themselves
(e.g., good or excellent) cannot be part of our ESRU pattern unless they are embedded in
Figure 1.
62
some elaboration on the rationale behind the evaluation provided; that is, helpful feedback.
Defining feedback as a kind of use of formative assessment separates it from more general
interpretations normally included in the IRF sequence.3
Assessment conversations, then, require teachers to be facilitators and mediators of student
learning, rather than providers or evaluators of correct or acceptable answers. In summary,
successful classrooms emphasize not only the management of actions, materials, and behavior, but
also stress the management of reasoning, ideas, and communication (Duschl & Gitomer, 1997).
Assessment Conversations in the Context of Scientific Inquiry
A goal of the present science education reforms is the development of thinking, reasoning,
and problem-solving skills to prepare students in the development and evaluation of scientific
knowledge claims, explanations, models, scientific questions, and experimental designs (Duschl
& Gitomer, 1997). The National Research Council (2001) describes these habits of mind as
fundamental abilities for students as they engage in the learning of science through the process of
inquiry. If students are taught science in the context of inquiry, they will know what they know, how
they know what they know, and why they believe what they know (Duschl, 2003).
Frequent and ongoing assessment activities in the classroom can help achieve these habits of
mind by providing information on the students progress toward the goal and helping teachers
incorporate this information to guide their instruction (Black & Wiliam, 1998; Duschl & Gitomer,
1997; Duschl, 2003). Assessment conversations in the context of scientific inquiry should focus on
three integrated domains (Driver, Leach, Millar, & Scott, 1996; Duschl, 2000, 2003): epistemic
frameworks for developing and evaluating scientific reasoning; conceptual structures used when
reasoning scientifically; and social processes that focus on how knowledge is communicated,
represented, and argued. Epistemic structures are the knowledge frameworks that involve the rules
and criteria used to develop and/or judge what counts as scientific (e.g., experiments, hypotheses,
or explanations). Epistemic frameworks emphasize not only the abilities involved in the processes
of science (e.g., observing, hypothesizing, and experimenting, using evidence, logic, and
knowledge to construct explanations), but also the development of the criteria to make judgments
about the products of inquiry (e.g., explanations or any other scientific information). When
students are asked to develop or evaluate an experiment or an explanation, we expect them to use
epistemic frameworks. Conceptual structures involve deep understanding of concepts and
principles as parts of larger scientific conceptual schemes. Scientific inquiry requires knowledge
integration of those concepts and principles that allows students to use that knowledge in an
effective manner in appropriate situations. Social processes refer to the frameworks involved in
students scientific communications needed while engaging in scientific inquiry, and can be oral,
written, or pictorial. It involves the syntactic, pragmatic, and semantic structures of scientific
knowledge claims, their accurate presentation and representation, and the use of diverse forms of
discourse and argumentation (Duschl & Grandy, 2005; Lemke, 1990; Marks & Mousley, 1990;
Martin, 1989, 1993; Schwab, 1962).
A key characteristic of effective informal formative assessment is to promote frequent
assessment conversations that may allow teachers to listen to inquiry (Duschl, 2003). Listening to
inquiry should focus on helping students examine how scientists have come to know what they
believe to be scientific knowledge and why they believe this knowledge over other competing
knowledge claims (Duschl, 2003, p. 53). Therefore, informal formative assessment that
facilitates listening to inquiry should: (1) involve discussions in which students share their
thinking, beliefs, ideas, and products (eliciting); (2) allow teachers to acknowledge student
participation (recognizing); and (3) allow teachers to use students participation as the springboard
to develop questions and/or activities that can promote their learning (using).
Journal of Research in Science Teaching. DOI 10.1002/tea
63
Recognizing
Using
Epistemic frameworks
Teacher asks students to:
Compare/contrast observations,
data, or procedures
Teacher:
Teacher:
Clarifies/Elaborates based Promotes students thinking by
on students responses
asking them to elaborate their
responses (why, how)
Use and apply known procedures Takes votes to acknowledge Compares/contrasts students
different students ideas
responses to acknowledges and
discuss alternative explanations
conceptions
Repeats/paraphrases
Promotes debating and
Make predictions/provide
hypotheses
students words
discussion among students
ideas/conceptions
Interpret information, data, patterns Revoices students words Helps students to achieve
consensus
(incorporates students
contributions into the class
conversation, summarizes
what student said,
acknowledge student
contribution)
Provide evidence and examples
64
and using information) across two of the three domains, epistemic frameworks and conceptual
structures.4 We have not specified teacher interventions in the social domain because we believe
this domain is by nature embedded in assessment conversations. Duschl (2003) described the
social domain as processes and forums that shape how knowledge is communicated, represented,
argued and debated (p. 42). Thus, when teachers elicit student conceptions with a question of a
conceptual or epistemic nature, they are simultaneously engaging students in the social process of
scientific inquiry; that is, inviting students to communicate, represent, argue, and debate their
ideas about science. In this manner, questions categorized as either conceptual or epistemic can
also be demonstrated to have a social function. For example, a teacher may ask a student to
compare and contrast the concepts of weight and mass. While doing so serves the purpose of
helping students to differentiate between the two concepts, it may also inform students about the
appropriate scientific language for communicating ideas in class. In the epistemic domain, asking
students to provide evidence for their ideas serves the dual purposes of challenging students to
consider how they know what they know, as well as to establish standards for knowledge claims
communicated in science.
Rather than including students responses in Table 2, we chose to examine the teachers
responses by referring to students responses and then using that information to determine the kind
of teacher response that had been delivered (e.g., repeats students words, revoices students
words). It is important to note that the dimensions of scientific inquiry are used only to distinguish
the strategies used in the eliciting phase. The rationale is that the eliciting strategies (questions)
can be linked more clearly to these dimensions, whereas recognizing and using strategies (actions
taken by the teacher) can be used as a reaction to any type of initial question and student response.
The Appendix provides a complete list of the strategies used to code teachers questions and
actions.
The Study
The data analyzed in the present study were collected as part of a larger project conducted
during the 20032004 school year (Shavelson & Young, 2000). This project was a collaboration
between the Stanford Educational Assessment Laboratory (SEAL) and the Curriculum Research
and Development Group (CRDG) at the University of Hawaii. The project focused on the first
physical science unit of the Foundational Approaches to Science Teaching (FAST I; Pottenger &
Young, 1992) curriculum, titled Properties of Matter. The FAST curriculum is a constructivist,
inquiry-based, middle-school science education program developed by the CRDG and aligned
with the National Science Education Standards (Curriculum Research and Development Group,
2003; Rogg & Kahle, 1997).
In Properties of Matter, students investigate the concepts of mass, volume, and density,
culminating in a universal, density-based explanation of sinking and floating. Most of the
investigations in the unit follow a basic pattern of presenting a problem for students to investigate,
collecting data, preparing a graph, and responding to summary questions in a class discussion. The
curriculum places an emphasis on students working in small groups and participating in class
discussion to identify and clarify student understanding; thus, the FAST curriculum provides
multiple opportunities for assessment conversations to take place. The present analysis focuses on
the implementation of the first four investigations in this unit, Physical Investigation 1 (PS1) to
Physical Investigation 4 (PS4).
It is important to note that analysis of the teachers informal formative assessment practices
took place over time. That is, the information provided herein is not based on a single lesson but on
all the lessons that took place during the implementation of the four FAST investigations. By
analyzing each of these teachers over a period of time, we sought to characterize their informal
Journal of Research in Science Teaching. DOI 10.1002/tea
65
assessment practices in a manner not possible in studies that analyze conversation data from one
teacher (Kelly & Crawford, 1997; Roth, 1996), across different grade levels (van Zee, Iwasyk,
Kurose, Simpson, & Wild, 2001), or in excerpts of short duration (Resnick, Salmon, Zeitz,
Wathen, & Holowchask, 1993). In the next section, we describe in detail the instruments and
means used to collect the information.
Method
Participants
We selected three teachers and their students from the original 12 involved in the SEAL/
CRDG project on the basis of maximizing contrasts between informal assessment practices. These
teachers were also selected, because, at the time of the original analysis, videotape data and
measures of student performance on the pretest were also available. There were no significant
differences between the students performance in the teachers classrooms on the pretest.
The teachers participating in this study were asked to videotape their classrooms in every
session they taught beginning with their first FAST lesson. Each teacher was provided with a
digital video camera, a microphone, and videotapes. We provided each teacher with suggestions
on camera placement and trained them on how to use the camera. Teachers were asked to submit
the tapes to SEAL every week in stamped envelopes. The teachers characteristics are briefly
described in Table 3.
Rob. Rob teaches at a middle school in a rural community near a large city in the Midwest.
Students at the school are mostly white, and the highest percentage minority population in the
school is Native American (11%). Both Rob and a school administrator characterized the school as
being a joiner, taking opportunities to participate in nationwide assessments such as the
National Assessment of Educational Progress in the hope of receiving financial compensation.
Rob majored in science and recently completed a masters degree in geosciences. He has taught
science for 14 years, working for 3 years at the sixth and seventh grade level. Although he has been
Table 3
Characteristics of teachers
Rob
Gender
Ethnicity
Highest degree earned
Major in science
Minor in science
Teacher credential
Years of teaching
Years of teaching
science
Years teaching
sixth/seventh grade
Grade level taught
Science sessions length
Number of students
taught
Male
White
(not Hispanic origin)
BA
Yes
Yes
State in science,
diverse areas
Danielle
Female
White
(not Hispanic origin)
MA
No
No
PreK-6
Alex
14
14
3
1
Male
White
(not Hispanic origin)
MA
Yes
Yes
Residency certification
K-8 science and
english
2
2
7
55 minutes
25
6
40 minutes
25
7
55 minutes
26
66
trained in multiple levels of FAST, the SEAL/CRDG project represented his first time teaching
FAST I. During the investigations, pairs of students would turn around to work with the pair of
students seated behind them as a group of four.
Danielle. Danielle teaches at an elementary school in the Northeast. While she is credentialed
to teach pre-kindergarten through sixth grade, she is now exclusively teaching sixth grade
mathematics, science, and reading as a result of being bumped from her position as a kindergarten
teacher during the year prior to the study. She majored in English as an undergraduate, and holds a
masters degree in elementary education and reading. Her participation in the SEAL/CRDG
project came during her second year of teaching FAST. Danielles school is located in a
predominantly white, lower middle-class suburb in an expansive building that also houses the
district offices. Students sit at desks clustered into tables of four to six, and conduct their
investigations at these same tables. Danielle described her school district as being very supportive
of the FAST program, as well being able to provide teachers with opportunities to learn more about
students understanding. For example, Danielle took the eighth grade science test along with other
teachers from her district, discussing students responses in-depth along with possible ways to
identify the areas in which students need more assistance.
Alex. Alex was in his third year of teaching during the study, working at a middle school in the
Pacific Northwest. The student population is mostly white and middle class, but also with a
relatively high proportion of Native American students, from a nearby reservation. He majored in
natural resource management and holds a masters degree in teaching. Alex described his schools
administration as being supportive of his professional development, paying for him and other
teachers in the department to attend National Science Teachers Association conferences. During
the year of the SEAL/CRDG study, Alex taught six classes each day, five seventh grade FAST
classes and one elective class, which he added voluntarily to his course load, called Exploring
Science. His classroom is larger than Danielles or Robs, with space for desks aligned in rows in
the center of the classroom and laboratory stations, each with a large table and sink, around the
periphery of the classroom.
Coding the Videotape Data
All the videotapes for every session taught from PS1 to PS4 were transcribed, amounting
to transcripts from 30 lessons across the three teachers.5 The transcripts identified whether
statements were made by the teacher or by students. Each speaking turn made by the teacher was
numbered and segmented into verbal units (VUs). VUs were determined by the content of the
teachers statement.
Next, we identified the assessment conversations that occurred within each transcript. We
focused only on assessment conversations that involved teacherwhole-class interactions.
Assessment conversations were identified based on the following criteria (Duschl & Gitomer,
1997): the conversation concerned a concept from or aspect of one of the FAST investigations
(e.g., classroom procedures such as checking out books or upcoming school assemblies were not
considered); the teacher was not the only speaker during the episode (i.e., students also had turns
speaking); and student responses were elicited by the teacher through questions. Transcripts that
did not include assessment conversations were not coded further. Twenty-six assessment
conversations were identified across the instructional episodes and teachers. Of these
conversations, most (46%) were observed in Danielles classroom, followed by Alexs (31%)
and Robs (23%).
We then coded the transcripts of the assessment conversations according to the ESRU model.
All coding of transcripts was performed while watching the videotapes. First, all the assessment
Journal of Research in Science Teaching. DOI 10.1002/tea
67
conversations identified across transcripts were coded at the speaking-turn level by two coders.
This coding process consisted of identifying the exact strategy being used by the teacher in a
particular speaking turn and/or verbal unit (e.g., teacher asks students to provide predictions,
teacher repeats student response, etc.).
Because each strategy we coded was mapped onto the ESRU model (see Table 2), we were
then able to identify instances in which the teacherstudent conversation followed the ESRU
model. In this phase of the coding process, we identified complete (ESRU) and incomplete cycles
(ES and ESR) within the two types of inquiry dimensions, Epistemic and Conceptual. We
also coded ESRU cycles that involved Non-Inquiry types of codes (e.g., asking the students
simple questions). Therefore, nine types of ESRU cycles could be identified in the transcripts,
three of each inquiry dimension and three for the Non-Inquiry category.
In some cases, parts of the ESRU cycle were implicit in some of the teachers comments. For
example, a teachers question, How do you know that? implicitly recognizes the students
response, and uses that response to elicit more information from the student. Thus, we coded
statements such as this as implicit cases of recognizing and eliciting. In other cases, teachers would
elicit a particular question from the class, and then recognize individual students by name without
repeating the question. In these cases, we coded the speaking turn as an implicit case of eliciting
the previously asked question. Finally, coders noted when the same student was involved in the
teacherstudent interactions by using arrows to connect the cycles.
Intercoder reliability. The identification of the ESRU cycles was done independently. Four
of the transcripts were used for training purposes and five were used to assess the consistency of
coders in identifying the cycles. Because the main interest was in determining the consistency of
coders in identifying types of ESRUs, we calculated the intercoder reliability by type. The
averaged intercoder reliability coefficient across the nine types of ESRUs was 0.74. The lowest
reliability was found in the Non-Inquiry ESRU category. When only the Epistemic and
Conceptual ESRUs were considered, the averaged intercoder reliability was 0.81. We
concluded that coders were consistent in identifying the two important types of ESRUs. Therefore,
the remaining transcripts (10), omitting those that had poor sound, no sound, or did not include
discussions (11), were coded independently.6
Measures of Student Learning
To assess student learning we used two sources of information: a multiple-choice
achievement test administered as a pretest, and the information collected in an embedded
assessment administered after Investigation 4 (PS4). The 38-item, multiple-choice test included
questions on density, mass, volume, and relative density.
The embedded assessment analyzed for this study involved three types of assessments:
Graphing (3 items); PredictObserveExplain (POE; 12 items); and Prediction Question
(2 items). These assessments were designed to be implemented in three sessions, each taking about
40 minutes to complete.7 In the Graphing prompt, students were provided with a representation of
data that is familiar to them: a scatterplot that shows the relationship between two variables, mass
and depth of sinking. This prompt requires students to: (a) judge the quality of the graph
(completeness and accuracy); (b) interpret a graph that focuses on the variables involved in sinking
and floating; and (c) self-evaluate how well they judged the quality of the graph presented. In the
POE prompt students were first asked to predict the outcome of some event related to sinking and
floating and then justify their prediction. Then students were asked to observe the teacher carrying
out the activity and describe the event that they saw. Finally, students were asked to reconcile and
explain any conflict between prediction and observation. The PredictObserve (PO) prompt is a
Journal of Research in Science Teaching. DOI 10.1002/tea
68
slight variation on the POE. This prompt asked students only to predict and explain their
predictions (P) and observe (O) an event; they were not asked to reconcile their predictions with
what was observed. POs act as a transition to the next instructional activity of the unit in which the
third piece of the prompt, the explanation, will emerge. POs can be thought of as a springboard that
motivates students to start thinking about the next investigation.
Reliability of measures of student learning. The internal consistency of the multiple-choice
pretest was 0.86 (Yin, 2004). To assess the interrater reliability of the embedded assessments we
randomly selected 20 students responses of each prompt scored independently by two scorers.
The magnitude of the interrater reliability coefficients varied according to the question at hand
(Graphing 0.97, POE 0.98, PO 0.94). We concluded that scorers were consistent in scoring
and coding students responses to the different questions involved in the embedded assessment
after PS4. Based on this information, the remainder of the students responses (56) were only
scored by one of two scorers. For the 20 students scored for assessing the interscorer reliability, the
averaged score across the two raters was used for the analyses conducted.
Results
This study responds to two research questions: (1) Can the model of informal formative
assessment distinguish the quality of informal assessment practices across teachers? (2) Can the
quality of teachers informal formative assessment practices be linked to student performance? To
respond to the first question, we provide evidence about teachers informal assessment practices.
To respond to the second question, we provide information about students observed performance
on the pretest and embedded assessments, and link students performance to the informal
assessment practices observed across teachers.
Distinguishing Informal Assessment Practices Across Teachers
Table 4 presents the frequency of each type of cycle observed in all the assessment
conversations analyzed. We identified both the complete and incomplete informal formative
assessment (ESRUs) cycles in the epistemic and conceptual dimensions. Note that the incomplete
Table 4
Frequency of informal formative assessment cycles by inquiry dimension and teachera
Epistemic
Teacher
Non-Inquiry
ESR
ESRU
ES
ESR
ESRU
ES
ESR
77
Danielle
10
119
2 (2)
1 (4)
Alex
11
85
2 (2)
1 (4)
1
2 (2)
85
20 (2)
10 (3)
1 (4)
27
5 (2)
1 (4)
Rob
ES
Conceptual
ESRU
0
21
1 (2)
11
2 (2)
13
10
3 (2)
1 (4)
20
10
2 (2)
15
1 (2)
1 (3)
2
1 (2)
Numbers in the second row for each teacher represent the number of consecutive cycles with the same student. Number in
parentheses represent the number of iterations with the same student, and number outside the parentheses represent the
number of times observed.
Journal of Research in Science Teaching. DOI 10.1002/tea
69
cycle, ESRteacher elicits, student responds, and teacher recognizeswas the highest type of
cycle observed for Rob and Alex.
Rob had the smallest number of complete and incomplete cycles. Only one complete cycle
was observed from the discussions we analyzed from his videotapes. Although Danielle and Alex
showed the highest frequency of incomplete cycles (ESRs) across the two scientific inquiry
dimensions, there is an important difference in the patterns of these cycles, which is discussed in
more detail in what follows. More complete cycles were observed for Alex than for Rob, but much
less than those observed for Danielle.
Another important aspect of the cycles captured during the coding was the consecutive
interactions that the teachers had with the same students (we named them iterations). They are
represented in the table as the number in parentheses. The number outside the parentheses
indicates the number of times that double, triple, or quadruple iterations were observed. We
interpreted these iterations as an indicator of spiraling assessment conversations in which the
teacher was looking for more evidence of the students level of understanding that could
provide information for helping the student to move forward in her/his learning. We observed
more iterations with the same student in Danielles lessons than in those of Rob or Alex. Most
involved two iterations, although three and four iterations were also observed.
It is important to note that most of the cycles were classified as focusing on the Epistemic
dimension and very few on Conceptual dimension, even though there was opportunity for doing
so, especially in Investigation 4 (PS4). Fifty-two percent of the cycles across the three teachers
were classified as Epistemic and only 7% as Conceptual. The rest of the cycles were
classified as Non-Inquiry. To explore this finding further, Table 5 presents information
regarding the eliciting cycle component by category within each dimension. The highest
percentage (25%) of the eliciting questions across the three teachers focused on Asking students
for the application of known procedures, observations, and predictions. Notice that very few
assessment conversations involved Formulation of explanations, Evaluation of quality of the
evidence (to support explanations), or Comparing or contrasting others ideas, explanations. It
is possible that by omitting these important steps in favor of focusing on predictions and
Table 5
Percentage of eliciting questions by category within dimension
Dimension
Eliciting Strategy
Rob
Danielle
Alex
Epistemic
8
15
33
23
12
10
0
0
0
0
0
0
0
29
0
71
1
29
15
9
4
3
18
4
1
1
2
14
59
21
0
21
12
23
15
13
7
9
2
1
0
0
12
7
41
26
0
33
Conceptual
70
observations may provide students with an incomplete experience in inquiry learning. This
information is important because it reflects that these teachers paid attention to the procedural
aspects of the epistemic frameworks more than to the development of the criteria to make
judgments about the products of inquiry (e.g., explanations).
The highest percentage of the eliciting conceptual questions focused on asking students to
Define concepts (44%) and Checking for students understanding (32%). Some questions focused
on Applying concepts and very few on Comparing, contrasting, or comparing concepts. The
highest percentage in the Non-Inquiry category, not shown in the table, focused on asking
students simple and yes/no questions. Alex had the highest percentage on this type of question
(52%), followed by Rob (47%) and Danielle (37%).
Across the three teachers, Repeating and Revoicing were the two recognizing strategies used
more frequently. Although repeating is a common way of acknowledging that students have
contributed to the classroom conversation, revoicing has been considered a better strategy for
engaging students in the classroom conversations (OConnor & Michaels, 1993). The highest
percentage for use of this code was found for Danielle (31%), followed by Alex (20%), then Rob
(17%). When teachers practice revoicing, they work students answers into the fabric of an
unfolding exchange, and as these answers modify the topic or affect the course of discussion in
some way, these teachers certify these contributions and modifications (Nystrand & Gamoran,
1991, p. 272). Revoicing, then, is not only a recognition of what the student is saying, but also
constitutes, in a way, an evaluation strategy because the teacher acknowledges and builds on the
substance of what the student says. Furthermore, it has been found that this type of engagement in
the classroom has positive effects on achievement (Nystrand & Gamoran, 1991).
Another strategy that has been considered essential in engaging students in assessment
conversations is Comparing and contrasting students responses to acknowledge and discuss
alternative explanations or conceptions (Duschl, 2003). This strategy is essential in examining
students beliefs and decisionmaking concerning the transformation of data to evidence, evidence
to patterns, and patterns to explanations. We found that comparing and contrasting students
responses was not a strategy frequently used by these teachers. The highest percentage was
observed for Danielle (10%), followed by Alex (7%). Across all the assessment conversations,
Rob used this strategy only once. The critical issue in this strategy is whether teachers point out
the critical differences in students reasoning, explanations, or points of view. Furthermore, for
the ESRU cycle to be completed, the next necessary step would be helping students to achieve a
consensus based on scientific reasoning. This can be done by promoting discussion and
argumentation. Clearly, Promoting argumentation was not a strategy used by these teachers. It
seems, then, that the third and most important step to complete the cycle was missed. The
opportunity to move students forward in their understanding of conceptual structures and/or
epistemic frameworks was lost. Helpful feedback, or comments that highlight what has been done
well and what needs further work, was also a strategy rarely used. Danielle was the teacher who
provided the most helpful feedback (14%), followed by Alex (2%).
Characterizing Informal Assessment Practices Across Teachers
Rob. During Robs enactment of the four investigations, he exhibited only a few instances of
complete ESRU cycles. More commonly, Robs teaching involved broken cycles of ESR; that is,
without the key step of using, basically the same as the IRE/F sequence. Although most of the
broken cycles took place in the Epistemic domain, the majority of his questions involved
discussions of procedures, such as how to create a graph. An example of Robs conversational style
is illustrated in the excerpt contained in Table 6. During this part of the lesson, Rob provides some
Journal of Research in Science Teaching. DOI 10.1002/tea
[Inaudible.]
I just stopped at nine because thats all we have up here. Now, weve been preaching about youve
got to have units and variables, so what were we measuring down here, length?
Of the straw.
And how did we measure it?
Centimeters.
Centimeters, okay?
Student
Rob
Student
Rob
Well, were going to have to come up with some sort of a scale, some way that were going to be able to use
our X and Y, horizontal and vertical axes, and so when we do that, lets try to figure out, how many tall
is this? Three boxes tall. Focus that. Its blurry to me. Is it blurry to you? How many tall? Ive only
counted 20 boxes high, and whats our maximum number over here on the sheet? It looks like 39, so we
could probably go by twos, okay, well start out with twos. Lets go with zero here, and everything,
every full line is going to be two. Two, 4, 6, 8, 10, 12, 14, 16, 18you guys can count, I dont have to
count for you. I can almost get the whole thing on there for you to look at. Okay. Now, what did you say,
3 here? If we look back over to PS3 [inaudible]. It doesnt tell us, so we can put PS3 on [inaudible],
thats fine. Youll know what were talking about. So, PS3 and then put individual or mine or something
like that that will tell you that this is your groups, and then well go to the next one and it will be
PS3 and then well put the class; that way, youll know ones class data and ones yours.
[Inaudible.]
Okay. Now, going across the bottom here, lets move a little bit quicker on this, going across the bottom,
our numbers are going to be, what, weve got 4 centimeters, 6, 5, 6, 7, 8, so weve got 10, so if we
could go by fourhow many [inaudible]?
Sixteen.
Sixteen across the bottom. Lets say each one could be, or each two could be one, but thats probably going
to push the issue. [Inaudible.] Okay, so lets go with each box is worth one, and that way we fit
everything, we may get a little cramped but well fit everything on the page.
Dialogue
Student
Rob
Student
Rob
Student
Rob
Rob
Speaker
Table 6
Transcript excerpt from Robs class
S
R
S
E
S
R
Cycle
Code
72
guidance to the students on the task they will complete and models how to graph on an overhead.
Notice how Rob repeats students responses but does not ask students to elaborate their responses.
In addition, Robs speaking turns are quite long in comparison to the shorter responses provided by
his students.
In the excerpt, Rob provides the students with direction on how to make a graph as he models
it for the class. Although he asks questions in the course of his initial speaking turn, he does not
wait for students to respond (repaired questions). Although students do respond to some of his
questions later, the conversation elicits a minimal amount of information, and the students
responses are simply repeated by the teacher. Given these trends in Robs teaching, it appears that
most of his discussions are not assessment conversations at all, but seem to be more general
whole-class discussions in which the teacher guides classroom talk and students participate
minimally.
Danielle. In contrast to Robs teaching style, Danielles classroom discussions are
characterized by a larger number of complete ESRU cycles. Although the majority of these
cycles were also in the Epistemic domain, as with Rob, Danielles teaching differs in that she
asks multiple questions of the same student, often going through several iterations of the ESRU
cycle with the same student. Furthermore, she asks students to interpret patterns in their graphs in
addition to speaking about the procedure of constructing them. The differences between Danielle
and Robs teaching can be illustrated in the excerpt in Table 7, in which Danielle discusses
construction of the same graph as Rob worked with in the previous example. As with Rob, during
this conversation Danielle provides some guidance to the students on the task and she also
models on an overhead how to graph. However, Danielle asks questions that will provide her with
information about the students understanding about constructing a graph. After repeating a
students response, she asks students to elaborate their responses. Also, she positively reinforces a
student for using a very scientific word.
In contrast to Rob, Danielle starts her conversation in this excerpt with student input, and
builds upon it by repeating students words (accepting them into the ongoing conversation), asking
students to elaborate, and clarifying and elaborating upon student comments. This pattern
occurred many times in our observations of Danielles teaching. This example thus fits the model
of an assessment conversation, in that Danielles questions are more instructionally responsive
than would be expected from an everyday lesson. That is, Danielle asks questions similar to
those Rob asks his students, and also frequently repeats what they have said; however, Danielle
pushes her students for more elaboration on their responses, and also provides feedback
that allows students to learn more about classroom expectations for learning. Danielles lessons
also showed characteristics of assessment conversations not found in Robs or Alexs lessons. For
example, she was the only one comparing and contrasting students responses. Rather than
moving from student to student without identifying how students ideas were different, Danielle
highlights these differences in the excerpt in Table 8, taken from a discussion about an unknown
liquid.
Alex. Like Rob, Alexs teaching included more broken than complete ESRU cycles,
although Alex more frequently completed cycles than Rob. Like the other two teachers, Alex
modeled graphing on an overhead and checked students understanding. However, like Rob, the
conversation has more of a tone of delivery of guidance on how to make a graph rather than relying
upon student input. Also, like Danielle and Rob, Alex asked mostly procedural questions, but the
responses he sought were not provided by the students, so ultimately Alex provided the responses
himself. The excerpt in Table 9, like previous ones, is taken primarily from the Epistemic
domain. Rather than illustrating iterative ESRU cycles, this excerpt shows multiple broken cycles,
including one that ends with an evaluative comment.
Journal of Research in Science Teaching. DOI 10.1002/tea
73
Table 7
Transcript excerpt from Danielles classa
Speaker
Dialogue
Cycle
Code
Danielle
Student
Danielle
Student
Danielle
You didnt. . .
What do we need to do? Eric?
Student
Danielle
S
R
U (E)
Student
Danielle
The numbers.
The numbers. Good. Excellent. I liked
that you used the word scales, its a
very scientific word. So, yes, we need to
figure out what the scales are, what we
should number the different axes.
S
R
S
(R) U
S
E
Although Alex does show concern for the level of his students understanding, reflecting the
fact that he has gained information from their responses, he does not use the responses in a manner
consistent with the ESRU model; rather, he provides answers himself, and asks students if what he
has said makes sense to them. In this manner, the model of assessment conversations also does not
characterize well the patterns of interactions in Alexs classroom.
Summary. Given the divergent pictures of informal formative assessment just
constructed, examining assessment conversations through the lens of the ESRU model has
allowed us to capture differences between teachers informal formative assessment practices.
Based on the evidence collected, we conclude that Danielle conducted more assessment
conversations than Rob and Alex and that the characteristics of the assessment conversations were
more consistent with our conception of informal assessment practices in the context of
scientific inquiry. We also conclude that Alexs informal formative assessment strategies
were somehow more aligned with our model than those of Rob. In the next section we provide
evidence on students performance and link the performance with teachers informal assessment
practices.
Journal of Research in Science Teaching. DOI 10.1002/tea
We thought, we knew the bottom couldnt have been water because the bottom, water would float
on top of oil because. . .
How do you know that? How do you know that?
Student
Danielle
Student
Danielle
Student
Student
Danielle
Danielle
So, thats when were talking about liquid 2; what about that bottom liquid, does anyone have any
theories on that bottom liquid? I know a few tables mentioned it. Amanda?
Its water.
Why do you think its water. Shes confident. She didnt even say, well, I think, shes like, its water.
Why do you think its water.
Danielle
Student
Danielle
Dialogue
Speaker
(R) U
(E)
S
(R) U
(E)
S
(R) U
S
R
U (E)
S
R
Cycle
Code
Teacher compares/contrasts
students responses
Teacher compares/contrasts
students responses
Teacher a promotes discussion/
argumentation among students
Teacher compares/contrasts
students responses
Teacher promotes discussion/
argumentation among students
Table 8
Transcript excerpt from Danielles class with comparing and contrasting students responses and promoting discussion among students
74
RUIZ-PRIMO AND FURTAK
Forty.
Forty. Is 40 your biggest one? Over here? Oh, you know what, that is our biggest one. We can
use these numbers. We dont have all of them but we can go ahead and use them. So, 40.
You want to make sure that we can get 40 on the graph. So, you know what, guys, I take
that back, I take that back. I only want to graph this data to the left of that line; does that
make sense? Do you guys want to graph these numbers over here?
No.
The only reason I say that is because Im looking at my graph, and were going to have our
data here all kind of clumped in one spot, and then these numbers are going to be huge
because we [inaudible] centimeters. So, I think this, for this graph, it might look, it might
make more sense to [inaudible] this is our first graph, I dont want to confuse you, because
its line of best fit I find often confuses people. So Im going to make this first one a little
less confusing. So, I only want to graph this part of the [inaudible]. Okay? So, if youre in
group 1, youre going to do that first, group 1 first. But, anyway, you have that [inaudible].
Does that make sense? Did that confuse anybody? No? Okay.
Students
Alex
The [inaudible].
When I say it, you guys will remember it. Its called the scale. Right, we have to put a scale,
our scale, it tells us what it is were measuring, right? Actually, it tells, the title tells us
what it is were measuring, but the scale gives us increments to use to measure it with. So,
looking on our data, where is my data sheet, looking on our data up here, how far do we
have to go, how big of a number do we [inaudible]? Someone who hasnt talked yet today.
If you look on here, we want to know the number of bbs; right? We need to be able to fit
on our graph all the bbs that were counted off in this data table. Whats the biggest
number that we have? Someone who hasnt talked yet. Jessica.
Student
Alex
Student
Alex
Dialogue
Alex
Student
Alex
Speaker
Table 9
Transcript excerpt from Alexs class
E
S
S
R
E
S
R
E
S
R
Cycle
Code
76
Table 10
Mean scores and standard deviation across the embedded assessments
Rob
Assessment
1. Graphing
2. Predict
ObserveExplain
3. PredictObserve
Danielle
Alex
Maximum
Mean
SD
Mean
SD
Mean
SD
14
20
23
25
6.17
6.32a
2.87
2.89a
23
25
8.76
14.84
2.73
4.49
26
22
6.44
10.78
3.09
4.34
22
1.27
0.98
22
1.54
1.01
10
0.40
0.52
77
year, there was no significant difference in student performance on the pretest among the students
of the three teachers and that teacher was the source of variability between groups. However, at the
end of the fourth investigation, teacher was a good predictor. This information leads us to conclude
that better informal assessment practices could be linked to better student performance, at least in
the sample we used for this study.
Conclusions
In this investigation we studied an approach for examining informal assessment practices
based on four components of assessment conversations (the teacher Elicits a question, the Student
responds, the teacher Recognizes the students response, and then Uses the information collected
to support student learning) and three domains linked to science inquiry (epistemic frameworks,
conceptual structures, and social processes). We have proposed assessment conversations (ESRU
cycles) as a step beyond the IRE/F pattern and how they may lead to increases in student learning.
Our findings regarding broken cycles (ESRs) are consistent with those of the IRE/F studies, in
that teachers conduct generally one-sided discussions in which students provide short answers that
are then evaluated or provided with generic feedback by the teacher (Lemke, 1990). However, we
believe that our ESRU model is a more useful way of distinguishing those teacherstudent
interactions that go beyond the generic description of feedback; it is the final step of using
information about students learning that distinguishes ESRU cycles from IRE/F sequences.
Using implies more than providing evaluation or generic forms of feedback to learners, but
rather involves helping students move toward learning goals. In the ESRU cycle a teacher can
ask another question that challenges or redirects the students thinking, promotes the exploration
and contrast of students ideas, or makes connections between new ideas and familiar ones.
Teachers also can provide students with specific information on actions they may take to reach
learning goals. In this manner, viewing whole-class discussions in an inquiry context through
the model of ESRU cycles allows the reflective and adaptive nature of inquiry teaching to be
highlighted.
We asked two questions in relation to the ESRU cycle: Can the ESRU cycle allow
distinguishing the quality of informal assessment practices across teachers? Can the quality of
teachers informal assessment practices be linked to the students performance? To respond to
these questions we studied the informal formative assessment practices in different science
classrooms. Using the model of ESRU cycles we tracked three teachers informal formative
assessment practices over the course of the implementation of four consecutive FAST
investigations. Furthermore, to track the impact of informal assessment practices and their
impact on student learning, we collected information on the students performance using two
sources of information: a pretest multiple-choice test and three less conventional types of
assessments administered as embedded assessment, a graphing prompt, a predictobserve
explain prompt, and a predictobserve prompt.
We have provided evidence that the approach developed could capture differences in
assessment practices across the three teachers studied. Based on our small sample size, it seems
that the teacher whose whole-class conversations were more consistent with the ESRU cycle had
students with higher performance on embedded assessments. We have also provided evidence
about the technical qualities of the approach used to capture ESRU cycles. Intercoder reliability
was 0.81 for cycles involved in the Epistemic and Conceptual dimensions. A piece of
information related to the validity of the coding system was also provided by proving that the
approach could capture differences in informal assessment practices. We found that the model
provided important information about the teachers informal assessment practices.
Journal of Research in Science Teaching. DOI 10.1002/tea
78
Our findings indicate that, in the context of scientific inquiry, teachers seem to focus more on
procedures rather than on the process of knowledge generation and thus students may not be
getting a full experience with scientific inquiry. Based on the information collected, we believe
that Duschls (2003) epistemic dimension seems to include both the procedures necessary to
generate scientific evidence and the reasoning processes involved with generation of scientific
knowledge. Classifying these two elements of the scientific process together masks our finding
that teachers focused on the procedures involved in scientific inquiry rather than the process of
developing scientific explanation. Making this distinction can provide clearer evidence about how
students are provided with opportunities to determine what they know, how they know what they
know, and why they believe what they know (Duschl, 2003).
Our findings also suggest that instructional responsiveness is a crucial aspect of scientific
inquiry teaching. That is, teachers need to continuously adapt instruction to students present level
of understanding rather than pursuing a more teacher-directed instructional agenda. In the
absence of the crucial step of using information about student learning, teachers may follow the
IRE/F sequence without gathering information about student learning and following an
instructional agenda that is unresponsive to evolving student learning. Doing so is a challenging
task, as it involves both planning and flexibility, or as Sawyer (2004) termed it, disciplined
improvisation.
In drawing conclusions from the study, it is important to note differences in the level of
training, experience, and other contextual factors of these three teachers. Although Rob and Alex
were trained in science and would be expected to have a deeper understanding of the content area,
their practices were less consistent with the ESRU model than that of Danielle. This result suggests
that content knowledge alone may not be a sufficient precursor to conducting informal formative
assessment. Furthermore, Rob had more years of teaching experience than Danielle and Alex;
however, the near absence of informal formative assessment practices in his conversations reveals
that he may not be focusing on collecting information about student learning during class. Finally,
it is important to note that Rob and Alex were teaching seventh grade science in large middle
schools, where they saw more than 100 students each day. In contrast, Danielle taught sixth grade
science and mathematics at an elementary school, working with about 5560 students each day.
Although it is impossible to disaggregate all the contextual factors associated with the differences
between the elementary and middle school setting given the design of this study, we at least note
that some of the differences in patterns of interactions may have been associated with the contexts
of each school.
As the science education community continues to struggle to define what it means to be an
instructionally responsive teacher in the context of scientific inquiry, being explicit about
the differences between IRE/F and ESRU questioning could help teachers to understand the
differences between asking questions for the purpose of recitation and asking questions for the
purpose of eliciting information about and improving student learning. Pre- and in-service
teachers alike may benefit from learning about the ESRU cycle as a way of thinking about
classroom discussions as assessment conversations, or opportunities to understand the students
understanding and to move students toward learning goals in an informal manner.
Future research should pursue the course of the present study by applying the ESRU model of
informal formative assessment to more teachers across more lessons to see if the trend observed in
the small sample can be replicated. Furthermore, additional studies should consider the students
responses in more detail than we included herein. For a classroom discussion to be truly dialogic,
teachers should ask questions of an open nature that leave room for student responses that are
longer than one or a few words and that contribute real substance to classroom conversations
(Scott, 1998).
Journal of Research in Science Teaching. DOI 10.1002/tea
79
As mentioned at the outset of this study, assessment, instruction, and curriculum need to be
aligned for education in order for educational reform to be successful. This alignment is, like other
reforms, predicated upon the teacher, who is the instrumental figure in any classroom. Providing
teachers with tools that will help them to integrate assessment into the course of everyday
instruction to meet the goals of a curriculum will help to realize educational reforms. Scientific
inquiry learning, which has been identified as an essential element of students experience in
science education (National Research Council, 2001), may thus be better achieved by speaking to
teachers about models of informal formative assessment, so that their instruction may be
continuously adaptive to meet student learning goals.
The findings and opinions expressed in this report do not reflect the positions or
policies of the National Institute on Student Achievement, Curriculum, and Assessment,
the Office of Educational Research and Improvement, or the U.S. Department of
Education.
Notes
1
We acknowledge that this is not always the case. Some laboratory studies (Kruger & Dunning, 1999;
Markham, 1979) have indicated that less competent students do not have the necessary metacognitive skills
and/or processing requirements to be aware of the quality of their performance or to know what to do to
improve it. On the other hand, multiple classroom studies have shown the positive impact, on average, of
appropriate feedback on student performance, especially of low-competence students (see Black & Wiliam,
1998). We rely on these latter results because they take place during classroom instruction and are therefore
of greater relevance to the present study.
2
The focus of this study is confined to the teachers actions during the process of formative assessment.
The reason behind this decision is practical more than conceptual. Because information was collected on
videotape, requiring the teachers to use a microphone, students comments and questions were not always
captured.
3
Feedback can be considered only if it can lead to an improvement of student competence
(Ramaprasad, 1983; Sadler, 1989). If the information is simply recorded, passed to a third party, or if it is
too coded (e.g., a grade or a phrase such as good!) to lead to an appropriate action, the main purpose of
the feedback is lost and can even be counterproductive (Sadler, 1989, 1998). Effective feedback should lead
students to be able to judge the quality of what they are producing and be able to monitor themselves during
the act of production (Sadler, 1989, 1998).
4
Although describing the final step of ESRU as feedback would clearly be more appropriate in the
context of informal formative assessment, we employ the term use rather than feedback to make a
clear distinction between the IRF pattern and the ESRU pattern.
5
One of Danielles transcripts was lost due to low battery charge.
6
The low intercoder reliability coefficient was due mainly to the small sample of transcripts
considered. If coders identified one or two ESRUs in a different category, either within or between types,
this could easily change the coefficient. We acknowledge that sample size for evaluating consistency across
coders is small.
7
We did not include in our analysis the information on open-ended questions (a single open-ended
question that asks students to explain why things sink and float, with supporting examples and evidence)
because the information gathered was similar to the interpretation of the graph. The information will be
analyzed at a later point.
Takes votes.
Compares/contrasts students responses/explanations.
Drives students to achieve consensus.
Relates evidence and explanations (WTSF).
Repeats/paraphrases student words.
Provide observations.
Provide predictions/hypothesis.
Explanations.
Provide evidence and examples.
Interpreting data/results, graphs.
Relate evidence to explanations.
Evaluate quality of evidence (promotes).
Compare/contrast ideas.
Suggest hypothetical procedure/experimental plan.
Teacher:
Defines
Clarifies/elaborates.
Define concept(s)
Use, apply, and compare concept(s).
Incorporates students words into the flow of a classroom conversation by using selected students
words in an ongoing train of thought. This involves the teachers modifying or building upon the
students original meaning to further a point or to toss a comment back to a student with new
wording to see if the teacher has grasped the students original meaning.
Read or display responses from homework or some other format that were not initially gathered as
observations
Share their observations. These could be either oral or written.
Share their predictions and hypotheses. These could be either oral or written.
Share their explanations. These could be either oral or written.
Provide evidence supporting evidence and examples collected in a scientific manner.
Provide experiences they have had outside the context of the present science classroom to content
related to the current discussion.
Appendix
80
RUIZ-PRIMO AND FURTAK
Provides/reviews criteria.
Captures/displays student responses/explanations.
Asks students questions of a factual nature, but not in a yes/no or fill-in-the-blank manner.
82
References
Bell, B., & Cowie, B. (2001). Formative assessment and science education. Dordrecht:
Kluwer.
Black, P. (1993). Formative and summative assessment by teachers. Studies in Science
Education, 21, 4997.
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in
Education, 5, 774.
Brophy, J.E., & Good, T.L. (1986). Teacher behavior and student achievement. In M.C.
Wittrock (Ed.), Handbook of research on teaching (3rd ed., pp. 328375). New York:
Macmillan.
Cazden, C.B. (2001). Classroom discourse: The language of teaching and learning (2nd ed.).
Portmouth, NH: Heinemann.
Christie, F. (2002). Classroom discourse analysis: A functional perspective. London:
Continuum.
Clarke, S. (2000, April). Closing the gap through feedback in formative assessment: Effective
distance marking in elementary schools in England. Paper presented at the AERA annual meeting.
New Orleans, LA.
Curriculum Research and Development Group. (2003). Foundational approaches in science
teaching I. Retrieved June 1, 2004, from www.hawaii.edu/crdg/FAST.pdf
Driver, R., Leach, J., Millar, R., & Scott, P. (1996). Young peoples images of science.
Buckingham, UK: Open University Press.
Duschl, R.A. (2000). Making the nature of science explicit. In R. Millar, J. Leach, &
J. Osborne (Eds.), Improving science education: The contribution of research. Buckingham, UK:
Open University Press.
Duschl, R.A. (2003). Assessment of inquiry. In J.M. Atkin & J.E. Coffey (Eds.), Everyday
assessment in the science classroom (pp. 4159). Washington, DC: National Science Teachers
Association Press.
Duschl, R.A., & Gitomer, D.H. (1997). Strategies and challenges to changing the focus of
assessment and instruction in science classrooms. Educational Assessment, 4, 3773.
Duschl, R.A., & Grandy, R.E. (2005, February). Reconsidering the character and role of
inquiry in school science: Framing the debates. Paper presented at the Inquiry Conference on
Developing a Consensus Research Agenda, Piscataway, NJ.
Edwards, D., & Mercer, N. (1987). Common Knowledge: The development of understanding
in the classroom. London: Routledge.
Edwards, A.D., & Westgate, D.P.G. (1994). Investigating classroom talk (2nd ed.). London:
Westgate.
Kelly, G.J., & Crawford, T. (1997). An ethnographic investigation of the discourse processes
of school science. Science Education, 81, 533559.
Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in
recognizing ones own incompetence lead to inflated self-assessment. Journal of Personality and
Social Psychology, 77, 11211134.
Lemke, J.L. (1990). Talking science: Language, learning and values. Norwood, NJ: Ablex.
Markham, E.M. (1979). Realizing that you dont understand: Elementary school childrens
awareness of inconsistencies. Child Development, 50, 643655.
Marks, G., & Mousley, J. (1990). Mathematics education and genre: Dare we make the
process writing mistake again? Language and Education, 4, 117135.
83
Martin, J.R. (1989). Factual writing: Exploring and challenging social reality. Oxford, UK:
Oxford University Press.
Martin, J.R. (1993). Literacy in science: Learning to handle text as technology. In M.A.K.
Halliday & J.R. Martin (Eds.), Writing science: Literacy and discursive power (pp. 166202).
Pittsburgh, PA: University of Pittsburgh Press.
National Research Council. (1999). The assessment of science meets the science of
assessment. Board on Testing and Assessment Commission on Behavioral and Social Sciences
and Education. Washington, DC: National Academy Press.
National Research Council. (2001). Inquiry and the national science education standards.
Washington, DC: National Academy Press.
Nystrand, M., & Gamoran, A. (1991). Instructional discourse, student engagement, and
literature achievement. Research in the Teaching of English, 25, 261290.
OConnor, M.C., & Michaels, S. (1993). Aligning academic task and participation status
through revoicing: Analysis of a classroom discourse strategy. Anthropology and Education
Quarterly, 24, 318335.
Pottenger, F., & Young, D. (1992). FAST 1: The local environment. Manoa, HI: Curriculum
Research and Development Group, University of Hawaii.
Ramaprasad, A. (1983). On the definition of feedback. Behavioral Science, 28, 413.
Resnick, L.B., Salmon, M., Zeitz, C.M., Wathen, S.H., & Holowchak, M. (1993). Reasoning
in conversation. Cognition and Instruction, 11, 347364.
Redfield, D.L., & Rousseau, E.W. (1981). A meta-analysis of experimental research on
teacher questioning behavior. Review of Educational Research, 511, 237245.
Rogg, S., & Kahle, J.B. (1997). Middle level standards-based inventory. Oxford, OH:
University of Ohio, Miami.
Roth, W.-M. (1996). Teacher questioning in an open-inquiry learning environment:
Interactions of context, content, and student response. Journal of Research in Science Teaching,
33, 709736.
Sadler, D.R. (1989). Formative assessment and the design of instructional systems.
Instructional Science, 18, 119144.
Sadler, D.R. (1998). Formative assessment: Revisiting the territory. Assessment in Education,
5, 7784.
Sawyer, R.K. (2004). Creative teaching: Collaborative discussion as disciplined improvisation. Educational Researcher, 33, 1220.
Schwab, J.J. (1962). The concept of the structure of a discipline. The Educational Record, 43,
197205.
Scott, P. (1998). Teacher talk and meaning making in science classrooms: A Vygotskyian
analysis and review. Studies in Science Education, 32, 4580.
Shavelson, R.J., Black, P., Wiliam, D., & Coffey, J. On aligning summative and
formative functions in the design of large-scale assessment systems. Paper submitted for
publication.
Shavelson, R.J., & Young, D. (2000). Embedding assessments in the FAST curriculum: On
the beginning the romance among curriculum, teaching and assessment. Proposal submitted
at the Elementary, Secondary and Informal Education Division at the National Science
Foundation.
Shepard, L. (2003). Reconsidering large-scale assessment to heighten its relevance to
learning. In J.M. Atkin & J.E. Coffey (Eds.), Everyday assessment in the science classroom
(pp. 121146). Washington, DC. National Science Teachers Association Press.
Journal of Research in Science Teaching. DOI 10.1002/tea
84
van Zee, E.H., Iwasyk, M., Kurose, A., Simpson, D., & Wild, J. (2001). Student and teacher
questioning during conversations about science. Journal of Research in Science Teaching, 38,
159190.
Yin, Y. (2004). Formative assessment influence on students science learning and motivation.
Proposal to the American Educational Research Association.
Zellenmayer, M. (1989). The study of teachers written feedback to students writing:
Changes in theoretical considerations and the expansion of research contexts. Instructional
Science, 18, 145165.