Student Evaluation of Teaching Effectiveness: An Assessment of Student Perception and Motivation

Assessment & Evaluation in Higher Education, Vol. 28, No.
1, 2003
Student Evaluation of Teaching

Effectiveness: an assessment of student
perception and motivation
YINING CHEN & LEON B. HOSHOWER, Ohio University, Athens, Ohio, USA
ABSTRACT Over the past century, student ratings have steadily continued to take
precedence in faculty evaluation systems in North America and Australia, are increas-
ingly reported in Asia and Europe and are attracting considerable attention in the Far
East. Since student ratings are the most, if not the only, influential measure of teaching
effectiveness, active participation by and meaningful input from students can be critical
in the success of such teaching evaluation systems. Nevertheless, very few studies have
looked into students’ perception of the teaching evaluation system and their motivation
to participate. This study employs expectancy theory to evaluate some key factors that
motivate students to participate in the teaching evaluation process. The results show that
students generally consider an improvement in teaching to be the most attractive
outcome of a teaching evaluation system. The second most attractive outcome was using
teaching evaluations to improve course content and format. Using teaching evaluations
for a professor’s tenure, promotion and salary rise decisions and making the results of
evaluations available for students’ decisions on course and instructor selection were less
important from the students’ standpoint. Students’ motivation to participate in teaching
evaluations is also impacted significantly by their expectation that they will be able to
provide meaningful feedback. Since quality student input is an essential antecedent of
meaningful student evaluations of teaching effectiveness, the results of this study should
be considered thoughtfully as the evaluation system is designed, implemented and
operated.
Introduction
Student evaluations have become routine at most colleges and universities. Evidence
from many studies indicates that most universities and colleges throughout the world use
student ratings of instruction as part of their evaluation of teaching effectiveness (Seldin,
1985; Abrami, 1989; Wagenaar, 1995; Abrami et al., 2001; Hobson & Talbot, 2001).
ISSN 0260-2938 print; ISSN 1469-297X online/03/010071-18  2003 Taylor & Francis Ltd
DOI: 10.1080/0260293032000033071
72 Y. Chen & L. B. Hoshower
With the surge in public demand for accountability in higher education and the
great concern for quality of university teaching, the practice of collecting
student ratings of teaching has been widely adopted by universities all over the
world as part of the quality assurance system. (Kwan, 1999, p. 181)
Student evaluations of teaching effectiveness are commonly used to provide: (1)
formative feedback to faculty for improving teaching, course content and structure; (2)
a summary measure of teaching effectiveness for promotion and tenure decisions; (3)
information to students for the selection of courses and teachers (Marsh & Roche, 1993).
Research on student evaluations of teaching effectiveness often examines issues like the
development and validity of an evaluation instrument (Marsh, 1987), the validity
(Cohen, 1981) and reliability (Feldman, 1977) of student ratings in measuring teaching
effectiveness and the potential bias of student ratings (Hofman & Kremer, 1980; Abrami
& Mizener, 1983; Tollefson et al., 1989). Very few studies, however, have examined
students’ perceptions of teaching evaluations and their motivation to participate in the
evaluation. Since students’ input is the root and source of student evaluation data,
meaningful and active participation of students is essential. The usefulness of student
evaluation data is severely undermined unless students willingly provide quality input.
Expectancy theory has been recognised as one of the most promising conceptualisa-
tions of individual motivation (Ferris, 1977). Many researchers have proposed that
expectancy theory can provide an appropriate theoretical framework for research that
examines a user’s acceptance of and intent to use a system (DeSanctis, 1983). However,
empirical research employing expectancy theory within an educational context has been
limited. This study uses expectancy theory as part of a student-based experiment to
examine students’ acceptance of and motivation to participate in a teaching evaluation
system. The following section provides a review of prior research on teaching evaluation
and a discussion of expectancy theory.
The Theoretical Background and Supporting Literature

The subject of teaching effectiveness has received much attention in research literature
(Marsh, 1987). Defining and measuring teaching effectiveness plays an important role in
many of the decisions made in higher education. Typically, teaching effectiveness is
measured through some form of student questionnaire that has been specifically designed
to measure observed teaching styles or behaviours (Wright & O’Neil, 1992). In many
universities, student ratings are used as one (sometimes the only and often the most
influential) measure of teaching effectiveness (Kwan, 1999).
Since student ratings are used as the primary measure of teaching effectiveness, active
participation and meaningful input from students are critical factors in the success of a
teaching evaluation system. Several studies in the educational area have observed a
significant link between student attitudes toward the evaluation of teaching effectiveness
and the success of a teaching evaluation system (Hofman & Kremer, 1980; Marsh, 1984,
1987; Douglas & Carroll, 1987; Tom et al., 1990). However, very few studies have
analysed the factors that influence students’ attitudes toward teaching evaluations and the
relative importance of these factors. Likewise, few studies have examined the be-
havioural intention of students’ participating in the evaluation of faculty teaching
effectiveness.
Student evaluations of teaching effectiveness have traditionally served two functions:
as formative and as summative measurements of teaching. One formative use of student
Student Evaluation of Teaching 73
evaluations is as feedback to instructors who wish to modify their teaching practices.

Many studies examined the usefulness of teaching evaluations in improving teaching
performance (Wilson, 1986; Arubayi, 1987; Divoky & Rothermel, 1989; Theall &
Franklin, 1991; Marsh & Roche, 1993). Teaching evaluations are also used to improve
course content, format and structure. Studies that examined the course improvement
aspect of teaching evaluations include Driscoll and Goodwin (1979) and Simpson
(1995).
The summative function of teaching evaluations provides information for administrat-
ive decisions. In fact, most colleges and universities attach great importance to teaching
performance in tenure, promotion and pay rise decisions (Lin et al., 1984; Kemp &
Kumar, 1990; Cashin & Downey, 1992; Centra, 1994). This summative function of
teaching evaluations may also provide information for students’ selection of instructors
or courses (Marsh & Roche, 1993). This function is a subject of controversy and has not
yet been widely adopted by many colleges and universities. In state supported institu-
tions, teaching evaluations are publicly available information under the Freedom of
Information Act. Student groups at some universities routinely request these data and
disseminate it to the student body.
Considerable research has investigated the reliability and validity of student ratings.
Reliability studies (Marlin & Gaynor, 1989; Scherr & Scherr, 1990; Nimmer & Stone,
1991; Wachtel, 1998) generally address the question ‘Are student ratings consistent both
over time and from rater to rater?’. On the other hand, validity studies (Howard et al.,
1985; Byrne, 1992; Tagomori & Bishop, 1995) address the questions ‘Do student ratings
measure teaching effectiveness?’ and ‘Are student ratings biased?’. Although methodo-
logical problems have been identified, there seems to be some support for both the
reliability and validity of student ratings. Overall, the literature supports the view that
properly designed student ratings can be a valuable source of information for evaluating
certain aspects of faculty teaching performance (Cohen, 1981; Marsh, 1984; Calderon et
al., 1994).
While the literature supports that students can provide valuable information on
teaching effectiveness given that the evaluation is properly designed, there is a great
consensus in the literature that students cannot judge all aspects of faculty performance
(Cashin, 1983; Centra, 1993; Seldin, 1993). This literature indicates that
students should not be asked to judge whether the materials used in the course
are up to date or how well the instructor knows the subject matter of the
course. (Seldin, 1993)
In both instances, the students’ background and experience may not be sufficient to make
an accurate assessment, thus their conclusions may be invalid. Green et al. (1998)
reported that, unfortunately, 60.8% of student evaluations used in accounting education
departments contained at least one question that requires students to infer beyond their
background and experience.
To this point, the research literature has been almost exclusively concerned with the
construction of ratings instruments and the reliability and validity of student ratings.
There has been little systematic study of the problem of creating evaluation
systems that truly respond to the needs of those who evaluate teaching
performance. (Theall & Franklin, 2000, p. 95)
Using expectancy theory, the present study investigates the impact of the potential uses
of teaching evaluations upon students’ motivation to participate in the evaluation
process. The four uses of teaching evaluations tested by this study include two formative
and two summative. They have been identified by literature as the primary uses of
teaching evaluations as discussed earlier. The two formative uses are: (1) improving the
professor’s teaching; and (2) improving the course’s content and format; the two
summative uses are: (3) influencing the professor’s tenure, promotion and salary rises;
and (4) making these results available for students to use in the selection of courses and
teachers [1]. The second objective of this study was to examine whether an inappropri-
ately designed teaching evaluation that, in the perception of students, hinders students
from providing valid or meaningful feedback will affect their motivation to participate
in the evaluation. The third objective is to discover whether these results are consistent
across class rank or whether freshmen and seniors have different preferences in the use
of student-generated teaching evaluations and consequently whether they have different
motivations to participate. Through a better understanding of students’ needs and
behavioural intentions, the results of this study can aid in creating evaluation systems
that truly respond to the needs of those who evaluate teaching performance.
Expectancy Theory
The theory of reasoned action, as proposed by Ajzen and Fishbein (1980), is a well-
researched model that has successfully predicted behaviour in a variety of contexts. They
propose that attitudes and other variables (i.e. an individual’s normative beliefs) do not
directly influence actual behaviour (e.g. participation) but are fully mediated through
behaviour intentions or the strength of one’s intention to perform a specific behaviour.
This would imply that measurement of behavioural intentions (motivation) to participate
in a system is a strong and more appropriate predictor (than just attitudes) of the success
of the system.
Expectancy theory is considered one of the most promising conceptualisations of
individual motivation. It was originally developed by Vroom (1964) and has served as
a theoretical foundation for a large body of studies in psychology, organisational
behaviour and management accounting (Harrell et al., 1985; Brownell & McInnes, 1986;
Hancock, 1995; Snead & Harrell, 1994; Geiger & Cooper, 1996). Expectancy models are
cognitive explanations of human behaviour that cast a person as an active, thinking,
predicting creature in his/her environment. He or she continuously evaluates the
outcomes of his or her behaviour and subjectively assesses the likelihood that each of
his or her possible actions will lead to various outcomes. The choice of the amount of
effort he or she exerts is based on a systematic analysis of: (1) the values of the rewards
from these outcomes; (2) the likelihood that rewards will result from these outcomes;
and (3) the likelihood of reaching these outcomes through his or her actions and efforts.
According to Vroom, expectancy theory is composed of two related models: the
valence model and the force model. In our application of the theory, the valence model
shows that the overall attractiveness of a teaching evaluation system to a student (Vj) is
the summation of the products of the attractiveness of those outcomes associated with
the system (Vk) and the probability that the system will produce those outcomes (Ijk):
Vj ⫽ 兺nk ⫽ 1(Vk ⫻ Ijk)
where Vj is the the valence, or attractiveness, of a teaching evaluation (outcome j, first
level outcome), Vk is the the valence, or attractiveness, of outcome k (second level
outcome) and Ijk is the perceived probability that the teaching evaluation will lead to
outcome k.
In our case the four potential outcomes (i.e. k ⫽ 4) are the four uses of teaching
evaluations that are described in the literature. They are: (1) improving the professor’s
teaching; (2) influencing the professor’s tenure, promotion and salary rises; (3) improv-
ing the course’s content and format; (4) making these results available for students to use
in the selection of courses and teachers.
The force model shows that a student’s motivation to exert effort into a teaching
evaluation system (Fi) is the summation of the products of the attractiveness of the
system (Vj) and the probability that a certain level of effort will result in a successful
contribution to the system (Eij):
Fi ⫽ 兺nj⫽ 1(Eij ⫻ Vj)
where Fi is the motivational force to participate in a teaching evaluation at some level
i, Eij is the expectancy that a particular level of participation (or effort) will result in a
successful contribution to the evaluation and Vj is the valence, or attractiveness, of the
teaching evaluation, derived in the previous equation of the valence model.
In terms of the decision making process, each student first uses the valence model and
then the force model. In the valence model, each participant in a teaching evaluation
system evaluates the system’s outcomes (e.g. improved teaching, rewarding effective
teaching, improved course content and availability of results for students’ decision
making) and subjectively assesses the likelihood that these outcomes will occur. Next,
by placing his or her own intrinsic values (or weights) on the various outcomes, each
student evaluates the overall attractiveness of the teaching evaluation system. Finally, the
student uses the force model to determine the amount of effort he or she is willing to
exert in the evaluation process. This effort level is determined by the product of the
attractiveness generated by the valence model (above) and the likelihood that his or her
effort will result in a successful contribution to the system. Based on this systematic
analysis, the student will determine how much effort he or she would like to exert in
participating in the evaluation system.
Research Method
Subject Selection
This study was conducted at a mid-sized (15 000–20 000 total enrollment), mid-west
university. The freshmen participants were gathered from two sections of Western
Civilization, which were designated as ‘freshmen and sophomores only’, although a few
upper classmen were registered for the class. Seniors were gathered from ‘Tier III’
courses. Tier III courses are the capstone course of the general education requirement.
All students are required to take one Tier III class before graduation. Although each Tier
III course is unique, they have a number of common factors. Tier III courses must
integrate more than one academic discipline, are restricted exclusively to seniors, address
widely diverse topics and have few or no prerequisites. As a general education
requirement, Tier III are never directed towards a particular major and are usually
populated by students from diverse academic backgrounds.
The instrument was administered at the beginning of a regularly scheduled class
around the middle of the quarter to all the students who were present on that particular
day. We explained the use of the instrument, read the instruction page to the students and
then asked the students to complete the instrument. The entire process took between 15
and 20 minutes. Students other than freshmen and seniors were eliminated from the
sample as were the instruments with incomplete data [2]. This resulted in 208 usable
instruments completed by 105 freshman and 103 senior students. The demographic
TABLE 1. Summary of demographic information
Freshmen Seniors
Total sample size 105 103

Breakdown by College
Arts & science 32 22
Business 7 13
Communication 26 19
Education 9 13
Engineering 4 12
Fine arts 4 1
Health & human services 5 23
University Collegea 18 0
Gender (male/female) 64/41 40/63
Average GPA 3.27 3.13
Perception of professorsb 8.04 7.68
Impression of evaluationc 5.99 4.84
a
University College is the college for students with undeclared majors.
b
The question asked was ‘In general, how do you describe the professor you
have had at this institution?’. We used an 11 point response scale with a
range of 0 to 10. 0 represents ‘very bad’ and 10 represents ‘very good’.
c
The question asked was ‘What is your general impression about the course
evaluation system?’. We used an 11 point response scale with a range of
0 to 10. 0 represents ‘useless’ and 10 represents ‘very useful’.
information for students in the two groups is summarised in Table 1. We used the
freshmen versus seniors design to examine whether freshmen and seniors have different
motivations to participate in the teaching evaluation system.
Judgment Exercise
The within-person or individual focus of expectancy theory suggests that appropriate
tests of this theory should involve comparing measurements of the same individual’s
motivation under different circumstances (Harrell et al., 1985; Murky & Frizzier, 1986).
In response to this suggestion, this study incorporates a well-established within-person
methodology originally developed by Stahl and Harrell (1981) and later proven to be
valid by other studies in various circumstances (see, for example, Snead & Harrell, 1995;
Geiger & Cooper, 1996). This methodology uses a judgment modelling decision exercise
that provides a set of cues that an individual uses in arriving at a particular judgment or
decision. Multiple sets of these cues are presented, each representing a unique combi-
nation of strengths or values associated with the cues. A separate judgment is required
from the individual for each unique combination of cues presented.
We employed a one-half fractional factorial design using the four second level
outcomes [3]. This resulted in eight different combinations of the second level outcomes
1
(24 ⫻ 2 ⫽ 8 combinations). Each of the resulting eight combinations was then presented
at two levels (10 and 90%) of expectancy to obtain 16 unique cases (8 combinations ⫻ 2
levels of expectancy ⫽ 16 cases). This furnished each participant with multiple cases
that, in turn, provided multiple measures of each individual’s behavioural intentions
under varied circumstances. This is a prerequisite for the within-person application of
expectancy theory (Snead & Harrell, 1995).
In each of the 16 cases, the participants were asked to make two decisions. The first
decision, Decision A, corresponded to Vj in the valence model and represented the
overall attractiveness of participating in the evaluation, given the likelihood (10 or 90%)
that the four second level outcomes (Ijk) would result from their participation. (The
instructions and a sample case are provided in the Appendix.) As mentioned earlier, the
four second level outcomes are (1) improving the professor’s teaching, (2) influencing
the professor’s tenure, promotion and salary rise, (3) improving the course content and
format and (4) making the results available to students. The second decision, Decision
B, corresponded to Fi in the force model and reflected the strength of a participant’s
motivation to participate in the evaluation, using (1) the attractiveness of the evaluation
(Vj) obtained from Decision A and (2) the expectancy (Eij, 10 or 90%) that if the
participant exerted a great deal of effort he or she would be successful in providing
meaningful or useful input to the evaluation process. We used an 11 point response scale
with a range of ⫺ 5 to ⫹ 5 for Decision A and 0 to 10 for Decision B. For Decision
A, ⫺ 5 five represented ‘very unattractive’ and ⫹ 5 represented ‘very attractive’; for
Decision B, 0 represented ‘zero effort’ and 10 represented a ‘great deal of effort’.
There is a problem with applying the expectancy theory model to teaching evaluations.
Expectancy theory holds that an individual may devote a great deal of effort towards
achieving an outcome and despite this best effort he or she may not be able to achieve
the desired outcome. It may be that in the student’s perception, the ‘successful’
completion of the evaluation is trivial. All one has to do is fill in the multiple choice grid.
Likewise, the range of effort that may be devoted to completing the evaluation may not
be apparent. Consequently, in the ‘Further Information’ supplied between Decision A
and Decision B, we provided a situation in which the student was told that the
hypothetical course evaluation contained ‘several open-ended essay questions which will
require a great deal of effort for you to complete’. Furthermore, we told the students that
despite their best efforts their feedback might not be helpful to the reader. This added
the necessary uncertainty about the reward of effort, as well as providing a feeling that
the required effort could be considerable. The students were further reminded that their
participation in student evaluations is voluntary and they are free to decide to what extent
they would participate in the evaluation.
It is quite common that open-ended questions are enclosed in a course evaluation to
allow students to express an unconstrained opinion about some aspect of the class and/or
instructor. Such questions usually provide important diagnostic information and insight
for the formative evaluation about the course and instructor (Calderon et al., 1996).
Though important, open-ended questions are more difficult to summarise and report. Our
instrument explained that the reader could misinterpret the evaluator’s feedback in the
essay. Likewise, the data from multiple choice questions could be difficult to interpret
or meaningless if the questionnaire is designed poorly or the questions are ambiguous or
the evaluation is administered inappropriately. Therefore, despite his or her efforts, the
student may not be successful in contributing meaningfully to the evaluation process.
Experimental Controls
The participants were presented with 16 hypothetical situations. They were supposed to
detach themselves from their past experiences and evaluate the hypothetical situations
from a third party perspective. If the respondents were successful in doing this, we would
expect to find no correlation between their actual experiences with student-generated
evaluations or background and their responses. To test this, we calculated Pearson’s
correlations between the R2 value of the valence model and four selected demographic
factors. These factors are gender, grade point average (GPA), impression of professors
and perception about the evaluation system. The coding of gender was 1 for male and
0 for female. The impression of professors and perception about the evaluation system
were measured by two 11 point scale demographic questions. Participating students were
asked ‘In general, how do you describe the professors you have had at this institution?’
and ‘What is your general impression about the course evaluation system?’. We also
calculated the correlation between the R2 value of the force model and the four
demographic factors. We used these correlations to assess whether the subjects were able
to evaluate the 16 hypothetical situations objectively without bias and thus were
appropriate for this study.
Results
Valence Model
Through the use of multiple regression analysis, we sought to determine each student’s
perception of the attractiveness of participating in the evaluation. Decision A (Vj) served
as the dependent variable and the four second level outcome instruments (Ijk) served as
the independent variables. The resulting standardised regression coefficients represent
the relative importance (attractiveness) of each of the second level outcomes to each
participant in arriving at Decision A. Table 2 presents the mean adjusted R2 of the
regressions and the mean standardised values of each outcome. Detailed regression
results for each participant are not presented but are available from the authors.
As indicated in Table 2, the mean R2 of the individual regression models is 0.69 for
the freshman group and 0.71 for the senior group. The mean R2 represents the percentage
of total variation in responses which is explained by the multiple regression. Thus, these
relatively high mean R2 values indicate that the valence model of expectancy theory
TABLE 2. Valence model regression resultsa
Frequency of
significance at
n Mean SD Range 0.05 level
Group I: Freshmen
Adjusted R2 105 0.69 0.14 0.35–0.98 104/105
Standardised weight
V1 105 0.47 0.17 0.03–0.92 86/105
V2 105 0.26 0.24 ⫺ 0.69–0.83 50/105
V3 105 0.42 0.16 ⫺ 0.08–0.77 80/105
V4 105 0.39 0.19 ⫺ 0.43–0.78 72/105
Group II: Seniors
Adjusted R2 103 0.71 0.15 0.20–0.98 101/103
Standardised weight
V1 103 0.44 0.22 ⫺ 0.27–0.78 78/103
V2 103 0.33 0.30 ⫺ 0.55–0.97 63/103
V3 103 0.42 0.19 ⫺ 0.23–0.87 77/103
V4 103 0.33 0.19 ⫺ 0.14–0.75 62/103
V1, valence of teaching improvement; V2, valence of tenure and promotion decisions; V3,
valence of course improvement; V4, valence of result availability.
a
Results (i.e. mean, standard deviation, range and frequency of significance at 0.05) of
individual within-person regression models are reported.
explains much of the variation in students’ perception of the attractiveness of participat-

ing in a teaching evaluation. There is no meaningful difference in the mean R2 between
the freshman and senior groups (P ⫽ 0.22, t-test; not shown in Table 2).
The standardised values of V1, V2, V3 and V4 are significant, at the level of 0.05, for
half or more than half of the individuals in both groups. This implies that all four of the
uses were important factors to a majority of the individuals in determining the
attractiveness of a teaching evaluation system. Although all four factors were important,
some factors were more important than others. It is the mean of these standardised
values that measures, on average, the attractiveness of potential outcomes resulting from
a teaching evaluation system. The participants in both groups, on average, placed the
highest valence on the outcome V1. Both groups placed the second highest valence on
the outcome V3. The other valences, in descending order of their strength, were V4 and
V2 for the freshman group and V2 and V4 for the senior group. These results imply that
students believe improving teaching (V1) is the most attractive outcome of a teaching
evaluation system with improvement of course content and format (V3) being the second
attractive outcome. Students, however, consider that teaching evaluation results being
used to influence a professor’s tenure, promotion and salary rise (V2) and being used by
students for course and instructor selection (V4) are less important outcomes, with the
order reversed between the two groups. Thus, both groups considered the formative uses
of student-generated teaching evaluations to be more important than the two summative
uses.
Table 3A summarises the rankings of the second level outcomes for both groups and
the statistical testing on the magnitudes of standardised values of these outcomes. The
results indicate that freshman students have several distinctive preferences on the uses
of the course evaluations. They consider improving teaching (V1) significantly more
attractive than improving course content (V3). The difference between improving course
content (V3) and results availability (V4) is not significant, implying that both these
possible uses of evaluations are of similar importance and that both V3 and V4 are of less
importance than V1. Freshman students consider the use of teaching evaluations to
influence tenure, promotion and salary rise decisions (V2) to be the least important use
and this use is significantly less important than the next lowest ranked use, V4.
In contrast, the standardised values for V1 (improving teaching) and V3 (course
improvement) of the senior students are not significantly different. This implies that
senior students’ preferences for these two evaluation uses are not as distinctive as the
preferences of the freshman students. Alternatively stated, the senior students considered
the importance of these two uses to be somewhat similar, while the freshman students
had more distinct hierarchical preferences. Senior students placed significantly less
importance on the evaluation’s impact on the professor’s tenure, promotion and salary
rises (V2) than on either formative use, V1 and V3. There was no significant difference
for seniors between the use of evaluations for promotion, tenure and pay rises (V2) and
their use for ‘professor shopping’ (V4). This implies that seniors placed relatively high
and similar values on V1 and V3 and significantly less value and similar values on uses
V2 and V4.
We also used t-tests to investigate whether there is a difference between the freshman
and senior groups in their perception of the attractiveness of the four second level
outcomes. These results are reported in Table 3B. We found that the two groups did not
view the attractiveness of V1 and V3 differently. The P values for the t-tests were 0.16
and 0.75 for V1 and V3, respectively. A significant difference, however, was found
between the two groups on the attractiveness of V4 and a very close to significant
TABLE 3. Equality tests

(A) Ranking of second level outcomes and equality tests on standardised values
Ranking (high to low) Mean of standardised values P value (t-test)
Group I: Freshmen
V1 0.47
V3 0.42 0.05 (V1 versus V3)
V4 0.39 0.16 (V3 versus V4)
V2 0.26 0.00 (V4 versus V2)
Group II: Seniors
V1 0.44
V3 0.42 0.47 (V1 versus V3)
V2 0.33 0.05 (V3 versus V2)
V4 0.33 0.86 (V2 versus V4)
(B) Equality tests on standardised values of second level outcomes between

freshman and senior groups
Mean of standardised values
Second level outcome Freshmen Seniors P value (t-test)
V1 0.47 0.44 0.16

V2 0.26 0.33 0.05
V3 0.42 0.42 0.75
V4 0.39 0.33 0.03
(C) Equality tests on standardised values of second level outcomes between

male and female students
Mean of standardised values
Second level outcome Male Female P value (t-test)
V1 0.43 0.48 0.14

V2 0.28 0.32 0.32
V3 0.41 0.42 0.73
V4 0.39 0.33 0.02
V1, valence of teaching improvement; V2, valence of tenure and promotion

decisions; V3, valence of course improvement; V4, valence of result
availability.
difference for V2. The freshman group considered V4 (the results made available to
students) as an outcome significantly more attractive than did the senior group. The
senior group, in contrast, considered V2 (tenure and promotion influence) a close to
significantly more attractive outcome. Our interpretation of the former result is that
freshmen may be seeking more guidance in choosing professors, while seniors may have
an effective word-of-mouth system. Our interpretation of the latter result is that freshmen
are naive about the promotion and tenure system, while seniors are more aware of its
impact on individual professors and upon the composition of the faculty.
Table 3C presents the comparison between male and female students in perceiving the
four second level outcomes. The results indicate that there was no significant difference
in their weight to the second level outcomes except for V4. Male students considered the
TABLE 4. Force model regression resultsa
Frequency of
significance at
n Mean SD Range 0.05 level
Group I: Freshmen
Adjusted R2 105 0.75 0.20 0.10–1.00 104/105
Standardised weight
W1 105 0.50 0.35 ⫺ 0.19–1.00 73/105
W2 105 0.53 0.37 ⫺ 0.23–1.00 75/105
Group II: Seniors
Adjusted R2 103 0.78 0.18 0.13–1.00 101/103
Standardised weight
W1 103 0.53 0.31 ⫺ 0.05–0.99 79/103
W2 103 0.56 0.33 ⫺ 0.21–1.00 81/103
W1, weight placed on attractiveness of the evaluation; W2, weight placed on the expectancy of
successfully participating in the evaluation.
a
Results (i.e. mean, standard deviation, range and frequency of significance at 0.05) of individual
within-person regression models are reported.
results being made available as a more attractive outcome of course evaluations than did
female students.
Force Model
We then used multiple regression analysis to examine the force model (Decision B) in
the experiment. The dependent variable is the individual’s level of effort to participate
in the evaluation (Fi). The two independent variables are (1) each student’s perception
about the attractiveness of the system (Vj) from Decision A and (2) the expectancy
information (Eij ⫽ 10 or 90%), which is provided by the ‘Further Information’ of the test
instrument (see Appendix). The force model results are summarised in Table 4.
The mean R2 (0.75 and 0.78) indicates that the force model sufficiently explains the
students’ motivation of participating in the evaluation system. The mean standardised
regression coefficient W1 indicates the impact of the overall attractiveness of the
evaluation (Vj) while W2 indicates the impact of the expectation that a certain level of
effort leads to successful participation in the evaluation. Our results found no significant
difference between the mean standardised values for W1 and W2 for either group of
students. The P values of these t-tests were 0.60 and 0.54 for the freshman and senior
groups, respectively. (P values are not shown in Table 4.) These results imply that both
factors, the attractiveness of the evaluation system (W1) and the likelihood that the
student’s efforts will lead to success (W2), are of similar importance to the student’s
motivation to participate in the evaluation.
Experimental Controls
Table 5 presents Pearson’s correlations between the R2 values of the valence and force
models and four demographic factors: gender, GPA, impression of professors and
perception about the evaluation system [4]. This creates eight correlations for each group
or a total of 16 correlations. These correlations are shown in the two right hand columns
of Table 5. None of these 16 correlations is significant, suggesting that neither the
TABLE 5. Pearson’s correlation coefficients (P values)
Impression Impression Adjusted R2 Adjusted R2

GPA of professors of evaluation force valence
Group I: Freshmen
Gender 0.03 (0.75) 0.03 (0.73) ⫺ 0.18 (0.07) 0.09 (0.37) ⫺ 0.00 (0.98)
GPA 0.10 (0.37) 0.10 (0.35) 0.18 (0.09) 0.17 (0.11)
Impression of professors 0.20 (0.04) ⫺ 0.03 (0.73) 0.01 (0.96)
Impression of evaluation 0.08 (0.42) 0.13 (0.08)
Group II: Seniors
Gender ⫺ 0.28 (0.00) ⫺ 0.14 (0.16) ⫺ 0.10 (0.31) ⫺ 0.15 (0.12) ⫺ 0.03 (0.76)
GPA 0.20 (0.04) ⫺ 0.02 (0.86) 0.15 (0.13) 0.04 (0.72)
Impression of professors 0.41 (0.00) 0.10 (0.32) 0.03 (0.79)
Impression of evaluation 0.07 (0.47) 0.03 (0.78)
students’ perception of the attractiveness of the evaluation system nor their motivation
to participate is correlated with their background or with their prior experience with
evaluation systems. These results also support our argument that the subjects we used
were appropriate for this study because neither their background nor their prior
experience with professors and teaching evaluations affected their perceptions of the
evaluation systems tested in the questionnaire [5].
To examine if an order effect is present in our experimental design, we administered
two versions of the instrument; each had the order of the 16 hypothetical situations
determined at random. We then ran regression using the average R2 values from the two
random order versions as the dependent variable and the order as the independent
dummy variable and found no association between the two. This result suggests that
there was no order effect in our experimental design.
Other Interesting Findings

We also ran all the possible combinations of correlations between each of the four
demographic factors for each group. The resulting 12 correlations are also shown in
Table 5. These correlations are unrelated to our objectives in this research project and
we are testing no hypothesis with these data. We are including it because we found it
interesting and it may serve as a basis for hypothesis formation for future research.
One interesting finding is that for both groups the correlation between the students’
overall impression of professors and the students’ impression about the course evaluation
system is positive and significant at the level of 0.05. This implies that students who
have higher appraisal of the professors whom they have had in the university generally
consider the course evaluation system more useful. A second finding is that seniors with
higher GPAs have higher appraisals of their professors. This is not true of freshmen. The
third finding is that gender and GPA are related for seniors. This is not true of freshmen.
The female seniors in the study reported significantly higher GPAs than the male seniors
in the experiment. We offer two conjectures for this phenomenon. Since there is no
difference after one-quarter of reported grades for freshmen, but there is a grade–gender
difference for seniors, it seems reasonable that the difference occurred in the intervening
years. (Of course, since these are cross-sectional data, rather than time series data, this
is pure conjecture.) Our initial speculation about this finding is that females adapt better
over time to the university environment and thus achieve higher grades. Our second
speculation, which competes with our initial speculation, is that this gender–GPA
relationship is due to a self-selection of males into majors that are traditionally stingier
with grades, such as engineering. GPA differences due to majors for freshmen are
minimal since most freshmen are taking similar general education requirements.
Our final interesting finding is drawn from Table 1. The data show that the freshman
group has a significantly higher regard of professors and student-generated teaching
evaluation systems than the senior group, with t-test P values of 0.01 and 0.00,
respectively (P values not shown in Table 1). This result is the opposite of our a priori
belief. We expected that seniors would be engaged in courses required by their major,
which presumably would be more consistent with their educational interests. Typically,
these senior level courses would have a smaller class size and would be staffed by
professors rather than graduate assistants. All of these factors are correlated with higher
evaluations of the professor. So, if these correlations are real, rather than spurious, it
is likely that an important change, which is probably worth investigating, has taken
place.
Concluding Remarks
Limitations and Future Research
Some limitations of this study need to be discussed. First, the selection of subjects was
not random. Students became subjects by virtue of being present the day their class was
surveyed. The selection of classes was arbitrary. Consequently, caution should be used
in generalising the results to other groups and settings. Second, an experimental task was
used in this study and the subjects’ responses were gathered in a controlled environment
rather than in a real world setting, although sitting in a classroom completing a teaching
evaluation and sitting in a classroom completing an instrument about teaching evalua-
tions are similar activities. Third, students were not given the opportunity for input on
the outcomes that motivate them to participate in a teaching evaluation. In the
instrument, four possible outcomes are given to the students. It is possible that other
possible outcomes of teaching evaluations may have a stronger impact on students’
motivation than the four outcomes used in this study. Future research can solicit input
from college students on what specifically they see or would like to see as the outcomes
of an evaluation system. Fourth, extreme levels of instrumentality and expectancy (10
and 90%) were used in the cases. This did not allow us to test for the full range within
the extremes. In another sense, such extremes may not exist in actual practice. Fifth, all
subjects came from only one institution, which may limit the applicability of the results
to other academic environments. Extensions can be made by future studies to examine
the effect of academic environments on the results of this study.
Implications
The expectancy model used in this study provides a good overall explanation of a
student’s motivation to participate in the evaluation of teaching effectiveness. The
valence model significantly explains a student’s assessment of the attractiveness of a
teaching evaluation system. Further, the force model provides a good explanation of a
student’s motivation to participate in the teaching evaluation. By the successful appli-
cation of expectancy theory, this study provides a better understanding of the behavioural
intent (motivation) of students’ participation in the teaching evaluation process.
Our empirical results show that students have strong preferences for the uses of
teaching evaluations and these preferences are remarkably consistent across individuals.
Since quality student participation is an essential antecedent of the success of student
evaluations of teaching effectiveness, this knowledge of student motivation must be
considered thoughtfully when the system is implemented. If, however, students are kept
ignorant of the use of teaching evaluations or if teaching evaluations are used for
purposes that students do not value or if they see no visible results from their
participatory efforts, they will cease to give meaningful input.
Suggestions
Towards the goal of motivating students to participate in the teaching evaluation process,
we make the following practical suggestions. First, consider listing prominently the uses
of the teaching evaluation on the evaluation instrument. This will inform the students of
the uses of the evaluation. If these uses are consistent with the uses that students prefer
(and they believe that the evaluations will truly be used for these purposes), the students
will assign a high valence to the evaluation system. The next step is to show students
that their feedback really is used. Accomplishing this will increase their subjective
probabilities of the secondary outcomes that are stated on the evaluation. It would also
increase their subjective probabilities that they will be successful in providing meaning-
ful feedback, since they will see that previous feedback has been used successfully.
Thus, their force or motivation to participate will be high. One way of showing students
that their feedback has been used successfully is to require every instructor to cite on the
course syllabus one recent example of how student evaluations have helped improve this
particular course or have helped the instructor to improve his or her teaching. This seems
like a low cost, but highly visible way to show students the benefits of teaching
evaluations. (It may also have the salutary effect of encouraging faculty to ponder the
information contained in student evaluations and to act upon it.)
This research shows that students’ most attractive outcome of an evaluation system for
both seniors and freshmen is improving the professor’s teaching, while improving the
course is the second strongest outcome. Thus, students who believe that their feedback
on evaluations will improve teaching or the course or both should be highly motivated
to provide such feedback. Through better understanding of students’ needs and be-
havioural intentions, the results of this study can aid in creating evaluation systems that
truly respond to the needs of those who evaluate teaching performance.
Notes on Contributors
YINING CHEN PhD is Associate Professor, School of Accountancy at Ohio University.
Her current teaching and research interests are in accounting information systems,
financial accounting and auditing. Professor Chen earned her doctorate from the
College of Business Administration, University of South Carolina. Before joining
Ohio University, she was Assistant Professor of Accounting at Concordia University
in Canada for 2 years. She has also held instructional positions at the University of
South Carolina. Professor Chen has authored articles in Auditing: A Journal of
Practice & Theory, Journal of Management Information Systems, Issues in Accounting
Education, Review of Quantitative Finance & Accounting, Journal of End User
Computing, Journal of Computer Information Systems and Internal Auditing. Corre-
spondence: 634 Copeland Hall, Ohio University, Athens, OH 45701, USA. Tel: ⫹ 1
740 593 4841. Fax: ⫹ 1 740 593 9342. E-mail: cheny@ohiou.edu
LEON B. HOSHOWER PhD, CPA, CMA is Professor, School of Accountancy at Ohio

University. His research and teaching interests include performance measurement and
cost control, ABC costing and accounting education. Professor Hoshower earned his
PhD at Michigan State University in 1982. He was Assistant Professor at Penn State
for six years before joining the faculty at Ohio University in 1988. He has about 40
published works, some of which have appeared in Management Accounting, Journal
of Cost Management, Financial Executive and Issues in Accounting Education. His
publications have received three Outstanding Manuscript awards from Institute of
Management Accountants and a Valley Forge Honor Certificate for ‘Excellence in
Economic Education’ from Freedoms Foundation at Valley Forge. His other awards
and honors include Most Valuable Member of the Year presented by the Central
Pennsylvania Chapter of the Institute of Management Accountants and he served as
mentor to Anthony Ciaffoni’s honors thesis which won first prize in the Association
of Chartered Accountants nationwide student manuscript contest. Professor Hoshower
has conducted numerous seminars for the American Accounting Association and for
the Institute of Management Accountants.
NOTES
[1] The two formative uses of teaching evaluations reflect teaching effectiveness issues.
[2] Five and 14 questionnaires were turned in incomplete or blank for the freshman and senior groups,
respectively.
[3] According to Montgomery (1984, p. 325), ‘If the experimenter can reasonably assume that certain
high-order interactions are negligible, then information on main effects and low-order interactions
may be obtained by running only a fraction of the complete factorial experiment’. A one-half
fraction of the 24 design can be found in Montgomery (1984, pp. 331–334). Prior expectancy theory
studies (see, for example, Burton et al., 1992; Snead & Harrell, 1995) also used one-half fractional
factorial design.
[4] Though gender is a dichotomous variable with 1 ⫽ male and 0 ⫽ female, the Pearson correlation
would still provide a basis for directional inferences.
[5] It is reasonable to expect an association between someone’s prior experience with an evaluation
system and his or her motivation to participate in that particular system. However, the participants
were asked to evaluate the 16 proposed cases (evaluation systems), none of which they had
experienced. Therefore, the non-significant correlations indicate that the subjects were able to
evaluate the proposed systems objectively without bias, thus supporting our argument that the
subjects we used were appropriate for this study.
REFERENCES
ABRAMI, P. C. (1989) How should we use student ratings to evaluate teaching?, Research in Higher
Education, 30 (2), pp. 221–227.
ABRAMI, P. C. & MIZENER, D. A. (1983) Does the attitude similarity of college professors and their
students produce “bias” in the course evaluations?, American Educational Research Journal, 20 (1),
pp. 123–136.
ABRAMI, P. C., MARILYN, H. M. & RAISZADEH, F. (2001) Business students’ perceptions of faculty
evaluations, The International Journal of Educational Management, 15(1), pp. 12–22.
AJZEN, I. & FISHBEIN, M. (1980) Understanding Attitudes and Predicting Social Behavior (Englewood
Cliffs, NJ, Prentice Hall).
ARUBAYI, E. A. (1987) Improvement of instruction and teacher effectiveness: are student ratings reliable
and valid?, Higher Education, 16 (3), pp. 267–278.
BROWNELL, P. & MCINNES, M. (1986) Budgetary participation, motivation, and managerial performance,
Accounting Review, 61 (4), pp. 587–600.
BURTON, F. G., CHEN, Y., GROVER, V. & STEWART, K. A. (1992) An application of expectancy theory
for assessing user motivation to utilize an expert system, Journal of Management Information Systems,
9 (3) pp. 183–198.
BYRNE, C. J. (1992) Validity studies of teacher rating instruments: design and interpretation, Research
in Education, 48 (November), pp. 42–54.
CALDERON, T. G., GREEN, B. P. & REIDER, B. P. (1994) Extent of use of multiple information sources
in assessing accounting faculty teaching performance, in: American Accounting Association Ohio
Regional Meeting Proceeding (Columbus, OH, American Accounting Association).
CALDERON, T. G., GABBIN, A. L. & GREEN, B. P. (1996) A Framework for Encouraging Effective
Teaching (St. Harrisonburg, VA, American Accounting Association Center for Research in Account-
ing Education, James Madison University).
CASHIN, W. E. (1983) Concerns about using student ratings in community colleges, in: A. SMITH (Ed.)
Evaluating Faculty and Staff: new directions for community colleges (San Francisco, CA, Jossey-
Bass).
CASHIN, W. E. & DOWNEY, R. G. (1992) Using global student rating items for summative evaluation,
Journal of Educational Psychology, 84 (4), pp. 563–572.
CENTRA, J. A. (1993) Reflective Faculty Evaluation (San Francisco, CA, Jossey-Bass).
CENTRA, J. A. (1994) The use of the teaching portfolio and student evaluations for summative evaluation,
Journal of Higher Education, 65 (5), pp. 555–570.
COHEN, P. A. (1981) Student ratings of instruction and student achievement: a meta-analysis of
multisection validity studies, Review of Educational Research, 51 (3), pp. 281–309.
DESANCTIS, G. (1983) Expectancy theory as an explanation of voluntary use of a decision support
system, Psychological Reports, 52 (1), pp. 247–260.
DIVOKY, J. J. & ROTHERMEL, M. A. (1989) Improving teaching using systematic differences in student
course ratings, Journal of Education for Business, 65 (2), pp. 116–119.
DOUGLAS, P. D. & CARROLL, S. R. (1987) Faculty evaluations: are college students influenced by
differential purposes?, College Student Journal, 21 (4), pp. 360–365.
DRISCOLL, L. A. & GOODWIN, W. L. (1979) The effects of varying information about use and disposition
of results on university students’ evaluations of faculty courses, American Educational Research
Journal, 16 (1), pp. 25–37.
FELDMAN, K. A. (1977) Consistency and variability among college students in their ratings among
courses: a review and analysis, Research in Higher Education, 6 (3), pp. 223–274.
FERRIS, K. R. (1977) A test of the expectancy theory as motivation in an accounting environment, The
Accounting Review, 52 (3), pp. 605–614.
GEIGER, M. A. & COOPER, E. A. (1996) Using expectancy theory to assess student motivation, Issues in
Accounting Education, 11 (1), pp. 113–129.
GREEN, B. P., CALERON, T. G. & REIDER B. P. (1998). A content analysis of teaching evaluation
instruments used in accounting departments, Issues in Accounting Education, 13 (1), pp. 15–30.
HANCOCK, D. R. (1995) What teachers may do to influence student motivation: an application of
expectancy theory, The Journal of General Education, (44) (3), pp. 171–179.
HARRELL, A. M., CALDWELL, C. & DOTY, E. (1985) Within-person expectancy theory predictions of
accounting students’ motivation to achieve academic success, Accounting Review, 60 (4), pp. 724–735.
HOBSON, S. M. & TALBOT, D. M. (2001) Understanding student evaluations, College Teaching, 49 (1),
pp. 26–31.
HOFMAN, F. E. & KREMER, L. (1980) Attitudes toward higher education and course evaluation, Journal
of Educational Psychology, 72 (5), pp. 610–617.
HOWARD, G. S., CONWAY, C. G. & MAXWELL, S. E. (1985) Construct validity of measures of college
teaching effectiveness, Journal of Educational Psychology, 77 (2), pp. 187–196.
KEMP, B. W. & KUMAR, G. S. (1990) Student evaluations: are we using them correctly?, Journal of
Education for Business, 66 (2), pp. 106–111.
KWAN, K. P. (1999) How fair are student ratings in assessing the teaching performance of university
teachers?, Assessment & Evaluation in Higher Education, 24 (2), pp. 181–195.
LIN, Y. G., MCKEACHIE, W. J. & TUCKER, D. G. (1984) The use of student ratings in promotion decisions,
MARLIN, J. E., JR & GAYNOR, P. (1989) Do anticipated grades affect student evaluations? A discriminant
analysis approach, College Student Journal, 23 (2), pp. 184–192.
MARSH, H. W. (1984) Students’ evaluations of university teaching: dimensionality, reliability, validity,

potential biases, and utility, Journal of Educational Psychology, 76 (5), pp. 707–754.
MARSH, H. W. (1987) Students’ evaluations of university teaching: research findings methodological
issues and directions for future research, International Journal of Educational Research, 11 (3),
pp. 253–388.
MARSH, H. W. & ROCHE, L. (1993) The use of students’ evaluations and an individually structured
intervention to enhance university teaching effectiveness, American Educational Research Journal, 30
(1), pp. 217–251.
MONTGOMERY, D. C. (1984) Design and Analysis of Experiments (New York, John Wiley & Sons).
MURKY, D. & FRIZZIER, K. B. (1986) A within-subjects test of expectancy theory in a public accounting
environment, Journal of Accounting Research, 1986 (2), pp. 400–404.
NIMMER, J. G. & STONE, E. F. (1991) Effects of grading practices and time of rating on student ratings
of faculty performance and student learning, Research in Higher Education, 32 (2), pp. 195–215.
SCHERR, F. & SCHERR, S. S. (1990) Bias in student evaluations of teacher effectiveness, Journal of
Education for Business, 65 (8), pp. 356–358.
SELDIN, P. (1985) Current Practices in Evaluating Business School Faculty (Pleasantville, NY, Center
for Applied Research, Lubin School of Business Administration, Pace University).
SELDIN, P. (1993) The use and abuse of student ratings of professors, The Chronicle of Higher Education,
40 (1), p. A40.
SIMPSON, R. D. (1995) Uses and misuses of student evaluations of teaching effectiveness, Innovative
Higher Education, 20 (1), pp. 3–5.
SNEAD, K. C. & HARRELL, A. M. (1994) An application of expectancy theory to explain a manager’s
intention to use a decision support system, Decision Sciences, 25 (4), pp. 499–513.
STAHL, M. J. & HARRELL, A. M. (1981) Modeling effort decisions with behavioral decision theory:
toward an individual differences model of expectancy theory, Organizational Behavior and Human
Performance, 27 (3), pp. 303–325.
TAGOMORI, H. T. & BISHOP, L. A. (1995) Student evaluation of teaching: flaws in the instruments,
Thought & Actions: The NEA Higher Education Journal, 11, p. 63.
THEALL, M. & FRANKLIN, J. (1991) Using student ratings for teaching improvement, New Directions for
Teaching and Learning, 48 (4), pp. 83–96.
THEALL, M. & FRANKLIN, J. (2000) Creating responsive student ratings systems to improve evaluation
practice, New Directions for Teaching and Learning, 83 (3), pp. 95–107.
TOLLEFSON, N., CHEN, J. S. & KLEINSASSER, A. (1989) The relationship of students’ attitudes about
effective teaching to students’ ratings of effective teaching, Educational and Psychological Measure-
ment, 49 (3), pp. 529–536.
TOM, G., SWANSON, S. & ABBOTT, S. (1990) The effect of student perception of instructor evaluations
on faculty evaluation scores, College Student Journal, 24 (3), pp. 268–273.
VROOM, V. C. (1964) Work and Motivation (New York, John Wiley & Sons).
WACHTEL, H. K. (1998) Student evaluation of college teaching effectiveness: a brief review, Assessment
& Evaluation in Higher Education, 23 (2), pp. 191–211.
WAGENAAR, T. C. (1995) Student evaluation of teaching: some cautions and suggestions, Teaching
Sociology, 64 (1), pp. 64–68.
WILSON, R. C. (1986) Improving faculty teaching: effective use of student evaluations and consultants,
WRIGHT, W. A. & O’NEIL, M C. (1992) Improving summative student ratings of instruction practice,
Journal of Staff, Program, and Organizational Development, 10. pp. 75–85.
Appendix
Instructions
At the end of each quarter you are asked to evaluate the courses you have taken and the professors who
have conducted those courses. These evaluations may be used in various ways, such as: improving
teaching; rewarding (or punishing) the professors’ performance with tenure, promotion, or salary
increases; improving course content; and providing information to future students who are contemplating
taking this course or this professor.
This exercise presents 16 situations. Each situation is different with respect to how the evaluation is
likely to be used. We want to know how attractive participation in the evaluation is to you in each given
situation.
You are asked to make two decisions. You must first decide how attractive it would be for you to
participate in the evaluation (DECISION A). Next you must decide how much effort to exert in
completing the evaluation (DECISION B). Use the information provided in each situation to reach your
decisions. There are no ‘right’ or ‘wrong’ responses, so express your opinions freely.
SITUATION 1: The likelihood that your feedback:

will be taken seriously by the professor for improving his/her teaching in the future is . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . HIGH (90%)
will be taken into consideration for the professor’s tenure, promotion, and salary raise is
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . HIGH (90%)
will be used to improve the same course (i.e. content, format, etc.) in the future is . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .LOW (10%)
will be provided to future students who are contemplating taking this course and this professor is
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LOW (10%)
DECISION A: With the above outcomes and associated likelihood levels in mind, indicate the
attractiveness to you of participating in the evaluation.
⫺5 ⫺4 ⫺3 ⫺2 ⫺1 0 ⫹1 ⫹2 ⫹3 ⫹4 ⫹5
Very Unattractive Very Attractive
FURTHER INFORMATION: The course evaluation contains several open-ended essay questions,
which will require a great deal of effort for you to complete. (As you know, your participation
in course evaluations is voluntary. Thus you can choose to exert much effort in the hopes of
providing meaningful feedback or at the other extreme you can do nothing.) If you exert a great
deal of effort, the likelihood that the reader will find your feedback helpful is……LOW (10%)*
DECISION B: Keeping in mind your attractiveness decision (DECISION A) and the FURTHER
INFORMATION, indicate the level of effort you would exert to participate in the evaluation.
0 1 2 3 4 5 6 7 8 9 10
Zero Effort Great Deal of Effort
*Despite your best efforts to articulate your feelings, the reader may misinterpret your feedback.
Even the responses to multiple-choice questions are difficult to interpret when the questions are
designed poorly.
Situations 2 to 16 vary in combinations of the second-level outcomes and expectancy levels, e.g. situation
2, low/low/low/low/low; situation 3, low/high/low/high/high; situation 4, low//low/low/low/high; situ-
ation 5, low/low/high/high/low; situation 6, high/low/low/high/low; situation 7, high/high/high/high/high;
situation 8, high/low/high/low/high; situation 9, high/low/high/low/low; situation 10, low/high/high/low/
high; situation 11, low/high/high/low/low; situation 12, high/high/high/high/low; situation 13, high/high/
low/low/high; situation 14, low/high/low/high/low; situation 15, high/low/low/high/high; situation 16,
low/low/high/high/high.

Student Evaluation of Teaching Effectiveness: An Assessment of Student Perception and Motivation

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Student Evaluation of Teaching Effectiveness: An Assessment of Student Perception and Motivation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Student Evaluation of Teaching Effectiveness: An Assessment of Student Perception and Motivation

Uploaded by

Copyright:

Available Formats

Assessment & Evaluation in Higher Education, Vol. 28, No.

Student Evaluation of Teaching

The Theoretical Background and Supporting Literature

evaluations is as feedback to instructors who wish to modify their teaching practices.

TABLE 1. Summary of demographic information

Total sample size 105 103

explains much of the variation in students’ perception of the attractiveness of participat-

TABLE 3. Equality tests

Ranking (high to low) Mean of standardised values P value (t-test)

(B) Equality tests on standardised values of second level outcomes between

Mean of standardised values

Second level outcome Freshmen Seniors P value (t-test)

V1 0.47 0.44 0.16

(C) Equality tests on standardised values of second level outcomes between

Mean of standardised values

Second level outcome Male Female P value (t-test)

V1 0.43 0.48 0.14

V1, valence of teaching improvement; V2, valence of tenure and promotion

TABLE 4. Force model regression resultsa

TABLE 5. Pearson’s correlation coefficients (P values)

Impression Impression Adjusted R2 Adjusted R2

Other Interesting Findings

LEON B. HOSHOWER PhD, CPA, CMA is Professor, School of Accountancy at Ohio

MARSH, H. W. (1984) Students’ evaluations of university teaching: dimensionality, reliability, validity,

SITUATION 1: The likelihood that your feedback:

You might also like