0% found this document useful (0 votes)
3 views16 pages

Aula 3

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 16

AULA 3

PROCESSOS AVALIATIVOS
EM LÍNGUA INGLESA

Profª Miriam Sester Retorta


INTRODUÇÃO

Evaluation: types, purposes and criteria

Having a well-defined purpose and appropriate criteria in language testing


is crucial for various reasons. Firstly, it can promote fairness and reduce bias by
ensuring that all test-takers are evaluated on the same standards, which
promotes equality and reduces the chances of any unjust advantages or
disadvantages. Secondly, the purpose and criteria of the language test must be
aligned with the skills, abilities, and knowledge that the test is designed to assess
to ensure validity. If the purpose and criteria are unclear or misaligned, the test
may not accurately evaluate what it intends to, leading to invalid results. Thirdly,
consistency is necessary, and the purpose and criteria must be uniform across
all test-takers and administrations to ensure reliability. Fourthly, having clear
purpose and criteria in language testing can help ensure accountability and
transparency, making it easier to evaluate the effectiveness of the test and the
results obtained. Lastly, having a clear purpose and criteria in language testing
can help ensure that the test is aligned with the learning objectives of the
language program or course, allowing for the test to evaluate what the students
are expected to learn and producing useful test results for both students and
teachers. Ultimately, having a clear purpose and appropriate criteria in language
testing is essential for ensuring validity, reliability, fairness, accountability, and
transparency.

THEME 1 – ASSESSMENT, EVALUATION, TESTS AND EXAMS

When we think about evaluation, the first thing that normally comes to our
mind is a test or an exam. But tests and exams are just some of the many ways
a student or candidate may be evaluated. The choice of a method for assessing
someone will depend on the purpose of the evaluation. There are several types
of tests/exams, each of them with a specific purpose and criteria. These
tests/exams can be categorized according to the information they provide and
according to what the examiner intends to do with the results.
Before we talk about the types of tests/exams there are, we would like to
define the difference between assessment, evaluation, tests and exams.

2
1.1 Assessment and evaluation

According to the East Stroudsburg University (ESU) guide (2023) “an


assessment system is typically defined as the process of collecting, synthesizing,
and interpreting information to aid in educational decision making – and
assessment is an umbrella term for the comprehensive process of measurement
and evaluation”. It is a process for evaluating a student’s or candidate’s abilities
based on empirical data, i.e. tests and exams. The objective of an assessment is
to match a students’ or candidates’ skill against pre established parameters in
order to show their strengths and weaknesses. An assessment can be formative,
when it happens throughout the learning process with immediate feedback or
summative, when we compare what a student shows he/she knows to a certain
level of knowledge expected by the end of a course or program. Therefore,
assessment should be seen as a systematic collection and analysis of information
for the main purpose of improving student learning and performance (ESU, 2023).
Evaluation, on the other hand, focuses on making judgements about
students’ or candidates’ performance. It determines the extent to which a subject,
program or course achieves predetermined goals or outcomes. When evaluating
a test, exam, subject or program, you critically analyze their different outcomes,
in order to serve as an instrument to measure their success or deficiencies. The
focus is mainly on the product, i.e. the result.
Therefore, the objective of an evaluation is to make a judgment on the
outcomes of a test or exam using a set of well-defined criteria. Thus, evaluations
are normally high-stakes and have an impact on the examinee.
Greene (2021) summarizes the differences between assessment and
evaluation in the following table.

3
Difference between assessment and evaluation

Assessment Evaluation

Ongoing At the end of a course

improves language learning Judges language level

Individualised Administered against external criteria

Often ungraded Commonly graded

Provides feedback for improvement Shows results (strengths and weaknesses)

Source: Greene, 2021.

Assessment, consequently, focuses on gathering elements for language


improvements during the process of language learning, whereas evaluation aims
at demonstrating the final results, i.e. whether the student learned our not and
how much he/she has learned.

1.2 Tests and exams

Tests and exams can be considered to be synonymous and can be used


interchangeably. Both are basically forms of assessing skills or knowledge.
However, one can differentiate between these two terms based on the contexts
they are commonly used for.
A language test is a formal instrument of assessment. It can be written or
oral and is generally administered in small groups, in a classroom, for example.
It can be used either to evaluate the language without reference to a particular
program or course or to evaluate the extent to which learners have achieved the
goals of a specific course. According to Ur (1996, p. 33), “a test is an activity
whose main purpose is to convey (usually to the tester) how well the testee knows
or can do something”.
An exam is a more formal form of test that evaluates the knowledge of a
student or candidates. In the educational field, exams mean a mid-term or end
year test. In the field of language testing, examinations are commonly used for
official proficiency exams such as the TOEFL or the IELTS exams. Exams are
usually large-scale once they are administered to a big number of candidates.
They are usually programmed to be taken by the school or institution and not by

4
the teacher and they are scheduled to be taken in certain times of the year or
school term.

Difference between tests and exams

Tests Exams

May be informal Always formal

Small-scale Large-scale

Usually administered by the teacher Usually administered by an institution

Can be taken at any time Usually scheduled to be taken in certain times


of the year
Source: Retorta, 2023.

Although both terms may be used interchangeably, in the field of language


testing, tests tend to be informal and administered by a teacher in order to
evaluate students’ progress, whereas an exam is formal and official, generally
administered by an institution with the purpose of determining a student’s or
candidate’s future educational, vocational or professional lives.

THEME 2 – PROFICIENCY, PLACEMENT AND ENTRANCE EXAMS

There are several types of tests with different purposes and criteria,
depending on the desired objective and the type of information they provide.
According to Hughes (2003), this categorization is useful both to verify whether a
test is suitable for a specific purpose and to help conceive new ones. The author
mentions four types of test: proficiency, placement, achievement and diagnostic.
Shohamy (1985) adds two more types: entrance and mastery exams. Some
theorists (e.g., Sheen, 2007; Sparks et al., 2011; Robinson, 2005; Granena,
2013) advocate towards aptitude tests.

2.1 Proficiency exam

A language proficiency exam assesses the user's language skills, general


competence in a language, despite any previous formal instruction by the
speaker. Brown (2001, p. 390) states that “a proficiency test is not intended to be
limited to any one course, curriculum, or single skills in the language”. For
Hughes (2003), the function of these exams is to show whether the speaker has

5
reached a certain standard in relation to a series of predetermined skills. He also
states that proficiency exams may be conceptualized in various ways depending
on the purposes of the tests. Let’s give some examples. A candidate might need
to show that he/she will be able to perform tasks which require a B2 level of
language proficiency. Or a student, who wishes to enter an English-speaking
country university, should demonstrate a certain level of proficiency in the
language in an academic environment. Another example is a candidate who
wishes to prove that he/she has reached a certain level of proficiency in relation
to a set of specific competencies, like the countries that require a C1 for teachers
to be eligible for job opportunities.
Some of the well-known proficiency tests include the Test of English as a
Foreign Language (TOEFL) and the Cambridge Main Suite Exam such as First
Certificate in English (FCE), Certificate in Advanced English (CAE) and
Certificate of Proficiency in English (CPE). There are also proficiency tests that
focus on a particular area of English, for example the Test of English for Aviation
(TEA) or Business English Certificates (BEC).

2.2 Placement exams

A placement exam, exame de nivelamento in English, is used when the


aim of the examiner is to assess the student's linguistic knowledge in order to
determine the level most appropriate to a student before putting him/her into a
class that is suitable for his/her language abilities.
In general, these exams are adapted to meet the specific programs of each
educational institution and, depending on the objectives of the course, they can
emphasize a given skill, although proficiency exams can sometimes act as a
surrogate of placement tests. The results of a placement exam can also be
considered as references for future teaching.

2.3 Entrance exams

Entrance exams are used to assess a candidate's linguistic knowledge in


relation to the language skill that will be used in a future course or program (not
necessarily a language course).
In Brazil, commonly known as vestibular, these exams are used to assess
the knowledge acquired in primary and secondary education and, thus, determine

6
whether the candidate is able to continue his studies at a higher level. These
assessment instruments are composed of several subjects, including modern
foreign languages (English, Spanish, French, German and Italian).
More recently, another admission exam has been selecting candidates to
enter university: The National High School Examination, known as Exame
Nacional do Ensino Médio (Enem).

2.4 Aptitude tests

An aptitude test is used to assess a student's likely linguistic performance,


and whose results can be used for future reference. The ability to learn a
language refers to the degree of proficiency with which an individual will be able
to learn a foreign language at a given time and under certain conditions. In
general, this type of exam evaluates different skills, such as identifying and
memorizing new sounds, understanding the function of words in sentences,
deducing grammatical rules through examples, and memorizing new words.
It was commonly used to ascertain if a student had the capabilities to learn
a language, and therefore, was used to select employees. However, due to its
structuralist character, this type of exam is no longer administered to assess
competence in a foreign language.

THEME 3 – MASTERY EXAMS, ACHIEVEMENT AND DIAGNOSTIC TESTS

Mastery and diagnostic tests/exams are commonly large-scale exams and


used in larger groups of students to gather information about students’ proficiency
in order to take actions. Achievement tests, on the other hand, are commonly
small-scale tests and are used to assess students individually. In this section, we
will be talking about them.

3.1 Mastery tests

A master exam is a domain-referenced test that is used to decide whether


individuals have attained some particular level of performance. Most of the time
it is used interchangeably with proficiency exams.
In foreign literature, there are no references to “exames de suficiência”. In
Brazil, this type of test serves to assess linguistic knowledge in relation to specific
and well-defined objectives in order to select candidates to enter a master's and

7
a doctoral program. The specific objective is to verify if the student is able to read
and understand texts in a modern foreign language (German, Spanish, French,
English or Italian) published in scientific journals. Other public and private
universities also use this type of test under the name of proficiency test.
Therefore, in Brazil the term “exame de suficiência” should be replaced by
“exame de proficiência”.

3.2 Achievement tests

An achievement test, known as “teste de rendimento” in Portuguese, is


used to assess whether or not the objectives of a given course or discipline have
been achieved. It is directly related to the language, material and skills that have
been covered in a course. According to McNamara (2000), such tests, associated
with the instructional process, accumulate evidence that indicates if (and when)
the learning goals were reached, so that they can be administered to evaluate
the student's progress.
Achievement tests should be frequently administered, in small intervals,
during a course so that depending on the results obtained, the teacher can
intervene, i.e. the teacher will provide remedial work (“material para
recuperação”, in Portuguese) for the students or even improve the teacher’s
teaching.
Likewise, they can be administered at the end of the course, to check
whether the student has reached the goal previously established by the course.
The problem with achievement tests being administered at the end of a course is
that if the student does not perform well, there is no time to recover or learn what
was not learned.
There are other kinds of achievement assessment instruments other than
written or oral tests such as portfolios, e-portfolios, and observation procedures
like diaries.
Remember that low grades in these tests can also identify flaws in the
course, such as complex content mistakenly taught in the initial stages. This may
result in failure of the tests, not implying that it is the student’s fault. The problem
can be that your test has problems such as difficult questions or stems. Another
example is when a didactic material was not elaborated adequately, and the
student may present a low performance. Or even when the curricular syllabi was

8
poorly conceived: an introductory subject is replaced by a more advanced one on
the assumption that the student has knowledge related to the introductory
content, for example. In this case, many students perform poorly on tests in the
new subject. As a last example, we have institutions that allow the indiscriminate
approval of students by professors who are not committed to teaching, who have
not adequately taught the contents of their discipline. This can lead these
students to perform worse than expected in the subsequent discipline. Thus, it
should be noted that the teacher must always be very cautious when evaluating
and deciding, as the result of the achievement test may be compromised by
factors unrelated to learning.

3.3 Diagnostic tests

The objective of a diagnostic test is to identify students’ strengths and


weaknesses so future actions may be taken. It may also help explain why certain
language problems occur.
According to Hughes (2003), the results of these tests make it possible to
verify elements of the content that need to be reinforced. Another purpose would
be, according to Haydt (2002), to verify the presence or absence of requirements
for new learning, which is why they are generally administered at the beginning
of the school term.
A diagnostic test will usually be accompanied by a checklist for the
administrator to use to help focus on the problematic areas. Diagnostic tests are
better than other types of tests for this purpose because they have been
developed to specifically find and define the problems and, in some cases, offer
remedial activities. It may also show a teacher where to start his/her course from.

THEME 4 – THE PURPOSE FOR TESTING

For each type of test we may have different purposes. So, before choosing
what type of test you are going to use, first, you should ask yourself some
questions. However, before asking questions, you need to define whether you
are going to use Classroom tests (achievement tests), written and administered
by teachers or External tests (proficiency, mastery, entrance or placement
exams), which are planned and administered by an external agency such as the
Ministry of Education.

9
Once you have chosen between classroom tests versus external tests, you
have come up with questions such as:

CLASSROOM TESTS
● Do I want to know if my students have successfully acquired what was taught?
● Do I want to gather evidence to improve my teaching?
● Do I want to know if my students have progressed?
● Do I have to administer a test because I am required to grade students?
● Do I want to see what my students’ weaknesses are, so I can propose remedial work?
● Will this test motivate students to study?
● Will this test provide evidence to report to parents?
● Will this test provide evidence to report to the school coordinator?

Source: Retorta, 2023.

EXTERNAL TESTS
1. Do we want to evaluate proficiency?
2. Do we want to decide whether to accept students to certain programs?
3. Do we need to provide information for administrative decisions - special treatment to
certain groups? (Eg. Prova Brasil, SAEB)
4. Do we need to evaluate a curriculum? (Eg. Enade)
5. Do we want to choose the best candidates for job openings? (Concursos públicos)

Source: Retorta, 2023.

Tests can basically serve three different purposes: selecting, classifying


or diagnosing. Unfortunately, a considerable number of teachers use their tests
only to select or classify their students, disregarding the primary function of
diagnosing and offering remedial work.
The first two purposes are usually used concurrently. When a teacher
chooses the best, the average and the weak student, he/she is classifying and
selecting the students, i.e. who is going to be approved and who is going to fail.
From this culture of evaluation, some schools absurdly assign students to their
classroom according to their school records: the students considered “brilliant”
go to classroom A, the average ones to classroom B, and those considered weak
to classroom C.
Most summative assessments (PERRENOUD, 1999; HAYDT, 2002;
LUCKESI, 2002) are used to classify and select students. According to Haydt
(2002), the summative assessment aims to assess the degree of achievement

10
obtained by the student at the end of the school term, to determine whether
he/she will pass or fail. Unfortunately, a big number of schools are supporters of
this view of evaluation which is related solely and exclusively to the culture of
grading and complying with legal and bureaucratic requirements.
It is worth mentioning that classifying or selecting a student is not always
an inappropriate practice: it will all depend on the purpose of the test. If you
organize a test in order to select the best candidate for a job offer (concursos
públicos), for example, it is necessary to classify and select one or some
candidates among a larger group. Another example is a university entrance
examination (vestibulares ou testes de seleção) in which you must choose the
candidates with the highest scores in order to fill the vacancies.
Now, when we use achievement tests in the school environment, it is
important for the teacher not only to use his/her assessment to classify students
and select the ones who will pass or fail. Teachers should use these tests in order
to diagnose problems during the school term and help students overcome their
limitations.
Perrenoud (1999), Haydt (2002), and Luckesi (2002) call tests with a
diagnostic purpose as formative. Formative assessments are carried out with the
purpose of informing the teacher and the student about the learning outcome
during the development of the course. Formative assessments have in their
nature a diagnostic purpose in which teachers may identify deficiencies in
teaching and learning, in order to enable reformulations and ensure the
achievement of objectives.
When a teacher uses formative tests, i.e., tests with the purpose of
diagnosing problems in order to propose immediate remedial work during the
school term, this teacher is in line with what we call “recuperação paralela'' in
Brazil.
The purposes of tests need to be well defined for teachers and students.
When the purpose is well established, negative views of tests will slowly
disappear. Shohamy (1985) conducted a survey among high school students in
Israel regarding their attitudes toward language tests. Students pointed out the
following feelings which may be the same our own students feel:

11
ADJECTIVES THAT BEST DESCRIBE WHAT YOU THINK ABOUT
LANGUAGE TESTS
Threatening, waste of time, unfair, painful, difficult, motivating, challenging, boring, fun, useless
and terrible.

Source: Shohamy, 1985.

In this survey, 90% of the students answered that their tests did not reflect
their actual knowledge in the language! This is astonishing, isn’t it? Even more
astonishing were the answers about what they liked or disliked about language
tests.

WHAT DO YOU LIKE OR DISLIKE ABOUT LANGUAGE TESTS?

We spent ten lessons conjugating the past tense but on the test there were
Student 1 only two conjugations.

I never learn anything from tests because the teacher never corrects the
Student 2 mistakes I make, so I end up at the same place where I was before I took the
test, except now I also have a bad grade.

The teacher bases the grade on the test, so if I did not feel well on the same
Student 3 day that the test was given, I flunked the course.

I don't believe that any test is a good measure of my proficiency, let alone on a
Student 4 day when I do not feel well.

I don't see the connection between the test and my knowledge, otherwise, how
can I explain the fact that I get good grades on English tests, but last week,
Student 5 when I met an American, I couldn't say anything in English? How come we
never speak on tests?

Our teacher uses the test as a punishment, whenever we don't behave she
Student 6 tells us that we have a test the following day. Sometimes she even asks us to
take out a sheet of paper and write the test on the spot.

Student 7 A test really makes me study, I would have never opened a book if not for the
test. I think the pressure is good for me.

Student 8 I hate to flunk, tests always show me that I am a failure.

Student 9 It seems that whenever the teacher is unprepared she asks us to write a test.

I don't mind the test. What I do mind is that it takes the teacher such a long
Student 10 time to correct the tests. By the time I get the test back, I forget what the test
was all about.

I think that it is really strange that whenever I study hard, I don't get a good
Student 11 grade, but when I don't study at all, I happen to succeed. Does it say
something about me or about the test?

Student 12 What I hate most is when the teacher does not tell us in advance what the test
will cover. It seems that I'm always studying the wrong things.

12
Student 13 Why do we need tests, the teacher knows how well we are doing anyway.
Source: Shohamy, 1985.

So, as Shohamy (1985) asked: What is wrong with our tests that make
students have such negative attitudes towards them? This is a question all of us
should try to answer. Defining the purpose and criteria of a test with your students
is a good start to change these views.

THEME 5 – THE CRITERIA FOR JUDGING THE TEST RESULTS

The results of a test can be interpreted in two different ways: norm or


criterion-reference. Both norm-referenced and criterion-referenced refer to how
scores are interpreted, not the type of assessment.

5.1 Norm-Referenced Assessment

A norm-referenced assessment is designed to help teachers identify which


students need extra support, which are on-track, and which students might
benefit from accelerated instruction.
In a norm-referenced assessment, we may interpret an individual score
compared to the group, or when a student's performance is evaluated in relation
to the performance of his/her classmates. An example would be the teacher who,
after correcting his/her students' essays, chose the best one and assigned it the
maximum grade. The remaining essays would be corrected according to the
standards established by the best one.
In general, a norm-referenced test is used to discriminate between
students with different degrees of ability. Therefore, in examinations which have
the objective of choosing the candidates with the best scores for a job opening
(concursos públicos, for example) or for entering university (vestibulares), for
example, the criteria for correction is normally norm-referenced.

5.2 Criterion-Referenced Assessment

A criterion-referenced assessment, or score, compares a student’s


knowledge or skills against a criterion. The criterion might be based on a course
syllabi, on lesson plans, on a scale of descriptors or even on an external indicator
like the performance on a well-established test. In other words, this kind of

13
criterion is not based strictly by the comparison of performance of other students.
Criterion-referenced interpretation is well-suited in contexts where there is
consensus about the standard, i.e. teachers and students are well aware of what
is supposed to be learned.
In the classroom, these tests should be used to ascertain the linguistic
knowledge expected of the student in relation to the course objectives. This is
how we become capable of perceiving the strengths and weaknesses of the
student's repertoire and, therefore, of proposing remedial activities, when
necessary.
In short, it can be said that, while the main purpose of criterion-referenced
tests is to discover the student's ability, that of the norm-referenced test is to
classify students, discriminating between good, fair and weak.
We will talk more about the purpose and criteria of assessments further in
this course when we turn to concepts of validity, reliability, fairness,
accountability, and transparency.

14
REFERÊNCIAS

BROWN, D. J. Teaching by Principles: An interactive approach to language


pedagogy White Plains, New York. Pearson Education, 2001.

EAST STROUDSBURG UNIVERSITY (ESU). What is Assessment.


Pennsylvania - USA, 2023. Available at:
<https://www.esu.edu/assessment/about.cfm>. Access in: feb. 2023.

GRANENA, G. Cognitive aptitudes for second language learning and the LLAMA
language aptitude test. Sensitive periods, language aptitude, and ultimate L2
attainment, v. 35, p. 105, 2013.

GREENE, S. Evaluation: issues associated with evaluation. Aula 1. Curitiba:


Uninter, 2021.

HAYDT, R. C. Avaliação do processo ensino-aprendizagem. São Paulo:


Ática, 2002.

HUGHES, A. Testing for language teachers. 2. ed. Cambridge: Cambridge


University Press, 2003.

LUCKESI, C. C. Avaliação da aprendizagem escolar. 13. ed. São Paulo:


Cortez, 2002.

McNAMARA, T. F. Language testing. Oxford: Oxford University Press, 2000.

PERRENOUD. P. Avaliação: da excelência à regulação das aprendizagens-


entre duas lógicas. Artmed, 1999.

ROBINSON, P. Aptitude and second language acquisition. Annual Review of


Applied Linguistics, v. 25, p. 46-73, 2005.

SHEEN, Y. The effect of focused written corrective feedback and language


aptitude on ESL learners' acquisition of articles. Tesol Quarterly, v. 41, n. 2, p.
255-283, 2007.

SHOHAMY, E. A practical handbook in language testing for the second


language teacher: A collection of principles, procedures, and examples in the
planning, writing, administering analyzing and using language tests. Tel-Aviv
University (edição experimental), 1985.

15
SPARKS, R. L. et al. Subcomponents of second language aptitude and second
language proficiency. The Modern Language Journal, v. 95, n. 2, p. 253-273,
2011.

UR, P. A course in language teaching. Cambridge: Cambridge university press,


1996.

16

You might also like