Tech 003 Barry Osullivan Linking Aptis v4 Single Pages 0
Tech 003 Barry Osullivan Linking Aptis v4 Single Pages 0
Tech 003 Barry Osullivan Linking Aptis v4 Single Pages 0
Technical Report
Linking the Aptis Reporting
Scales to the CEFR
TR/2015/003
Barry O'Sullivan
British Council
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
CONTENTS
EXECUTIVE SUMMARY I
PART 1 BACKGROUND TO THE STUDY 3
1.1. THE PURPOSE OF THE PROJECT 3
1.2. APTIS 4
1.2.1. INTENDED TEST POPULATION 6
1.2.2. STAKES AND DECISIONS 7
PART 2 OVERVIEW OF THE STUDY 7
PART 3 THE SPECIFICATION PHASE 9
PART 4 THE STANDARD-SETTING PHASE 10
4.1. THE APPROACH TAKEN 10
4.2. THE EXPERT PANEL 14
4.3. THE READING PAPER 15
4.3.1. PRE-EVENT TEST OVERVIEW 15
4.3.2. FAMILIARISATION ACTIVITIES 15
4.3.3. BOUNDARY DISCUSSIONS 15
4.3.4. ROUND 1 OF JUDGEMENTS 16
4.3.5. ANALYSIS OF JUDGEMENTS FROM ROUND 1 16
4.3.6. DISCUSSION OF ROUND 1 17
4.3.7. ROUND 2 OF JUDGEMENTS 18
4.3.8. ANALYSIS OF JUDGEMENTS FROM ROUND 2 19
4.3.9. DISCUSSION OF ROUND 2 19
4.3.10. FINAL DECISION 19
4.3.11. COMMENTARY 19
4.4. THE LISTENING PAPER 20
4.4.1. PRE-EVENT TEST OVERVIEW 20
4.4.2. FAMILIARISATION ACTIVITIES 20
4.4.3. BOUNDARY DISCUSSIONS 20
4.4.4. ROUND 1 OF JUDGEMENTS 21
4.4.5. ANALYSIS OF JUDGEMENTS FROM ROUND 1 21
4.4.6. DISCUSSION OF ROUND 1 21
4.4.7. ROUND 2 OF JUDGEMENTS 22
PAGE 3
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
PAGE 4
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
PAGE 5
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
TABLES
PAGE 6
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
FIGURES
FIGURE 1.1: MODEL FOR LINKING A TEST TO THE CEFR (BASED ON O’SULLIVAN, 2009) 8
FIGURE 1.2: MODEL FOR LINKING APTIS TO THE CEFR 8
FIGURE 4.1: SUMMARY OF THE DESIGN OF THE STANDARDISATION
PROCESS (KNOWLEDGE & RECEPTIVE PAPERS) 11
FIGURE 4.2: SUMMARY OF THE DESIGN OF THE STANDARDISATION PROCESS (PRODUCTIVE PAPERS) 11
FIGURE 5.1: EXAMPLE OF HOW THE LANGUAGE KNOWLEDGE SCORE IS USED 44
PAGE 7
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
EXECUTIVE SUMMARY
PAGE 8
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
2. Reading
3. Listening
– Language use
4. Writing
5. Speaking
PAGE 9
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1.2. Aptis
As indicated above, Aptis is made up of five language papers, one focusing on a candidate’s knowledge of the
systems of the language and the others focusing on the ability to actually use the language.
The individual papers are outlined in the following tables (Table 1.1. to 1.5.). A fuller description of the papers,
with exemplar tasks, can be found at the Aptis website (www.britishcouncil.org/exams/aptis).
Multiple:
A. Word definition Match word to a definition, synonym or collocant.
Vocabulary:
B. Synonym Sets of five target words with ten options.
C. Collocation
Reading 2:
Re-order a series of sentences to form a story. Ordering task
short text cohesion
PAGE 10
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Writing 4: formal and Write one informal message to a friend and a more formal Approx. 50 words for Part 1
informal text writing complaint both on the same topic. Approx. 120–150 words for Part 2
PAGE 11
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Q1 – 40 seconds
Speaking 3: describe, Two contrasting pictures presented, three increasingly
Q2 – 60 seconds
compare and speculate complex questions on the two pictures.
Q3 – 60 seconds
Not all papers need to be taken by all test takers. Aptis offers a range of packages, from which a client can
choose to suit their needs. Table 1.6. shows the range of packages available at launch in August 2012.
The approach taken to the development of Aptis is outlined in O’Sullivan (2015).
PAGE 12
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1 ★ ★
2 ★ ★
3 ★ ★
4 ★ ★
5 ★ ★ ★
6 ★ ★ ★
7 ★ ★ ★
8 ★ ★ ★
9 ★ ★ ★
10 ★ ★ ★
11 ★ ★ ★ ★
12 ★ ★ ★ ★
13 ★ ★ ★ ★
14 ★ ★ ★ ★
15 ★ ★ ★ ★ ★
1.2.1. Intended test population it has been shown in controlled trials to function
well down to 13 years of age in specific contexts.
Aptis is designed to assess the English proficiency The British Council expects that potential users
of non-native speakers of English at CEFR levels A1 with this range in mind carry out carefully designed
to C. It has been designed to be used across a feasibility studies in cooperation with the British
range of contexts and a number of domains, where Council to establish empirically that its use is justified.
a measure of general proficiency is required. Aptis
is what is called a B-to-B (business-to-business) test, 1.2.2. Stakes and decisions
designed to be sold to an institution rather than
an individual. It does not offer an internationally Aptis is a medium stakes test, designed to allow
recognised certification of ability, but can be institutions to make decisions about test takers
certified by the client institution. within that institution. It is not intended for use for
high stakes decisions such as university entrance,
Therefore, Aptis is designed to be used where immigration or citizenship.
an institution wishes to establish an estimate of
the language ability of a known population (e.g.
employees, students). The test is designed primarily
for adults and young adults. While Aptis was not
specifically developed for use with younger learners,
PAGE 13
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
The study reported here is comprised of Manual (2009). This is because Aptis is a new testing
a series of four standard-setting events, service, which has been developed from scratch
by a team of developers at the British Council using
each one aimed at establishing a series the British Council/ EAQUALS Core Inventory as its
of cut-points on a separate skill paper. basis. The Inventory itself represented a significant
Since the cut-points are designed to attempt to add detail to the CEFR level descriptors
indicate different CEFR levels, then a by the two organisations. The approach adopted by
O’Sullivan (2009) added a critical review phase to the
formal standard-setting event is required
Manual procedure and also limited the claims made
in order to supply empirical support with regard to the test at each stage of the process.
of the veracity of claims made by the O’Sullivan also argued that the entire process was
British Council in this regard. iterative, and not linear as inferred by the Manual, and
that the process should be supported by a clearly
As this is not meant to be a full and formal ‘linking’ stated model of test validation.
study (as was the case with the City & Guilds
Communicator project, O’Sullivan, 2009), it was not Therefore, the original approach (see Figure 1.1) was
considered necessary to follow the complete set replaced with a slightly updated and contextualised
of procedures as laid out in the Council of Europe version (see Figure 1.2).
Figure 1.1: Model for linking a test to the CEFR (based on O’Sullivan, 2009)
FAMILIARISATION
CLAIM ON BASIS OF
BASIS OF PROGRESSION BASIS OF PROGRESSION BASIS OF PROGRESSION EMPIRICAL EVIDENCE
TO SPECIFICATION TO STANDARD SETTING TO VALIDATION AND EVIDENCE FROM
EARLIER PHASES
PAGE 14
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
FAMILIARISATION
✘ ✔ ✘ ✔ ✘ ✔
In this model, familiarisation of participants with the CEFR is suggested before each stage of the process.
This is to ensure that the participants at these stages are fully competent in their understanding of the
purpose of the stage, as well as being able to accurately apply their shared understanding of the CEFR levels.
Once a process has been carried out, it is evaluated in a number of appropriate ways, so that one of two
decisions can be made: (1) continue to the next phase (the positive ✔ direction) of the project or (2) go
back to the beginning of the stage (or even further back, depending on the findings of the evaluation)
and repeat the process, having taking into consideration negative aspects of the evaluation (the negative
✖ direction). Since there are three distinct stages, and as progress is always dependent on the positive
outcomes of the evaluation, the entire process can be seen to be iterative in nature.
PAGE 15
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
PAGE 16
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
In this section of the report, the outcomes of the standard-setting stage for each of the four skills are reported.
Since the aim of the familiarisation phase is designed to allow participants in the process to internalise
relevant details and interpretations of the CEFR descriptors, this was the first element of each standard-setting
event. This element was followed by a discussion of the minimally competent candidate and then rounds of
judgements related to items or performances where appropriate.
Figure 4.1: Summary of the design of the standardisation process (Knowledge & Receptive papers)
DISCUSSION
ANALYSIS OF JUDGEMENTS
FINAL DECISION
PAGE 17
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
As has been noted elsewhere (O’Sullivan, 2009), when it comes to setting standards for a productive paper
(i.e. writing or speaking), the event is more reflective of a rating event than of a typical receptive skill
standard-setting event.
Figure 4.2: Summary of the design of the standardisation process (Productive papers)
With the productive papers (speaking and writing), the main focus of our work was to explore the accuracy,
in terms of interpretation of level in the Aptis specifications and rater training and standardisation, and of the
resultant decisions made by Aptis raters. The procedure for the productive skills is summarised in Figure 4.2
above and outlined in more detail below.
Pre-event test overview Participants were presented with information about the particular productive
skill as described in the CEFR and about the Aptis paper being focused on.
The pre-event activity was to review and familiarise themselves with the test
paper and re-familiarise themselves with the CEFR level descriptors.
Familiarisation activities Since the panel selected for the work on the receptive papers was the same as
that for the productive papers, we were again able to employ a limited number
of familiarisation activities. Again, three such activities were found to be sufficient
(though of course many more had been created in case they were needed). The
activities were based on matching descriptors to CEFR level. In addition to these
activities, panel members were shown a specially constructed scale, created using
descriptors from the CEFR, and asked to discuss it and to ensure that it was likely to
be functional. By this we mean, the differences between the levels described were
clear and easy to distinguish and apply operationally.
PAGE 18
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Boundary discussions Following on from the familiarisation activities, participants were asked to
discuss the various boundaries, with the aim of internalising the definition of the
minimally competent candidate at each boundary point. As was the case with
the receptive skills definitions, the resultant definitions are published here. The
discussions led to an operational consensus on the levels, their range and the
boundaries between them.
Round 1 of judgements When it was agreed that all participants were ready to begin the rating process, they
were asked to consider a set of eight pre-selected scripts. The scripts were selected
to represent a range of performances across the CEFR levels and came from
different geographical locations (to eliminate any ‘local’ effects – e.g. a rater might be
familiar with the language use or handwriting associated with a particular education
system). The panel members then used the scale (described above) to help them
decide on the likely CEFR level of each task performance they encountered.
Analysis of judgements The judgements were entered into a pre-prepared Excel workbook, and individual
and group mean CEFR scale levels were automatically estimated. The resulting
outputs were then fed back to the participants.
Discussion Using the data from the first round of ratings, the participants were encouraged to
discuss their decisions, particularly where there were significant differences (though
in reality there were few if any such cases – with almost all ratings coming within one
level of each other). This discussion was led and focused by the event facilitator.
Round 2 of judgement When the group felt that the discussion had reached a natural conclusion,
participants were asked if they wished to reconsider each rating. As was the case
with the receptive skills, some participants chose to make changes to their initial
ratings based on the preceding discussions, while others did not make any changes.
Analysis of judgements The data were again entered into the pre-prepared worksheet in the Excel
workbook, and the individual and overall mean CEFR levels automatically estimated.
Consensus on ratings The results of the analysis were then discussed by the participants, who were
informed that further rounds of rating could follow if they deemed it necessary, i.e.,
if they were unable to come to a consensus on the final agreed CEFR level for each
task. This option was not required for any of the papers, as overall agreement was
quickly reached.
Compare ratings At this stage in the process, panel members were shown the original scores
awarded by Aptis raters (as reported on the CEFR). The idea here was to promote
discussion if and when disagreements were found.
General discussion As it turned out, there was a significant level of agreement between the original
levels (from the Aptis raters) and the judgements made by the expert panel.
Final decision When the final discussion was completed, the participants were asked to agree that
the two rating processes had reached an appropriate level of agreement. This was
done and the proceedings closed.
PAGE 19
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
PAGE 20
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
PAGE 21
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Table 4.1: Summary of the Round 1 judgements for the A0 – A1 boundary (Reading)
Items M D A P Mi J L An Ma Mh Do Po V N Ni
1 30 60 40 70 60 50 40 50 40 30 60 10 50 40 50
2 30 30 50 80 60 60 40 60 60 40 60 10 60 60 50
3 20 60 40 70 50 70 40 50 50 50 60 10 50 40 50
4 10 40 40 70 30 80 20 60 50 50 60 30 50 50 50
5 20 20 30 70 20 50 20 60 40 50 50 40 60 50 50
6 10 10 20 40 0 50 40 40 10 20 20 10 50 10 10
7 10 10 10 50 0 50 20 30 10 30 20 10 40 10 10
8 10 10 10 50 0 50 40 40 20 40 20 10 50 10 0
9 10 10 10 50 0 50 20 40 20 40 20 10 40 10 0
10 10 10 10 40 0 50 20 40 30 30 20 10 30 10 0
11 10 10 10 40 0 50 20 30 20 20 20 10 60 10 0
12 10 0 0 0 0 0 10 0 0 0 0 0 0 0 0
13 10 0 0 0 0 0 10 0 0 0 0 0 0 0 0
14 10 0 10 0 0 0 10 0 0 10 0 0 10 0 0
15 10 0 0 0 0 0 10 0 0 0 0 0 0 0 0
16 10 0 0 0 0 0 0 0 0 10 0 0 10 0 0
17 10 0 0 0 0 0 10 0 0 0 0 0 0 0 0
18 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0
19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
21 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
23 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0
24 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
25 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
PAGE 22
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
PAGE 23
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Table 4.2: Summary of the Round 2 judgements for the A0 – A1 boundary (Reading)
Items M D A P Mi J L An Ma Mh Do Po V N Ni
1 40 60 20 30 50 30 40 50 40 30 30 10 30 40 30
2 40 30 30 50 50 60 40 60 60 50 40 10 50 60 50
3 30 60 20 30 50 70 40 50 50 50 40 10 30 40 50
4 20 40 20 40 40 80 20 50 50 50 40 30 40 50 50
5 20 20 20 30 40 50 20 50 40 50 30 40 30 40 50
6 20 10 10 20 20 20 40 40 10 20 20 10 20 10 20
7 20 10 10 30 20 30 20 30 10 30 20 10 30 10 20
8 20 10 10 30 20 30 40 40 20 40 20 10 30 10 0
9 20 10 10 30 20 30 20 40 20 40 20 10 30 10 0
10 20 10 10 20 20 20 20 40 30 30 20 10 30 10 0
11 20 10 10 20 20 20 20 30 20 20 20 10 30 10 0
12 20 0 0 0 0 0 10 0 0 0 0 0 0 0 0
13 20 0 0 0 0 0 10 0 0 0 0 0 0 0 0
14 20 0 10 0 0 0 10 0 0 10 0 0 0 0 0
15 20 0 0 0 0 0 10 0 0 0 0 0 0 0 0
16 20 0 0 0 0 0 0 0 0 10 0 0 0 0 0
17 20 0 0 0 0 0 10 0 0 0 0 0 0 0 0
18 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0
19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
21 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
23 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0
24 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
25 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
PAGE 24
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
4.3.8. Analysis of judgements from the need for a clearly focused set of test and
Round 2 CEFR familiarisation tasks. It also demonstrated
to all concerned (both the organisers and the
The data were input into the appropriate section panel members), the importance of a highly skilled
of the spreadsheet and the individual and group and experienced panel whose members were
cut-scores were again automatically calculated. independent of the test developer.
These data were then used as the basis of the
Following this event, we were satisfied that the
discussion that followed. The summary results for
boundary points identified represent a genuine and
the A0 – A1 boundary point for this round of
successful attempt to allow Aptis to present test
judgements are shown in Table 4.2. In total, 20
performance data either using a traditional reporting
per cent of the judgements were changed by the
scale and the CEFR (or indeed using both approaches).
panel members. The boundary point shifted slightly
(to 14 per cent) from Round 1. 4.4. The listening paper
4.3.9. Discussion of Round 2 As with the reading paper, the listening paper consists
of 25 items. Unlike the reading paper, the listening
The results of the analysis of the judgements from paper consists of all independent items (while the
Round 2 were then discussed by the participants. reading has 25 items focused on four individual
Again, the initial focus was on items where there reading tasks). The panel members were asked to
tended to be disagreement, though later the review a complete listening paper when making
individual mean judgements were also discussed their judgements.
to ensure that the group was aware of the impact
of the judgements they had made. During this The stages shown in Figure 4.1 above, and which were
discussion, the panel members were assured that followed in the reading panel, were also followed by
additional rounds of judgements could follow if they the listening panel.
deemed it necessary. These additional rounds were
not required as the group felt that a final set 4.4.1. Pre-event test overview
of decisions could be agreed on. Participants were given details of the listening skill
from the CEFR and also of the Aptis listening paper.
4.3.10. Final decision A set of self-access familiarisation activities were
The event concluded with the reaching of a developed to give the panel members an opportunity
consensus on the placing of the various CEFR to familiarise themselves with the listening paper
boundaries. In fact, the final agreed boundaries and re-familiarise themselves with the CEFR level
reflected those initially identified in Round 1 (the descriptors appropriate to listening.
only material change between Rounds 1 and 2
came at the A0 – A1 boundary and that was just
a single percentage point lower).
4.3.11. Commentary
The standard-setting event for Aptis Reading
resulted in a set of robust boundary points that were
endorsed by the expert panel. The experience in
reaching a consensus on these boundary points was
to contribute to the later panels, as it demonstrated
PAGE 25
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
PAGE 26
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Table 4.3: Summary of the Round 1 judgements for the A2 – B1 boundary (Listening)
Items M D A P Mi J L An Ma Mh Do Po V N Ni
2 60 60 100 100 100 100 80 100 100 80 100 100 80 100 100
5 50 40 50 70 70 50 60 90 100 50 50 90 30 50 80
7 30 40 70 60 100 80 60 80 90 40 70 90 30 80 70
8 50 30 50 30 100 20 40 80 80 50 80 50 20 50 50
9 40 40 50 20 90 60 60 60 60 40 80 20 10 50 30
10 30 60 70 50 90 80 60 60 80 40 50 50 30 60 60
11 20 50 40 60 90 80 40 50 70 50 30 30 10 40 50
12 20 30 50 10 80 60 80 50 60 60 20 40 0 30 50
13 30 60 40 0 70 30 60 50 70 70 20 10 20 40 70
14 30 70 60 20 100 60 60 50 60 50 30 60 50 50 50
15 20 50 30 20 60 30 40 40 40 10 10 20 50 50 60
16 30 90 60 10 60 20 60 50 70 60 70 40 0 60 40
17 20 30 40 0 50 20 60 30 50 40 10 0 0 40 50
18 10 50 30 30 50 20 60 30 40 30 60 0 0 40 40
19 50 40 50 20 40 70 60 10 50 30 80 10 50 40 60
20 20 30 30 20 40 30 60 20 30 10 50 0 0 30 50
21 20 10 40 0 50 20 10 20 30 10 40 0 50 20 60
22 10 0 20 0 40 0 10 0 10 0 10 0 0 30 30
23 20 30 40 20 30 10 20 10 20 10 40 10 50 50 50
24 30 40 30 0 20 10 20 10 10 0 30 0 40 30 30
25 20 10 10 0 0 0 30 0 0 0 10 0 0 10 20
PAGE 27
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Table 4.3: Summary of the Round 1 judgements for the A2 – B1 boundary (Listening)
Items M D A P Mi J L An Ma Mh Do Po V N Ni
1 100 70 70 100 100 100 90 100 100 100 100 100 80 100 100
2 70 60 100 100 100 100 80 100 100 100 100 100 90 100 100
5 60 40 50 70 70 50 60 90 100 50 50 90 60 50 70
7 40 40 50 60 100 80 60 80 90 40 70 90 30 80 60
8 60 30 40 30 100 20 40 80 80 80 80 50 20 50 40
9 50 40 50 20 90 60 60 60 60 40 80 30 10 50 30
10 40 60 60 50 90 80 60 60 80 40 50 50 30 60 50
11 30 50 40 60 90 80 40 50 70 50 30 30 30 40 40
12 30 30 40 10 80 60 80 50 60 70 20 40 40 30 30
13 40 60 40 0 70 30 60 50 70 80 20 30 30 40 40
14 40 70 40 20 100 60 60 50 60 50 30 60 50 50 40
15 30 50 30 20 60 30 40 40 40 10 10 30 50 50 40
16 50 90 50 10 60 20 60 50 60 60 70 40 40 60 20
17 30 30 40 0 50 20 60 30 40 40 10 0 40 40 30
18 30 50 30 30 50 20 60 30 30 30 60 0 40 40 20
19 60 40 40 20 40 70 60 10 40 30 80 10 60 40 40
20 40 30 30 20 40 30 60 20 30 10 50 0 0 30 30
21 40 10 40 0 50 20 10 20 30 10 40 0 50 20 40
22 30 0 20 0 40 0 10 0 10 0 10 0 30 30 30
23 40 30 40 20 30 10 20 10 20 10 40 10 50 50 30
24 40 40 30 0 20 10 20 10 10 0 30 0 40 30 10
25 40 10 10 0 0 0 30 0 0 0 10 0 0 10 10
PAGE 28
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
4.4.8. Analysis of judgements from major difference when we are dealing with
Round 2 production-based papers, as we are not expecting
expert panel members to make judgements on the
The summary results for the A2 – B1 boundary point probability of the difficulty of an item. Instead, we are
for this round of judgements are shown in Table 4.4. asking panel members to make informed judgements
Interestingly enough, just as was found in the reading on the actual level of a piece of language (be it written
paper for the A0 – A1 border, a total of 20 per cent of or spoken) produced by a learner. While we still go
the judgements were changed by the panel members. through a similarly systematic set of procedures, the
The boundary point shifted slightly (to 44 per cent) task we asked panel members to perform is more
from Round 1. akin to rating than to judging, as will be seen.
PAGE 29
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
93680513 5 5 5 5 5 5 5 5 5 4 5 5 5 5 5 Above A
93680511 5 4 5 5 4 5 4 4 5 3 4 4 5 4 5 A2.2
93680516 5 0 3 4 0 4 2 2 4 3 3 2 3 0 5 A2.1
93680572 4 3 2 3 3 2 3 2 4 2 3 2 3 2 4 A2.1
93683074 3 2 4 3 3 3 2 2 3 2 3 4 3 2 3 A2.1
93683062 4 1 3 2 4 3 2 2 4 1 2 4 2 1 4 A2.1
93683094 5 3 4 3 4 3 2 3 4 1 3 4 4 3 5 A2.1
93683092 5 4 5 4 4 4 5 4 5 3 3 5 5 4 5 A2.2
CofE1 3 3 5 3 3 3 4 3 4 3 3 4 4 2 3 A2.1
PAGE 30
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
The outcomes from the judgements for task 3 are shown in Table 4.6 and are similar in nature to the situation
with the first task. Clearly, there was some significant disagreement among the panel members. This was again
later used to fuel the discussions.
93680513 5 3 4 5 4 4 5 4 4 4 3 5 4 3 5 B1.2
93680511 5 5 5 5 4 5 3 4 5 4 4 5 5 4 5 Above B1
93680516 5 2 4 3 3 1 0 3 5 1 2 1 3 3 5 B1.1
93680572 4 3 2 2 2 3 1 2 3 1 2 1 2 2 4 A2.2
93683074 4 3 2 2 2 2 0 2 3 1 2 1 2 2 4 A2.2
93683062 4 5 5 4 4 5 2 3 4 3 3 3 5 5 5 B1.2
93683094 4 2 2 2 2 3 1 3 3 2 2 1 3 2 3 A2.2
93683092 2 4 4 2 2 3 3 3 3 3 2 3 2 3 4 B1.1
CofE1 3 3 3 4 3 3 2 2 3 4 3 3 0 4 4 B1.1
Finally, the outcomes for task 4 are shown in Table 4.7. Here again there is some level of disagreement, though it
does not appear to be as great as with the other tasks.
93680513 5 4 3 3 3 4 5 3 3 3 2 3 4 3 5 B2.2
93680511 4 3 4 2 2 3 2 3 3 4 2 3 5 2 4 B2.1
93680516 3 3 2 1 2 2 0 3 3 2 2 2 3 1 3 B1.2
93680572 1 1 1 0 1 1 0 2 2 1 1 0 2 0 1 B1.1
93683074 2 0 2 0 0 0 0 2 1 2 1 1 1 0 3 B1.1
93683062 3 3 5 3 0 3 0 3 4 4 4 3 4 4 2 B2.1
93683094 1 0 1 0 0 0 0 1 0 0 1 0 0 0 2 Below B
93683092 2 0 1 0 0 0 1 0 1 1 1 0 0 0 3 B1.1
CofE1 1 3 3 2 2 3 4 2 3 3 3 2 3 2 2 B2.1
PAGE 31
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
4.5.6. Discussion
The data tables from the first round of ratings were used as the basis of the discussions. The point of these
discussions was to help the panel members further clarify their thinking, and to help them make decisions
about any future ratings. This phase of the process was essentially used to replicate the procedures typical
of a rater training event.
93680513 5 5 5 5 5 5 5 5 5 4 5 5 5 5 5 Above A
93680511 5 4 5 5 4 5 4 4 5 3 4 4 5 4 5 A2.2
93680516 4 0 3 4 0 4 2 2 4 3 3 2 3 0 5 A2.1
93680572 4 3 2 3 3 2 3 2 4 2 3 2 3 2 4 A2.1
93683074 3 2 4 3 3 3 2 2 3 2 3 4 3 2 3 A2.1
93683062 4 2 3 2 4 3 2 2 4 3 2 4 2 2 4 A2.1
93683094 4 3 4 3 4 3 2 3 4 3 3 4 3 3 3 A2.1
93683092 5 4 5 4 4 4 5 4 5 3 3 5 5 4 5 A2.2
CofE1 3 3 4 3 3 3 4 3 4 3 3 4 4 2 3 A2.1
PAGE 32
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
93680513 5 3 4 5 4 4 5 4 4 4 3 5 4 3 5 B1.2
93680511 5 5 5 5 4 5 3 4 5 4 4 5 5 4 5 Above B1
93680516 5 2 4 3 3 3 2 3 3 2 2 2 3 3 4 B1.1
93680572 3 3 2 2 2 3 1 2 3 1 2 1 2 2 3 A2.1
93683074 4 3 2 2 2 2 0 2 3 1 2 1 2 2 3 A2.2
93683062 4 5 5 4 4 5 2 3 4 3 3 3 5 5 5 B1.2
93683094 3 2 2 2 2 3 1 3 3 2 2 1 3 2 3 A2.1
93683092 2 4 4 2 2 3 3 3 3 3 2 3 2 3 4 B1.1
CofE1 3 3 3 4 3 3 2 2 3 4 3 3 3 4 4 B1.1
93680513 5 4 3 3 3 4 5 3 3 3 3 3 4 3 5 B2.2
93680511 3 3 3 2 2 3 2 3 3 4 2 3 3 2 4 B2.1
93680516 3 3 2 1 2 2 1 3 3 2 2 2 3 1 3 B1.2
93680572 1 1 1 0 1 1 0 2 2 1 1 0 2 0 1 B1.1
93683074 2 0 2 0 0 0 0 2 1 2 1 1 1 0 2 B1.1
93683062 3 3 4 3 3 3 2 3 4 4 4 3 4 4 2 B2.1
93683094 1 0 1 0 0 0 0 1 0 0 1 0 0 0 1 Below B
93683092 2 0 1 0 0 0 1 0 1 1 1 0 0 0 2 B1.1
CofE1 2 3 3 2 2 3 4 2 3 3 3 2 3 2 2 B2.1
PAGE 33
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
PAGE 34
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
93690131 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Below A
93686854 3 2 2 3 3 2 2 2 3 3 2 3 3 2 2 A2
93686728 3 2 2 3 3 3 3 1 3 2 2 2 2 2 3 A2
93690115 3 4 4 4 2 3 4 2 4 4 3 2 2 3 4 B1
93686711 4 3 3 4 3 4 4 1 4 3 2 2 3 3 4 B1
93686905 3 3 2 4 3 3 3 2 3 3 3 4 2 3 3 B1
93690100 4 4 4 4 4 3 5 3 4 2 3 3 4 4 3 B2
93685911 4 4 3 4 3 3 4 2 4 4 3 3 4 3 5 B2
CofE1 2 3 3 2 2 3 4 2 3 3 3 2 3 2 2 B2.1
93690131 1 1 1 1 2 1 1 1 2 1 1 2 1 1 1 A1
93686854 3 3 3 4 4 3 3 2 4 4 3 3 4 2 4 B1
93686728 3 2 2 3 3 3 3 1 3 3 2 2 2 2 3 A1
93690115 3 4 4 4 2 4 4 2 4 4 3 3 3 3 4 B1
93686711 3 2 3 3 3 4 4 2 3 4 2 2 3 3 4 B1
93686905 2 3 2 4 2 3 3 2 3 2 3 3 2 3 3 B1
93690100 5 4 3 4 4 4 5 3 4 3 3 3 4 4 4 B2
93685911 4 3 3 4 3 4 4 1 3 4 3 3 3 3 5 B1
CofE1 2 3 3 2 2 3 4 2 3 3 3 2 3 2 2 B2.1
PAGE 35
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
93690131 2 1 1 1 2 1 1 0 1 1 1 2 1 1 2 A1
93686854 3 3 3 4 3 3 3 2 4 3 2 3 2 3 4 B1
93686728 3 2 2 3 2 3 3 1 3 3 2 1 3 2 3 A1
93690115 3 3 4 4 3 3 3 2 4 4 3 4 4 3 4 B1
93686711 3 3 3 3 3 4 4 2 3 4 2 2 3 3 4 B1
93686905 2 2 3 4 2 4 3 1 3 2 3 3 2 3 3 B1
93690100 5 4 3 4 3 4 5 2 4 2 3 3 3 4 4 B2
93685911 3 3 3 3 3 2 3 1 3 3 2 3 3 2 4 B1
CofE1 2 3 3 2 2 3 4 2 3 3 3 2 3 2 2 B2.1
93690131 2 1 1 2 2 1 2 1 1 1 1 1 2 1 2 A1
93686854 4 4 3 4 4 3 3 2 4 4 2 3 4 3 4 B1
93686728 3 3 2 3 2 3 3 1 3 3 2 2 3 2 3 B1
93690115 3 4 4 4 3 3 4 2 4 4 3 5 4 3 4 B2
93686711 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 A0
93686905 3 3 2 4 3 3 3 1 4 3 3 4 3 3 3 B1
93690100 5 5 3 5 4 5 5 3 5 3 3 4 4 4 5 B2
93685911 4 4 3 4 3 4 3 2 4 5 2 4 4 3 5 B2
CofE1 2 3 3 2 2 3 4 2 3 3 3 2 3 2 2 B2.1
PAGE 36
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
4.6.6. Discussion
There was a high level of agreement among the panel members, though one (‘An’) appeared somewhat
wayward when compared to the others. That said, the deviation shown by ‘An’ tended to be in the same
direction. In other words, this panel member tended to be consistently harsher than the others when making
judgements. The discussions, therefore, centred around these (and other less significant) variations.
93690131 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Below A
93686854 3 2 2 3 3 2 2 2 3 3 2 3 3 2 2 A2
93686728 3 2 2 3 3 3 3 2 3 2 2 2 2 2 3 A2
93690115 3 4 4 4 2 3 4 2 4 4 3 2 2 3 4 B1
93686711 4 3 3 4 3 4 4 3 4 3 2 2 3 3 4 B1
93686905 3 3 2 4 3 3 3 2 3 3 3 4 2 3 3 B1
93690100 4 4 4 4 4 3 5 3 4 2 3 3 4 4 3 B2
93685911 4 4 3 4 3 3 4 3 4 4 3 3 4 3 5 B2
CofE1 2 3 3 2 2 3 4 2 3 3 3 2 3 2 2 B2.1
PAGE 37
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
93690131 1 1 1 1 2 1 1 1 2 1 1 2 1 1 1 A1
93686854 3 3 3 4 4 3 3 2 4 4 3 3 4 2 4 B1
93686728 3 2 2 3 3 3 3 1 3 3 2 2 2 2 3 A2
93690115 3 4 4 4 2 4 4 2 4 4 3 3 3 3 4 B1
93686711 3 2 3 3 3 4 4 2 3 4 2 2 3 3 4 B1
93686905 2 3 2 4 2 3 3 2 3 2 3 3 2 3 3 B1
93690100 5 4 3 4 4 4 5 3 4 3 3 3 4 4 4 B2
93685911 4 3 3 4 3 4 4 3 3 4 3 3 3 3 5 B1
CofE1 2 3 3 2 2 3 4 2 3 3 3 2 3 2 2 B2.1
93690131 2 1 1 1 2 1 1 0 1 1 1 2 1 1 2 A1
93686854 3 3 3 4 3 3 3 2 4 3 2 3 2 3 4 B1
93686728 3 2 2 3 2 3 3 1 3 3 2 1 3 2 3 A2
93690115 3 3 4 4 3 3 3 2 4 4 3 4 4 3 4 B1
93686711 3 3 3 3 3 4 4 2 3 4 2 2 3 3 4 B1
93686905 2 2 3 4 2 4 3 2 3 2 3 3 2 3 3 B1
93690100 5 4 3 4 3 4 5 3 4 3 3 3 3 4 4 B2
93685911 3 3 3 3 3 2 3 2 3 3 2 3 3 2 4 B1
CofE1 2 3 3 2 2 3 4 2 3 3 3 2 3 2 2 B2.1
PAGE 38
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
93690131 2 1 1 2 2 1 2 1 1 1 1 1 2 1 2 A1
93686854 4 4 3 4 4 3 3 2 4 4 2 3 4 3 4 B1
93686728 3 3 2 3 2 3 3 2 3 3 2 2 3 2 3 B1
93690115 3 4 4 4 3 3 4 3 4 4 3 5 4 3 4 B2
93686711 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A0
93686905 3 3 2 4 3 3 3 3 4 3 3 4 3 3 3 B1
93690100 5 5 3 5 4 5 5 3 5 3 3 4 4 4 5 B2
93685911 4 4 3 4 3 4 3 3 4 5 2 4 4 3 5 B2
CofE1 2 3 3 2 2 3 4 2 3 3 3 2 3 2 2 B2.1
PAGE 39
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
4.7. Claims
It is now widely accepted that, in order to make any
claim of a link between a test’s reporting system and
the CEFR, it is first necessary to indicate how the
test itself has been informed by the CEFR, and then
demonstrate how important decision boundaries are
made. When deciding on the precise placement of
these boundaries, we should again both demonstrate
how the decision-making process has been informed
by the CEFR and also demonstrate that the process
has been transparent, systematic and accurate.
This section of the report has clearly demonstrated
that the boundaries set for the Aptis papers are robust
and reliable. However, remaining consistent with earlier
suggestions related to the linking process (O’Sullivan
2009), we will not at this point be making any final
definitive claims regarding Aptis. All final claims will be
addressed following the presentation of evidence of
the validity of the examination.
The procedures followed in the standard-setting
process indicate that the validity of the claim of the
CEFR levels reported by Aptis is strong.
PAGE 40
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
To ensure that the final element of the linking project been created for, we either suggest an alternative
provides a coherent argument, we will present test or propose a research study to investigate test
evidence of the validity of Aptis in terms of the performance from a representative sample of the
components of the validation frameworks developed population. At the time of writing this report, this has
initially by Weir & O’Sullivan at Roehampton University been done in three countries. In one of these places
over a decade ago and later published by Weir (India), the results of a review process resulted in
(2005) and updated by O’Sullivan & Weir (2011) and some changes to the test (e.g. people and place
O’Sullivan (2011). This model is used here for a number names, reading task topics, writing task topic, and
of reasons. As argued by O’Sullivan & Weir (2011), the speaking task photographs) before it was considered
framework is the only practically operational model of appropriate for trialling with a group of secondary
validation in existence. Others have been proposed school students (Maghera & Rutherford, 2013). At
but they fail to offer the user a sufficiently detailed or a trial of the test in these circumstances, various
coherent account of what is expected of the validation additional pieces of information are collected,
process, see for example Kane (1972), Messick these include:
(1975, 1980, 1989) or Mislevy et al. (2002, 2003). The
following elements of the updated framework are • linguistic background (L1)
presented for each paper: • age
• the test taker • educational level
• the test task • ethnic background
• the scoring system • gender
5.1. The test taker Some of these variables (age and gender) are routinely
collected and test data analysed for potential bias.
The test taker is considered within the Aptis approach
from the very beginning. The underlying model of In addition to these procedures, Aptis is delivered
validation which drives the test is O’Sullivan’s (2011) using the Surpass platform (BTL, 2012), which
modification of the earlier Weir (2005) validation conforms to all international requirements for
framework. This model suggests that we consider accessibility.
the test taker from two major perspectives, personal
characteristics and cognitive characteristics. The
successful test will take both sets of characteristics
into account when creating items and tasks.
PAGE 41
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Listening is assessed using Task 1: Limited challenge, Task 1: Lowest challenge, Task 1: Lowest challenge,
25 discrete items. The items concrete topic and task basic personal information short responses to three
are designed to become and simple sentence level in online form. personal questions, though
progressively more difficult understanding. they increase slightly in
and reflect the different complexity.
aspects of listening assessed:
1. Sound System
Task 2: Higher challenge, Task 2: Slightly higher Task 2: Higher challenge as
2. Literal Meaning identification of cohesive challenge, individual candidate describes picture,
3. Inferred Meaning structure, so sentence level preferences (interests etc.) then responds to personal
focus, though within text. in sentence form. question related to picture,
(Weir, 1993; Buck 2001) then makes comparisons
between scene in picture
and own culture/city etc.
PAGE 42
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
PAGE 43
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Intended Mid-teens to adult learners of English Data collected at test trial and administration stages
population (completed by each test taker)
Known criteria Grammar – MCQ N/A – answer key N/A – answer key Not shown on test, is available online on
the Aptis website
Task types Grammar – focus 1. Listening for 1. Careful local 1. Form filling 1. Personal
on grammatical detail (pragmatic reading – MCQ 2. Short extended information
form (including competence) – cloze items guided writing questions
discourse usage) MCQ items 2. Careful global (personal 2. Short answer
using MCQ items 2. L
istening for reading – information) non-personal
overall meaning re-building text 3. Interactive questions (picture
Vocabulary –
– MCQ items 3. G
lobal meaning writing prompt)
Focus on word
definition, usage, 3. L
istening for –gapped text/ (social media 3. Describe
synonyms, detail – note matching semi-guided and compare
collocations taking 4. Global reading – 4. Extended writing questions (2
4. Listening for overall meaning – informal and picture prompts)
detail – MCQ formal text 4. Extended output
items based on prompt
Weighting Equal weighting Equal weighting Equal weighting Weighted: Equal weighting
Task 1: max 3
Task 2: max 5
Task 3: max 7
Task 5: max 9
Order of Items Order as above As described in Order as above Order as above As described in
for computer, task types for computer, for computer, task types
test takers may test takers may test takers may
respond in any respond in any respond in any
order on P&P order on P&P order on P&P
PAGE 44
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
5.2.1.1. Purpose (2005) and O’Sullivan and Weir (2011) as being critical
to the validity of the test.
In each of the papers in the Aptis system, candidates
are offered a wide variety of tasks, each with 5.2.1.4. Weighting
specifically defined purposes. The rationale behind
this approach is to ensure as broad a coverage of All items are equally weighted in each paper and this
the underlying construct as possible, and to ensure information is made clear to the candidates both
that candidates are encouraged to set goals from the within the paper and on the Aptis website. This is done
beginning of each task that reflect those expected by to ensure that candidates are all equally informed as
the development team. to the expectations of the developers (and therefore
do not spend more time than intended on particular
The flexibility of the Aptis approach means the British aspects of the test).
Council is in a position to work with clients to localise
(i.e. make appropriate to the particular context and 5.2.1.5. Order of items
domain of language uses) the test, thus ensuring it will
meet the expectations and requirements of the client While the papers are set out in a particular order, the
while maintaining its internal integrity (from a content candidate is free to respond in any order, with the
and a measurement perspective). An example of this exception of the speaking and the listening papers.
approach was presented by Maghera & Rutherford
(2013) when describing the work done on the Aptis 5.2.1.6. Time constraints
papers to ensure they met the needs of a major Indian Candidates are allowed a limited amount of pre-
client. performance preparation time for both writing and
speaking (the time is built into the response times).
5.2.1.2. Response format In addition to this, the time allowed for responding
In the same way that the different items and tasks to items and tasks is carefully controlled to ensure a
have a variety of purposes, they also contain a range similar test experience for all candidates. In fact, all
of response formats, from multiple choice to matching timings are automatically gathered and will be used by
in the knowledge and receptive skills papers, to the Aptis research team to study specific aspects of
structured and open responses in the productive skills the test papers.
papers. This commitment to offering a wide variety of
task and item formats reduces the potential for any 5.2.2. The linguistic parameters
format-related bias (either positive or negative).
The linguistic parameters refer to the language of the
5.2.1.3. Known criteria input, the expected output and also to factors such as
variables associated with the interlocutor or audience
In order to ensure that all candidates set similar that may affect language performance, e.g. gender,
goals with regard to their expected responses, the status, nature of acquaintanceship, see O’Sullivan
assessment criteria for all tasks and items are made (2000, 2002, 2008) and Berry (2007). These are
clear both within the test papers and on the Aptis explained in Appendix 5 and outlined as they relate to
website. Aptis in Table 5.3.
It is also the case that the assessment criteria were
very carefully considered by the development team
in the early stages of the process to ensure that they
reflect the underlying knowledge and ability being
assessed in each paper. This link is recognised by Weir
PAGE 45
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
5.2.2.1. Channel
In terms of input, this can be written, visual (photo,
artwork, etc.), graphical (charts, tables, etc.) or
aural (input from examiner, recorded medium, etc.).
Output depends on the ability being tested, although
candidates will use different channels depending
on the response format. With Aptis, we consider
channel from a number of perspectives, taking into
account lessons learnt from multi-literacy research
(see Unsworth, 2001) and assessment research
(Ginther, 2001; Wagner, 2008) into the impact on test
performance of features of visual input, for example.
PAGE 46
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Discourse Grammar: short input, Announcements Task 1: related Task 1: form Task 1:
mode using descriptive, Phone messages / sentences – each completion response to
narrative and conversations can be understood in Task 2: personal short personal
discursive texts isolation information questions
[formal & informal]
Vocabulary: discrete Task 2: narrative or exchange. (interactive
Short monologue type)
(single word) and biographical text of Task 3: informal
simple descriptive texts All including a range 7 sentences. Task 2:
narrative or
of accents descriptive
Task 3: medium descriptive text
All delivered at length narrative or & narrative
Task 4: two texts a)
normal speed descriptive text of 1 Task 3:
informal email
(approx. 125-150 or 2 paragraphs. description,
words per minute) b) formal request or
Task 4: a longer comparison
complaint
narrative, discursive, & speculation
explanatory, Task 4: to texts
descriptive, or – extended
instructive text. monologue
Text length Grammar: maximum 15 Typically approx. 50 Task 1: Max 50 words. Task 1: 110-130 Task 1: max.
words for most forms, word input Task 2: 100 words in 7 words 10 words per
30 words for discourse- 1 to 5 word options sentences. Task 2: 30-50 question (x3)
based items. in MCQs words in input text. Task 2: max
Task 3: 135 words
Vocabulary: typically Maximum 40 words 25 words
Task 4: 750 words in
single word, short in rubric. Task 3: approx.
text. Headings are a
sentences (10 word Task 3: Maximum 100 words
maximum 12 words.
max) for usage items. 25 words in rubric. Task 4: approx
100-120 words 35 words
written by learner.
Writer- Not relevant to In most cases Task 1: friend, family Task 1: unknown Unspecified
reader these items as they the speaker is Task 2: unspecified reader audience for
relationship are accessing an identified as a friend, Task 2: unknown all tasks
Task 3: unknown writer
individual’s knowledge colleague or boss reader
of the language system for example. With Task 4: unknown writer
Task 3: friend
and are not concerned other items (e.g.
with usage, with the announcements), it Task 4:
exception of discourse is assumed that the a) friend,
appropriacy. speaker is a stranger. b) person of status
Nature of Concrete Concrete and some Concrete and some Concrete and some Concrete and
information more abstract more abstract more abstract some more
abstract
Continues overleaf.
PAGE 47
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Continued: Table 5.3: Test task evidence (demands) of the Aptis papers
Content Unfamiliar Mix of familiar and Mix of familiar and Familiar Familiar
knowledge unfamiliar unfamiliar
There is a broad candidature so this is dealt with by selecting only clear topics accessible to the general reader.
No expectation of knowledge of British culture. In some cases, domain or population specific version will include
some level of expected knowledge of that domain or culture.
Linguistic parameters
Lexical
range The papers have as their basis the British Council/EAQUALS Core Inventory, which can be found at:
http://www.teachingenglish.org.uk/article/british-council-eaquals-core-inventory-general-english-0
Structural
range The language of the Aptis papers is carefully controlled, with clear specification of grammar and vocabulary
for each task type (input and expected output) – lexical profiles are provided for all input texts (including
Functional instructions and prompts and are based on the Compleat Lexical Tutor (www.lextutor.ca)
range
PAGE 48
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Administrative parameters
Physical conditions for all tests are set out in the Administrator Guidelines, to which all delivering
centres have access. Close monitoring of delivery is essential, particularly (though not exclusively)
for the listening and speaking papers, where interference from nearby candidates is a risk. The
Physical conditions
British Council has extensive experience delivering tests across the world and does so for many
language examination boards, as well as for general education examination boards and
professional bodies.
With the computer delivered versions of the papers, this is not a major issue, though there is a
clear dependence on the physical conditions being appropriate. Administration is strictly controlled,
Uniformity of
and Aptis is treated by the British Council as any of the high stakes tests it administers around the
administration
world. With pen and paper versions, there are clearly set out procedures, which are monitored at
the local level and on occasion from outside.
This is less of an issue for Aptis than for major high stakes tests such as IELTS. However, security is
seen as important and all test papers and test data are routinely stored (and securely destroyed
Security
when called for) in the same way we deal with high stakes tests. Security issues are dealt with in
the Invigilator Guidelines and the Administrator Guidelines.
PAGE 49
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Answer keys are systematically checked on production of task/item, then again both pre and
Accuracy of the answer key
post test administration.
When taken online, responses are automatically scored within the system. This is the
most accurate procedure.
Marker reliability When a pen & paper version is taken, responses are manually input and routinely checked.
However, Aptis is about to move to a Optical Mark Reader (OMR) to capture test scores
– expected reliability is 99.98%.
All items are routinely trialled with a large (100+) representative sample of candidates from
a range of countries. At this point, logit values (required in order to include items in the item
Item performance bank) and other important data are collected (facility, point biserial, infit) in order to ensure
that only properly functioning items are included in the test papers. Items are also routinely
analysed post test delivery to confirm that they are working as expected.
SEM 4% 6% 7%
PAGE 50
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
The rating scales used for Aptis writing was The rating scales used for Aptis speaking was also
developed based directly on the descriptors based directly on the descriptors from the CEFR.
Rating scale from the CEFR.
The scales are task specific (one each for tasks The scale is test specific (a single scale is used for
2, 3 and 4) and can be seen in Appendix 2. all four tasks) and can be seen in Appendix 4.
Minimum requirements for rater selection are set out in the Aptis guidelines. Experience in teaching
Rater selection
and assessing at a range of levels is considered vital.
All raters are trained using materials based on the CEFR and representing the Aptis test tasks.
Rater training
Trainers are examiners who have received additional training.
Raters are routinely monitored during to ensure they are on level. This is done in two ways:
1. Control scripts (pre-scored by expert raters) are fed to the rater during every session (approx. 5%
Rater monitoring of all performances marked), failure to meet pre-set conditions can result in removal from the system
pending additional training.
2. Data from test scoring sessions are routinely analysed to ensure that all markers are on level.
Rater consistency No inconsistent raters remain in the system, perhaps the attrition rate is due to the constant monitoring.
Estimated SEM 7% 7%
Raters may mark scripts in their own work environment, though they are given clear and strict
Rating conditions
instructions relating to the conduct of the assessment.
Grading and Since all final decisions are made within the system, a unique approach to dealing with SEM.
awarding This is described in Section 5.3.1
PAGE 51
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
5.3.1. Using the core score to resolve level decisions and contributes significantly to the
boundary cases increased reliability of the outcomes.
The language knowledge score contributes to the In the example shown in Figure 5.1., a candidate who
overall CEFR level allocation in the following way. achieves score A on the language knowledge paper
which is clearly above the review point (mean plus
Where a candidate achieves a score on their skills 1 standard deviation), will have their speaking score
paper (in this example we are looking at speaking) reviewed. If, like score C, it falls within the level review
that falls within 1 standard error of measurement range (boundary point minus 1 SEM), then the person
(SEM) of a CEFR level boundary (e.g. achieving a in this case will be awarded a B2 (rather than the
score of 17 when the cut-score for B2 is 18), then lower B1). If it falls below this range (score D), then
their score on the language knowledge paper is no action will be taken. If the candidate scores below
taken into consideration when deciding whether they the review point for language knowledge (score B),
should remain at the lower level or be upgraded to then no action is taken regarding the speaking paper
the higher level. To receive this upgrade, they should score, regardless of where the speaking paper score
perform significantly above the average (we set this at lies in relation to the level review range.
1 standard deviation above the worldwide mean). This
system greatly increases the accuracy of the CEFR
SCORE A
CEFR B2 BOUNDARY
Level
SCORE C review
range
SCORE D
SCORE B
PAGE 52
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
5.3.2. Conclusions from this phase In order to lend support to claims of a link between
the CEFR and the Aptis boundary points, we first
The evidence presented here, from the overview of completed the specification forms as suggested in
the test and its rationale and from the trials, strongly the Council of Europe’s Manual (2009). The evidence
suggests a set of test papers that are working well to emerging from this activity supported the progression
offer an accurate indication of an individual’s language to the next phase, that of formal standard setting. The
ability. The stability of the different papers, as shown in report of the standard-setting events presented here
the tables, indicates that Aptis meets the expectations offer a strong vindication of the claims made by the
of a high stakes, international examination. British Council of the veracity of their CEFR-related
claims. Finally, the validation stage demonstrated
5.4. Claims the test’s accuracy and validity (in terms of content
coverage and relevance; appropriacy of cognitive
In keeping with the approach suggested in earlier challenge; delivery and linguistic parameters; and
linking projects (e.g. O’Sullivan, 2009), we now arrive scoring system).
at the stage when substantial claims regarding a test
can be made. All of this evidence combines to support both the
validity of the test for use as a measure of general
Since Aptis is a completely new test, it was felt that the proficiency and the accuracy and appropriacy of
critical review stage suggested by O’Sullivan (2009) the claimed links to the CEFR.
would not be needed, as the procedures outlined in
this technical report act as a validation of the test. Of
course, it would be naïve to think that the validation
process ends with this report. On the contrary, this
marks the formal beginning of the whole process.
In recognition of the fact that test validation is an
ongoing, long-term process, the British Council has
undertaken two valuable initiatives. The first of these
is the creation of the British Council Assessment
Research Awards and Grants (the first of which were
confirmed in early 2013). This initiative is designed
to gather a broad range of validity evidence from
external researchers across the world and is expected
to contribute greatly to the test (in much the same
way as the IELTS Joint Funded Research Scheme has
for that test). The initiative is also designed to support
young researchers with a series of small awards to
help them complete work important to their careers.
The other initiative is the revitalisation of the British
Council’s in-house research expertise. It is planned
that this combination of internal and external research
will add significantly to the validity evidence in support
of various uses of Aptis in the coming years.
PAGE 53
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
The Aptis development project marked a new era for 6.1. Summary of the main findings
the British Council, even though it had been involved
in a number of test development projects in the past, The project findings can be summarised as follows:
most notably ELTS (later IELTS). The decision was
1 The Aptis papers offer a broad measure of ability
taken at an early stage in the project that the test
across the different skills, as well as the key area
should reflect best practice in the area of language
of knowledge of the system of the language.
testing and also ‘fit’ with the British Council’s ambitions
in the area of assessment literacy. These ambitions 2. The Aptis test papers are robust in terms of
relate to the aim of offering world-class advice and quality of content and accuracy and consistency
consultancy to the many governments, institutions of decisions.
and corporation it works with across the globe. To 3. The CEFR boundary points suggested are robust
make the most of the opportunities offered to the and accurate.
British Council itself and to its many partners in the
UK and beyond, a wide-ranging assessment literacy 6.2. Limitations
agenda has been envisaged in which all British As with any project of this nature, there are limitations
Council staff will be given the opportunity to learn to this project. Pressure of time means that ongoing
about assessment. In addition, the plan is to pass on work to further support the psychometric qualities
this knowledge and expertise to clients so that they of the test cannot be included in this report,
can begin to make more informed decisions when it although this evidence will be made public in a
comes to assessment. future technical report.
Aptis was developed as a low to medium stakes test
to be used by large institutions such as education
6.3. Concluding comments
ministries, recruitment agencies and corporations The project reported here was designed to offer
in a variety of situations where an accurate, though evidence of the validity of claims of a link between
affordable, estimation of the language levels of their the boundary points across the various Aptis skills
employees or prospective employees was required. papers and the CEFR. The fact that the project has
provided evidence in support of these claims is of
The decision to undertake a formal CEFR linking
great importance to the British Council and the
project, normally the domain of high stakes tests,
end-users of the test.
reflected a will to continue to push the boundaries
of language testing. The development of Aptis and the completion of this
project marks a significant beginning of the future of
The success of the project, as presented in this
the British Council and high quality test development
report, should not be taken as an end in itself. As
and validation.
already indicated, the British Council is committed to a
long-term exploration of issues around the validation
of Aptis and any future tests it is involved with.
PAGE 54
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
REFERENCES
Abad Florescano, A., O’Sullivan, B., Sanchez Chavez, C., Ryan, D. E., Zamora Lara, E., Santana
Martinez, L. A., Gonzalez Macias, M. I., Maxwell Hart, M., Grounds, P. E., Reidy Ryan, P., Dunne, R.
A. and Romero Barradas, T. de E. (2011). Developing affordable ‘local’ tests: the EXAVER project. In Barry
O’Sullivan (ed), Language Testing: Theory & Practice (pp. 228-243). Oxford: Palgrave Macmillan.
Alderson, J. C., Clapham, C & Wall, D. (1995). Language Test Construction and Evaluation. Cambridge:
Cambridge University Press.
Berk, R. A. (1986). A consumer’s guide to setting performance standards on criterion referenced tests.
Review of Educational Research, 56, pp. 137-172.
Berry, V. (2007). Personality Differences and Oral Test Performance. Frankfurt: Peter Lang.
Chizek, G. J. and Bunch, M. B. 2007. Standard Setting. Thousand Oaks, CA: Sage.
Council of Europe. (2001). Common European Framework of Reference for Languages: Learning, teaching,
assessment. Cambridge: Cambridge University Press.
Council of Europe. (2009). Relating Language Examinations to the Common European Framework of
Reference for Languages: Learning, teaching, assessment: Manual. Strasburg: Council of Europe, Language
Policy Division.
Council of Europe. (Undated). Relating language examinations to the Common European Framework of
Reference for Languages: learning, teaching, assessment – writing samples. Retrieved January 10, 2013 from
http://www.coe.int/t/dg4/education/elp/elp-reg/Source/Key_reference/exampleswriting_EN.pdf
Ginther, A. (2001). Effects of the Presence and Absence of Visuals on Performance on TOEFL CBT Listening-
Comprehensive Stimuli. TOEFL Research Report 66. Princeton: Educational Testing Services.
Kane, M. T. (1992). An argument-based approach to validity, Psychological Bulletin 112 (3), pp. 527–535.
Kantarcıoğlu, E. (2012). A Case-Study of the Process of Linking an Institutional English Language Proficiency
Test (COPE) for Access to University Study in the Medium Of English to the Common European Framework for
Languages: Learning, Teaching and Assessment. (Unpublished PhD thesis.) University of Roehampton, London.
Kantarcıoğlu, E., Thomas, C., O’Dwyer, J. and O’Sullivan, B. (2010). The COPE linking project: a case
study. In Waldemar Martyniuk (ed.) Aligning Tests with the CEFR: Case studies and reflections on the use of the
Council of Europe’s Draft Manual (pp. 102-118). Cambridge: Cambridge University Press
Khalifa, H. & Weir, C., J. (2009). Examining Reading. Cambridge: Cambridge University Press.
Maghera, D. & Rutherford, K. (2013). Flexibility and Accessibility in a large-scale test of English proficiency.
Paper presented at the 3rd International Teacher Educator Conference, Hyderabad, India.
Messick, S. (1975). The standard program: Meaning and values in measurement and evaluation. American
Psychologist, 30, pp. 955-966.
Messick, S. (1980). Test validity and the ethics of assessment. American Psychologist, 35, pp. 1012–1027.
Messick, S. (1989). Validity. In R. L. Linn (ed.), Educational measurement (3rd edition) London, NY: McMillan.
PAGE 55
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Mislevy R. J., Steinberg, L. S. & Almond, R. G. (2003). On the structure of educational assessments,
Measurement: Interdisciplinary Research and Perspectives 1 (1), pp. 3-62.
Mislevy, R. J., Steinberg, L. S. & Almond, R. G. (2002). Design and analysis in task-based language
assessment, Language Testing 19 (4), pp. 477–496.
O’Sullivan, B. (2002). Learner Acquaintanceship and Oral Proficiency Test Pair-Task Performance.
Language Testing, 19 (3), pp. 277-295
O’Sullivan, B. (2005). Levels Specification Project Report. (Internal report.) Zayed University,
United Arab Emirates.
O’Sullivan, B. (2008). Modelling Performance in Tests of Spoken Language. Frankfurt: Peter Lang.
O’Sullivan, B. (2009). City & Guilds Communicator IESOL Examination (B2) CEFR Linking Project.
London: City & Guilds.
O’Sullivan, B. (2011). Language Testing. In James Simpson (ed) Routledge Handbook of Applied Linguistics
(pp. 259-273). Oxford: Routledge.
O’Sullivan, B. (2015). Aptis Test Development Approach. Aptis Technical ReportTR/2015/003: British Council.
O’Sullivan, B. & Weir, C. (2011). Language Testing and Validation. In Barry O’Sullivan (ed) Language Testing:
Theory & Practice (pp. 13-32). Oxford: Palgrave Macmillan.
Porter, D. & O’Sullivan, B. (1999). The effect of audience age on measured written performance.
System, 27, pp. 65–77.
QALSPELL. 2004. Quality Assurance in Language for Specific Purposes, Estonia, Latvia, Lithuania. Leonardo da
Vinci funded project. Website accessed June 8, 2008: http://www.qalspell.ttu.ee/
Swain, Merrill. (1984). Teaching and testing communicatively. TESL Talk Vol 15, No.s 1 and 2, pp. 7-18.
Unsworth, L. (2001). Teaching multiliteracies across the curriculum: Changing contexts of text and image in
classroom practice. Buckingham: Open University Press.
Wagner, E. (2008). Video listening tests: What are they measuring? Language Assessment Quarterly,
5, pp. 218-243.
Weir, C. J. (2005). Language Testing and Validation: an evidence-based approach. Oxford: Palgrave.
Wu, J. R. W. & Wu, R. Y. F. (2010). Relating the GEPT Reading Comprehension Test to the CEFR. In Waldemar
Martyniuk (ed.) Aligning Tests with the CEFR: Case studies and reflections on the use of the Council of Europe’s
Draft Manual (pp. 204-224). Cambridge: Cambridge University Press.
PAGE 56
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
APPENDIX 1:
COMPLETED SPECIFICATION FORMS
CEFR DRAFT LINKING MANUAL SPECIFICATION FORMS FOR APTIS
Completed September - December 2013
PAGE 57
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1. General Information
Name of examination Aptis
Language tested English
Examining institution British Council
Versions analysed (date) August 2013
Type of examination þ International ¨ National ¨ Regional ¨ nstitutional
Purpose To test general proficiency in English – four skills plus a language knowledge paper
Target population ¨ Lower Sec þ Upper Sec þ Uni/College Students þ Adult
No. of test takers per year New test
4. W
hat is/are principal þ Public
domain(s)? þ Personal
¨ Occupational
þ Educational
5. W
hich communicative þ 1
Listening comprehension Name of Subtest(s) Duration
activities are tested? þ 2
Reading comprehension 1. Core (Grammar & Vocabulary) 25 minutes
¨ 3 Spoken interaction 2. Reading 40 minutes
þ 4 Written interaction 3. Listening 25-50 minutes
þ 5 Spoken production 4. Writing 40 minutes
þ 6 Written production 5. Speaking 15 minutes
¨ 7 Integrated skills
8 Spoken mediation of text
¨
9 Written mediation of text
¨
þ 10 Language usage
þ 1
1 Other: (specify): Written Interaction
6. W
hat is the weighting of Aptis is designed to offer a profile of language ability, with no ‘overall’ CEFR level reported.
the different subtests in Where a client requires an overall grade, the ratio of importance of the skills is first agreed,
the global result? then this ration is used as the basis of any calculation.
PAGE 58
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
7. D
escribe briefly Core Listening Reading Writing Speaking
the structure of
each subtest Grammar – focus 1. Listening 1. C
areful local 1. F
orm filling 1. Personal
on grammatical for detail reading – MCQ 2. Short extended information
form (including (pragmatic cloze items guided writing questions
discourse usage) competence) 2. Careful global (personal 2. Short answer
using MCQ items – MCQ items reading – information) non-personal
Vocabulary – 2. Listening re-building text 3. Interactive questions
focus on word for overall 3. G
lobal meaning writing (social (picture prompt)
definition, usage, meaning – – gapped text/ media) 3. Describe
synonyms, MCQ items matching semi-guided and compare
collocations 3. L
istening for questions (2
4. Global reading 4. Extended
detail – note – overall writing – picture prompts)
taking meaning informal and 4. Extended
4. Listening for formal text output based
detail – MCQ on prompt
items
8. W
hat type(s) of Subtests used in
responses are
required? þ Multiple-choice Cg, L, R
¨ True/False
þ Matching Cv
þ Ordering R2
þ Gap fill sentence Cv
þ Sentence completion Cv;, Cg
þ Gapped text / cloze, selected response R1
þ Open gapped text / cloze R3
þ Short answer to open question(s) W1
þ Extended answer (text) W2, W3, W3
¨ Interaction with examiner
¨ Interaction with peers
þ Other (Short answer - spoken) S1, S2
PAGE 59
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
10. W
here is this Subtests used in
accessible?
þ On the website
¨ From bookshops
þ In test centres
þ On request from the institution
¨ Other
11. W
hat is ¨ Global grade ¨ Global grade plus graphic profile
reported? þ Grade per subtest (scale 0-50) ¨ Profile per subtest
þ CEFR Profile
1. W
hat organisation decided that the examination þ Own organisation/school
was required? ¨ A cultural institute
¨ Ministry of Education
¨ Ministry of Justice
¨ Other: specify: ____________________________
5. Who writes the items or develops the test tasks? A specially trained team of British Council experienced
teachers (up to 250 hours of training in assessment & item
design and writing)
PAGE 60
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
11. If yes, how? With a population of over 100 candidates who have been
identified by centres as being at the appropriate level.
15. Are different aspects of validity estimated? Face validity – during piloting - questionnaires to
¨
teachers in examination centres
þ T
est taker related
þ T
est task related
þ S
coring system related
16. If yes, describe how. By a team of trained experts during the development process,
analysis as part of routine quality assurance procedures
PAGE 61
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1. How are the test tasks marked? For receptive test tasks:
þ Optical mark reader (coming in 2013)
þ Clerical marking
2. Where are the test tasks marked? þ Centrally (on computer versions)
þ Locally:
þ By local teams
þ By individual examiners
5. D
escribe the specifications of the rating criteria of N/A
productive and/or integrative test tasks.
6. A
re productive or integrated test tasks single or N/A
double rated?
PAGE 62
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1. How are the test tasks marked? For receptive test tasks:
þ Optical mark reader (coming in 2013)
þ Clerical marking
2. Where are the test tasks marked? þ Centrally (on computer versions)
þ Locally:
þ By local teams
þ By individual examiners
5. D
escribe the specifications of the rating criteria of N/A
productive and/or integrative test tasks.
6. A
re productive or integrated test tasks single or N/A
double rated?
PAGE 63
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1. How are the test tasks marked? For receptive test tasks (editing):
¨ Optical mark reader
¨ Clerical marking
For productive or integrated test tasks:
þ Trained examiners
¨ Teachers
5. D
escribe the specifications of the rating criteria of þ One holistic score for each task
productive and/or integrative test tasks. ¨ Marks for different aspects for each task
¨ Rating scale for overall performance in test
¨ Rating grid for aspects of test performance
þ Rating scale for each task
¨ Rating grid for aspects for each task
¨ Rating scale bands are defined, but not to CEFR
þ Rating scale bands are defined in relation to CEFR
6. A
re productive or integrated test tasks single or ¨ Single rater
double rated? ¨ Two simultaneous raters
þ Double marking of scripts (random)
þ Other: specify: Approx. 5% of all scripts are pre-marked by
multiple expert raters. Failure to mark to standard results in
withdrawal until additional training is passed
PAGE 64
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
7. I f double rated, what procedures are used when þ Use of third rater and that score holds
differences between raters occur? ¨ Use of third marker and two closest marks used
¨ Average of two marks
¨ Two markers discuss and reach agreement
8. Is inter-rater agreement calculated? þ Yes measuring the inter-rater reliability currently with
Spearman Rho & multi-faceted Rasch analysis
¨ No
1. How are the test tasks marked? For receptive test tasks (editing):
¨ Optical mark reader
¨ Clerical marking
For productive or integrated test tasks:
þ Trained examiners
¨ Teachers
PAGE 65
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
5. D
escribe the specifications of the rating criteria of þ One holistic score for each task
productive and/or integrative test tasks. ¨ Marks for different aspects for each task
¨ Rating scale for overall performance in test
¨ Rating grid for aspects of test performance
þ Rating scale for each task
¨ Rating grid for aspects for each task
¨ Rating scale bands are defined, but not to CEFR
þ Rating scale bands are defined in relation to CEFR
6. A
re productive or integrated test tasks single or ¨ Single rater
double rated? ¨ Two simultaneous raters
þ Double marking of scripts (random)
þ Other: specify: Approx. 5% of all scripts are pre-marked by
multiple expert raters. Failure to mark to standard results in
withdrawal until additional training passed
7. I f double rated, what procedures are used when þ Use of third rater and that score holds
differences between raters occur? ¨ Use of third marker and two closest marks used
¨ Average of two marks
¨ Two markers discuss and reach agreement
8. Is inter-rater agreement calculated? þ Yes measuring the inter-rater reliability currently with
Spearman Rho & multi-faceted Rasch analysis
¨ No
PAGE 66
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
2. D
escribe the procedures used to establish pass The boundaries are set using a modified Angoff
marks and/or grades and cut-scores standard-setting procedure, described in full in the
standard-setting section of this report.
4. I f grades are given, how are the grade See item 2 in this table
boundaries decided?
5. How is consistency in these standards maintained? Consistency is maintained by ensuring that the parallel
versions of the test are equivalent. Tests are compiled from
an item bank and must reflect a specified difficulty profile
(IRT-based). In addition, item writers are carefully trained and
follow the specifications, while a quality assurance system
pre-proofs all items prior to pilot testing.
1. A
re pass marks and/or grades given? ¨ Pass marks
þ CEFR levels
þ Scale scores (0-50)
2. D
escribe the procedures used to establish pass The boundaries are set using a modified Angoff
marks and/or grades and cut-scores standard-setting procedure, described in full in the
standard-setting section of this report.
4. I f grades are given, how are the grade See item 2 in this table
boundaries decided?
5. How is consistency in these standards maintained? Consistency is maintained by ensuring that the parallel
versions of the test are equivalent. Tests are compiled from
an item bank and must reflect a specified difficulty profile
(IRT-based). In addition, item writers are carefully trained and
follow the specifications, while a quality assurance system
pre-proofs all items prior to pilot testing.
PAGE 67
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
2. D
escribe the procedures used to establish The boundaries are set using a modified Angoff standard
pass marks and/or grades and cut-scores setting procedure and is described in full in the Standard
Setting section of this report.
4. If grades are given, how are the See item 2 in this table
grade boundaries decided?
5. H
ow is consistency in these Item writers are carefully trained and follow the specifications,
standards maintained? while a quality assurance system pre-proofs all items prior
to pilot testing. Test versions are routinely analysed for
consistency of level using multi-faceted Rasch analysis.
2. D
escribe the procedures used to establish The boundaries are set using a modified Angoff standard-
pass marks and/or grades and cut-scores setting procedure, described in full in the standard-setting
section of this report.
4. I f grades are given, how are the grade See item 2 in this table
boundaries decided?
5. H
ow is consistency in these Item writers are carefully trained and follow the specifications,
standards maintained? while a quality assurance system pre-proofs all items prior
to pilot testing. Test versions are routinely analysed for
consistency of level using multi-faceted Rasch analysis.
PAGE 68
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
4. I s information provided to help candidates to Details on the report form and on the dedicated website
interpret results? Give details. on the meaning of the CEFR levels that are used on the
certificate – based on ‘Can Do’ statements.
5. D
o candidates have the right to see the corrected No
and scored examination papers?
PAGE 69
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1. Is feedback gathered on the examinations? þ Yes, in the course of pre-testing and live testing
¨ No
6. F
or which features is analysis on the data gathered þ Difficulty
carried out? þ Discrimination
þ Reliability
þ Validity (content)
7. S
tate which analytic methods have been used • Descriptive stats – measures of central tendency
(e.g. in terms of psychometric procedures). and dispersion
• Classical item statistics
• IRT – item level difficulty and item misfit
• Qualitative feedback (how it works/rater remarks)
• Inter-subtest correlations
8. A
re performances of candidates from different Yes – bias analysis / DIF based on the candidate data
groups analysed? If so, describe how. performed during annual test review.
9. D
escribe the procedures to protect the All scripts are handled and stored within secure areas.
confidentiality of data. Data are analysed using spreadsheets held on a secure
network drive. There is limited access to this data.
10. A
re relevant measurement concepts explained Summary of how final scores are calculated is available.
for test users? If so, describe how.
PAGE 70
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Give the rationale for the decisions that have been Basic underlying philosophy is flexibility and accessibility
made in relation to the examination or the test tasks For this reason:
in question.
• Different delivery options – computer
(tablets and iPad in 2013), phone, pen & paper
• Client decides on which skills to test
• Customisation of content possible
Is there a review cycle for the examination? (How often? No fixed time, though already (6 months after launch) we
Who by? Procedures for revising decisions) are returning to the Listening paper with a view to future
amendments. It is part of the philosophy of the test, to
constantly improve.
PAGE 71
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
2. W
hich communication themes are the test takers As Aptis tests across the levels, different themes are found
expected to be able to handle? in different items, examples include:
Self and Family, House and Home, Environment, Daily Life,
Free time, Entertainment, Travel, Shopping, Food and Drink,
Public Services, Places, Language, Time, Numbers, Weather,
Measures and Shapes.
3. W
hich communicative tasks, activities and strategies • Listening for detail in announcements and messages
are the test takers expected to be able to handle? • Listen for speaker intent/mood/attitude
4. W
hat text-types and what length of text are the test • Interpersonal dialogues and conversations
takers expected to be able to handle? • Broadcasts
• Discussions
• Instructions and directions
• Telephone conversations
5. A
fter reading the scale for Overall Listening Levels: A1 to C
Comprehension, given below, indicate and justify
Justification (incl. reference to documentation)
at which level(s) of the scale the subtest should
be situated. The items, themes and foci of the input texts were
drawn from the CEFR.
Trialling of items indicates a broad range of difficulty
Standard setting acts to triangulate this evidence
PAGE 72
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
C2 Has no difficulty in understanding any kind of spoken language, whether live or broadcast, delivered at fast native speed.
Can understand enough to follow extended speech on abstract and complex topics beyond his/her own field,
though he/she may need to confirm occasional details, especially if the accent is unfamiliar.
C1 Can recognise a wide range of idiomatic expressions and colloquialisms, appreciating register shifts.
Can follow extended speech even when it is not clearly structured and when relationships are only implied
and not signalled explicitly.
Can understand standard spoken language, live or broadcast, on both familiar and unfamiliar topics normally
encountered in personal, social, academic or vocational life. Only extreme background noise, inadequate
discourse structure and/or idiomatic usage influences the ability to understand.
B2
Can understand the main ideas of propositionally and linguistically complex speech on both concrete and abstract
topics delivered in a standard dialect, including technical discussions in his/her field of specialisation.
Can follow extended speech and complex lines of argument provided the topic is reasonably familiar, and the
direction of the talk is sign-posted by explicit markers.
Can understand straightforward factual information about common everyday or job related topics, identifying both
general messages and specific details, provided speech is clearly articulated in a generally familiar accent.
B1
Can understand the main points of clear standard speech on familiar matters regularly encountered in work, school,
leisure etc., including short narratives.
Can understand enough to be able to meet needs of a concrete type provided speech is clearly and slowly articulated.
A2
Can understand phrases and expressions related to areas of most immediate priority (e.g. very basic personal and
family information, shopping, local geography, employment) provided speech is clearly and slowly articulated.
A1 Can follow speech which is very slow and carefully articulated, with long pauses for him/her to assimilate meaning.
PAGE 73
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1. I n what contexts (domains, situations, …) are the test Public, personal and educational
takers to show ability?
2. W
hich communication themes are the test takers As Aptis tests across the levels, different themes are found in
expected to be able to handle? different items, examples include:
Self and Family, House and Home, Environment, Daily Life, Free
time, Entertainment, Travel, Shopping, Food and Drink, Public
Services, Places, Language, Time, Numbers, Weather, Measures
and Shapes.
3. W
hich communicative tasks, activities and strategies These are as outlined in the British Council/EAQUALS
are the test takers expected to be able to handle? Core Inventory
Tasks:
• Completing texts (variety of text types) by inserting missing
sentences / words into phrases
• Completing short texts
• Locating specific information
Activities:
• Reading for global comprehension
• Reading for local detail
The language user may read:
• for gist
• for specific information
• for detailed understanding
Strategies:
• Planning: framing
• Execution: Identifying cues and inferring from them
• Evaluation: hypothesis testing
• Repair: revising hypothesis
4. W
hat text-types and what length of text are the Text types:
test takers expected to be able to handle? • Narratives
• Explanations
• Descriptions
Text length: max 750 words (exc. Items)
5. A
fter reading the scale for Overall Reading Levels: A1 to C
Comprehension, given below, indicate and justify
at which level(s) of the scale the subtest should Justification (incl. reference to documentation)
be situated. The items, themes and foci of the input texts were drawn
from the CEFR.
Trialling of items indicates a broad range of difficulty
Standard setting acts to triangulate this evidence
Test based on extensive research during development
– using Khalifa & Weir’s (2009) reading model
PAGE 74
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Can understand and interpret critically virtually all forms of the written language including abstract, structurally
complex, or highly colloquial literary and non-literary writings.
C2
Can understand a wide range of long and complex texts, appreciating subtle distinctions of style and implicit as
well as explicit meaning.
Can understand in detail lengthy, complex texts, whether or not they relate to his/her own area of speciality,
C1
provided he/she can reread difficult sections.
Can read with a large degree of independence, adapting style and speed of reading to different texts and purposes,
B2 and using appropriate reference sources selectively. Has a broad active reading vocabulary, but may experience
some difficulty with low-frequency idioms.
Can read straightforward factual texts on subjects related to his/her field and interest with a satisfactory level of
B1
comprehension.
Can understand short, simple texts on familiar matters of a concrete type which consist of high frequency everyday
or job-related language
A2
Can understand short, simple texts containing the highest frequency vocabulary, including a proportion of
shared international vocabulary items.
Can understand very short, simple texts a single phrase at a time, picking up familiar names, words and basic
A1
phrases and rereading as required.
PAGE 75
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
A3.2 Interaction
1. I n what contexts (domains, situations, …) are the test Personal, public and educational
takers to show ability?
2. W
hich communication themes are the test takers As Aptis tests across the levels, different themes are found
expected to be able to handle? in different items. Interactive writing is examined in Tasks 3
and 4 only. Examples of themes include:
Environment, Entertainment, Travel, Shopping, Food and Drink,
Public Services, Clubs
3. W
hich communicative tasks, activities and strategies Tasks: (awareness of audience is the key in both tasks)
are the test takers expected to be able to handle? • Social media reading and responding
• Formal and informal writing on same topic
Strategies
• Planning
• Execution
• Evaluation
• Repair
4. W
hat kind of texts and text-types are the test takers • Social Media
expected to be able to handle? • Email
5. A
fter reading the scale for Overall Written Levels: B1 to C
Interaction, given below, indicate and justify at which
level(s) of the scale the subtest should be situated. Justification (incl. reference to documentation)
The tasks, themes and foci of the input texts were drawn
from the CEFR
Tasks designed to elicit language at B1 and above
Standard setting acts to triangulate this evidence
C2 As C1
C1 Can express him/herself with clarity and precision, relating to the addressee flexibly and effectively.
B2 Can express news and views effectively in writing, and relate to those of others.
Can convey information and ideas on abstract as well as concrete topics, check information and ask about
or explain problems with reasonable precision.
B1
Can write personal letters and notes asking for or conveying simple information of immediate relevance,
getting across the point he/she feels to be important.
A2 Can write short, simple formulaic notes relating to matters in areas of immediate need.
PAGE 76
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
➢ Correspondence Page 83
A3.3 Production
1. I n what contexts (domains, situations, …) are the test Home, office, place of study
takers to show ability?
2. W
hich communication themes are the test takers As Aptis tests across the levels, different themes are found
expected to be able to handle? in different items. Interactive writing is examined in Tasks 1
and 2 only. Examples of themes include:
House and Home, Daily life, Free time, entertainment,
Personal Information
3. W
hich communicative tasks, activities and strategies Form filling (basic)
are the test takers expected to be able to handle? Form filling (extended response)
4. W
hat kind of texts and text-types are the test takers Descriptive
expected to be able to handle? Narrative
The lists in CEFR 4.6.2 and 4.6.3 might be of help as a reference. Expository
5. A
fter reading the scale for Overall Written Levels: A1 to A2
Production, given below, indicate and justify at which
level(s) of the scale the subtest should be situated. Justification (incl. reference to documentation)
The subscales for written production in CEFR 4.4.1.2 listed after The tasks, themes and foci of the input texts were d
the scale might be of help as a reference. rawn from the CEFR
Tasks designed to elicit language at level A
Standard setting acts to triangulate this evidence
PAGE 77
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Can write clear, smoothly flowing, complex texts in an appropriate and effective style and a logical structure
C2
which helps the reader to find significant points.
Can write clear, well-structured texts of complex subjects, underlining the relevant salient issues, expanding
C1 and supporting of view at some length with subsidiary point, reasons and relevant examples, and rounding
off with an appropriate conclusion.
Can write clear, detailed texts on a variety of subjects related to his/her field of interest, synthesising and
B2
evaluating information and arguments from a number of sources.
Can write straightforward connected texts on a range of familiar subjects within his/her field of interest, by linking
B1
a series of shorter discrete elements into a linear sequence.
A2 Can write a series of simple phrases and sentences linked with simple connecters like “and, “but” and “because”.
PAGE 78
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1. In what contexts (domains, situations, …) are the test Home, office, place of study
takers to show ability?
2. W
hich communication themes are the test takers As Aptis tests across the levels, different themes are found
expected to be able to handle? in different items. Interactive writing is examined in Tasks 1
and 2 only. Examples of themes include:
Self and Family, House and Home, Environment, Daily Life,
Free time, Entertainment, Travel, Shopping, Food and Drink,
Public Services, Places, Language, Time, Numbers, Weather.
3. W
hich communicative tasks, activities and strategies Responding to questions
are the test takers expected to be able to handle? Describing
Comparing
Speculating
4. W
hat kind of texts and text-types are the test takers Descriptive
expected to be able to handle? Narrative
Expository
5. A
fter reading the scale for Overall Written Levels: A1 to C
Production, given below, indicate and justify at which
level(s) of the scale the subtest should be situated. Justification (incl. reference to documentation)
The items, themes and foci of the input texts were
drawn from the CEFR.
Quality control of items indicates a broad range of difficulty
Standard setting acts to triangulate this evidence
PAGE 79
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Can write clear, smoothly flowing, complex texts in an appropriate and effective style and a logical structure
C2
which helps the reader to find significant points.
Can write clear, well-structured texts of complex subjects, underlining the relevant salient issues, expanding
C1 and supporting of view at some length with subsidiary point, reasons and relevant examples, and rounding
off with an appropriate conclusion.
Can write clear, detailed texts on a variety of subjects related to his/her field of interest, synthesising and
B2
evaluating information and arguments from a number of sources.
Can write straightforward connected texts on a range of familiar subjects within his/her field of interest, by linking
B1
a series of shorter discrete elements into a linear sequence.
A2 Can write a series of simple phrases and sentences linked with simple connecters like “and, “but” and “because”.
PAGE 80
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
A4.1 Reception
Those CEFR scales most relevant to Receptive skills have been used to create Table A3, which can be referred
to in this section. Table A3 does not include any descriptors for “plus levels”. The original scales consulted,
some of which do define plus levels, include:
Linguistic Competence
• General Linguistic Range English: page 110
• Vocabulary Range English: page 112
Socio-linguistic Competence
• Socio-linguistic Appropriateness English: page 122
Pragmatic Competence
• Thematic Development English: page 125
• Cohesion and Coherence English: page 125
• Propositional Precision English: page 129
Strategic Competence
• Identifying Cues/Inferring English: page 72
PAGE 81
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1. W
hat is the range of lexical and grammatical This is clearly set out in the British Council/EAQUALS
competence that the test takers are expected Core Inventory.
to be able to handle?
2. A
fter reading the scale for Linguistic Competence Levels A1-C
in Table A3, indicate and justify at which level(s) of
the scale the examination should be situated. Justification (incl. reference to documentation)
• The items, themes and foci of the input texts were drawn
from the CEFR, while the areas of linguistic competence
are based on a Core Inventory which itself is very much
driven by the CEFR.
• Quality control of items ensures that the item and task writers
continue to meet the expectations of the specifications.
• The specifications are created in such a way as to
encourage interaction between the item writers and
the quality assurance team.
• Standard setting acts to triangulate this evidence
• Standard setting acts to triangulate this evidence
• See Table A3
3. W
hat are the socio-linguistic competences that These are clearly set out in the British Council/EAQUALS Core
the test takers are expected to be able to handle: Inventory.
linguistic markers, politeness conventions, register,
adequacy, dialect/accent, etc.?
4. A
fter reading the scale for Socio-linguistic Levels A1-C
Competence in Table A3, indicate and justify at which
level(s) of the scale the examination should Justification (incl. reference to documentation)
be situated. • The items, themes and foci of the input texts were drawn
from the CEFR, while the areas of linguistic competence
are based on a Core Inventory which itself is very much
driven by the CEFR.
• Quality control of items ensures that the item and task writers
continue to meet the expectations of the specifications.
• The specifications are created in such a way as to
encourage interaction between the item writers and
the quality assurance team.
• Standard setting acts to triangulate this evidence
• See Table A3
PAGE 82
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
TABLE A3: RELEVANT QUALITATIVE FACTORS FOR RECEPTION – appropriate cells are shaded
LINGUISTIC: Edited from SOCIO-LINGUISTIC: Edited from PRAGMATIC: Edited from STRATEGIC: Identifying
General Linguistic Range; Socio-linguistic Appropriateness Thematic Development and Cues and Inferring
Vocabulary Range Propositional Precision
Can understand a very wide Has a good command of idiomatic expressions Can understand precisely finer As C1.
range of language precisely, and colloquialisms with awareness of shades of meaning conveyed by a
appreciating emphasis and, connotative levels of meaning. wide range of qualifying devices (e.g.
differentiation. No signs of Appreciates fully the socio-linguistic and adverbs expressing degree, clauses
comprehension problems. sociocultural implications of language used by expressing limitations).
C2 Has a good command of a native speakers and can react accordingly. Can understand emphasis and
very broad lexical repertoire differentiation without ambiguity.
including idiomatic expressions
and colloquialisms; shows
awareness of connotative
levels of meaning.
Has a good command of a Can recognise a wide range of idiomatic Can understand elaborate Is skilled at using contextual,
broad lexical repertoire. expressions and colloquialisms, appreciating descriptions and narratives, grammatical and lexical cues
Good command of idiomatic register shifts; may, however, need to recognising sub-themes, and to infer attitude, mood and
expressions confirm occasional details, especially if the points of emphasis. intentions and anticipate what
and colloquialisms. accent is unfamiliar. Can understand precisely the will come next.
C1 Can follow films employing a considerable qualifications in opinions and
degree of slang and idiomatic usage. statements that relate to degrees of,
Can understand language effectively for social for example, certainty/uncertainty,
purposes, including emotional, allusive and belief/doubt, likelihood etc.
joking usage.
Has a sufficient range of Can with some effort keep up with fast and Can understand description or Can use a variety of strategies
language to be able to colloquial discussions. narrative, identifying main points to achieve comprehension,
understand descriptions, from relevant supporting detail including listening for main
viewpoints and arguments on and examples. points; checking comprehension
B2 most topics pertinent to his by using contextual clues.
Can understand detailed
everyday life such as family, information reliably.
hobbies and interests, work,
travel, and current events.
Has enough language to get Can respond to a wide range of language Can reasonably accurately Can identify unfamiliar words
by, with sufficient vocabulary functions, using their most common exponents understand a straightforward from the context on topics
to understand most texts on in a neutral register. narrative or description that is a related to his/her field and
topics such as family, hobbies Can recognise salient politeness conventions. linear sequence of points. interests.
and interests, work, travel, and
B1 Is aware of, and looks out for signs of, the most Can extrapolate the meaning
current events. of occasional unknown words
significant differences between the customs,
usages, attitudes, values and beliefs prevalent in from the context and deduce
the community concerned and those of his sentence meaning provided the
or her own. topic discussed is familiar.
Has a sufficient vocabulary for Can handle very short social exchanges, Can understand a simple story or Can use an idea of the overall
coping with everyday situations using everyday polite forms of greeting and description that is a list of points. meaning of short texts and
with predictable content and address. Can make and respond to invitations, Can understand a simple and direct utterances on everyday topics
A2 simple survival needs. apologies etc. exchange of limited information on of a concrete type to derive the
familiar and routine matters. probable meaning of unknown
words from the context.
Has a very basic range of Can understand the simplest everyday polite No descriptor available. No descriptor available.
simple expressions about forms of: greetings and farewells; introductions;
A1 personal details and needs of a saying please, thank you, sorry etc.
concrete type.
PAGE 83
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
5. W
hat are the pragmatic competences that the test This is clearly set out in the British Council/ EAQUALS
takers are expected to be able to handle: discourse Core Inventory.
competences, functional competences?
6. A
fter reading the scale for Pragmatic Competence in Level A1-C
Table A3, indicate and justify at which level(s) of the
Justification (incl. reference to documentation)
scale the examination should be situated.
The items, themes and foci of the input texts were drawn from
the CEFR, while the areas of linguistic competence are based
on a Core Inventory which itself is very much driven by the
CEFR.
Quality control of items ensures that the item and task writers
continue to meet the expectations of the specifications.
The specifications are created in such a way as to encourage
interaction between the item writers and the quality assurance
team.
Standard setting acts to triangulate this evidence.
7. W
hat are the strategic competences that the test These are clearly set out in the British Council/EAQUALS
takers are expected to be able to handle? Core Inventory.
8. A
fter reading the scale for Strategic Competence in Level A1-C
Table A3, indicate and justify at which level(s) of the
Justification (incl. reference to documentation)
scale the examination should be situated.
• The items, themes and foci of the input texts were drawn
from the CEFR, while the areas of linguistic competence
are based on a Core Inventory which itself is very much
driven by the CEFR.
• Quality control of items ensures that the item and task writers
continue to meet the expectations of the specifications.
• The specifications are created in such a way as to
encourage interaction between the item writers and
the quality assurance team.
• Standard setting acts to triangulate this evidence
PAGE 84
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
A4.2 Interaction
Those CEFR scales most relevant to Interaction have been used to create Table A4 which can be referred to in
this section. Table A4 does not include any descriptors for “plus levels”. The original scales consulted, some of
which do define plus levels, include:
Linguistic Competence
• General Linguistic Range English: page 110
• Vocabulary Range English: page 112
• Vocabulary Control English: page 112
• Grammatical Accuracy English: page 114
Socio-linguistic Competence
• Socio-linguistic Appropriateness English: page 122
Pragmatic Competence
• Flexibility English: page 124
• Turntaking English: page 124
• Spoken Fluency English: page 129
• Propositional Precision English: page 129
Strategic Competence
• Turntaking (repeated) English: page 86
• Cooperating English: page 86
• Asking for Clarification English: page 87
• Compensating English: page 64
• Monitoring and Repair English: page 65
PAGE 85
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1. W
hat is the range of lexical and grammatical This is clearly set out in the British Council/EAQUALS
competence that the test takers are expected Core Inventory.
to be able to handle?
2. W
hat is the range of phonological and orthographic This is clearly set out in the British Council/EAQUALS
competence that the test takers are expected to be Core Inventory.
able to handle?
3. A
fter reading the scales for Range and Accuracy in Levels A1-C
Table A4, indicate and justify at which level(s) of the
Justification (incl. reference to documentation)
scale the examination should be situated.
• The items, themes and foci of the input texts were drawn
from the CEFR, while the areas of linguistic competence
are based on a Core Inventory which itself is very much
driven by the CEFR.
• Quality control of items ensures that the item and task writers
continue to meet the expectations of the specifications.
• The specifications are created in such a way as to
encourage interaction between the item writers and
the quality assurance team.
• Standard setting acts to triangulate this evidence
• Standard setting acts to triangulate this evidence
• See Table A4
Socio-linguistic Competence
4. W
hat are the socio-linguistic competences that This is clearly set out in the British Council/EAQUALS Core
the test takers are expected to be able to handle: Inventory.
linguistic markers, politeness conventions, register,
adequacy, dialect/accent, etc.?
5. A
fter reading the scale for Socio-linguistic Levels A1-C
Competence in Table A4, indicate and justify at
• The items, themes and foci of the input texts were drawn
which level(s) of the scale the examination should be
from the CEFR, while the areas of sociolinguistic competence
situated.
are based on a Core Inventory which itself is very much
driven by the CEFR.
• Quality control of items ensures that the item and task writers
continue to meet the expectations of the specifications.
• The specifications are created in such a way as to
encourage interaction between the item writers and
the quality assurance team.
• Standard setting acts to triangulate this evidence
• Standard setting acts to triangulate this evidence
• See Table A4
PAGE 86
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
6. W
hat are the pragmatic competences that the test This is clearly set out in the British Council/EAQUALS
takers are expected to be able to handle: discourse Core Inventory.
competences, functional competences?
7. A
fter reading the scale for Fluency in Table A4, Levels A1-C
indicate and justify at which level(s) of the scale the
Justification (incl. reference to documentation)
examination should be situated.
• The items, themes and foci of the input texts were drawn
from the CEFR, while the areas of pragmatic competence
are based on a Core Inventory which itself is very much
driven by the CEFR.
• Quality control of items ensures that the item and task writers
continue to meet the expectations of the specifications.
• The specifications are created in such a way as to
encourage interaction between the item writers and
the quality assurance team.
• Standard setting acts to triangulate this evidence
• Standard setting acts to triangulate this evidence
• See Table A4
8. W
hat are the interaction strategies that the This is clearly set out in the British Council/EAQUALS
test takers are expected to be able to handle? Core Inventory.
The discussion in CEFR 4.4.3.5 might be of help
as a reference.
9. A
fter reading the scale for Interaction in Table A4, Levels A1-C
indicate and justify at which level(s) of the scale the
Justification (incl. reference to documentation)
examination should be situated.
• The items, themes and foci of the input texts were drawn
from the CEFR, while the areas of strategic competence
are based on a Core Inventory which itself is very much
driven by the CEFR.
• Quality control of items ensures that the item and task writers
continue to meet the expectations of the specifications.
• The specifications are created in such a way as to
encourage interaction between the item writers and
the quality assurance team.
• Standard setting acts to triangulate this evidence
• Standard setting acts to triangulate this evidence
• See Table A4
PAGE 87
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
A4.3 Production
Those CEFR scales most relevant to Production have been used to create Table A5, which can be referred to in
this section. Table A5 does not include any descriptors for “plus levels”. The original scales consulted, some of
which do define plus levels, include:
Linguistic Competence
• General Linguistic Range English: page 110
• Vocabulary Range English: page 112
• Vocabulary Control English: page 112
• Grammatical Accuracy English: page 114
Socio-linguistic Competence
• Socio-linguistic Appropriateness English: page 122
Pragmatic Competence
• Flexibility English: page 124
• Thematic Development English: page 125
• Cohesion and Coherence English: page 125
• Spoken Fluency English: page 129
• Propositional Precision English: page 129
Strategic Competence
• Planning English: page 64
• Compensating English: page 64
• Monitoring and Repair English: page 65
PAGE 88
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
1. W
hat is the range of lexical and grammatical This is clearly set out in the British Council/EAQUALS
competence that the test takers are expected to be Core Inventory.
able to handle?
2. What is the range of phonological and orthographic This is clearly set out in the British Council/EAQUALS
competence that the test takers are expected to be Core Inventory.
able to handle?
3. A
fter reading the scales for Range and Accuracy in Levels A1-C
Table A5 indicate and justify at which level(s) of the
Justification (incl. reference to documentation)
scale the examination should be situated.
• The items, themes and foci of the input texts were drawn
from the CEFR, while the areas of strategic competence
are based on a Core Inventory which itself is very much
driven by the CEFR.
• Quality control of items ensures that the item and task writers
continue to meet the expectations of the specifications.
• The specifications are created in such a way as to
encourage interaction between the item writers and
the quality assurance team.
• Standard setting acts to triangulate this evidence
• Standard setting acts to triangulate this evidence
• See Table A5
4. What are the socio-linguistic competences that This is clearly set out in the British Council/EAQUALS
the test takers are expected to be able to handle: Core Inventory.
linguistic markers, politeness conventions, register,
adequacy, dialect/accent, etc.?
PAGE 89
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
6. W
hat are the pragmatic competences that the test This is clearly set out in the British Council/EAQUALS
takers are expected to be able to handle: discourse Core Inventory.
competences, functional competences?
The lists in CEFR 5.2.3 might be of help as a reference.
7. A
fter reading the scale for Pragmatic Competence in Levels A1-C
Table A5, indicate and justify at which level(s) of the
Justification (incl. reference to documentation)
scale the examination should be situated.
• The items, themes and foci of the input texts were drawn
from the CEFR, while the areas of strategic competence
are based on a Core Inventory which itself is very much
driven by the CEFR.
• Quality control of items ensures that the item and task writers
continue to meet the expectations of the specifications.
• The specifications are created in such a way as to
encourage interaction between the item writers and
the quality assurance team.
• Standard setting acts to triangulate this evidence
• Standard setting acts to triangulate this evidence
• See Table A5
8. What are the production strategies that the test This is clearly set out in the British Council/EAQUALS
takers are expected to be able to handle? Core Inventory.
The discussion in CEFR 4.4.1.3 might be of help as a reference.
9. A
fter reading the scale for Strategic Competence in Levels A1-C
Table A5, indicate and justify at which level(s) of the
Justification (incl. reference to documentation)
scale the examination should be situated.
• The items, themes and foci of the input texts were drawn
from the CEFR, while the areas of sociolinguistic competence
are based on a Core Inventory which itself is very much
driven by the CEFR.
• Quality control of items ensures that the item and task writers
continue to meet the expectations of the specifications.
• The specifications are created in such a way as to
encourage interaction between the item writers and
the quality assurance team.
• Standard setting acts to triangulate this evidence
• Standard setting acts to triangulate this evidence
• See Table A5
PAGE 90
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Shows great flexibility Maintains consistent Appreciates fully the socio- Can express him/herself Can interact with ease and
reformulating ideas in grammatical control of linguistic and sociocultural spontaneously at length with skill, picking up and using
differing linguistic forms complex language, even implications of language a natural colloquial flow, non-verbal and intonational
to convey finer shades of while attention is otherwise used by speakers and can avoiding or backtracking cues apparently effortlessly.
meaning precisely, to give engaged (e.g. in forward react accordingly. around any difficulty so Can interweave his/her
emphasis, to differentiate planning, in monitoring Can mediate effectively smoothly that the interlocutor contribution into the joint
and to eliminate ambiguity. others’ reactions). between speakers of the is hardly aware of it. discourse with fully natural
C2 Also has a good command turntaking, referencing,
target language and that
of idiomatic expressions and of his/her community of allusion making etc.
colloquialisms. origin taking account of
sociocultural and socio-
linguistic differences.
Has a good command of a Consistently maintains a Can use language flexibly Can express him/herself Can select a suitable phrase
broad range of language high degree of grammatical and effectively for social fluently and spontaneously, from a readily available range
allowing him/her to select accuracy; errors are rare, purposes, including almost effortlessly. Only a of discourse functions to
a formulation to express difficult to spot and generally emotional, allusive and joking con-ceptually difficult subject preface his remarks in order
him/herself clearly in an corrected when they do usage. can hinder a natural, smooth to get or to keep the floor
appropriate style on a wide occur. flow of language. and to relate his/her own
C1 range of general, academic, contributions skil-fully to
professional or leisure topics those of other speakers.
without having to restrict
what he/she wants to say.
Has a sufficient range of Shows a relatively high Can with some effort keep Can adjust to the changes Can initiate discourse,
language to be able to give degree of grammatical up with and contribute to of direction, style and take his/her turn when
clear descriptions, express control. Does not make group discussions even when emphasis normally found in appropriate and end
viewpoints on most general errors which cause speech is fast and colloquial. conversation. conversation when he/she
topics, without much misunderstanding, and can Can sustain relationships Can produce stretches of needs to, though he/she may
conspicuous searching for correct most of his/her with native speakers without language with a fairly even not always do this elegantly.
B2 words, using some complex mistakes. unintentionally amusing or tempo; although he/she Can help the discussion
sentence forms to do so. irritating them or requiring can be hesitant as he or she along on familiar ground
them to behave other than searches for patterns and confirming comprehension,
they would with a native expressions, there are few inviting others in, etc.
speaker. noticeably long pauses.
Has enough language to get Uses reasonably accu¬rately Can perform and respond Can exploit a wide range of Can initiate, maintain and
by, with sufficient vocabulary a repertoire of frequently to basic language functions, simple language flexibly to close simple face-to-face
to express him/herself used “routines” and patterns such as information express much of what he/ conversation on topics that
with some hesitation and asso¬ciated with more exchange and requests she wants. are familiar or of personal
circumlocutions on topics predictable situations. and express opinions and Can keep going interest. Can repeat back
such as family, hobbies and attitudes in a simple way. comprehensibly, even though part of what someone has
B1 interests, work, travel, and Is aware of the salient pausing for grammatical and said to confirm mutual
current events. politeness conventions and lexical planning and repair understanding.
acts appropriately. is very evident, especially
in longer stretches of free
production.
PAGE 91
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Uses basic sentence Uses some simple Can handle very short social Can make him/herself Can indicate when he/she is
patterns with memorised structures correctly, exchanges, using everyday understood in very short following but is rarely able to
phrases, groups of a few but still systemati¬cally polite forms of greeting utterances, even though understand enough to keep
words and formulae in makes basic mistakes. and address. Can make pauses, false starts and conversation going of his/
A2 order to communicate and respond to invitations, reformulation are very her own accord.
limited information in simple apologies etc. evident. Can expand learned Can ask for attention.
everyday situations. phrases through simple
recombinations of
their elements.
Has a very basic repertoire Shows only limited Can establish basic social Can manage very short, Can interact in a simple way
of words and simple phrases grammatical control of a contact by using the simplest isolated, mainly pre- but communication is totally
related to personal details few simple grammatical everyday polite forms of: packaged utterances, with dependent on repetition,
A1 and particular concrete structures and sentence greetings and farewells; much pausing to search for rephrasing and repair.
situations. patterns in a memorised introductions; saying please, expressions, to articulate less
repertoire. thank you, sorry etc. familiar words, and to repair
communication.
Shows great flexibility Maintains consistent Appreciates fully Can express him/ Can create coherent and Can substitute an
reformulating ideas in grammatical the socio-linguistic herself spontaneously cohesive discourse making full equivalent term
differing linguistic forms control of complex and sociocultural at length with a and appropriate use of a variety for a word he/
to convey finer shades language, even implications of natural colloquial of organisational patterns and a she can't recall
of meaning precisely, while attention language used by flow, avoiding or wide range of connectors and so smoothly that
C2 to give emphasis, to is otherwise speakers and can backtracking around other cohe¬sive devices. it is scarcely
differentiate and to engaged (e.g. in react accordingly. any difficulty so noticeable.
eliminate ambiguity. Also forward planning, in smoothly that the
has a good command of monitoring others’ interlocutor is hardly
idiomatic expressions and reactions). aware of it.
colloquialisms.
Has a good command Consistently Can use language Can express him/ Can produce clear, smoothly Can backtrack
of a broad range of maintains a flexibly and effectively herself fluently and flowing, well-structured when he/she
language allow¬ing high degree of for social purposes, spontaneously, almost speech, showing controlled encounters a
him/her to select a grammatical including emotional, effortlessly. Only a use of organisational patterns, difficulty and
formulation to express accuracy; errors allusive and joking conceptually difficult connectors and cohesive reformulate what
him/ herself clearly in are rare, difficult to usage. subject can hinder a devices. he/she wants to
C1 an appropriate style on spot and generally natural, smooth flow say without fully
Can give elaborate descriptions
a wide range of general, corrected when of language. and narratives, integrating sub interrupting the
academic, professional they do occur. themes, developing particular flow of speech.
or leisure topics without points and rounding off with an
having to restrict what appropriate conclusion.
he/she wants to say.
PAGE 92
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Has a sufficient range of Shows a relatively Can express him or Can produce Can develop a clear description Can use
language to be able to high degree of herself appropriately stretches of language or narrative, expanding and circumlocution
give clear descriptions, grammatical control. in situations and with a fairly even supporting his/her main points and paraphrase
express viewpoints Does not make avoid crass errors of tempo; although he/ with relevant supporting detail to cover gaps in
on most general errors which cause formulation. she can be hesitant and examples. vocabulary and
topics, without much misun¬derstanding, as he or she searches Can use a limited number of structure.
B2 conspicuous searching and can correct for patterns and cohesive devices to link his/her Can make a note
for words, using some most of his/her expressions, there utterances into clear, coherent of “favourite
complex sentence forms mistakes. are few noticeably discourse, though there may mistakes” and
to do so. long pauses. be some “ jumpiness” in a long consciously
contribution. monitor speech
for it/them.
Has enough language Uses reasonably No descriptor Can exploit a wide Can link a series of shorter, Can use a simple
to get by, with sufficient accurately available range of simple discrete simple elements in word meaning
vocabulary to express a repertoire language flexibly order to reasonably fluently something similar
him/herself with of frequently to express much of relate a straightforward to the concept
some hesitation and used “routines” what he/she wants. narrative or description as a he/she wants to
circumlocutions on topics and patterns Can keep going linear sequence of points. convey and invites
such as family, hobbies associated with comprehensibly, “correction”.
and interests, work, travel, more predictable even though pausing
B1 Can start again
and current events. situations. for grammatical and using a different
lexical planning and tactic when
repair is very evident, communication
especially in longer breaks down.
stretches of free
production.
Uses basic sentence Uses some No descriptor Can make him/herself Can link groups of words with No descriptor
patterns with memorised simple structures available understood in very simple connectors like “and”, available
phrases, groups of a few correctly, but still short utterances, “but” and “because”.
words and formulae in systematically makes even though pauses,
order to communicate basic mistakes. false starts and
limited information reformulation are very
A2 in simple everyday evident. Can expand
situations. learned phrases
through simple
recombinations of
their elements.
Has a very basic Shows only limited No descriptor Can manage very Can link words or groups of No descriptor
repertoire of words and control of a few available short, isolated, words with very basic linear available
simple phrases related simple grammatical mainly pre-packaged connectors like “and” or “then”.
to personal details and structures and utterances, with much
particular concrete sentence patterns pausing to search
A1 situations. in a memorised for expressions, to
repertoire. articulate less familiar
words, and to repair
communication.
PAGE 93
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
B2
B1
A2
A1
Form A23: Graphic Profile of the Relationship of the Examination to CEFR Levels
Short rationale, reference to documentation. If this form presents a different conclusion to the initial estimation
in Form A8, please comment on the principal reasons for the revised view.
The evidence presented here and in the rest of the report indicates that the test access is language across all levels.
The specifications for the test were created with the CEFR as its basis. The specifications are constantly reviewed and reflected on
by the quality assurance team and the item writers. In addition, all items and tasks are extensively trialled before usage in the test.
The standard-setting section of this report shows that there is a clear link between the various boundary points and the CEFR,
as claimed.
Finally, the validation section of the report offers evidence that the test is robust accurate and reliable. Is evident also supports
and justifies playing as the test is likely to function in a consistent way.
PAGE 94
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
APPENDIX 2:
APTIS WRITING PAPER SCALES
Task 2 Scale
4 [A2.2] Clearly defined sentences all on topic. Mostly accurate grammar with few serious errors of vocabulary usage
(i.e. appropriateness and spelling). The text organisation is completely appropriate for task. Attempts at textual
cohesion and accurate punctuation.
3 [A2.1] There are some serious issues with grammar and vocabulary usage. However, the meaning still clear.
Text written in complete sentences, organised appropriately for the text form and mostly accurate punctuation.
2 [A1.2] Numerous serious errors of grammar and vocabulary usage which make the text sometimes difficult to follow.
A series of phrases, not sentences. Poor punctuation.
1 [A1.1] There is too little language or the usage is so poor that the text is almost impossible to follow.
There is no clear structure.
Task 3 Scale
4 [B1.2] Replies fully to each piece of input The grammar is appropriate to B1 and is mostly accurate, while there is a good
range of vocabulary on general topics. Some errors but these don’t impede communication Cohesive and coherent
text appropriately using an appropriate range of linguistic devices. Few if any punctuation or spelling errors.
3 [B1.1] Replies well to at least two of the input texts. An adequate range of grammar used with no major errors which impact
on understanding. There is good control of elementary vocabulary, though evidence of some major errors when
expressing unfamiliar or complex topics Cohesive and coherent text adequately using a range of linguistic devices.
Spelling and/or punctuation errors do not impede communication.
2 [A2.2] Replies to at least two of the input texts. Many errors which make the text sometimes difficult to follow. Narrow lexical
repertoire, here again, frequent errors make the message difficult to follow. Some effort to use connecting devices
though not always consistent. Errors, including punctuation and spelling, make the text difficult to follow.
1 [A2.1] Does not reply to more than one input. There is little language with such poor control as to make the text almost
impossible to follow without considerable effort. Very basic for everyday vocabulary. Lacks cohesion and/or uses
linguistic devices inappropriately. Spelling and punctuation errors make the text almost impossible to follow.
PAGE 95
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Task 4 Scale
4 [B2.2] Task fulfilled in terms of appropriateness of register (i.e. two distinct registers used in the different messages written).
Evidence of a clear, assured and precise use of a broad range of grammatical forms used. A good command of a
broad lexicon. Good use of idiomatic expressions and no impeding errors of grammar or lexis. Few if any errors of
cohesion or coherence.
3 [B2.1] Task partially fulfilled in terms of appropriateness of register (i.e. fully appropriate register used in one of the two
different messages written). An adequate range of grammatical forms used, with no impeding errors. A good range of
lexis with a high level of accuracy. Errors do not affect the message. Cohesive and coherent text adequately using a
range of linguistic devices. Spelling and/or punctuation errors evident but these do not affect the message.
2 [B1.2] Task not fulfilled in terms of appropriateness of register (i.e. appropriate register not used in either of the two different
messages written). A relatively narrow range of grammatical forms used, with some impeding errors. The lexical range
adequate for the description of situations relating to him/herself. Some errors which tend to make understanding
difficult. Attempts to use linguistic devices though not always consistent. Errors, including punctuation and spelling,
can make understanding difficult.
1 [B1.1] Task not fulfilled in terms of appropriateness of register (i.e. no evidence of awareness of register). A limited range of
grammatical forms and vocabulary used and not always with sufficient accuracy. Errors may make the text difficult to
follow Lacks systematic cohesion and/or uses linguistic devices inappropriately. Spelling and punctuation errors can
make understanding difficult.
PAGE 96
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Student ID 93680513
I joined the club because it sounded interesting and it was exactly what I was looking for!
How can anyone dislike something about this club? It's almost impossible! The things I like more are the way we are learning and
the activities we do! The only thing I dislike it's some of the tecnical problems!
Student ID 93680511
I joined the club because there's activities i'm interested in participate and a lot of my friends recommended this club to me.
I like the friendship that we can create by joining a club like this, what i don't like are the little problems that sometimes i have to
carry on because of some things that i've done in the club.
Student ID 93680516
I like all kind of films but specially drama ones. "Slumdog Millionaire" is my favourite one.
I'm sorry but already have something to do after the Saturday's film showing...
Student ID 93680572
I join the club to improve My English to help me to emigrate to Austerlia. i want to discover more Land and area.
I like the club because it helps me to improve my Enhlish and teah me more in English. Idislike it because it
sometimes comes boring.
PAGE 97
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
XXXXX
Student ID 93683074
I am going in businus trip. Have meeting with Factory pump to use it in the project i am working on it.
It is my first time to me to visit London so of course i want to see roal palce alos to visit the all the musiums in london. also i heared
that the will make a big conference for the new technolgy avilable in the pumps cntrol may i can go there.
I really want buy as much as i can to buy a souvners and gifts from London.also i will try to catch if there are good shows of
threater are there, i will try to go. may be if i have some time i will try to go to another around in Englind
Student ID 93683062
Student ID 93683094
I would like to visit, becouse i want to see different culture, meet new people and I love adventures.
I want to see every places, which are recommended in internet. I want to visit bigest city, to see history places.
I want to spend my money for transport, food, suvenirs.
PAGE 98
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Student ID 93683092
I want to go Mexico with my friends in the Summer because there are very beautiful island and beach.Second reason is I want to
meet my best friend ,because She Studies there.
First time I am going to my friend because the hotel is very expensive and my friend only lives there. I am going to visit ancient
captal and see famous building,historic town, museums, and go swimming and shopping.In the end I am going to visit are gallary
because I like painting pictures very mush. I am also going to have Mexico festival with my best friend.
I am not sure,but I will spend money on buying clothes, playing on the beach,having party,and I will buy presents about my parents.
Maybe I think,I shound eating in a restaurant.But I am going to save money.
Task 2 [A2]
Sam
Dear Sam,
I lived in a small town, although it was small but lovely.
People lived in my town are friendly and nice, they always help each other.
I think that’s the nicest part of my town. I hope you can come here.
By the way I’m not went out in evenings.
Love
PAGE 99
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Task 3 [B1]
Student ID 93680513
Hello Paul! Yes, I'm. Wow me too! I really enjoy comedies. But my favorite film isn't a comedy one. It's a horror one, called
Paranormal Activity. Have you ever seen it? It's really good and scary! I bought the DVD so I could watch it anytime!
Yes I'm! We could go to a shopping mall and see the new clothes that just arrived! But I think the boys woudn't like that!
Or we could make a long walk at the beach and see the ocean! And just relax a little.
What?! I didn't heard that! How can they do that? When I go to the cinema I really enjoy my bucket of hot popcorns! I think
they aren't going to do that, because that way they will loose money!
Student ID 93680511
Hey! I joined the club recently has you know. My favourite genre of films are the horror ones and my favourite film is Friday
the 13th. I watched it for the first time when i was a little kid and since there i love it!
What a great idea! There's an old film store in front of the cinema that sells really good films for a great price. We can go there
and check if there's something of our interest. I swear you won't be disappointed.
That's just stupid! Who doesn't love to watch a good film accompained by some really tasty bucket of popcorn? If they stop
selling them, people are you going to stay at home and watch a film because they can eat whatever they want to.
Student ID 93680516
It would really be a bad idea, i think. I go to the cinema to watch the movies but i must confess i also go there for the
delicious popcorns.
I'm sorry but already have something to do after the Saturday's film showing...
It would really be a bad idea, i think. I go to the cinema to watch the movies but i must confess i also go there for the
delicious popcorns.
PAGE 100
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Student ID 93680572
Really because i like the comedy movies also, and my favorete Film is The Mask, i watched more than 20 times also My favorte
actress is Jim Cary
I think we can go out for a restuart or having a walk around , or may be a cafeshop . OHH.. i Know a cafe name hipark i think
it is good.they sell a very good popcorn.
i think it will be very boring.
Student ID 93683074
Really I am going in a business trip , you know just for work. but i think i may stay for more time to go around Englind to see
and visit many places as much as i can. My business trip will take 1 week but iwill stay for another week.
Sure i am going to see the roal palce and to move around in London as much as i can ,if i have some free time i will try to move
around all england as much as i can.
for sure the roal family will be the same, but but what will be different the new molls that are built in south of london. this molls is
huge and large in this area. i believe i will make to much shopping.
Student ID 93683062
Just filled in a form, and submit relevant documents they required. It took me around 2 weeks to get the visa upon the date I
submitted the documents.
The first place to go is, of course, London. I am going to spend 3 days there, and then travel to Lake Area, and then Scotland,
and then back to London again.
I heard the landscape in Lake Area and Scotland is very much different from it is here. This is the main reason triggered my
trip to UK. As for London, I guess the thing most different from here would be the lifestyle. We will see’
PAGE 101
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Student ID 93683094
I just apply for visa and give them the document they want from me. The procedure for visa takes 1 month.
I have no plans. Just take the airplane and jump into adventures
I don't know. I will tell you when I come back. I think everything is different there.
Student ID 93683092
I had to apply my visa and buy plane tickets .But I found buying my cheap ticket on the Intenet for long time because it was
very expensive. Other friend bought their tickes for long time too.I took It for 2 weeks
My travel plans are meeting my best friend,going swimming and shopping and sightseeing ,visiting art gallary.In the end I am
going to go wine shops ,because my dad likes red wine very much. I think ,it is very important about visiting other one country.
I think ,their language is a littel different from English.Maybe there life, having foods,geting up, going to bed are different with
our county life.I am not sure ,Althogh they are Americans ,there psychology likes our country.
PAGE 102
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Task 4 [B2]
Student ID 93680513
Hello Mary! Have you heard the news? The main hall of our film club will be closed for painting and we have to see the films
on DVD in the lounge! The maximum of seats will be of 25 per showing! I don't think this is right, because they are a lot of
people in our club that want to watch the films! I'm really upset about this! I think they should rent a bigger room and that way
all of us would watch the films!Tell me what you think about this! XOXO Gabriela
Good afternoon, I'm a member of your film club. I heard that the main hall of the film club will be closed due to painting.
And the members will have to see the films on DVD in the lounge, with the maximum of seats being 25 per showing. I think
we should think in other solutions, like for example rent a bigger room and we all could fit there. Because a lot of members
are upset (including me) and don't want to watch the films there because we simply don't have any space. We understand
that it needed to be painted but we can always suggest other possible alternatives. Sincerely, Gabriela.
Student ID 93680511
Hey John, do you heard the news about the film club? They're going to close the main hall for painting, that's not right!
We need to get a place to watch our films and it has to be really big. I remembered once you've said that you had a house
here in the city that was completely empty. What do you think if we start doing our movie-marathons there?
Dear Film Club Manager Every single members of the film club have heard about the terrible news and we are shocked.
Don't you think we should get a better solution instead of showing the films in the lounge? It's a really small place and the
club has like 150 members or even more. Besides, we need a bigger screen than the one that's in the lounge. I talked to a
friend of mine and he is as much indignated as me. He has a house completely empty house and he agreed to borrow the
house until the main hall is finished. What do you think about the idea? We can take the chairs and the screen from our film
room and put them in there. I think it's perfect. Regards, Anthony
PAGE 103
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Student ID 93680516
Hi Sara! When I saw in the noticed that a maximum of 25 seats will be available for watch the films I was really sad because,
as I always get late, i don't think that, when I come to see the movie, any seats will be available for me. It's too bad but I don't
think i'm going there anymore.
Manager of the club, i don't think it was a really good idea to close the hall for painting, speacially in this time of the year. I think
you already know that, as the winter is coming, the hall would be warmer than the lounge and i don't believe people will be
uncomfortable just to see a film (doesn't matter how good it might be). I work until the hour the films start so, even if i go really
quickly over there, i always get a little late and, with only 25 seats available, i don't think i can have mine. In my opinion, you should
replace the hall for another place but the lounge. Somewhere warmer and, if possible, bigger. This way people would feel more
comfortable and, with more seats available, i would have more chances to find one for me when i get there. Thank you, Adriana.
Student ID 93680572
Dear Michael : How are You? how are things going? i have a bad news for you.the main hall is under renewing . and they are
going to change to into DVD in the Club with max. 25 chair this means that not all of us can meet together. ithink we have to
meet togther in cenima Metro every sunday night so we can see each other until the finis the big hall. or even let's see if any
of or friends have any other ideas.
Dear Sir: I am writing this mail just to tell how sorry i am because of closing the big hall i fell .me and alot of the members who
are meeting every weak in this hall and share alot of Fun and nice time there. it was a very unhappy thing for closing that hall.
I believe i have avery good idea for you.what if try to split the big hall in two area and you can work on one , and when you
finish move to other one. so always have an areato meet together.
PAGE 104
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Student ID 93683074
I have a problem with my visa. My name was simillar too much simillar to guy who is criminal. they though that i am that guy
becuse of these.you know should try to check the photos. they will know that i am not him.
Dear sir : I woulld to tell you about my problem. the embassy reject my visa because my name was similar to guy who made a
lot of crimes. i think if you try to check the photos you dicover that i am not that man.i think if you use computer program that
help to identiy the photos, it will help to much to check on the people, also if can make some investigation through our police
department,they will help you more and give you more data about me.you. i am good man who earn his money from his work
and i have never made any crimes.you should have check more pleas nect time . even with other people. your sincerly
Moataz Mohamed
Student ID 93683062
Can you imaging that the document I was told to prepare was not the correct one I was supposed to submit? I couldn%u2019t
understand what happened, but the thing I do know is there will not be enough of time for me to get my visa before I leave for
UK, I couldn’t make the trip I have planned for months.
To Whom It May Concern: I was informed yesterday morning by Sisley Zhu of the failure in processing my visa to UK, and the
reason was that the annual income supporting was not the correct document I should submit. I couldn’t understand this because
the document check list I get from your office weeks ago indicates very clearly that the annual income supporting is one of the
correct documents. I therefore came to your office this morning with this document check list, however, I didn’t get any reasonable
answers from you, except for being asked to wait for the receptionist for almost one hour. Can you at least ensure that the
information you deliver to the customers is accurate and consistent? I would also grateful if you could get back to me to give me a
reasonable answer. Best regards, Jiang Lin
Student ID 93683094
my friends...I'm so sad. there are problems with my visa. The people from the embassy think that I don't have the money for my
travell and want give them a bank statement to prove my finance status.
Dear Embassador, I would like to make some complain and suggestions about services of the embassy. It took you so loog to take
a decision for my applly - I think you need to improve that.
Student ID 93683092
I have problems about applying visa because I want to stay there for 2 weeks but there only 7days.I think this dates are very short
about travelling .Unluckily this year a lot of people offer visa. What can I do?
Hello Mr (Mrs) Be I have problems with travelling dates.Before you said what I could travel for long times allow my deciding.I can't
understand why your saying is different with before.I think ,your moodying is the worst I've been offer my visa.NowI got very angry
with my husband.If you can help me ,I will thanks to you.I am really sorry about my bad complaining.If you allow my planning, I am
sure maybe I will have very nice and exciting travelling. Take care of yourself. 13 11 2011
PAGE 105
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Note: All additional tasks are from the Council of Europe’s document “Relating language examinations to
the Common European Framework of Reference for Languages: learning, teaching, assessment”.
This can be found online at the following address:
http://www.coe.int/t/dg4/education/elp/elp-reg/Source/Key_reference/exampleswriting_EN.pdf
PAGE 106
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Overall Descriptor
5 [C] Consistently high level of grammatical and lexical range and accuracy; errors are rare and difficult to spot.
Clear, effective pronunciation and intonation; varies intonation and sentence stress correctly to express finer
shades of meaning.
Fluent and spontaneously, with little or no sign of effort.
Clear, smoothly flowing, well-structured speech, with controlled use of organisational patterns, connectors and
cohesive devices.
4 [B2] Sufficient range and control of grammatical forms and lexis to express ideas without much conspicuous hesitation,
using some complex forms to do so. No mistakes lead to misunderstanding.
Has clear, effective pronunciation and intonation.
Stretches of language with fairly even tempo; can be hesitant when searching for patterns and expressions, fairly long
pauses possible.
Uses a limited number of cohesive devices to link utterances into clear, coherent discourse; may be some 'jumpiness'
in long turns.
3 [B1] Sufficient range and control of grammatical forms and lexis to get by, but there is hesitation, repetition and difficulty
with formulation. A reasonably accurately repertoire of frequently used 'routines', patterns and words associated with
more predictable situations, but major errors still occur when expressing more complex thoughts.
Pronunciation is intelligible though the accent means that occasional mispronunciations occur.
Keeps going comprehensibly; pausing for grammatical and lexical planning and repair is very evident in longer
stretches of production.
Links a series of shorter, discrete simple elements into a connected, linear sequence of points.
2 [A2] Control of basic grammatical forms and lexis, but may have to compromise the message and take time to formulate
structures. Uses some simple structures and lexis correctly, but still systematically makes basic mistakes (e.g. tends to
mix up tenses and forgets to mark agreement; sufficient vocabulary for the expression of basic communicative needs
only). Meaning clear.
Pronunciation is generally clear enough to be understood despite a noticeable accent and occasional difficulty for
the listener.
Constructs phrases on familiar topics despite very noticeable hesitation and false starts.
Links groups of words with simple connectors like 'and', 'but' and 'because'.
1 [A1] Very basic range of simple forms with only limited control of a few simple grammatical structures and sentence
patterns in a learned repertoire. Basic vocabulary of isolated words and phrases related to particular concrete
situations.
Pronunciation of a very limited range of words and phrases can be understood with some effort.
Manages very short, isolated utterances, with much pausing to search for expressions, to articulate less familiar
words, and to repair communication.
Little attempt to link words or groups of words, when it happens uses very basic linear connectors like 'and' or 'then'.
PAGE 107
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Task 1
You will be asked 3 questions. Answer each question as fully as you can. You have a maximum of 30 seconds to answer each
question so don't worry if the computer stops you. You will hear this sound (beep) before each question. All your answers will be
recorded. The test will now begin. (3 second pause).
Task 2
In this part you will see a picture and answer three questions. Before each question
you will hear this sound (beep) You can talk for a maximum of 45 seconds for each
question. There are 5 marks for this task.
Task 3
Task 3a
You will see 2 pictures. Look at them and say what you see in the two pictures. You
only have 40 seconds to do this. At the end of this time, you will hear a sound (beep).
The test will now begin
Task 3b
You should now compare something in these pictures. You have 1 minute for this
task. AT the end of 1 minute you will hear a sound (beep). Here is your question.
What would it be like to work in these two places?
Task 3c
This is your last question about the pictures. You have 1 minute to answer. AT the end
of this time you will hear this sound (beep). Here is your question.
Which of these places do you think it would be better to work in, and why?
Task 4
PAGE 108
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
APPENDIX 6:
TASK PARAMETERS EXPLAINED
Parameter Description
Purpose The requirements of the task. As with tests of other aspects of language ability this gives candidates
an opportunity to choose the most appropriate strategies and determine what information they are to
target in the text in comprehension activities. Facilitates goal setting and monitoring (key aspects of
cognitive validity).
Response format How candidates are expected to respond to the task (e.g. MCQ, SAF, matching, handwriting, writing on
computer etc.). Different formats can impact on performance.
Known criteria As with listening tests, letting candidates know how their performance will be assessed. Means informing
them about rating criteria beforehand (e.g. in SAF, is spelling or grammar relevant as is the case in IELTS;
for writing, letting the test takers know about the assessment criteria before they attempt the task).
Weighting Goal setting can be affected if candidates are informed of differential weighting of items before test
performance begins. Items should only be weighted where there is compelling evidence that they are
more difficult and/or more central to the domain.
Order of items In reading comprehension tests, items will not appear in the same order as the information in the text
where students search read (i.e. for scanning) but may appear in any order for careful reading.
Time constraints Can relate either to pre-performance, or during performance. The latter is very important in the testing
of reading, as without a time element we cannot test skills such as skimming and scanning (i.e. without
this element all reading will be ‘careful’)
Discourse mode Includes the categories of genre, rhetorical task and patterns of exposition.
Channel In terms of input this can be written, visual (photo, artwork, etc), graphical (charts, tables, etc.) or aural
(input from examiner, recorded medium, etc). Output depends on the ability being tested.
Writer – reader This can be an actual or invented relationship. Test takers are likely to react differently to a text where
relationship the relative status of the writer is known – or may react in an unpredictable way where there is no
attempt to identify a possible relationship (i.e. the test developer cannot predict who the test taker may
have in mind as the writer and so the test developer looses a degree of control over the conditions).
Nature of The degree of abstractness. Research suggests that more concrete topics/inputs are less difficult to
information respond to than more abstract ones.
Content knowledge Same as background knowledge which is very likely to impact on test task/item performance.
PAGE 109
LINKING THE APTIS REPORTING SCALES TO THE CEFR
BARRY O'SULLIVAN
Linguistic
Lexical range These relate to the language of the input (usually expected to be set at a level below that of the
expected output) and to the language of the expected output. Described in terms of a curriculum
Structural range document or a language framework such as the CEFR.
Functional range
Physical conditions All of these elements are taken into consideration in the Information for Centres documents.
Centres are routinely monitored to ensure that they are complying with the regulations.
Uniformity of
administration
Security
PAGE 110
BRITISH COUNCIL
APTIS TECHNICAL REPORTS
Linking the Aptis Reporting
Scales to the CEFR
Barry O'Sullivan, British Council
www.britishcouncil.org/aptis