International Journal of Lexicography, Vol. 26 No. 2, pp. 219–233
doi:10.1093/ijl/ecs016 Advance access publication 21 August 2012
219
ONLINE ENGLISH LEARNERS’
DICTIONARIES AND MISSPELLINGS:
ONE YEAR ON
Robert Lew: Adam Mickiewicz University in Poznan¤, Poland (rlew@amu.edu.pl)
Roger Mitton: Department of Computer Science and Information Systems, Birkbeck,
University of London, UK (r.mitton@dcs.bbk.ac.uk)
A previous study (Lew and Mitton 2011) found that the leading monolingual English
learners’ dictionaries in their online versions coped poorly with misspelled words as
search terms. This paper reports on a repeat of this study in 2012, which obtained
similar results, though some changes from 2011 are noted. As in 2011, the performance
of the dictionaries was compared with that of an experimental context-free spellchecker,
and again the online dictionaries were found wanting. The new data were also subjected
to a cluster analysis which showed how the dictionaries could be grouped based solely
on their performance.
1. The 2011 study
If you want to look up a word in a paper dictionary, you need to know how it is
spelled, or at least how it starts; looking up newmoanya won’t find pneumonia.
An online dictionary, however, may incorporate a ‘did-you-mean’ function,
somewhat like a spellchecker. You type newmoanya and the dictionary
responds with a short list of suggestions, including, hopefully, pneumonia.
Assuming that your spelling problems are not extreme, you will spot pneumonia
in the list, click on it and get the information you wanted.
All the leading monolingual English learners’ dictionaries have an online
version, and all provide a ‘did-you-mean’ function. For the 2011 study, the
first author selected seven such dictionaries (see Table 1). He also assembled a
corpus of 200 misspellings by advanced learners (details in the Appendix). Each
of these misspellings was then keyed in to each of the dictionaries and a note
was made, each time, of where the required word appeared in the list of
suggestions.
# 2012 Oxford University Press. All rights reserved. For permissions,
please email: journals.permissions@oup.com
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
Abstract
220
Robert Lew and Roger Mitton
Table 1: The seven dictionaries used in the study
Title (and version)
LDOCE free
Longman Dictionary of Contemporary English, free
online version
Longman Dictionary of Contemporary English, premium
subscription version
Merriam-Webster’s
Advanced
Learner’s
English
Dictionary
Macmillan English Dictionary Online
Cambridge Advanced Learner’s Dictionary
Oxford Advanced Learner’s Dictionary
Google English Dictionary
LDOCE premium
MWALED
MEDO
CALD
OALD
GoogleED
Figure 1: Suggestions list in CALD for the target word temporary misspelled
as *tempori.
As an illustration of the procedure, consider Figure 1, taken from a test
lookup in CALD. The required word was temporary, and it was misspelled
as *tempori. The dictionary returned a list of ten suggestions. The top suggestion was temporise, which was not the required word. However, temporary was
found further down the list: in this case it was listed ninth. So, position 9 was
noted for this misspelling in CALD.
The procedure for GoogleED was slightly different because GoogleED
offered just one suggestion (if any) rather than a short list; what was recorded
in this case was simply whether this single suggestion was the required word
or not.
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
Acronym
Dictionaries and Misspellings
221
The main conclusions of the 2011 study, in brief, were that there was wide
variation in the performance of the dictionaries, that the general standard was
poor and that even the best performance was unimpressive. The best of the
group was the free version of LDOCE, which offered the required word in the
top ten of its suggestions for 77% of the misspellings; the corresponding figure
for the worst performer (OALD) was 52%.1
2. The 2012 study
Table 2: Success rates for the seven dictionaries. Figures indicate the proportion of the 200 target words found in the respective positions in the suggestions list.
Target word listed in position:
Dictionary
First
Top 3
Top 6
Top 10
LDOCE Free
MWALED
LDOCE Premium
CALD
MEDO
OALD
GoogleED
51%
47%
50%
35%
24%
25%
43%
64%
57%
59%
50%
43%
43%
(43%)
74%
63%
60%
55%
51%
47%
(43%)
77%
66%
62%
57%
54%
51%
(43%)
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
Almost exactly one year after the collection of the 2011 data, the operation was
repeated — same seven dictionaries, same 200 misspellings. The results are
presented in Table 2 and Figure 2.
The percentages indicate what proportion of the 200 target words were
found in the respective positions within the suggestions lists returned by the
dictionaries; it shows, for example, that LDOCE Free put the target word first
in its list of suggestions for 51% of the misspellings. Top 3 means that the target
was listed as first, second, or third, and so on. These figures are cumulative, so
if the target was listed at the top of the list, it was automatically counted under
all four categories (i.e. First, Top 3, Top 6, and Top 10). Figure 2 conveys the
same results in graphic form.
A comparison of these figures with those from the earlier paper shows that
very little has changed since 2011. The only exception to this is that the profile
for CALD, which in 2011 was almost identical to that of MEDO, has clearly
improved. In 2012, CALD succeeded in placing the target word as the top
suggestion in 69 out of 200 cases, an increase from 51 a year earlier (statistically
significant at p < 0.05 by Fisher’s exact 2x2 test).
222
Robert Lew and Roger Mitton
1st suggestion
2nd or 3rd
4th-6th
7th-10th
GoogleED
OALD
MEDO
CALD
LDOCE Premium
MWALED
LDOCE Free
0
20
40
60
80
100
120
140
160
180
200
195
200
197
194
improvement
197
no change
181
regression
175
146
150
146
items
125
100
75
50
38
31
23
25
0
16
0
5
LDOCE Free
4
2
LDOCE Premium
2
1
MWALED
9
3
0
MEDO
CALD
OALD
10
GoogleED
Figure 3: How the seven dictionaries changed between January 2011 and
January 2012. The three bars for each dictionary represent, from left to
right: items for which results have improved; items with no change; and
items that got worse.
A more detailed comparison of the two years at the level of individual misspellings suggests that both CALD and OALD have done some work on their
‘did-you-mean’ function but that the improvements made by CALD have been,
overall, more successful. The differences are shown in Figure 3. For each dictionary, we give three measures, based on the position of the target word in the
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
Figure 2: Performance of the seven dictionaries for all 200 misspellings. Bar
sections indicate the number of target words ranked in the respective positions in the suggestions list.
Dictionaries and Misspellings
223
2.1 Where the dictionaries failed
We commented in our earlier paper on the strangeness of some of the suggestions made by the dictionaries and speculated on how they had arrived at them.
We noted that OALD relied too heavily on substring matches. Since then, a
few items on OALD’s lists have switched places, and a couple of suggestions
have been replaced by more obscure ones (e.g. ecology is out, enology is in), but
the general problem persists. In the 2011 evaluation, OALD did not seem to
give much regard to the first letter, even though research has shown that people
generally get the first letter right (Yannakoudakis and Fawthrop 1983, Mitton
1996). For instance, the dictionary used to offer deferens for *referens (reference). In the new evaluation, this aspect has improved, and OALD now correctly guesses reference.
A particular oddity of the suggestions served up by MEDO in 2011, and
occasionally also OALD and CALD, was their tendency to offer words with an
–s at the end, such as recommends as first choice for *rekomend, even where
there was no indication in the misspelling that one was required. In 2011 all
three dictionaries offered citizens for *sitizen, forwards for *foward, repetitions
for *repetyszyn and even spaghettis for *spagetti. Compared with 2011, OALD
and especially CALD have improved for such cases, so now, of the three, only
MEDO favours citizens; OALD and CALD no longer offer forwards, nor does
CALD offer repetitions, and none of them offers spaghettis. The changes
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
list of suggestions. The bars on the left represent items showing improvement:
in these cases the target is now placed higher in the list of suggestions than it
was in January 2011. The middle bars stand for items with no change. The bars
on the right indicate items where the target word has fallen down in the suggestions list.
It might seem strange that modifications to a ‘did-you-mean’ system can
result in the performance getting worse. But a modification that has the
right effect on target A can have the wrong effect on target B. To give a
simple example, suppose you note that misspellings tend to retain the same
first letter as the target, so you tweak your algorithm to give more weight to the
first letter — suggestions with the same first letter are now preferred. So,
orchestra, say, will be promoted up the list for *orkester. But by the same
token certain will move down the list for *serten (i.e. other suggestions may
be promoted above it). It is an exceptionally happy improvement that results in
only the right suggestions being promoted.
The corpus of 200 misspellings was taken from native speakers of three
different languages — 100 misspellings from Polish speakers, 50 from
Japanese and 50 from Finnish. We did not find any interesting differences in
performance depending on whether the misspellings came from the Polish,
Finnish, or Japanese subcorpus.1
224
Robert Lew and Roger Mitton
3. Can the dictionaries do better? Comparing them with a spellchecker
In assessing the performance of the dictionaries as poor, are we perhaps asking
for more than the current technology can provide? The challenge faced by a
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
compared to January 2011 are going in the right direction, but nevertheless the
question remains why the other such cases still persist (and the plurals still
often appear in second place, which they probably do not deserve). Perhaps it
has something to do with the software for dictionary compilation and publication that all three use — they all use IDM PitchLeads as the online platform
(Dominic Glennon, personal communication). However, as far as we know,
LDOCE also uses the IDM system, and yet it does not exhibit the –s problem.
(We shall explore empirically the degree of relatedness between the dictionaries
in a later section.) Conversely, MEDO, OALD and CALD all place the singular university at the top of the list of suggestions for *univercitys (a misspelling of universities). For once, the plural would have been reasonable, and yet it
is only given in second place.
MWALED’s offerings remain occasionally baffling, still proposing archdiocese for *ridicyles (a misspelling of ridiculous), and failing to correct *spagetti.
OALD largely retains its fondness for obscure and unhelpful suggestions; deferens, commis and Du Bois have gone but etyma, xylem, inf, umbrae, Tok Pisin
and Wat Tyler remain. Going a step further in unhelpfulness, GoogleED sometimes provides non-words as its suggestions. Even though four of such bizarre
suggestions have disappeared since January 2011, the following remain: *sejfy
for *sejfty (safety), *sinirli for *sinsirli (sincerely), *bicikli for *beisikli (basically), and *identiti for *aidentiti (identity).
The –ing ending is still a cause of difficulty for these dictionaries. By January
2012, LDOCE Premium and OALD had joined GoogleED in being able to
correct *useing, but the others still insisted on, variously, unseeing, seeing, and
— strangest of all — useding. GoogleED, OALD and CALD can now cope
with *diging, but the others still offer, as their first suggestions, dining, dodging,
Diegan and dinging.
A new development, not noted in January 2011, is the tendency of
OALD and CALD to suggest phrasal verbs and compounds spelled as separate
words rather than the reasonable simplex forms. For example, both dictionaries place dining car at the top of the list for *dyning rather than the simple and
correct dining. *vater (for water) gets later on (CALD) and cater to (OALD) as
the best suggestion, and both dictionaries follow down the list with an assortment of compounds with water (water gun, water ice, etc.). For *szajning (a
misspelling of shining), CALD gives signing up and then, third on the list,
training bra. While it is true that multi-word items have in many ways been
the lexicographic underdog, prioritizing them in this way is a step too far in the
other direction.
Dictionaries and Misspellings
225
3.1 How Mitton’s spellchecker works
When presented with a misspelling, the spellchecker begins by assembling a
collection of dictionary words — typically a few hundred — that somewhat
resemble the misspelling. It then takes each of these candidates and matches it
against the misspelling to assess how good a candidate it is. The stringmatching algorithm is a version of the well-known ‘edit-distance’ algorithm
(Levenshtein 1966, Wagner and Fischer 1974, Véronis 1988).
The algorithm calculates the minimum number of editing operations
required to get from the candidate to the misspelling, where each editing operation consists of inserting a letter, or deleting a letter, or changing one letter
to another. For example, if the misspelling was *sakology and the candidate
was psychology, you could get from the candidate to the misspelling by deleting
the p, changing the y to an a and the c to a k, and deleting the h — a total of
four operations, therefore an edit-distance of four.
Merely counting the edit operations, however, only takes you so far.
Consider the candidate ecology. You can get from ecology to *sakology by
inserting an s, then changing the e to an a and the c to a k — an edit-distance
of three. So, simply on the basis of the number of operations, ecology would be
preferred to psychology. But this does not seem quite right.
The p of psychology is silent so it is not surprising that people sometimes
omit it; the y is relatively unstressed, and people often make mistakes over
unstressed vowels, and the ch, in this word, corresponds to the same sound as a
k. Taking all this into account, *sakology is not such a bad shot at psychology.
By contrast, if you were trying to write ecology, starting with an s would be a
strange thing to do.
We can accommodate this by assigning a cost to each editing operation, with
more serious (i.e. less likely) operations having a higher cost. We might decide
that the operations on psychology are relatively insignificant and assign a cost
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
‘did-you-mean’ function is much like the problem addressed by a spellchecker.
There are differences: the online dictionary lacks the benefit of context but, on
the other hand, it does not need to handle punctuation, numerals, obscure
proper nouns and so on. But they have much in common, and in fact many
spellcheckers find they can get a long way without making any use of the
context (Kukich 1992, Deorowicz and Ciura 2005, Mitton 2009).
The second author has produced a context-free experimental spelling correction system (Mitton 1996, 2009). How does it cope with this corpus of
advanced learners’ misspellings? (The original version of this spellchecker
was designed for native speakers of English. A version has since been produced
for Japanese speakers (Mitton and Okada 2007), but it was the original one
that was used in this study, without any adaptations for speakers of other
languages.)
226
Robert Lew and Roger Mitton
of just one to each of them. For ecology, we might, similarly, assign a cost of
one to changing e to a and c to k, but a much higher cost, perhaps four, to the
unlikely error of inserting an initial s. If we now adapt the algorithm so that it
calculates the cost, rather than the number, of the editing operations, we come
out with a cost of four for psychology and six for ecology, so we would present
psychology higher up the list of suggestions.
The dictionary inside Mitton’s spellchecker is primed with information about
appropriate costs to use in the string-matching. These are based partly on
pronunciation and partly on analyses of large corpora of misspellings. So the
spellchecker already knows, so to speak, that you might omit the t of mortgage
or the middle syllable of remember, that you might insert an s into latest
(*lastest), that you might begin phantom with an f, and so on.2
Table 3 compares the success rates (in the same fashion as in Table 2) of
Mitton’s experimental spellchecker with the best-performing online dictionary
(LDOCE Free), and Figure 4 presents the same results graphically.
Mitton’s spellchecker was able to place the required word among the top ten
of its list of suggestions for 93% of the misspellings; the best dictionary in our
set, LDOCE Free, achieved a success rate of only 77%. The gap is even greater
if we consider the spellchecker’s ability to place the target word in the most
valuable top portion of the list of suggestions. Here the experimental spellchecker outperforms LDOCE Free by over 20 percentage points (both for First
and for Top 3). The shortcomings of the dictionary ‘did-you-mean’ functions
cannot be attributed to limitations of the technology.
4. Similarities between the dictionaries
In section 2.1 above we saw many examples of different dictionaries returning
similar results, or indeed falling into the same traps. We hypothesized that
some of the similarities may be due to the use of the same online dictionary
Table 3: Success rates of the best-performing dictionary compared with
Mitton’s experimental spellchecker, for all data
Target word listed in position:
Dictionary
First
Top 3
Top 6
Top 10
Mitton
LDOCE Free
73%
51%
87%
64%
91%
74%
93%
77%
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
3.2 The spellchecker versus online dictionaries
Dictionaries and Misspellings
1st suggestion
2nd or 3rd
4th-6th
227
7th-10th
LDOCE Free
Mitton
0
20
40
60
80
100
120
140
160
180
200
Figure 4: Performance of the best of the dictionary ‘did-you-mean’ functions
compared with Mitton’s spellchecker.
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
platform, PitchLeads from IDM, which, to our knowledge, is used by the two
LDOCE dictionaries, OALD, CALD, and MEDO. This would leave only two
dictionaries in our sample, MWALED and GoogleED, not using it. But there
are other factors that could determine the ranking of words, such as the wordlist or particular techniques of spellchecking. In this section we explore the
similarities between the dictionaries in a more formalized way.
For the purposes of this study, the main parameter of interest is the position
of the intended word in a list of suggestions. Thus, if two dictionaries both
present the intended word at the top of the list, or if both list the word in the
same position (say, third), this means the two dictionaries perform identically.
Conversely, the greater the disparity between the ranks of the target word in
two suggestions lists, the farther apart the dictionaries are. In order to quantify
this measure, we computed pairwise Spearman’s rank-order correlation coefficients for the dictionaries. Mitton’s spellchecker was included, but GoogleED
was not, because this dictionary only offered at best a single suggestion rather
than a list, imposing a severe restriction on the range of possible values in an
analysis of ranks. The figures are provided in Table 4. (The table is symmetrical
about the diagonal since the correlation of A with B is, of course, the same as
the correlation of B with A.)
It is evident from the correlation coefficients that some dictionaries indeed
exhibit greater affinity than others. LDOCE Free, for instance, correlates most
highly with LDOCE Premium at 0.69. OALD is very close to both CALD and
MEDO.
By computing complements to 1 of the correlation coefficients in Table 4, we
obtain a distance matrix that can be used as input in hierarchical clustering.
A cluster tree (dendrogram) from these data using the single-linkage approach
is given in Figure 5. The dictionary branches connect at different levels, and the
lower the linkage distance, the greater the connectedness.
The dendrogram in Figure 5 reveals a well-defined three-way cluster made up
of OALD, CALD and MEDO, and another one comprising the two versions of
LDOCE. These five dictionaries join together at the next step, bearing
228
Robert Lew and Roger Mitton
Table 4: Pairwise Spearman correlation coefficients for target word rank data
(N = 200)
Mitton LDOCE LDOCE MWALED MEDO CALD OALD
Free
Premium
1.00
0.44
0.44
1.00
0.48
0.69
0.39
0.49
0.38
0.51
0.29
0.51
0.32
0.41
0.48
0.69
1.00
0.46
0.54
0.50
0.44
0.39
0.38
0.29
0.32
0.49
0.51
0.51
0.41
0.46
0.54
0.50
0.44
1.00
0.41
0.37
0.32
0.41
1.00
0.75
0.74
0.37
0.75
1.00
0.75
0.32
0.74
0.75
1.00
Mitton
LDOCE Free
LDOCE Premium
MEDO
CALD
OALD
MWALED
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.55
Linkage Distance
Figure 5: A cluster tree for the dictionaries on word rank data, using single
linkage
testimony to the common software platform. MWALED and Mitton’s spellchecker remain relatively apart and increasingly distant from the core cluster.
5. Conclusion
Our conclusion from the 2011 study was that the online versions of the leading
monolingual English learners’ dictionaries were inadequate when it came to
correcting misspelled input from non-native users. On the basis of our repeat of
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
Mitton
LDOCE
Free
LDOCE
Premium
MWALED
MEDO
CALD
OALD
Dictionaries and Misspellings
229
(1) non-native speakers interpret sounds of a foreign language through the
filter of their native language phonological system (an effect known as
categorical perception); and
(2) languages tend to have their own peculiar sound-to-spelling correspondences, which leads to the entrenchment of L1-specific spelling behaviour.
If a ‘did-you-mean’ function had information about the first language of the
users, it could improve its suggestions further by taking account of these
language-specific patterns. Beyond sensitivity to the language, one can envisage
a yet more radical customization, where an adaptive version of a spellchecker
would be capable of fine-tuning itself to the needs and problems of an individual user.
Another suggestion for the improvement of online dictionaries would be to
deal with real-word errors. These occur when someone types a word that is
listed in the dictionary but which is not the word they meant. A spellchecker
checking running text can sometimes detect these by using the context, but even
the ‘did-you-mean’ function of an online dictionary might guess that the input
string was a real-word error if the word was both very rare and similar in
spelling to some other more common words. For example, if a user typed
the rare word wold, it would be helpful if, in addition to displaying the entry
for wold, it treated the word as a misspelling and offered some suggestions —
world? wild? etc.
The state of the art in spellchecking technology certainly permits the online
dictionaries to provide a better service than they currently do. With some investment from the publishers, there is no reason why they should not get closer
to fulfilling the promise of electronic dictionaries: efficient access to relevant
content.
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
this study in 2012, the conclusion stands. Far too often, when challenged with a
misspelling, the dictionaries are unable to include the required word in their list
of suggestions, and, if they do include it, it often appears some way down the
list. While the individual dictionaries vary substantially in performance, and
granted that OALD and especially CALD have made some improvements,
there is still far to go for even the best ones.
Many of the misspellings of the dictionary users will be the same as
those made by native speakers, because the underlying cause is the same for
both — the difficulties of English spelling. This is why Mitton’s spellchecker,
designed for native speakers, performs reasonably well on the sample of
misspellings used in this study. The ‘did-you-mean’ functions of the online
dictionaries could improve by taking more account of this. But, in addition
to that, non-native speakers tend to make errors of their own, because of
the influence of their native language. This is likely to surface in at least two
ways:
230
Robert Lew and Roger Mitton
Acknowledgements
The first author wishes to thank his student assistants, Marta Dabrowska and
Aleksandra Lasko, for their help in collecting the Polish corpus of misspellings.
Notes
1 Full details are provided in Lew and Mitton (2011).
2 Readers wanting more than this brief sketch are invited to consult Mitton (2009)
or, for more detail, Mitton (1996).
References
B. Other literature
Deorowicz, Sebastian and Marcin Ciura 2005. ‘Correcting Spelling Errors by Modelling
Their Causes’. International Journal of Applied Mathematics and Computer Science,
15.2: 275–285.
Kukich, Karen 1992. ‘Techniques for Automatically Correcting Words in Text’.
Computing Surveys, 24.4: 377–439.
Levenshtein, Vladimir 1966. ‘Binary Codes Capable of Correcting Deletions, Insertions
and Reversals’. Soviet Physics — Doklady, 10.8: 707–710.
Lew, Robert and Roger Mitton 2011. ‘Not the Word I Wanted? How Online English
Learners’ Dictionaries Deal with Misspelled Words’. In Iztok Kosem and
Karmen Kosem (eds), Electronic Lexicography in the 21st Century: New
Applications for New Users. Proceedings of eLex 2011, Bled, 10-12 November 2011.
Ljubljana: Trojina — Institute for Applied Slovene Studies, 165–174.
Mitton, Roger 1985. Birkbeck Spelling Error Corpus.
Mitton, Roger 1996. English Spelling and the Computer. Harlow: Longman.
Mitton, Roger 2009. ‘Ordering the Suggestions of a Spellchecker without Using
Context’. Natural Language Engineering, 15: 173–192.
Mitton, Roger and Takeshi Okada 2007. The Adaptation of an English Spellchecker for
Japanese Writers. Symposium on Second Language Writing. Nagoya, Japan.
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
A. Online dictionaries tested
CALD. Cambridge Advanced Learner’s Dictionary. http://dictionary.cambridge.org/.
GoogleED. Google English Dictionary. At the time of collecting data: http://www.google
.com/dictionary; at the time of writing the present version: http://www.google.com/
search?q=%s&tbs=dfn:1 (where %s stands for the search term).
LDOCE Free. Longman Dictionary of Contemporary English. http://www.ldoceonline
.com/.
LDOCE Premium. Longman Dictionary of Contemporary English. http://www.longmandictionariesonline.com/.
MEDO. Macmillan English Dictionary Online. http://www.macmillandictionary.com/.
MWALED. Merriam-Webster’s Advanced Learner’s English Dictionary. http://www
.learnersdictionary.com/.
OALD. Oxford Advanced Learner’s Dictionary. http://www.oxfordadvancedlearnersdictionary.com/.
Dictionaries and Misspellings
231
Okada, Takeshi 2005. ‘A Corpus-Based Study of Spelling Errors of Japanese EFL
Writers with Reference to Errors Occurring in Word-Initial and Word-Final
Positions’. In Vivian Cook and Benedetta Bassetti (eds), Second Language Writing
Systems. Clevedon: Multilingual Matters, 164–183.
Suomi, Riitta 1984. Spelling Errors and Interference Errors in English Made by Finns and
Swedish-Speaking Finns in the 9th Form of Comprehensive School. MA Thesis, Abo
Akademi.
Véronis, Jean 1988. ‘Computerized Correction of Phonographic Errors’. Computers and
the Humanities, 22: 43–56.
Wagner, Robert A. and Michael J. Fischer 1974. ‘The String-to-String Correction
Problem’. Journal of the Association for Computing Machinery, 21.1: 168–173.
Yannakoudakis, Emmanuel J. and David Fawthrop 1983. ‘The Rules of Spelling Errors’.
Information Processing and Management, 19.2: 87–99.
The corpus of misspellings
The corpus consists of 200 attempts at spelling English words by native speakers of languages from three different language families: Polish (100 items),
Japanese (50), and Finnish (50). A brief description of the three sets of misspellings follows, and a sample of ten items from each is given below.
Polish misspellings
The largest part of the corpus was made up of misspellings by Polish writers,
collected in 2010 by the first author, with the help of two student assistants as
experimenters.
The data were collected by way of oral elicitation. A set of English words
known to be frequently misspelled was taken from The 200 Most Commonly
Misspelled Words in English reported by Richard Nordquist at http://grammar
.about.com/od/words/a/misspelled200.htm. One by one, the words from the list
were played back in audio form to one of two Polish learners of English in their
first year of college (one female from Szczecin University, one male from
Gdańsk University), using the built-in audio pronunciation capability of the
popular bilingual English-Polish dictionary Diki.pl, known for its decent audio
quality. Thus, a target word would be played back to the participant without
disclosing its spelling, and the participant would respond by typing the word
into the computer. The experimenter would wait until the participant indicated
that they were done, and then proceed to play back the next target word.
Participants had been instructed in the warm-up sessions to proceed as if
they were using an online dictionary to look up words they had just heard.
All the typed word-like strings were logged. Correctly spelled words were
subsequently removed, as well as trivial errors, such as obvious mistypings,
which in all likelihood would not have challenged the spellchecking algorithms
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
APPENDIX
232
Robert Lew and Roger Mitton
of the dictionaries, with the remaining strings yielding the Polish subcorpus of
100 misspellings.
Japanese misspellings
Finnish misspellings
The Finnish data were collected by Suomi (1984) as part of her MA research.
The errors were taken from test papers written by 60 Finnish speakers, aged 1516 years, who had had about 16 hours per week of English at school for six or
seven years. There were two tests. In the first, the students were presented with
a short written dialogue, mostly in English but with some sentences in Finnish;
they had to write their translations of these sentences. In the second, the
students listened to a short dialogue, in English, then wrote their answers, in
English, to questions (in Finnish) about the dialogue.
The set of Finnish misspellings is one of several included in the Birkbeck
spelling error corpus (Mitton 1985) available from the Oxford Text Archive.
(The data collected by Suomi also includes misspellings from native speakers of
Swedish, but, for the present study, only the data from native speakers of
Finnish were used.) Trivial errors were discarded, as they were in the case of
the Polish subcorpus. This resulted in a list of 50 misspellings.
SUBCORPUS
TARGET
MISSPELLING
PL
PL
PL
PL
certain
easily
guarantee
interfere
serten
izli
garanti
interfir
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
The 50 Japanese misspellings were taken from the SAMANTHA Error Corpus
created by Takeshi Okada at Tohoku University, Japan (Okada 2005).
Japanese university students were asked to write down a series of English
words. For each one they were given its definition in Japanese and an approximate representation of the English pronunciation in the Japanese moraic
(or, more loosely, syllabic) script katakana. For the present study, we confined
our attention to those Japanese misspellings that contained more than one
single-letter error (and thus would provide a greater challenge for spellcheckers), selecting, for each target word, the most common of these. Up to a point,
though perhaps not as much as for the Polish sample, the elicitation technique
used would be likely to produce misspellings influenced by the typical sequencing of letters and sounds in Japanese, as well as by spelling-to-sound correspondences in English.
Dictionaries and Misspellings
233
TARGET
MISSPELLING
PL
PL
PL
PL
PL
PL
JP
JP
JP
JP
JP
JP
JP
JP
JP
JP
FI
FI
FI
FI
FI
FI
FI
FI
FI
FI
interruption
library
psychology
receive
separate
succeed
albatross
antenna
beautiful
embarrass
enough
gallery
graph
laughter
neglect
umbrella
because
colour
delicious
especially
gasoline
good-bye
orchestra
symphony
temperature
universities
interapsion
lajbery
sakology
reseve
sepret
sukcid
albatlos
untena
butiful
enbarance
inaf
garally
glaf
lafter
nigrect
umblera
becourse
coulor
delecous
espessially
gazolin
goodbay
orkester
sinfony
tempeture
univercitys
Downloaded from http://ijl.oxfordjournals.org/ by guest on March 24, 2016
SUBCORPUS