Naclo 2016 Round 1
Naclo 2016 Round 1
Naclo 2016 Round 1
Annual
North American
Computational
Linguistics
Olympiad
2016
www.nacloweb.org
Open Round
January 28, 2016
Serious language puzzles that are surprisingly fun!
-Will Shortz, Crossword editor of The New York Times and Puzzlemaster for NPR
Rules
1. The contest is three hours long and includes eight problems, labeled A to H.
2. Follow the facilitators' instructions carefully.
3. If you want clarification on any of the problems, talk to a facilitator. The facilitator
will consult with the jury before answering.
4. You may not discuss the problems with anyone except as described in items 3 & 12.
5. Each problem is worth a specified number of points, with a total of 100 points.
In this years open round, no points will be given for explanations. Instead, make
sure to fill out all the answer boxes properly.
6. All your answers should be in the Answer Sheets at the end of this booklet. ONLY
THE ANSWER SHEETS WILL BE GRADED.
7. Write your name and registration number on each page of the Answer Sheets
Here is an example:
Jessica Sawyer
#850
8. The top 10% of participants (approximately) across the continent in the open round
will be invited to the second round.
9. Each problem has been thoroughly checked by linguists and computer scientists as
well as students like you for clarity, accuracy, and solvability. Some problems are
more difficult than others, but all can be solved using ordinary reasoning and some
basic analytic skills. You dont need to know anything about linguistics or about
these languages in order to solve them.
10. If we have done our job well, very few people will solve all these problems completely in the time allotted. So, dont be discouraged if you dont finish everything.
11. DO NOT DISCUSS THE PROBLEMS UNTIL THEY HAVE BEEN POSTED
ONLINE! THIS MAY BE A COUPLE OF MONTHS AFTER THE END OF THE
CONTEST.
Oh, and have fun!
All material in this booklet 2016, North American Computational Linguistics Olympiad and the authors of the individual problems. Please do not copy or distribute without permission.
NACLO 2016
Sites
As well as more than 120 high schools throughout the USA and Canada
English
A1. Translate the following sentences into Malay. Write your answers in the Answer Sheets.
a. Beauty is not a gift.
b. The rich girl is not a singer.
c. His wealth is not for the girl.
d. The man is not coming.
e. The gift from the singer is not beautiful.
start
D
O
end
This DAWG can recognize all three words, in that each word constitutes a valid path from the start symbol
to the end symbol, and no other sequence of letters forms such a path.
However, it is not correct to just merge any redundant letters like this, because inappropriately merged
letters will lead to incorrect words being recognized.
start
O
X
end
This DAWG correctly recognizes the letter sequences NITROGEN, HYDROGEN, and OXYGEN, but it also incorrectly recognizes the letter sequences NITROXYGEN and HYDROXYGEN, which were not intended.
Answer these questions in the Answer Sheets.
B1. On the next page are three DAWGs that recognize a list of words in a category, the way that the
DAWGs above recognizes a three-word list of chemical elements. Each DAWG recognizes a different category
of words.
These DAWGs are poorly-constructed, however, in that each one recognizes several incorrect letter sequences as well. We will give you the shape of the DAWG (but without letter labels) and the incorrect letter sequences; from this, deduce what words the DAWG was supposed to recognize and write these intended
words on your answer sheet. (For each part of this question, the number of intended words is the same as
the number of answer spaces you are given on your Answer Sheet.)
start
start
end
start
end
Unintended words: PANDA, GHAQ, IRANADA, CAN, RWANAMA, and many more
B2. For each of the DAWGs above, what would be the fewest number of letter squares needed to recognize
every intended word, and only the intended words? (For example, to recognize HYDROGEN, OXYGEN, and
NITROGEN, you need at least 14 squares. Any fewer than 14 squares and you would recognize an unintended
word like NITROXYGEN or OXYDROGEN. Do not include the start or end spaces in your counts.)
hobusega.
horse.comitative
Estonian nouns have a lot of different forms depending on their case and their number (singular or plural). The table below illustrates various forms for different nouns and adjectives; the exact meaning of these
different cases is not relevant to this problem. Be careful, however: exactly four of the forms in the table below are mistakes! Note that , , , and are vowels.
English
translation
Genitive
Singular
Partitive
Singular
Adessive
Singular
Nominative
Plural
Genitive
Plural
Adessive
Plural
house
maja
maja
majal
majad
majade
majadel
nest
pesa
pesa
pesal
pesad
pesade
pesadel
singer
laulja
laulja
lauljal
lauljad
lauljate
lauljatel
restaurant
skla
sklat
sklal
sklad
sklate
sklatel
name
nime
nime
nimel
nimed
nimede
nimedel
ice
jd
jl
jd
jde
jdel
summer
suve
suve
suvel
suved
suvede
suvedel
white
valge
valget
valgel
valged
valgete
valgetel
sister
de
del
ed
dede
dedel
road
tee
teed
teel
teed
teede
teedel
big
suur
suurt
suurel
suured
suurte
suurtel
yellow
kollase
kollaset
kollasel
kollased
kollaste
kollastel
man
mehe
meest
mehel
mehed
meeste
meestel
bean
oa
uba
oal
oad
ubade
ubadel
reason
phjuse
phjust
phjusel
phjused
phjuste
phjustel
story
loo
lugu
lool
lood
lugude
lugudel
island
saare
saart
saarel
saared
saarte
saartel
Genitive
Singular
Partitive
Singular
Adessive
Singular
Nominative
Plural
Genitive
Plural
Adessive
Plural
moon
(a)
kuud
kuul
(b)
(c)
(d)
human
inimese
(e)
(f)
(g)
inimeste
(h)
fish
(i)
kala
(j)
kalad
(k)
(l)
(F) Take One Tablet and Call Me in the Morning (1/2) [15 points]
Hittite is an extinct language that belongs to the Anatolian branch of the Indo-European language family. It
was spoken in the ancient Hittite Empire in second millennium BCE. Hittite was written using a script, called
cuneiform, composed of many wedge shapes; an example of cuneiform unrelated to this problem is below.
The excerpt below is a (simplified) phonetic rendering of a cuneiform passage found on a tablet. You do not
need to know how the text is pronounced to solve this problem.
nata illuyankan
h attenaz ar kallita
kawa ezenan iyami
nuwa adanna akuwanna eh u
nata illuyanka qadu dumumeu
ar r nuza eter ekuer
nata palh an h mandan ekuer
neza ninkr
ne namma hattena kattanta
nmn pnzi h upaiyaa it
nu illuyankan ih imanta kallit
ima it nukn illuyankan
kuenta dingirmea kattii eer
Its translation into English:
And he called up the snake from the hole: Behold the feast Im making! Come to eat and to drink! And the
snake came up with his sons. And they ate and drank. And they drank all the kettles. And they could no longer go down into the hole again. And Hupasiyas came and tied the snake with a rope. The Stormgod came and
killed the snake; and the gods were with him.
Answer the questions on the next page in the Answer Sheets provided.
(B)
(D)
(E)
(F)
(I)
(G)
(C)
(H)
(J)
(K)
The particular variety of Cistercian sign language represented here is that of a monastery in the U.S.; signs in other communities
may vary from those presented here.
(M)
(N)
(O)
(P)
(Q)
(R)
(S)
(T)
(U)
(V)
(W)
i. a Cistercian monk
j. Cistercians
k. dormitory
l. (to) drink
m. England
n. ice
o. Iceland
p. Italy
q. milk
r. a nun
s. poetry
t. queen bee
u. snow
v. tree
w. wooden table
G2. Translate the following into Cistercian Sign Language (for each word, the answer will be a single sign.
Write the number of that sign in your Answer Sheet).
a. baby
b. (to) pour
c. rain
d. tea
The Benedictines are another order of Christian monks (see pictures above on this page).
The Blessed Sacrament is a term used to refer to the bread used in a particular ritual.
4
Christmas is a holiday celebrating the birth of the Christian figure Jesus.
3
A second method is the bigram method. In this method, MARY first finds all the tokens that were used to
start a sentence in the text and randomly chooses one of these to start the sentence. Then she builds the rest
of the sentence by looking at the most recent token generated, finding all tokens that occur immediately
after that token in the text, and randomly choosing one of these. For example, if the most recently generated
token was red, MARY would find all the tokens in the text that immediately follow red, {hair,
curtains, as, } and randomly choose one of these to be the next word. A sentence generated using the
bigram method might look like this:
Face your nose noisily after you saying stuff.
A third method is called the trigram method. This method is very similar to the bigram method, but uses the
previous two tokens (instead of the previous one) to decide what the next token will be. A sentence generated using the trigram method might look like this:
But Harry hardly noticed that six extra chairs.
The last method that MARY can use to generate sentences is called the Context Free method. This method
starts by taking each sentence in the text and generating a grammar tree, like the one below, for it.
S
NP
NNP
VP
NNP
Sirius Black
VBD
NP
PP
lent
PRP
TO
NP
it
to
PRP
me
H1. Below is a collection of sentences. Two of them are real sentences from the Harry Potter series. The
rest were generated using one of the methods above; each method generated at least two sentences. In the
answer sheets, write either u for unigram, b for bigram, t for trigram, or c for context-free to indicate
the method that most likely generated that sentence, or if you think the sentence was not automatically generated, write r for real.
a. Headmaster uninjured could that was Malfoy that badges
b. He bent over top of the water blushing furiously.
c. There were crouching in your bedroom.
d. He lived about a hundred wizards were closing.
e. Ron spooned iron bolts, keyholes, and a heavy wooden breadboard on to her
e. back and picked up a fistful.
f. "What?" said Harry.
g. Sorry! he said," said Mr. Malfoy's eyes.
h. Harry wasn't," said Dumbledore went slightly surprised.
i. years beginning at to annoyance spider!" just months Harry
j. You might have been an impostor.
k. They'll be the first to rise up in the Invisibility Cloak on," said
Professor Flitwick pressed a box into his bag.
Contest Booklet
REGISTRATION NUMBER
Name: ___________________________________________
Contest Site: ________________________________________
Site ID: ____________________________________________
City, State: _________________________________________
Grade: ______
Start Time: _________________________________________
End Time: __________________________________________
Please also make sure to write your registration number and your name on each page that you turn in.
SIGN YOUR NAME BELOW TO CONFIRM THAT YOU WILL NOT DISCUSS THESE PROBLEMS WITH ANYONE
UNTIL THEY HAVE BEEN OFFICIALLY POSTED ON THE NACLO WEBSITE IN APRIL.
Signature: __________________________________________________
YOUR NAME:
REGISTRATION #
b.
c.
2. a.
b.
c.
b.
c.
d.
e.
f.
YOUR NAME:
REGISTRATION #
h.
i.
j.
k.
l.
Althea
Inday
Janelle
Maria
Main Dish
Dessert
Drink
1. a.
b.
c.
d.
g.
h.
i.
j.
2.
3. a.
b.
c.
d.
e.
f.
g.
h.
e.
f.
YOUR NAME:
REGISTRATION #
b.
c.
d.
e.
f.
g.
2. a.
b.
c.
d.
e.
f.
g.
b.
c.
d.
e.
f.
g.
h.
i.
j.
k.
l.
m.
n.
o.
p.
q.
r.
s.
t.
u.
v.
w.
f.
g.
2. a.
b.
c.
d.
b.
c.
d.
i.
j.
k.
l.
e.
h.