Syntax: introduction
Roger Levy
UC San Diego
Department of Linguistics
February 1, 2008
Modeling Syntax
Today we’re going to cover:
◮ Basic idea of phrase structure
◮ Context-free grammars
◮ Dependency structure
◮ Syntactic Ambiguity
◮ Treebanks
◮ Investigating the data: tree search
. . . this is all with an eye toward eventually doing full-blown
syntactic parsing
Phrase structure parsing
S
◮ Phrase structure organizes
words into phrases, often NPsg VPsg
called constituents DT NN PP VBZ
◮ This organization is The velocity IN NPsg rises
hierarchical and thus nested
of DT JJ NNS
◮ There are arguments in
linguistics about the proper the seismic waves
representations
◮ For a given string there is
usually ambiguity as to the
correct phrase structure
◮ This ambiguity often
corresponds to semantic
ambiguity
Context-free grammars, formally
A context-free grammar (CFG) consists of a tuple (N, V , S, R)
such that:
◮ N is a finite set of non-terminal symbols;
◮ V is a finite set of terminal symbols;
◮ S is the start symbol;
◮ R is a finite set of rules of the form X → α where X ∈ N
and α is a sequence of symbols drawn from N ∪ V ;
Simple example of a CFG
◮ Take the non-terminal set N = {S, NP, VP, V }
◮ . . . and the terminal set V = {cats, dogs, meow, like}
◮ Let the start symbol be S
◮ . . . and the rule set be: S → NP VP NP → cats
VP → V NP → dogs
VP → V NP V → meow
V → like
◮ This context-free grammar licenses a finite number of
tree-sentences (all unambiguous), including:
S S S S
NP VP NP VP NP VP NP VP
cats V dogs V cats V NP dogs V NP
meow meow like dogs meow dogs
Constituency tests
◮ How do linguists build a grammar?
◮ How do we determine what nodes go into a tree?
◮ Answer: classic constituency tests
Constituency tests
◮ pro-form substitution for noun phrases (NPs)
The children ate with a spoon ⇒ They ate with a spoon
⇒ The children ate with it
◮ Cleft formation
Pat ate with a spoon ⇒ It was with a spoon that Pat ate
Pat ate with a spoon H ⇒
H It was with a that Pat ate spoon
◮ Topicalization:
Pat ate with a spoon ⇒ A spoon, Pat ate with.
Pat ate with a spoon H ⇒
H Ate with, Pat (did) a spoon
◮ . . . there are other tests too.
Constituency tests
◮ pro-form substitution for noun phrases (NPs)
The children ate with a spoon ⇒ They ate with a spoon
⇒ The children ate with it
◮ Cleft formation
Pat ate with a spoon ⇒ It was with a spoon that Pat ate
Pat ate with a spoon H ⇒
H It was with a that Pat ate spoon
◮ Topicalization:
Pat ate with a spoon ⇒ A spoon, Pat ate with.
Pat ate with a spoon H ⇒
H Ate with, Pat (did) a spoon
◮ . . . there are other tests too.
Constituency tests
◮ pro-form substitution for noun phrases (NPs)
The children ate with a spoon ⇒ They ate with a spoon
⇒ The children ate with it
◮ Cleft formation
Pat ate with a spoon ⇒ It was with a spoon that Pat ate
Pat ate with a spoon H ⇒
H It was with a that Pat ate spoon
◮ Topicalization:
Pat ate with a spoon ⇒ A spoon, Pat ate with.
Pat ate with a spoon H ⇒
H Ate with, Pat (did) a spoon
◮ . . . there are other tests too.
Attachment ambiguity
S
NP VP
Det N V NP PP
The children ate Det N P NP
the cake with Det N
a spoon
S
NP VP
Det N V NP
The children ate NP PP
Det N P NP
the cake with Det N
a candle
Dependency Structure
◮ Another major way of representing syntactic relations is
through word-word dependency structure:
After dinner, a musician who was hired for the wedding arrived
◮ This turns out to be homologous to context-free
phrase-structure trees for which each node has a head
daughter:
S
PP NP VP
P N NP RC V
After dinner Det N WHNP S arrived
NP
a musician who
VP
V VP
was V PP
hired P NP
for Det N
the wedding
Syntactic Ambiguities
◮ Prepositional phrases: They cooked the beans in the pot
on the stove with handles.
◮ Particle vs. preposition: A good pharmacist dispenses with
accuracy. The puppy tore up the staircase.
◮ Complement structures: The tourists objected to the guide
that they couldn’t hear. She knows you like the back of her
hand.
◮ Gerund vs. participial adjective. Visiting relatives can be
boring. Changing schedules frequently confused
passengers.
More syntactic ambiguities
◮ Modifier scope within NPs: impractical design
requirements
plastic cup holder
◮ Multiple gap constructions: The chicken is ready to eat.
The contractors are rich enough to sue.
◮ Coordination scope: Small rats and mice can squeeze into
holes or cracks in the wall.
Syntactic attachment underspecifies semantics
◮ Semantic role (more on this later!)
I cleaned the dishes in the sink
I cleaned the dishes with detergent
I cleaned the dishes with enthusiasm
◮ Scope ambiguity
Every person in the room speaks two languages
Treebank Sentences
(show the sentence here...)
S
PP NP-SBJ VP .
TO NP NNS CC NNS MD VP .
To PRP peanuts and emeralds would VB VP
her have VBN NP-PRD
been RB ADJP NN
just RB JJ blubber
so much
Exploring syntax: Tree search
◮ tgrep2
◮ tregex