How Flexible Idiom Is

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

Linguistics 2019; 57(4): 735–767

Christiane Fellbaum*
How flexible are idioms? A corpus-based
study
https://doi.org/10.1515/ling-2019-0015

Abstract: Idioms are a compelling subject of study for linguists, lexicographers


and psycholinguists due to their seemingly idiosyncratic status as lexical units
that pose challenges for integration into accepted grammatical frameworks. The
literature reveals much disagreement on the semantic compositionality, syntac-
tic flexibility and lexical variation of both specific idioms and idioms as a class.
We analyze some of the sources for the disparate analyses, which are most often
based on judgments of constructed rather than attested examples. Relying solely
on corpus data from English and German that shows a wide range of syntactic
and lexical variation independent of semantic compositionality, we argue that
speakers’ use of idioms is in fact compatible with the rules governing freely
composed language.1

Keywords: idioms, syntax, semantics, flexibility, corpus data

1 Introduction
Analyses of spoken and written language reveal a high percentage of Multi Word
Units (MWUs), both in terms of types and tokens (Jackendoff 1995; Moon 1998;
Cowie 1998). MWUs comprise a broad range of phrases, including idiosyncratic
collocations like brush one’s teeth and answer the door, formulae like Happy New
Year! and idioms like pull someone’s leg (for a typology of MWUs see Mel’čuk
1995; Langlotz 2006; Fellbaum 2015a; inter alia). Such “pre-fabricated” phrases
persist in the language, perhaps because they often refer to complex but com-
mon events and situations, and save speakers the effort to encode anew appro-
priate messages (Nunberg et al. 1994; Fellbaum 2007a).2

1 This article is based on a talk given at the BSGL 2015 meeting on idioms in Brussels (Fellbaum
2015b).
2 Domain-specific language, where the topics tend to be limited, may make even greater use of
“pre-fabricated” phrases. Kuiper (1996) studied the speech of sports commentators and auction-
eers. Noting that they are under pressure to talk fast when reporting in real time on rapidly

*Corresponding author: Christiane Fellbaum, Program in Linguistics, Green Hall


1-S-14, Princeton University, Princeton, NJ 08544, USA, E-mail: fellbaum@princeton.edu

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
736 Christiane Fellbaum

Idioms, a sub-class of MWUs, are a compelling subject of investigation by


linguists, lexicographers and psycholinguists because of their pervasiveness and
universality, the challenge they pose to any definition of “word,” and their
resistance to clear-cut and uniform criteria for classification in terms of semantic
compositionality, syntactic flexibility and modifiability. There is little agreement
on the lexical status or the grammatical properties either of idioms as a class or
of specific idioms. There is also widespread disagreement on the acceptability of
idioms that differ syntactically and lexically from a neutral citation form. This
paper analyzes some of the sources for the disparate analyses and proposes a
broad, comprehensive view of idioms in view of attested data, focusing on
English and German.

1.1 Scope of this paper

We focus on verb phrase (VP) idioms of the form

(1) V (NP)* (PP)

where one or more complements of verb are part of the idiom, though not
necessarily all. Examples include the English idioms hit a nerve, give somebody
the glad eye, and (not) look a gifthorse in the mouth.3 The open argument position
that is not lexically filled by an idiom component is most often the subject. Many
idioms have in addition an open indirect object argument (read Y the riot act), or
an open object of a preposition (keep tabs on Y) or a possessive (get Y’s goat).
We consider both “plausible” and “implausible” idioms, exemplified by
smell a rat and fry one’s brains, respectively; the former, but not the latter
have possible alternate non-idiomatic readings.4 Of particular interest are

evolving situations that they need to hold in short-term memory, Kuiper shows that such
speakers resort to a repertoire of formulaic language, which allows them to pack complex
information into standardized forms.
3 We will not consider Light Verb Constructions, Support Verb Constructions or Vague Action
Verbs like take a bow, make phone call, have a drink and give a groan (Grimshaw and Mester
1988; Kearns 2002 [1988]; inter alia). We exclude sentences like the early bird gets the worm,
routine formulae like have a nice day, idioms that cannot be assigned to any phrasal category
such as say when, like father like son as well as lexically specified collocations such as answer
the door. See Fellbaum (2014) for a broad classification of idioms.
4 A manual classification of frequent German idiom candidates retrieved from a one billion-
word corpus showed that the literal meaning is intended about half the time on average, with
significant variation across individual idioms (Fellbaum 2007b).

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 737

partially or fully non-compositional idioms, whose meanings cannot be derived


from the meanings of their components nor easily guessed by speakers unfami-
liar with the idiom. We examine the syntactic and lexical flexibility of idioms
based on data retrieved from the Web (for English data) and a large corpus (for
German).

1.2 Canonical form and variations


Before investigating the syntactic and lexical variations of idioms we need to
establish what the basic, unvaried form is. Throughout, we use the term “cano-
nical” or “citation” form of a VP idiom to refer to the intuitively neutral one that
is found in dictionaries (give somebody the boot, have bats in one’s belfry, kick the
bucket, etc.) and that can be represented structurally as in (1). For the German
data, the standard citation form is verb-final.
Dictionaries list most idioms with the verb in the infinitive form and in the
active voice, with the open subject position most often filled by an Agent.
Occasionally, there are multiple citation forms. For example, a common
German idiom for which different dictionaries list alternative forms is shown
in as in (2) and (3); the second form here is the so-called lassen (let) passive.

(2) jemanden ins Bockshorn jagen


somebody into the buck’s horn chase
‘intimidate somebody’
(3) jemand laesst sich nicht ins Bockshorn jagen
somebody lets himself not into the buck’s horn chase
‘somebody doesn’t let himself be intimidated'

Corpus data suggest that the latter, with a Patient argument in the open subject
position, is by far the more frequent form of this idiom (192 tokens vs. 44 tokens
in the one billion word corpus, see Section 4.1.1 for more discussion of corpus-
based frequency).
Another hallmark of the canonical form is that it is neutral with respect to
information structure and does not require a contrast or assume prior reference
to (and thus givenness of) a constituent in the context in which the idiom is
used. In (4), an example of a non-neutral context, the beans are contrasted with
the beans that were spilled in the author’s previously published memoir, and
this contrast licenses the topicalization of the noun phrase in the second
sentence:

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
738 Christiane Fellbaum

(4) The book is a collection of anecdotes and political opinions, rather than a
sequel to her best-selling memoir Spilling the Beans. The beans she spilled in
that book included an account of how her alcoholic father beat her as a
child.
https://www.telegraph.co.uk/news/6179497/Clarissa-Dickson-Wright-
They-dont-call-me-Krakatoa-for-nothing.html

A third characteristic of the canonical form of an idiom is that it is not subject to


different acceptability judgments across speakers. Only syntactic and lexical
variations are frequent subject of such disagreements.

1.3 Variations considered in this paper

The aim of this paper is to investigate whether, and to what extent, VP idioms
are in fact the kind of monolithic, inflexible structures that they are often
portrayed to be. To this end, we examine corpus data for syntactic and lexical
variations of familiar idioms.5
Operations that apply to entire idioms rather than idiom-internal constitu-
ents will not be further discussed in this paper. These include negation (William
F. Cody, who did not kick the bucket until 1917), questioning (When did he kick the
bucket?), variations in mood, tense and aspect (it would make your grandmother
very happy if you had kids before she kicks the bucket; the washing machine was
kicking the bucket when suddenly it started working just fine) and use of a modal
verb (80 tablets and he could not kick the bucket?). There is general agreement in
the literature that these operations are freely available to all idioms, regardless
of semantic compositionality (Stathi 2007; Schenk 1995; Mel’čuk 1995).6

1.3.1 Syntactic variations

We consider variations of the canonical form that result from the following
syntactic operations on idiom-internal components: topicalization, clefting, pas-
sivization, relativization, wh-questioning, ellipsis and pronominalization.

5 Our distinction between syntactic and lexical operations does not imply any theoretical
claims pertaining to different levels of grammatical analysis.
6 In fact, statements that aspectual variations are acceptable only when the aspect agrees with
that of the idiom’s meaning (Everaert 2010; Mel’čuk 1995, inter alia) are refuted by attested
examples like (14).

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 739

1.3.2 Lexical variations

We examine variations resulting from lexical operations such as adjectival


modification or compounding of an idiom-internal NP, substitution of a consti-
tuent with another word or phrase, and zeugma, as discussed in Section 7. We
also include in the class of lexical operations the reversal of polarity, as when
the negation in the canonical (and far more frequent) form of idiomatic Negative
Polarity Items is reversed.
We will not include Ernst’s (1981) “external modification” among the lexical
variation we consider. In these cases, a noun in a non-compositional idiom is
modified by an adjective, as in Ernst’s example (5):

(5) This is grist for the linguistic mill

Ernst points out that the adjective here does not in fact modify the noun but
rather the entire VP and that is must therefore be considered semantically
external. (5) can be paraphrased roughly as “for linguists/linguistically, this is
grist for the mill.” The adjectival modification here does not necessarily entail
that the noun receives a semantic interpretation. A related case is (6), called
"metalinguistic" by Stathi (2007), where the adjective does not modify the noun
but rather comments on the linguistic (idiomatic) status of the phrase:

(6) An agenda of adventures that you were made to experience and a lifestyle
that you were meant to live before you kicked the proverbial bucket?
https://books.google.com/books?
id = 0XtKBQAAQBAJ&pg = PA51&lpg = PA51&q = %22you + kicked
+ the + proverbial + bucket%22&source = bl&ots = qPuGiS3pBg&sig
= rARQhJ0Sy8iNOInUNBpq0Oy2cW0&hl = en&sa = X&ved = 2ahUKEwj1j7D-
orvLeAhWJd98KHdMzC6EQ6AEwBXoECAgQAQ#v = onepage&q = %22you
%20kicked%20the%20proverbial%20bucket%22&f = false

Both external and metalinguistic noun modifications are available to all idioms,
regardless of compositionality.7

7 Nicolas (1995) provides a fine-grained semantic classification of external modification based


on Quirk et al. (1985). He explicitly excludes from consideration examples of Ernst’s “conjunc-
tion modification”, where the modified noun is interpreted both figuratively and idiomatically,
as in …had such fun pulling his cross-gartered leg for so long. Nicolas also dismissed Ernst’s
“semantically internal” modification, as in Many people were eager to jump on the horse-drawn

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
740 Christiane Fellbaum

2 Related work
Idioms have received attention from linguist, lexicologists, lexicographers, com-
putational linguists and psycholinguists. Among the analyses proposed by
linguists, we distinguish those that focus on the syntax of idioms, those that
consider syntax in conjunction with semantic compositionality, and those that
categorically deny any semantic compositionality.

2.1 Proposed accounts for syntactic flexibility of idioms in


grammatical frameworks

Many idioms violate a basic principle of natural language–semantic composi-


tionality–and for this reason they are thought by some to constitute a particular
class of lexemes outside the “core” and belonging to the “periphery” of the
grammar (Di Sciullo and Williams 1987). When viewed as monolithic “long
words,” idioms may appear to be fixed strings that are constrained in their
syntactic behavior. But clearly not all idioms are created equal, and several
proposals have been made to account for syntactic variation found across
idioms.
Fraser (1970) proposes an implicational hierarchy of transformations, dis-
tinguishing five levels of frozenness. In the lexicon, idioms are marked with a
feature specifying their position in the hierarchy. Extraction operation are
ranked highest, and if a given idiom allows one kind of extraction of an
idiom-internal NP, such as passivization, it will also tolerate other NP extraction
operations. Idioms that can undergo extraction operations allow lower-ranked
operations, such as permutation of constituents and insertion of lexical materi-
als. While undeniably elegant, Fraser’s analysis is based entirely on constructed
data, and both his specific data as well as the value of the hierarchy have been
challenged (Newmeyer 1974; inter alia). Corpus data furthermore refute his
account.
For example, Fraser states that the idiom kick the bucket is frozen and
cannot passivize, though there are plenty of attested examples, such as

Reagan bandwagon, where the adjective horse-drawn modifies the literal meaning of bandwa-
gon, although this noun is not interpreted literally within the idiom. Nicolas considers such
cases to be “word play, “external to the grammar of idioms.” See Section 7 for a discussion of
word play.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 741

(7) And no one here knows when the bell will toll or when the bucket will be
kicked
https://cherylcapaldotraylor.com/2016/02/29/take-the-leap/
https://books.google.com/books?id = YIdCmIrZZxEC&pg
= PA195&lpg = PA195&dq = %22the + bucket + will + be + kicked%
22&source = bl&ots = sXEqD5Mski&sig = qfKMK4D6c6jBPjQhMKHcIwMTn6-
k&hl = en&sa = X&ved = 2ahUKEwj-0sa7p97eAhVC3VMKHWrTAMo
Q6AEwCHoECAYQAQ#v = onepage&q = %22the%20bucket%20will%20be%
20kicked%22&f = false

According to Fraser, idioms that don’t allow passivization should be barred


from nominalizations; however, this does not hold, either, as shown by data
like (8):

(8) this, coupled with his diabetic cum hypertensive condition, was what led to his
kicking the bucket in the early hours of last Saturday, November 18, 2017
http://www.peacefmonline.com/pages/local/news/201711/336344.php?
storyid = 100&

Lebeaux (2000) attempts to integrate idioms into the core grammar and to
capture their behavior in terms of broad rules. He argues that idioms are
constructed like partial phrase markers similar to those characterizing certain
stages of language acquisition. Both can be accommodated in a “sub-gram-
mar” framework that is distinct from, but compatible with, the full grammar
that defines competence. Lebeaux distinguishes between a class of “pre-
merger” and a class of “post-merger” phrases. For example, pre-merger
idioms, including take advantage of, have a variable determiner (take no
advantage of) and are subject to syntactic operations like passivization
(advantage was taken of Jim); post-merger idioms like kick the bucket include
a definite determiner and cannot undergo syntactic operations like passive.
While Lebeaux’s proposal for an idiom grammar is interesting in that it
integrates language acquisition and adult grammar, it, too, is based on
constructed data that conflicts with attested data. Moreover, the claim that
in an idiom with a definite NP the Determiner is invariant is contradicted by
idioms like idioms like break the ice, which occurs freely with negation (break
no ice) or a demonstrative (break this ice):

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
742 Christiane Fellbaum

(9) After talks, India and Pakistan break no ice on how to demilitarize the no-man’s
land above the Siachen glacier.
https://in.reuters.com/article/india-pakistan-events/timeline-flashpoints-and-
flare-ups-in-india-pakistan-ties-idINDEE83703C20120408

(10) How to break this ice and how not to spurt out unwanted topics and place
yourself in an awkward situation?
https://www.blinddate.com/blogs

And, contrary to Lebeaux’s prediction, passivization is possible in many idioms


with a definite NP, as seen in (7) and (11–13):

(11) "Thank God" said a Georgia representative, and the ice was broken.
https://www.google.com/search?q = %22the + ice + was + broken%22&ie = utf-
8&oe = utf-8&client = firefox-b-1-ab

(12) In this latest meeting between leaders of the two countries, the hatchet was buried
https://www.theeastafrican.co.ke/news/ea/Rwanda-France-relations-bury-
hatchet/4552908-4580862-format-xhtml-112t1ak/index.html

(13) But when a greedy nephew took her to court to get a piece of the pie, the beans
were spilled.
https://www.forbes.com/pictures/eiif45gkek/catherine-lozick/#4f96c9555f45

Accounting for the syntactic flexibility of idioms in purely structural terms does
not do justice to the data.

2.2 Syntactic flexibility and semantic decomposition

Syntactic and lexical flexibility has been linked to semantic transparency. This is
an intuitively appealing approach, as it breaks down the hard boundary
between literal, freely composed and non-literal, possibly frozen language,
and could account straightforwardly for syntactic variations from the canonical
form. But this view is not universally accepted.
Sabban (1998) and Mel’čuk (1995) are among those who assert that all VP
idioms are non-compositional. Consequently, their morphosyntactic behavior is
that of simplex verbs and they can show variation only in the verb’s tense,
aspect and number (as far as it is compatible with the figurative meaning), as
well as negation and questioning of the entire VP.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 743

Schenk (1995) categorically states that idiom chunks do not have meaning
but allows for some variation. He distinguishes two types of syntactic opera-
tions. The first does not affect meaning and can operate on meaningless expres-
sions such as idioms. These operations comprise raising, passivization and yes-
no-questioning.8 The second kind of syntactic operations distinguished by
Schenk apply to meaningful expressions only. They include topicalization, rais-
ing, control structures, clefting, pseudo-clefting, modification, relativization,
pronominalization and wh-questioning. Since Schenk considers all idiom com-
ponents to be semantically unanalyzable, idioms cannot appear in these syntac-
tic configuration. However, all data cited by Schenk are constructed, and
throughout this paper we will cite corpus examples showing that speakers
produce syntactically modified idioms in ways that Schenk would fail to predict.
Everaert (2010) considers the lexical representation of idioms within the
framework of generative grammar. He argues that syntactic flexibility is not
tied to semantic transparency. Rather, the properties of a given idiom compo-
nent are connected to those of all senses of the same word form in the lexicon,
and these senses are always available. For example, the lexical encodings of kick
and bucket in their use as idiom components share properties of the literal
meanings of these lexical items, and variations like passivization are licensed,
as they are for the non-idiomatic senses. However, Everaert’s analysis also
entails that the aspectual properties of kick are retained in the idiomatic use,
and he specifically dismisses structures like he kicked the bucket slowly to be as
ill-formed as he kicked the ball slowly. But speakers do produce such data,
indicating that the lexical entry of the idiom component kick it not simply
merged with non-idiomatic senses of that verb:

(14) Our computer here at home slowly, ever so slowly, kicked the bucket.
asksistermarymartha.blogspot.com/2008/11/its-alive.html

Abeillé (1995) argues that while some idioms are semantically decomposable,
decompositionality is not systematically associated with, and does not predict,
syntactic flexibility. Working within a Tree Adjoining Grammar (TAG), where
idioms are represented as elementary "frozen" trees associated with a semantic

8 Bargmann and Sailer (2015) similarly separate purely syntactic operations from those that are
semantically motivated. They take a crosslinguistic perceptive on the syntactic flexibility of non-
decomposable idioms and argue that the German obligatory verb-second syntax in declarative
sentences allows non-referential nominal idiom chunks to be fronted in topicalization and
passivization while remaining “semantically neutral”. By contrast, in English such dislocated
NPs are claimed to be topics and thus topicalization and passivization for non-compositional
idioms is licensed only under the appropriate discourse conditions and information structure.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
744 Christiane Fellbaum

representation, Abeillé proposes that idioms follow the same syntactic rules as
corresponding non-idiomatic structures, a position that will be argued for, and
supported by corpus data in this paper as well, though not within the gramma-
tical framework assumed by Abeillé.
A comprehensive proposal regarding the correlation between semantic
decompositionality and syntactic flexibility is made by Nunberg et al. (1994),
who examine a large number of English idioms and argue that the majority are
in fact semantically decomposable. To distinguish the semantic compositionality
of freely generated phrases like pull the rope and knot strings from idioms like
like pull strings, they introduce the notion of “idiomatically combining expres-
sion,” whose parts carry conventionalized meanings, specific to the idiom. Thus,
pull strings derives its meaning (roughly, “exploit personal contacts”) from
directly identifiable correspondences between its constituents and their idiom-
specific meanings (pull = exploit, strings = personal contacts). “Idiomatically
combining” refers to the fact that neither pull nor strings carry these meanings
outside of the idiom. Other idiomatically combining phrases are spill the beans
and let the cat out of the bag. Such decomposition of idiomatic phrases is
consistent with analyses that posit metaphorical status for idiom components
like cat and strings, within the context of specific idioms (Gibbs and Nayak 1989;
Glucksberg 1993; Geeraerts 1995). Nunberg et al. (1994) argue that the semantic
interpretation of idiom constituents allows syntactic operations on these consti-
tuents, such as topicalization, pronominalization and VP ellipsis.
Nunberg et al. (1994) state that in contrast to the constituents of idiomati-
cally combining expressions, the components of “idiosyncratic phrasal construc-
tions” like kick the bucket and saw logs are not semantically interpreted, though
the meaning of saw logs may be more intuitively apparent than that kick the
bucket, as it suggests the kind of sounds a sleeper may make. Idiosyncratic
phrasal constructions, whose constituent are not metaphors, do not show syn-
tactic flexibility, according to Nunberg et al. However, structures that Nunberg
et al. rule out, such as the passivization of kick the bucket, are attested, indicat-
ing that semantic transparency is not sufficient to account for the flexibility of
idioms.
Recognizing that idioms are more flexible than often claimed, Kay et al.
(2012) propose a lexical theory of idioms, citing rich attested data. They conclude
that semantically compositional idioms (like let the cat out of the bag) are
flexible, and, conversely, that the constituents of inflexible idioms do not receive
a semantic interpretation. However, the data we retrieved from corpora and
report on in Sections 5.2, 6.1 and 7.2 show that speakers also modify idioms
that are not semantically compositional.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 745

3 Why do acceptability judgments differ across


speakers?
The rich literature on idioms focuses on relatively few idioms but it is full of
disagreements about the acceptability of data involving their syntactic flexibility
and lexical modification, even when judgments are not motivated by a broader
theory of grammar. We ask what the possible sources of these disagreements are.

3.1 Frequency of the canonical form

Corpus analyses show the canonical or citation form is the most frequent and
thus the more familiar one. It may well reflect the way the idiom is represented
in speakers’ mental lexicon.9 Put differently, variations are relatively infrequent
and unfamiliar, hence speakers may reject them, especially when considered
outside of a context.

3.2 Different decomposition analyses may account for diver-


gent analyses

Idiom components like the verbs and nouns in pull strings and spill the beans
readily lend themselves to semantic interpretation and paraphrases of the
idioms like ‘use personal connections’ and ‘reveal a secret.’ Tying such semantic
compositionality to syntactic flexibility and modification is intuitively convin-
cing. However, there is widespread disagreement about the compositionality of
many idioms and, related, their flexibility. Speakers differ in the way they assign
meaning to idiom components and to entire idioms, and different paraphrases
and mappings of idiom components to metaphoric readings may account for the
divergent judgments of constructed data that one finds in the literature.
Gibbs (1995) makes an important point in arguing that speakers do not
access the same invariant, literal meaning when they encounter a word, and
that one cannot assume that idioms or their components have easily determined
literal meanings. Indeed, speakers do not always agree on the precise meaning
of an idiom or on the interpretation of idiom components, and this may affect
their acceptability judgments. For example, Abeillé and Schabes (1989) para-
phrase grist for someone’s mill as ‘help.’ This interpretation precludes separate

9 I thank an anonymous reviewer for making this point.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
746 Christiane Fellbaum

meanings for grist and mill, unlike a more specific interpretation of this idiom
that includes a reference to someone’s particular situation or agenda (the mill)
and the entity or event that has a favorable effect on it (the grist). Examples (15,
16) suggest such an interpretation:

(15) Each detail that leaks out becomes grist for the Democrats’ mill
https://www.washingtonpost.com/blogs/right-turn/wp/2017/06/19/how-
will-we-miss-congress-if-it-doesnt-go-away/?noredirect = on&utm_term = .
2f4b2c45e5a8

(16) That makes grist for the Democrats’ mill and they are grinding it night and day.
https://newspaperarchive.com/austin-daily-herald-sep-07-1957-p-15/

3.3 Semantic re-analysis

The distinction between composing (metaphorical) and non-composing idioms


is not always clear-cut. (15) and (16) above suggest that, in specific contexts,
speakers often interpret a constituent as referring and modify the canonical form
of the expression accordingly. For example, the idiom fall off the wagon means
‘resume the habit of drinking alcohol.’ Its constituents cannot be mapped in a
one-to-one fashion on the literal meaning, though the Prepositional Phrase on
the wagon could be interpreted as ‘in a dry/abstemious state.’ (17)–(20) show
that speakers extend the meaning of this noun to include other forms of aspira-
tional habitual behavior:

(17) I fell off the daily blog wagon.


http://sanguinaryblue.blogspot.com/

(18) Falling Off The Exercise Wagon:


https://www.realmomnutrition.com/falling-off-exercise-wagon/

(19) I fell off that wagon for a year or so, but drank decaf for 4 years before that.
https://twitter.com/MarkMaddenX/status/959100902329241602

(20) before I fell off that wagon and started smoking again
https://www.quora.com/What-happens-to-addicted-people-when-they-enter-
a-long-coma

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 747

The specific meanings of wagon here vary across speakers, and the noun
appears to have undergone a change from its constituency in the monolithic
VP to an independent metaphor referring to any unhealthy or undesirable habit.
Similar to (17)–(20), (21) and (22) are examples of semantic re-analysis where
the noun in face the music is assigned a meaning (an unpleasant situation) that
is interpreted specific to a context:

(21) There is some responsibility; he might have to face that music, that much is
sure, but not murder or manslaughter charges.
https://books.google.com/books?id = 3S0BAAAQBAJ&pg = PT223&lpg =
PT223&dqx = %22face + that + music%22&source = bl&ots = Rqx3iRMD
G2&sig = USMcw9NfMKW9uf1CLYf5bmZ5EMA&hl = en&sa = X&ved = 2ahU-
KEwimjqDhie7eAhXG3VMKHbgtDf44ChDoATAJegQIARAB#v = onepage&q-
= %22face%20that%20music%22&f = false

(22) Harold Wilson had to face this music in 1967, and Callaghan and Healey
needed the IMF to bail them out in 1976
https://www.terrafirma.com/an-alternative-perspective-article/items/its-
not-over-yet-the-implications-of-the-credit-crunch.html

Idioms like beat around the bush, sit on the fence, and be on one’s high horse are
considered non-composing; the entire VP refers to a specific situation, form of
behavior or attitude. Given a context where this situation, behavior or attitude
are known to the interlocutors, we find the nouns preceded by a demonstrative:

(23) He asked if I would go into psychology, and rather than beat around that
bush again, I said yes.
https://books.google.com/books?id = m5HrCQAAQBAJ&
pg = PT427&lpg = PT427&dq = %22rather + than + beat + around
+ that + bush + again%22&source = bl&ots = LM54CrPmhc&sig
= G4tp1GOEINhDUgFfGf8vkqyquJM&hl = en&sa = X&ved = 2ahUKEwivkb-
u0puveAhUQ0VMKHcMgAa0Q6AEwAHoECAAQAQ#v = onepage&q = %
22rather%20than%20beat%20around%20that%20bush%20again%
22&f = false

(24) Thinking About a New Home? Don’t Sit On That Fence Too Long!
https://www.facebook.com/Sharri.Abii.Realtor/photos/hey-youdont-sit-
on-that-fence-too-long-if-youre-considering-buying-a-house-in-20/
777224589135660/

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
748 Christiane Fellbaum

(25) You had better come down from that high horse, and own up that you set the
Maud afire.
https://www.gutenberg.org/files/23351/23351-h/23351-h.htm

3.4 Context

Another source of disagreement among native speakers concerning the accept-


ability of non-canonical forms of idioms may well be the lack of contexts in
which such forms can occur. Linguists who construct data in support of a theory
often fail to include appropriate syntactic or lexical contexts that would make
the data acceptable (Bargmann and Sailer 2015 are a notable exception). The
need for context is a question addressed by Tabossi et al. (2009). They perform
four experiments in support of the claim that general syntactic constraints of a
language apply to idioms, and that their syntactic behavior is not idiosyncratic
but rather systematically sensitive to context and pragmatics, just like non-
figurative language. Judgments about the acceptability of idioms in their non-
canonical form generally support this claim, even for invented idioms or idioms
unknown to speakers of certain dialects.
The findings of Tabossi et al. (2009) are important, as they are grounded in
experiments rather than driven by theory or intuition. However, there are
potential concerns with the data and method. First, their "invented" idioms
are very similar to existing English idioms–either outright translations or simple
lexical variations of English idioms. Several are arguably idiomatically compos-
ing phrases (in the sense of Nunberg et al. 1994), so their syntactic flexibility is
not surprising, as it might simply be the result of analogy with known idioms
and corresponding metaphoric interpretations of specific components. Second,
the syntactic operations that were most often accepted by the participants in the
experiments were adverb insertion and adjectival modification as examined by
Ernst (1981); both operations are semantically external and modify the idiom as
a whole rather than individual constituents. Such adjective modification in
particular has been shown to be generally applicable to all types of idioms
and does not contribute much to our understanding of the syntactic flexibility
of idioms.
To demonstrate the contextual effect on idiom variability, Tabossi et al.
(2009) embed the target idioms in constructed sentences that precede the
syntactically modified idioms. However, in the examples given, the context
does no more than facilitate the figurative, rather the literal, reading; the
particular syntactic structure (e.g. topicalization) is not always motivated by
the information structure of context. Noteworthy is the fact that the judgments

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 749

of the participants in the experiments were rated as “correct” when they agreed
with a majority of pre-classified judgments, and that there never was full or
nearly full agreement among the participants. This suggests that acceptability
judgments differ across speakers, at least for "invented" idioms and contexts.

4 Corpus data
Tabossi et al. (2009) importantly emphasize the need for context when accept-
ability judgments are elicited. To better support their claim that, given appro-
priate contextual embedding, syntactic flexibility is available to idioms just as it
is to freely composed phrases, we examine data attested in corpora. By doing so,
we do not ask how speakers judge a given structure but merely analyze what
speakers produce.

4.1 The corpora


We focus on attested German and English data. For the English data, we
consulted the Web, using Google. The German examples were mined from a
corpus, the Digitales Wörterbuch der Deutschen Sprache (Digital Dictionary of the
German Language), (Geyken 2007).10 This corpus, compiled between 2003 and
2007 at the Berlin-Brandenburg Academy of Sciences, included slightly over a
billion words of running text at the time when the data for this paper were
collected. The corpus comprises a core corpus and an extended corpus. The core
corpus contains approximately 100 million running words from 80,000 docu-
ments, balanced across time and genre (including literary works, newspaper
texts, scientific writings, and a wide range of non-fiction texts). The Digitales
Wörterbuch der Deutschen Sprache was designed to constitute the German
equivalent of the British National Corpus, a standard resource for linguists
working on English. The extended corpus contains over 900 million word
tokens. Unlike the core corpus, it is “opportunistic,” i.e. the texts are not
balanced but were acquired based on their ready availability. The bulk of the
data come from freely accessible German newspapers published between 1993
and 2007. Both corpora are parsed and part-of-speech tagged. They were queried

10 A URL will be provided for each English example, while the German data are all from the
corpus described in Geyken (2007) and accessible via http://kollokationen.bbaw.de/htm/idb_
de.html.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
750 Christiane Fellbaum

with a linguistic search engine designed for this purpose (Geyken and Sokirko
2007; Herold 2007).

4.1.1 The German idiom bank

Fellbaum (2007b) and Neumann et al. (2004) report on the creation of the
database of German idioms found in the corpus (http://kollokationen.bbaw.
de/htm/idb_de.html). A target list of 817 German idioms was manually cre-
ated and the corpus was searched with the goal of extracting morphosyntac-
tic and lexical variations.11 Importantly, the searches did not target any pre-
selected variations. Regular expression written explicitly for this purpose
(Herold 2007) allowed for the retrieval of a maximal number of variations
of the target idioms. The queries focused on the idiom constituents that are
arguably its core lexemes (equivalent to, for example, English bite and bullet)
but allow for lexical divergence from the dictionary form such as compounds
and semantically related words like synonyms, as extracted from a thesaurus-
like resource.
The regular expressions moreover were designed to capture all inflected and
derivational forms. Furthermore, the order of the lexemes was not specified, so
that structures like passive and topicalization were retrieved. Allowing for vari-
able distance between the lexemes also returned such variations as adjectival
modification of nouns.
An example is jemanden ins Bockshorn jagen (lit. chase somebody into the
buck’s horn, ‘intimidate’). The noun Bockshorn is considered a lexeme that does
not occur outside the idiom and that carries no meaning as an idiom constituent.
Searching for Bockshorn alone yielded 285 hits. Searching with the regular
expression

(26) *horn && jagen !@Bockshorn

leaves the verb unspecified and allows for variation in the noun, and produces
twenty-seven hits, some with variant spellings of the verb and the noun as well
as with a different, semantically similar verb. It also yielded a noun Hasenhorn

11 Fazly et al. (2009) and Zhu and Fellbaum (2015) represent efforts to automatically extract
idiomatic expressions from a corpus, based on statistical measures of co-occurrence of tokens.
By contrast, we proceeded from a predefined set of frequent and familiar English and German
Verb Phrase idioms.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 751

(lit. ‘hare’s horn’), with the substitution of a context-specific semantically related


noun.
A second query, as in (27), yielded tokens with compounded variants of the
noun, such as Bockshornklee ‘buck’s horn clover’. This noun, which has a non-
figurative meaning referring to a plant, was mentioned in the same context; its
use in the idiom is thus a context-specific lexical variation. Another context-
specific variant was Gruselhorn, lit. ‘scary horn,’ which reflects the meaning of
the idiom (‘to intimidate/scary someone’).

(27) ((Bockshorn* && !@Bockshorn) || *bockshorn)

The idiom was found most often in the context of a negation (141 tokens vs. 36
affirmative tokens), suggesting that its “canonical” status as a Negative Polarity
Item.
The flexible search queries also allowed for variations of prepositions
that were part of the canonical structure. The retrieved tokens were manually
sorted into true positives (with an idiomatic reading) and false positives,
sequences containing one or more of the core constituents but with literal
interpretation. Across all idioms about half of the tokens received a literal
reading. There was variation – tokens with idiom-specific lexical items like
Bockshorn were unsurprisingly used idiomatically in most of the retrieved
tokens. In a few cases, the linguist sorting the tokens concluded that both
idiomatic and literal readings.
For each target idiom, a manually created entry in the database shows
the kinds of attested variations and their frequency in the corpus. The
“canonical” form, following the structure given in (1) for most idioms, was
by far the most frequent in all cases. The frequency of non-canonical varia-
tions of a given idiom differed across the idioms, as did the number of
retrieved examples with a specific type of variation. Most variations for a
given idiom were in the single digits; for some idioms, no syntactically
specified variation was found, but equivalent variations were found for
other idioms with a similar syntactic canonical form. All data can be accessed
on the website http://kollokationen.bbaw.de/htm/idb_de.html.

5 German data: Representative examples


We report on a few representative cases of idiom variations found in the German
corpus.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
752 Christiane Fellbaum

5.1 Variation in decomposable idioms

We take Nunberg et al.’s (1994) observation of the systematic correlation


between the semantic transparency of idioms and their syntactic flexibility
to be uncontroversial. The distinction between literal and metaphoric inter-
pretation is often blurry, and interpreting an idiom component as a metaphor
likely licenses its syntactic freedom. While Nunberg et al.’s data are con-
structed, Moon’s (1998) large corpus-based investigation supports this
hypothesis with attested data. She cites examples of relativization (that is a
bullet on which the Arthur Golds of this world have steadfastedly failed to bite)
and pronominalization (if there is ice, Mr Clinton is breaking it); in each case,
the extracted noun phrase is a semantically interpretable metaphor (bullet,
ice); other examples include passives such as the hatchet was buried.
Semantically transparent constituents are also subject to number variation,
change in definiteness and negative polarity, adjectival modification and
lexical category change. Data from the German idiom bank show many
similar syntactic variations in German idioms, as well as clefting and
topicalization.

5.2 Variation in non-compositional idioms

The flexibility of metaphoric idiom constituents is both unsurprising and sup-


ported by corpus data. More interesting, and subject to far more disagreement
found in work on idioms, are variations in non-compositional idioms. Examples
from the German idiom bank show the same wide range of syntactic and lexical
modification. We illustrate this with a frequently encountered German VP idiom,
given in its citation form in (28):

(28) kein Blatt vor den Mund nehmen


no leaf/sheet in front of the mouth take
take no leaf (or sheet) in front of one’s mouth
‘be outspoken, speak ones’ mind, speak openly’

Like many idioms, this is a negative polarity item. Its origin–a very old custom
whereby actors in the theater covered their faces so as to remain anonymous and
protected from possible prosecution for using obscene or provocative language–
is unknown to everyday contemporary speakers. Mund ‘mouth’ is assigned a
meaning; it may refer to the mouth of the typical subjects of this VP (speakers)

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 753

or, metonymically, to their words. This noun can therefore be modified by an


adjective, as in (29):12

(29) Da nimmt der Redner schon einmal kein Blatt vor


There takes the speaker already once no sheet in front of
seinen republikfeindlichen Mund
his republic-hostile mouth
there the speaker places no sheet in front of his republic-hostile mouth
‘the speaker does not his his hostility to the republic’

By contrast, no meaning can be readily assigned to Blatt in the context of the


idiom. This noun has several related but distinct literal meanings: ‘leaf [of a
plant],’ ‘sheet [of paper]’ and ‘newspaper, publication’ (the latter is likely unre-
lated to the origin of this phrase). Although it cannot be assigned a meaning
within the idiom, speakers produce syntactic variations involving Blatt.
(30)–(32) are examples of is a passivization, pseudo-clefting and
relativization:

(30) Bei BMW wird kein Blatt vor den Mund genommen
at BMW is no sheet in front of the mouth taken
at BMW no sheet is taken in front of the mouth
‘people at BMW speak out openly’

(31) Was der Sprecher nicht vor den Mund genommen


what the speaker not in front of the mouth taken
hat war ein Blatt
has was a sheet
what the speaker did not take before his mouth was a sheet

(32) Das Blatt, das er vor den Mund genommen hat…


the sheet that he in front of the mouth taken has
the sheet that he put in front of his mouth

In (33), the writer plays on the polysemy of Blatt: the pronoun in the idiom refers
back to the antecedent with the “newspaper” reading:

12 Note that the adjectival modification here is not of the external kind studied by Ernst (1981),
i.e. hostility to the republic strictly modifies the speaker (metonymically his mouth) and not the
entire sentence.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
754 Christiane Fellbaum

(33) Französisches Satireblatt nimmt keins vor den Mund: neue


French satirical paper takes none in front of the mouth: new
Karikaturen zu einem eher ungünstigen Moment
caricatures at a rather inconvenient moment
French satirical paper takes none before its mouth: new caricatures at a
rather inconvenient moment

In (34), the writer interpreted Blatt as ‘sheet; and created the compound mean-
ing ‘sheet of music’ in a musical context about the orchestra conductor Herbert
von Karajan, referring to his unrestrained performance:

(34) so nimmt der philharmonische Medizinalrat…kein Notenblatt


thus takes the philharmonic master no sheet music
vor den Mund
in front of the mouth
‘thus the philharmonic master…holds no sheet music in front of his mouth’

In (35), the polarity is reversed and the quantifier communicates a government


spokesman’s characteristic reluctance to speak out:

(35) Ein Regierungssprecher ist ein Mann, der sich


a government spokesman is a man who for-himself
100 Blätter vor den Mund nimmt
100 sheets in front of the mouth takes
‘a government spokesman is a man who puts 100 leaves before his mouth’

The quantification here does not entail a metaphoric reading of Blatt; the writer
expresses the opinion that a government speaker takes great care not to be
speak too openly.

5.3 Idiom-specific lexemes

Some idiom components are not found outside of their use in idioms. An
example is gift horse in the composing phrase (not) look a gift horse in the
mouth. While the meaning of this compound noun is readily interpretable as
‘gift,’ corresponding to that of its first member, other lexemes are semantically
opaque constituents of non-composing idioms. An example is the common
German idiom in (36):

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 755

(36) ins Fettnäpfchen treten


into the little grease pot step
‘commit a gaffe/faux pas’

Fettnäpfchen, like gift horse, is an idiom-specific lexeme, and no corpus occur-


rences were found outside this idiom. Unlike gift horse, Fettnäpfchen does not
lend itself to a metaphorical interpretation that contributes to the idiom’s mean-
ing. Yet corpus searches show numerous tokens where the noun is quantified or
modified:

(37) er trat in jedes Fettnäpfchen…


he stepped into every grease pot

(38) Berlusconi: ein Mann, viele Fettnäpfchen.


Berlusconi: one man, many grease pots

(39) Immer trat [er] ins bereitstehende


Always stepped [he] into the standing-by
Fettnäpfchen
grease pot
‘he committed a gaffe at every possible occasion’

Similarly, we find relativization, topicalization, quantification of Fettnäpfchen:

(40) Das Fettnäpfchen, in das [er] getreten ist, ist riesig


The grease pot into which [he] stepped has is huge
‘he committed a huge gaffe’

(41) Ins Fettnäpfchen trete ich bestimmt mal


Into the grease pot step I definitely at some point
‘I'll definitely commit a faux pas at some point’

Such data attest to variations on the “canonical” form of the idiom and the
modification of a non-referring constituent.

6 English data
We now turn to English data, focusing on non-compositional idioms, retrieved
from the Web.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
756 Christiane Fellbaum

6.1 Syntactic variations in non-compositional idioms

Kick the bucket is perhaps the English idiom par excellence, and it is cited in
many papers as the prototypical frozen idiom whose constituents do not receive
any semantic interpretation. Thus a frequently encountered claim is that the
bucket was kicked last night can only receive a literal interpretation (Nunberg
et al. 1994; Glucksberg 1993; inter alia), under the reasoning that neither the verb
nor the noun can be assigned a meaning as parts of the idiom. However, a Web
search yields examples of passivization such as these:

(42) And no one here knows when the bell will toll or when the bucket will be
kicked.
https://cherylcapaldotraylor.com/2016/02/29/take-the-leap/

(43) Live life to the fullest, you never know when the bucket will be kicked.
https://www.puff.com/forums/vb/general-cigar-discussion/84813-avo-le05s-
what-do-what-do-3.html

These naturally occurring examples clearly contradict the claim that this idiom
is blocked from the passive construction (Nunberg et al. 1994; Schenk 1995; inter
alia). (42) and (43) have the flavor of impersonal passives, which can be formed
with intransitive verbs and a semantically empty “dummy” subject as (44):

(44) There was a lot of dying from infectious disease


https://fatburningman.com/interview-with-robb-wolf-author-of-the-paleo-
solution-podcast-video/

No dummy subject is needed in (42)–(43), where the object of the verb occupies
the subject position. Like there, it is semantically empty.13

Similarly, a synonymous idiom can be passivized:


(45) a song that Amy Winehouse sang it before her clogs were popped?
http://blog.thecatsdiary.com/2012/05/14/

13 Tabossi et al. (2009) rule at the passive for the idiom miss the boat based on similar
reasoning. But numerous examples can be found on the Web, such as It seems the boat was
missed in not building a fertilizer plant right at the new sewage treatment plant. (https://cityroom.
blogs.nytimes.com/…/turn-piles-of-waste-into-piles-of-cash-city-asks). These have the same
impersonal passive flavor.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 757

We also find the idiom in nominalizing forms, with adjectival modifiers and
quantifiers, and as members of a compound:

(46) Petra prays for her hateful hubby’s untimely kicking of the bucket
http://gone-and-forgotten.blogspot.com/2014/10/truly-gone-forgotten-
foes-gorna-lord-of.html

(47) I am young but have experienced more bucket kicking within my immediate
family and circle of family friends than I can shake a fist at
https://www.straight.com/confessions/1505/live-kicking-bucket

(48) here’s a short list of things I hope to continue to avoid from now until bucket
kicking time.
https://www.boothbayregister.com/article/my-non-bucket-list/7624

6.2 Variation of the determiner

The definite determiner is usually restricted to nouns with specific or


generic reference. Its use is strikingly frequent in English VP idioms, both
when the noun arguably refers (spill the beans, take the bull by the horns)
and when it does not (kick the bucket, hit the ceiling, shoot the breeze, chew
the fat).
In composing idioms, replacing the definite with another kind of determiner
or a possessive can preserve the idiomatic reading; this is consistent with the
often-made observation that referring nouns are generally available for morpho-
syntactic and lexical operations:

(49) Now he (Mr. H.) could also let a cat out of the bag – to show how the
opposition jumps
https://books.google.com/books?id = LqpCAQAAMAAJ&pg = PA639&lpg =
PA639&dq = %22let + a + cat + out + of + the + bag%22&source = bl&ots =
Fcm5pIrqvq&sig = I7lKDjB6kQqj6eyExunKBZtqnak&hl = en&sa = X&ved = 2-
ahUKEwjQu5PuqvLeAhWNrFMKHTiC70Q6AEwDHoECAMQAQ#v = onepag-
e&q = %22let%20a%20cat%20out%20of%20the%20bag%22&f = false

(50) Trump spilled his beans that he has no intention of releasing his tax returns
when he slipped up and said, “And only a fool would give a tax return…“
https://www.redstate.com/california_yankee/2016/04/01/trump-slipped-
spilled-beans-releasing-tax-returns/

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
758 Christiane Fellbaum

The definite determiners are often replaced by the demonstrative determiners


that and those in non-composing idioms, where the NP does not refer:

(51) We are tired of you, so hurry up and kick that bucket


https://books.google.com/books?id = SHWHoFpISvIC&pg = PA27&lpg
= PA27&dq = %22hurry + up + and + kick + that + bucket%22&source
= bl&ots = EyP9u_kJit&sig = hu_gVT5rnwJajmmqGCH06xcnng8&hl = en&-
sa = X&ved = 2ahUKEwiVyOOsvfLeAhXnYN8KHQzPDJYQ6AEwAHoECAUQ-
AQ#v = onepage&q = %22hurry%20up%20and%20kick%20that%20bucket%
22&f = false

(52) You can study hard, burn that midnight oil, and attempt another A plus.
https://books.google.com/books?id = 9-ScBwAAQBAJ&pg
= PT145&lpg = PT145&dq = %22burn + that + midnight + oil, + and + attempt
%22&source = bl&ots = AoadDOh_yd&sig = t1NWImhdkTmjqg-
AcP0fKgQiF0E&hl = en&sa = X&ved = 2ahUKEwih-Pb5vfLeAhWig-
AKHeYKDF4Q6AEwAHoECAEQAQ#v = onepage&q = %22burn%20that%
20midnight%20oil%2C%20and%20attempt%22&f = false

(53) We had the hardest time trying to get that boy to hit those books. But, we
succeeded. He ended up going to engineering school.
https://books.google.com/books?id = MNprioytNPYC&pg = PA101&lpg
= PA101&dq = %22hit + those + books%22&source = bl&ots =
tlDgmkRuJv&sig = QDuHjD9u7_ZERo3y6yfebCfNcs8&hl = en&sa = X&ved = 2-
ahUKEwixicfnvvLeAhWBmOAKHSlRB7s4ChDoATAIegQIAhAB#v = onepag-
e&q = %22hit%20those%20books%22&f = false

The function of the demonstratives here is not deictic but serves to establish a
common ground and a shared perspective among the speakers; this use of
demonstratives is explored in Lakoff (1974), though she did not consider the
kind of non-referring nouns we find in the idioms.

7 Lexical variation: Word play or systematic?


Sabban (1998) and Mel’čuk (1995) are among those who consider many uses of
idioms that deviate from the canonical form to be “word play” or “artistic
deformations,” not to be included in any theory of lexicology. No clear definition
of word play is offered, but from the data given involve the substitution of

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 759

canonical idiom components with a lexical item that is specific to the context;
such substitutions typically achieve a humorous effect. These modifications
indicate that speakers have a representation of the canonical form,14 and that
the modification is deliberate. Importantly, the data show that the lexical sub-
stitutions are systematic, similar to paraphrases and puns.

7.1 Substitution of paradigmatically related words


Speakers substitute lexemes that are paradigmatically related to the literal
meaning of idiom constituent, such as synonyms, hypernyms (superordinates)
and hyponyms (subordinates) as well as compound phrases.
Consider the Web examples below, based on the common idiom let the cat out of
the bag. Examples (54) and (55) show that the speaker accessed the concept ‘cat’ and
substituted context-appropriate words that are semantically related to the literal
meaning of cat, suggesting a semantic interpretation of that idiom component. The
substituted words refer to more specific cats, such as lion or tiger, and evoke secrets
(the figurative meaning) with similar properties (large, dangerous, aggressive):

(54) To say that Simone had let the cat out of the bag was an understatement.
She’d let a lion out of the bag.
https://books.google.com/books?id = fIcRCPvI1Y4C&pg = PA157&lpg
= PA157&dq = %22She%27d + let + a + lion + out + of + the + bag%
22&source = bl&ots = bp4JfjsEA3&sig = tS7G25acTKwgLyyDSHFcAQ52GI&hl =
en&sa = X&ved = 2ahUKEwiR2enWqd7eAhWLZd8KHa6_An4Q6AEwAHoECA-
AQAQ#v = onepage&q = %22She’d%20let%20a%20lion%20out%20of%20the
%20bag%22&f = false

(55) Barack Obama got himself in trouble when he let the tiger out of the bag
about how he wanted to “spread the wealth around.”
https://startthinkingright.wordpress.com/tag/we-are-all-socialists-now/

Further examples of substitutions with semantically related words are (56) and (57):

(56) Kelly Rowland appears to have let the kitten out of the bag. The recently
married singer has been the subject of pregnancy rumors for a while
https://www.eonline.com/news/550004/is-this-kelly-rowland-announcing-
that-she-s-pregnant-see-the-pic

14 I thank an anonymous reviewer for this observation.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
760 Christiane Fellbaum

(57) when you asked him to share his knowledge with you he let the Angora out of
the bag?
https://books.google.com/books?id = sYxFAQAAMAAJ&pg = PA15&lpg
= PA15&dq = %22he + let + the + Angora + out + of + the + bag%22&source
= bl&ots = oKTSBEiXcY&sig = 8WpFIuIwOHDdt3DrLXtxrPUC0qk&hl = en&-
sa = X&ved = 2ahUKEwivnvjMqt7eAhUSVd8KHbbhD1AQ6AEwAHoECAAQ-
AQ#v = onepage&q = %22he%20let%20the%20Angora%20out%20of%
20the%20bag%22&f = false

Speakers also clearly access the phonological form of the idiom constituents, as
the following examples from the Web, alluding to the secret sexual adventures
of the golfer Tiger Woods (whose first name is homonymous with the animal)
and the actor Tom Cruise, show:

(58) More importantly, now that the Tiger is out of the bag, what purpose does it
serve not to release an official photo? [of President Obama playing golf with
T.W.]
https://theweek.com/articles/467558/5-ways-looking-obamas-secret-golf-
game-tiger-woods

(59) Nevertheless, it’s quite a shame that someone let the Tomcat out of the bag
http://www.mtv.com/news/2760121/tom-cruise-steals-tropics-thunder/

The example below involves a different sense from the feline one:

(60) Let that sex kitten out of the bag. You deserve a good romp.
https://epdf.tips/house-of-lies.html

Such data suggest that speakers access the literal meanings of idiom constituents
along with their semantic and formal (phonological) properties and support propo-
sals by psycholinguists on independent grounds (Cutting and Bock 1997; inter alia).
To preserve the idiomatic meaning, the substitution of a “canonical” idiom con-
stituent with another lexical item must be restricted to semantically similar lexemes.
Cat is semantically similar to dog. Word association norms show that the
response rate of dog to the stimulus cat is 66.7%, and 55.1% of responses to the
stimulus dog are cat (Moss and Older 1996). The semantic similarity of this word
pair is also reflected in the high similarity of their semantic vectors, which reflect
shared contexts (for example, Boyd-Graber reports a similarity score of 0.9).15

15 http://www.umiacs.umd.edu/~jbg/teaching/CMSC_726/13b.pdf

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 761

Thus it is not surprising that we find speakers substituting dog for cat in the
idiom, in contexts where dog in one of its literal meanings is present:

(61) The Training Cesar’s Way Clinics and Fundamentals of Dog Behavior and
Training will continue at the Dog Psychology Centers in California and Florida,
and we have some nice surprises coming down the line, one of which I’m
particularly proud of, although I can’t let the dog out of the bag yet.
https://www.cesarsway.com/cesar-millan/cesars-blog/bark-to-the-future

(62) Not to let the dog out of the bag, but some of the creations involve jalapeno
poppers, mac and cheese and some everlovin’ BLT action. A Split Rail craft
beer will be paired with each of the gourmet hot dogs to create a unique
dining experience.
https://www.manitoulin.ca/elliotts-fundraiser-feature-gourmet-hot-dogs-
cause/

The first sentence alludes to canines, semantically similar to felines, whereas


the second to hot dogs, involving the homonym rather than semantic
similarity.
The constraints on lexical substitutions need further study. For example,
we could not find a case where beans was replaced with semantically
similar words like peas or legumes, but we came across the following
sentences:

(63) How do you avoid bringing chemicals and poisons into your home while
keeping it pest free? We live for this stuff so spill the (green) beans.
https://www.younghouselove.com/ants-in-my-pans/comment-page-3/

(64) Selena Gomez’s mother…breaks silence about her daughter’s kidney trans-
plant. Now that Selena Gomez has spilled the kidney beans, her mom has
something to say.
http://oceanup.com/2017/09/18/selena-gomezs-mother-mandy-teefey-
breaks-silence-about-her-daughters-kidney-transplant/now-that-selena-
gomez-has-spilled-the-kidney-beans-her-mom-has-something-to-say-
feature/

Note that kidney in (64) has two different meanings and the similarity is pho-
nological rather than semantic.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
762 Christiane Fellbaum

7.2 Lexical modification of non-referring idiom constituents

We also see lexical modifications of non-referring constituents. In (65), a near-


synonym is substituted for the common constituent bucket:

(65) For our darling little Willy’s kicked the pail.


https://mudcat.org/detail_pf.cfm?messages__Message_ID = 2803634

The German idiom sich auf die Strümpfe machen lit. make oneself onto one’s
stockings, ‘get going or moving’ is often found with Strümpfe replaced by Socken
‘socks’. This common idiom is an example of what Nunberg et al. call “idiosyn-
cratic phrasal,” somewhat similar to saw logs: the constituents do not receive a
semantic interpretation, though the meaning of the entire phrase could perhaps
be guessed. The lexical variation involves two nouns, Struempfe (stockings) and
Socken (socks), whose literal meanings are so similar that they are interchange-
able in many contexts.
Another kind of modification of idiom components is seen in compounding,
as in the example below:

(66) Addison Graham is hardly the first porn actor to move to the relatively quiet
desert town of Palm Springs after saying au revoir to the industry, but unlike
most retired adult industry veterans who throw in the bath towel after long
extensive careers, Graham was done with porn after only two years of
working.
(https://thehissfit.com/blog/my-interview-with-former-porn-star-addison-
graham/)

Bath towel does not receive a figurative meaning in the use of the idiom here,
but alludes to the state of undress associated with porn actors.

7.3 Zeugma

Though the discussion of "word play" in the literature does not include zeugma,
the conjunction of canonical idiom components with non-idiomatic components
within a single idiom, such structures are relevant. Kramer (2006) analyzes some
of the examples found in the German corpus. She distinguishes cases of “inter-
phrasal” zeugma, where two VP idioms with the same verb take two noun
phrase complements that are each part of a different idiom. An example is (67):

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 763

(67) Der Dumme, der in den sauren Apfel beissen muss


The loser who into the sour appel bite must
und nicht selten auch ins Gras, ist der Hund.
and not seldom also into the grass is the dog.
‘The loser, who has to bite into the sour apple and not seldomly also into
the grass, is the dog.’

In den sauren Apfel beissen is similar to the the English idiom bite the bullet,
i.e. ‘bear the negative consequences’ and ins Gras beissen, meaning ‘die,’
corresponds to bite the dust,’ hence the sentence can be translated as ‘the
loser who has to bite the bullet and not seldomly the dust as well, is the
dog.’
Another, frequent type of zeugma labeled by Kramer (2006) as “transphra-
sal,” extends over multiple sentences. Here, an idiom component and a free NP
are arguments of a same verb, which may receive two readings, one literal and
one as an idiom component. Such cases of zeugma are characteristic of journal-
istic prose.

(68) Ein Essen ist wie ein Konzert. Der Wein kann darin
A meal is like a concert. The wine can in it
die erste Geige spielen oder nur ein Begleitinstrument.
the first violin play or only an accompanying instrument.
‘A meal is like a concert. The wine can play first fiddle or only [be] an
accompanying instrument.’

(69) Die Frage, ob Mosimann einen Einfluss auf


The question whether Mosimann an influence on
die englische Küche habe, wird vom Gastrojournalisten Paul
the English cuisine has is by the food journalist Paul
Levy mit einem klaren “Oh yes” beantwortet: Der Schweizer
Levy with a clear "Oh yes" answered: The Swiss [man]
hatte als einziger den Mut, den Stier
had as single [man] the courage the steer
beziehungsweise die britische Küche bei den Hörnern
or rather the Britsh cuisine by the horns
zu packen.
to grab.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
764 Christiane Fellbaum

‘The question as to whether Mosimann is influencing English cuisine is


answered
by the food journalist Paul Levy with a clear “Oh yes”: The Swiss [cook]
was the
only one with the courage to take the bull, or rather British cuisine, by the
horns.’

(70) Und weil er den Hals nicht vollkriegen kann, beisst


And because he the neck not full get can, bites
er zuerst in einen zarten Fasan im Speckmantel und
he first into a tender pheasant in bacon coat and
danach ins Gras.
afterwards into the grass.
‘And because he cannot get enough, he first bites into a tender pheasant
wrapped in bacon and then into the grass.’
‘And because he cannot get enough, he first bites into a tender pheasant
wrapped in bacon and then the dust.’

We conclude that “word play” is humorous, context-dependent and ad hoc,


but not arbitrary (Fellbaum 2015b). The data show that lexical substitutions
and noun compounding are highly systematic. In all cases cited, speakers
access the literal meaning of an idiom component, whether or not it is a
metaphor and carries meaning within the idiom. Zeugma similarly shows that
both literal and figurative readings of idioms are accessed. In some cases of
zeugma, a constituent shared by two idioms, independent of whether it has
metaphoric status or not, may receive two different corresponding figurative
readings. This suggests that the shared component is accessed as a single
lexeme independent of, but linked to, several idioms. Importantly, in all
cases of lexical substitution and zeugma, the figurative meaning is preserved.
The data support findings by psycholinguists who argue for both a literal and
a figurative representation of idiom components in the mental lexicon
(Cutting and Bock 1997).
Because such “word play” is highly systematic, it cannot be dismissed as a
linguistic epiphenomenon and must instead be included in the study of idioms,
as argued in Fellbaum (2015b). The data presented here invite further investiga-
tion of the formal and lexical properties of “word play,” including the con-
straints on lexical substitution.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 765

8 Summary and conclusion


We examined German and English corpus data for syntactic and lexical varia-
tions of common idioms after establishing several criteria for "variation." The
data indicate that none of the previously offered accounts for the flexibility of
idioms is sufficient to accommodate the range of variations produced by speak-
ers, and many idioms claimed to be invariant are in fact subject to modification.
Speakers may differ in their assignment of meaning to idiom components, but
semantic transparency does not account for attested variations in non-compo-
sable idioms. Lexical modifications, sometimes dismissed as mere word play,
are highly systematic. The data suggest that in appropriate contexts, speakers
produce idiom variations consistent with the rules of freely composed language.

References
Abeillé, Anne. 1995. The flexibility of French idioms: A representation with lexicalized tree
adjoining grammar. In Martin Everaert, Erik-Jan van der Linden, André Schenk & Rob
Schreuder (eds.), Idioms: Structural and psychological perspectives, 15–42. Hillsdale, NJ:
Lawrence Erlbaum.
Abeillé, Anne & Yves Schabes. 1989. Parsing idioms in lexicalized TAGs. In Proceedings of the
fourth conference on European chapter of the Association for Computational Linguistics,
1–9. Stroudsburg, PA: Association for Computational Linguistics.
Bargmann, Sascha & Manfred Sailer. 2015. Syntactic flexibility of non-decomposable idioms,
Abstract. 4th general meeting of PARSEME, Valletta, Malta, March 19–20. https://typo.uni-
konstanz.de/parseme/images/WG1-Volume-Outlines/BARGMANN-SAILER-outline.pdf
Cowie, Anthony. 1998. Phraseology: Theory, analysis, and applications. Oxford: Oxford
University Press.
Cutting, Cooper & Kathryn Bock. 1997. That’s why the cookie bounces: Syntactic and semantic
components of experimentally elicited idiom blends. Memory and Cognition 25(1). 57–71.
Di Sciullo, Anna Maria & Edwin Williams. 1987. On the definition of word (Linguistic Inquiry
Monograph 14). Cambridge, MA: MIT Press.
Ernst, Thomas. 1981. Grist for the linguistic mill: Idioms and “extra” adjectives. Journal of
Linguistic Research 1(3). 51–68.
Everaert, Martin. 2010. The lexical encoding of idioms. In Malka Rappaport Hovav, Edit Doron &
Ivy Sichel (eds.), Lexical semantics, syntax, and event structure, 76–98. Oxford: Oxford
University Press.
Fazly, Afsaneh, Paul Cook & Suzanne Stevenson. 2009. Unsupervised type and token identifi-
cation of idiomatic expressions. Computational Linguistics 35(1). 61–103.
Fellbaum, Christiane. 2007a. The ontological loneliness of idioms. In Andrea Schalley & Dieter
Zaefferer (eds.), Ontolinguistics, 419–434. Berlin & New York: Mouton de Gruyter.
Fellbaum, Christiane. 2007b. Introduction. In Christiane Fellbaum (ed.), Idioms and colloca-
tions: From corpus to electronic lexical resource, 1–19. Birmingham: Continuum Press.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
766 Christiane Fellbaum

Fellbaum, Christiane. 2014. Non-syntactic idioms and phrases. In Tibor Kiss & Artemis
Alexiadou (eds.), Handbook of syntax, 776–802. Berlin & Boston: De Gruyter.
Fellbaum, Christiane. 2015a. The treatment of multi-word units. In Philip Durkin (ed.), Oxford
handbook of lexicography, 411–425. Oxford: Oxford University Press.
Fellbaum, Christiane 2015b. Is there a grammar of idioms? Paper presented at the Brussels
Conference on Generative Linguistics (BCGL 8: The grammar of idioms) 4–5 June 2015,
Brussels, Belgium.
Fraser, Bruce. 1970. Idioms within a transformational grammar. Foundations of Language 6.
22–42.
Geeraerts, Dirk. 1995. Specialization and reinterpretation in idioms. In Martin Everaert, Erik-Jan
van der Linden, André Schenk & Rob Schreuder (eds.), Idioms: Structural and psycholo-
gical perspectives, 57–73. Hillsdale, NJ: Lawrence Erlbaum.
Geyken, Alexander. 2007. The DWDS Corpus: A reference corpus for the German language of the
twentieth century. In Christiane Fellbaum (ed.), Idioms and collocations: From corpus to
electronic lexical resource, 23–39. Birmingham: Continuum Press.
Geyken, Alexander & Alexey Sokirko. 2007. Classifying NVGs/FVGs in an interactive parsing
process. In Christiane Fellbaum (ed.), Idioms and collocations: From corpus to electronic
lexical resource, 41–53. Birmingham: Continuum Press.
Gibbs, Raymond. 1995. Idiomaticity and human cognition. In Martin Everaert, Erik-Jan van der
Linden, André Schenk & Rob Schreuder (eds.), Idioms: Structural and psychological
perspectives, 97–116. Hillsdale, NJ: Lawrence Erlbaum.
Gibbs, Raymond & Nandini Nayak. 1989. Psycholinguistic studies on the syntactic behaviour of
idioms. Cognitive Psychology 21. 100–138.
Glucksberg, Sam. 1993. Idiom meaning and allusional content. In Cristina Cacciari & Patrizia
Tabossi (eds.), Idioms: Processing, structure, and interpretation, 3–26. Hillsdale, NJ:
Lawrence Erlbaum.
Grimshaw, Jane & Arnim Mester. 1988. Light verbs and theta-marking. Linguistic Inquiry 19.
205–232.
Herold, Axel. 2007. Corpus queries. In Christiane Fellbaum (ed.), Idioms and collocations: From
corpus to electronic lexical resource, 54–63. Birmingham: Continuum Press.
Jackendoff, Ray. 1995. The boundaries of the lexicon. In Martin Everaert, Erik-Jan van der
Linden, André Schenk & Rob Schreuder (eds.), Idioms: Structural and psychological
perspectives, 153–165. Hillsdale, NJ: Lawrence Erlbaum.
Kay, Paul, Ivan Sag & Dan Flickinger. 2012. A lexical theory of phrasal idioms Unpublished
manuscript. Stanford, CA: Stanford University.
Kearns, Kate 2002 [1988]. Light verbs in English. http://citeseerx.ist.psu.edu/viewdoc/down
load;jsessionid=C4DE6737920C5946FB4D814BD1B8EB33?doi=10.1.1.132.29&rep=
rep1&type=pdf
Kramer, Undine. 2006. Linguistic lightbulb moments: Zeugma in idioms. In Christiane Fellbaum
(ed.), International Journal of Lexicography 19(4). 370–395.
Kuiper, Koenraad. 1996. Smooth talkers. The linguistic performance of auctioneers and
sportscasters. Hillsdale, NJ: Lawrence Erlbaum.
Lakoff, Robin. 1974. Remarks on ‘this’ and ‘that’. Proceedings of the Chicago Linguistics Society
(CLS) 10. 345–356.
Langlotz, Andreas. 2006. Idiomatic creativity: A cognitive-linguistic model of idiom-represen-
tation and idiom-variation in English. Amsterdam & Philadelphia: John Benjamins.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM
How flexible are idioms? 767

Lebeaux, David. 2000. Language acquisition and the form of grammar. Amsterdam &
Philadelphia: John Benjamins.
Mel’čuk, Igor. 1995. Phrasemes in language and phraseology in linguistics. In Martin Everaert,
Erik-Jan van der Linden, André Schenk & Rob Schreuder (eds.), Idioms: Structural and
psychological perspectives, 167–232. Hillsdale, NJ: Lawrence Erlbaum.
Moon, Rosamund. 1998. Fixed expressions and idioms in English: A corpus-based approach
(Oxford Studies in Lexicography and Lexicology). Oxford: Clarendon Press.
Moss, Helen & Lianne Older. 1996. Birkbeck word association norms. East Sussex: Psychology
Press.
Neumann, Gerald, Christiane Fellbaum, Alexander Geyken, Axel Herold, Christiane Huemmer,
Fabian Koerner, Undine Kramer, Kerstin Krell, Alexander Sokirko, Diana Stantcheva &
Ekatherini Stathi. 2004. A corpus-based lexical resource of German idioms. Paper pre-
sented at the 20th International Conference on Computational Linguistics (COLING),
Geneva, Switzerland, August 23–27.
Newmeyer, Frederick. 1974. The regularity of idiom behavior. Lingua 34. 327–342.
Nicolas, Tim. 1995. Semantics of idiom modification. In Martin Everaert, Erik-Jan van der Linden,
André Schenk & Rob Schreuder (eds.), Idioms: Structural and psychological perspectives,
233–252. Hillsdale, NJ: Lawrence Erlbaum.
Nunberg, Geoffrey, Ivan Sag & Thomas Wasow. 1994. Idioms. Language 70. 491–538.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A comprehensive
grammar of the English language. London: Longman.
Sabban, Annette. 1998. Okkasionelle Variationen sprachlicher Schematismen: Eine Analyse
französischer und deutscher Presse- und Werbetexte. Tübingen: Narr.
Schenk, André. 1995. The syntactic behavior of idioms. In Martin Everaert, Erik-Jan van der
Linden, André Schenk & Rob Schreuder (eds.), Idioms: Structural and psychological
perspectives, 253–272. Hillsdale, NJ: Lawrence Erlbaum.
Stathi, Katerina. 2007. A corpus-based analysis of adjectival modification in German idioms. In
Christiane Fellbaum (ed.), Idioms and collocations: Corpus-based linguistic and lexico-
graphic studies, 81–108. Birmingham: Continuum Press.
Tabossi, Patrizia, Kinou Wolf & Sara Koterle. 2009. Idiom syntax: Idiosyncratic or principled?
Journal of Memory and Language 61. 77–96.
Zhu, Feng & Christiane Fellbaum. 2015. Quantifying fixedness and compositionality in Chinese
idioms. International Journal of Lexicography 28(3). 338–350.

Brought to you by | Universitaetsbibliothek Frankfurt/Main


Authenticated
Download Date | 7/24/19 2:31 PM

You might also like