What AI Teaches Us About Good Writing - NOEMA
What AI Teaches Us About Good Writing - NOEMA
What AI Teaches Us About Good Writing - NOEMA
By Laura Hartenberger
JULY 25, 2023
Laura Hartenberger is a writer and lecturer for the University of California, Los Angeles
Writing Programs.
Generative AI tools like ChatGPT offer the seductive possibility that we can
optimize this laborious process. But while it can clearly optimize the time and
effort of writing, ChatGPT cannot necessarily optimize writing quality. The
program produces highly competent prose that usually passes as human-
generated, but so far, the quality of its writing — beyond the novelty of being
authored by an algorithm — is mostly unremarkable.
At the University of California, Los Angeles, where I teach writing, the common
sentiment among faculty is: “Sure, ChatGPT can write — but it can’t write well.”
Some professors caution students against using the tool by appealing to their
egos: “You could use AI to cheat on your essay, but do you really want a C+?”
Others, recognizing that AI tools will characterize the working world into which
students will graduate, are beginning to allow their use in constrained ways,
framing them as automated writing tutors or advanced grammar-checking tools.
But even AI enthusiasts tend to advise students to maintain authorial control by
editing any AI-generated output for accuracy, style and sophistication.
The flat, conventional feel that characterizes most AI-generated writing stems
from the predictive nature of the algorithm. Trained on vast databases of human
texts, from books to articles to internet content, programs such as ChatGPT, Bard,
Bing, and Claude function like sophisticated autocomplete tools, identifying and
predicting phrase patterns, which makes their output feel somewhat predictable,
too.
But does predictable writing necessarily mean bad writing? When we talk about
good writing, what exactly do we mean? As we explore new applications for large
language models and consider how well they can optimize our communication, AI
challenges us to reflect on the qualities we truly value in our prose. How do we
measure the caliber of writing, and how well does AI perform?
In school, we learn that good writing is clear, concise and grammatically correct
— but surely, it has other qualities, too. Perhaps the best writing also innovates in
form and content; or perhaps it evokes an emotional response in its readers; or
maybe it employs virtuosic syntax and sophisticated diction. Perhaps good
writing just has an ineffable spark, an aliveness, a know-it-when-you-see-it
quality. Or maybe good writing projects a strong sense of voice.
But then, what makes a strong voice, and why does ChatGPT’s voice so often fall
flat?
“The Elements of Style,” the classic reference book on writing by William Strunk
Jr. and E.B. White, lays out a series of concrete rules. To write well, the authors
say, you should abide by certain conventions, such as grouping your sentences
into single-topic paragraphs. You should adhere to certain grammatical rules,
like, “Do not join independent clauses by a comma.” You should “omit needless
words” and write in an efficient, organized, streamlined manner.
These rules take effort for any human writer — we all miss the occasional comma
splice, use a few more words than necessary or bury our main point in the middle
of a paragraph. ChatGPT, by comparison, rarely makes rhetorical moves that
stray from Strunk and White’s conventions unless instructed to do so, and the
speed with which it spews forth efficient, grammatically correct sentences is
impressive, unsettling and perhaps mildly humiliating to us error-prone human
writers. For teachers trying to catch cheating students, the total absence of typos
and grammatical flubs is often what raises suspicions.
But simply abiding by the rules doesn’t make excellent writing — it makes
conventional, unremarkable writing, the kind usually found in business reports,
policy memos and research articles. In his review of AI-generated novel “Death of
an Author,” Dwight Garner describes the prose as having “the crabwise gait of a
Wikipedia entry.” Even when a user prompts ChatGPT to include specific
grammatical errors or to stray from certain norms, its writing tends to carry a
certain flatness. By design, the program relapses to a rhetorical median, its
deviations mechanical whereas ours are organic.
That’s not to say that convention flattens prose. In fact, convention lies at the root
of much of the best writing — it’s rare to see acclaimed texts that stray
dramatically from grammatical and stylistic norms.
Structural convention also underlies much of what we call good writing. Most
prize-winning literature innovates within classic story arcs: Aristotle’s three-act
structure (beginning, middle and end); Freytag’s five-stage structure (exposition,
rising action, climax, falling action and resolution); or a screenwriter’s six
categories of dramatic conflict (conflict with self; with others; with society; with
nature; with the supernatural; and with the machine).
Indeed, the fact that AI, which is trained to detect and replicate underlying
patterns in our writing, can produce such coherent prose is a testament to just
how much we rely on convention, both at the sentence and structural level.
Oddly, ChatGPT is not very good at producing writing under Oulipian constraints
— it failed to generate correct responses to all of these formulas. In response to
my prompt, “Write a sentence that doesn’t use any words containing the letter
“E,” it wrote:
“The big brown dog ran swiftly through the grassy field.”
The overly enthusiastic “Sure!” and the affirmatory closing line make this sample
feel like it was written by a mischievous child hoping the reader won’t notice that
three out of the ten words in the sentence use “E”. Whatever limited sense of
spark this passage has can be attributed to the AI’s failure to adhere to the
constraint, to the human-like energy that comes from its error.
The student essay is another form of constraint. To earn a passing grade, writers
must conform to its rules, but to excel they must innovate within and beyond
them. Students must use specific citation methods, adhere to conventions of
academic writing and ensure their responses fully address a prompt’s question; at
the same time, they must offer sufficiently interesting variations on the theme,
evidence of original thought and dynamic phrasing that commands attention — a
delicate balance most do not consistently achieve.
But good writers recognize that grammatical rules are dictated by problematic
power structures and are not independent measures of correctness. Good writers
assess the rhetorical context for their writing and make deliberate decisions about
where to conform and where to stray.
The opening line of Charles Dickens’ “A Tale of Two Cities” is a horrifically long,
repetitive sentence, but deliberately so:
It was the best of times, it was the worst of times, it was the age of wisdom, it was
the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it
was the season of Light, it was the season of Darkness, it was the spring of hope, it
was the winter of despair, we had everything before us, we had nothing before us,
we were all going direct to Heaven, we were all going direct the other way — in
short, the period was so far like the present period, that some of its noisiest
authorities insisted on its being received, for good or for evil, in the superlative
degree of comparison only.
The unwieldy length and repetitive nature of the sentence is its point. Its cascade
of contradictory clauses violates most basic writing advice, and its flagrant
comma splices override Strunk and White’s rules. But these moves mirror the
meaning of the sentence and establish rhythmic momentum, allowing Dickens to
characterize the period as a chaotic, contradictory, overflowing moment through
both the phrasing of the language and its meaning.
Literature and writing courses teach the art of “close reading,” or looking to
sentence-level elements like punctuation, diction and syntax, for insight into the
deeper meaning of the text. Perhaps a key criterion of good writing is that which
rewards close reading with rich congruity between syntax and story — in other
words, writing with intention.
When I command ChatGPT to “Write the opening to a novel set in the current
moment using a series of repeated clauses,” it writes:
In the midst of chaos, where uncertainty lurked around every corner, where hope
and despair danced an intricate tango, where time seemed both fluid and
stagnant, a new chapter began. A chapter etched with the echoes of a pandemic
that had reshaped the very fabric of our lives, where masks adorned faces like
silent guardians, where distance became the bitter mediator between loved ones,
where solitude embraced us with its relentless grip.
The result isn’t terrible, but it has an overly familiar feel; the “new chapter” and
“fabric of our lives” are tired phrases, and the repetition is not as musical or
energetic as the Dickens opening. Elevating the writing with textual layers and
nuanced opportunities for close reading would take word-by-word human
editing.
The seeming subjectivity of the answer is at the root of what frustrates many
students about graded writing assignments and most folks who compose things
more intensive than a Slack message — but the subjectivity is the point.
Consider the Dickens sentence above: If the author were writing for an audience
of children rather than adults, he might have used simpler sentence structures; if
he were writing an op-ed meant to persuade rather than a novel meant to
entertain, he might have avoided the antitheses; and if he were writing a novel in
2023 rather than 1859, he might have used a different cadence and register.
It’s difficult to determine a text’s quality without considering the context in which
it was written. In an ambitious attempt to create a universal measure of good
writing, regardless of discipline or genre, the American Association of Colleges
and Universities (AAC&U) developed a rubric whose categories focus on
intentionality. It rewards writing that demonstrates “control” over syntax and
mechanics; “attention” (but not necessarily blind adherence) to genre and
disciplinary conventions; and “a thorough understanding of context, audience,
and purpose that is responsive to the assigned task(s) and focuses all elements of
the work.”
In other words, good writing isn’t about sophisticated sentences or complex ideas;
it’s about unifying all elements into a coherent whole. You can write a poignant,
lyrical, oblique sonnet about the rain, but if your purpose is to inform newspaper
readers about the weather forecast, that’s not good writing.
The chatbot can write passable essays for standardized tests because the purpose
and context are so general — they need to be for humans to produce texts that can
be compared and ranked in an equitable way.
But in a highly specific context like a novel or a letter, ChatGPT can’t know
enough to create sufficient nuance. Writing a prompt with all relevant
information would be nearly impossible, and suboptimal for a technology meant
to optimize our time. For creative, expressive, or exploratory writing tasks, using
ChatGPT is like supervising a bumbling assistant who needs painfully detailed,
step-by-step instructions that take more effort to explain than to simply do the
work yourself.
We often say that good writing has a strong sense of voice. Speaking voices can be
recognized from their tone and pitch, but what rhetorical features define a
writer’s voice on the page?
I sometimes ask students to underline selections from their drafts that they
believe represent their voice. Sometimes they notice patterns or tics, stylistic
quirks, a repeated word or sentence structure. Some highlight sections in which
they convey strong opinions or a particularly well-defined point of view.
Sometimes they label whole drafts as their voice — after all, they wrote it.
Others cannot find their voice at all — it was a class assignment, so they were
writing in the voice of their academic alter-ego. Those who lack confidence
sometimes point to grammatical errors as examples of their voice. Their wide-
ranging answers showcase how difficult it is to pin down what makes a distinctive
voice.
Can ChatGPT teach us anything about what makes writing sound like one person
versus another? The program is a masterful ventriloquist — its ability to imitate
style is one of its most impressive and delightful features. It does so by using
“unsupervised learning” to detect rhetorical patterns from its massive database of
various kinds of writing, without being told what to look for.
The frustrating part is that it can’t tell us precisely what it notices — it can only
deliver text that imitates these patterns, often with startling aptness. It can write
recognizably in the voice of any number of characters, real or imagined, historic
or contemporary, from Oprah to Jane Austen, Holden Caulfield to Matthew
McConaughey, and can emulate the style of texts from the Bible to a Fox News
comments section to a wedding toast.
When I input the prompt, “Write a speech about potatoes in the style of Donald
Trump,” ChatGPT’s response sounds like the script from a “Saturday Night Live”
sketch: “Folks, let me tell you, nobody loves potatoes more than me, believe me.
I’ve been eating them my whole life. Best thing you can put on your plate. And let
me tell you, our farmers, they grow the best potatoes. The best. They’re huge,
they’re beautiful, they’re red, white, and yellow.”
What exactly makes this language sound like Trump — the content? The syntax?
The colloquial diction — “folks” and “let me tell you”? The rhythm and repetition
of “They’re huge, they’re beautiful, they’re red”? All the above? What’s striking
about this example is that ChatGPT is not so much imitating Trump’s voice as
exaggerating its features into a caricature, almost as if the chatbot has picked up
on the man’s very essence.
William Zinsser, in his classic book, “On Writing Well,” explains that “we express
ourselves as we do” because of the “subconscious mind.” Perhaps ChatGPT’s deft
impressions show us that our language patterns reveal more about our character
than we might realize. And its facility at imitating style has implications for
copyright — to what extent should we view the rhetorical tendencies that make up
one’s writing voice as proprietary?
Some writers seem unsettled when faced with AI renditions of their own style.
Douglas Hofstadter, author of “Gödel, Escher, Bach: An Eternal Golden Braid,”
noted that GPT-4, when prompted to write in his voice, produced what he termed
a “Hofstader façade,” or a series of “vague generalities that echo phrases in the
book” rather than a seemingly authentic replica of his writing style. And
songwriter Nick Cave called ChatGPT’s attempt to write lyrics in his style “a
grotesque mockery of what it is to be human.”
But the technology’s capacity for imitation will likely continue to improve: New
Yorker staff writer Kyle Chayka observed that ChatGPT was not very effective at
mimicking his own writing voice, but the AI startup, Writer, created a bot trained
on his own writing to produce text in his voice that, while not perfect, was still
“unnervingly effective.” Chayka expressed mixed feelings about this capability:
“The robot has made me acutely self-conscious. I recognize my A.I. doppelganger,
and I don’t like it.”
(But of course, it’s not really the voice of the masses — the algorithm inherently
prioritizes the writing patterns of those that have published most often, letting
them dominate over underrepresented groups and writing styles.)
When speaking as itself, the chatbot sounds neutral, unanimated, optimistic, but
not especially enthusiastic to be talking with you. It often opts for lists
sandwiched between a clear introduction and conclusion.
But something critical is missing from its voice: a certain sense of connection. At
its core, writing is about creating intimacy between writer and reader. It’s a
relational act, not a one-sided performance, and its power is in the exchange of
ideas. It’s the closest we can get to inhabiting the mind of another human, the
closest to escaping our own egos.
“So what?” is the common refrain of writing teachers. “Why should your reader
care?” A key way that good writing achieves connection is by creating stakes, or
engaging the reader by showing them why your ideas matter.
Stakes are the parts that reach up off the page and out into the world to connect
with the reader, to shift their interior state, to make us want to keep reading.
Without emotional stakes, even virtuosic texts can feel difficult, off-putting or
cold; the emotional payoff is low relative to the energy they take to read.
We appreciate literary prowess, but engaging the reader matters more — we seem
to want more than just spectacle from good writing. As readers, we need to feel
like the writer is paying attention to us, trying to connect. ChatGPT cannot build a
real connection with its reader — it can only imitate one.
Reading ChatGPT’s writing feels uncanny because there’s no driver at the wheel,
no real connection being built. While the machine can articulate stakes, it is
indifferent to them; it doesn’t care if we care, and somehow that diminishes its
power. Its writing tends not to move us emotionally; at best, it evokes a sense of
muted awe akin to watching a trained dog shake a hand: Hey, look what it can
do.
The writing isn’t Pulitzer-worthy, but it has a certain energy that perhaps stems
from the surprise of seeing how the program tackled the prompt’s challenge —
you can’t help but feel like ChatGPT is in on the joke. It’s almost as if a personality
starts to form — a little cheeky, willing to embarrass itself to make us laugh.
“As readers, we need to feel like the writer is paying attention to us,
trying to connect. ChatGPT cannot build a real connection with its
reader — it can only imitate one.”
“You silly humans,” the chatbot seems to be saying. “Using the greatest
technology of our generation to create funny memes. But okay, I’ll play along.”
These moments convey a sense of energy that makes it hard for me to believe I’m
not chatting with a sentient being. Perhaps having a strong voice simply means
writing in a way that makes you seem alive.
If this poem were written by a human, its voice probably wouldn’t have the same
strength — it might feel cheesy and oddly reverential of the fast-food chain. Once
we know ChatGPT generated the poem, however, its quality improves — we get
the feeling the technology is unwittingly commenting on our world, illuminating
the categories we use to understand it.
The McDonald’s sonnet isn’t interesting as a poem — it’s interesting as the output
of an algorithm programmed with knowledge of our writing and our world.
Perhaps AI-generated writing has the potential to be interesting or meaningful in
contexts where the chatbot’s lack of awareness and intentionality matters; when
the fact that the machine is not sentient amplifies the impact of its output; when
the writing is, in some sense, about AI-generated writing.
But AI-generated writing about AI-generated writing is a narrow niche and there
are limits to how long we’ll find it compelling.
Ethics Of Plagiarism
We tend to believe that good writing is original and thus advise writers to avoid
clichés — phrases used so often we’ve come to see them as unoriginal and
thoughtless. Clichés spring to mind too easily, careening along well-paved neural
pathways, whereas original phrases must be pulled from the quicksand of our
brains with significant effort.
We scorn clichés not because they’re bad descriptions — indeed, the reason they
linger is probably because they’re pretty decent — but because their familiarity is
off-putting, a sign of writerly laziness. Even when ChatGPT doesn’t use clichés, its
writing still echoes of them; there’s usually the sense that there might be a
fresher, more original way to say things.
Good writing, we believe, not only avoids phrases taken from the general
consciousness, but also avoids language taken from individual writers unless
acknowledged by quotation marks, and it credits others for their ideas with
citations.
We expect students to read widely and build arguments that use others’ texts as
support for their own. But to maintain so-called “academic integrity,” they must
do this using fresh language and draw explicit distinctions between their own
ideas and others’ — with the exception of information that is considered general
knowledge. But what is general knowledge in a world where virtually any
information is freely accessible online?
ChatGPT, in a sense, plagiarizes our voices as it parrots the writing it was trained
on. It tends not to cite the specific sources it synthesizes to craft its phrases, and
when it does, they are unreliable — the MLA Style Center website cautions writers
to “vet” any secondary sources that appear in AI-generated text, as the programs
have the occasional tendency to “hallucinate” false sources and provide
information of questionable accuracy. Given the opacity of the AI’s sources, a
student who tries to pass off AI-generated text as their own may be inadvertently
performing a multi-dimensional transgression, plagiarizing an AI that itself is
plagiarizing others.
The ethics of training AI on copyrighted materials are murky, too. Platforms like
Reddit are pushing back against AI developers’ use of their content, and Sarah
Silverman and other authors recently sued OpenAI for electronically ingesting
illegally uploaded versions of their books from the internet to use as training data
for ChatGPT. The Writers Guild of America, on strike since May, seeks to regulate
the use of AI, both by preventing human-authored scripts from becoming AI
training data and limiting AI tools in the writer’s room.
But if generative AI becomes as widely adopted as the Google search engine, will
authors still want to opt out of contributing to it, or will serving as a model for the
algorithm become a way to amplify their own literary influence, an honor akin to
being ranked at the top of a Google search result? Should we work to protect the
right to be excluded from AI training data, or the right to be included in it?
Their relevance is debatable, too — what’s the value we’re trying to preserve by
differentiating human writing from AI? Is it really plagiarism not to cite an AI-
generated phrase? Is plagiarism still the crime we think it is?
But this distinction troubles me: How can we tell what kinds of writing are meant
to be original and which are not? What exactly does originality mean? More
practically, why do so many workplaces ask us to produce such unoriginal texts –
what kind of value do they produce?
Writing As Thought
I like to say that I’m not teaching my students how to write — I’m teaching them
how to think; how to be observant; how to question the systems around them;
how to interpret and build meaning; how to relate to others; how to understand
and differentiate themselves; how to become agents of change. But ChatGPT, by
producing competent writing with apparent thoughtlessness, threatens the idea
that critical thinking is the core of good writing.
With its startling ability to regenerate responses by paraphrasing the same ideas
in new words ad infinitum, it mocks the weight we put on paraphrasing to avoid
plagiarism. We task students with summarizing texts in their own words to
demonstrate their understanding of the material — but ChatGPT shows us that
it’s possible to explain others’ ideas without understanding them; to build
arguments from their content without metacognition.
Its revelation is reversing how we tend to think writing works: First, you come up
with an idea. Second, you find the words to articulate it. But ChatGPT inverts this
process. It begins with the words and builds its arguments and narratives based
on language patterns, letting its ideas emerge from the text it uses to produce
them.
We tend to view writing as hyper-personal, a conduit for our unique thoughts. But
ChatGPT, through its own training, reminds us that we learn to write through
imitation, the same way we learn to smile or eat or walk. Children grow up
speaking with the accent of their peers, not their parents; in the same way,
writing is a networked, communal act, inseparable from others’ writing.
We write in conversation with what we read, and good writing balances our own
words with others’. We summarize their ideas, using them as springboards and
support for our arguments. We take language from others, too, and not just as
quotations: The English language is a colonial artifact that swallows up other
languages. It’s full of stolen words and idioms and familiar, tired phrases —
things we say because others say them.
“Perhaps the ineffable spark of good writing and the spark of a
romantic connection are related — both involve a certain energy
exchange, a sense of connection across individual minds, a balance of
surprise and familiarity.”
Costs Of Optimization
I’ve heard writer-parents say having a baby is equivalent to losing two books’
worth of time. And time is worth money: Full-time writing is a privilege that few
can afford, with most writers stealing scraps of time between day jobs.
Optimizing the effort involved with writing is no small thing, either. The process
requires an unparalleled level of focus, and for many it ushers in feelings of
inadequacy and self-doubt. We take writing failures personally because we see
writing as thought, so failing to express ourselves well in writing can sting more
than other forms of expression. As a result, many end up terrified of the blank
page, and AI becomes a tempting corrective.
Anyone who’s published knows that readership is a rare gift. Reading is work —
valuable work — but like writing, it requires exertion and takes time away from
other tasks. Many of us already feel saturated with content; we consume so much
information through screens that our daily attention spans feel fragile and
limited. There’s a certain respect we hold for writers who are careful not to
publish too much, who honor their readers enough to self-censor and share only
what’s really worth our attention.
And what will become of our own writing after reading so much AI-authored
prose? Will we begin to write more like ChatGPT in a linguistic mise en abyme?
Will we lose the sense that reading and writing offer a solution to loneliness, the
chance to connect deeply with another human’s inner world, given the growing
uncertainty about whether a human is even present behind a given text?
Perhaps something meaningful is lost when we use AI to reduce the time and
effort spent writing. Writing well takes practice, and I see the most significant
progression in students who spend the most time writing, reworking draft after
draft. To “essay” is to try — perhaps good writing is about trying, about process as
much as outcome.
But even if a reader can’t be sure whether a text had AI support, the writer knows,
and producing unassisted writing can feel deeply gratifying. We run marathons
and climb mountains for the sake of it; because they’re hard. Maybe the parts of
writing that feel so burdensome — the effort to think deeply, to sit still with our
thoughts, to articulate them and revise them until they say exactly what we want,
until we figure out what we’re trying to say at all — are the parts that we value
when we praise good writing.
Perhaps the time spent writing matters as much as having written; there is a
vague sense of being, in the moment of writing, the most authentic version of
yourself.
What makes good writing in a world with generative AI? Perhaps writing classes
of the future will lean into the subtle ways in which human writing surpasses AI-
generated writing and challenge students to write better than the machine.
Perhaps they will teach students to be AI curators and remixers, teaching the
prompt engineering skills they need to leverage the technology most effectively, in
preparation for the kind of sparkless, functional writing they will produce post-
graduation — contracts, reports, meeting minutes, instructional manuals.
Perhaps the college essay will be retired in favor of other assignments that
demonstrate knowledge, critical thinking and argumentation skills — speeches,
hands-on activities or multimedia creations. It seems likely we will continue to
teach students to read widely and study textual patterns and conventions closely
— the same way we train AI to write.
But perhaps ChatGPT also shows us that at a certain point, reading has
diminishing returns. Maybe we also need to be trained on other kinds of data in
order to write well, data that comes from being alive in the world over time, from
accumulating enough experience to differentiate our own voice from others.
Writing courses are different from other disciplines — they’re not so much about
transferring knowledge or conveying conceptual frameworks. Instead, they aim to
create a space in which students can practice differentiating themselves.
Generative AI complicates this mission, but it doesn’t terminate it. The division
between the words and ideas that belong to us versus others has always been
more ambiguous than we’d like to think, and ChatGPT blurs that line further.
Even if we use AI to make writing feel easier, we still need to do the hard, lifelong
work of becoming ourselves.
To write well, you need the specificity of perspective that comes from
communicating critically with others over an extended time. AI might make
writing faster, but figuring out who we are in relation to others cannot be
accelerated.
We may see writing as equal to thought, but it is also synonymous with power.
Allowing AI to write for us gives away our power and the opportunity to assert
control over the way we represent ourselves to the world.
The tension between the individual and the collective, between novelty and
familiarity, drives the arc of our lives. We are conceived from the text of others’
DNA and emerge with our own combinations. We grow up learning to imitate
those who raise us, then rebel from them as we battle to find ourselves amid the
influences of society. We use writing to differentiate ourselves, to respond to
others, weaving their words with our own, synthesizing their ideas, adding new
ones and exploring where we align and diverge.
Beyond mirroring elements of our voices, maybe AI also mirrors the tensions we
feel between ourselves and others. Perhaps large language models, if we interact
with them critically, will open new frames through which to explore the balance
between the ways we conform and the ways we break free, adding depth to the
mission of self-discovery that defines our lives.