This art icle was downloaded by: [ I owa St at e Universit y]
On: 14 Novem ber 2011, At : 15: 08
Publisher: Rout ledge
I nform a Lt d Regist ered in England and Wales Regist ered Num ber: 1072954
Regist ered office: Mort im er House, 37- 41 Mort im er St reet , London W1T 3JH,
UK
The Serials Librarian
Publicat ion det ails, including inst ruct ions for
aut hors and subscript ion informat ion:
ht t p:/ / www.t andfonline.com/ loi/ wser20
The Half-Life Phenomenon
a
Michael Bugej a & Daniela V. Dimit rova
b
a
Greenlee School of Journalism and
Communicat ion, Iowa St at e Universit y of Science
and Technology, Hamilt on Hall 101, Ames, IA,
50011-0001, USA
b
Greenlee School of Journalism and
Communicat ion, Iowa St at e Universit y of Science
and Technology, Hamilt on Hall 117, Ames, IA,
50011-0001, USA
Available online: 22 Sep 2008
To cite this article: Michael Bugej a & Daniela V. Dimit rova (2006): The Half-Life
Phenomenon, The Serials Librarian, 49:3, 115-123
To link to this article: ht t p:/ / dx.doi.org/ 10.1300/ J123v49n03_10
PLEASE SCROLL DOWN FOR ARTI CLE
Full t erm s and condit ions of use: ht t p: / / www.t andfonline.com / page/ t erm sand- condit ions
This art icle m ay be used for research, t eaching, and privat e st udy purposes.
Any subst ant ial or syst em at ic reproduct ion, redist ribut ion, reselling, loan,
sub- licensing, syst em at ic supply, or dist ribut ion in any form t o anyone is
expressly forbidden.
The publisher does not give any warrant y express or im plied or m ake any
represent at ion t hat t he cont ent s will be com plet e or accurat e or up t o
dat e. The accuracy of any inst ruct ions, form ulae, and drug doses should be
independent ly verified wit h prim ary sources. The publisher shall not be liable
for any loss, act ions, claim s, proceedings, dem and, or cost s or dam ages
Downloaded by [Iowa State University] at 15:08 14 November 2011
what soever or howsoever caused arising direct ly or indirect ly in connect ion
wit h or arising out of t he use of t his m at erial.
Downloaded by [Iowa State University] at 15:08 14 November 2011
The Half-Life Phenomenon:
Eroding Citations in Journals
Michael Bugeja
Daniela V. Dimitrova
ABSTRACT. The phenomenon of lapsed URLs, otherwise known as
“linkrot,” has been acknowledged since the 1990s; however, a relative
few studies addressed the impact of linkrot with respect to the footnote,
the foundation upon which the scientific documentation is based. In this
summary of research to date, the authors focus on three top journals in
journalism and communication–Human Communication Research,
Journal of Communication, and Journalism & Mass Communication
Quarterly–testing some 416 online citations over four years. Of the total
416 citations, only 61% were still accessible. Additionally, 19% of the
online footnotes contained an error in the URL, and 63% did not provide
an access date in the published citation. Of those links that were still
active, only 58% matched the cited content. The authors also introduce their concept of “the half-life of Internet footnotes,” or the time
it takes for one-half of online citations in a journal to go dead, and
make recommendations to extend the online life of Internet-based footnotes. [Article copies available for a fee from The Haworth Document Delivery
Service: 1-800-HAWORTH. E-mail address: <docdelivery@haworthpress.com>
Website: <http://www.HaworthPress.com> © 2005 by The Haworth Press, Inc.
All rights reserved.]
Michael Bugeja is Professor and Director, Greenlee School of Journalism and Communication, Hamilton Hall 101, Iowa State University of Science and Technology,
Ames, IA 50011-0001 (E-mail: bugeja@iastate.edu).
Daniela V. Dimitrova is Assistant Professor, Greenlee School of Journalism and
Communication, Hamilton Hall 117, Iowa State University of Science and Technology, Ames, IA 50011-0001 (E-mail: danielad@iastate.edu).
The Serials Librarian, Vol. 49(3) 2005
Available online at http://www.haworthpress.com/web/SER
2005 by The Haworth Press, Inc. All rights reserved.
doi:10.1300/J123v49n03_10
115
116
THE SERIALS LIBRARIAN
Downloaded by [Iowa State University] at 15:08 14 November 2011
KEYWORDS. Internet, footnotes, linkrot, journals
On March 18, 2005, the media reported new research in space and
cyberspace. Astronomers using the Spitzer Space Telescope detected
light from planets orbiting distant stars,1 and two Iowa State University
journalism professors released their findings about “the half-life of
Internet footnotes.”2 We are those professors. It is rare that research in
journalism and communication makes news. However, our year-long
study, as reported in The Chronicle of Higher Education, brought
world-wide attention to “the new unstable publishing medium” of the
Internet, responsible for the disappearance of footnotes over a four-year
period in some of the world’s most prestigious communication journals.3
The decay of online footnote–not the footnotes per se, but the sites to
which they refer–was not news to librarians, who had a special name for
it: linkrot. Among the early chroniclers of linkrot (not necessarily associated with footnotes) was the Web Surveying Team at the Georgia Institute of Technology, which documented the erosion of online URLs in
a 1997 user survey.4 Other subsequent studies did associate linkrot with
footnotes, including one examining Internet references in three important U.S. scientific journals–New England Journal of Medicine, The
Journal of the American Medical Association and Science–with 13% of
online references inaccessible after two years.5
Our study connected linkrot with something greater than inconvenience or even citation, noting that the footnote was the foundation of
modern scholarship and that lapses in online references would threaten
basic research, simply because the Internet was replacing the printing
press, upon which our scientific documentation has been based since
the Enlightenment. In his insightful book, The Footnote: A Curious History, Princeton historian Anthony Grafton writes:
Citations in scientific works–as a number of studies have
shown–do far more than identify the originators of ideas and the
sources of data. They reflect the intellectual styles of different national scientific communities, the pedagogical methods of different graduate programs, and the literary preferences of different
journal editors. They regularly refer not only to the precise sources
of scientists’ data, but also to larger theories and theoretical
schools with which the authors wish or hope to be associated.6
Downloaded by [Iowa State University] at 15:08 14 November 2011
Michael Bugeja and Daniela V. Dimitrova
117
This quotation and others in Grafton’s book alerted us to the potential
scope of what seemed, at first, an annoying unreliability of online footnotes and what subsequently expanded over a year’s time to questions
about reliability and verification associated with scientific scholarship.
With these ramifications in mind, we set out to measure the half-life
phenomenon in some of the best journals in journalism and communication. We defined the term “half-life” as the estimated time it takes for
one-half of online footnotes in a journal to disintegrate. Our research
questions were basic:
RQ1: What is the frequency of use of Internet citations in articles
in Human Communication Research, Journal of Communication,
and Journalism & Mass Communication Quarterly?
RQ2: What are the general characteristics of the Internet sources?
• RQ2a: How many Internet sources are active?
• RQ2b: What are the most common domains?
• RQ2c: Are the sources hyperlinked correctly?
• RQ2d: How many provide retrieval dates?
• RQ2e: What factors can serve as predictors of online citations’ permanence?
We performed a content analysis of all articles published between
2000 and 2003 in the three selected journals. This yielded 416 online citations, with the following breakdown: 73 citations in 2000 articles, 92
in 2001, 118 in 2002, and 133 in 2003. Figures 1, 2 and 3 show the use
of online citations per journal. Of the total 416 citations, only 61% were
still accessible. Additionally, 19% of the online footnotes contained an
error in the URL, and 63% did not provide an access date in the published citation. Of those links that were still active, only 58% matched
the cited content. We also ascertained the stability of domain names in
these journals, finding that .org, .edu and .gov were the most reliable
(see Figure 4). Based on the four-year period and the three top journals
we examined, the average half-life for journalism and communication
Internet references was estimated to be 3.02 years. This means that it
will take about three years for half of these Internet citations to be no
longer accessible through the original URLs.
The most frequently cited reason for a dead hyperlink was a message
that the page was not found–sometimes explicitly saying “404 Error” or
just “Page Not Found.” Some sites expanded the message to “The page
you requested could not be found.” Some Internet citations required a
118
THE SERIALS LIBRARIAN
Downloaded by [Iowa State University] at 15:08 14 November 2011
FIGURE 1. Number of Internet citations in Human Communication Research.
Human Communication Research:
Number of Online Citations per Year
12
10
8
6
4
2
0
2000
2001
Citation didn’t work
2002
2003
Citation worked
FIGURE 2. Number of Internet citations in Journal of Communication.
Journal of Communication:
Number of Online Citations per Year
30
25
20
15
10
5
0
2000
2001
2002
Citation didn’t work
2003
Citation worked
Michael Bugeja and Daniela V. Dimitrova
119
Downloaded by [Iowa State University] at 15:08 14 November 2011
FIGURE 3. Number of Internet citations in Journalism & Mass Communication
Quarterly.
Journalism & Mass Communication Quarterly:
Number of Online Citations per Year
80
70
60
50
40
30
20
10
0
2000
2001
2002
Citation didn’t work
2003
Citation worked
FIGURE 4. Stability of domain names in the three journals.
Online Citation by Top Level Domain
120
100
80
60
40
20
0
.com
.edu
.gov
Citation didn’t work
.org
other
Citation worked
Downloaded by [Iowa State University] at 15:08 14 November 2011
120
THE SERIALS LIBRARIAN
user name and password, while some required subscriptions. Here is an
example: “The story you requested is available only to registered members. Registration is free and offers great benefits. Click here to register
if you are not a registered member of latimes.com.” Few of the online
citations retrieved a redirect page. A dead link hosted by a government
Web site, for example, brought up the following redirect message: “As
of May 31, 2002, www.nara.gov became www.archives.gov! Please
update your bookmarks. Wait 10 seconds, or click now to visit
www.archives.gov. Thank you!”
In analyzing these findings, we came to realize that a double standard
seemed to exist in how scholars viewed paper-based references as opposed to Internet-based ones. Scholars working in previous eras would
be aghast at print citations referencing wrong pages; and we believed
that current-day scholars need to hold the Internet to the same standards
as printed works. Also associated with those standards was the specter
of purposeful deception. Text in URLs often eventually refers to different content over time. Consider the fact that the half-life of Internet
footnotes in our select journals occurs shortly after 3 years, resulting in
a substantial number of dead links. URLs can be fabricated and then
seem to have gone dead without the reliability of fact-checking; or those
URLs can be accurate but refer to different content, casting scholarship
in a false light. Worse, because those lapsed URLs can be cited in subsequent research, promulgation of inherent error is possible, undermining
the very nature of verification that is at the core of the scholarship–
again, based on the printing press.
Upon reading a copy of our quantitative study, which earned best paper status in the Communication and Technology division of the 2005
International Communication Association, Anthony Grafton at Princeton wrote in a Feb. 23, 2005, e-mail that researchers are rapidly approaching an era in the digital age when the modern footnote will have
to be reconceptualized to suit the “new” printing press of the Web. According to Grafton, “One possibility would be via references to stable
online databases that are themselves preserved by reliable long-term institutions. Only a form of reference not liable to rapid decay or limited
by passwords would work. And of course we don’t know yet what platforms will be genuinely stable and durable.”7 Grafton also commented
on our scholarly tradition being anchored by the footnote and how this
tradition might be transformed with the new medium of the Internet.
“[S]cholarship is certainly anchored by its verifiability,” he wrote, “and
it’s hard to imagine a form that doesn’t have some equivalent to the
footnote. In the fields of humanistic scholarship that I work in, for the
Downloaded by [Iowa State University] at 15:08 14 November 2011
Michael Bugeja and Daniela V. Dimitrova
121
moment, the lowering of costs associated with computer typesetting and
the transfer of labor to the author has preserved massive footnotes of the
old sort for another generation or so. But after that–who knows?”8
In any case, Grafton’s comments have influenced our scholarship,
forcing us to question basic aspects of the Internet, which provides instant access to hundreds of thousands of information resources with the
convenience of search tools for fast, convenient data retrieval. Because
libraries, in particular, are purchasing online journals, or journals with
online editions, Internet-based references will continue to increase–a
research-related quandary compounded by lack of guidelines in graduate schools, academic associations and libraries themselves. Grafton is
especially sensitive to the preservation of information as historical artifact. He notes that the work of historians “stride forward or totter backward on their footnotes,” which accomplish two basic tasks: “They
must examine all the sources relevant to the solution of a problem and
construct a new narrative or argument from them. The footnote proves
that both tasks have been carried out.”9 Our research indicates that the
Internet threatens both tasks over time. Moreover, problems only now
are starting to mount so that librarians–especially the serials librarian–
must start taking notice.
Serials librarians have been purchasing text/html versions of journal
articles because users want to copy, paste and otherwise manipulate that
data with a computer. The computer not only enables quick, easy online
data retrieval; its software also allows users to manipulate that data.
During the era of the printing press, manipulation of journals was a
crime typically perpetuated with a razor blade in the periodicals section.
In that allusion is the statement of the crisis at hand. The book is the ultimate fire-walled medium; the Internet is its opposite.
Thus, we have been recommending to scholars that the closer the medium is to the book, the more reliable the footnote. For instance, a pdf is
closer to the book than text/html because the former is essentially a
snapshot of paper. Text features manipulability. As such scholars
should beware of using URL addresses for html formats, especially
when pdf formats are available from journals in library databanks. A
case in point: Journalism & Mass Communication Quarterly offers full
text versions of its contents in both html and pdf formats. Early in our
study one of our coders analyzed the html version of J&MC Quarterly
rather than the pdf version and found repeated formatting errors–an extra space between forward slashes (http://) and another extra space
when hyperlinks in the original stretched from the right to left margin.
Downloaded by [Iowa State University] at 15:08 14 November 2011
122
THE SERIALS LIBRARIAN
The comparison between the two formats showed a 17% increase in
citation failure from pdf to html.
In the end, Internet research is critical for scholarship because it allows continuous, convenient access to electronic databases and other resources, thus increasing the scope and breadth of that scholarship. That
is its allure as a medium. However, without the ability to fact-check stable citations, research may become as reliable as opinion and method as
un-replicatable as art.
In future research we intend to compare online-only journals with
traditional journals, under the theory that scholars doing research on the
Internet will be particularly vulnerable to the half-life effect because
their work must refer to and document Web-based content. We will be
analyzing the Internet’s impact on other aspects of scholarship associated with the printing press, including indexes and bibliographies. Over
time we will analyze how research methods must be changed to account
for the Internet’s dynamic aspects and also will look into methods to
foster reliable archiving–a library topic if there ever was one–previously known as a book shelf.
In the interim the diffusion of online access–including databases and
digital journals–will undoubtedly create a snowball effect as more authors cite hypertext whose addresses will eventually lapse. Without cogent recommendations to offset the decay of Internet footnotes, scholars
will not be able to access the full array of citations. Journals will become
increasingly unreliable when future studies attempt replication, without
which validity cannot be fully ascertained, all of which will contribute
to the erosion of standards. And yet this phenomenon continues to be
under the radar of many academic programs and library associations,
even as Google.com publicizes plans to digitize entire libraries.
Granted, our research is not as far-reaching as detecting light from distant solar systems, the news that was announced on March 18 of this
year, along with our findings; but astronomers documenting that cosmic
phenomenon using footnotes in digital journals may find that their
observations about light over time may be extinguished on the Internet.
NOTES
1. Richard Harris, “Looking at Planets Beyond the Solar System,” National Public
Radio’s Morning Edition; retrieved May 3, 2005, from http://www.npr.org/templates/
story/story.php?storyId=4556714.
2. Scott Carlson, “Scholars Note Decay of Citations to Online References,” The
Chronicle of Higher Education, March 18, 2005, p. A30.
Downloaded by [Iowa State University] at 15:08 14 November 2011
Michael Bugeja and Daniela V. Dimitrova
123
3. Carlson, p. A30.
4. “Problems Using the Web,” 1997. Retrieved May 3, 2005, from http://www.gvu.
gatech.edu/user_surveys/survey-1997-10/graphs/use/Problems_Using_the_Web.html
5. Drake A. Dellavalle and Graber, M., Heilig, L., Hester, E., Kuntzman, J., & Schilling, L., “Going, going, gone: Lost Internet references,” Science, October 2003,
pp. 787-788.
6. Anthony Grafton, The Footnote: A curious history (Cambridge, Mass: Harvard,
1997), p. 13.
7. Anthony Grafton, grafton@.Princeton.EDU, “Question about the Footnote from
Iowa State,” 23 February 2005, personal e-mail to Michael Bugeja, <bugeja@
iastate.edu> (23 February 2005).
8. Grafton, “Question about the Footnote from Iowa State.”
9. Grafton, pp. 4-5.