Academia.eduAcademia.edu

The Half-Life Phenomenon

2006, The Serials Librarian

This art icle was downloaded by: [ I owa St at e Universit y] On: 14 Novem ber 2011, At : 15: 08 Publisher: Rout ledge I nform a Lt d Regist ered in England and Wales Regist ered Num ber: 1072954 Regist ered office: Mort im er House, 37- 41 Mort im er St reet , London W1T 3JH, UK The Serials Librarian Publicat ion det ails, including inst ruct ions for aut hors and subscript ion informat ion: ht t p:/ / www.t andfonline.com/ loi/ wser20 The Half-Life Phenomenon a Michael Bugej a & Daniela V. Dimit rova b a Greenlee School of Journalism and Communicat ion, Iowa St at e Universit y of Science and Technology, Hamilt on Hall 101, Ames, IA, 50011-0001, USA b Greenlee School of Journalism and Communicat ion, Iowa St at e Universit y of Science and Technology, Hamilt on Hall 117, Ames, IA, 50011-0001, USA Available online: 22 Sep 2008 To cite this article: Michael Bugej a & Daniela V. Dimit rova (2006): The Half-Life Phenomenon, The Serials Librarian, 49:3, 115-123 To link to this article: ht t p:/ / dx.doi.org/ 10.1300/ J123v49n03_10 PLEASE SCROLL DOWN FOR ARTI CLE Full t erm s and condit ions of use: ht t p: / / www.t andfonline.com / page/ t erm sand- condit ions This art icle m ay be used for research, t eaching, and privat e st udy purposes. Any subst ant ial or syst em at ic reproduct ion, redist ribut ion, reselling, loan, sub- licensing, syst em at ic supply, or dist ribut ion in any form t o anyone is expressly forbidden. The publisher does not give any warrant y express or im plied or m ake any represent at ion t hat t he cont ent s will be com plet e or accurat e or up t o dat e. The accuracy of any inst ruct ions, form ulae, and drug doses should be independent ly verified wit h prim ary sources. The publisher shall not be liable for any loss, act ions, claim s, proceedings, dem and, or cost s or dam ages Downloaded by [Iowa State University] at 15:08 14 November 2011 what soever or howsoever caused arising direct ly or indirect ly in connect ion wit h or arising out of t he use of t his m at erial. Downloaded by [Iowa State University] at 15:08 14 November 2011 The Half-Life Phenomenon: Eroding Citations in Journals Michael Bugeja Daniela V. Dimitrova ABSTRACT. The phenomenon of lapsed URLs, otherwise known as “linkrot,” has been acknowledged since the 1990s; however, a relative few studies addressed the impact of linkrot with respect to the footnote, the foundation upon which the scientific documentation is based. In this summary of research to date, the authors focus on three top journals in journalism and communication–Human Communication Research, Journal of Communication, and Journalism & Mass Communication Quarterly–testing some 416 online citations over four years. Of the total 416 citations, only 61% were still accessible. Additionally, 19% of the online footnotes contained an error in the URL, and 63% did not provide an access date in the published citation. Of those links that were still active, only 58% matched the cited content. The authors also introduce their concept of “the half-life of Internet footnotes,” or the time it takes for one-half of online citations in a journal to go dead, and make recommendations to extend the online life of Internet-based footnotes. [Article copies available for a fee from The Haworth Document Delivery Service: 1-800-HAWORTH. E-mail address: <docdelivery@haworthpress.com> Website: <http://www.HaworthPress.com> © 2005 by The Haworth Press, Inc. All rights reserved.] Michael Bugeja is Professor and Director, Greenlee School of Journalism and Communication, Hamilton Hall 101, Iowa State University of Science and Technology, Ames, IA 50011-0001 (E-mail: bugeja@iastate.edu). Daniela V. Dimitrova is Assistant Professor, Greenlee School of Journalism and Communication, Hamilton Hall 117, Iowa State University of Science and Technology, Ames, IA 50011-0001 (E-mail: danielad@iastate.edu). The Serials Librarian, Vol. 49(3) 2005 Available online at http://www.haworthpress.com/web/SER  2005 by The Haworth Press, Inc. All rights reserved. doi:10.1300/J123v49n03_10 115 116 THE SERIALS LIBRARIAN Downloaded by [Iowa State University] at 15:08 14 November 2011 KEYWORDS. Internet, footnotes, linkrot, journals On March 18, 2005, the media reported new research in space and cyberspace. Astronomers using the Spitzer Space Telescope detected light from planets orbiting distant stars,1 and two Iowa State University journalism professors released their findings about “the half-life of Internet footnotes.”2 We are those professors. It is rare that research in journalism and communication makes news. However, our year-long study, as reported in The Chronicle of Higher Education, brought world-wide attention to “the new unstable publishing medium” of the Internet, responsible for the disappearance of footnotes over a four-year period in some of the world’s most prestigious communication journals.3 The decay of online footnote–not the footnotes per se, but the sites to which they refer–was not news to librarians, who had a special name for it: linkrot. Among the early chroniclers of linkrot (not necessarily associated with footnotes) was the Web Surveying Team at the Georgia Institute of Technology, which documented the erosion of online URLs in a 1997 user survey.4 Other subsequent studies did associate linkrot with footnotes, including one examining Internet references in three important U.S. scientific journals–New England Journal of Medicine, The Journal of the American Medical Association and Science–with 13% of online references inaccessible after two years.5 Our study connected linkrot with something greater than inconvenience or even citation, noting that the footnote was the foundation of modern scholarship and that lapses in online references would threaten basic research, simply because the Internet was replacing the printing press, upon which our scientific documentation has been based since the Enlightenment. In his insightful book, The Footnote: A Curious History, Princeton historian Anthony Grafton writes: Citations in scientific works–as a number of studies have shown–do far more than identify the originators of ideas and the sources of data. They reflect the intellectual styles of different national scientific communities, the pedagogical methods of different graduate programs, and the literary preferences of different journal editors. They regularly refer not only to the precise sources of scientists’ data, but also to larger theories and theoretical schools with which the authors wish or hope to be associated.6 Downloaded by [Iowa State University] at 15:08 14 November 2011 Michael Bugeja and Daniela V. Dimitrova 117 This quotation and others in Grafton’s book alerted us to the potential scope of what seemed, at first, an annoying unreliability of online footnotes and what subsequently expanded over a year’s time to questions about reliability and verification associated with scientific scholarship. With these ramifications in mind, we set out to measure the half-life phenomenon in some of the best journals in journalism and communication. We defined the term “half-life” as the estimated time it takes for one-half of online footnotes in a journal to disintegrate. Our research questions were basic: RQ1: What is the frequency of use of Internet citations in articles in Human Communication Research, Journal of Communication, and Journalism & Mass Communication Quarterly? RQ2: What are the general characteristics of the Internet sources? • RQ2a: How many Internet sources are active? • RQ2b: What are the most common domains? • RQ2c: Are the sources hyperlinked correctly? • RQ2d: How many provide retrieval dates? • RQ2e: What factors can serve as predictors of online citations’ permanence? We performed a content analysis of all articles published between 2000 and 2003 in the three selected journals. This yielded 416 online citations, with the following breakdown: 73 citations in 2000 articles, 92 in 2001, 118 in 2002, and 133 in 2003. Figures 1, 2 and 3 show the use of online citations per journal. Of the total 416 citations, only 61% were still accessible. Additionally, 19% of the online footnotes contained an error in the URL, and 63% did not provide an access date in the published citation. Of those links that were still active, only 58% matched the cited content. We also ascertained the stability of domain names in these journals, finding that .org, .edu and .gov were the most reliable (see Figure 4). Based on the four-year period and the three top journals we examined, the average half-life for journalism and communication Internet references was estimated to be 3.02 years. This means that it will take about three years for half of these Internet citations to be no longer accessible through the original URLs. The most frequently cited reason for a dead hyperlink was a message that the page was not found–sometimes explicitly saying “404 Error” or just “Page Not Found.” Some sites expanded the message to “The page you requested could not be found.” Some Internet citations required a 118 THE SERIALS LIBRARIAN Downloaded by [Iowa State University] at 15:08 14 November 2011 FIGURE 1. Number of Internet citations in Human Communication Research. Human Communication Research: Number of Online Citations per Year 12 10 8 6 4 2 0 2000 2001 Citation didn’t work 2002 2003 Citation worked FIGURE 2. Number of Internet citations in Journal of Communication. Journal of Communication: Number of Online Citations per Year 30 25 20 15 10 5 0 2000 2001 2002 Citation didn’t work 2003 Citation worked Michael Bugeja and Daniela V. Dimitrova 119 Downloaded by [Iowa State University] at 15:08 14 November 2011 FIGURE 3. Number of Internet citations in Journalism & Mass Communication Quarterly. Journalism & Mass Communication Quarterly: Number of Online Citations per Year 80 70 60 50 40 30 20 10 0 2000 2001 2002 Citation didn’t work 2003 Citation worked FIGURE 4. Stability of domain names in the three journals. Online Citation by Top Level Domain 120 100 80 60 40 20 0 .com .edu .gov Citation didn’t work .org other Citation worked Downloaded by [Iowa State University] at 15:08 14 November 2011 120 THE SERIALS LIBRARIAN user name and password, while some required subscriptions. Here is an example: “The story you requested is available only to registered members. Registration is free and offers great benefits. Click here to register if you are not a registered member of latimes.com.” Few of the online citations retrieved a redirect page. A dead link hosted by a government Web site, for example, brought up the following redirect message: “As of May 31, 2002, www.nara.gov became www.archives.gov! Please update your bookmarks. Wait 10 seconds, or click now to visit www.archives.gov. Thank you!” In analyzing these findings, we came to realize that a double standard seemed to exist in how scholars viewed paper-based references as opposed to Internet-based ones. Scholars working in previous eras would be aghast at print citations referencing wrong pages; and we believed that current-day scholars need to hold the Internet to the same standards as printed works. Also associated with those standards was the specter of purposeful deception. Text in URLs often eventually refers to different content over time. Consider the fact that the half-life of Internet footnotes in our select journals occurs shortly after 3 years, resulting in a substantial number of dead links. URLs can be fabricated and then seem to have gone dead without the reliability of fact-checking; or those URLs can be accurate but refer to different content, casting scholarship in a false light. Worse, because those lapsed URLs can be cited in subsequent research, promulgation of inherent error is possible, undermining the very nature of verification that is at the core of the scholarship– again, based on the printing press. Upon reading a copy of our quantitative study, which earned best paper status in the Communication and Technology division of the 2005 International Communication Association, Anthony Grafton at Princeton wrote in a Feb. 23, 2005, e-mail that researchers are rapidly approaching an era in the digital age when the modern footnote will have to be reconceptualized to suit the “new” printing press of the Web. According to Grafton, “One possibility would be via references to stable online databases that are themselves preserved by reliable long-term institutions. Only a form of reference not liable to rapid decay or limited by passwords would work. And of course we don’t know yet what platforms will be genuinely stable and durable.”7 Grafton also commented on our scholarly tradition being anchored by the footnote and how this tradition might be transformed with the new medium of the Internet. “[S]cholarship is certainly anchored by its verifiability,” he wrote, “and it’s hard to imagine a form that doesn’t have some equivalent to the footnote. In the fields of humanistic scholarship that I work in, for the Downloaded by [Iowa State University] at 15:08 14 November 2011 Michael Bugeja and Daniela V. Dimitrova 121 moment, the lowering of costs associated with computer typesetting and the transfer of labor to the author has preserved massive footnotes of the old sort for another generation or so. But after that–who knows?”8 In any case, Grafton’s comments have influenced our scholarship, forcing us to question basic aspects of the Internet, which provides instant access to hundreds of thousands of information resources with the convenience of search tools for fast, convenient data retrieval. Because libraries, in particular, are purchasing online journals, or journals with online editions, Internet-based references will continue to increase–a research-related quandary compounded by lack of guidelines in graduate schools, academic associations and libraries themselves. Grafton is especially sensitive to the preservation of information as historical artifact. He notes that the work of historians “stride forward or totter backward on their footnotes,” which accomplish two basic tasks: “They must examine all the sources relevant to the solution of a problem and construct a new narrative or argument from them. The footnote proves that both tasks have been carried out.”9 Our research indicates that the Internet threatens both tasks over time. Moreover, problems only now are starting to mount so that librarians–especially the serials librarian– must start taking notice. Serials librarians have been purchasing text/html versions of journal articles because users want to copy, paste and otherwise manipulate that data with a computer. The computer not only enables quick, easy online data retrieval; its software also allows users to manipulate that data. During the era of the printing press, manipulation of journals was a crime typically perpetuated with a razor blade in the periodicals section. In that allusion is the statement of the crisis at hand. The book is the ultimate fire-walled medium; the Internet is its opposite. Thus, we have been recommending to scholars that the closer the medium is to the book, the more reliable the footnote. For instance, a pdf is closer to the book than text/html because the former is essentially a snapshot of paper. Text features manipulability. As such scholars should beware of using URL addresses for html formats, especially when pdf formats are available from journals in library databanks. A case in point: Journalism & Mass Communication Quarterly offers full text versions of its contents in both html and pdf formats. Early in our study one of our coders analyzed the html version of J&MC Quarterly rather than the pdf version and found repeated formatting errors–an extra space between forward slashes (http://) and another extra space when hyperlinks in the original stretched from the right to left margin. Downloaded by [Iowa State University] at 15:08 14 November 2011 122 THE SERIALS LIBRARIAN The comparison between the two formats showed a 17% increase in citation failure from pdf to html. In the end, Internet research is critical for scholarship because it allows continuous, convenient access to electronic databases and other resources, thus increasing the scope and breadth of that scholarship. That is its allure as a medium. However, without the ability to fact-check stable citations, research may become as reliable as opinion and method as un-replicatable as art. In future research we intend to compare online-only journals with traditional journals, under the theory that scholars doing research on the Internet will be particularly vulnerable to the half-life effect because their work must refer to and document Web-based content. We will be analyzing the Internet’s impact on other aspects of scholarship associated with the printing press, including indexes and bibliographies. Over time we will analyze how research methods must be changed to account for the Internet’s dynamic aspects and also will look into methods to foster reliable archiving–a library topic if there ever was one–previously known as a book shelf. In the interim the diffusion of online access–including databases and digital journals–will undoubtedly create a snowball effect as more authors cite hypertext whose addresses will eventually lapse. Without cogent recommendations to offset the decay of Internet footnotes, scholars will not be able to access the full array of citations. Journals will become increasingly unreliable when future studies attempt replication, without which validity cannot be fully ascertained, all of which will contribute to the erosion of standards. And yet this phenomenon continues to be under the radar of many academic programs and library associations, even as Google.com publicizes plans to digitize entire libraries. Granted, our research is not as far-reaching as detecting light from distant solar systems, the news that was announced on March 18 of this year, along with our findings; but astronomers documenting that cosmic phenomenon using footnotes in digital journals may find that their observations about light over time may be extinguished on the Internet. NOTES 1. Richard Harris, “Looking at Planets Beyond the Solar System,” National Public Radio’s Morning Edition; retrieved May 3, 2005, from http://www.npr.org/templates/ story/story.php?storyId=4556714. 2. Scott Carlson, “Scholars Note Decay of Citations to Online References,” The Chronicle of Higher Education, March 18, 2005, p. A30. Downloaded by [Iowa State University] at 15:08 14 November 2011 Michael Bugeja and Daniela V. Dimitrova 123 3. Carlson, p. A30. 4. “Problems Using the Web,” 1997. Retrieved May 3, 2005, from http://www.gvu. gatech.edu/user_surveys/survey-1997-10/graphs/use/Problems_Using_the_Web.html 5. Drake A. Dellavalle and Graber, M., Heilig, L., Hester, E., Kuntzman, J., & Schilling, L., “Going, going, gone: Lost Internet references,” Science, October 2003, pp. 787-788. 6. Anthony Grafton, The Footnote: A curious history (Cambridge, Mass: Harvard, 1997), p. 13. 7. Anthony Grafton, grafton@.Princeton.EDU, “Question about the Footnote from Iowa State,” 23 February 2005, personal e-mail to Michael Bugeja, <bugeja@ iastate.edu> (23 February 2005). 8. Grafton, “Question about the Footnote from Iowa State.” 9. Grafton, pp. 4-5.