Op-ed

Licensed for reuse? Citing open-access sources in Wikipedia articles

The views expressed in this op-ed are those of the author only; responses and critical commentary are invited in the comments section. The Signpost welcomes proposals for op-eds at our opinion desk.
This image of Xanthichthys ringens is sourced from an open-access scholarly article licensed for re-use. Should we make that reusability explicit when citing this source in Wikipedia articles?[1]

It is heavily ironic that two decades after the World Wide Web was started—largely to make it easier to share scholarly research—most of our past and present research publications are still hidden behind paywalls for private profit. The bitter twist is that the vast majority of this research is publicly funded, to the tune of hundreds of billions of dollars worldwide each year.

This has placed Wikipedia in an awkward position with respect to its verifiability policy: "all material in Wikipedia mainspace, including everything in articles, lists and captions, must be verifiable [so that] people reading and editing the encyclopedia can check that the information comes from a reliable source." Combined with the policy on identifying reliable sources, the paywall dilemma faced by editors and readers becomes clearer: "many Wikipedia articles rely on scholarly material. When available, academic and peer-reviewed publications, scholarly monographs, and textbooks are usually the most reliable sources." Not only this, none of the academic journals most cited on the English Wikipedia are open access (PLOS ONE breaks the drought at No. 22 on that list).

While WP:PAYWALL advises: "Do not reject sources just because they are hard or costly to access". Commenting on a draft proposal that Wikipedia articles should preferentially cite open-access literature, one editor wrote that "verifiability isn't an option if people are expected to pay in excess of $20 to view a single article ... over closed- or toll-access resources of equivalent scholarly quality". That draft proposal—started in 2007 when the English Wikipedia was half its current age—died quietly like so many.

But what if we could just mark references as being open, rather than preferentially citing them over closed ones? WikiProject Open Access is currently exploring the options, and the Workgroup on Open Access Metadata and Indicators (OAMI) at the National Information Standards Organization has been working on a set of recommendations for how to provide information about the use and re-use rights of scholarly articles. A draft version was released last week, and public comments are invited until 4 February.

These recommendations boil down to two metadata tags:

  • <free_to_read>, which signals whether and when a publication is available publicly without a requirement for payment or registration, and
  • <license_ref>, which points to a stable place on the web containing the licensing terms applicable to that publication.

The recommendations don't include:

  • a definition of the term open access;
  • specifications as to which licensing terms would be acceptable, or whether and how they should be version-controlled; and
  • suggestions for icons that may be suitable for signalling the content of the proposed tags.

Similar recommendations have been put forward in a more broadly scoped draft report from Jisc, the UK body that supports senior-high-school and higher education. The draft had been was released for public comment in September, and its final version is still being worked on. A related report from the Confederation of Open Access Repositories looked at components of license clauses in use by scholarly publishers.

One of the organisations involved in the NISO Workgroup is CrossRef, which is working on including the proposed tags into their metadata and making that information available through their API, in collaboration with the Directory of Open Access Journals. The Open Article Gauge, developed by Cottage Labs with support from the Public Library of Science (PLOS), already provides article-level information about licensing terms for a subset of the scholarly literature; PLOS has signalled an interest in implementing a system that would provide licensing information for references cited in articles published in its journals, which are among the most well-known open-access journals.

The NISO document contains a scenario quite similar to searching for illustrations for use in Wikipedia articles:

The reference 1 (broken in the NISO document) refers to the November 2012 open-access report (part of the Wikimedia GLAM newsletter), which lists examples of such conflicting licensing statements and served as the basis for a more detailed analysis published and presented last October.

The icon used to signal the Attribution module in Creative Commons licenses.

It is the potential for these kinds of incongruencies that motivated the NISO group to opt for signalling only the stable home (the URI) of the licensing terms and not individual use and re-use rights. Many publishers use licensing terms incompatible with Creative Commons licenses, and to understand their implications, Wikipedia users might need legal assistance; this makes it difficult to see how signalling those terms (other than perhaps by way of {{closed access}} or {{subscription required}}) would incur any benefit to those users.

The case is different for Creative Commons licenses: their URI (e.g. http://creativecommons.org/licenses/by/4.0/) already signals re-use rights, making it easy to implement the <license_ref>, while their corresponding <free_to_read> tag can always be set to "yes", and compatibility with the NISO recommendations would be ensured.

On Wikimedia sites, a number of external link icons are already in use that act on certain elements of a URI—for example, a lock icon for HTTPS, as in https://www.eff.org/copyrightweek (which is this week, a period of action around copyright, organised by the Electronic Frontier Foundation). So having the CC BY icon displayed right next to external links that contain the string "http://creativecommons.org/licenses/by/" would be straightforward. Once the licensing information is available via the CrossRef API, a link to the appropriate CC URI could be added automatically to template-based references (e.g. by way of Citation bot, which was migrated to Wikimedia Labs last weekend).

Since Wikidata has enabled phase I support for Wikisource on Tuesday, it would even be possible to link to the full text available from Wikisource (see also the Wikisource vision) and to the corresponding Wikidata entry, as demonstrated in the reference. Of course, there is room to economise on space, such as by linking the icons directly rather than adjacent text bits, and if the article is covered on other Wikimedia platforms (e.g. Wikiquote, Wikinews, Wikispecies), the corresponding links could be included as well.

Currently, Wikidata items can be created for sources supporting statements on Wikidata, but the details of whether and how other sources (e.g. those supporting statements in a Wikipedia or Wikibooks page) are to be handled—or whether Citation bot should be ported to Wikidata—remain yet to be worked out. Two taskforces have been created to work on this: one for books and one for periodicals.

Irrespective of the details, I think that if Wikipedia articles were to signal the openness of scholarly references they cite, this would go a long way towards raising awareness of open licensing among users of Wikimedia content, amplifying similar efforts by open-access publishers and even Google, whose image search by re-use rights (available since 2009) was simplified this week.


Another image that anyone is allowed to freely reuse, revise, remix, and redistribute for any purpose: Prognathodes aculeatus, out of a total of 202 files on Wikimedia Commons from the same source.[1]

References

  1. ^ a b Williams, J. T.; Carpenter, K. E.; Van Tassell, J. L.; Hoetjes, P.; Toller, W.; Etnoyer, P.; Smith, M. (2010). Gratwicke, Brian (ed.). "Biodiversity Assessment of the Fishes of Saba Bank Atoll, Netherlands Antilles". PLOS ONE. 5 (5): e10676. Bibcode:2010PLoSO...510676W. doi:10.1371/journal.pone.0010676. PMC 2873961. PMID 20505760. CC0 full text media metadata