Open Access, Open Data, and Open Educational Resources
Stop Link Rot: Use Perma.cc to Preserve Web References in Scholarship
As research becomes increasingly digital, it’s becoming more important to ensure a findable and unchanging scholarly record. Researchers are probably familiar with the digital object identifier (DOI), which in URL form provides a persistent link to articles (and more), and libraries and publishers provide redundant archiving to ensure scholarship is preserved for the long term.
However, it’s also important to make sure links (other than DOIs) in articles work, and to make sure web pages represent what the author saw when she cited them. Some journals are aware of these issues, and I’ve noticed a few authors who employ URLs from the Internet Archive’s Wayback Machine in their manuscripts. Thinking back to my last article, there are probably some links that won’t work five (or twenty) years from now, or links resolving to web pages that won’t accurately represent what I was referring to at the time.
These problems are usually referred to as link rot and reference rot. Link rot means a URL can no longer be found (your browser returns a 404 error); reference rot means the information cited at a URL has either disappeared or changed. Research has shown 50% of the links in Supreme Court decisions from 1996-2010 had reference rot, one in five articles suffers from reference rot, and three out of four URI references lead to changed content. In the last article, the authors say these problems raise “significant concerns regarding the long term integrity of the web-based scholarly record.” In this era of fact-checking and “fake news,” it’s more important than ever to stabilize the evidence base built in peer-reviewed articles.
To help address this problem, Virginia Tech’s University Libraries is pleased to announce that we are now a registrar for Perma.cc, a service to provide archiving of web pages for research purposes. Researchers at Virginia Tech will be able to archive, manage, and annotate an unlimited number of web pages with persistent shortlinks for citing, and will also receive local support.
Including a Perma.cc link in a citation or footnote may depend on the citation style you are using, but a general recommendation is to include the original URL, followed by “archived at” and the Perma.cc shortlink, for example:
36. Scott Althaus & Kalev Leetaru, Airbrushing History, American Style, Cline Center for Democracy (Nov. 25, 2008), http://www.clinecenter.illinois.edu/airbrushing_history, archived at http://perma.cc/G8PW-798L.
If you click on the Perma.cc link above, you can see how the web page looks in archived form. In addition, the time of capture is recorded, there’s a link to the live page, and you can download the archive file (under “show record details”). Perma.cc is intended for non-commercial scholarly and research purposes that do not infringe or violate anyone’s copyright or other rights. Web pages to be archived should be freely available without payment or registration. Additionally, some web pages employ a “noarchive” restriction, which Perma.cc archives but makes private. In other words, the shortlink can be shared, but is available only to the researcher and upon request.
There are some advantages to using Perma.cc over the Wayback Machine (and the Internet Archive is a supporting partner of Perma.cc). Perma.cc provides a more thorough, accurate capture in two forms, a web archive file (WARC), and a screenshot (PNG). Perma.cc also provides persistent shortlinks that are more convenient for citing, and enables researcher management of the links (with folders, annotation, and public/private control).
Other features of the Perma.cc system include:
- Researchers will be added as organizations, and can add other users within that organization, such as lab members or collaborators
- A bookmarklet and extension are available for easy use in a browser (users must be logged in)
- Links can be deleted within 24 hours
See more information about creating Perma.cc records and links, and check out the FAQ. Perma.cc is built by Harvard’s Library Innovation Lab, and in alignment with its focus on preservation, the service has a contingency plan and is also open source.
To get started, send an email to email@example.com and request an account (or to be added as an unlimited user if you’ve already signed up). You can also send questions and problems to this address, or you can use the Perma.cc contact form.