open-science scientific writing web-service || citation reproducibility wayback machine web archive

Archiving Quoted Web Resources

Picture of an archive
Image by Chris Stermitz from Pixabay

Quoting web resources is a hassle for several reasons:

  1. Web pages are not available anymore.
  2. Web pages have moved to another URL.
  3. Web pages change their content so that the cited reference is not correct anymore.

Humanities, where detailed content analysis of websites is a popular research method. Referring to exact quotes is a question of reproducibility and therefore crucial in science generally. This article presents some strategies and tools to bypass the challenges mentioned above.

Quote websites with the Wayback Machine

With WebCite, there used to be a web service to circumvent link rot and changed the content. WebCite allowed to archive online resources and returned an URL where these filed pages could be accessed. Besides that this service was often down and therefore notoriously unreliable, as of July 14, 2019, it does not accept any new archive requests anymore.

Start page of the WebCite service, proclaiming that new archiving request are currently not feasible anymore.

Figure 1: Start page of the WebCite service, proclaiming that new archiving request are currently not feasible anymore.

Luckily with Wayback Machine1, operated by the Internet Archive, there is recently a new and reliable web service available. Although there is a sophisticated how-to use of this service in the Wikipedia context, I have prepared my own How-to use Wayback Machine for the general public.

Start page of the Wayback Machine, a service by the Internet Archive

Figure 2: Start page of the Wayback Machine, a service by the Internet Archive

How to cite archived resources?

Internet Archive asked the Modern Language Association (MLA) how to cite resources archived with the Wayback Machine. MLA Style is a prevalent system for documenting sources in scholarly writing.

MLA answered

that there is no established format for resources like the Wayback Machine, but it’s best to err on the side of more information. You should cite the webpage as you would normally, and then give the Wayback Machine information.

MLA also provided an example:

McDonald, R. C. “Basic Canary Care.” Robirda Online. 12 Sept. 2004. 18 Dec. 2006 [http://www.robirda.com/cancare.html]. Internet Archive. [ http://web.archive.org/web/20041009202820/http://www.robirda.com/cancare.html].

Note there are several additions to a standard bibliography:

  • Two dates: The first is the date of the archive, then comes the date when the page is retrieved.
  • Two URLs: The first is the original URL (not available anymore), then comes the archived URL from the Internet Archive.
  • Web service: Between the two URLs comes the ‘second’ author, the name of the internet service which archived the resource and generated its URL.

According to MLA, both URLs shouldn’t be underlined in the bibliography.

Let’s try another example. The archiving service Peeep.Us is not available anymore. The Wayback Machine gives us as archived URL https://web.archive.org/web/20180813205348/http://peeep.us:80/. If we are are going to compose this bibliography in the usual way, we would get:

Nikolaev, Cyril. “Peeep.Us.” Save Snapshot of a Web Page Forever!, 13 Aug. 2018, https://web.archive.org/web/20180813205348/http://peeep.us:80/.

Using a name for web sites may questionable, but I use it whenever there is a reasonable possibility (e.g., from the Copyright or from the name of the institution, which produces the web site).

Now we have to add the retrieval date, the original URL and the name of the archiving service:

Nikolaev, Cyril. “Peeep.Us.” Save Snapshot of a Web Page Forever!, 13 Aug. 2018, 22 Jul. 2019 [http://peeep.us]. Internet Archive. [https://web.archive.org/web/20180813205348/http://peeep.us:80/].

Wakelet

In addition to the following Wakelet, there is also a community edition on my Wakelet homepage where you can add relevant links.


  1. To access this page you must be registered by archive.org

Page created: 2019-07-22 | Last modified: 2019-07-24
comments powered by Disqus