This week we dig into the procedures that power the NRS Web Continuity Service. We are a multi-faceted service, dealing with numerous stakeholders and subject areas. With that in mind, we need to ensure our processes are efficient and effective, to help us deliver a high quality web archive.
But what do we mean by ‘high quality web archive’? In web archiving, quality can be related to three elements:
Completeness – how much of captured website’s links, text, downloads etc. the crawler has been able to access and capture
Behaviour – how much of the navigational functionalities within the captured website snapshot have been preserved, compared to the live site
Welcome to our blog! Over the course of few weeks, we will take Open Book readers on a tour of NRS’s new Web Continuity Service. Web archiving and Web Continuity represent an exciting new era for archiving at NRS, providing a digital tool that directly supports our mission to,
“collect, preserve and produce information about Scotland’s people and history, and make it available to inform present and future generations.”
Stay tuned for bite-sized articles on how this new service operates, and how it will contribute to the development of Scotland’s national archive collection and support the Scottish Government’s transparency agenda.
Websites as archival public records and the ‘looking glass’ into government
Nowadays, when a member of the public wants to understand something about government, the first source they will likely check is an official government website (probably found via Google).
In this multi-channel era, government websites have a critical role to disseminate official, trusted information, so that the government remains accountable and transparent to the citizen.
As a result, government websites form an integral part of the public record. National archives, who capture, preserve and make available public records, are therefore taking steps to capture a representative record of this modern aspect of government. To do so, national archives are creating web archives. Web archives have been around for some time. Nevertheless, the process of web archiving is technically challenging: more on that in our next blog post.
If done well, web archiving has the potential to dramatically alter the way we record, preserve, and analyse the activities of our government and wider society.
Selecting and capturing government websites, evidencing how these change over time, and making the output of this archiving process clear, reusable and interoperable, can create a powerful ‘looking glass’ into modern official business. It can also do this in a scalable and consistent manner.
Furthermore, emerging research is indicating that web archives may form the single most important contextual record for understanding society in the last twenty years, and will continue to do so. Here’s some examples to ponder:
Do you want to understand how US institutions reacted to the September 11th attacks? The Library of Congress has a web archive collection dedicated to this which is free to access, including this archived snapshot from the US Department of State, dated 12th September 2001.
Turning closer to home, the snapshots of the Scottish Parliament website, now captured by National Records of Scotland, provide a navigable resource for the business of Scotland’s devolved legislature, with time-stamped captured content available on Current Bills, MSPs, and special events and visitors to Holyrood.
Observant readers will quickly notice some unusual features about these archived pages; they all have arresting headers to show the user the page is archived and when this occurred, and some of the original dynamic functionality such as search, unfortunately may not work.
What is key though is that these archives have attempted to capture information from these websites as completely and accurately as possible.
In the next blog, we will explore the core technology behind web archiving, its technical challenges, and how archives (and NRS) are responding to this new era of collecting.