FAQs on web archiving
Search the UK Government Web Archive
You can search for a website by entering a web address and clicking on the Search button:
The Modernising Government White Paper sets a target that all Government services to the citizen and to business should be available online by 2005. The World Wide Web is increasingly becoming the principle means of interaction between Government, citizens and business, and The National Archives has a responsibility to collect and preserve websites as evidence of the changing nature of this interaction.
The National Archives is pioneering new work in the field of digital preservation, including the recent development of a cutting-edge digital archive system to provide for the long-term preservation of electronic public records. This project is part of the wide-ranging research and development being undertaken by its Digital Preservation department, to develop new techniques for archiving electronic records.
This project is being undertaken in order to inform future policy on the archiving of Government websites.
The National Archives is committed to preserving electronic public records using methods which represent both best practice and value for money. Accordingly, we are investigating a number of approaches to the archiving of websites. We will use this project to evaluate the Internet Archive's method, gain valuable experience in the collection, preservation and delivery of archived websites, and begin to develop a collection for public access. Our current investigations, including this project, will allow us to develop a longer-term policy on the best approach for archiving Government web sites.
The European Archive is a non-profit organisation that was founded to build an 'Internet library,' with the purpose of offering permanent access for researchers, historians, and scholars to historical collections that exist in digital format.
The UK Web Archiving Consortium (UKWAC) is made up of 6 leading UK institutions, and aims to expand the lifespan of website materials from around 44 days (the same life expectancy as a housefly) to a century or more. The consortium launched a pilot project in 2004, to run for an initial period of two years. During the project, approximately 6,000 websites have been collected and archived. Each consortium member selects and captures.
The European Archive is the part of one of the largest and longest-running web archiving services in the world. As such, it brings unique skills and experience to this project. By undertaking this project, we will be able to evaluate how successful its methodology is in this context, and develop a substantial and unique collection of websites. The National Archives is also using the UK Web Archiving Consortium approach to web archiving in parallel to this project.
Yes, access to the collection is completely free, from The National Archives website, the European Archive's website, the UK Web Archiving Consortium website, and from the public search rooms and cyber café at The National Archives in Kew.
Please see the list of archived websites.
A recent study suggested that there are currently approximately 2,500 separate UK Government websites. This project is being undertaken to help develop a policy on the archiving of these websites. The websites have been carefully chosen as a representative sample of the entire UK Government web domain.
The websites were selected by The National Archives in accordance with our Operational Selection Policy for websites, which is designed to reflect the overall functions of government. The websites were chosen to be representative of each of these functions. This provides a broad cross-section across UK Central Government.
The websites are collected using specially designed crawler software, which will retrieve and store each page of the site. This process is called harvesting. The sites are then catalogued and stored.
The majority of the websites are stored in a format originally developed by the Internet Archive, now also available on the European Archive. The archived sites can be accessed using their Wayback Machine software. This provides access to the European Archive's collections: users enter the URL of the website they wish to view, and then select by date from the archived versions available. The websites can be harvested at varying intervals, to provide the flexibility to respond to changing circumstances. Currently, some of the websites are harvested every week, and the remainder are harvested every six months.
In addition, the growing collection archived by the UK Web Archiving Consortium are collected and stored using PANDAS software, originally developed by the National Library of Australia. These can be harvested as one-off snapshots', or at selected regular intervals from weekly to annually.
Copies of the websites collected are also stored at The National Archives, for long-term preservation by our Digital Preservation department.
The websites are publicly available on the World Wide Web, on The National Archives website, and also via the European Archive and UKWAC. Access is also available from the public search rooms and cyber café at The National Archives in Kew.
No, the only requirements to view the websites are an internet connection and a web browser.
This is the first UK archive of Government websites to be created, and one of the first archives of websites to be created in this country.
The National Archives is committed to providing public access to this collection. As an independent organisation, it is not possible to guarantee the long-term future of the European Archive, the UK Web Archive Consortium or their websites. For this reason, we are acquiring a copy of all the data collected and preserving it in The National Archives. This will allow us to provide future access to the collection without any dependency on other agencies.
Some of the early archived websites were collected using experimental technology. The archiving software often collected the top-level pages of a website only, usually without images. Or, the website was collected only as a link from one of the websites in our core collection. The means that in some cases, only 1 or 2 pages were gathered as a snapshot'.
There can be 2 reasons for this. Either the name of the department and website address changed. For example the Public Record Office is now the National Archives. Or, the website address change. For example the Department of Health website was www.doh.gov.uk' and is now www.dh.gov.uk' Often, there will be an overlap of covering dates we list the most recent set of archived websites from the date they began under the new address.
Yes. Several records in Electronic Records Online take the form of websites. These were selected for permanent preservation in the Digital Archive, and form part of our main digital collection.