digital archivist

The Archaeology Data Service, keeping the Grey Literature Library going

Welcome to another post to the Archaeology Data Service (ADS)  Day of Archaeology blog 2012

If you want a quick introduction to the ADS and what we do see last year’s post.

We have contributions from two members of staff from the ADS this year, one from Stuart Jeffrey ADS deputy Director (Access) and this one from Ray Moore one of the ADS Digital Archivists.

ADS logoRay Moore

As a digital archivist at the Archaeology Data Service, my day to day activities involve the accessioning the digital data and other outcomes of archaeological research that individuals and institutions deposit with us, developing a preservation programme for that data, but also curating existing ADS collections.

Today, and indeed for the past week, I have spent much of my time working on the Grey Literature Library (or GLL).  The GLL is an important resource for those amateur and professional archaeologists working in archaeology today providing access to the many thousands of unpublished fieldwork reports, or grey literature, produced during the various assessments, surveys and fieldwork carried out throughout the country. These activities are recorded using OASIS (or Online AccesS to the Index of archaeological investigationS) and after passing through a process of validation and checking the reports produced in these projects arrive at the ADS. On first impressions then the digital archive may seem like an ‘end point’, a place where archaeological grey literature goes to die, but the ADS, through the GLL, makes these reports available to other archaeologists and the wider community allowing the grey literature to inform future research. At the same time as a digital archive we take steps to preserve these reports so that future generations can continue to use the information that they contain; an important job as many of these reports do not exist in a printed form.

Grey Literature Reports

Reports from the Grey Literature Library.

So what does digitally archiving a grey literature report entail? Initially all the grey literature reports must be transferred from OASIS to the ADS archive; the easiest part of the process. More often than not the report comes in a Portable Document Format (or PDF) form, and while this is useful for sharing documents electronically it is pretty useless as preservation format for archiving. One of my jobs is to convert these files into a special archival form of PDF, called PDF/A (the A standing for Archive). Sound’s easy, but often it can take some work to get from PDF to PDF/A (my all time record is 2 hours producing a 900mb PDF/A file). These conversions must also be documented in the ADS’ Collection Management System so that other archivists can see what I did to the file to preserve the file and its content. While OASIS collects metadata associated with project, the ADS uses a series of tools to generate file level metadata specific to the creation of the file, so that we can understand what and how the file was created. Only once these processes are complete can the file be transferred to the archive, with a version also added to the GLL so that people can download and read the report. With a through flow of some 5 to 600 reports per month the difficulties of the task should become apparent; and all this alongside my other duties as a digital archivist. This month’s release includes an interesting report on The Olympic Park Waterways and Associated Built Heritage Structures which stood on the site now occupied by the Olympic Park. Anyway I’d better get back to it!

A day with the Archaeology Data Service

ADS logo 

Welcome to the Archaeology Data Service (ADS)  Day of Archaeology blog. Before we start looking at some of the nitty-gritty of our busy day it might be useful to give a little bit of background on what we do, especially for those of you who maybe don‘t know anything about us at all.

It’s not all trowels, beards and woolly jumpers:  In lots of the other Day of Archaeology blogs you will be reading about archaeologists out in the field excavating, surveying, recording and so on. You’ll also read about the careful cleaning and analysis of artefacts that have been recovered the pots, metal work, skeletons and so on.  This is often exciting and stimulating work, but it raises an important question, why is it being done? There are lots of good answers to this question that range from the very philosophical to the very practical. However, almost all the answers rely on the fact that the information that archaeologists create, the data they gather, will be around for everybody to reuse in the future.  This can be said to apply to many disciplines, but it is especially important for archaeology because the process of excavating a site is of course the process of destroying it too! What remains after the site is excavated are the memories of the experience, the impressions of those affected by the site and the ideas about the past that those involved in the work – and those watching it happen – have created through direct  contact and through consideration of the material that has been recovered.  After the project is over the main connection back to the site apart from memories and the physical remains considered important enough to  keep in a museum are the records that are generated throughout the archaeological process (sometimes called primary data) and the ideas about people in the past that these records have helped to inform (often called interpretation).

The King's Manor, York - where the ADS is based.

The King's Manor, York - where the ADS is based.

So it is important for archaeologists and all those with an interest in the past that these records are kept safe for the long term, especially because they can’t be recreated. At first glance this might seem like a straightforward problem, but it is a surprisingly complex one and has become more so in the last 25 years. This is because almost all archaeological information is created in digital form and now covers a huge range of data generation and recording  techniques, databases, text documents, images, videos, sound recording, aerial photographs, satellite images, laser scanning, digital mapping, sonar data, three-dimensional models etc. etc. It is often very surprising to discover that even with all this new technology, and sometimes because of it, the data created is really quite fragile and requires a lot of looking after. This is where the ADS comes in. The ADS are a digital archive with two main objectives:  1) to provide a safe place for those interested in keeping the results of their archaeological work available to others in the long term; 2) exploring new ways of making all these exciting results  available, findable and usable to anyone and everyone over the internet.

There is lots more about the ADS and it’s history here.

So that’s the headlines, what does it mean in practice? Apart from these main objectives there are lots of other activities we undertake to support them, such as giving advice and creating guides to good practice, but you’ll read more about these activities in the sections below. Different people do different things at the ADS so the sections below will detail a number of activities on or around the 29th July.

Stuart Jeffrey – Deputy Director (Access)


A busy day for me, right now I’m concentrating on various European projects that the ADS are involved with, it’s important to remember that the national boundaries we work within today are a relatively new invention and people in the past wouldn’t recognise them, so to help people study human activity in the past it’s crucial to work with colleagues in other countries.  Information on all the ADS research projects can be found under the ‘OUR RESEARCH’ pages on the main ADS website.

First things first though, a good big cup of coffee is in order to get me ready for the day! I also like to check activity on twitter and see if we have any big collections coming up for release. My colleague Jen Mitcham and I normally have a check to see if her ADS facebook page has more new followers or if the ADS_Update twitter account which I run has more, twitter is winning so far, but it can be a close run thing.

It almost goes without saying that after the coffee and a short gloat over twitter’s success most of the morning will be spent on the computer dealing with emails, lots of emails. The ADS are involved in quite a number of projects with partners all over Europe and also in the USA, keeping in touch with these colleagues is a very important part of my job. Today I have been writing a progress report for the CARARE project which is about getting ADS 3D data into a big Europe wide heritage search mechanism called Europeana.

Coffee break time!  – then onto arranging exhibition space for a photographic exhibition on the diversity of archaeological practice as part of a project called the Archaeology of Contemporary Europe (ACE). A couple of weeks ago I was escorting the photographer round the sites of York including stone masons at the famous York Minster, the Jorvik center and the Hungate excavations by YAT.

After sandwiches for lunch and a quick walk round town, York is lovely in the summertime, my afternoon is split into two tasks. Firstly I’m looking at progress on the development of some new features on the ADS website, if you are a regular user you will know it has been recently updated with a new design and also lots of new features. We are working hard on trying to integrate the Imagebank (a free to use collection of archaeological images for teaching and learning) into our main search – ArchSearch. This means that when someone searches on, for example, Stonehenge, they get a series of good pictures to use in their results set as well as monument inventory records and archives relating to the site. Progress on this is good thanks to the hard work of the development team and others. Secondly I have meetings with the ADS development team in the afternoon to discuss our plans for services –this means that as well as the various ways of discovering data held by the ADS via our website we are working to publish data as ‘services’ that can be consumed by other search mechanisms. This is quite a technical discussion, but it’s also quite exciting because we can see lots of potential for making our holdings more easily discoverable to wider and wider audiences, and in my job that’s what makes me really happy.

So after a long day I’ve got no dirt under my fingernails, and discovered no new sites, but I feel that it’s been a good and satisfying day working on ways to both keep archaeological data safe and to get it out to people who need it to continue their work or simply have an interest in our shared past.

Tim ponders some worrisome floppy discs

Tim, one of our curatorial officers ponders some worrisome floppy discs, will the data be recoverable?


Jenny Mitcham (Curatorial Officer)


I work for the Archaeology Data Service as a digital archivist. I have an archaeology degree and did a couple of years digging in the UK before I decided that an office job was more my style. I am engaged in the very useful task of preserving the digital data that archaeologists create in the field (and the office).

At the ADS we know that in order to keep files safe and accessible long into the future, we need to migrate or refresh them to create newer versions to replace the old obsolete files (which will soon not be readable by modern software). To this end, I am currently working on one of the first large collections that was entrusted to us back in the very early days of the ADS. The resource I’m looking at is an archive of Council for British Archaeology (CBA) Research Reports. A run of reports dating back to 1955 which were no longer in print so were scanned and given to us in digital form to make more widely available on-line. The collection consists of some 100 reports and covers many different topics and themes within British Archaeology. This has remained one of our most popular and well-used resources ever since we started making it available on-line in 2000.

The year 2000 was a long time ago in computer terms. The internet was quite different to how it is now and many people relied on very slow dial up speeds. The decision was made at the time that people would not be able to download the CBA Research Reports in one go and would prefer to access them in small chunks of 3 or 4 pages per pdf file. This was all well and good at the time but things have moved on since then and the majority of our users now have access to faster broadband speeds and would actually prefer to download the whole report as a single file.

The other issue with these original CBA Research Reports is that the files are quite an early version of the PDF standard (1.2) and though they are not yet obsolete, some of them are throwing up error messages and they would all benefit from being refreshed.

The exciting job in store for me today is to turn all of these CBA Research Report chunks into full and complete pdf files (one file per report), to refresh them into a more up-to-date file format (the archival version of pdf) and also to update the web interface which people use to access these reports.

OK, so I know this isn’t the most exciting of posts (or exciting of days for me!) but it just highlights some of the essential and ongoing work that we have to carry out in order to make archaeological data available to anyone who wishes to access it, both now and into the future.


Kieron Niven (Curatorial Officer)

Kieron hard at work on the new Guides to Good practice

As with other members of the ADS curatorial team, my day can be quite varied ranging from archiving datasets and creating web pages right through to dealing with helpdesk queries coming in through our website or providing guidance and support to potential data depositors. Although I’m currently posted to helpdesk (we rotate this on a weekly basis and it’s been satisfyingly quiet this week!) my main activity today has revolved around the finishing up of major chapters of our new Guides to Good Practice. This has mostly been focussed on completing outstanding sections in the guide for marine survey data (looking at data from bathymetry, single and multibeam sonar, etc.) but I’ve also had a brief ‘catch up’ skype call with the guides project partners in the U.S. at Digital Antiquity /Arizona State University. As a minor break to my predominantly ‘guides focussed’ day I’ve also done some tweaking to the introduction and overview pages of a large laser scan project archive that we will be imminently releasing. The archive has come to us as part of the LEAPII project (a collaboration with Internet Archaeology to showcase projects featuring linked digital publications and archives) and contains laser scans of a number of objects from Amarna (Egypt). The really interesting thing – for me, at least – is that we have data for each object at a number of different points in the laser scan lifecycle e.g. individual point clouds from the scans, registered scans, meshes and – my favourite – 3D PDF files. This variety, I hope, will make it a really useful dataset for those interested in the process of laser scanning.