Digital Preservation

Team Digital Preservation

Hannah Smith Digital Archives Officer at Historic Environment Scotland

My name is Hannah Smith and I work for Historic Environment Scotland. I work within the collections section at HES, in the digital archive team which consists of myself and the Digital Archive Manager. We have been actively collecting digital archive since the 1990’s, receiving both internally and externally generated material. We currently hold over 500,000 catalogued digital items in our collection, which will only continue to grow in the future meaning as an organisation we need to be equipped to safeguard our archive to preserve and promote our digital material.

Current  Scottish planning guidance ( (accessed 22 July 2016) places emphasis on preservation in situ, but where this is not possible or appropriate it encourages recording or excavation following by the publication and archiving of that record.  Preservation by record is a widely recognised concept within archaeology, but it can only be achieved if those archives have a place of deposit where they will be preserved for the future. So our role in ensuring this is possible and happens is a vital part of this chain of archaeological responsibility and we consider this as important as the excavation.

The priorities for HES digital archiving are to collect all primary material relating to archaeological and architectural fieldwork and excavation undertaken within Scotland and Scottish territorial waters. This remit covers an extremely diverse range of information types including:  textual reports; databases; geophysical survey; air photography; mapping (GIS) and topographic survey; buildings survey; visualisation reconstruction; and digital video and audio. Some of these data types can pose challenges due to their complexity for example 3D laser scans which is a technology that is being utilised more and more for recording the built environment.

3D laser scan of Pencaitland Church © HES

3D laser scan of Pencaitland Church © HES

Digital photograph over 3D laser scan © HES

Digital photograph over 3D laser scan © HES

As technology evolves and file formats become obsolete we have to choose the best way to maintain access to the collections we hold. The only practical way we have to do this is to ‘migrate’ the file into a new format, however with some file types we risk losing or worse altering some of the properties of that file. Therefore we need to understand and define the significant properties of a file so that we can know what constitutes acceptable loss, and what crosses the line into unacceptable loss. We carefully consider these effects and experiment with different migration routes before finding the best possible balance between minimal or no loss of information or functionality and ongoing accessibility for that information. We also ensure we maintain the original object in an unchanged state so that should new possibilities emerge we can take advantage of them.

To help explain what we do*, I’ve included this animation to digital preservation, that HES digital archive manager Emily Nimmo helped to create. *We don’t wear capes, but still like to think of ourselves as Team Digital Preservation.

Most of my day to day work involves processing externally generated material into our trusted digital repository encompassing two areas: digital accessioning and digital cataloguing. We receive all types of digital media and often still receive obsolete media.


5¼-inch floppy disk and Amstrad 3 inch disk © HES

It’s a very satisfying job to take the digital media and link the information to our relevant records and know the data is now safeguarded in our archive and available to the public, to researchers and to inform the management of the historic environment in the future.

Submerged wartime defences off Roan Head, Flotta © Orkney Research Centre for Archaeology

Submerged wartime defences off Roan Head, Flotta © Orkney Research Centre for Archaeology

Archaeological evaluation © Cameron Archaeology

Archaeological evaluation © Cameron Archaeology

It also allows me to see all types of interesting archaeology from across Scotland every day – including cute little dogs on site . We come across all sorts of interesting material in this job and there’s never a dull day. We get to see little time capsules of archaeological events from all across Scotland, from working shots during an excavation to site diaries through to the final reports. I can live vicariously through commercial archaeologists from the comfort of my office.

Digital Archaeology isn’t just Scanning

Connor Rowe, Center for Digital Archaeology, Mukurtu CMS. Today is the Day of Archaeology, in which archaeologists around the world blog about this day in the life of an archaeologist. Now my background is in cultural anthropology and digital media, but I happen to work with a team of archaeologists at the Center for Digital Archaeology here at UC Berkeley, so I tend to jump on the archaeological wagon, especially when it intersects with the digital world. Hence my participation in #DayofArch 2013.

Browsing through digital heritage inside Mukurtu CMS.

Browsing through digital heritage inside Mukurtu CMS.

My current project is Mukurtu CMS, an open-source digital archive originally intended for (and created by) indigenous communities to collect and share their (digital, digitized, and intangible) cultural heritage, on their own terms. It is built on Drupal 7, and attempts to remain community-based in its development process (yes, this is as hard as it sounds). We’ve been supported by generous NEH, IMLS, and university grants, which help us, first, eat, and, second, continue this project for little or no cost to interested communities (notwithstanding Congressional budget cuts…). These grants have allowed us to produce complementary tools, e.g., Mukurtu Mobile, an iOS (and soon, Android) app, and work on projects as varied as museum exhibits and school science curricula. My work consists primarily of community support, software and installation upkeep, and facilitation of internal and external communication. I also get to fly around the continents and help communities implement digital preservation workflows on site.

Pondering bugs in the Treehouse

Pondering bugs in the Treehouse

Today, however, I am in our sunny Berkeley treehouse office, listening to the quiet chirping of birds, leaf blowers, and jack hammers (the archaeological offices surround BP’s new capital investment), staring at lines of code trying, somewhat successfully, to fix a problem reported by a community using Mukurtu in New Zealand. Time zones make it a little difficult to collaborate in real time, but it adds to the sense that the work I’m doing is globally worthwhile. My work in this aspect of digital archaeology, what might be termed digital cultural heritage preservation and management, is a rewarding niche of archaeological work. It allows me to empower others in the face of expectations of steep digital learning curves, manage their own heritage, and make sure that history is not lost, but rather shared. It allows me to build and learn code, while also paying attention to cultural relevancy. There is responsibility tied to certain knowledge, sacred stories, and ancestors. By building, maintaining, and supporting Mukurtu, I help communities retain control over how their heritage is distributed. As Kim of Team Mukurtu (below) would say it, “does all information want to be free?

Team Mukurtu:
Kim Christen, Project Director and persona behind @mukurtu
Michael Ashley, Development Director and Chief Technology Officer of the Center for Digital Archaeology, @lifeisnotstill
Chacha Sikes, Lead Engineer, @chachasikes
and me, Connor Rowe, Service Manager, @mrthebutler

The Archaeology Data Service, keeping the Grey Literature Library going

Welcome to another post to the Archaeology Data Service (ADS)  Day of Archaeology blog 2012

If you want a quick introduction to the ADS and what we do see last year’s post.

We have contributions from two members of staff from the ADS this year, one from Stuart Jeffrey ADS deputy Director (Access) and this one from Ray Moore one of the ADS Digital Archivists.

ADS logoRay Moore

As a digital archivist at the Archaeology Data Service, my day to day activities involve the accessioning the digital data and other outcomes of archaeological research that individuals and institutions deposit with us, developing a preservation programme for that data, but also curating existing ADS collections.

Today, and indeed for the past week, I have spent much of my time working on the Grey Literature Library (or GLL).  The GLL is an important resource for those amateur and professional archaeologists working in archaeology today providing access to the many thousands of unpublished fieldwork reports, or grey literature, produced during the various assessments, surveys and fieldwork carried out throughout the country. These activities are recorded using OASIS (or Online AccesS to the Index of archaeological investigationS) and after passing through a process of validation and checking the reports produced in these projects arrive at the ADS. On first impressions then the digital archive may seem like an ‘end point’, a place where archaeological grey literature goes to die, but the ADS, through the GLL, makes these reports available to other archaeologists and the wider community allowing the grey literature to inform future research. At the same time as a digital archive we take steps to preserve these reports so that future generations can continue to use the information that they contain; an important job as many of these reports do not exist in a printed form.

Grey Literature Reports

Reports from the Grey Literature Library.

So what does digitally archiving a grey literature report entail? Initially all the grey literature reports must be transferred from OASIS to the ADS archive; the easiest part of the process. More often than not the report comes in a Portable Document Format (or PDF) form, and while this is useful for sharing documents electronically it is pretty useless as preservation format for archiving. One of my jobs is to convert these files into a special archival form of PDF, called PDF/A (the A standing for Archive). Sound’s easy, but often it can take some work to get from PDF to PDF/A (my all time record is 2 hours producing a 900mb PDF/A file). These conversions must also be documented in the ADS’ Collection Management System so that other archivists can see what I did to the file to preserve the file and its content. While OASIS collects metadata associated with project, the ADS uses a series of tools to generate file level metadata specific to the creation of the file, so that we can understand what and how the file was created. Only once these processes are complete can the file be transferred to the archive, with a version also added to the GLL so that people can download and read the report. With a through flow of some 5 to 600 reports per month the difficulties of the task should become apparent; and all this alongside my other duties as a digital archivist. This month’s release includes an interesting report on The Olympic Park Waterways and Associated Built Heritage Structures which stood on the site now occupied by the Olympic Park. Anyway I’d better get back to it!

The Archaeology Data Service, Working to Keep Your Bits in Good Order

Welcome to the Archaeology Data Service (ADS)  Day of Archaeology blog 2012

If you want a quick introduction to the ADS and what we do see last year’s post.

We have contributions from two members of staff from the ADS this year, one from Stuart Jeffrey ADS deputy Director (Access) and one from Ray Moore one of the ADS Digital Archivists.

Stuart Jeffrey

Stuart Jeffrey

Another busy day at the ADS today, lots of looming deadlines and lots of work to be done.  Since the last Day of  Archaeology the ADS has continued to expand its collections and participate in more and more national and international projects, which is great news and it certainly keeps us out of mischief. In terms of recognition for ADS’s work, it’s actually been a very good year too, the ADS was a major part of the submission that got the University of York’s Department of Archaeology a Queen’s Anniversary Prize for Higher and Further Education and we are also short listed for a BAA award for innovation (to be announced on 9th July, so fingers crossed!).

The project that is occupying most of my time today is the Economic Impact of the ADS project. The ADS is a free to access digital archive, but it’s really important to us, and funders, that we have a good idea of what the actual economic value to the whole sector of the ADS actually is, so we have embarked on a JISC funded project to try and find out, it’s no easy task to try and put numbers on this kind of ‘value perception’.  I’m preparing for a meeting with John Houghton the Professor of Economics (from CSES in Australia) who is carrying out the analysis for the project in Oxford on Monday. This will be our first meeting since the on-line survey of users and depositors will have closed and I’m really looking forward to seeing the responses. (BTW is closes tonight so if you want to participate there is probably a bit of time left, follow the project link above).

Copyright Clive Ruggles from ImageBank

A nice image from the ADS archive, Cloonsharragh, Ireland, Copyright Clive Ruggles, image taken from ADS ImageBank

Also today, I’m also putting the finishing touches to a joint application, with Internet Archaeology, for an IfA HLF work place learning bursary. We have hosted a couple of these in the past and have always enjoyed the experience of giving someone the opportunity to bring on their skills in a work place environment. We also think there is still a skills gap in the archaeological work force when it comes to digital data management, especially the complexities of digital archiving, and managing data and understanding archiving should really be core skills for archaeologists.

I’d also like to mention the fact that the ADS are proud to support the Day of Archaeology. We’ve been really impressed with the response to the Day of Archaeology project in general and the way a ‘snapshot’ of archaeological activity has been built up covering all sectors including academic, commercial, fieldworkers, specialists, students and curators. As well as fulfilling its role of information sharing and community building amongst the profession, it is also clear that the snapshot created on this one day in 2012 could well become a valuable document for the historians of the archaeological discipline in the future. With this in mind, the ADS are keen to help archive these contributions for the long term. Everyone’s contributions today could well be part of a future research project in 2112!

Finally, as we near the end of the month it’s time for me to change the ‘featured collection’ section of the ADS front page. Ray has been busy archiving and validating a lot of Grey Literature reports, our total is now over 17,000 I think, and some of these relate to archaeological work done in advance of the construction work at the Olympic sites in London. Given that the Olympics are nearly upon us it seems a good idea to make the major MoLAS report (533 pages!) on this work the featured collection for July, very topical. Topicality is not always something that easy to manage when dealing with archaeological archives, but we like to give it a try.

Details of Ray’s Day to follow…….



A day with the Archaeology Data Service

ADS logo 

Welcome to the Archaeology Data Service (ADS)  Day of Archaeology blog. Before we start looking at some of the nitty-gritty of our busy day it might be useful to give a little bit of background on what we do, especially for those of you who maybe don‘t know anything about us at all.

It’s not all trowels, beards and woolly jumpers:  In lots of the other Day of Archaeology blogs you will be reading about archaeologists out in the field excavating, surveying, recording and so on. You’ll also read about the careful cleaning and analysis of artefacts that have been recovered the pots, metal work, skeletons and so on.  This is often exciting and stimulating work, but it raises an important question, why is it being done? There are lots of good answers to this question that range from the very philosophical to the very practical. However, almost all the answers rely on the fact that the information that archaeologists create, the data they gather, will be around for everybody to reuse in the future.  This can be said to apply to many disciplines, but it is especially important for archaeology because the process of excavating a site is of course the process of destroying it too! What remains after the site is excavated are the memories of the experience, the impressions of those affected by the site and the ideas about the past that those involved in the work – and those watching it happen – have created through direct  contact and through consideration of the material that has been recovered.  After the project is over the main connection back to the site apart from memories and the physical remains considered important enough to  keep in a museum are the records that are generated throughout the archaeological process (sometimes called primary data) and the ideas about people in the past that these records have helped to inform (often called interpretation).

The King's Manor, York - where the ADS is based.

The King's Manor, York - where the ADS is based.

So it is important for archaeologists and all those with an interest in the past that these records are kept safe for the long term, especially because they can’t be recreated. At first glance this might seem like a straightforward problem, but it is a surprisingly complex one and has become more so in the last 25 years. This is because almost all archaeological information is created in digital form and now covers a huge range of data generation and recording  techniques, databases, text documents, images, videos, sound recording, aerial photographs, satellite images, laser scanning, digital mapping, sonar data, three-dimensional models etc. etc. It is often very surprising to discover that even with all this new technology, and sometimes because of it, the data created is really quite fragile and requires a lot of looking after. This is where the ADS comes in. The ADS are a digital archive with two main objectives:  1) to provide a safe place for those interested in keeping the results of their archaeological work available to others in the long term; 2) exploring new ways of making all these exciting results  available, findable and usable to anyone and everyone over the internet.

There is lots more about the ADS and it’s history here.

So that’s the headlines, what does it mean in practice? Apart from these main objectives there are lots of other activities we undertake to support them, such as giving advice and creating guides to good practice, but you’ll read more about these activities in the sections below. Different people do different things at the ADS so the sections below will detail a number of activities on or around the 29th July.

Stuart Jeffrey – Deputy Director (Access)


A busy day for me, right now I’m concentrating on various European projects that the ADS are involved with, it’s important to remember that the national boundaries we work within today are a relatively new invention and people in the past wouldn’t recognise them, so to help people study human activity in the past it’s crucial to work with colleagues in other countries.  Information on all the ADS research projects can be found under the ‘OUR RESEARCH’ pages on the main ADS website.

First things first though, a good big cup of coffee is in order to get me ready for the day! I also like to check activity on twitter and see if we have any big collections coming up for release. My colleague Jen Mitcham and I normally have a check to see if her ADS facebook page has more new followers or if the ADS_Update twitter account which I run has more, twitter is winning so far, but it can be a close run thing.

It almost goes without saying that after the coffee and a short gloat over twitter’s success most of the morning will be spent on the computer dealing with emails, lots of emails. The ADS are involved in quite a number of projects with partners all over Europe and also in the USA, keeping in touch with these colleagues is a very important part of my job. Today I have been writing a progress report for the CARARE project which is about getting ADS 3D data into a big Europe wide heritage search mechanism called Europeana.

Coffee break time!  – then onto arranging exhibition space for a photographic exhibition on the diversity of archaeological practice as part of a project called the Archaeology of Contemporary Europe (ACE). A couple of weeks ago I was escorting the photographer round the sites of York including stone masons at the famous York Minster, the Jorvik center and the Hungate excavations by YAT.

After sandwiches for lunch and a quick walk round town, York is lovely in the summertime, my afternoon is split into two tasks. Firstly I’m looking at progress on the development of some new features on the ADS website, if you are a regular user you will know it has been recently updated with a new design and also lots of new features. We are working hard on trying to integrate the Imagebank (a free to use collection of archaeological images for teaching and learning) into our main search – ArchSearch. This means that when someone searches on, for example, Stonehenge, they get a series of good pictures to use in their results set as well as monument inventory records and archives relating to the site. Progress on this is good thanks to the hard work of the development team and others. Secondly I have meetings with the ADS development team in the afternoon to discuss our plans for services –this means that as well as the various ways of discovering data held by the ADS via our website we are working to publish data as ‘services’ that can be consumed by other search mechanisms. This is quite a technical discussion, but it’s also quite exciting because we can see lots of potential for making our holdings more easily discoverable to wider and wider audiences, and in my job that’s what makes me really happy.

So after a long day I’ve got no dirt under my fingernails, and discovered no new sites, but I feel that it’s been a good and satisfying day working on ways to both keep archaeological data safe and to get it out to people who need it to continue their work or simply have an interest in our shared past.

Tim ponders some worrisome floppy discs

Tim, one of our curatorial officers ponders some worrisome floppy discs, will the data be recoverable?


Jenny Mitcham (Curatorial Officer)


I work for the Archaeology Data Service as a digital archivist. I have an archaeology degree and did a couple of years digging in the UK before I decided that an office job was more my style. I am engaged in the very useful task of preserving the digital data that archaeologists create in the field (and the office).

At the ADS we know that in order to keep files safe and accessible long into the future, we need to migrate or refresh them to create newer versions to replace the old obsolete files (which will soon not be readable by modern software). To this end, I am currently working on one of the first large collections that was entrusted to us back in the very early days of the ADS. The resource I’m looking at is an archive of Council for British Archaeology (CBA) Research Reports. A run of reports dating back to 1955 which were no longer in print so were scanned and given to us in digital form to make more widely available on-line. The collection consists of some 100 reports and covers many different topics and themes within British Archaeology. This has remained one of our most popular and well-used resources ever since we started making it available on-line in 2000.

The year 2000 was a long time ago in computer terms. The internet was quite different to how it is now and many people relied on very slow dial up speeds. The decision was made at the time that people would not be able to download the CBA Research Reports in one go and would prefer to access them in small chunks of 3 or 4 pages per pdf file. This was all well and good at the time but things have moved on since then and the majority of our users now have access to faster broadband speeds and would actually prefer to download the whole report as a single file.

The other issue with these original CBA Research Reports is that the files are quite an early version of the PDF standard (1.2) and though they are not yet obsolete, some of them are throwing up error messages and they would all benefit from being refreshed.

The exciting job in store for me today is to turn all of these CBA Research Report chunks into full and complete pdf files (one file per report), to refresh them into a more up-to-date file format (the archival version of pdf) and also to update the web interface which people use to access these reports.

OK, so I know this isn’t the most exciting of posts (or exciting of days for me!) but it just highlights some of the essential and ongoing work that we have to carry out in order to make archaeological data available to anyone who wishes to access it, both now and into the future.


Kieron Niven (Curatorial Officer)

Kieron hard at work on the new Guides to Good practice

As with other members of the ADS curatorial team, my day can be quite varied ranging from archiving datasets and creating web pages right through to dealing with helpdesk queries coming in through our website or providing guidance and support to potential data depositors. Although I’m currently posted to helpdesk (we rotate this on a weekly basis and it’s been satisfyingly quiet this week!) my main activity today has revolved around the finishing up of major chapters of our new Guides to Good Practice. This has mostly been focussed on completing outstanding sections in the guide for marine survey data (looking at data from bathymetry, single and multibeam sonar, etc.) but I’ve also had a brief ‘catch up’ skype call with the guides project partners in the U.S. at Digital Antiquity /Arizona State University. As a minor break to my predominantly ‘guides focussed’ day I’ve also done some tweaking to the introduction and overview pages of a large laser scan project archive that we will be imminently releasing. The archive has come to us as part of the LEAPII project (a collaboration with Internet Archaeology to showcase projects featuring linked digital publications and archives) and contains laser scans of a number of objects from Amarna (Egypt). The really interesting thing – for me, at least – is that we have data for each object at a number of different points in the laser scan lifecycle e.g. individual point clouds from the scans, registered scans, meshes and – my favourite – 3D PDF files. This variety, I hope, will make it a really useful dataset for those interested in the process of laser scanning.