digital archive

Team Digital Preservation

Hannah Smith Digital Archives Officer at Historic Environment Scotland

My name is Hannah Smith and I work for Historic Environment Scotland. I work within the collections section at HES, in the digital archive team which consists of myself and the Digital Archive Manager. We have been actively collecting digital archive since the 1990’s, receiving both internally and externally generated material. We currently hold over 500,000 catalogued digital items in our collection, which will only continue to grow in the future meaning as an organisation we need to be equipped to safeguard our archive to preserve and promote our digital material.

Current  Scottish planning guidance (http://www.gov.scot/Publications/2011/08/04132003/1 (accessed 22 July 2016) places emphasis on preservation in situ, but where this is not possible or appropriate it encourages recording or excavation following by the publication and archiving of that record.  Preservation by record is a widely recognised concept within archaeology, but it can only be achieved if those archives have a place of deposit where they will be preserved for the future. So our role in ensuring this is possible and happens is a vital part of this chain of archaeological responsibility and we consider this as important as the excavation.

The priorities for HES digital archiving are to collect all primary material relating to archaeological and architectural fieldwork and excavation undertaken within Scotland and Scottish territorial waters. This remit covers an extremely diverse range of information types including:  textual reports; databases; geophysical survey; air photography; mapping (GIS) and topographic survey; buildings survey; visualisation reconstruction; and digital video and audio. Some of these data types can pose challenges due to their complexity for example 3D laser scans which is a technology that is being utilised more and more for recording the built environment.

3D laser scan of Pencaitland Church © HES

3D laser scan of Pencaitland Church © HES

Digital photograph over 3D laser scan © HES

Digital photograph over 3D laser scan © HES

As technology evolves and file formats become obsolete we have to choose the best way to maintain access to the collections we hold. The only practical way we have to do this is to ‘migrate’ the file into a new format, however with some file types we risk losing or worse altering some of the properties of that file. Therefore we need to understand and define the significant properties of a file so that we can know what constitutes acceptable loss, and what crosses the line into unacceptable loss. We carefully consider these effects and experiment with different migration routes before finding the best possible balance between minimal or no loss of information or functionality and ongoing accessibility for that information. We also ensure we maintain the original object in an unchanged state so that should new possibilities emerge we can take advantage of them.

To help explain what we do*, I’ve included this animation to digital preservation, that HES digital archive manager Emily Nimmo helped to create. *We don’t wear capes, but still like to think of ourselves as Team Digital Preservation.

Most of my day to day work involves processing externally generated material into our trusted digital repository encompassing two areas: digital accessioning and digital cataloguing. We receive all types of digital media and often still receive obsolete media.

IMG_7612

5¼-inch floppy disk and Amstrad 3 inch disk © HES

It’s a very satisfying job to take the digital media and link the information to our relevant records and know the data is now safeguarded in our archive and available to the public, to researchers and to inform the management of the historic environment in the future.

Submerged wartime defences off Roan Head, Flotta © Orkney Research Centre for Archaeology

Submerged wartime defences off Roan Head, Flotta © Orkney Research Centre for Archaeology

Archaeological evaluation © Cameron Archaeology

Archaeological evaluation © Cameron Archaeology

It also allows me to see all types of interesting archaeology from across Scotland every day – including cute little dogs on site . We come across all sorts of interesting material in this job and there’s never a dull day. We get to see little time capsules of archaeological events from all across Scotland, from working shots during an excavation to site diaries through to the final reports. I can live vicariously through commercial archaeologists from the comfort of my office.

Archiving Ipswich

Two years after posting about my work on the Silbury Hill digital archive, in ‘AN ADS DAY OF ARCHAEOLOGY’, and I’m still busy working as a Digital Archivist with the ADS!

For the past few months, I have been working on the Ipswich Backlog Excavation Archive, deposited by Suffolk County Council, which covers 34 sites, excavated between 1974 and 1990.

Ipswich2

Excavation at St Stephen’s Lane, Ipswich 1987-1988

To give a quick summary of the work so far, the data first needed to be accessioned into our systems which involved all of the usual checks for viruses, removing spaces from file names, sorting the data into 34 separate collections and sifting out duplicates etc.  The archive packages were then created which involved migrating the files to their preservation and dissemination formats and creating file-level metadata using DROID.  The different representations of the files were linked together using object ids in our database and all of the archiving processes were documented before the coverage and location metadata were added to the individual site collections.

Though time consuming, due to the quantity of data, this process was fairly simple as most of the file names were created consistently and contained the site code.  Those that didn’t have descriptive file names could be found in the site database and sorted according to the information there.

The next job was to create the interfaces; again, this was fairly simple for the individual sites as they were made using a template which retrieves the relevant information from our database allowing the pages to be consistent and easily updateable.

The Ipswich Backlog Excavation Archive called for a more innovative approach, however, in order to allow the users greater flexibility with regards to searching, so the depositors requested a map interface as well as a way to query information from their core database.  The map interface was the most complex part of the process and involved a steep learning curve for me as it involved applications, software and code that I had not previously used such as JavaScript, OpenLayers, GeoServer and QGIS.  The resulting map allows the user to view the features excavated on the 34 sites and retrieve information such as feature type and period as well as linking through to the project archive for that site.

OpenLayers map of Ipswich excavation sites.

OpenLayers map of Ipswich excavation sites.

So, as to what I’m up to today…

The next, and final step, is to create the page that queries the database.  For the past couple of weeks I have been sorting the data from the core database into a form that will fit into the ADS object tables, cleaning and consolidating period, monument and subject terms and, where possible, matching them to recognised thesauri such as the English Heritage Monument Type Thesaurus.

Today will be a continuation of that process and hopefully, by the end of the day, all of the information required by the query pages will be added to our database tables so that I can begin to build that part of the interface next week.  If all goes to plan, the user should be able to view specific files based on searches by period, monument/feature type, find type, context, site location etc. with more specialist information, such as pottery identification, being available directly from the core database tables which will be available for download in their entirety.  Fingers crossed that it does all go to plan!

So, that’s my Day of Archaeology 2015, keep a look out for ADS announcements regarding the release of the Ipswich Backlog Excavation Archive sometime over the next few weeks and check out the posts from my ADS colleagues Jo Gilham and Georgie Field!

An ADS Day of Archaeology

Here it is, my Day of Archaeology 2013 and after a routine check of my emails and the daily news I’m ready to begin!

Silbury Hill ©English Heritage

Silbury Hill ©English Heritage

I am currently approaching the end of a year-long contract as a Digital Archivist at the Archaeology Data Service in York on an EH-funded project to prepare the Silbury Hill digital archive for deposition.

For a summary of the project, see the ADS newsletter and for a more in-depth account of my work so far check out my blog from a couple of weeks ago: “The Silbury Hill Archive: the light at the end of the tunnel”

Very briefly, though, my work has involved sifting through the digital data to retain only the information which is useful for the future, discarding duplicates or superfluous data; sorting the archive into a coherent structure and documenting every step of the process.

The data will be deposited with two archives: the images and graphics will go to English Heritage and the more technical data will be deposited with the ADS and as the English Heritage portion of the archive has been completed it is time for the more technical stuff!

So, the plan for today is to continue with the work I have been doing for the past few days: sorting through the Silbury Hill database (created in Microsoft Access).

Originally, I had thought that the database would just need to be documented, but, like the rest of the archive, it seems to have grown fairly organically; though the overall structure seems sound it needs a bit of work to make it as functional as possible and therefore as useful as possible.

The main issue with the database is that there are a fair amount of gaps in the data tables; the database seems to have been set up as a standard template with tables for site photography, contexts, drawings, samples, skeletal remains and artifact data etc.  but some of these tables have not been populated and some are not relevant.  The site photography and drawing records have not been entered for example, meaning that any links from or to these tables would be worthless.  The missing data for the 2007 works are present in the archive, they are just in separate Excel spreadsheets and there are also 2001 data files, these are in simple text format as the information was downloaded as text reports from English Heritage’s old archaeological database DELILAH.  The data has since been exported into Excel, so, again to make the information more accessible, I’m adding the 2001 data to the 2007 database.

My work today, therefore, as it has been for the past couple of days, is to populate the empty database tables with the information from these spreadsheets and text files and resolve any errors or issues that cause the tables to lose their ‘referential integrity’, for example where a context number is referred to in one table but is missing from a linking table.

Silbury database relationship diagram ©English Heritage

Silbury database relationship diagram ©English Heritage

So, this morning I started with the 2001 drawing records. The entering of the data itself was fairly straightforward, just copying and pasting from the Excel spreadsheet into the Access tables, correcting spelling errors as I went.  Some of the fields were controlled vocabulary fields, however, which meant going to the relevant glossary table and entering a new term in order for the site data to be entered as it was in the field.

Once the main drawing table was completed, the linking table needed to be populated; again, this was done fairly simply through cutting and pasting from Excel.

The next step was the most time-consuming: checking the links between the tables, to do this I went to the relationship diagram, clicked on the relevant link and ticked the box marked ‘enforce referential integrity’ this didn’t work which meant that a reference in one table was not matched in the linking table which meant going through the relevant fields and searching for entries that were not correct.  The most common reason for these error messages was that an entry had been mis-typed in one of the tables.

That took me up to lunchtime, so what about the afternoon?  More of the same: starting work on the sample records with the odd break for tea or a walk outside to save my eyes!

As much as the process of updating the database has been fairly routine, it’s an interesting and valuable piece of work for me as it is the first time I’ve ever really delved into the structure of a database and looking at the logic behind its design.  I was fortunate in that I had attended the Database Design and Implementation module taught by Jo Gilham as part of the York University Msc in Archaeological Information Systems which gave me a firm foundation for this work.  Also very helpful was the help provided by Vicky Crosby from English Heritage who created the database and provided a lot of documentation in the first instance.

The next step once the data has been entered will be to remove any blank fields and tables and then to document the database using the ADS’ Guidelines for Depositors and then to move on to the survey data and reports.

I’m looking forward to seeing it all deposited and released to a wider world for, hopefully, extensive re-use and research!

Cosmeston post-excavation morning report

Hello everyone,

This is Nicolle Grieve, 20, Cardiff University 3rd year student studying BA joint honours Ancient History and Archaeology.

Although I had originally entered university to study single honours Ancient History, the archaeology modules provided an opportunity to study the ancient Egyptians and a chance to get physically involved in the process of excavation.

Last year I took part in an excavation of a Neolithic site at Brodsworth. It was a brilliant experience and, although the work was hard and the thought of living in a tent for a month wasn’t appealing, I found I really enjoyed my time on excavation. I returned healthy (very tanned), I met a lot of great people and the Wednesday night BBQ was always something to look forward to.

This year however, I wanted to experience the other side of excavation, the post-excavation work. At Cardiff University we are looking at what happens with the material found after excavations. As a group of six students we have looked at material found at Cosmeston.  We have been sorting and marking the pottery found in each context. Each sherd of pottery is marked with the site code and context number (where it was found), so if lost or misplaced it can be reconnected with the area from which it was discovered.

My main job though has been writing up the Cosmeston context sheets, and following this, scanning in the photographs taken during work in the 1980s to cross reference them with the catalogue to build a digital archive. This is very important because post excavation is about organising and ensuring the material and information is preserved. With the completed digital archive it not only makes the work of archaeologists studying the finds of the 1980’s easier but it allows us as archaeologists to find patterns within similar sites and find links in which we can form theories.  The overall process of post excavation is the most time consuming part of archaeology but the final stage, cataloguing the information ready for publication, is in some ways, the most rewarding part in my opinion.