Open data

A Day in the Digital Index of North American Archaeology


What is DINAA?

The Digital Index of North American Archaeology, or DINAA,  applies open access principles to archaeological data created by governments and researchers, in order to create a standardized data discovery tool (without using sensitive information like site coordinates). This allows for a more complete understanding of the past by allowing data covering large areas, or those separated by modern political boundaries, to be analyzed using the same terms in one data set. As the index of DINAA grows, it will incorporate larger numbers of stable links to public data sets hosted throughout the Internet, and can act as a kind of library search engine for primary archaeological data on architecture, fauna, flora, lithics, pottery … or anything!

What We Do

Each state in the U.S. has a State Historic Preservation Office, or SHPO, and each of these maintains their own database of archaeological sites in their respective state. These databases have been designed independently of each other, and often differ in terms of data structure and vocabulary. DINAA uses definitions and organizational elements from these nearly comprehensive catalogs as its base data layer. We have created a system that allows these differing databases to become interoperable through translation to one or more standardized classifications. If the DINAA and each SHPO can talk to each other, the information from each state can be presented in one data set. A publicly accessible live map, seen below, is the one of the products of this process. Click on the link, or the map image to try your own query!

Live Map of mound sites listed in the DINAA as of April 15, 2014.

DINAA is an archaeological information tool for the Internet. Records for sites of interest can be browsed and used as a basis for further research. Maps can be exported as GeoJSON files for use in GIS software programs like QGIS and ArcGIS, allowing use by anyone through our open access policies. DINAA can be used by researchers to help identify broad areas of interest for their work, by educators who want to show students current maps of archaeological cultures, or for all sorts of important investigative or public activities. However, because of its sensitive data restrictions, DINAA is not built to conduct records checks for cultural resource management or other legal compliance activities. It is a public research and educational tool. Click on the map links or images to go to our query page and try it yourself!

On a typical day, much of the work involved with creating the DINAA consists of two tasks: obscuring site locations to prevent unauthorized access, followed by linking culture-history terms in individual state databases to a standardized terminology. Obscuring location data involves allocating sites to sectors on the map grid, each sector is 20 km on a side (or 400 square km),  then removing all geographic coordinates and other sensitive data. This work, done by registered professional archaeologists ONLY, allows useful cultural and scientific information to be published publicly online while simultaneously protecting important site locations.

The next step is to relate each state’s unique terms to the standardized vocabulary used by the DINAA (based off of the CIDOC-CRM ontology which is an international standard for cultural heritage data. The DINAA team first creates a comprehensive list of all archaeological terms used within a source database. They then sift through the published archaeological literature on each state or region to find discrete definitions for each term. DINAA accumulates definitions for sites, rather than replacing them, and users can query the original definitions to compare with the newer DINAA definitions to ensure accuracy and continuity. Reference citations for each new definition are then recorded and added to the DINAA Zotero library, which is also available as a public resource online.


Screenshot of the DINAA Zotero Library


The word cloud above, created by DINAA team member Kelsey Noack Myers demonstrates the variety of terms used across state archaeological databases. The size of the text for each terms corresponds with the frequency with which it is used. Linking these categories across multiple states is a major challenge facing the project team, but it is being used to document where people were on the landscape by major time periods in the past.

The word cloud above, created by DINAA team member Kelsey Noack Myers demonstrates the variety of terms used across state archaeological databases. The size of the text for each terms corresponds with the frequency with which it is used. Linking these categories across multiple states is a major challenge facing the project team, but it is being used to document where people were on the landscape by major time periods in the past.

What’s next?

Papers and posters about DINAA have inspired audiences at professional meetings over the last two years. Our team recently produced presentation materials for the 2014 Society for American Archaeology annual meeting (click here to access our papers, posters, slides, and a summary of our activities at the SAA meetings). An article in Literary and Linguistic Computing will be available this fall. Please follow our work or tweet us @DINAA_proj on Twitter, and visit our blog for updates. Team members are currently working on technical papers describing DINAA, and research based on it, related to both the construction of the index, and from examining the combined dataset.

DINAA also gives back to the discipline of archaeology, acting as a focal point around which we can discuss “how” and “why” we record data in different ways. Project team members have hosted one workshop with 30 participants already this year, and are planning a second next month. Site file managers and other researchers from many states in Eastern North America are participating. DINAA is an open, community effort, and the support of many people and organizations is what makes it happen. Feel free to contact us!

In 2014 our initial NSF funding period is coming to a close. We are currently planning the next round of funding that will help the DINAA grow to cover all US states and territories, as well as other North American nations as well.


 This Post Was Authored By the DINAA Team: R. Carl DeMuth, Kelsey Noack Myers, Joshua Wells (PI), David G. Anderson (PI), Eric Kansa (PI), Sarah Kansa (PI), Steve Yerka (PI), and Thad Bissett

Open School of Archaeological Data

MIndHere we are! At the MAPPA Lab preparing the last details (not so little, sigh) for the first Open School of Archaeological Data.

This year we decided to offer a free opportunity to 10 scholars to work with archaeological open data. We’ll start next Monday (July, 14th) and we’ll finish on Friday (July, 18th). We received 37 applications. The quality was really high, so we admitted 4 more students at the school. We believe this as a great responsibility: 14 archaeologists want to study how to find, to download, to use, to reuse, to publish data in open formats.

We have an ambitious project: to create a new generation of Italian archaeologists a collaborative generation able to work with a trowel, and to share and manipulate data, because we believe that archaeological data are public, are expensive to produce and for this they must be recycled.

We don’t want to teach, but to share our experience. We’ll have in front of us a little group of young scholars with a relevant starting curriculum; they must share their experience between them and with us.

For the first step we’ll be transforming data from native formats to more useful formats, for example using Tabula for liberating tables locked inside pdf files, or making web data extraction easily with software like . After the data mining, we will work with OpenRefine, a powerful tool for working with messy data: is there someone that thinks that archaeological data aren’t messy? We will clean and transform them from one format into another; we’ll geocode tabular data starting from a simple address and analyse the spatial properties of archaeological data. In 2002, Wheatly and Gillings wrote that «Contrary to popular mythology, contemporary professional archaeologists may spend more time using GIS than a trowel». Using qGIS we’ll explore the archaeological data, and with the help of the mathematician Nevio Dubbini we’ll apply to them statistical, geostatistical and mathematical models,

Working with data is useful only if archaeologists will be able to communicate their result to the archaeological community, but mainly to the community of the citizens: archaeologists have a public role in modern society that data can reinforce. So, Francesca Anichini will lead us in the world of storytelling: how to visualise the data via infographics or through graphs that permit to explore networks and complex systems in a dynamic manner using Gephi, whilst Fabio Viola will talk about Datagamification (an exciting topic!).

Archaeologists are open data user, but also open data producer: Matteo Lorenzini will lead us in the world of metadata and Linked Open Data, whilst Francesca Anichini will make us aware of the ethical and legal aspects connected with the opening of the archaeological data and of the importance of using licenses.

Is this geek archaeology? Maybe, but we are all living in geek world!