ARCHAEOINFORMATICS
ARCHAEOINFORMATICS
Lecture Series
Archaeoblog

Report: The Pennsylvania State University

Archaeological and other historical sciences depend heavily upon legacy data. This is because the record is finite and new data are difficult to acquire. Field archaeology actually destroys the physical resource as it generates new data, so archaeologists necessarily depend upon not only the data they acquire through their own efforts, but also upon data acquired by other projects for different purposes. The use of legacy data entails several practical problems, including variable systems of weights and measures, incommensurate databases, and missing metadata. These data may take many forms, such as field notes, field forms, photographs, maps, remote sensing imagery, and the specialized data sets produces by technology such as magnetometers, resistivity surveys, and ground penetrating radar. Laboratory analysis produces additional data, including GIS files, database files, word processor files, digital images, and PDF files. All of these must be made accessible to future researchers if legacy data are to be useful to them.

Archaeological collections are not themselves legacy data, but rather the sources of data. Conservation of sites and curation of collections is important, but not the focus of our efforts in archaeoinformatics. The continuing generation of data from curated collections and unprocessed assemblages will be much improved with the development of standards and protocols for data and metadata recording. However standardization going forward will not improve accessibility to legacy data already recorded. The latter exists largely in the form of incommensurate databases and published reports, voluminous but nearly inaccessible “gray” literature, and archival sources. In many cases the sites and collections from which they were generated no longer exist.

The Penn State team is working on one solution to part of this problem. We have created a search engine called ArchSeer, which can crawl published and unpublished text and generate an excellent return at high precision given reasonable queries. When coupled with projects to convert thousands of gray literature reports into PDF files with optical character recognition, the tool will open up a vast resource for archaeological researchers, agencies, and private individuals. With the cooperation and assistance of JSTOR we are currently using 75 years of American Antiquity back issues to test and refine ArchSeer. When it is fully tested we will release ArchSeer as open source software.

A second part of the Penn State contribution to the larger effort is image recognition. This involves both automatic map recognition and registration and object recognition in photographs. The latter is built upon SIMPLIcity, image recognition software that allows us to find objects in published or unpublished photographs without the benefit of tagging or cataloging. The software facilitates automatic recognition of things such as arrow points, and with sufficient training the system should be able to recognize specific point types. One version of this software can already trace the outline of an object and correctly classify it to one or another of a set of defined types within that class of objects.

 

 

 

 

 

 

 

 

 

 

 

Archaeopedia
Links
 

What's New

Mellon All-Projects Meeting: Archaeology, New York, March 2008

arrowVisit the meeting website

arrowView the Archaeoinformatics Presentation

Joint Disciplinary and Technical Advisory Board Meeting, Santa Fe, February 2008

arrowAgenda

arrowParticipants

arrowSteering Committee Reports

arrowBoard Presentations

arrowJoint Disciplinary and Technical Advisory Board Final Report

arrow Technical Board Recommendations

_______________________________

Archaeoinformatics.org

arrowFormation of the Board of Directors

arrowOrganizational Plan

arrowPlanning Project Scope

arrowPlanning Effort Activities

arrowEvaluation of Existing Initiatives

arrowPlanning Project Schedule

_________________________

arrowJoint Disciplinary and Technical Advisory Board Report

arrowTake our Survey- 'Current Conditions and Needs in the Field'

arrowNew and Archived Articles and Abstracts

_________________________