Previous Page Table of Contents Next Page

Site File Databases and GIS Systems



Steven LeBlanc

In a recent SAA Bulletin [12(5):13,22], Jim Ebert raised the issue of what is the best approach to electronically manage archeological site records. While he points out the merits of using Geographic Information Systems (GIS) as an analytic tool, it would appear that he is proposing to use such systems as the primary data storage environment. As I have discussed this issue with him at some length, I am surprised that he still holds this view. I think he correctly sees, as I do, the benefits of using GIS systems for some analysis, but this is far different from using a GIS as a general purpose data repository. Because this issue is a very important one, I feel that it is useful to outline what I believe an overall data analytic environment should be like. While this will be directed toward site records, the logic is virtually identical to excavation or surface collection data.

As Ebert alludes to, experience has overwhelmingly demonstrated that one should always buy software instead of writing it oneself. As Ebert points out, many states use systems written from the ground up, which turns out to be a very expensive process when all costs are considered. However, I also believe that no one piece of software, or hardware, can do everything. Ebert seems to be correctly proposing that writing one's own site management system is a poor idea, but I believe the evidence is against him when he argues that an application based on a GIS system can adequately fill all the relevant needs. That is, I agree with Jim that most site management systems are not nearly as useful and efficient as they should be, but we disagree as to the reason. He seems to feel that they are flawed because they are not based on GIS systems, while I believe that their quality is limited because they have been custom-written.

From a technical perspective, Ebert argues that the simple flat-file data structure of a GIS is adequate for site records, but that the relational data model is too complex for such an application. I believe that the opposite is true and that this has been amply demonstrated. A relational data structure results in higher quality data with fewer inconsistencies and errors, and is more efficient in terms of data entry, data maintenance, and overall performance. Such a data structure has been overwhelmingly shown to be easy to use and maintain. In addition to a number of SHPO offices that successfully use a relational data structure, a very large number of anthropology museums also use it for their collection data. If anything, site data are more complex than collection data, and so the benefits of a relational data structure are even greater for site records. Probably one of the best examples is the New Mexico site system. New Mexico has probably utilized GIS as intensively as any SHPO system, yet their primary data structure is not a simple GIS database, but is instead a very sophisticated and expensive relational database. That is, the system is fundamentally a relational data structure with the GIS added on. I believe this to be the correct approach.

What is frequently misunderstood is that a database itself is not an application. A huge investment in design and programming is required to provide it with the structure and tools to do what is needed. The most obvious concerns are vocabulary control, data validation, proper searching, and reporting capability. The ability to handle images of site photos, forms, maps, and other documents can now be inexpensively integrated with computerized site records. Thus, the concept of what the database actually is should be very broad. It really consists of the underlying database engine itself, the application program, the basic data, and images. I think it is quite clear that the best environment for application development and complex site data is not a GIS, but instead an application that integrates them all.

These broad functions and many others constitute a good site record application. They are not inherent in any database, including any GIS database. To get the best functionality and the most cost-effective programs, the underlying database should be much more powerful than those integrated with a typical GIS. That does not mean that the GIS cannot be seen or designed as an "application" that accesses the same database as the basic application. However, a GIS does not have the specific functionality to handle a large and complex automated site file. Furthermore, I do not believe that spatial relations are the primary focus of site data research or use. Administrative use is extremely important, as is non-spatial statistical analysis. Finally, virtually all SHPO organizations and many others rightly integrate both historic structure and archeological site records. There are only minor roles for GIS regarding these records.

At a more technical level, there is a question of how one integrates the GIS (or other statistical package) with the basic data. There are two approaches. The first is to use the query language capability of the GIS or statistical package to access the data directly. This is the approach taken by New Mexico. Virtually all good databases will allow such an approach and it is the best one if the data are simple enough. A second approach is to "export" the data to the application after they have been manipulated. At one level, this is very inefficient as it requires an additional step. It does, however, allow the data to be stored and manipulated in very complex waysbefore they are exported. Of course, those who use simple data structures feel that direct query is preferable, while those with complex data see the opposite. One cannot measure the validity of the two positions without considering the underlying data and the potential for data analysis.

In summary, there are a number of agencies that use custom software for their basic site records. They allow for as much standardization as any other approach, and they serve both researcher and administrator. The relational data model has worked very well in these cases, as it does for data derived from excavation. Even though many agencies have yet to add GIS capability, they provide a successful model of how to manage site data. Other successful implementations are those that use a powerful relational database with a custom application and a GIS. SHPOs are better off looking to these examples instead of trying to build a complete solution based on a GIS as Ebert suggests.

Steven LeBlanc is with Questor Systems, Inc., of South Pasadena, Calif.

To Top of Page


Jim Ebert, Eric Ingbar, and Roger Werner

A reader of Steven LeBlanc's reply to our article on research into methods for automating SHPO databases [SAA Bulletin 12(5):13,22] who went back and looked at the original news item would sense immediately that we are in substantial agreement. Unfortunately, there are some specific differences as well, most of which we think have semantic roots. Some of LeBlanc's critique points out general weaknesses of electronic data management, which we find apt but would extend somewhat further based on our knowledge of Geographic Information Systems (GIS) and other database management techniques and software gained through many years of collective experience in this area. Some of his other perceptions seem to stem from a misunderstanding, on his part, of just what Geographic Information Systems are, and how the data contained in SHPO archives across the country are actually used. Our knowledge of the nature and uses of SHPO archives is based on nearly a year of research involving direct mail, telephone, and face-to-face contact with SHPOs, their database managers, SHPO database users, and cultural resource managers in all but a few states, as reported in the SAA Bulletin article.

When we began considering how SHPO archives might best be converted to computerized databases, we contacted Steve because we thought the Questor Systems software, which he developed and markets, might have a role to play in organizing SHPO's data. The Questor Systems software is relationally organized about a lexicon of words that identify items; spatial relationships are quite appropriately not important in the system. Many state site forms contain lexical entries--text entries into blanks. These convey a great deal of important information that can probably never be adequately entered as encoded or standardized data into a computer, and for this reason SHPO databases will probably always at least include past site forms, possibly in document-imaged format. The lexicon system attempts to deal with the wealth of textual description found in site files, but we think it may be too flexible since it allows unnecessary ambiguity in what should be comparable observations. The lexicon model copes with cultural records in their past and present formats, but a fundamental question is whether recording archaeological observations should continue to depend upon idiosyncratic narrative.

Most of the respondents to our surveys felt that GIS would suit their needs better than an aspatial textual database system. A lexical tool may be a desirable part of such a system but need not lie at its heart to achieve the benefits of "relationality". There seem to be two main misunderstandings between LeBlanc and us: (1) the "relationality" of GIS and other database management techniques, and (2) the rationale behind and uses of SHPO archives.

GIS means Geographic Information Systems. We don't interpret GIS to mean some specific kind of software, but rather an approach, and in a real way even a philosophy, for managing digital data. In a GIS approach, the "relationality" of components of the database focuses first on geographic or spatial data as well as allowing other links between data items. Some off-the-shelf GIS software packages come with what LeBlanc calls "flat file" data managers (for the non-spatial data associated with spatial data in the GIS). Other tabular data managers included with higher-end GIS packages, such as the Info part of Arc/Info, can be customized by the user to be "non-spatially relational" with multiple layers of lookup tables, or scrapped entirely and easily replaced by well-known RDBMS (Relational Database Management System) engines (e.g., the numerous packages capable of using SQL). There is no basis for regarding GIS approaches to SHPO database management as non-relational.

It doesn't go the other way, however. While RDBMS software can of course store spatial coordinates, a non-GIS database manager cannot in an efficient or user-friendly way relate non-spatial data items using spatial relationships. For instance, even the best non-spatial relational database managers cannot create new data layers (i.e., a map with associated tabular data) by creating buffers around points, lines, or polygons; or by joining or splitting polygons or lines and then associating tabular data from both datasets with new polygons. A GIS-centered database management system is the most, and really the only , appropriate way to organize data that are primarily spatial.

Contrary to what LeBlanc seems to feel, the most basic property of all data in SHPO archives is spatial location. It is nice to think that someday it will be easy to use SHPO archives for archaeological research (this will probably happen after they are converted to GIS-centered databases), but the reason these archives exist is to facilitate the fulfillment of laws and government policy that require the assessment of impacts to cultural resources on lands to be developed, mined, or otherwise disturbed. SHPO archives are universally organized around maps that show locations of cultural sites or properties, the survey projects that have been undertaken to find these sites, and the boundaries of real estate upon which assessments have been or must be made. Cultural resource managers and cultural resource firms search SHPO archives to find whether areas have been surveyed and whether sites or other cultural resources were found. Often voluminous non-spatial data is recorded for each site and project as well, but these data are primarily intended to be used to determine the one central, most important, non-spatial characteristic of sites the law requires--whether they are eligible for nomination to the National Register or whether they trigger other regulatory mandates.

Nowhere in our original article did we argue that "...a relational data model is too complex" for SHPO archives or any other purpose. LeBlanc feels that relational database structures have been "overwhelmingly shown to be easy to use and maintain." Almost every major database system (whether GIS or tabular) of which we are aware has required difficult, time-consuming labor to create. The successful systems are easy to use and maintain, but were not easy to create. Relational structures are important and are here to stay in data management. GIS should be considered a powerful tool in the relational toolbox, especially in applications such as SHPO archives that are "map-driven." The hard work of creating a records management system is made even more laborious if one refuses to use appropriate tools, such as GIS, because of a limited conception of what these tools can and cannot do.

We wonder if it is misunderstood by anyone that a database is not an application. Certainly, much thought must be invested in developing a database structure that fulfills the needs of database users while at the same time facilitating the constant updating and maintenance that SHPO databases require. This is exactly the focus of our ongoing research on SHPO data automation.

Jim Ebert is with Ebert & Associates, Inc., Albuquerque, N.M. Eric Ingbar is with Gnomon, Inc., Carson City, Nev. Roger Werner is with Archeological Services, Inc., Stockton, Calif.

To Top of Page

Previous Page Table of Contents Next Page