You are on page 1of 227

Prepared under the:

Local Government GIS Demonstration Grant

Supported by: Local Government Records Management Improvement Fund Local Government Records Services State Archives and Records Administration

Project Team:
Erie County Water Authority Mr. Paul Becker, Project Manager National Center for Geographic Information and Analysis State University of New York at Buffalo Dr. Hugh Calkins, Project Director Ms. Carmelle J. Ct Ms. Christina Finneran GIS Resource Group, Inc. Mr. Graham Hayes, President Mr. Thomas Murdoch, Vice-President

For More Information, Contact: Local Government Technology Services State Archives And Records Administration 9B38 Cultural Education Center Albany, New York 12230 Phone: (518) 474-4372 Fax: (518) 473-4941

GIS DEVELOPMENT GUIDE Volume 1

Table of Contents
MANAGER'S OVERVIEW Introduction ......................................................................................... 1 Geographic Information Systems: Definitions and Features......................... 3 Enterprise-wide GIS: The Corporate Database........................................... 5 Policy Issues in GIS Development ............................................................. 6 Management Issues in GIS Development .................................................... 7 Geographic Information Systems: The Development Cycle ......................... 9 Tasks for GIS Development and Use.........................................................10 Summary ........................................................................................16 References ........................................................................................17 Glossary ........................................................................................19 Figures 12GIS Development Process.........................................................11 Life-Cycle of a GIS Database ....................................................13

NEEDS ASSESSMENT Introduction...........................................................................................27 Conducting a Needs Assessment ..............................................................28 Local Government Uses of GIS................................................................29 Data Used by Local Government..............................................................32 Documenting GIS Needs..........................................................................33 Documenting an Activity-Type Use of the GIS ..........................................36 Master Data List.....................................................................................37 Conducting Interviews ............................................................................38 Preparing the Needs Assessment Report....................................................40 Summary...............................................................................................44 Appendices A - GIS Application Description Forms .........................................A-1 B - Full-Page Sample of Master Data List ......................................B-1 C - Sample GIS Application Description ........................................C-1 D - Data Flow Diagraming Symbols ..............................................D-1 E - Sample Application Descriptions and Summary Tables...............E-1

Needs Assessment cont'd Figures 123456789GIS Application Descriptions ....................................................33 Data Flow Diagram Example ....................................................36 Master Data List ......................................................................37 Interviewing and Documenting Needs ........................................39 List of GIS Applications ...........................................................40 Table Summarizing Applications Example..................................41 GIS Applications/Data Matrix....................................................41 GIS Functions List ...................................................................42 Compiling Results of Needs Assessment Example........................43

CONCEPTUAL DESIGN OF THE GIS Part 1 Data Modeling Introduction ........................................................................................46 Nature of Geographic Data......................................................................48 Entity Relationship (E-R) Data Modeling..................................................49 Geographic Data Models .........................................................................53 Methodology for Modeling......................................................................55 Developing a Spatial Data Model (Entity-Relationship Diagram) .................58 Summary of Conceptual Data Modeling....................................................59 Part 2: Spatial Data Standards and Metadata Requirements Metadata Tables......................................................................................61 Additional Reading.................................................................................64 Appendix A ......................................................................................A-1 Figures 12345678910 11 GIS Development Process.........................................................45 Life-Cycle of a GIS Database ....................................................46 Entities ...................................................................................49 Example of a Firm's Database...................................................50 Example of Simple E-R Diagrams .............................................50 Simple E-R Diagrams...............................................................52 Spatial Relationships.................................................................54 Entity Symbol for Spatial Objects..............................................56 Entity Relationship Symbols......................................................57 Diagramming a Spatial Relationship...........................................58 Example of Entity Relationship Diagram for Local Government...59

GIS DEVELOPMENT GUIDE: MANAGER'S OVERVIEW

1 INTRODUCTION
This guide is the first of a set of technical support documents to assist local governments indeveloping a GIS. The set of guides describes procedures and methods for planning the GIS,evaluating potential data sources, testing available hardware and software and planning for itsacquisition, building the GIS data base, developing GIS applications, and planning for the long termmaintenance of the GIS system and data base. These guides are intended to provide advice on howbest to accomplish the GIS development tasks for all levels of local government - from large,urbanized counties to small rural towns to special-purpose districts. Realistically, large comprehensive GISs will be developed by the larger units of government (counties and cities) individually or, most likely as the leader in a cooperative multi-participant effort. These would involve the individual operating units within that government and/or the smaller units of local government within the common land area of the larger leading unit. Typically, we would expect to see county government taking the lead, but also covering the interest of all other governmental units within the county. Occasionally, there will be situations where smaller units of government (town, special purpose district, or limited purpose GIS application) may have to "go-it-alone" in developing the GIS. These guidelines have been written to mainly address the first case - a county leading a consortium or cooperative effort. Thus, we would expect the GIS development team of a county to be the primary user of these guidelines, in the sense of actually performing the tasks outlined in each document. However, this does not mean the other participants in a GIS should stop reading these guidelines at this point. It is critically important for all expected participants in a cooperative GIS venture to fully understand the development process. If a smaller unit of government is to reap the benefits of a county-level GIS, they must actively participate in the planning and development effort. The procedures are applicablefor use in first-time creation of a GIS, for restructuring an on-going GIS development project, andfor the review and further development of an existing GIS. The subject matter of the guidesidentifies the necessary tasks in a GIS development program, describes appropriate methods toaccomplish each task and, where applicable, provides examples and illustrations of documents orother products that result from each task. The guidelines are designed for use by general-purpose local governments (city, county, town, or village), special purpose governments (utilities, school districts, etc.), and by those who provide assistanceto local governments (consultants, academic units, etc.). The guides address the technical stepsrequired to create a GIS, the management tasks required to ensure successfuldevelopment of the GIS, and the policy issues that should be considered for the effective use of theGIS. The Role Of Management Although GIS is often viewed as an arena for the technically sophisticated computerprofessional, the development of a successful government-based multi-participant GIS is very dependent on

GIS Development Guide

propermanagement participation and supervision. Normal, common-sense management practices are asnecessary in a GIS project as in any other major undertaking. In fact, our experience has shown thatthe recommended management actions may be the most critical aspect of the GIS developmentprocess. GIS development is a process of technological innovation and requires managementattention appropriate to this type of activity - active as opposed to passive management involvement in theproject. Historically, much of the disillusions and disappointment with GIS projects stems not froma failure of the technical components of the GIS but rather from a lack of understanding of theprocess of technology innovation and the lack of realistic expectations of all parties associated withthe project (GIS technicians, potential users, managers, and elected/appointed officials). Applying The GIS Development Guides By Local Governments In New State York

The overall procedure contained in the GIS Development Guides is very comprehensive and canrequire considerable time, effort and dollars to complete. This raises the questions: Does all ofthis have to be done? What level of detail is appropriate? How can smaller governments,villages and towns, special purpose districts, or a single department in a larger jurisdiction, get through this process?

Does everything have to be done? . . level of detail? Basically, yes. However, the steps in the GIS development process are frequently done in an iterative manner over an extended time period. Also, the steps are not completely independent of one another and so some back-and-forth does happen. It is often useful to make a "first-cut" run through the entire process, writing down what is already known and identifying the major questions that need to be answered. The person who will be managing the development process may be able to do this "first-cut" description in 1 to 2 days. This can be very helpful in getting a feel for the scope of the whole process and then can be used as a decision tool for continuing. The number of times the process is conducted, the amount of detail, and the resources needed to complete the study can be balanced in this way. If the intended implementation will be limited or small, the planning effort and documents can be sized accordingly. It is important, however, that each step be considered and completed at some level. The companion GIS Design software package that accompanies these guides provides a structure and makes it easy to record the information developed during the planning process - application descriptions, data model, data dictionary, metadata, logical database design, and record retention information. How can smaller units of local government, such as villages and small towns complete a GIS Plan? The best situation for a village, small town, or even a smaller, rural county is to be a partner with a larger unit of government, a county, regional agency or utility company that is conducting and/or leading a GIS planning exercise. Participating in a regional GIS cooperative, or joining an existing one, will provide access to GIS technical expertise and spatial data created by other agencies. Additionally, if one is a partner in a larger group,

Manager's Overview 3 the activities directed toward the evaluation and selection of the GIS hardware and software may not need to be completed. One would simply use the same GIS system in use by the larger agency or group. Only the activities aimed at defining applications (uses) and identifying the needed data would need to be done by the smaller unit of government. In such a situation, the larger unit of government assumes the leadership role for the areawide GIS and should have the technical expertise to assist the smaller unit. In situations where a larger effort does not exist, a village or town government may want to look at a GIS installation in a similar village or town elsewhere in the state. Given the similarities in local governments within the state, the adoption of the GIS plan of another unit is not unreasonable. That plan should be carefully reviewed by the intended participants in the GIS to ensure applicability. After modifying and validating the plan, a schedule for GIS hardware, software and data acquisition can be prepared consistent with available resources. If a good plan is prepared, there is no reason data acquisition (the most expensive part of a GIS) cannot be stretched over a long time period. Significant data already is available from state and federal agencies at reasonable costs. These data can form the initial GIS database, with locally generated data added later. A list of state and federal data sources is contained in the Survey of Available Data Guide . Content Of This Guide This guide presents an overview of the GIS development process. This process is presented as asequence of steps conducted in a specific order. Each step is important in itself, but moreimportantly, information needed to complete subsequent steps is assembled and organized in eachprevious step. The underlying philosophy of the entire series of documents is to concentrate on the GIS data. As well as being the most expensive part of any GIS, the data must be collected, stored, maintained, and archived under an integrated set of activities in order to ensure continued availability and utility to the initial users as well as future users, including the general public. Defining and documenting data elements from their initial definition in the needs assessment through to proper archiving of the GIS database according to state requirements is the constant theme of these guidelines.

2 GEOGRAPHIC INFORMATION SYSTEMS: DEFINITIONS AND FEATURES


Basic Definition Of A Geographic Information System (GIS) A geographic information system (GIS) may be defined as "...a computer-based informationsystem which attempts to capture, store, manipulate, analyze and display spatially referenced andassociated tabular attribute data, for solving complex research, planning and management problems"(Fischer and Nijkamp, 1992). GISs have taken advantage of rapid developments inmicroprocessor technology over the past several decades to address the special challenges of storingand analyzing spatial data. Geographers have referred to GISs as simultaneously providing "...thetelescope, the microscope, the computer and the Xerox machine" for geographic and regionalanalysis (Abler, 1987).

GIS Development Guide

Unique Features Of A GIS - Why Planning Process Is Needed GIS belongs to the class of computer systems that require the building of large databasesbefore they become useful. Unlike many micro-computer applications where a user can begin useafter the purchase of the hardware and software, the use of a GIS requires that large spatial databasesbe created, appropriate hardware and software be purchased, applications be developed, and allcomponents be installed, integrated and tested before users can begin to use the GIS. These tasks are large and complex, so large in fact,as to require substantial planning before any data, hardware or software is acquired. The focus ofthe GIS Development Guides is to describe the GIS planning process and to provide examples ofhow to accomplish the recommended planning tasks. History Of Technology Innovations - GIS Is A Technology Innovation It is useful to note that GIS is, at present, a technological innovation. The adoption of technologicalinnovations (i.e., the development of a GIS for a local government) is not always a straightforwardprocess, such as one might expect with the installation of something that is not new. Severalproblems are likely to occur such as: Staff not fully understanding the technology prior toextensive training Development time estimates differing from actual task times Greateruncertainty about costs A greater likelihood that programmatic changes will be needed duringthe development phases, etc.

The significant management point here is that these are normalconditions in the adoption of a new technology. Management needs to anticipate that such eventswill happen, and when they do, take appropriate management actions. The adoption of computer technology by an organization either GIS or other applications, introduces fundamental change into the organization in its thinking about data. Prior informationtechnology allowed data to be collected and related to activities and projects individually. Organized stores of data were the exception rather than common practice. This led to duplicate datacollection and storage (as in different departments) and to the possibility of erroneous data existing in one ormore locations. One of the goals of computer systems and database development is to eliminateredundant data collection and storage. The principle is that data should be collected only once andthen accessed by all who need it. This not only reduces redundancy; it also allows for more accuratedata and a greater understanding of how the same data is used by multiple departments. The necessary condition for successful computer system and database development is fordifferent departments and agencies to cooperate in the development of the system. A databasebecomes an organization-wide resource and is created and managed according to a set of databaseprinciples.

Manager's Overview 5

ENTERPRISE-WIDE GIS: THE CORPORATE DATABASE

The role of a GIS in a local government setting is more than simply automating a few obvious tasksfor the sake of efficiency. A local government (or several cooperating governments) should viewthe GIS project as an opportunity to introduce fundamental change into the way its business isconducted. As with the adoption of management and executive information systems in the businessworld, the adoption of GIS effectively reorganizes the data and information the government collects,maintains and uses to conduct it affairs. This can, and arguably should, lead to major changes inthe institution, to improve both effectiveness and efficiency of operations. A key factor in the success of computer system adoption in the business world is the concept of the"enterprise" or "corporate" database. As implied by the name, the corporate database is a single,organization-wide data resource. The advantages of the corporate database are first, that all usershave immediate and easy access to up-to-date information and, secondly that the construction of thedatabase is done in the most efficient manner possible. Typically, the corporate database eliminatesredundant collection and storage of information and the keeping of extra copies of data and extrareference lists by individual users. Here, we are recommending the use of corporate database concept to integrate GIS data for all units of local government participating in a cooperative GIS program. An effective corporate database does require cooperation on the part of all users, both for thecollection and entry of data in the database and in developing applications in a shared data context. This may result in some individual applications or uses being less efficient, however the overallbenefits to the organization can easily outweigh these inefficiencies. Greater emphasis must,however, be placed on maintaining a high quality of data and services to users, mainly to offset theperceived loss of control that accompanies sharing an individual's data to another part of theorganization. The corporate database concept can be used in the governmental situation, for either single units ofgovernment or between several governmental entities in the same region. The benefits associatedwith the corporate database can be achieved if governmental units are willing to cooperate andshare a multi-purpose regional GIS database. Such an arrangement has some technicalrequirements; however, establishing the corporate database is much more a question of policy , management cooperation and coordination.

POLICY ISSUES IN GIS DEVELOPMENT

GIS Development Guide

There are several policy issues that need to be addressed early in the GIS planning process: GIS Project Management Adequate management attention has already been mentioned in this document. As GIS is still anevolving new technology, the individuals involved (management, users, GIS staff) may have verydifferent expectations for the project, some based on general perceptions of computing, which mayor may not be correct. This, along with the long time period for developing the GIS, makes it veryimportant for substantial involvement of management in the project. Several factors associated withsuccessful GIS projects are: Emphasize advantages of GIS to individual users and entire organization Require high level of competency by all participants Ensure high level of management commitment from all management levels in the organization Require participation in team building and team participation within & between departments Ensure minimum data quality and access for all users Require development team to set realistic expectations Minimize time between user needs assessment and availability of useful products. Develop positive attitude toward change within organization Ensure level of technology is appropriate for intended uses Highly visible Pilot Project that is successful

Data Sharing The sharing of data among government agencies is a virtual necessity for a successful, long-term GIS. Not even the most affluent jurisdictionswill be able to justify "goingtheir-own-way" and not taking advantage of what data areavailable from other sources or not sharing their database with other governmental units. This, then, raises several questions that must beconsidered during the planning of the GIS: What will be the source for each data item? How will sharing be arranged? . . purchase? . . license? . . other agreement? Who will own the data? How will new GIS data be integrated with existing data files (legacy systems)? Who will be responsible for updates to the data? How will the cost of the data (creation and maintenance) be allocated? Who will provide public access to the data? Who will be responsible for data archiving and retention? . . of the original? ..of copies?

These questions do not, at this time, have good answers. Currently, the Freedom ofInformation regulations require that all government data be made available to thepublic at minimal cost (cost of making a copy of the data). No distinction is made onthe basis of the format of the data (eyereadable or digital), the amount of data, or theintended use of data. Thus, the question of sharing the cost of a GIS database cannotbe addressed in general. If data can be obtained free from

Manager's Overview 7 another agency, why enter into anagreement to pay for it? The answer is, of course, that the creating agency will not beable to sustain the GIS database under these circumstances. However, at this time , the set of state laws and regulations applicable to GIS data are not adequate to resolvecost issues and to facilitate regional data sharing cooperatives. New legislation willbe required. The New York State Temporary GIS Council did submitrecommendations on these issues to the Legislature in March 1996. Additionally, theNew York State Archives and Records Administration is currently in the process ofpreparing record management and retention schedules suitable for GIS data, both inindividual agencies and for shared databases. The New York State Office of Real Property Services has been designated as the GIS representative on the Governor's Task Force for Information Resource Management. One of the charges that has been given to the Task Force is to design a cohesive policy for the coordination of geographic information systems within New York building on the work of the Temporary GIS Council. Further information should be availablein late-1996 that should clarify the issues associated with arranging for data sharingamong governments.

MANAGEMENT ISSUES IN GIS DEVELOPMENT

Expected Benefits From The GIS Local government need for, and use of, a GIS falls into several categories: maintaining publicrecords, responding to public inquiries for information, conducting studies and makingrecommendations to elected officials (decision-makers), and managing public facilities and services(utilities, garbage removal, transportation, etc.) . The GIS tasks that meet these uses are: Providing regular maps Conducting spatial queries and displaying the results Conductingcomplex spatial analyses

Many of these tasks are already done by local government, althoughby manual means. The GIS is able to perform these tasks much more efficiently. Some of theanalytical tasks cannot be performed without a computer due to their size and complexity. In thesecases, the GIS improves local government effectiveness by providing better information to plannersand decision-makers. Benefits from using a GIS fall into the two categories of: efficiency and effectiveness. Existingmanual tasks done more efficiently by the GIS result in a substantial savings of staff time. In thelocal government context, the largest savings come from answering citizen inquiries of many types. Depending on the size of the government, savings using the query function of a GIS can rangefrom 2 person-years for a smaller town, to 5-8 person years for a large town, to 10 or more person-years for a large county. Estimates of potential time savings can be derived by measuring the timeto respond to a query manually and by GIS and multiplying the difference by the number ofexpected queries. This information is usually gathered during the Needs Assessment. Effectivenessbenefits are more difficult to estimate. The GIS may be used to accomplish several tasks that werenot previously done due to their size and complexity (e.g., flow analysis in water and sewer systems,traffic analysis, etc.). As these are essentially new tasks, a comparison between manual and GISmethods is not possible. While not measurable, the benefits from these applications can besubstantial. Generally categorized as better planning, better or more effective decision-making ,these applications support more effective

GIS Development Guide

investment of government resources in physicalinfrastructure where relatively small performance improvements can translate into large dollarsavings. GIS also provides an effective way to communicate the problem and solution to the general public and other interested parties Resources Required To Develop A GIS Developing a GIS involves investment in five areas: computer hardware, computer software,geographic data, procedures and trained staff. The acquisition of the computer hardware andsoftware are often incorrectly viewed as the most expensive activity in a GIS program. Research, someconducted at the National Center for Geographic Information and Analysis at SUNY-Buffalo, hasdemonstrated that developing the geographic database (which includes some of the procedure andstaff costs) can account for 60% to 80% of the GIS development costs. Continuing costs foroperation and maintenance are also dominated by the data costs. Coordination of GIS programs,particularly among several local government agencies, can minimize the cost of databaseconstruction and maintenance, and can provide for the greatest use of the database, which givesmaximum benefits from the investment. Staffing Requirements For A GIS Staffing for a GIS is a critical issue. In general, it is not easily feasible to directly expand the localgovernment staff positions to fill the GIS need. There are three areas where expertise is needed: Management of the GIS project (GIS project manager) GIS database skills (usually called a database administrator ) Application development for database and users (a GIS software analyst)

Initial creation of the GIS database (digitizing) will require an appropriatelysized clerical staff, dependent on the amount of data to be converted. Alternatives to staff expansionare consultants and data conversion firms. GIS database conversion is a front-end staff need thatcan easily be contracted-out (good quality specifications need to be written for this task). If at allpossible, the three functions of GIS manager, GIS software analyst and GIS databaseadministrator should be fulfilled by staff personnel, either by hiring or by retraining existingprofessionals. When necessary, during the start-up phases of GIS development, the GIS analyst anddatabase administrator functions can be done under consultancy arrangements, PROVIDED THATA FULL-TIME GIS MANAGER IS AVAILABLE ON STAFF . The second need is for training of users in general computing, database principles, and GIS use. These topics are covered in training courses offered by most GIS vendors, and after the GISsoftware has been selected, they are the best source for user training.

Management Decision Points in the GIS Development Program

Manager's Overview 9 The "decision" to develop a GIS is made incrementally. The information needed to determine thefeasibility and desirability of developing a GIS is not available until several of the planning stepshave been completed. The key decision points are: Decision to investigate GIS for the organization - the initial decision to begin the process. This is an initial feasibility decision and is based on the likelihood that a GIS will be useful and effective. It is fairly important to identify the major participants at this point - both departments within agencies and the group of agencies, particularly key agencies, the agencies who represent a majority of the uses and who will contribute most of the data. Decision to proceed with detailed planning and design of the database - at this time, theapplications, data required, and sources of the data have been identified. Applications can be prioritized and scheduled and the benefits stream determined. Also, applications to be tested during the pilot study and the specific questions to be answered by the pilot study will have been determined. A preliminary decision will need to be made as to which GIS software will be used to conduct the pilot study. Decision to acquire the GIS hardware and software - this decision follows the preparation of the detailed database plan, the pilot study and, if conducted, the benchmark test. This is the first point in the development process where the costs of the GIS can reasonably be estimated, the schedule for data conversion developed, and targets for users to begin use determined.

GEOGRAPHIC INFORMATION SYSTEMS: THE DEVELOPMENT CYCLE

Developing a GIS is more than simply buying the appropriate GIS hardware and software. Thesingle most demanding part of the GIS development process is building the database. This tasktakes the longest time, costs the most money, and requires the most effort in terms of planning andmanagement. Therefore the GIS development cycle presented here emphasizes database planning. Most local governments will acquire the GIS hardware and software from a GIS vendor. Choosingthe right GIS for a particular local government involves matching the GIS needs to thefunctionality of the commercial GIS. For many agencies, especially smaller local governments, choosing a GIS will require help from larger, more experienced agencies, knowledgeable university persons and from qualified consultants. By completing selected tasks outlined in these guidelines local governments can prepare themselves to effectively interact and use expertise from these other groups. The GIS development cycle starts with the needs assessment where the GIS functions andthe geographic data needed are identified. This information is obtained through interviewingpotential GIS users. Subsequently, surveys of available hardware, software and data are conductedand, based in the information obtained, detailed GIS development plans are formulated. It is important to involve potential users in all stages of GIS development. They benefit from thisinvolvement in several ways:

10 GIS Development Guide Describing their needs to the GIS analysts Learning what the GISwill be capable of accomplishing for them Understanding the nature of the GIS developmentcycle - the time involved and the costs.

Potential users need to understand that there may besignificant time lags between the first steps of Needs Assessment and the time when the GIS canactually be used. Mostly, this is due to the size of the database building task, which can take up toseveral years in a large jurisdiction. In addition to understanding that database development takes substantial time, users and managersneed to appreciate that GIS is a new technology and its adoption often involves some uncertaintythat can cause time delays, on-going restructuring the development program, and the need to resolveunforeseen problems. This set of guideline documents describes the GIS development process in away that will minimize problems, time delays, cost overruns, etc.; however, the occurrence of thesesituations cannot be completely avoided. The GIS project team and management simply have to beaware that some unforeseen events will happen. GIS development must be viewed as a process rather than a distinct project. Estimating and planning for the cost of the GIS is a somewhat difficult task. First, it is necessaryto recognize that the GIS database will likely be the single most costly item - if a local governmentdevelops all of the data itself from maps, etc., this cost can be as much as 70 - 80 % of the totalsystem cost. Thus, acquiring digital data from other GIS systems, government sources or the privatesector can be very cost effective. Participating in, or organizing a regional data sharing cooperativeor district, can also lead to reduced data costs. When planning for the GIS database, long term datamaintenance and retention costs must be estimated as well as the initial start-up costs. Cooperation betweenagencies with similar data needs may provide the most effective way to achieve long-term datamaintenance, retention, and archiving.

TASKS FOR GIS DEVELOPMENT AND USE

The GIS development cycle is a set of eleven steps starting with the needs assessment and endingwith on-going use and maintenance of the GIS system. These steps are presented here as a logicalprogression with each step being completed prior to the initiation of the next step. While this viewis logical, it is not the way the world always works. Some of the activities in the process mayhappen concurrently, may be approached in a iterative manner, or may need to be restructureddepending on the size and character of the local government conducting the study and the resourcesavailable to plan for the GIS. The GIS development cycle is based on the philosophy that one firstdecides what the GIS should do and then as a second activity decides on how the GIS will accomplisheach task. Under this philosophy, the needs are described first, available resources are inventoriedsecond (data, hardware, software, staff, financial resources, etc.), preliminary designs are createdand tested as a third major set of activities, and lastly the GIS hardware and software are acquiredand the database is built.

Manager's Overview 11

Needs Assessment

Conceptual Design Database Planning and Design Database Construction

Available Data Survey

GIS System Integration

Application Development

GIS Use and Database Maintenance

Pilot/ Benchmark

Aquisition of GIS Hardware and Software

H/W & S/W Survey

Figure 1 - GIS Development Process Figure 1 shows the GIS development cycle, which is described in terms of 11 major activities. Prior to initiating these studies, the responsible staff in local governments should attend introductory GIS seminars and workshops, GIS conferences, and meetings of specific GIS users' groups, to obtain a broad overview of what GIS is and how others are using these systems. The 11 steps of the GIS development cycle are: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Needs Assessment Conceptual Design of the GIS Survey of Available Data Survey of GIS Hardware and Software Detailed Database Planning and Design Database Construction Pilot Study/Benchmark Test Acquisition of GIS Hardware and Software GIS System Integration GIS Application Development GIS Use and Maintenance

These tasks are one way of dividing up the entire set of activities that must be accomplished to builda successful GIS. While there are other ways of expressing and organizing these activities, thisparticular structure has been chosen because it emphasizes data development - data definition, datamodeling, data documentation, data capture and storage, and data maintenance and retention.

12 GIS Development Guide Theimportant point to be made here is not the order or structure of the tasks, but rather that, one wayor another, all of these tasks must be completed to have a successful GIS . In some situations, different methods may be more appropriate than those presented in these guides,or a different level of detail may fit the particular situation of a unit of local government. Nomatter how simple or complex a given GIS environment is, all of the above tasks should becompleted at an appropriate level of detail . In the specific guides of this set, examples of differentlevels of detail will be provided. The starting point is the needs assessment. It is assumed that the local government has decided thata GIS may be justified and it is reasonable to expend the resources to further study the problem. Afinal assessment of the costs and benefits will not be made until several tasks have been completedand the nature and size of the resulting GIS can be estimated. In the process presented here, thisfinal feasibility assessment is made as part of the detailed database planning and design activity. Each of the major portions of the development cycle identified and briefly described below is furtherdescribed in a subsequent guideline document. Needs Assessment The GIS needs assessment is designed to produce two critical pieces of information: Thelist of GIS functions that will be needed A master list of geographic data.

These twoinformation sets are extracted from a set of GIS application descriptions, a list of important data,and a description of management processes. Standard forms are used to document the results ofuser interviews. The information gained in the needs assessment activity goes directly intothe Conceptual GIS Design activity. Conceptual Design of the GIS System The conceptual design of the GIS system is primarily an exercise in database design. Itincludes formal modeling (preparation of a data model) of the intended GIS database and the initialstages of the database planning activity. Database planning is the single most important activity inGIS development. It begins with the identification of the needed data and goes on to cover severalother activities collectively termed the data life cycle - identification of data in the needs assessment,inclusion of the data in the data model, creation of the metadata, collection and entry of the data into thedatabase, updating and maintenance, and, finally, retention according to the appropriate recordretention schedule (Figure 2). A complete data plan facilitates all phases of data collection,maintenance and retention and as everything is considered in advance, data issues do not becomemajor problems that must be addressed after the fact with considerable difficulty and aggravation. Theproduct of the conceptual design activity is a data model which rigorously defines the GIS databaseand supports the detailed database planning activity.

Manager's Overview 13

Data Objects Identified During Needs Assessment

Source Documents: Maps, Images, Air Photos, etc.

Preparation of Data Model

Match Needed Data to Available Data and Sources

Survey and Evaluation of Available Data

Prepare Detailed Database Plan

Create Initial Metadata

Map and Tabular Data Conversion

Add Record Retention Schedules to Metadata

Database QA/QC Editing

GIS Database

Continuing GIS Database Maintenance

Archives

Database Backups

Figure 2 - Life Cycle of a GIS Database

14 GIS Development Guide The conceptual design of the GIS also includes identification of the basic GIS architecture (type ofhardware and GIS software), estimates of usage (derived from the Needs Assessment), and scopingthe size of the GIS system. All of this is done with reference to the existing data processingenvironments (legacy systems) that must interface with the GIS. This guideline also includes a section on metadata and data standards. Survey Of Available Data A survey of available data can commence once needed data have been identified in the NeedsAssessment. This task will inventory and document mapped, tabular and digital data within thelocal government as well as data available from other sources, such as federal, state, or other localgovernments and private sector organizations. The entries in this inventory may include other GISsystems within the local area from which some of the needed data may be obtained. If there existsan organized data sharing cooperative or other mechanism for government data sharing, it shouldbe investigated at this time. There also exists the possibility that one or more of the commercial GISdatabase developers may be able to supply some of the needed data and should therefore beinvestigated. The documentation prepared at this point will be sufficient to evaluate each potentialdata source for use in the GIS. Information collected at this point will also form part of themetadata for the resulting GIS database. Survey Of Available GIS Hardware And Software Almost all local government GIS programs will rely on commercially available GIS software. Asa result, a survey of the available GIS systems needs to be conducted. During this activity, the GISfunctionality of each commercial GIS system can be documented for later evaluation. Detailed Database Design And Planning The detailed database planning and design task includes the following activities: developing alogical or physical database design based on the data model prepared earlier, evaluating the potentialdata sources, estimating the quantities of geographic data, estimating the cost of building the GISdatabase and preparing the data conversion plan. Concurrent with the detailed planning for thedatabase, pilot studies and/or benchmark testing that are desired can be executed. Informationgained from these studies and tests will be needed to estimate the size of the equipment (disk space,main memory etc.) and to determine how much application development will be necessary. Subsequently, plans for staffing, staff training, equipment acquisition and installation, and usertraining must be completed. After the preparation of all these plans, the entire cost of the GIS willbe known and the final feasibility assessment can be made. Pilot Study And Benchmark Tests Pilot studies and benchmark tests are intended to demonstrate the functionality of the GIS software -simply put, what the commercial GIS from the vendor can do. These tests are useful to demonstrate to potential users andmanagement what the GIS will do for them. Also, performance data of the GIS system can bedetermined.

Manager's Overview 15 GIS Database Construction Database construction (sometimes referred to as "database conversion") is the process of buildingthe digital database from the source data - maps and tabular files. This process would have beenplanned during the previous activity and the main emphasis here is management of the activity andquality assurance/quality control of the converted data. The conversion process is often "contracted-out" and involves large quantities of source maps and documents. Close and effective managementis the critical factor in successful data conversion. GIS System Integration Unlike many other computer applications, a GIS is not a "plug and play" type system. The severalcomponents of a GIS must be acquired according to well documented specifications. The databasemust be created in a careful and organized manner. Once all the individual components have beenacquired, they must be integrated and tested. Users must be introduced to the system, trained asnecessary, and provided with adequate assistance to begin use of the GIS. Parts of the GIS which may appear to work fine individually may not work properly when puttogether. The GIS system staff must resolve all the problems before users can access the GIS. GIS Application Development "Application" is a general term covering all things that "go on" in a GIS. First, there are "databaseapplications." These are all the functions needed to create, edit, build, and maintain the database,and are usually carried out by the GIS systems staff. Some users may have responsibility forupdating selected parts of the GIS database, however the entire database should be under the controlof a "database administrator." Other applications are termed "user applications." ContemporaryGISs provide many simple applications as part of the initial software package (e.g., map display,query, etc.). More complex applications, or ones unique to a particular user, must be developedusing a macro-programming language . Most GISs have a macro-programming language for thispurpose (e.g., Arc Macro Language (AML) in ARC/INFO. and Avenue in ArcView). Theapplications needing development by the GIS systems staff will have been described during theNeeds Assessment on the GIS Application forms. GIS System Use And Maintenance After having described the rather large task of creating a GIS, we can now say that use andmaintenance of the GIS and its database will likely require as much attention as was needed to initiallybuild it. Most GIS databases are very dynamic, changing almost daily, and users will immediatelythink of additional applications that they would like to have developed. Formal procedures forall the maintenance and updating activities need to be created and followed by the GIS system staffand by all users to ensure continued successful operation of the GIS.

16 GIS Development Guide

SUMMARY

This document has presented an overview of the GIS development process, with anemphasis on data and database issues. All of the tasks and issues identified in thisdocument will be described in detail in the remaining eleven guidelines of this series. Theprocedures are presented as "guides," and not as a "cookbook recipe" which must berigorously followed. Each of the major tasks in the GIS development process and theinformation generated within the task should be addressed in any specific project. Themethods and forms used in this series can be used, or alternatives can be developed,as appropriate to the situation. The one matter to always keep in mind is that the GISplan is a document to communicate user needs to a GIS analyst. The componentsof the plan must contain: Descriptions of applications that are understandable to theuser A logical translation of user requirements to system specifications Detailed specification suitable for system development

Following the recommendations in these guidelines cannot, unfortunately, guarantee success. Many of the factors, outside the control of the GIS development team, will affect the ultimate success of the GIS - success being defined as use of the GIS by satisfied users. However, the authors of these guidelines believe that attempting to develop a GIS without following these, or similar procedures, substantially raises the probability of an unsuccessful GIS project - either one that is not useful or one that substantially exceeds both cost and development time estimates. Finally, although presented here as an independent activity, GIS development must recognize and interface with other computer systems in local government, such as E911, police and fire dispatch, facilities management systems, etc. . The GIS must not be viewed as independent of the other systems, but integrated with them, no matter how difficult, to form a true corporate database for local government.

Manager's Overview 17 REFERENCES 1) Fischer, Manfred M. and Nijkamp, Peter, "Geographic Information System, Spatial Modeling, and Policy Evaluation," Berlin & New York: Springer-Verlag, 1993, pg 42. 2) Abler, R.F., 1987, "The National Science Foundation National Center for Geographic Information and Analysis" International Journal of Geographical Information Systems, 1, no. 4, 303-326. SUGGESTED READINGS 1. Antenucci, John C., et.al., Geographic Information Systems: A Guide to the Technology, New York: Van Nostrand Reinhold, 1991 (ISBN 0-442-00756-6) 2. Aronoff, Stan, Geographic Information Systems: A Management Perspective, Ottawa: WDL Publications, 1989 (ISBN 0-921804-00-8) 3. Burrough, P.A., Principles of Geographical Information Systems for Land Resources Assessment, Oxford: Oxford University Press, 19865. (ISBN 0-19-854563-0); ISBN 0-19-854592-4 paperback). 4. Huxhold, William E., An Introduction to Urban Geographic Information Systems , Oxford: Oxford University Press, 1991 (ISBN 0-19-506534-4) 5. Korte, George B., A Practioner's Guide: The GIS Book , Sante Fe: OnWord Press, 1992 (ISBM 0-934605-73-4) 6. Laurini, Robert and Derek Thompson, Fundamentals of Spatial Information Systems , London: Academic Press Limited (ISBN: 0-12-438380-7) 7. Montgomery, Glenn E., and Harold C. Schuck, GIS Data Conversion Handbook , Fort Collins: GIS World, Inc. (ISBN 0-9625063-4-6) GIS INFORMATION SOURCES Scholarly journals There are a number of scholarly journals that deal with GIS. These are published on an on-going basis. Cartographica - Contact: Canadian Cartographic Association Cartography and Geographic Information Systems - Contact: American Cartographic Association International Journal of Geographical Information Systems - Contact: Keith Clark at CUNY Hunter College, New York City URISA Journal - Contact: Urban and Regional Information Systems Association

18 GIS Development Guide

Trade magazines There are a number of trade magazines that are focused on GIS. They are: GIS World GIS World Inc. 155 E. Boardwalk Drive Suite 250, Fort Collins, CO 80525. Phone: 303-223-4848 Fax: 303-223-5700 Internet: info@gisworld.com Business Geographics GIS World, Inc. 155 E. Boardwalk Drive, Suite 250 Fort Collins, CO 80525. Phone: 303-223-4848 Fax: 303-223-5700. Internet: info@gisworld.com. Geo Info Systems Advanstar Communications 859 Williamette St. Eugene, OR., 97401-6806 Phone: 541-343-1200 Fax: 541-344-3514 Internet:geoinfomag@aol.com WWW site:http://www.advanstar.com/geo/gis GPS World Advanstar Communications 859 Williamette St. Eugene, OR., 97401-6806 Phone: 541-343-1200 Fax: 541-344-3514 Internet:geoinfomag@aol.com WWWsite:http://www.advanstar.com/geo/gis

Conference Proceedings

American Congress on Surveying and Mapping (ACSM) 5410 Grosvenor Lane Bethesda, MD, 20814 Phone: 301-493-0200 Fax: 301-493-8245 American Society for Photogrammetry and Remote Sensing (ASPRS) & (GIS/LIS) 5410 Grosvenor Lane Bethesda, MD, 20814 Phone: 301-493-0290 Fax: 301-493-0208 Association of American Geographers (AAG) 1710 Sixteenth St. N.W. Washington D.C., 20009-3198 Phone: 202-234-1450 Fax: 202-234-2744 Automated Mapping/Facility Management International (AM/FM International) 14456 East Evans Ave. Aurora, CO, 80014 Phone: 303-337-0513 Fax: 303-337-1001 Canadian Association of Geographers (CAG)
Burnside Hall, McGill University Rue Sherbrooke St. W Montreal, Quebec H3A 2K6 Phone: 514-398-4946 Fax: 514-398-7437

Canadian Institute of Geomatics (CIG) 206-1750 rue Courtwood Crescent Ottawa, Ontario, K2C 2B5 Phone: 613-224-9851 Fax: 613-224-9577 Urban And Regional Information Systems Association (URISA) 900 Second St. N.E., Suite 304 Washington, D.C. 20002 Phone: 202-289-1685 Fax: 202-842-1850

Glossary
Accuracy - Degree of conformity with a standard, or the degree of correctness attained in a measurement. Accuracy relates to the quality of a result. If accuracy is relative, the position of a point is defined in relation to another point. It is less expensive to build a GIS in the context of relative accuracy. If accuracy is absolute, the position of a point is defined by a coordinate system. Building a GIS in the context of absolute accuracy requires use of the global positioning system. Accuracy Requirement - statement of how precise the desired results must be to support a particular application. A d j o i n i n g S h e e t s - Maps that are adjacent to one another at the corners and on one or more sides.

20

GIS Development Guide

Aerial - Relating to the air atmosphere, being applicable in a descriptive sense to anything in space above the ground and within the atmosphere. A e r i a l P h o t o g r a p h y - The method of taking photographs from an aerial platform (aircraft). (1.) Vertical photography, some times called orthophotography (see entry) is used for photogrammetric mapping and requires a high degree of accuracy. (2.) Oblique photography is used for general information, sometimes to verify certain attributes, but does not provide accurate measurements for photogrammetric mapping. Aerial Survey - A survey utilizing aerial photography or from remote sensing technology using other bands of the electromagnetic spectrum such as infrared, gamma or ultraviolet. A l g o r i t h m - A set of instructions; ordered mathematical steps for solving a problem like the instructions in a computer program. A l i g n m e n t - Relates to survey data transposed to maps. The correct position of a line or feature in relation to other lines or features. Also the correct placement of points along a straight line. Alphanumeric - A combination of alphabetic letters, numbers and or special characters. A mailing address is an alphanumeric listing. Analog Data - Data represented in a continuous form, not readable by a computer. Area - level of spatial measurement referring to a two-dimensional defined space; for example, a polygon on the earth as projected onto a horizontal plane. Attribute - 1. A numeric, text, or image data field in a relational data base table that describes a spatial feature such as a point, line, node, area or cell. 2. A characteristic of a geographic feature described by numbers or characters, typically stored in tabular format, and linked to the feature by an identifier. For example, attributes of a well (represented by a point) might include depth, pump type, location, and gallons per minute. AM/FM - Automated mapping/facilities management. A GIS designed primarily for engineering and utility purposes, AM/FM is a system that manages databases related to spatially distributed facilities. Base Data - set of information that provides a baseline orientation for another layer of primary focus, e.g., roads, streams, and other data typically found on USGS topographic and/or planimetric maps. B a s e L i n e - A surveyed line established with more than usual care upon which surveys are based. Base Map - A map showing planimetric, topographic, geological, political, and/or cadastral information that may appear in many different types of maps. The base map information is drawn with other types of changing thematic information. Base map information may be as simple as major political boundaries, major hydrographic data, or major roads. The changing thematic information may be bus routes, population distribution, or caribou migration routes. Base Station - a GPS receiver on a known location that may broadcast and/or collect correction information for GPS receivers on unknown locations. Bench Mark - A relatively permanent point whose elevation above or below an adopted datum is known. Beta Test - Hardware or software testing performed by users in a normal operating environment; follows alpha testing, which is generally done in the developer's facility. B e z i e r - (computer graphics) A curve generated by a mathematical formula in CAD (see entry) programs that maintains continuity with other Bezier curves. B i n a r y - The fundamental principal behind digital computers. Binary means two, computer input is converted into binary numbers made up of O and 1 (see bit). BIT : (computers) a binary digit with a value of either 1 or 0. B l o c k ( T a x ) - A group of municipal tax lots that can be isolated from other parcels by a boundary, usually a roadway, waterway or properly labeled lot line. Boundary Line - A line along which two areas meet. In specific cases, the word "boundary" is sometimes omitted, as in "state line", sometimes the word "line" is omitted, as in "international boundary", "county boundary", etc. The term

"boundary line" is usually applied to boundaries between political territories, as "state boundary line", between two states. A boundary line between privately owned parcels of land is termed a property line by preference, or if a line of the United States public land surveys, is given the particular designation of that survey system, as section line, township line, etc. BPS - Bits per second, the speed of data transfer. Buffer A zone of a given distance around a physical entity such as a point, line, or polygon. CAD/CADD - (Computers) Computer-Aided Design/ Computer-Aided Design and Drafting. Any system for ComputerAided rather than manual drafting and design. Displays data spatially. on a predefined coordinate grid system, allowing data from different sources to be connected and referenced by location. Speeds conventional map development process by 1. permitting replication of shapes, floor plans, etc. from an electric library rather than requiring every component to be drawn from scratch. 2. Plotters and terminal screens are faster and more accurate than manual drafting. 3. Portions of drawings can be edited, enlarged, etc. quickly. 4. Related information can be stored in files and added to drawings in layers. CAD - (Communication) Computer-Aided Dispatching. Used with emergency vehicles, CAD can be very sophisticated. Online maps of a city can display emergency vehicles as moving dots on the map, their status (enroute to an emergency, awaiting a call, call completed, returning to base, etc.) indicated by different colors. (The acronym for computer-aided dispatch is sometimes confused with computer-aided design.) Cadastre - a record of interests in land, encompassing both the nature and extent of interests. Generally, this means maps and other descriptions of land parcels as well as the identification of who owns certain legal rights to the land (such as ownership, liens, easements, mortgages, and other legal interests). Cadastral information often includes other descriptive information about land parcels. Cadastral - Relating to the value, extent and ownership of land for tax purposes. Cadastral maps describe and record ownership. Also called property map. Cadastral Survey - A survey relating to land boundaries and subdivisions, made to create units suitable for transfer or to define the limitations to title. Derived from "cadastre", and meaning register of the real property of a political subdivision with details of area, ownership, and value. The term cadastral survey is now used to designate the surveys for the identification and resurveys for the restoration of property lines; the term can also be applied properly to corresponding surveys outside the public lands, although such surveys are usually termed land surveys through preference. See also boundary, survey. Cartographic (Planimetric) Features - Objects like trees or buildings shown on a map or chart. Cartography - The technology of mapping or charting features of Earth's topography. Centroid - The "center of gravity" or mathematically exact center of an irregular shaped polygon; often given as an x, y coordinate of a parcel of land.

Clearinghouse - a physical repository structure used to accumulate and disseminate digital data and information concerning that data. In the GIS context a clearinghouse can contain all or a portion of spatial, metadata and informational data. C l i e n t - A software application that works on your behalf to extract some service from a server somewhere on the network. Basic idea, think of your telephone as a client and the telephone company as a server. COGO - Acronym for Coordinate Geometry achieved via a computer program. Computer-aided documents. Design or Drafting (CAD) A group of computer software packages for creating graphic

Control Point - A point in a network, identifiable in data or a photograph, with a given horizontal position and a known surface elevation. It is correlated with data in data set or photograph. Contour - An imaginary outline of points on the ground which are at the same altitude relative to mean sea level. Contour Line - A line on a map or chart that connects to points which are at the same elevation.

22

GIS Development Guide

Contour Map - A map that defines topography (hypsography) by interpreting contour lines as relief. C o n t r o l - Also called ground control. A system of survey marks or objects called control points that have established positions and/or elevations verified by ground survey. The marks, or control points, serve as a reference correlating other data such as contour lines (see entry) determined from aerial surveys. C o n v e r s i o n - 1. The translation of data from one format to another (e.g., TIGER to DXF; a map to digital files).S 2. Data conversion when transferring data from one system to another (E.g., SUN to IBM).s Coordinate - The position of point is space in respect to a Cartesian coordinate system (x, y and/or z values). In GIS, a coordinate often represents locations on the earth's surface relative to other locations. C o o r d i n a t e S y s t e m - The system used to measure horizontal and vertical distances on a planimetric map. In a GIS, it is the system whose units and characteristics are defined by a map projection. A common coordinate system is used to spatially register geographic data for the same area. See map projection CRT - Cathode Ray Tube. A computer screen or monitor. CTG - Center for Technology in Government Data Capture - series of operations required to encode data in a computer-readable digital form (digitizing, scanning, etc.) Data Dictionary - description of the information contained in a data base, e.g., format, definition, structure, and usage. It typically describes and defines the data elements of the data base and their interrelationships within the larger context of the data base. Data Element - specific item of information appearing in a set of data, e.g. well site locations. Data Model 1. A generalized, user-defined view of the data related to applications. 2. A formal method for arranging data to mimic the behavior of the real world entities they represent. Fully developed data models describe data types, integrity rules for the data types, and operations on the data types. Some data models are triangulated irregular networks, images, and georelational or relational models for tabular data. Data Quality - refers to the degree of excellence exhibited by the data in relation to the portrayal of the actual phenomena Data Sets - a collection of values that all pertain to a single subject. Data Standardization - the process of achieving agreement on data definitions, representation, and structures to which all data layers and elements in an organization must conform. Data Structure - organization of data, particularly the reference linkages among data elements. Database -usually a computerized file or series of files of information, maps, diagrams, listings, location records, abstracts, or references on a particular subject or subjects organized by data sets and governed by a scheme of organization. "Hierarchical" and relational" define two popular structural schemes in use in a GIS. For example, a GIS database includes data about the spatial location and shape of geographic entities as well as their attributes. Database Management System (DBMS) - 1. The software for managing and manipulating the whole GIS including the graphic and tabular data. 2. Often used to describe the software for managing (e.g., input, verify, store, retrieve, query, and manipulate) the tabular information. Many GlSs use a DBMS made by another software vendor, and the GIS interfaces with that software. Datum - a mathematical reference framework for geodetic coordinates defined by the latitude and longitude of an initial point, the azimuth of a line from this point, and the parameters of the ellipsoid upon which the initial point is located. DEC - Department of Environmental Conservation Differential Correction - the method (usually done through post processing) of using two GPS receivers, one on a known location and one on an unknown location, using information from the one on the known location to correct the position of the unknown location. Digital Accuracy - refers to the accuracy of digital spatial data capture.

Digital Elevation Model (DEM) - a file with terrain elevations recorded at the intersections of a fine grid and organized by quadrangle to be the digital equivalent of the elevation data on a topographic base map. D i g i t a l D a t a - a form of representation in which distinct objects, or digits, are used to stand for something in the real world--temperature or time--so that counting and other operations can be performed precisely. Data represented digitally can be manipulated to produce a calculation, a sort, or some other computation. In digital electronic computers, two electrical states correspond to the Is and Os of binary numbers, which are manipulated by computer programs. Digital Exchange Format (DXF) 1. ASCII text files defined by Autodesk, Inc. (Sausalito, CA) at first for CAD, now showing up in third-party GIS software . 5 2. An intermediate file format for exchanging data from one software package to another, neither of which has a direct translation for the other but where both can read and convert DXF data files into their format. This often saves time and preserves accuracy of the data by not reautomating the original. D i g i t a l L i n e Graph (DLG) 1. In reference to data, the geographic and tabular data files obtained from the USGS for exchange of cartographic and associated tabular data files. Many non-DLG data may be formatted in DLG format. 2. In reference to data, the formal standards developed and published by the USGS for exchange of cartographic and associated tabular data files. Many non-DLG data may be formatted in DLG format. D i g i t a l M a p - A machine-readable representation of a geographic phenomenon stored for display or analysis by a digital computer; contrast with analog map. Digital Orthophoto - A geographically correct digital image with the same accuracy as a vector digital map, but preserving the information content of the original photography. Digital Orthophoto Quarter-Quad (DOQ) - a 3.75 minute square distortion free image of the surface of the earth. The imagery has been geographically and photographically rectified to remove all distortion, and meet requirements of the USGS. Digital Terrain Model (DTM) - A computer graphics software technique for converting point elevation data into a terrain model displaced as a contour map, sometimes as a three-dimensional "hill and valley" grid view of the ground surface. D i g i t i z e - A means of converting or encoding map data that are represented in analog form into digital information of x and y coordinates. Digitized Terrain Data - Transposed elevation information from maps or photographs to X-Y-Z digital coordinates for storage on magnetic media. D i g i t i z e r - A device used to capture planar coordinate data, usually as x and y coordinates, from existing analogmaps for digital use within a computerized program such as a GIS; Also called a digitizing table. D i g i t i z i n g - refers to the process of manually converting an analog image or map or other graphic overlay into numerical format for use by a computer with the use of a digitizing table or tablet and tracing the input data with a cursor (see also scanning). DIME - Dual Independent Map Encoding Provides vector data such as streets to census data addresses. Superseded by Topologically Integrated Geographic Encoding and Referencing (see TIGER). DIME File - A geographic base file produced by the U.S. Census Bureau with Dual Independent Map Encoding. Now being superseded by TIGER files (see below). DLG - See Digital Line Graph DOB - Division of the Budget DOQ - See Digital Orthophoto Quarter-quad DOT - Department of Transportation DTF - Department of Taxation and Finance Edge Match - An editing procedure to ensure that all features crossing adjacent map sheets have the same edge locations, attribute descriptions, and feature classes.

24

GIS Development Guide

Federal Information Processing Standards (FIPS) - official source within the federal government for information processing standards. They were developed by the Institute for Computer Sciences and Technology, at the National Institute of Standards and Technology (NIST), formerly the National Bureau of Standards. Federal Geographic Data Committee (FGDC) - established by the Federal Office of Management and Budget, is responsible for the coordination of development, use, sharing, and dissemination of surveying, mapping, and related spatial data. Fifth Generation Computer - A computer designed for applications of artificial intelligence (Al). Some elements of spatial data management, especially the CADD output side, are beginning to integrate Al computing. FOIL - Freedom of Information Law Format - 1. The pattern in which data are systematically arranged for use on a computer. 2. A file format is the specific design of how information is organized in the file. For example, DLG, DEM, and TIGER are geographic data sets in particular formats that are available for many parts of the United States 6 File Transfer Protocol (FTP) - a standard protocol that defines how to transfer files from one computer to another. Fortran - A high-level programming language and compiler originally designed to express math formulas. Developed in 1954 by IBM it is still the most widely used language for scientific and engineering programming. GBF/DIME - See Geographic base file/dual independent map encoding Geocode - The process of identifying a location as one or more x, y coordinates from another location description such as an address. For example, an address for a student can be matched against a TIGER street network to locate the student's home. Geodetic Monumentation - a permanent structure that marks the location of a point taking into account the earth's curvature. G e o g r a phi c - Pertains to the study of the Earth and the locations of living things, humans and their effects. Geographic Base File/dual Independent Map Encoding (GBF/DIME) - A data exchange format developed by the US Census Bureau to convey information about block-face/street address ranges related to 1980 census tracts. These files provide a schematic map of a city's streets, address ranges, and geostatistical codes relating to the Census Bureau's tabular statistical data. See also TIGER, created for the 1990 census. Geographic Database - Efficiently stored and organized spatial data and possibly related descriptive data. Geographic Information Retrieval and Analysis (GIRAS) - Data files from the US Geological survey. GIRAS files contain information for areas in the continental United States, including attributes for land use, land cover, political units, hydrologic units, census and county subdivisions, federal land ownership, and state land ownership. These data sets are available to the public in both analog and digital form. Geographic Information System (GIS) - An organized collection of computer hardware, software, geographic data, and personnel designed to efficiently capture, store, update, manipulate, analyze, and display all forms of geographically referenced information. Certain complex spatial operations are possible with a GIS that would be very difficult, timeconsuming, or impractical otherwise. Geographic Object - A user-defined geographic phenomenon that can be modeled or represented using geographic data sets. Examples include streets, sewer lines, manhole covers, accidents, lot lines, and parcels.

Geographical Resource Analysis Support System (GRASS) - 1. A public-domain raster GIS modeling product of the US Army Corps of Engineers Construction Engineering Research Laboratory. 2. A raster data format that can be used as an exchange format between two GlSs. G e o r e c t i f y - the process of referencing points on an image to the real world coordinates. Georeference - To establish the relationship between page coordinates on a paper map or manuscript and known realworld coordinates

G e o s p a t i a l - a term used to describe a class of data that has a geographic or spatial nature. G e o s t a t i o n a r y S a t e l l i t e : An earth satellite that remains in fixed position in sync with the earth's rotation. GIS - Geographic information system. A computer system of hardware and software that integrates graphics with databases and allows for display, analysis, and modeling. Grid-Cell Data - Grid-cell data entry places a uniform grid over a map area, and the area within the cell is labeled with one attribute or characteristic, such as elevation averaged over all points. Grid cells can be layered with differing types of information. G l o b a l P o s i t i o n i n g S y s t e m ( G P S ) - a system developed by the U.S. Department of Defense based on 24 satellites orbiting the Earth. Inexpensive GPS receivers can accurately determine ones position on the Earth's surface. Ground Truth - Information collected from a survey area as remote sensing data is being collected from the same area (see control). Hierarchical - A way of classifying data, starting with the general and going to specific labels. Hydrography - Topography pertaining to water and drainage feature. H y p s o g r a p h y - 1 ) The science or art of describing elevations of land surfaces with reference to a datum, usually sea level. 2) That part of topography dealing with relief or elevation of terrain. Image - A graphic representation or description of an object that is typically produced by an optical or electronic device. Common examples include remotely sensed data such as satellite data, scanned data, and photographs. An image is stored as a raster data set of binary or integer values representing the intensity of reflected light, heat, or another range of values on the electromagnetic spectrum. Remotely sensed images are digital representations of the earth. Imagery - a two dimensional digital representation of the earth's surface. Examples are a digital aerial photograph, a satellite scene, or an airborne radar scan. Index - A specialized lookup table or structure within a database and used by an RDBMS or GIS to speed searches for tabular or geographic data. Infrastructure - The fabric of human improvements to natural settings that permits a community, neighborhood, town, city metropolis, region, state, etc., to function. I n i t i a l G r a p h i c s E x c h a n g e Specification data among computer systems.' (IGES) An interim standard format for exchanging graphics Polygon

Internet - a system of linked computer networks, worldwide in scope, that facilitates data communication services such as remote login, file transfer, electronic mail, and newsgroups. The Internet is a way of connecting existing computer networks that greatly extends the reach of each participating system. Internet Protocol (IP) - the most important of the protocols on which the Internet is based. It allows a packet to traverse multiple networks on the way to its final destination. I n t e r p o l a t e - Applied to logical contouring by determining vertical distances between given spot elevations. IT - Information Technology Land Information System (LIS) - the sum of all the elements that systematically make information about land available to users including: the data, products, services, the operating procedures, equipment, software, and people. Land Information System (LIS) - NJ State 45:8-28(e) - Any computer coded spatial database designed for multipurpose public use developed from or based on property boundaries. Latitude - The north-south measurement parallel to the equator. Layer- A logical set of thematic data, usually organized by subject matter. Layers - refers to the various "overlays" of data each of which normally deals with one thematic topic. These overlays are registered to each other by the common coordinate system of the database.

26

GIS Development Guide

Longitude - The angular distance, measured in degrees, cast or west from the Greenwich meridian, or by the difference in time between two reference meridians on a globe or sphere. Lot Number - A numerical parcel designation, that when combined with a block number is unique to a single parcel of land within a given municipality. Manual Digitizing calculator. - Conversion of an analog measurement into a digital form by using a manual device such as a

Map - A representation of a portion of the earth, usually drawn on a flat surface. (From Latin mappa, a napkin, sheet or cloth upon which maps were drawn.) Map Projection - A mathematical model for converting locations on the earth's surface from spherical to planar coordinates, allowing flat maps to depict three dimensional features. Some map projections preserve the integrity of shape; others preserve accuracy of area, distance, or direction. Map Units - The coordinate units in which the geographic data are stored, such as inches, feet, or meters or degrees, minutes and seconds. Metadata - data describing a GIS database or data set including, but not limited to, a description of a data transfer mediums, format, and contents, source lineage data, and any other applicable data processing algorithms or procedures . NCGIA - National Center for Geographic Information Analysis Network Analysis - Addresses relationships between locations on a network. Used to calculate optimal routes, and optimal locations for facilities. NSGIC - National States Geographic Information Council NSDI - National Spatial Data Infrastructure OPRHP - Office of Parks, Recreation and Historical Preservation ORPS - Office of Real Property Services O r t h o p h o t o - A photograph of the earth's surface in which geographic distortion has been removed. Overlay - A layer of data representing one aspect of related information. Parcel - Generally refers to a piece of land that can be designated by number. Photogrammetry satellite imagery. The system of gathering information about physical objects through aerial photography and

P l a n e - C o o r d i n a t e S y s t e m A system for determining location in which two groups of straight lines intersect at right angles and have as a point of origin a selected perpendicular intersection. Planimetric Map - A map which presents the horizontal positions only for the features represented; distinguished from a topographic map by the omission of relief in measurable form. The natural features usually shown on a planimetric map include rivers, lakes and seas; mountains, valleys and plains; and forests, prairies, marshes and deserts. The culture features include cities, farms, transportation routes and public-utility facilities; and political and private boundary lines. A planimetric map intended for special use may present only those features which are essential to the purpose to be served. Plat : A scale diagram void of cultural, drainage and relief features, showing only land boundaries and subdivisions together with data essential to its legal description. P l o t t e r - Equipment that can plot a graphic file using multiple line weights and colors. Types available today are: pen, laser, and electrostatic plotters. Point Data - level of spatial definition referring to an object that has no dimension, e.g., well or weather station. P o i n t s - Items such as oil wells, utility poles, etc. Specific objects with exact location noted. P o l y g o n - A vector representation of an enclosed region, described by a sequential list of vertices or mathematical functions.

P o s i t i o n a l A c c u r a c y - term used in evaluating the overall reliability of the positions of cartographic features relative to their true position. P r e c i s i o n - refers to the quality of the operation by which the result is obtained, as distinguished from accuracy. P r o t o c o l - a definition for how computers will perform when talking to each other. Protocol definitions range from how bits are placed on a wire to the format of an electronic mail message. Standard protocols allow computers from different manufacturers to communicate; the computers can use completely different software, providing that the programs running on both ends agree on what the data means. Quadrangle - A four-sided region, usually bounded by a pair of meridians and a pair of parallels. Q u a l i t y C o n t r o l - process of taking steps to ensure the quality of data or operations is in keeping with standards set for the system. Raster - A grid-type data format used to interpret gray-scale photographs and satellite imagery. Imagery is stored as dots or pixels, each with a different shade or density. Raster Data - Machine-readable data that represent values usually stored for maps or images and organized sequentially by rows and columns. Each "cell" must be rectangular but not necessarily square, as with grid data. RDBMS - See relational database management systems. R e c t i f i e d - referencing points, lines, and/or features of two dimensional images to real world geographic coordinates, to correct distortion in the image. Rectify - The process by which an image or grid is converted from image coordinates to real-world coordinates. Rectification typically involves rotation and scaling of grid cells, and thus requires resampling of values. R e g i s t r a t i o n - the procedure used to bring two maps or data layers into concurrence via known ground location control points or the procedure of bringing a map or data layers into concurrence with the earth's surface. Relational Database Management System (RDBMS) - A database management system with the ability to access data organized in tabular files that may be related together by common field (item). An RDBMS has the capability to recombine the data items from different files, thus providing powerful tools for data usage. Remote Sensing - Recording imagery or data and information from a distance. Photography is a form of remote sensing. Satellites provide a remote sensing platform for developing geology and soils analysis with sensors sensitive to various bands of the electromagnetic spectrum. R e s o l u t i o n - 1. The accuracy at which the location and shape of map features can be depicted for a given map scale. For example, at a map scale of 1:63,360 (1 inch=1 mile), it is difficult to represent areas smaller than 1/10 of a mile wide or 1/10 of a mile in length because they are only 1/1 0-inch wide or long on the map. In a larger scale map, there is less reduction, so feature resolution more closely matches real world features. As map scale decreases, resolution also diminishes because feature boundaries must be smoothed, simplified, or not shown at all. 2. The size of the smallest feature that can be represented in a surface. 3. The number of points in x and y in a grid (e.g., the resolution of a USGS one-degree DEM is 1.201 x 1.201 mesh points).2 Rubber-sheet - A procedure to adjust the entities of a geographic data set in a non-uniform manner. From- and tocoordinates are used to define the adjustment. SARA - State Archives and Records Administration S c a l e - the relationship between a distance on a map and the corresponding distance on the earth. Often used in the form I :24,000, which means that one unit of measurement on the map equals 24,000 of the same units on the earth's surface.

28

GIS Development Guide

Scanner - A scanner is an optical device that recognizes dark and light dots on a surface and converts this recognition into a digital file. However, scanners generally do not create a map database in a logically correct format, so additional computer-aided manipulation and often manual editing are used to add intelligence required by a specific GIS platform.

S c a n n i n g - Also referred to as automated digitizing or scan digitizing. A process by which information originally in hard copy format (paper print, mylar transparencies, microfilm aperture cards) can be rapidly converted to digital raster form (pixels) using optical readers. Schematic Map - A map prepared by electronically scanning or digitizing in which the lines are not dimensionally or positionally accurate. SDTS - Spatial Data Transfer Standard SED - State Education Department SEMO - State Emergency Management Office Server - software that allows a computer to offer a service to another computer. Other computers contact the server program by means of matching client software. Also a computer using server software. Source Material - data of any type required for the production of mapping, charting, and geodesy products including, but not limited to, ground-control aerial and terrestrial photographs, sketches, maps, and charts; topographic, hydrographic, hypsographic, magnetic, geodetic, oceanographic, and meteorological information; intelligence documents; and written reports pertaining to natural and human-made features. S p a t i a l Data - data pertaining to the location of geographical entities together with their spatial dimensions. Spatial data are classified as point, line, area, or surface. Spatial Index - A means of accelerating the drawing, spatial selection, and entity identification by generating geographic-based indexes. Usually based on an internal sequential numbering system S p a t i a l M o d e l - Analytical procedures applied with a GIS. There are three categories of spatial modeling functions that can be applied to geographic data objects within a GIS: (1) geometric models (such as calculation of Euclidian distance between objects, buffer generation area, and perimeter calculation); (2) coincidence models (such as a polygon overlay); and (3) adjacency models (pathfinding, redistricting, and allocation). All three model categories support operations on geographic data objects such as points, lines, polygons, TlNs, and grids. Functions are organized in a sequence of steps to derive the desired information for analysis. St a ke ho l de rs - Any constituency in the environment that is affected by an organization's decisions and policies. Standards - In computing, a set of rules or specifications which, taken together, define the architecture of a hardware device, program, or operating system. State Plane Coordinate System - The plane-rectangular coordinate systems established by the United States Coast and Geodetic Survey (now known as National Ocean Survey), one for each state in the United States, for use in defining positions of geodetic stations in terms of plane-rectangular (X and Y) coordinates. Each state is covered by one or more zones, over each of which is placed a grid imposed upon a conformal map projection. The relationship between the grid and the map projection is established by mathematical analysis. Zones of limited east-west dimension and indefinite north south extent have the transverse Mercator map projection as the base for the state coordinate system, whereas zones for which the above order of magnitude is reversed have the Lambert conformal conic map projection with two standard parallels. For a zone having a width of 158 statute miles, the greatest departure from exact scale (scale error) is 1 part in 10,000. Only adjusted positions on the North American datum of 1927 and NAD 1983 may be properly transformed into plane coordinates on a state system. All such geodetic positions which are determined by the National Ocean Survey are transformed into state plane-rectangular coordinates on the proper grid, and are distributed by that bureau with the geodetic

positions. State plane coordinates are extensively used in recording land surveys. and in many states such use has received approval by legislative enactment. SUNY - State University of New York S y s t e m - A group of related or interdependent elements that function as a unit. Tax Map - An accurate map of a municipal territory prepared for the purpose of taxation. Showing among other things, the location and width of streets, roads, avenues and each individual lot of land within the municipality. Text Data - Information in a GIS system such as property owners' names and lot dimensions. Thematic Layer - mapping categories, consisting of a single type of data such as population, water quality, or timber stands, intended to be used with base data. Thematic Map A map that illustrates one subject or topic either quantitatively or qualitatively. Theme - A collection of logically organized geographic objects defined by the user. Examples include streets, wells, soils, and streams. TIGER - supersedes DIME (see entry) files. TIGER - See Topologically Integrated Geographic Encoding and Referencing Topographic Map - A map of land-source features including drainage lines, roads, landmarks, and usually relief, or elevation. Topologically Integrated Geographic Encoding and Referencing data (TIGER) - A format used by the US Census Bureau to support census programs and surveys. It is being used for the 1990 census. TIGER files contain street address ranges along lines and census tract/block boundaries. These descriptive data can be used to associate address information and census/demographic data to coverage features. T o p o l o g y - The spatial relationships between connecting or adjacent coverage features (e.g., arcs, nodes, polygons, and points). For example, the topology of an arc includes its from- and to- nodes and its left and right polygons. Topological relationships are built from simple elements into complex elements: points (simplest elements), arcs (sets of connected points), areas (sets of connected arcs), and routes (sets of sections) that are arcs or portions of arcs). Redundant data (coordinates) are eliminated because an arc may represent a linear feature, part of the boundary of an area feature, or both. Topology is useful in GIS because many spatial modeling operations don't require coordinates, only topological information. For example, to find an optimal path between two points requires a list of which arcs connect to each other and the cost of traversing along each arc in each direction. Coordinates are only necessary to draw the path after it is calculated . Transformation - The process of converting data from one coordinate system to another through translation, rotation, and scaling . T r a n s m i s s i o n C o n t r o l P r o t o c o l (TCP) - One of the protocols on which the Internet is based. V e c t o r s - Lines defined by "x", "y" and "z" coordinate endpoints. Roads, rivers, contour lines, etc. presented as vector lines. Vector Data - A coordinate-based data structure commonly used to represent map features. Each linear feature is represented as a list of ordered x, y coordinates. Attributes are associated with the feature (as opposed to a raster data structure, which associates attributes with a grid cell). Traditional vector data structures include double-digitized polygons and arc-node models. V e c t o r D i s p l a y : A vector display on a computer screen is produced by drawing vectors on the screen. A raster display, in contrast, is produced on a screen as rows of dots of "on" or "off' which produce the picture. Wide Area Network (WAN) - a network that uses high-speed, long distance communications networks or satellites to connect computers over distances greater than those traversed by local area networks (LANs)--about 2 miles. Workstations and Terminals A workstation is a device or a combination of devices integrated to provide the user with graphic data entry, display, and manipulation. These devices are used for map digitizing and map-related applications, geographic analysis and ad hoc query. Most systems still use some type of inexpensive edit-query workstations or terminals to provide low-cost access to both maps and related data.

30

GIS Development Guide

GIS DEVELOPMENT GUIDE: NEEDS ASSESSMENT

1 INTRODUCTION
A needs assessment is the first step in implementing a successful GIS within any localgovernment. A needs assessment is a systematic look at how departments function and the spatial data needed to do their work. In addition to the final needs assessment report that is generated, intangible benefits are realized by an organization. Conducting a GIS needs assessment fosters cooperation and enhanced communication among departments by working together on a common technology and new set of tools. Finally, the needs assessment activity itself serves as a learning tool where potential users in each participating department learns about GIS and how it can serve the department. A needs assessment is required if the local government will be adopting a GIS throughout the organization. Without a complete needs assessment each department might proceed to adopt their own system and database which may or may not be compatible with those of another department. The largest benefit for a local government adopting a GIS is to realize efficiencies from common "base data" and the sharing of data among departments. At the conclusion of a needs assessment, an organization will have all of the information needed to plan the development of a GIS system. This information can be grouped into the following categories: Applications to be developed . - In evaluating the responsibilities and work flow within a department, certain tasks are identified that can be done more efficiently or effectively in a GIS. These tasks will form the basis of GIS applications. Application descriptions prepared as part of the needs assessment will describe these tasks. GIS Functions required. - For each application identified, certain GIS functions will be required. These will include standard operations such as query and display, spatial analysis functions such as routing, overlay analysis, buffering, and possibly advanced analysis requiring special programming. Data needed in the GIS database. - Most departments in local government use data that has a spatial component. Much of this data are hardcopy maps or tabular data sets that have a spatial identifier such as addresses and zip codes or X-Y values (latitude-longitude, state plane coordinates, or other coordinate system). A needs assessment will identify how this information will be used by GIS applications. Data maintenance procedures. - By looking at the work flow and processes within and between departments, responsibility for data creation, updates and maintenance will become apparent.

Note : The needs assessment procedure refers to a local government and its departments as the organizational units. In a multi-agency GIS cooperative, the same activities described would be carried out by all participants, at the appropriate level of detail as determined by the role each participant would play in the resulting GIS cooperative.

Needs Assessment 31 Once all of this information is collected and analyzed for each department and published in a report, it can be used as a blueprint for implementing the GIS. The GIS coordinating group within the organization will use it to: Design the GIS database Identify GIS software that will meet the government's needs Prepare an implementation plan Start estimating the benefits and costs of a GIS

A common mistake in performing a needs assessment is to simply take an inventory of the maps and spatial data currently used in each department. There are two major problems with this approach. First, this does not allow the GIS coordinating group to evaluate how a GIS could be used to enhance the work of each department and the agency as a whole . By looking at the department functions and what the department does or produces, the GIS coordinating group and potential users develop an understanding of the role GIS can play in the organization. The existing data and maps do need to be inventoried and may well be used in building the GIS, however such an inventory should be separate from the needs assessment. The second major problem with the "data inventory" approach is that it tends to focus only on data internal to the organization. Local governments rely heavily on data from outside sources - federal agencies, state agencies, business, etc. The need for these data is better determined by looking at the potential GIS applications and how data will be used by each application. It can then be determined what data should be acquired from other sources.

2 CONDUCTING A NEEDS ASSESSMENT


The most significant aspect of a needs assessment is to document the findings in a standard and structured manner. It is very important to adopt (or develop) a standard method to be used for the description of all the GIS tasks, processes and data that will be included in the needs assessment. These forms will be used in needs assessment to identify the three kinds of GIS requirements: GIS applications - these will be tasks that can be performed by the GIS when a user requires them, such as preparing a map, processing a query, or conducting some particular GIS analysis. GIS applications can be described using the five page GIS Applications forms included with this guide as Appendix A. GIS activity - these are situations where information needs to be kept on some activity or process important to the user, such as issuing building permits, conducting public health inspections, etc. A GIS activity can be described using pages 1 and 4 of the GIS Applications forms - the main application form and the data flow diagraming (DFD) form. GIS data - there will be certain categories of spatial data that are important to keep, but which will not appear in any GIS application or activity identified in any application description. A separate method must be developed to systematically record the need for such data. Other GIS data needed but not included in either of the above categories, can be entered directly into the master data list.

32

GIS Development Guide

The main method used to collect the information to enter onto the forms is individual interviews. Potential users of the GIS can be identified by management and by examination of the organization chart. A series of one-on-one interviews is the best way to identify the users needs. During the interview, the user can usually identify documents that can provide additional information to the GIS analyst. The needs assessment activity is composed of two main parts Interviewing and documenting the needs of potential GIS users Compiling the results of the needs assessment into the master data list and the list of GIS functions. These two lists respectively are used to prepare the GIS data model and the GIS specifications (activities described under Conceptual Design).

The interview process should identify and describe all anticipated uses of the GIS. The next section briefly describes the major categories of GIS use, followed by a detailed description on how to complete the needs assessment forms.

3 LOCAL GOVERNMENT USES OF GIS


The use of geographic information systems by local government falls into five major categories: Browse This function is equivalent to the human act of reading a map to find particular features or patterns. Browsing usually leads to identification of items of interest and subsequent retrieval and manipulation by manual means. For single maps, or relatively small areas, the human brain is very efficient at browsing. However, as data volumes increase, automated methods are required to effectively extract and use information from the map. Simple Display This GIS function is the generation of a map or diagram by computer. Such maps and diagrams are often simple reproduction of the same maps used in a previous manual orientedGIS environment. Examples of this type of use are preparation of a 1:1000-scale town map, a sketch of an approved site plan, maps of census data, etc. Browse Simple display (automated mapping); Query and display; Map analysis; and Spatial modeling.

Needs Assessment 33 Query And Display This function supports the posing of specific questions to a geographic database, with the selection criteria usually being geographic in nature. A typical simple query would be: "draw a map of the location of all new residential units built during 1989" A more complex query might be: "draw a map of all areas within the town where actual new residential units built in 1989 exceeds growth predictions." Such a query could be part of a growth management activity within the town. Queries may be in the form of regular, often asked questions or may be ad hoc, specific purpose questions. The ability to respond to a variety of questions is one of the most useful features of a GIS in its early stages of operation. In the long run, other more sophisticated applications of the GIS may have a higher value or benefit, but to achieve these types of benefits, users must be familiar with the GIS and its capabilities. Such familiarization is achieved through the use of a GIS for the simpler tasks of query and display. Map Analysis (Map Overlay) This involves using the analytical capabilities of GIS to define relationships between layers of spatial data. Map analysis is the super-imposition of one map upon another to determine the characteristics of a particular site (e.g., combining a land use map with a map of flood prone areas to show potential residential areas at risk for flooding). Map analysis (often termed overlay or topological overlay) was one of the first real uses of GIS. Many government organizations, particularly those managing natural resources, have a need to combine data from different maps (vegetation, land use, soils, geology, ground water, etc.). The overlay function was developed to accomplish the super-imposition of maps in a computer. The data are represented as polygons, or areas, in the GIS data base, with each type of data recorded on a separate "layer." The combination of layers is done by calculating the logical intersection of polygons on two or more map layers. In addition to combining multiple "layers" of polygon-type data, the map overlay function also permits the combination of point data with area data (point-in-polygon). This capability would be very useful in a town for combining street addresses (from the Assessor's files) with other data such as parcel outlines, census tract, environmental areas, etc. Many facility siting problems, location decisions, and land evaluation studies have successfully used this procedure in the past. Spatial Modeling This application is the use of spatial models or other numerical analysis methods to calculate a value of interest. The calculation of flow in a sewer system is an example of spatial modeling. Spatial modeling is the most demanding use of a GIS and provides the greatest benefit. Most spatial modeling tasks are very difficult to perform by hand and are not usually done unless a computerized system, such as a GIS, is available. These models allow engineers and planners to evaluate alternate solutions to problems by asking "what if" type questions. A spatial model can predict the result expected from a decision or set of decisions. The quality of the result is only as good as the model, but the ability to test solutions before decisions have to be made usually provides very useful information to decision makers. Once again, this type of use of a GIS will evolve over time, as the GIS is implemented and used. A closely related computer capability is a CAD system (computer aided design). CAD systems are used to prepare detailed drawings and plans for engineering and planning applications. While CAD systems functions are different from GIS functions, many commercial CAD products have some of

34

GIS Development Guide

the functionality normally found in a GIS. There are, however, significant differences between a CAD system and a GIS, mainly in the structure of the data base. There may be some need for CAD-type capabilities in a particular local government, so this forms another category of use. In general, geographic information in local government is used to: Respond to public inquiries, Perform routine operations such as application reviews and permit approvals, and Provide information on the larger policy issues requiring action by the town board.

These are typical local government activities which benefit from a geographic information system. The development of GIS will facilitate the present geographic information handling tasks and should lead to the development of additional applications of benefit to the local government. There are also other computer systems in local governments that perform GIS-like functions, such as Emergency 911, underground utility locator systems, school bus routing systems, etc. The variety and diversity of GIS applications are what make the definition of a GIS very difficult. Basically, any computer system where the data have one or more spatial identifiers or that perform spatial operations can be classed as a GIS. For example, a system containing street addresses and census tract codes and that has the ability to place a given street address in the proper census tract is a GIS whether or not map boundaries are part of the system. There are two important points here: A large proportion of local government data does have one of more spatial identifiers, and therefore has the potential of being part of a GIS. Other, existing systems with GIS data or performing GIS-like functions must be integrated into the overal system design. GIS should not be developed as a separate system.

Whether a local government unit is considering or planning a "full, multi-purpose GIS" or is only interested in a limited or single function system, the database planning and design considerations are the same. Only the magnitude of the analysis and design activities differ. Some GIS users believe that smaller and simpler applications, such as a school bus routing system, do not require a formal planning activity. There are, however, several reasons to conduct such a planning activity for the smaller applications: To ensure that the user requirements will be fully met To develop documentation, especially data documentation (metadata), needed to use and maintain the GIS To be in a position to participate in data sharing programs with other agencies as additional applications are developed To create a permanent record of the data and its use to document agency plans and decisions, and to meet data retention and archiving requirements. To use as a base for building a larger, multi-function at some later date.

Needs Assessment 35 The level of effort needed to complete a GIS plan can be kept commensurate with the scope and size of the intended GIS. Further, the GIS planning software tool that accompanies these guidelines provides an easy and convenient way to create the recommended documentation.

DATA USED BY LOCAL GOVERNMENT

There are many kinds of data used by local government that can be included in a GIS. Data in a GIS can be one of two types: spatial data and non-spatial data. Spatial data is that data which is taken from maps, aerial photographs, satellite imagery, etc. It is composed of spatial entities, relationships between these entities, and attributes describing these entities. Non-spatial data is usually tabular data taken from tables, lists, etc. Most of the time, the non-spatial data will be linked to one or more spatial entities by keys (unique identifiers associated with the spatial data and non-spatial data). For example, the tax map would represent the spatial data while the real property inventory is non-spatial data, which is linked to the entities(parcels) on the tax map. Spatial data is commonly represented by geometric objects (points, line, and polygons). Nonspatial data containing a spatial reference is also considered spatial data. One of the most common forms of this type of data in local government are records and files referenced by street address. Examples of local government data that have been used with GIS include: Tax parcels Real property inventories Infrastructure data Water system Sewer system Electric Census data Land use maps Zoning maps Planimetrics Right-of-way Waterways (streams) Building Outlines Permit records

The operations required in a GIS must meet the data handling requirements of the spatial data as well as those of the non-spatial data. The most common use of a GIS in local government is the query based on attribute keys and then displayed in map form.

36

GIS Development Guide

5 DOCUMENTING GIS NEEDS


The GIS needs are documented using the following forms (full-page sized copies of all forms are included in Appendix A): The GIS Application Description (5 pages) used to: Describe products (mostly map displays) produced by the GIS Describe activities supported by the GIS The Master Data List Most GIS applications can be described using the GIS Application Description. In cases where these forms are not appropriate, any other systematic description of the need can be used. If more appropriate, different forms can be developed as long as the same information can be systematically recorded: the data required and the GIS functions need to develop the GIS product. GIS Application Description The set of forms used to document a GIS contains five pages: Figure 1 - GIS Application Descriptions
GIS Application Description

Name of Government Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: Purpose and Description:

GIS Application Description (Page A- 1) Use to enter: - Application identification - Description of purpose - Type of application, map scale, query key, frequency, and required response time - Data needed by the application - Entities (features) - Attributes of entites

Type of Application: Display Query Query & Display Map Analysis Spatial Model

Display/Map Scale: Query Key: Response Time: Frequency:

Data Required: Features (entities):

Attributes:

Prepared by:

Approved by:

Date:

Needs Assessment 37
Map Display
Name of Government Geographic Information System Requirements Analysis
Application Identification #: Application Name: Department: Defined by:

Map Display (Page A- 2)


Screen: Hard Copy:

Graphical Output Sample:

Used to draw a sample of any maps to be produced by the application (including the legend showing symbols for each feature). This can be a hand sketch, although it should be drawn to the scale of the output desired.

Symbols/Legend
Prepared by: Approved by: Date:

Table Display
Name of Government Geographic Information System Requirements Analysis

Application Identification #: Application Name: Department: Defined by:

Table Display (Page A- 3)


Screen: Hard Copy:

Report Layout/Format:

HEADINGS SUB-HEADINGS

Used to show samples of any tables to be produced by the application (used only if tables are needed in the application). If any entries in the table involve complex calculations, these should be described using either a Data Flow Diagram (page 4) or other separate pages.

SUB-TOTALS/TOTALS:

Prepared by:

Approved by:

Date:

38

GIS Development Guide

Data Flow Diagram


Name of Government Geographic Information System Requirements Analysis

Application Identification #: Application Name: Department: Defined by:

Data Flow Diagram (Page A-4) Used to draw a data flow diagram or flow chart when an application is complex. This chart is usually drawn by the GIS analyst or someone else familiar with the diagraming techniques, and is used to document complex calculations or descriptions of activities that will need GIS support.

Process Description: Data Flow Diagram or Flow Chart

Prepared by:

Approved by:

Date:

Entity-Relationship Diagram
Name of Government Geographic Information System Requirements Analysis

Application Identification #: Application Name: Department: Defined by:

Entity-Relationship Diagram (Page A- 5) Used to draw an entity-relationship (E-R) diagram of the data used in the application. This drawing is usually done by the GIS analyst or someone else familiar with the E-R technique, and is only done for more complex GIS applications.
1 N

Data Description: Entity - Relationship Diagram

1 1

Prepared by:

Approved by:

Date:

Needs Assessment 39

6 DOCUMENTING AN ACTIVITY-TYPE USE OF THE GIS


Some GIS applications in local government do not involve the production of maps and tables. For example, a GIS may be used to record and store information about a building permit application, a subdivision plat, a site plan, etc. Many activities of local government are simply the processing of permits from individuals or firms. If any of these activities will also generate GIS data, they should be described for the needs assessment. Two techniques available for describing processes are flow charts and data flow diagrams. A completed application description for a local government activity of this type can be entered on pages 1 and 4 of the GIS Application Description forms. Page 4 - Data Flow Diagram would appear as follows:

Data Flow Diagram Example


Zoning Map

Town Resident

Locate Parcel

Determine Zoning

Town Resident

Assessor Office

Planning Department

SBL # Index Locate Parcel Tax Map

Figure 2 - Data Flow Diagram Example This example shows a data flow diagram that has three participants (town resident, planning department, and assessor's department) that uses three parts of the database (zoning map, sectionblock-lot number index, and tax map), to answer a zoning inquiry. Appendix D contains a brief description of the data flow diagraming method.

40

GIS Development Guide

7 THE MASTER DATA LIST


The master data list is a composite of all data entities (features) and their attributes that have been entered in the data section of the GIS Application Description (Page 1). Other data identified by users as "needed," but not included in any application description may be entered directly into the master data list. Master Data List Entity Spatial Object -------------------------------------------------------------------------------------------------------------------Street_segment name, address_range Line Street_intersection street_names Line Parcel section_block_lot#, Polygon owner_name, owner_address, site_address, area, depth, front_footage, assessed_value, last_sale_date, last_sale_price, size owner_name, owner_address, assessed_value (as of previous January 1st) Building building_ID, date_built, Footprint building_material, building_assessed_value Occupancy occupant_name, occupant_address, None occupancy_type_code Street_segment name, type, width, Polygon length, pavement_type Street_intersection length, width Polygon, traffic_flow_conditions, intersecting_streets Water_main type, size, material, installation_date Line Valve type, installation_date Node Hydrant type, installation_date, Node pressure, last_pressure_test_date Service name, address, type, invalid_indicator None Soil soil_code, area Polygon Wetland wetland_code, area Polygon Floodplain flood_code, area Polygon Traffic_zone zone_ID#, area Polygon Census_tract tract#, population Polygon Water_District name, ID_number Polygon Zoning zoning_code, area Polygon Figure 3 - Master Data List Attributes

Needs Assessment 41

8 CONDUCTING INTERVIEWS
Individual interviews are the most effective way of finding out from users their potential GIS applications. Before starting interviews, a briefing session for all potential users should be held. During this meeting, the interviewers should describe the entire needs assessment procedure to all participants. The main activities will be: Conduct "start-up" seminar or workshop Interview each potential user Prepare documentation (forms) for each application, etc Review each application description with the user Obtain user approval of and sign-off for each application description

An introductory seminar or workshop with all potential users in attendance is useful to prepare the way for user interviews. At the beginning of a project, many users may not have much knowledge about GIS or how it might help them. Also, the interview team may be from outside the organization and may not be very familiar with the structure of the particular local government. The start-up seminar should address the following topics: Definitions: What is a GIS? How is a GIS used by local government? (Typical applications) Interview procedure to be followed: What the interviewee will do? What is expected from the interviewee? Who approves the application descriptions? How the information from the application descriptions will be used? Group discussion: It is often useful to have the group identify an initial set of GIS applications as candidates for further documentation. The discussion of possible applications between interviewers and users will start to reveal what is suitable for a GIS application. One or more applications can be described in the process by the group so everyone sees how the process will work.

It is preferable to interview users individually rather than in groups. This provides a better opportunity to explore the ideas of each person and also prevents other individuals from dominating any particular meeting. Group meetings easily lose focus on specific GIS applications and therefore do not provide the detailed information needed to adequately describe the GIS applications. Conducting an interview is not an easy task. Some potential users may have a good grasp of GIS and how they might use one. However, often potential users do not have complete knowledge of the capabilities of a GIS and therefore may not be able to readily identify GIS applications. In these cases, the interviewer (GIS analyst) needs to help the user explore his/her job activities and responsibilities to identify GIS opportunities. The GIS analyst should usually begin an interview with a review of the procedure, then ask the user to identify and describe potential applications. When specific GIS applications cannot be easily identified, it is helpful if potential users describe, in general, his/her job functions and responsibilities and the role their department plays in the whole organization. From this discussion, the GIS analyst can usually identify potential GIS applications and then explore these for possible inclusion in the needs assessment.

42

GIS Development Guide

Needs Assessment 43

How can I use a GIS?

Need to make a map Need to answer a query

Need to save important data

Need to describe an activity

Add to Master Data List Prepare Application Description Prepare Data Flow Diagram

Master Data List


Entity Attributes Spatial Object
Line Line Polygon

GIS Application Description


Name of Government Geographic Information System Requirements Analysis

Street_segment Street_intersection Parcel

Application Identification #: Application Name: Department: Defined by: Purpose and Description:

Building

Occupancy
Type of Application: Display Query Query & Display Map Analysis Spatial Model Display/Map Scale: Query Key: Response Time: Frequency:

Street_segment Street_intersection

Data Required: Features (entities):

Attributes:

Water_main Valve Hydrant Service Soil Wetland Floodplain Traffic_zone Census_tract Polygon Water_district Zoning

Prepared by:

Approved by:

Date:

name, address_range street,_names section_block_lot#, owner_name, owner_address, site_address, area, depth, front_footage, assessed_value, last_sale_date, last_sale_price, size (owner_name, owner_address, assessed_value as of previous January 1st) building_ID, date_built, building_material, building_assessed_value occupant_name, occupant_address, occupancy_type_code name, type, width, length, pavement_type length, width, traffic_flow_conditions intersecting_streets type, size, material, installation_date type, installation_date type, installation_date, pressure last_pressure_test_date name, address, type, invalid_indicator Soil_code, area wetland_code, area flood_code, area zone_ID#, area tract#, population name, ID_number zoning_code, area

Data Flow Diagram


Name of Government Geographic Information System Requirements Analysis

Application Identification #: Application Name: Department: Defined by:

Footprint

Process Description: Data Flow Diagram or Flow Chart

None

Polygon Polygon Line Node Node None Polygon Polygon Polygon Polygon

Polygon Polygon

Prepared by:

Approved by:

Date:

44

GIS Development Guide

Figure 4 - Interviewing and Documenting Needs of a Potential GIS User

Needs Assessment 45

9 PREPARING THE NEEDS ASSESSMENT REPORT


The needs assessment report consists of the application descriptions, the master data list, and several summary tables. A list of all applications summarizing the type and frequency of use is the first table. App # Application Name 1 Zoning Query 2 Customer Phone Inquiry 3 Fire Dispatch Map 4 Fire Redistricting Map 5 Crime Summary Map 6 Patrol Dispatch Map 7 Complaint Summary Map 8 Subdivision Development Map 9 Counter Query Map 10 Land Use/Land Value 11 Assessed Value Map 12 Grievance Map 13 Comparable Value Map 14 Built/Vacant Map 15 Water and Sewer Line Map 16 Hydrologic Profile Map 17 Sewer System Flow Analysis 18 Emergency Repair Map 19 Storm Drainage Map 20 Fire Flow Test Map 21 Easement Map 22 Zoning Map 23 Floodplain Map 24 Youth League Residency 25 Mosquito Control Area Map 26 Site Plan Approval Process 27 Census Data Map 28 Population Density Map 29 Land Use Inventory 30 Retail Space Projection 31 Office Space Projection 32 Traffic Volume Map Figure 5 List of GIS Applications Type Query & Display Query & Display Query & Display Map Analysis Query & Display Query & Display Query & Display Query & Display Query & Display Map Display Query & Display Query & Display Query & Display Display Query & Display Spatial Model Spatial Model Query & Display Spatial Model Spatial Model Query & Display Query & Display Query & Display Check Query & Display Query & Display Query & Display Display Map Analysis Display Spatial Model Spatial Model Query & Display Frequency 85 / day 100/day 86/day 1/year 12/month 133/day 624/year No estimate 85/day 1/year 144/year 2500/year No estimate 1/year 30/month 1440/year 12/year 110/year 700/year 260/year 520/year 50/day 50/day 3500/year 50/year 200/year 48/year 50/year 24/year 24/year 12/year 24/year

This table contains selected GIS applications from the Town of Amherst, N.Y. Needs Assessment

46

GIS Development Guide


GIS Application by Dept. by Frequency

GIS Application by Department by Type

Department

Display

Query & Display Map Analysis Spatial Model

Total

Department

Display

Query & Display Map Analysis Spatial Model

Total

Fire Dispatch

Fire Dispatch

94,170

100

94,271

Police

Police

49,637

49,637

Assessor

Assessor

23,894

23,896

Engineering

15

Engineering

18

2,049

3,452

5,519

Building

Building

250

25,000

25,250

Recreation

Recreation

3,520

3,520

Highway

10

11

Highway

1,475

10

1,485

Planning

10

12

28

Planning

718

2,536

80

40

3,374

Total

17

45

10

76

Total

988

202,281

81

3,602

206,952

Figure 6 - Table Summarizing Applications Example The data from the first table can be used to prepare tables summarizing applications by department and the frequency of applications by department.
Numbers in these tables are from the Town of Amherst, N.Y. needs assessment and represent the estimates of GIS use per year. These numbers will be used during the database Planning and Design phase to estimate usage and benefits, of the GIS. In this example, for the Town of Amherst, it is estimated that 2.5 minutes of staff time will be saved for each query giving a total savings of 4.03 years staff time/year (202,281 times 2.5 minutes divided by 60 minutes/hour divided by 2088 hours per year).

The last table relates GIS applications to the data used by each application.
Application/Data Item Matrix:
Ro ad s
Bu ild in gs

La

nd

Pa

rc

els
W at er M

ai

ns Fi re Hy

dr

an

ts

et

lan

Ar

ea

# 1 Leak Detection Map #2 Customer Service Report #3 Pressure Test Map #4 Hydraulic Model Analysis #5 Work Crew Schedule
X X

X X

X X X X X

This matrix is very useful in planning and scheduling data conversion. If applications are prioritized, then data needed by high priority applications can be scheduled for conversion early in the conversion process. Also, if some data is not available for some reason, it is possible to determine the affected applications.

Figure 7 - GIS Applications/Data Matrix The last step in compiling the needs assessment report is to extract the list of GIS functions needed from the application descriptions. This list will include the standard function types of display and query and display plus any other functions included in a data flow diagram or flow chart. Typical examples of such GIS functions are: calculate distance between objects, determine the shortest path through a network, etc. Figure 8 is an example of a GIS functions list.

Needs Assessment 47

GIS Functions/Procedures List

GIS Functions Needed:


Candidate GISs GIS Functions (from Applic. Desc.) DISPLAY Generic GIS Functions ARC/INFO Basic ARCPLOT ARCPLOT INFO INFO IDENTIFY ARC ARC ARC NETWORK NETWORK AML Macro INTERGRAPH Basic YES YES YES YES YES YES YES YES NO NO YES Macro SYSTEM 9 Basic YES YES YES YES YES YES YES YES YES YES YES Macro

SCREEN DISPLAY PLOTTER DISPLAY GENERATE REPORT

QUERY SPATIAL QUERY MAP ANALYSIS

ATTRIBUTE QUERY SPATIAL SEARCH OVERLAY BUFFER RECLASSIFY

SHORTEST PATH ROUTE HYDRAULIC MODEL

SHORTEST PATH ROUTE

Figure 8 - GIS Function List

48

GIS Development Guide


Master Data List

GIS Application Description


Name of Government Geographic Information System Requirements Analysis

Entity

Attributes

Spatial Object
Line Line Polygon

Data Flow Diagram


Name of Government Geographic Information System Requirements Analysis

Street_segment Street_intersection Parcel

Application Identification #: Application Name: Department: Defined by: Purpose and Description:

Building

Type of Application: Display Query Query & Display Map Analysis Spatial Model

Occupancy
Display/Map Scale: Query Key: Response Time: Frequency:

Street_segment Street_intersection

Data Required: Features (entities): Attributes:

Water_main Valve Hydrant Service Soil Wetland Floodplain Traffic_zone Census_tract Polygon Water_district Zoning

name, address_range street,_names section_block_lot#, owner_name, owner_address, site_address, area, depth, front_footage, assessed_value, last_sale_date, last_sale_price, size (owner_name, owner_address, assessed_value as of previous January 1st) building_ID, date_built, building_material, building_assessed_value occupant_name, occupant_address, occupancy_type_code name, type, width, length, pavement_type length, width, traffic_flow_conditions intersecting_streets type, size, material, installation_date type, installation_date type, installation_date, pressure last_pressure_test_date name, address, type, invalid_indicator Soil_code, area wetland_code, area flood_code, area zone_ID#, area tract#, population name, ID_number zoning_code, area

Application Identification #: Application Name: Department: Defined by:

Process Description: Data Flow Diagram or Flow Chart

Footprint

None

Polygon Polygon Line Node Node None Polygon Polygon Polygon Polygon

Prepared by:

Approved by:

Date:

Polygon Polygon

Prepared by:

Approved by:

Date:

Application Description

List of Important Data

Data Flow Diagram

Master Data List

List of GIS Functions

Figure 9 - Compiling Results of Needs Assessment Example The list of GIS functions and the master data list will be used in subsequent tasks to design the database and prepare the GIS specifications.

Needs Assessment 49

1 0 SUMMARY
The procedure presented in the guideline for preparing a needs assessment is based on documenting GIS applications in a standard format. The components of this format are structured to facilitate communication between potential GIS users and the GIS analyst, and to provide specific and detailed information to the GIS analyst for designing the GIS. The first page of the application description is the most critical to the GIS analyst as it contains the list of data and an indication of the GIS functionality required by the application. If additional information on the GIS functionality is needed, than a flow chart or data flow diagram can be developed (page 4 of the application description). For the potential user, the map display and report format describe output he/she will receive. These pages should be sufficiently detailed for the user to approve or sign-off as to the correctness of the application description. It is, of course, very important that the entire GIS application description be internally consistent. The entity-relationship diagram (page 5) is mainly useful in the next phase of the GIS design Conceptual Design, where the data model for the entire system will be defined. If entityrelationship diagrams are prepared for individual applications, they will than be available for the Conceptual Design phase. Otherwise, these diagram can be prepared during the Conceptual Design phase. Figure 9 is a diagrammetric representation of the flow of information from the elements of the application description to the master data list and the list of GIS functions.

Appendix Table of Contents

Appendix A - GIS Application Description Forms GIS Application Description.............................................................A-1 Map Display ...................................................................................A-2 Table Display..................................................................................A-3 Data Flow Diagram .........................................................................A-4 Entity-Relationship Diagram.............................................................A-5 Appendix B Master Data List.............................................................................. B-1 Appendix C - Sample GIS Application Descriptions Customer Phone Inquiry, Erie County Water Authority...................... C-1 Erie County Map Guide, Erie County Public Works Dept.................... C-4 Job Training Site Selection, Erie County Social Services Dept.............. C-5 Appendix D Data Flow Diagraming.....................................................................D-1 Appendix E List of Application Name, Type, & Frequency ................................... E-1 Application Descriptions .................................................................. E-2 Master Data List.............................................................................E-17 Summary Table of Depts. & Counts of Application Type....................E-21 Summary Table of Depts. & Annual Frequencies of Application Type.E-22

GIS Application Description

Name of Government Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: Purpose and Description:

Type of Application: Display Query Query & Display Map Analysis Spatial Model

Display/Map Scale: Query Key: Response Time: Frequency:

Data Required: Features (entities):

Attributes:

Prepared by:

Approved by:

Date:

A-1

Map Display
Name of Government Geographic Information System Requirements Analysis
Application Identification #: Application Name: Department: Defined by:

Graphical Output Sample:

Screen:

Hard Copy:

Symbols/Legend
Prepared by: Approved by: Date:

A-2

Table Display
Name of Government Geographic Information System Requirements Analysis

Application Identification #: Application Name: Department: Defined by:

Report Layout/Format:

Screen:

Hard Copy:

HEADINGS SUB-HEADINGS

SUB-TOTALS/TOTALS:

Prepared by:

Approved by:

Date:

A-3

Data Flow Diagram


Name of Government Geographic Information System Requirements Analysis

Application Identification #: Application Name: Department: Defined by:

Process Description: Data Flow Diagram or Flow Chart

Prepared by:

Approved by:

Date:

A-4

Entity-Relationship Diagram
Name of Government Geographic Information System Requirements Analysis

Application Identification #: Application Name: Department: Defined by:

Data Description: Entity - Relationship Diagram

1 1 1 N

Prepared by:

Approved by:

Date:

A-5

Master Data List


Entity Attributes Spatial Object
Line Line Polygon

Street_segment Street_intersection Parcel

Building

Occupancy

Street_segment Street_intersection Water_main Valve Hydrant Service Soil Wetland Floodplain Traffic_zone Census_tract Polygon Water_district Zoning

name, address_range street,_names section_block_lot#, owner_name, owner_address, site_address, area, depth, front_footage, assessed_value, last_sale_date, last_sale_price, size (owner_name, owner_address, assessed_value as of previous January 1st) building_ID, date_built, building_material, building_assessed_value occupant_name, occupant_address, occupancy_type_code name, type, width, length, pavement_type length, width, traffic_flow_conditions intersecting_streets type, size, material, installation_date type, installation_date type, installation_date, pressure last_pressure_test_date name, address, type, invalid_indicator Soil_code, area wetland_code, area flood_code, area zone_ID#, area tract#, population name, ID_number zoning_code, area

Footprint

None

Polygon Polygon Line Node Node None Polygon Polygon Polygon Polygon

Polygon Polygon

B-1

ECWA Geographic Information System


Erie County Water Authority Geographic Information System Requirements Analysis

Application Identification #: Application Name: Department: Defined by:

19 Customer Phone Inquiry Dispatch T. May

Purpose and Description: To respond to phone inquires of: 1) "no water;" or 2) to take requests for service. (Reference Donohue #18)

Type of Application: Display Query Query & Display XX Map Analysis Spatial Model

Display/Map Scale: 1" = 100' Query Key: Address Response Time: 10 seconds Frequency: xx/day

Data Required: Features (entities):


ROW

Attributes: Location (boundary), street name, street_address_range Location (line), size Location (boundary), address Location (footprint) Location(parcel), address, name, status Location (boundary), current_in_progress Location (parcel_by_address), current, date work_order_number

Pipe Parcel Building Services Projects Work_orders


Prepared by: Approved by:

Date:

C-1

ECWA Geographic Information System

Erie County Water Authority Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 19 Customer Phone Inquiry Dispatch T. May

Graphical Output Sample:

Screen: XX

Hard Copy:

30

31

51 ST AVE

7 6 5

41

42

43

40 4 3 39 18

Symbols/Legend
Parcel ROW Prepared by: Approved by: Water Main Service Connection Date: Building

C-2

Maple Road

44

ECWA Geographic Information System

Erie County Water Authority Geographic Information System Requirements Analysis

Application Identification #: Application Name: Department: Defined by:

19 Customer Phone Inquiry Dispatch T. May

Report Layout/Format:

Screen: XX

Hard Copy:

Customer Service Inquiry


Name J.J. Jones Address Status Pipe Size 4 in. Project None Work Order None

1551 51st Ave. Active

Prepared by:

Approved by:

Date:

C-3

Erie County Geographic Information System

County of Erie Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 1 Erie County Map Guide Public Works Roger Fik

Purpose and Description: To provide a general multi-purpose map of Erie County for public use.

Type of Application: Display X Query Query & Display Map Analysis Spatial Model

Display/Map Scale: 1"=2 miles Query Key: N/A Response Time: 5 to 10 minutes Frequency: Yearly publication

Data Required: Features (entities):


County Boundary Townships Cities Villages Communities County Roads City, Town, & Village Roads State Highways Interstate, State Thruway, & Expressways Interstate Route Numbers State Route Numbers US Route Numbers Reservations State Parks County Parks County Forests Streams, Rivers, & Creeks Water Bodies Airports County Jurisdiction Designation

Attributes:
Location, Name (line) Location, Name (Polygon) Location, Name (Polygon) Location, Name (Polygon) Location, Identifier (Node) Location, Name (Line) Location, Name (Line) Location, Name (Line) Location, Name (Line) Route Identifier, Location Route Identifier, Location Route Identifier, Location Location, Name (Polygon) Location, Name (Polygon) Location, Name (Polygon) Location, Name (Polygon) Location, Name (Line) Location, Name (Polygon Location, Name (Node) Location, Name (Node)

Prepared by J. Volpe:

Approved by:

Date: 3/15/94

C-4

Erie County Geographic Information Systems


County of Erie Geographic Information System Requirements Analysis

Application Identification #: Application Name: Department: Defined by: Purpose and Description:

48 Job Training Site Location Social Services Jeff Embury (C/O Jim Kubacki) To provide trainees with an adequate training site while minimizing the distance they must travel to reach that site.

Type of Application: Display Query Query & Display X Map Analysis Spatial Model

Display/Map Scale: Multiple Query Key: Trainee Address Response Time: < 1 Minute Frequency: Daily

Data Required: Features (entities): ROADS TRAINEE TRAINING SITE SUBWAY

Attributes: XY_Location, Name, Address_Range Trainee_Address, Trainee_Name Site_Name, Site_Address, Site_Phone # XY_Location, Subway_Stop_Name, Subway_Stop_Location XY_Location, Busroute_Number, Bus_Stop_Name, Bus_Stop_Location

BUS ROUTES

Prepared by:

Approved by:

Date:

C-5

Erie County Geographic Information System

County of Erie Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 48 Job Training Site Location Social Services Jeff Embury/Jim Kubacki

Graphical Output Sample:

Screen:

Hard Copy:

Symbols/Legend
Trainee Job Training Site Approved by: Road Bus Date: 5/5/94 Subway

Prepared by: Eric Covino

C-6

Data Flow Diagram


Name of Government Geographic Information System Requirements Analysis

Application Identification #: Application Name: Department: Defined by:

48 Job Training Site Location Social Services Jeff Embury (C/O JimKubacki)

Process Description: Data Flow Diagram or Flow Chart


Start

Dept. of Social Services

Query Trainee Address

Overlay Training Sites

Choose Training Site

File of Trainee Placement

Prepared by:

Approved by:

Date:

C-7

Data Flow Diagram


Name of Government Geographic Information System Requirements Analysis
Application Identification #: Application Name: Department: Defined by: 48 Job Training Site Location Social Services Jeff Embury (C/O Jim Kubacki)

Process Description: Data Flow Diagram or Flow Chart


Bus Route No. Bus Stop Names Bus Stop Coordinates XY Location BUS ROUTE Line T C Trainee Address

On XY Location ROAD Name Address Range On Line TC

Address Match

TRAINEE Point T C

Trainee Name

Address Match

JOB SITE Point T C

Site Phone #

Site Address XY Location SUBWAY Line T C

Site Address

Subway Stop Name

Subway Stop Coordinates

Prepared by:

Approved by:

Date:

C-8

Data Flow Diagraming


Data flow diagrams offer a standardized method of portraying processes, data stores, and participants that make up a logical activity potential GIS application). Four symbols are used in a data flow diagram:

A square represents people, organizations, things, or sources or destinations of data or information

A cylindrical shape to represent a process or activity

An open rectangle to represent a data stored from which data can be added or removed

An arrow to represent data flows. Arrow can be annotated as necessary to describe nature or content flow.

D-1

List of Application Name, Type, and Frequency


Appl# 11 12 13 14 15 16 17 19 28 29 36 37 41 63 70 Application Name Subdivision Development Map Counter Query Map Land Use/Land Value Map Assessed Value Map Grievance Map Comparable Property Map Built/Vacant Map Sanitary Sewer Line Map Public Improvement Map Total Committed Flow Map Storm Sewer Map Youth League and Residency Check Map Optimal Snow Removal Route Map Population Density Map Population Projection Type Query & Display Query Display Query & Display Query & Display Query & Display Display Query & Display Query & Display Spatial Model Display Query Spatial Model Browse Spatial Model Frequency 1 50 1 3 1650 1 1 2 10 20 10 1500 10 50 4 per month per day per year per year per year per month per year per week per week per week per day per year per month per year per year

E-1

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 11 Subdivision Development Map Assessor H. Williams

Purpose and Description: To monitor the progress of development of an approved subdivision (how many lots are built and the rate of the building).

Type of Application: Query & Display

Display/Map Scale: Response Time : Frequency:

100;200;400 1 per month

Data Required: Feature


Parcel Street (double line)

Spatial Object Attribute


Polygon location Polygon name location

Subdivision

Polygon name location

Prepared by:

Approved by:

Date: 03-June-96

E-2

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 12 Counter Query Map Assessor H. Williams

Purpose and Description: To provide a quick query of one or more parcels and the associated parcel data (mostly ARLM file data) for answering inquiries at the counter or over the telephone.

Type of Application: Query

Display/Map Scale: Response Time : Frequency:

50;100;200;400 1 per month

Data Required: Feature


Building

Spatial Object Attribute


Polygon assessed value building #

Parcel

Polygon subdivision lot SBL # location

Street (center line)

Line length address range name location

Prepared by:

Approved by:

Date: 03-June-96

E-3

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 13 Land Use/Land Value Map Assessor H. Williams

Purpose and Description: To produce a display of the value of land per square foot and/or front footage by land use type.

Type of Application: Display

Display/Map Scale: Response Time : Frequency:

200;400 1 week 1 per year

Data Required: Feature


Parcel

Spatial Object Attribute


Polygon depth front footage size last sale price land use code location

Street (center line)

Line name location

Prepared by:

Approved by: E-4

Date: 03-June-96

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 14 Assessed Value Map Assessor H. Williams

Purpose and Description: To produce a map showing the assessed values (by range) for a small area; or for designated neighborhoods.

Type of Application: Query & Display

Display/Map Scale: Response Time : Frequency:

400 Interactive 3 per year

Data Required: Feature


Neighborhood

Spatial Object Attribute


Polygon name location

Parcel

Polygon assessed value location

Street (double line)

Polygon name location

Prepared by:

Approved by:

Date: 03-June-96

E-5

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 15 Grievance Map Assessor H. Williams

Purpose and Description: To show assessed values of properties in the same area as a parcel where a grievance is filed.

Type of Application: Query & Display

Display/Map Scale: Response Time : Frequency:

100;200;400 interactive 1650 per year

Data Required: Feature


Parcel

Spatial Object Attribute


Polygon assessed value location

Street (double line)

Polygon name location

Prepared by:

Approved by:

Date: 03-June-96

E-6

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 16 Comparable Property Map Assessor H. Williams

Purpose and Description: Show the comparable properties selected to determine the assessed value of a given property.

Type of Application: Query & Display

Display/Map Scale: Response Time : Frequency:

100;200;400 1 per month

Data Required: Feature


Parcel

Spatial Object Attribute


Polygon address assessed value location

Street (double line)

Polygon name location

Prepared by:

Approved by:

Date: 03-June-96

E-7

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 17 Built/Vacant Map Assessor H. Williams

Purpose and Description: To display the built and vacant parcels

Type of Application: Display

Display/Map Scale: Response Time : Frequency:

400 Interactive 1 per year

Data Required: Feature


Occupancy

Spatial Object Attribute


Node occupant type code occupant address Occupant name

Parcel

Polygon built/vacant code location

Street (double line)

Polygon name location

Prepared by:

Approved by:

Date: 03-June-96

E-8

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: Purpose and Description: To show the location of sanitary sewer lines for the purpose of approving digging activities. 19 Sanitary Sewer Line Map Assessor H. Williams

Type of Application: Query & Display

Display/Map Scale: Response Time : Frequency:

50;1000 5 min. 2 per week

Data Required: Feature


Building footprint

Spatial Object Attribute


Polygon business name building name address

Manhole

Node depth location

Sanitary sewer line Sidewalk Storm sewer line Street (double line)

Line location Line location Line location Polygon name location address range

Wye hook ups (new only)

Node distance from manholes location

Prepared by:

Approved by: E-9

Date: 03-June-96

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 28 Public Improvement Map Engineering P. Bowers

Purpose and Description: To show facilities near a certain parcel for review of a public improvement permit or site plan.

Type of Application: Query & Display

Display/Map Scale: Response Time : Frequency:

100;200 30 sec 10 per week

Data Required: Feature


Parcel Sanitary sewer Storm drainage Street (double line)

Spatial Object Attribute


Polygon location Line location Line location Polygon curb location pavement type location

Water main

Line installation date material size type location

Prepared by:

Approved by:

Date: 03-June-96

E-10

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: Purpose and Description: To keep track of the total committed flow of sanitary and storm sewers. 29 Total Committed Flow Map Engineer P. Bowers

Type of Application: Spatial Model

Display/Map Scale: Response Time : Frequency:

1000 1 min 20 per week

Data Required: Feature


Detention pond

Spatial Object Attribute


Polygon capacity size location

Ditches

Polygon capacity size location

Monitoring point Sanitary sewer line

Node location Line capacity size location

Storm sewer line

Line capacity size location

Street (center line)

Line location

Prepared by:

Approved by: E-11

Date: 03-June-96

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 36 Storm Sewer Map Building Department T. Ketchum

Purpose and Description: To display the location of storm sewers.

Type of Application: Display

Display/Map Scale: Response Time : Frequency:

100;200 12 sec 10 per day

Data Required: Feature


Contours

Spatial Object Attribute


Line location elevation

Easement

Polygon location type

Manhole

Node location invert elevation rim/surface elevation

Parcel Storm sewer line Street (double line)

Polygon location Line location Polygon name location

Prepared by:

Approved by: E-12

Date: 03-June-96

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 37 Youth League and Residency Check Map Recreation Department J. Bloom

Purpose and Description: To determine the appropriate league for a resident (by parcel) and discover non-resident applications.

Type of Application: Query

Display/Map Scale: Response Time : Frequency:

1000 30 sec 1500 per year

Data Required: Feature


League

Spatial Object Attribute


Polygon type location

Parcel

Polygon owner address owner name land use address location

Prepared by:

Approved by:

Date: 03-June-96

E-13

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 41 Optimal Snow Removal Route Map Highway Department F. Jurgens

Purpose and Description: To calculate the most efficient routes for snow removal and salting.

Type of Application: Spatial Model

Display/Map Scale: Response Time : Frequency:

1000 1 week 10 per month

Data Required: Feature


Street (center line)

Spatial Object Attribute


Line length class width location

Street intersections

Node traffic flow conditions street names

Traffic zone

Polygon area zone code

Prepared by:

Approved by:

Date: 03-June-96

E-14

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 63 Population Density Map Planning G. Black

Purpose and Description: To browse population density by census tract, block group, or block.

Type of Application: Browse

Display/Map Scale: variable: 200 to 1000 Response Time : Interactive Frequency: 1 per year

Data Required: Feature


Census Block

Spatial Object Attribute


Polygon block # population total location

Census tract

Polygon tract # location

Parcel

Polygon area land use code location

Street (center line)

Line name location

Prepared by:

Approved by: E-15

Date: 03-June-96

Geographic Information System Requirements Analysis


Application Identification #: Application Name: Department: Defined by: 70 Population Projection Planning C. Brown

Purpose and Description: To estimate future population of the Town, by small area (census tract, block group, and possibly block).

Type of Application: Spatial Model

Display/Map Scale: Response Time : Frequency:

1000 1 day 4 per year

Data Required: Feature


Census Block

Spatial Object Attribute


Polygon block # size location

Census tract

Polygon tract # size location

Net migration Wetland

None application #69 Polygon area wetland code

Zoning

Polygon area zoning code

Prepared by:

Approved by: E-16

Date: 03-June-96

Master Data List Feature Building Attribute assessed value building # Building footprint address building name business name Census Block block # location popultion total size Census tract location size tract # Contours elevation location Detention pond capacity location size Ditches capacity location size Easement location type League location type Manhole depth invert elevation location rim/surface elevation Monitoring point location E-17 Node Node Polygon Polygon Polygon Polygon Line Polygon Polygon Polygon Spatial Object Polygon

Master Data List Cont'd Neighborhood location name Net migration application # 69 Occupancy occupant address occupant name occupant type code Parcel address area assessed value built/vacant code depth front footage land use land use code last sale price location owner address owner name SBL # size subdivision lot Sanitary sewer line capacity location size Sidewalk location Storm drainage location Storm sewer line capacity location size Line Line Line Line Polygon Node None Polygon

E-18

Master Data List Cont'd Street (center line) class length location name width address range Street (double line) address range curb location location name pavement type Street intersections street names traffic flow conditions Subdivision boundary name Traffic zone area zone code Water main installation date location material size type Wetland area wetland code Wye hook ups (new only) distance from manholes location Zoning area zoning code node Polygon Polygon Line Polygon Polygon Node Polygon Line

E 19

GIS Application by Department by Type

Department Assessor Building Dept. Engineering Highway Dept. Planning Recreation Dept. Total

Browse 0 0 0 0 1 0 1

Display 2 1 0 0 0 0 3

Query 1 0 0 0 0 1 2

Query & Display 4 0 2 0 0 0 6

Spatial Model 0 0 1 1 1 0 3

Total 7 1 3 1 2 1 15

GIS Application by Department by Frequency

Department Assessor Building Dept. Engineering Highway Dept. Planning Recreation Dept. Total

Browse 0 0 0 0 50 0 50

Display 2 3650 0 0 0 0 3652

Query 18250 0 0 0 0 1500 19750

Query & Display 1677 0 624 0 0 0 2301

Spatial Model 0 0 1040 120 4 0 1164

Total 19929 3650 1664 120 54 1500 26917

E-20

GIS DEVELOPMENT GUIDE: CONCEPTUAL DESIGN OF THE GIS

86 GIS Development Guide

PART 1 - DATA MODELING

1 INTRODUCTION
This guide describes data modeling in general, spatial data modeling in specific, the setting of GISspecifications, and an introduction to spatial data and metadata standards. These activities arecollectively called conceptual design of the GIS system (Figure 1). This activity takes theinformation developed during the Needs Assessment and places it a structured format. The resultof this activity will be a GIS data model and functional specifications for the GIS system.

Needs Assessment

Conceptual Design Database Planning and Design Database Construction

Available Data Survey

GIS System Integration

Application Development

GIS Use and Database Maintenance

Pilot/ Benchmark

Aquisition of GIS Hardware and Software

H/W & S/W Survey

Figure 1 - GIS Development Process Conceptual design is the first step in database design where the contents of the intended database are identified and described. Database design is usually divided into three major activities Conceptual data modeling: identify data content and describe data at an abstract, or conceptual, level. This step is intended to describe what the GIS must do and does not deal with how the GIS will be implemented - the "how" question is the subject of logical and physical database design; Logical database design: translation of the conceptual database model into the data model of a specific software system; and

Conceptual Design of the GIS 87

Physical database design: representation of the logical data model in the schema of the software.

Data Objects Identified During Needs Assessment

Source Documents: Maps, Images, Air Photos, etc.

Preparation of Data Model

Match Needed Data to Available Data and Sources

Survey and Evaluation of Available Data

Prepare Detailed Database Plan

Create Initial Metadata

Map and Tabular Data Conversion

Add Record Retention Schedules to Metadata

Database QA/QC Editing

GIS Database

Continuing GIS Database Maintenance

Archives

Database Backups

Figure 2 - Life Cycle of a GIS Database The conceptual design of the GIS system is primarily an exercise in database design. Database planning is the single most important activity in GIS development. It begins with the identification of the needed data and goes on to cover several other activities collectively termed the

88 GIS Development Guide data life cycle - identification of data in the needs assessment, inclusion of the data in the data model, creation of the metadata, collection and entry into the database, updating and maintenance, and, finally, retained according to the appropriate record retention schedule (Figure 2). A complete data plan facilitates all phases of data collection, maintenance and retention and as everything is considered in advance, data issues do not become major problems that must be addressed after the fact with considerable difficulty and aggravation. The conceptual design of the GIS also includes identification of the basic GIS architecture (functions ofhardware and GIS software), estimates of usage (derived from the needs assessment), and scopingthe size of the GIS system. All of this is done with reference to the existing data processingenvironments (legacy systems) that must interface with the GIS. Preparing A GIS Data Model A data model is a formal definition of the data required in a GIS. The data model can take one of several forms, the two used in this guideline are a structured list and an entity-relationship diagram . The purpose of the data model, and the process of specifying the model, is to ensure that the data has been identified and described in a completely rigorous and unambiguous fashion and that both the user and GIS analyst agree on the data definitions. The data model is then the formal specification for the entities , their attributes and all relationships between the entities for the GIS. Building a data model is not necessarily an easy task. Most professionals in local government will not have had experience in this task. The GIS analyst of the project is the individual who either should build the data model or acquire assistance, such as a qualified consultant, to complete this task. If the opportunity exists for the GIS analyst to attend a database design course or seminar, this would enhance this person's ability to build the model but, more importantly, provide the knowledge for using the final data model in building the GIS. To the extent that data models prepared for other local governments match the needs of a particular GIS development program, or can be easily adapted, they can be modified for use as the data model. However, the GIS analyst must have a good understanding of the resulting model and how it is used to build and manage the GIS database. The next sections of the guideline first discuss the nature of geographic data, then present themethodology used for data modeling, and lastly describe the development of a GIS data modelfrom the information collected during the Needs Assessment. The example provided in the last section is actually a sample local government GIS data model and is suitable for direct use, withappropriate modification to specific situations.

N ATURE OF GEOGRAPHIC DATA

Conceptual Design of the GIS 89 Geographic data describe entities which have a location. The geographic data includes the locationinformation and other information about the entity of interest. This other information will bereferred to as attributes of the entities. Historically several terms have been used to describe thedata in a GIS database, among them features, objects, or entities. The term feature derives fromcartography and is commonly used to identify "features shown on a map," while entity and object are terms from computer science used to identify the elements in a database. The normal dictionarydefinitions of these terms are: Object : a thing that can be seen or touched; material thing that occupies space Entity : a thing that has definite, individual existence in reality Feature : the make, shape, form or appearance of a person or thing A good GIS database design methodology requires the use of terms in a clear an unambiguousmanner. This guideline will use the term entity to represent objects or things to be included in thedatabase and attribute will be the term for representing the characteristics or measurements to berecorded for the entities. Other terms have commonly been used to describe the organization ofentities and attributes in a GIS, such as layer, coverage, base map, theme, and others. Each of thesewill usually refer to a collection of one or more entities organized in some useful way which isspecific to the GIS software in use. These terms will become important during the logical/physicaldatabase design activities where decisions about how the GIS data are to be stored in the GISdatabase are made. The conceptual database design activity is focused solely on specifying whatis to be included in the GIS database and should provide clear and unambiguous representation ofthe entire GIS database. In addition to a clear and concise definition of entities and their attributes, data modeling describes relationships between entities . An example of a relationship between an employee and a company would be "works for ." Employee - Works For - Company Relationships may be bi-directional, thus: Company - Has - Employees An important aspect of a relationship is "cardinality," that is if the relationship is between only one of each entity or if either entity may be more than one. For example, one company usually has many employees whereas one employee works for only one company. The possible cardinalities are: one-to-one; one-to-many; and many-to-many. Thus: --- Has ---> Company (One) <--- Work For ------ (Many) Employees There are many variations of the notation used to express these facts. The notation recommended for local government will be described later.

90 GIS Development Guide Geographic, or spatial data, differs from other "regular" data that are included in computerdatabases in how entities are defined and in the relationships between entities. Entity identificationfor spatial data includes the definition of a physical or abstract entity (e.g., a building) and thedefinition of a corresponding spatial entity (i.e., a polygon to represent the building footprint). This latter, or second entity does not exist for other types of computer databases. The existence ofthe corresponding spatial entity is one of the major factors that distinguishes GIS from other types ofsystems and is what makes it very important to utilize proper planning and design techniques whenbuilding a GIS. An example will be used to illustrate this difference.

Entities:
Entity: Physical or Conceptual

Entity and Its Attributes

Spatial Entity and Attributes


Polygon (coordinates, topology) Line_segment (coordinates, topology) Footprint (coordinates) Polygon (coordinates, topology) Grid_cell (coordinates)

Parcel (owner_name, owner_address) Street_segment (name, type, width) Building (date_built, assessed_value) Soil-type (soil_code, area) Landuse_area (land_use, code, area)

Figure 3 - Examples of Physical or Conceptual Entities and Their Corresponding Spatial Entities

ENTITY-RELATIONSHIP (E-R) DATA MODELING

To start the discussion of entity-relationship modeling, two examples will be shown. One, a regular database and the second, a simple GIS database. The personnel database inany local government could have entities of employee, dependent and department. Relationshipsbetween these entities would be employee " works in" department and dependent "is a member of"the employee's family. Some of the attributes for each entity would be as follows: Employee (name, age, sex, job title) Dependent (name, age, relationship_to _employee, i.e., spouse, child, etc.) Department (department_name, function, size) A diagrametric representation of the example would be as follows:

Conceptual Design of the GIS 91

Name Age Sex Job Title

Name

Employee

Works in

Department

Function Size

Has

Name Age

Occupant_Name

Dependents

Unit_Number

Relationshlip to employer

Figure 4 - Example of a Firm's Database An example of a simple spatial database would be a follows: Parcel Building Occupant ID#, owner_name, owner_address, site_address Building_name, height, floor_area Occupant_name, unit_number Polygon Footprint None

The diagrammatic form of this spatial database would be as follows:


Building_Name Height Floor_area
Owner_Name

Building

Located on

Parcel

Owner_Address Situs_Address

ID #

Has

Occupant_Name

Occupant
Unit_Number

Figure 5 - Example of a Simple GIS Database This example has been presented using two standard notational forms for conceptual databasedesign: a relation, the entity name followed by a list of attributes; and an entityrelationshipdiagram showing entities, their attributes, and the relationships between entities. On figure 5, there are two things to notice: The standard entity - relationship diagram has no provision for representing the corresponding spatial entity (point, line, polygon) of the data; and The representation of the attributes (ellipses) can be somewhat awkward due to different name lengths and the number of attributes to be shown.

92 GIS Development Guide The two notational forms modified to accommodate GIS data will be used as the primary tools for conceptual database design in this guideline; however, modifications will be made to adequately represent GIS data.The next section will provide the formal definition of the basic entityrelationship data modelingmethod, the modifications needed to represent GIS data, followed by examples of GIS data entities and attributes typical for local government andthe by a description of how to model these data using the modified entity-relationship data modelingtechnique. Basic Entity-Relationship Modeling The basic entity-relationship modeling approach is based on describing data in terms of the three parts noted above (Chen 1976): Entities Relationships between entitles Attributes of entitles or relationships

Each component has a graphic symbol and there exists a set of rules for building a graph (i.e., an E-R model) of a database using the three basic symbols. Entities are represented as rectangles, relationships as diamonds and attributes as ellipses. The normal relationships included in a E-R model are basically those of: 1. Belonging to; 2. Setand subset relationships; 3. Parent-child relationships; and 4. Component parts of an object. Theimplementation rules for identifying entities, relationships, and attributes include an Englishlanguage sentence structure analogy where the nouns in a descriptive sentences identify entities,verbs identify relationships, and adjectives identify attributes. These rules have been defined byChen (1983) as follows: Rule 1: A common noun (such as person, chair), in English corresponds to an entity type on an E-R diagram. Rule 2: A transitive verb in English corresponds to a relationship type in an E-R diagram. Rule 3: An adjective in English corresponds to an attribute of an entity in an E-R diagram. English statement: Mr. Joe Jones resides in the Park Avenue Apartments which is located on land parcel #01-857-34 owned by the Apex Company. Analysis: .. "Joe Jones"," "Park Avenue Apartments," "land parcel" and "Apex Company" are nouns and therefore can be represented as entities "occupant," "building," "parcel," and "owner." "resides," "located on" and "owned by" are transitive verbs (or verb phrases) and therefore define relationships. Example of Simple E-R diagrams

Conceptual Design of the GIS 93

Building

Located on

Parcel

Owner_name: Apex Co.

Resides

Owned
by

Occupant

Name: Joe Jones

Owner

Name ApexCo..

Figure 6 Shows a Simple E-R Diagrams of the Previous Example Many times it is possible to build different E-R diagrams for the same data. For example, instead of creating the entity "owner" in the above example, the owner's name could be an attribute of parcel (shaded areas of figure 6). During the process of building an E-R diagram (i.e., the conceptual model) for a database, the analyst must make decisions as to whether something is best represented as an entity or as an attribute of some other entity. The process of constructing an E-R diagram uncovers many inconsistencies or contradictions in the definition of entities, relationships, and attributes. Many of these are resolved as the initial E-R diagram is constructed while others are resolved by performing a series of transformations on the diagram after its initial construction. The final E-R diagram should be totally free from definitional inconsistencies and contradictions. If properly constructed, an E-R diagram can be directly converted to the logical and physical database schema of the relational, hierarchical or network type database for implementation. Unique Aspects of Geographic Data In the simplest terms, we think of geographic data as existing on maps as points, lines and areas. Early GIS systems were designed to digitally encode these spatial objects and associate one or morefeature codes with each spatial feature. Examples would be a map of land use polygons, a set ofpoints showing well locations, a map of a stream shown as line segments. For the purposes ofplotting (redrawing the map) a simple data structure linking (x,y) coordinates to afeature code was sufficient. Topology A distinguishing feature of a modern GIS is that some spatial relationships between spatial entities will be coded in the database. This coding is termed topologically coding. Topologyis based on graph theory, where a diagram can be expressed as a set of nodes and links in a mannerthat shows logical relationships. Applied to a map, this concept is used to abstract the featuresshown on the map and to represent these features as nodes and arcs (point and lines). Nodes are the end-points of arcs and areas are formed by a set of arcs. If the concept and definitions of topologic data structures are not familiar to the reader, the following readings are recommended:

94 GIS Development Guide

Geographic Information Systems: Antenucci, et. al, pages 98-99.

A Guide to the Technology, by John

Fundamentals of Spatial Information Systems, by Robert Lauring and Derek Thompson, pages 206-211. ARC/INFO Data Model, Concepts, & Key Terms, by Environmental Systems Research Institute, Inc., pages 1-12 to 1-15.

Coordinate strings without topology with associated feature codes were called "spaghetti" files because there was not any relationship between any two coordinate strings formally encoded in the database. For example, the "GIS system" would not "know" if two lines intersect or not or whether they had common end points. These relationships could be seen by the human eye if a plot were to be made or alternatively could be calculated (often a time consuming process). Typical of this type of geographic data file are those produced by computer-aided drafting systems (CAD), or known as .dxf, .dwg, or .dgn files.

GEOGRAPHIC DATA MODELS

The data models in most contemporary GISs are still based on the cartographic view. Other data models have begin to evolve, but are still very limited. Current and potential geographic data models include: The cartographic data model: points, lines and polygons (topologically encoded) with one, or only a few, attached attributes, such as a land use layer represented as polygons with associated land used code Extended attribute geographic data mode: geometric objects as above but with many attributes, such as census tract data sets; Conceptual object/spatial data model: explicit recognition of user defined objects, zero or more associated spatial objects, and sets of attributes for reach defined object (example: user objects of land parcel, building, and occupant, each having its own set of attributes but with different associated spatial objects: polygon for land parcel, footprint for building, and no spatial object of occupant). Conceptual objects/complex spatial objects: multiple objects and multiple associated spatial objects (example: a street network with street segments having spatial representations of both line and polygon type and street intersections having spatial representations of both point and polygon type).

Current GIS are based on the cartographic and extended attribute data models. The trend to objectoriented computer systems and databases will require that GIS planners view their databases from an "object viewpoint." Spatial Relationships

Conceptual Design of the GIS 95 GISs also differ from other systems in that they include spatial relationships. These relationships are included in the GIS either by the topologic coding or by means of calculations based on the (x,y) coordinates. One common calculation is whether or not two lines intersect. Table 1 shows the spatial relationships, associated descriptive verbs, and the common implementation of each relationship by a GIS.

Spatial Relationships
Spatial Relationship Descriptive Verbs Connect, link Common GIS Implementation Topology E-R Model Symbol

Connectivity

Contiguity

Adjacent, abutt

Topology

Containment

Contained, containing, within Closest, nearest

X, Y coord. operation

Proximity

X,Y coord. operation

Coincidence

Coincident, Coterminous

X,Y coord. operation

Figure 7 - Spatial Relationships in a GIS Connectivity and contiguity are implemented through topology: the link-node structure for connectivity through networks and the arc-polygon structure for contiguity. Containment and proximity are implemented through x,y coordinates and related spatial operations: containment is determined using the point-, line-, and polygon-on-polygon overlay spatial operation and proximity is determined by calculating the coordinate distance between two or more x,y coordinate locations. The spatial relationship of coincidence may be complete coincidence or partial coincidence. The polygon-on-polygon overlay operation in ARC/INFO calculates partial coincident of polygons in two different coverages. The System 9 Geographic Information System recognizes coincident features through a "shared primitive" concept (the geometry of a point or line is stored only once and then referenced by all features sharing that piece of geometry). Future versions of commercial GISs will likely implement coincident features through either the

96 GIS Development Guide "belonging to" database relationship or through x,y coordinates and related spatial operations, whichever is more efficient within the particular GIS. In summary, there are three types of relationships that will be represented in a geographic database with an "object view" orientation: Normal database relationships, which are represented in a relational database by means of keys (primary and secondary) Spatial relationships represented in the GIS portion of the database by topology Spatial relationships that exist only after a calculation is made on the (x,y) coordinates

METHODOLOGY FOR MODELING

Modeling a geographic database using the E-R approach requires an expanded or extended concept for: Entity identification and definition; and Relationship types and alternate representational forms for spatial relationships. There are three considerations in the identification and definition of entities in a geographic database: Correct identification and definition of entities Entities in a geographic database are defined as either discrete objects (e.g., a building, a bridge, a household, a business, etc.) or as an abstract object defined in terms of the space it occupies (e.g., a land parcel, a timber stand, a wetland, a soil type, a contour, etc.). In each of these cases we are dealing with entities in the sense of "things" which will have attributes and which will have spatial relationships between themselves. These "things" can be thought of as "regular" entities. Defining a corresponding spatial entity for each "regular" entity A corresponding spatial entity will be one of the spatial data types normally handled in a GIS, e.g., a point, line, area, volumetric unit, etc. The important distinction here is that we have a single entity, its spatial representation and a set of attributes; we do not have two separate objects (Figure 3 illustrates this concept). A limited and simple set of spatial entitles may be used, or alternatively, depending on the anticipated complexity of the implemented geographic information system, an expanded set of spatial entities may be appropriate. The corresponding spatial entity for the regular entity may be implied in the definition of the regular entity, such as abstract entities like a wetland where the spatial entity would normally be a polygon, or a contour where the spatial entity would be a line. Other regular entitles may have a less obvious corresponding spatial entity. Depending on the GIS requirements, the cartographic display needs, the implicit map scale of the database and other factors, an entity may be reasonably represented by one of several corresponding spatial entities. For example, a city in a small-scale database could have a point as its corresponding spatial entity, while the same city would have a polygon as its corresponding spatial entity in a large-scale geographic database.

Conceptual Design of the GIS 97 Recognize multiple instances of geographic entities, both multiple spatial instances and multiple temporal instances Multipurpose (or corporate) geographic databases may need to accommodate multiple corresponding spatial entities for some of the regular entities included in the GIS. For example, the representation of an urban street system may require that each street segment (the length of street between two intersecting streets) be held in the GIS as both a single-line street network to support address geocoding, network based transportation modeling, etc., and as a double-line (or polygon) street segment for cartographic display, or to be able to locate other entities within the street segment (such as a water line), etc. In each of these instances the "regular" entity is the street segment, although each instance may have a different set of attributes and different corresponding spatial entities. Also, there may be a need to explicitly recognize multiple temporal instances of regular entities. The simple case of multiple temporal instances will be where the corresponding spatial entity remains the same, however, future GISs will, in all likelihood, have to deal with multiple temporal instances where the corresponding spatial entity changes over time. Three symbols are defined to represent entities: entity (simple); entity (multiple spatial representations); and entity (multiple time periods). The internal structure of the entity symbol contains the name of the entity and additional information indicating the corresponding spatial entity (point, line or polygon), a code indicating topology, and a code indicating encoding of the spatial entity by coordinates (Figure 8). The coordinate code is, at the present time, redundant in that all contemporary GISs represent spatial entities with x,y coordinates. However, it is possible that future geographic databases may include spatial entities where coordinates are not needed. Similarly, topological encoding is normally of only one type and can, for the present, be indicated by a simple code. However, different spatial topologies have been defined and may require different implementations in a GIS (Armstrong and Densham, 1990). In the future, the topology code may be expanded to represent a specific topologic structure particular to a GIS application.

Object (entity)
Spatial Object G T

Regular Object Name

Topology Indicator

XY Coordinate Indicator Associated Spatial Object Type

Figure 8 - Entity Symbol for Spatial Objects

98 GIS Development Guide Modeling Spatial Relationships The spatial relationships are defined by three relationship symbols (Figure 9).
The traditional diamond symbol can be used for normal database relationships. An elongated hexagon and a double elongated hexagon, are defined to represent spatial relationships. The elongated hexagon represents spatial relationships defined through topology (connectivity and contiguity) and the double elongated hexagon represents spatial relationships defined through x,y coordinates and related spatial operations (coincidence, containment and proximity). The appropriate "verbs" to include in the hexagonal symbols are the descriptors of the spatial relationships (as shown in Figure 7). The spatial operation will be implicitly defined by the relationship symbol (double hexagon), the spatial entity and the topology code. For example, a spatial relationship named "coincident" between entities named "wetlands" and "soils," both of which carry topologic codes and x,y coordinates, indicates the spatial operation of topological overlay. If this does not sufficiently define the spatial operation needed, the name of the spatial operation can be used to describe the relationship, such as shortest path, point-in-polygon, radial search, etc. Figure 9 shows all symbols needed to construct an Entity-Relationship diagram for a GIS database.
Basic E-R Symbology E-R Symbology for Spatio-Temporal Data
Entity: simple with corresponding spatial entity

Entity

Entity: multiple spatial representation

Entity: multiple temporal representations

Relationship

Relationship: represented in database

Relationship: represented by topology

Relationship: derived using spatial operation

Attribute

Attribute

Figure 9 - Extended Entity - Relationship Symbology for Designing GIS Databases, Source: Calkins, 1996

Conceptual Design of the GIS 99

DEVELOPING A SPATIAL DATA MODEL (ENTITY-RELATIONSHIP DIAGRAM)

The information needed to develop the E-R diagram representing the spatial data model comes from the Needs Assessment activity as: The GIS application descriptions The master data list: Lists, entities, corresponding spatial entities and attributes The list of functional capabilities (spatial operations)

The process of building the E-R diagram involves taking entities from the master data list one at a time and placing each one on the diagram. For each new entity, any relationship to any previously entered entity should be entered. Relationships are found by examining the Application Descriptions and determining if the GIS processes require a specified operation. For example, if an Application Description indicated that land parcels needed to be compared to a flood plain area, then a spatial relationship of "coincident area" (or topological overlay operation) should be defined between the two entities.

Land Parcel Coincident Area


Polygon G T

Flood Plain
Polygon G T

Figure 10 - Diagramming a Spatial Relationship As each entity is added to the E-R diagram, the list of attributes should be reviewed and checked to determine if the attribute is appropriate for the entity, does not duplicate any other attribute or entity, and can be rigorously defined for entry to create the metadata (metadata is discussed in the next section of this guideline). Figure 11 is a sample E-R diagram for data commonly used by local government. This example contains 16 entities and 15 relationships. Attributes have not been included in the diagram in order to reduce the size of the diagram for inclusion in this document.

100GIS Development Guide

ZONING CONTAINS
POLYGON

CENSUS BLOCK
POLYGON

TRAFFIC ZONE T
POLYGON

COINCIDENT AREA SOILS


POLYGON

CENSUS TRACT
POLYGON

COINCIDENT LINE T

STREET SYSTEM

COINCIDENT AREA

FLOODPLAIN
POLYGON

COINCIDENT AREA T

PARCEL ADJACENT
POLYGON

STREET SEGMENT
POLYGON

ABUTTING STREET SEGMENT G T LINK

INTERSECTION
POLYGON

T INTERSECT
NODE

LINE

WETLANDS CONTAINS
POLYGON

T WITHIN VALVE
NODE

BUILDING WATERMAIN
POLYGON

T NEAREST ADDRESS

G
LINE

LINK G T HYDRANT
NODE

HAS

LINK

OCCUPANT

HAS

WATER SERVICE CONNECTION


LINE

Figure 11 - Entity Relationship Diagram for Selected Local Government Data

SUMMARY OF CONCEPTUAL DATA MODELING

The E-R diagram shown in Figure __ will be used to verify with the expected users the data content of the GIS and, by additional reference to the GIS needs analysis, the required spatial operations. Once verified by the users, the E-R representation can be mapped into a detailed database design (as will be described in the Database Planning and Design Guideline)where: 1) Each entity and its attributes map into: (a) One or more relational tables with appropriate primary and secondary keys (this assumes the desired level of normalization has been obtained); (b) The corresponding spatial entity for the "regular" entity. As most commercial GISs rely on fixed structures for the representation of geometric coordinates and topology, this step is simply reduced to ensuring that each corresponding spatial entity can be handled by the selected GIS package; 2) Each relationship into: (a) Regular relationships (diamond) executed by the relational database system's normal query structure. Again, appropriate keys and normalization are required for this mapping. (b) Spatial relationships implemented through spatial operations in the GIS. The functionality of each spatial relationship needs to be described, and if not a standard operation of the selected GIS, specifications for the indicated operation need to be written.

Conceptual Design of the GIS101

PART 2:
Introduction

SPATIAL DATA STANDARDS AND METADATA REQUIREMENTS

Spatial data standards cover a variety of topics including the definition of spatial data entities (including a formal data model), methods of representation of the spatial entities in a GIS, specifications for the transfer of spatial data between different organizations, and the definition of the attributes of the spatial entities and the values these attributes may assume. Metadata is "information about data," and should describe the characteristics of the data such as identifying entities and attributes by their standard names and provide information on such items as data accuracy, data sources and lineage, and data archiving provisions. Much of the work on spatial data standards to date as been done under the auspices of the Federal Geographic Data Committee and only concerns federal spatial data directly. The relationship between the existing federal data standards and state and local spatial data standards have yet to be developed. Appendix A contains a list of current and pending reports on federal spatial data standards. Work towards New York State spatial data standards will be conducted under the proposed GIS Standing Committee of the Information Resources Management Task Force. Metadata for Local Governments in New York State Metadata can serve many purposes. Some of the more important functions of metadata are: Provide a basic description of a dataset Provide information for data transfers to facilitate data sharing Provide information for entries into clearinghouses to catalogue the availability of data

The metadata structure and content for local government recommended in this guideline has been prepared according to the following criteria: The metadata must first, and primarily, serve as a documentation and data management tool for the data administrator in an agency or department Secondly, the metadata must encompass and support the data manager and records management officer in a local agency in all aspects of data management including data definition, source documentation, management and updating, and data archiving and retention requirement. Thirdly, the metadata information must be able to generate and supply database descriptions for spatial data clearinghouses such as the prototype New York State Spatial Data Clearinghouse developed under the GIS Demonstration Project conducted by the Center for Technology in Government, SUNY - Albany and any relevant federal spatial data clearinghouses.

102GIS Development Guide The following metadata information is a prototype for a New York State Local Government Spatial Metadata Standard. This metadata is represented in a set of tables listed below and has been implemented in Microsoft Access. A working copy of this metadata program is available to all local governments in the state. The structure and information on how to use the software are described in a user's guide available with the program. The content of the metadata tables is as shown in the following lists.

Metadata Tables
1. Organization Information Name Of Organization Department Room/Suite # Number And Street Names City State Zip Code Phone Number Fax Number Contact Person Phone Number/Extension Email Address Organization Internet Address Comments 2. Reference Information Filename File Format Availability Cost File Internet Address Metadata Created By Date Metadata Created Metadata Updated By Date Metadata Updated Metadata Standard Name Comments 3. Object/File Name Information Filename Data Object Name 4. Data Object Information Distribution Filename (Same as Filename in Reference Information) Data Object Name Type Data Object Description Spatial Object Type Comments

Conceptual Design of the GIS103 5. Attribute Information Data Object Name Data Attribute Name Attribute Description Attribute Filename Codeset Name/Description Measurement Units Accuracy Description Comments 6. Data Dictionary Information Data Object Name Data Attribute Name Data Type Field Length Required Comments 7. Spatial Object Information Data Object Name Spatial Object Type Place Name Projection Name/Description HCS Name HCS Datum HCS X-Offset HCS Y-Offset HCS Xmin HCS Xmax HCS Ymin HCS Ymax HCS Units HCS Accuracy Description VCS Name VCS Datum VCS Zmin VCS Zmax VCS Units VCS Accuracy Description Comments 8. Source document information Data Object Name Spatial Object Type Source Document Name Type Scale Date Document Created Date Digitized/Scanned Digitizing/Scanning Method Description Accuracy Description Comments

104GIS Development Guide 9. Lineage Information Data Object Name Data Object 1 Data Object 2 Description of Spatial Operation and Parameters Accuracy Description Comments 10. Update Information Data Object Name Update Frequency Date Updated By Comments 11. Archive Information Data Object Name Retention Class Retention Period Data Archived Archived By Date to be Destroyed 12. Source Documents Source Document Name Source Document ID# Source Organization Type of Document Number of Sheets (map, photo) Source Material (paper, mylar) Projection Name Coordinate System Date Created Last Updated Control/Accuracy (map, photo) Scale Reviewed by Review date Spatial extent File format Comments 13. Entities Contained in Source Source ID# Entity Name Spatial Entity Estimated Volume of Spatial Entity Symbol Accuracy Description of Spatial Entity Reviewed by Review Date Scrub Needed (yes/no) Comments

Conceptual Design of the GIS105 14. Attributes by Entity Source ID# Entity Name Attribute Description Code Set Name Accuracy Description of Attribute Reviewed By Review Date Comments

Additional Reading
(the following material is quite technical, but a good set of sources on conceptual database design.) Armstrong, M.P. and P.J. Densham, 1990, Database organization strategies for spatial decision support systems, International Journal of Geographical Information Systems, vol. 4, no. 1, 3-20. Calkins, Hugh W., Entity Relationship Modeling of Spatial Data for Geographic Information Systems, International Journal of Geographical Information Systems, January 1996. Chen, P.P., 1976, The entity-relationship model - toward a unified view of data, ACM Transactions on Database Systems, vol. 1, no. 1, March 1976, pp. 9-36 Chen, P.P., 1984, English sentence structure and entity-relationship diagrams, Information Sciences, 29, 127-149 Davis, C., et. al., eds., 1983, Entity-Relationship Approach to Software Engineering, Amsterdam, Netherlands: Elsevier Science Publishers B.V. Elmasri, R. and S.B. Navathe, 1989, Fundamentals of Database Systems, Redwood City, California: The Benjamin/Cummings Publishing Company, Inc. Jajodia, S. and P. Ng, 1983, On representation of relational structures by entity-relationship diagrams, Entity-Relationship Approach to Software Engineering, P. Ng and R. Yeh (eds.), Amsterdam, Netherlands: Elsevier Science Publishers B.V., pp. 249-263. Liskov, B. and S. Zilles, 1977, An introduction to formal specifications of data abstractions, Current Trends in Programming Methodology - Vol. 1: Software Specification and Design, R.T. Yeh (ed), Prentice Hall, pp 1-32. Loucopoulos, P. and R. Zicari, 1992, Conceptual Modeling, Databases, and CASE: An integrated view of information systems development, New York: John Wiley & Sons, Inc. Teorey, T.J. and J.P. Fry, 1982, Design of Database Structures, Englewood Cliffs, NJ: PrenticeHall, Inc. Ullman, J.D., 1988, Principles of Database and Knowledge-base Systems, 2 vols. (Rockville, Maryland: Computer Science Press, Inc.)

106 GIS Development Guide

Appendix A
Developing Standards for Spatial Data and Metadata
Spatial data standards are needed in order to facilitate the exchange of spatial data between geographic information systems. We refer to data as "spatial" because the common factor is a geographic reference (a reference in space) which allows the data to be accessed through a GIS. In order to accomplish the goal of facilitating data exchange, spatial data standards should provide: Definitions of terms for spatial objects or features included in GIS; A structure (or format) for the exchange of spatial data; A method for describing the accuracy and lineage of the data; and The definition of metadata (the data that describes the spatial data). The primary purpose for spatial data standards is to facilitate data sharing and exchange, thus the focus only on data issues. The Council concluded that It is not necessary to develop standards for GIS hardware or software at this time. as these standards are expected to evolve from groups such as the Open GIS Consortium, a non-profit trade association formed to implement the Open Geodata Interoperability Specification .

The Current Status of Standards


At present, spatial data standards exist only at the Federal government level. Under the Federal Geographic Data Committee, three standards documents have been prepared: The Spatial Data Transfer Standard (SDTS - FIPS 173) This standard defines a method for the exchange of spatial data between different GIS software systems. It also contains definitions of terms for the spatial objects of interest to Federal government agencies. Content Standards for Digital Geospatial Metadata (proposed) This standard defines the content for digital geospatial metadata, the information about spatial data that would be entered into a clearinghouse or repository to form a catalog of spatial data available to other users. Cadastral Standards for the National Spatial Data Infrastructure (draft) This is a draft standard for cadastral (land ownership) data, one of twelve theme standards documents under preparation. The Federal Geographic Data Committee has also established a National Spatial Data Infrastructure (NSDI) for the purpose of coordinating geographic data acquisition and access. The mechanism for this will be a National Spatial Data Clearinghouse, a distributed network of geospatial data producers, managers, and users linked electronically. It is envisioned that this network of clearinghouses would contain information about available spatial data. Potential users would search this clearinghouse to find data of interest, access the metadata for a description of data of interest, and could acquire the data from the distributing agency. Spatial data may be deposited directly with a clearinghouse or retained by the originator. A-1

Manager's Overview107 The Federal effort towards standards development started in 1981 and The National Spatial Data Infrastructure and Federal spatial data standards are still evolving at this time. The remaining subject area (theme) standards reports are scheduled for release during the Spring of 1996 ( themes are: base cartographic, bathymetric, cultural and demographic, geodetic, geologic, ground transportation, international boundaries, soils, vegetation, water, and wetlands). The table below shows the current status of federal spatial data standards development. Implementation of the Federal geospatial data standards is through Executive Order 12906 signed by the President on April 11, 1994. The FGDC is directed to " ...seek to involve State, local, and tribal governments in the development and implementation of the initiatives continued in this order." The Order provides that: "Federal agencies collecting or producing geospatial data, either directly or indirectly ~e.g. through grants, partnerships, or contracts with other entities) shall ensure, prior to obligating funds for such activities, that data will be collected in a manner that meets all relevant standards adopted through the FGDC process. " Status of Federal Geographic Data Committee Standards Currently in development:
National Spatial Data Accuracy Standard Standards for Digital Orthoimagery Draft Standards for Digital Elevation Data Hydrographic and Bathymetric Accuracy Standard Standards for Geodetic Control Networks Transportation Network Profile for Spatial Data Transfer Standard Transportation-related Spatial Feature Dictionary Soils Data Transfer Standard Vegetation Classification Standards River Reach Standards and Spatial Feature Dictionary Facility ID Code Content Standard for Cultural and Demographic Data Metadata

Completed public review:


Cadastral Content Standard Federal Domain of Values for Data Content Standard Cadastral Collection Standard (Cadastral) Clearinghouse Metadata Profile (Cadastral) Classification of Wetlands and Deepwater Habitats of the United States

Source: Federal Geographic Data Committee Newsletter, November 1995.

A-2

108 GIS Development Guide

GIS DEVELOPMENT GUIDE Volume II

Table of Contents
SURVEY OF AVAILABLE DATA Introduction .....................................................................................1 Data Required.....................................................................................1 Potential Sources of Data......................................................................1 Describing and Evaluating Potential Data...............................................9 Reference ...................................................................................13 EVALUATING GIS HARDWARE AND SOFTWARE Introduction ...................................................................................14 Sources of Information About GIS......................................................14 GIS Source Book...........................................................14 Publications..................................................................14 Trade Shows.................................................................15 User Groups.................................................................16 Selection Process...............................................................................17 Attachment A- User Groups ............................................................22 DATABASE PLANNING AND DESIGN Introduction ...................................................................................24 Selecting Sources for the GIS Database................................................25 Master Data List...........................................................................25 List of Surveyed Data Sources .......................................................26 The Logical/Physical Design of the GIS Database..................................30 Procedures for Building the GIS Database............................................33 Procedures for Managing and Maintaining the Database ........................35 GIS Data Sharing Cooperatives...........................................................36 Matrix Example ................................................................................37 Figures 123456GIS Representation of Object and Associated Spatial Object ............31 Example of Mapping of E-R Entity and Attribute List....................31 E-R Representative of Elements of a Water Distribution System......32 Physical Design of Several Entities in a Single Layer......................32 Standard Database Relationship with Primary & Secondary Keys.....33 Guide to Data Conversion............................................................35

T able of Contents cont' d

DATABASE CONSTRUCTION Introduction ...................................................................................40 Information Required to Support Data Conversion Process....................41 Data Conversion Technologies Available..............................................44 Data Conversion Contractors..............................................................47 Data Conversion Processes .................................................................49 Attribute Data Entry..........................................................................54 External Digital Data.........................................................................57 Accuracy and Final Acceptance Criteria...............................................58 Figures 12345Steps in Creating a Topologically Correct Vector Polygon Database 40 Guide to Data Conversion............................................................41 GIS Data Model..........................................................................41 Raster GIS Data..........................................................................43 Vector GIS Data.........................................................................43

PILOT STUDIES AND BENCHMARK TESTS Introduction ...................................................................................60 Pilot Study: Proving the Concept........................................................60 Executing the Pilot Study ...................................................................65 Evaluating the Pilot Study ..................................................................68 Benchmark Tests: Competitive Evaluation...........................................71 Figures 1 - Steps in Creating a Topologically Correct Vector Polygon Database 63 2 - Guide to Data Conversion............................................................65

GIS DEVELOPMENT GUIDE: SURVEY OF AVAILABLE DATA

INTRODUCTION

One of the most important elements of developing a GIS is finding and utilizing the appropriate data. The form of the data is critical to the overall database design and the success of the analyses performed with the system. The quality of the results produced from GIS analyses and applications ultimately resides in the quality of the data used. GIS data can be obtained in various formats from many different sources. Application requirements based upon quality, scale and level of completeness will depend upon the needs of the application. Once data requirements are developed, there are usually a plethora of data options which the potential user can choose from. Some of these choices will include whether to utilize government- or privately-developed data, cost in this case will be a major difference. Other choices may involve data currency, scale, accuracy, and depending upon the application, the data structure, platform specifications or even media format. This guideline will discuss various information surrounding available GIS data including evaluating data requirements, various types and sources of available GIS data, potential datasets. This guideline will also discuss potential opportunities for data sharing.

DATA REQUIRED

Master Data List (from Needs Assessment) One of the products available from a Needs Assessment is a Master Data List. Based upon descriptions of the tasks future GIS users will want to perform, a listing of the various required data is developed. From the Needs Assessment you will have identified: the data entities the attributes associated with the entities The Master Data list is used to prepare a database plan which includes: a logical/physical design of the GIS database procedures for building the GIS database procedures for managing and maintaining the database In this guide, the procedures for identifying and documenting existing data will be described.

POTENTIAL SOURCES OF DATA

Types of Data There are many different types of data which can be utilized by a GIS system. Each data type has its own unique properties and potential for contributing to the overall quality and functionality of

Evaluating GIS Hardware and Software the GIS database. These various data types are mapped data, tabular data listings, remotely sensed imagery, and scanned images. The following sections describe these data types. Mapped Data/Map Series Mapped data may refer to published maps found in an existing map series or collection. These maps should be logically classified based upon their data content (e.g., topographical, hydrological data). Maps which meet National Map Accuracy Standards are usually produced by federal or state government agencies. Paper maps, if not already in digital format, can be utilized in developing the database through vector tablet digitizing or scanning. Mapped data can also be identified as geographic data which has been digitized into the vector data structure. Vector map data may be found with or without real-world coordinate information and may or may not have topological relationships. Many organizations which digitized their map data in the past, did so utilizing CAD (computer aided drafting), and thus were not able to establish topological relationships between their spatial elements. Today, there exists software which allows CAD data to be quickly converted into topologically correct geographic data which can then be assigned coordinate data within a GIS. Many alternative sources of digital spatial data thus exist, in addition to the volumes of topologically correct geographic data available from local, state and federal governments. Attribute Tables or Lists A readily available form of GIS input, data tables and listings are available from many different organizations and government agencies. Various data tables can be obtained as GIS input to provide additional attributes which will be associated with spatial data elements. These elements are easily linked using primary relationship keys. Database, spreadsheet or ASCII-delimited text tables include some of the various import formats available in many GIS systems. Any organization that maintains a database, or uses spreadsheets to organize their records is able to create digital listings. Tables and lists are available from almost any government organization as long as the data does not involve privacy issues which would impede accessing such data. Image Data (Remotely Sensed Images, Aerial Photos) Image data is an excellent source of GIS input data. It mainly consists of remotely sensed images which includes both aerial photographs (in analog or digital format) and satellite images. Aerial photos are normally captured with analog cameras. These cameras produce photographs whose data can be very important in a GIS system. Photographs, though not digital, can be digitized by using a vector digitizing tablet, or they can be scanned, and then input into the GIS as an image. In either case, the digital version will normally require rectification and re-scaling in order to correct camera distortions common with most aerial photography. Until they are converted into a raster GIS format, basic raster images such as satellite imagery or scanned aerial photographs do not offer any topological connectivity or potential for GIS analysis. Satellite imagery is captured in raster digital format. With the advent of an open display architecture, many GIS packages are able to integrate both raster and vector data into the same display. Remotely-sensed image data is useful within an editing environment for display as a backdrop for both heads-up digitizing and updating of vector layers, for verification, or for conversion into raster GIS layers and then subsequently into vector data layers.

GIS Development Guide

Most remote sensing cameras allow for the capture of infrared images, separating different light waves into varying band-widths which together and/or alone may show much more information than a normal camera reading only in the visible spectrum. Most GIS will allow for the display of these images and will allow for the assigning of different colors to the various bands for the effective display of the data. GIS packages today also allow for the processing of these images in order to rectify, warp, and geo-reference the imagery as necessary so that they will be useful as scaled images. After such procedures, geo-referenced images can be overlaid with similarly georeferenced vector imagery for effective display. Scanned Images (Pictures, Diagrams) Scanned raster images are able to be displayed in a GIS the same way that satellite images are displayed. Any raster image, whether it be a scanned map, photograph or diagram, can be easily input into a GIS for display purposes. Integrating scanned images into a GIS display, or converting raster data into raster GIS format are fairly routine capabilities for most high-end GIS packages. As discussed earlier, a GIS allows for the assignment of coordinates to raster image data. Scanned maps (as opposed to digitized vector representations) can be effective backgrounds upon which other GIS vector layers can be displayed. Scanned maps usually contain much valuable annotation which would be very time-consuming to duplicate in a vector environment. Including raster images allows for the enhancing of an application by providing the user with visual display data which can enhance the user's understanding of the data. Scanned photographs are especially effective. In many GIS packages, links can be established between an image viewer, which displays scanned images, and vector geographic features so that when an event sequence is initiated (e.g. selecting a vector feature), the raster image viewer appears with the specified scanned image. Formats There are three major formats in which GIS-usable data can be obtained. They include hardcopy/eye-readable format, analog image format, and in fully digital format. Unique types of information can be accessed from each of these data formats. Hardcopy (Paper, Linen Or Film)/Eye-Readable Hardcopy maps are easily accessed from a wide variety of organizations. Hardcopy maps, as a form of GIS source data, can be digitized on a digitizing tablet into vector GIS format, or scanned and then converted into raster GIS format. Although there are potential accuracy problems which are associated with paper and linen maps (related to distortions due to shrinkage/expansion of the media) in capturing geographic features, there is still much unique geographic data which can only be found on these maps. An example of unique data from paper or linen maps is seen when seeking geographic data for a certain time period. Much of the digital data which is readily available may only be the most current, updated data for a region. For example, in order to find geographic data from before 1970, the only choice may be to access a paper or linen map. Use a film copy of the source document where available as this will be the most stable media.

Evaluating GIS Hardware and Software Accessing dated tabular information for the development of an attribute database may be a similar endeavor requiring the use of paper documents. Organizations which have been in existence since before the dawn of digital filing systems all had to keep their data in paper "hard-copy" format at one time. Some of these older records may have been converted into digital form at one point. In other cases, there may be hard-copy documents which are the only versions of dated material. In order to conserve space and the integrity of most documents, many might possibly have been copied onto microfiche. Image (Picture) Aerial photography is found to be an abundant geographic data form. Photogrammetry (aerial mapping) is a common way of creating an accurate and up-to-date land base. Aerial photos provide the raw data which is necessary for various planimetric and topographic mapping applications. Photographic images are a very rich data source in that many geographic features can be seen clearly on a photograph but may not be seen in a paper map or a vector digital file (e.g., a large clearing within a wooded area would not be differentiated on most paper maps, but it is clearly visible on the aerial photo). Aerial photography is available from many sources (i.e.: USGS, DOT, County agencies, etc.) The federal government has recently developed the National Aerial Photography Program (NAPP) in which states that desire to have their counties flown may split the cost with the Federal government. Many useful products are derived from the NAPP including 1:12,000 hard or softcopy orthophotographs. An orthophoto is a scanned aerial photograph which has been digitally rectified using control points and a digital elevation model. The digital versions are especially useful for GIS applications. If the type of digital aerial photography needed is not available, organizations can create a request for proposal to solicit bids for aerial mapping, although this can be very expensive. Digital Within the digital format genre, there are many different varieties of data available. These various options are becoming as numerous as what is currently available in paper maps. In terms of map graphics, there are again two different data structures which are quickly integrated into today's GIS systems: these are raster and vector data formats. Tabular data can be found in digital data format most frequently. Various forms of digital spatial data which are currently available in raster format may include some of the following: Scanned maps and aerial photography Satellite Imagery Digital Orthophotography Digital Elevation Models

Some of the various forms of digital spatial data which are currently available in vector format may include some of the following: Topological vector linework Non-topological vector linework Annotation layers

GIS Development Guide

Some of the various forms of digital attribute data which can be input into a GIS includes file types associated with various software components: spreadsheet, database and word-processing. Some of the file formats which can be utilized include: dBase, Excel, and ASCII delimited text. Government Sources Government is the largest single source of geographic data. Data for most any GIS application can be obtained through federal, state, or local governments. Various data formats, whether paper, image or digital, can all be obtained through government resources. The following subsections give basic descriptions of the datasets which are available through some federal, state and regional/local government agencies. Federal Data Sources The federal government is an excellent source of geographic data. Two of the largest spatial databases which are national in coverage include the US Geological Survey's DLG (Digital Line Graph) database, and the US Census Bureau's TIGER (Topologically Integrated Geographic Encoding and Referencing) database. Both systems contain vector data with point, line and area cartographic map features, and also have attribute data associated with these features. The TIGER database is particularly useful in that its attribute data also contains census demographic data which is associated with block groups and census tracts. This data is readily used today in a variety of analysis applications. Many companies have refined various government datasets, including TIGER, and these datasets offer enhancements in their attribute characteristics, which increases the utility of the data. Unfortunately, problems associated with the positional accuracy of these datasets usually remain as these are much more difficult to resolve. Satellite and digital orthophoto imagery, raster GIS datasets, and tabular datasets are also available from various data producing companies and government agencies. The following information on federal agencies was taken from the Manual of Federal Geographic Data Products developed by the Federal Geographic Data Committee (FGDC). To contact the FGDC: Federal Geographic Data Committee Secretariat US Geologic Survey 590 National Center Reston, VA 22092 Phone 703-648-4533 The departments all have different agencies and bureaus within them which offer various listings on the types of data which are available (e.g. concerning data structure, scale, software export format, source data, currency, what applications the data can be used for), and from which agencies they can be acquired. The reader is encouraged to consult this manual for further information regarding the geographic data products related to these organizations.

Evaluating GIS Hardware and Software DEPARTMENT OF AGRICULTURE The Agriculture Stabilization & Conservation Service: R Forest Service: B, H, L, Sur, T Soil Conservation Service: H, Sub, Sur DEPARTMENT OF COMMERCE Bureau of the Census: B, S, H, Sur Bureau of Economic Analysis: B , S National Environmental Satellite Data & Info. Service: A, Ged, Gep, H, R, Sub, Sur, T National Ocean Service: Ged, H, R, Sub, Sur, T National Weather Service: A, R, T DEPARTMENT OF DEFENSE Defense Mapping Agency: B, H, Sur, T DEPARTMENT OF HEALTH & HUMAN SERVICES Centers for Disease Control: B, S DEPARTMENT OF THE INTERIOR Bureau of Land Management: B, H, L, R Bureau of Mines: Sub Bureau of Reclamation: H, Sur Minerals Management Service: B, H, L National Park Service: B, H, Sur, T US Fish & Wildlife Service: H, Sur US Geological Survey: A, B, S, Ged, Gep, H, L, R, Sub, Sur, T DEPARTMENT OF TRANSPORTATION Federal Highway Administration: Sur INDEPENDENT AGENCIES Federal Emergency Management Agency: H National Aeronautics & Space Administration: H, L, R, Sub, Sur Tennessee Valley Authority: B, S, Ged, H, L, R, Sub, Sur, T
Federal Agency Data A = Atmospheric B = Boundaries Ged = Geodetic Gep = Geophysics Product Code: H = Hydrologic L = Land Ownership R = Remotely Sensed S = Socioeconomic Sub = Subsurface Sur = Surface and Manmade Features T = Topography

National Spatial Data Infrastructure (NSDI) There is a wealth of geographic data which can be accessed from federal and state agencies over the internet. Most federal agencies which deal with geographic data have File Transfer Protocol (FTP) servers storing various geographic datasets. These servers allow organizations to download digital data over the internet. One of the most populated servers is the US Geological Survey FTP server, which holds all of the USGS Digital Line Graph files (the USGS server FTP address can be found

GIS Development Guide

by calling the USGS at 1-800-USA-MAPS). The Census Bureau also has an FTP server which allows organizations to access portions of its TIGER/Line file database. Government FTP servers can be searched for on the internet using ARCHIE. Many federal and state agencies and corporations which deal with geographic data have internet home pages which can be accessed on the World-Wide-Web. The US Geological Survey (USGS) home page (URL address: http://www.usgs.gov), like the USGS FTP server, contains a wealth of information about USGS geographic data and how it can be used. From the USGS home page it is possible to search for, view, and download USGS data. One can also obtain USGS Fact Sheets, general information on the USGS, educational resources, publications, research papers, and informational resources on other internet sites. Most federal agencies have their own home page and are structured similarly to the USGS home page. Most major GIS software vendors also have internet home pages. Environmental Systems Research Institute (ESRI), Inc. has an excellent home page (URL address: http://www.esri.com) which contains a wide assortment of useful information. State Government Agencies There are many New York State agencies which are good sources of GIS data. Three of these organizations include the Department of Transportation, the Department of Environmental Conservation, and the Office of Real Property Services. The New York State Department of Transportation (NYSDOT) offers data in paper and digital file formats. Paper topographic maps can be obtained at various scales. Most applicable to GIS needs, the NYSDOT has developed digital spatial files which are part of the New York State County Base Map Series. The Base Map files, though created with a CADD (Computer Aided Design and Drafting), have been designed for use in a GIS. The Department has developed a file structure which will allow for their conversion into a topological GIS format. There are various data layers available within this database including: Roads, Boundaries, Hydrography, Miscellaneous Transportation, and Names (NYSDOT, 1994). For further information, see Digital Files from the County Base Map Series from the NYSDOT. The New York State Department of Environmental Conservation (NYSDEC) is another state organization which offers GIS data in varying formats. In 1990, the NYSDEC compiled an inhouse inventory of its geographic data sources called the "Geographic Data Source Directory. The directory contains information on all of the DEC's geographic data sources with potential GIS applications. The DEC divided its data into the following categories: Air Resources, Construction Management, Fish and Wildlife, Hazardous Substances Regulation, Hazardous Waste Remediation, Lands and Forests, Law Enforcement, Management Planning and Information Systems Development, Marine Resources, Mineral Resources, Operations, Regulatory Affairs, Solid Waste, and Water (Warnecke et al, 1992). A copy of the directory is available from NYSDEC. Call your local office or the main office in Albany. The New York State Office of Real Property Services (ORPS) has developed a database known as RPIS (Real Property Information System) which contains information on all tax parcels in the state. Each parcel contains a coordinate representing the center point of the parcel and attribute information which includes: unique land-based parcel identification numbers and descriptive information, such as land use, locations, sales information, exemptions, and other parcel

Evaluating GIS Hardware and Software 10 attributes. RPIS data is available to local assessors, real property assessment offices , corporations and the general public for a nominal fee. The New York State Department of Health (DOH) uses GIS in its work in analyzing and mapping environmental health risk areas and hazardous waste sites. The DOH has a database containing Census Bureau TIGER files and parcel maps. These GIS files can be acquired by the public. Some other agencies which have GIS databases and which may have data usable in a GIS include: the Adirondack Park Agency (APA); the Hudson River Valley Greenway; New York Metropolitan Transportation Council; the Office of Parks, Recreation and Historic Preservation; Department of Public Service; State Emergency Management Office; New York City Department of Environmental Protection (Hilla, 1995); State Data Center Affiliates (various NYS Counties). Please note these are all examples and not intended to be an exhaustive list. Regional And Local Governments Many regional and local government agencies and organizations maintain GIS databases. These agencies may have data sharing arrangements with local companies and other municipalities. Information identifying which government agencies and companies have available GIS data layers may be found in regional or local GIS data directories. One such regional data directory developed within New York State is the Regional Directory of Geographic Data Sources for Genesee/Finger Lakes Counties. The directory contains information on participating government agencies and companies which have GIS data layers, then lists information regarding these layers, and provides the name, address and phone number of the person within the organization who can be contacted for further details or data sharing arrangements (GIS/SIG, 1995). Private Data Firms There are companies that will develop data for a local government. These companies will develop programs based on contract data conversion or public/private partnerships. Contract data conversion firms are available for those organizations that wish to have custom geographic datasets developed. Usually, the development of these datasets involves the client organization providing existing source data (e.g., paper maps) to the data development firm, which then converts the data into digital format. In public/private partnerships, the company will work out an agreement with the local government that will provide data conversion but also retain the ability to market, sell and/or use the digital data that was created. Public/private agreements are just emerging as a method for creating GIS databases cost effectively. When considering a public/private partnership, issues such as ownership, access, freedom of information requirements and long-term data maintenance must be addressed as well as the cost sharing of building the database.

DESCRIBING AND EVALUATING POTENTIAL DATA

11 GIS Development Guide

The next step is to actually survey the various departments within the local governments and other external sources to determine what data is available for use in the GIS and what condition the data is in. Metadata Documentation The first step will be to document the data by developing metadata files for each database available. The metadata file is used for two roles. 1) develop information that will be used to evaluate the data for use in a GIS and 2) fulfill the metadata requirements for data once it is used in a GIS. For each potential data source for the GIS database, the map series, photos, tabular files, etc. just be identified, reviewed, and evaluated for suitability to use in the GIS. Maps, photos, and remotely sensed data are the most likely sources and should be evaluated for: appropriate scale projection and coordinate system availability of geodetic control points aerial coverage completeness and consistency across entire area symbolization of entities (especially positional accuracy of symbol due either to size of symbol or off-set placement on map) quality of linework and symbols general readability and legibility for digitizing (labels) quality and stability of source material (paper/mylar) amount of manual editing needed prior to conversion edge match between map sheets existence and type of unique identifies for each entity (often entities shown on in map series used so-called "intelligent" keys or identifiers where an identifier for an object contains the map sheet number and/or other imbedded locational codes - in database design, it is much better to avoid "intelligent" keys of this type, particularly locational codes). positional and attribute accuracy

All of the above information needs to be documented for each potential data source. If a particular data source is then used to build part of the GIS database, some of this information will become part of the permanent metadata. The metadata software accompanying this guideline provides three tables for recording the basic metadata about a potential data source. The content of these tables is listed below. The first table contains information on the source document (or file); the second table can describe each entity contained on a source document; and the third table can describe each attribute of an entity. Once again, only the most basic entries have been included in the supporting software in order to keep the software simple an straightforward. A particular user may wish to expand the tables provided to meet his/her specific needs.

Evaluating GIS Hardware and Software 12

Data Objects Identified During Needs Assessment

Source Documents: Maps, Images, Air Photos, etc.

Preparation of Data Model

Match Needed Data to Available Data and Sources

Survey and Evaluation of Available Data

Prepare Detailed Database Plan

Create Initial Metadata

Map and Tabular Data Conversion

Add Record Retention Schedules to Metadata

Database QA/QC Editing

GIS Database

Continuing GIS Database Maintenance

Archives

Database Backups

Figure 1 - Life Cycle of a GIS Database: Source Documents

13 GIS Development Guide The following lists the fields of the three tables that contain source data information:

Source Documents
Source Document Name: Parcel Map Source ID #: 1 Source Organization: Town of Amherst Type of Document: Map Number of Sheets (map, photo, etc): 200 Source Material: Mylar Projection Name: UTM Coordinate System: State Plane Date Created: 5-Oct-91 Last Updated: 8-Nov-95 Control Accuracy Map: National Map Accuracy Standard Scale: Variable; 1" = 50 ft To 1" = 200 ft Availability: Current Reviewed By: Lee Stockholm Review Date: 19-Dec-95 Spatial Extent: Town of Amherst File Format: N/A Comments:

Evaluating GIS Hardware and Software 14

Entities Contained In Source


Source ID #: 1 Entity Name: Parcel Spatial Entity: Polygon Estimate Volume Spatial Entity: 126 per map sheet Symbol: None Accuracy Description Spatial Entity: National Map Accuracy Standard Reviewed By: Lee Stockholm Review Date: 02-Jan-94 Scrub Needed: Yes Comments:

Attributes By Entity
Source ID #: 1 Entity Name: Parcel Attribute Name: SBL Number Attribute Description: Section, Block, and Lot Number Code Set Name: N/A Accuracy Description Attribute: N/A Reviewed By: John Henry Review Date: 08-Feb-93 Comments: Additional Criteria For Evaluating Potential Data Sources As the survey is being conducted, it is important to consider the following issues about the data: Is the data current and what is its continuing availability? Is the data suitable for intended applications? Is the quality of the data appropriate for the type of applications needed? This should include both locational and attribute accuracy. Is the data cost effective?

15 GIS Development Guide FOR FURTHER INFORMATION: The Manual of Federal Geographic Data Products, developed by the Federal Geographic Data Committee, is an excellent source for information on geographic datasets produced by agencies within the federal government. Listed by federal agencies and bureaus within each federal department, there are listings on the types of data which are available (e.g. concerning data structure, scale, software export format, source data, currency, what applications the data can be used for), and from which agencies they can be acquired. To order contact: Federal Geographic Data Committee Secretariat US Geologic Survey 590 National Center Reston, VA 22092 Phone 703-648-4533 New York State Department of Transportation data listing: Digital Files from the County Base Map Series. Map Information Section Mapping and Geographic Information Systems Bureau New York State Department of Transportation State Office Campus Building 4, Room 105 Albany, New York 12232 Phone: (518) 457-3555 Example of a Regional Level GIS Data Directory: 1995 Regional Directory of Geographic Data Sources, developed by the GIS/SIG (Geographic Information Sharing/Special Interest Group) for New York State's Genesee/Finger Lake Region Counties. The directory is a listing of the various data sources which are available from local companies, and local government agencies in the Genesee/Finger Lakes Region. The International GIS Source book, published by GIS World, Inc. is an annual publication which contains an excellent "Data Source Listings" chapter. It provides a wealth of information on companies which produce GIS datasets and also provides descriptions of the data they produce. The chapter also lists the different types of spatial data produced by public agencies, and lists data availability and contacts. REFERENCE Hilla, Christine M. "The Revolution of Geographic Information Systems in Land Use and Environmental Planning in New York State," Environmental Law in New York, Vol 6, no. 3., March, 1995. Montgomery, Glenn E. and Harold C. Schuch, 1993. Collins, CO: GIS World, Inc., pp. 89-91. GIS Data Conversion Handbook. Fort

NYSDOT (New York State Department of Transportation), Digital Files from the County Base Map Series, mapping and Geographic Information Systems Bureau (1994). Warnecke, L., J. Johnson, K. Marshall and R. Brown, State Geographic Information Activities Compendium, 294 Council of State Government (1991).

Evaluating GIS Hardware and Software 16

GIS DEVELOPMENT GUIDE: EVALUATING GIS HARDWARE AND SOFTWARE

INTRODUCTION

Purpose of Guide A GIS is more than just hardware and software. It is a complex system with multiple components: Hardware, Software, People, Procedures and Data. The purpose of this guide is to focus on the hardware and software components of the system and how to acquire information on what is available. Deciding what hardware and software to use for your GIS is a difficult yet important task. It will make up the foundation on which you will build your system. There is no clear-cut formula to use to make the selection process easier. In this guideline we will give you suggestions that you can use to evaluate various systems and sources for additional information.

SOURCES OF INFORMATION ABOUT GIS To develop an understanding of GIS, you will need to get information about GIS systems. Here is a sampling of references to start with. This is not a comprehensive listing. Use it as a starting point and spread out from there. GIS Source Book The GIS source book is a good reference book that will give you a great deal of information about software vendors, trade associations, product specifications and more. This book is published by: GIS World, Inc. 155 E. Boardwalk Drive, Suite 250 Fort Collins, CO 80525 Phone: 303-223-4848 Fax: 303-223-5700 Internet: info@gisworld.com Other Publications Conference Proceedings Each major GIS conference publishes the proceedings from their event. Contact the association listed in Attachment A for information on how to obtain these documents. Scholarly Journals

17 GIS Development Guide There are a number of scholarly journals that deal with GIS. These are published on an ongoing basis. Cartographica - Contact: Canadian Cartographic Association Cartography and Geographic Information Systems - Contact: American Cartographic Association International Journal of Geographical Information Systems - Contact: Keith Clark at CUNY Hunter College, New York City URISA Journal - Contact: Urban and Regional Information Systems Association Trade Magazines There are a number of trade magazines that are focused on GIS. They are: GIS World GIS World Inc. 155 E. Boardwalk Drive Suite 250, Fort Collins, CO 80525 Phone: 303-223-4848 Fax: 303-223-5700 Internet: info@gisworld.com Business Geographics GIS World, Inc. 155 E. Boardwalk Drive, Suite 250 Fort Collins, CO 80525 Phone: 303-223-4848 Fax: 303-223-5700 Internet: info@gisworld.com Geo Info Systems Advanstar Communications 859 Williamette St. Eugene, OR., 97401-6806 Phone: 541-343-1200 Fax: 541-344-3514 Internet:geoinfomag@aol.com WWW site:http://www.advanstar.com/geo/gis GPS World Advanstar Communications 859 Williamette St. Eugene, OR., 97401-6806 Phone: 541-343-1200 Fax: 541-344-3514 Internet:geoinfomag@aol.com WWW site:http://www.advanstar.com/geo/gis Association Newsletters

Evaluating GIS Hardware and Software 18 Many associations have newsletters that cover GIS topics and can be a good source of information. Contact the organizations listed in attachment A for more information Books with vendor specific information There is a number of books published about GIS and related topics. Here are some of the publishers: Onword Press 2530 Camino Entrada Sante Fe, NM, 87505-4835 Phone: 505-474-5132 Fax: 505-474-5030 John Wiley & Sons, Inc. 605 Third Avenue New York, NY, 10158-0012 ESRI, Inc. 80 New York Street Redlands, CA 92373-8100 Phone: 909-793-2853 Fax: 909-793-4801 GIS World, Inc. 155 E. Boardwalk Drive, Suite 250 Fort Collins, CO 80525 Phone: 303-223-4848 Fax: 303-223-5700 Internet: info@gisworld.com Vendor Booths at Trade Shows A wealth of information is available at trade shows from vendor booths. These can range from the general product literature to white papers and technical journals. This is also a good time to gather a large amount of information on different companies in a short period of time. User Groups User Groups are another source of valuable information and support. There are a number of user groups that have formed to provide support and professional networking. GIS user groups are formed around a geographic region or by users of specific software products. New users are always welcome to these groups. A listing of users groups is contained in Attachment A Current Users The best way to gauge a vendor is by talking to their installed sites. The information that you get from talking to these users will be valuable insight into the type of company you will be working with. Ask the vendors you want to explore for a list of all of their users in the area or that are similar to your organization. Ask for contact names and phone numbers/e-mail addresses.

19 GIS Development Guide

SELECTION PROCESS Initially you will need to evaluate the software independently of hardware. The software will be selected based on the functionality it offers. Your hardware selection will be based on the GIS software you select and the operating system strategy your organization uses. You will need to test the hardware and software together making sure it works as advertised. The nature of hardware and software technology is that it changes. In recent years it has been changing very quickly. Don't let this stop your efforts. It is easy to get intimidated. The important thing to remember is to get a product that has been proven in the marketplace and continues to have a clear development path. Avoid technology that is outdated or is on the bleeding edge and has not been proven. Software Software is evaluated on functionality and performance. In the Needs Assessment guide the need to identify the functionality was discussed. Here is where you will begin to use this information. Functionality What is important here is the ability of the software to do the things you need it to do in a straightforward manner. As an example, if the intended users are relatively new to using computers, the software has to have an easy to use graphical user interface (GUI). If the organization needs to develop specific applications, the software should have a programming language that allows the software to be modified or customized. In the Needs Assessment Guide, the final report contains tables and references to the functionality you will need. Use this in developing the overall functionality required for the system. Standards Standards are a way of making sure that there is a common denominator that all systems can use. This can be in the form of data formats that can import and export data into the system, guidelines used for developing software, supporting industry developed standards that allow different applications to share data. Standards are generally developed by a neutral trade organization or in some cases are defined by the market. There is a group that has formed for the GIS industry called Open GIS. This organization is developing standards for developers to use as they engineer software. Open GIS is made up of representatives from the software developer companies. Performance The performance of the software is dependent on two factors, 1) how it is engineered and 2) the speed of the hardware it is running on. GIS software is complex and will use a large amount of the system resources (memory, disk, etc.). The more complex the software, the more resources it will need.

Evaluating GIS Hardware and Software 20

Performance will be impacted if you have a minimally configured computer. Look for the developers software specifications to see what configuration is needed to run the software. This will give you the minimum requirements. Follow this up by getting the recommended specifications from the developer or a user group. These recommendations will give you a more accurate idea of the type of configuration you will need. Expandability The software needs you have today will change over time. More than likely your system will need to expand. Is the software being evaluated able to provide networking capabilities? Will it share data with other applications? Will it grow as the organization's GIS grows? Evaluate software based on the ability to grow with you. This may mean that there are complimentary products that can be used in conjunction with the package you are evaluating today or the developer has clearly defined plans for added functionality. Talk with other users to see if the developer has a good track record for providing these enhancements. Licensing GIS software is not purchased, it is licensed. There is normally a one-time license fee with an ongoing maintenance fee that provides you with the most current versions of the software as they are released. In large systems this will be spelled out in a licensing agreement with a corresponding maintenance agreement. For desktop software a shrink wrap license is used with subsequent releases being offered to existing users through a discounted upgrade. The maintenance fees and upgrade costs generally run between 15% to 30% of the initial license fee. The terms in most software packages spell out how the software can and cannot be used. Have the terms of the license reviewed by an attorney before signing up. This can save hassles later as you are developing and using your system. Hardware When discussing hardware, there are terms/concepts that you need to understand. The following is a discussion of these. However, GIS software selection drives the hardware requirements. Therefore before launching a full scale evaluation of hardware, make your selection for the GIS software you will be using. Hardware can be broken down into the following basic components: Operating System Processor Disk Memory Communications Operating Systems An operating system is the software that runs the computer hardware. It is this program that tells the computer what to do and how to do it. You may already be familiar with some of the operating

21 GIS Development Guide systems that are on the market such as Microsofts Windows product or various brands of the UNIX operating system. It is important to have an Operating System plan within your organization. The plan should take into account the departments that will be using the computer system, the type of network being used (or being planned), what operating systems are currently being used, how large the database is and what kind of technical support skills you have access to (in-house or contractor). The GIS will need to fit into your operating system plan. This will be important as you add other departments onto the system. Processor The processor or CPU (central processor unit) is the part of the computer that actually does the calculations or processes the instructions being sent to it. The most common term that describe the processors capabilities is the clock speed. This is stated in terms of MHz (MegaHertz). The clock speed simply describes how many cycles per second the processor works. The higher the clock speed the faster the processor. Another description of the processors capability is how many bits it can access at one time. Many of the new processors are 32-bit processors. This means that the CPU can access or grab 32 bits of information during each cycle. Older computers such as a 386 machine where 16-bit machines. There are some machines on the market that manufacture a 64-bit machine (such as Digital Equipment Corporation). These are very fast CPUs but are hampered by the lack of a 64bit operating system that can take advantage of its speed. It is the direction the hardware industry seems to be heading. Disk The disk or hard drive is the device used to store the operating and application software. It is also used to store data and images. In working with a GIS you will quickly find out that GIS uses a large amount of disk space. It is not uncommon to have multiple gigabytes of hard drive on a single end-user machine and 10 - 20 gigabytes on a central data server. Luckily the prices of hard drives have been coming down and will continue to be affordable. Memory Memory or random access memory (RAM) is used as a temporary storage space by the operating system and by the application software which is running on the computer. Most applications run better as the amount of memory increases. This is true up to a point. At some point, the performance increases will begin to taper off as additional memory is added. Most software developers can give you configuration data that indicates where this point is. Communications The trend in most systems today is to link up users throughout the organization on a network. This is an area in the computer industry that is advancing very rapidly. It is recommended that you retain a competent consultant who works with networks to give you detailed and current information.

Evaluating GIS Hardware and Software 22 In simple terms, a network is a connection between computers that allows information to be passed around from computer to computer. In a typical organization, this is a local area network (LAN). In order to connect a computer to the network it will need a network card for the wiring to plug into and network software to allow the computer to transmit and receive signals over the wiring. Of course the physical network (wiring) is also needed. A small network within a department is inexpensive and can allow the users to share network resources such as printers and database servers. The network can provide services like e-mail and disk sharing. It can also be the entryway into larger networks that go outside the building or campus your organization is located on. This is called a Wide Area Network (WAN). A WAN requires a more structured network architecture. It does give users access to more resources. Another important point to consider is developing access to the Internet. This specialized network is growing rapidly and provides an incredible amount of resources for a user. The Internet is an area to share ideas in a GIS forum, download data for use in the system, get technical support for a problem, get the latest information on a product from a vendors home page or develop one of your own. The amount of information is overwhelming and too diverse to list in this guide. The point is that you should seriously be considering getting a connection to the Internet. When considering your network, factor this into the equation. Benchmarking a System Benchmarking a GIS can be a very involved process. The level of effort needed for the benchmark should be proportional to the size and complexity of the overall system being developed. A benchmark is the process of testing various combination of hardware and software and evaluating their functionality and performance. The benchmark is usually part of an RFP process and is only done with a limited number of selected vendors (i.e.: those that have been shortlisted). Each combination is tested under similar conditions using a predefined data set that is indicative of your database. This data set should be used with all of the hardware / software configurations selected for evaluation. When completed, an organization will have results that can used to objectively evaluate the systems. Setting It Up When putting a benchmark together there is strength in numbers. Get a committee together. A committee will take the burden off of one person and give the process more objectivity. Have representation from all the interested departments and agencies within the organization. A working group of about 8-10 committee members is reasonable. The committee will develop the criteria that will be used to evaluate the systems. Use the Needs Assessment documentation as a reference for this. These criteria will form the basis of the benchmark. Develop a series of tasks that each vendor will need to complete during the benchmark. The tasks should be measurable (i.e.: time, ease of use, can the function be done). Also prepare a form that each of the committee members will use to rate the tasks performed in the benchmark. In your benchmark you will not only be to rating various aspects of the system, you are also going to be rating the vendor. Be sure to include some measurement for teamwork, communication, and technical skills of the vendor. It might be useful to work with a consultant that has experience

23 GIS Development Guide setting up benchmarks or to get advice (and examples of documentation) from another local government who has recently completed a benchmark. Well in advance of the scheduled benchmarks, send out information that outlines the tasks the vendor will need to perform and any rules they will need to follow (how much time for set up, time given to perform various tasks, how many people can be present for the benchmark, etc.).

Vendor Support The vendor you select will become an extended team member for your GIS. There needs to be a good fit. The vendor will be a good source of support and information. All vendors provide some type of technical support. Ask current users how it has worked for them. If there have been problems in the past, do existing users see improvement? The GIS industry has been growing very fast over the last few years, there are bound to be some growing pains. What you should be looking for is a vendor who listens to what you need and makes improvements based on user input.

GIS Development Guide

Attachment A - User Groups


New York State

Western New York ARC/INFO Users Group (WNYARC) Buffalo area: Contact: Graham Hayes GIS Resource Group, Inc. 716-655-5540 GIS/SIG Rochester Area: Contact: Scott Sherwood Multi-County GIS Cooperative Statewide: Tri-County GIS Users Group Southern Tier: Contact: Jennifer Fais GISMO New York City: Contact: Jack Eichenbaum Capital Region ARC/INFO User Group (CAPARC) Albany Area: URISA New York State Chapter Contact - Lee Harrington, Professor SUNY College of Environmental Science and Forestry Syracuse Phone: 315-470-6670 Fax: 315-470-6535 Long Island GIS (LIGIS) Contact: Joseph P. Jones

Survey of Available Data

National
American Congress on Surveying and Mapping (ACSM) 5410 Grosvenor Lane Bethesda, MD, 20814 Phone: 301-493-0200 Fax: 301-493-8245 American Society for Photogrammetry and Remote Sensing (ASPRS) & (GIS/LIS) 5410 Grosvenor Lane Bethesda, MD, 20814 Phone: 301-493-0290 Fax: 301-493-0208 Association for American Geographers (AAG) 1710 Sixteenth St. N.W. Washington D.C., 20009-3198 Phone: 202-234-1450 Fax: 202-234-2744 Automated Mapping/Facility Management International (AM/FM International) 14456 East Evans Ave. Aurora, CO, 80014 Phone: 303-337-0513 Fax: 303-337-1001 Canadian Association of Geographers (CAG)
Burnside Hall, McGill University Rue Sherbrooke St. W Montreal, Quebec H3A 2K6 Phone: 514-398-4946 Fax: 514-398-7437

Canadian Institute of Geomatics (CIG) 206-1750 rue Courtwood Crescent Ottawa, Ontario, K2C 2B5 Phone: 613-224-9851 Fax: 613-224-9577 Urban And Regional Information Systems Association (URISA) 900 Second St. N.E., Suite 304 Washington, D.C. 20002 Phone: 202-289-1685 Fax: 202-842-1850

GIS Development Guide

GIS DEVELOPMENT GUIDE: DATABASE PLANNING AND DESIGN

INTRODUCTION

The primary purpose of this phase of the GIS development process is to specify "how" the GIS will perform the required applications. Database planning and design involves defining how graphics will be symbolized (i.e., color, weight, size, symbols, etc.), how graphics files will be structured, how nongraphic attribute files will be structured, how file directories will be organized, how files will be named, how the project area will be subdivided geographically, how GIS products will be presented (e.g., map sheet layouts, report formats, etc.)., and what management and security restrictions will be imposed on file access. This is done by completing the following activities: Select a source (document, map, digital file, etc) for each entity and attribute included in the E-R diagram Set-up the actual database design (logical/physical design) Define the procedures for converting data from source media to the database Define procedures for managing and maintaining the database

The database planning and design activity is conducted concurrently with the pilot study and/or benchmark activities. Clearly, actual procedures and the physical database design cannot be completed before specific GIS hardware and software has been selected while at the same time GIS hardware and software selection cannot be finalized until the selected GIS can be shown to adequately perform the required functions on the data. Thus, these two activities (design and testing) need to be conducted concurrently and iteratively. In many cases, neither database design matters nor hardware and software selection are unconstrained activities.. First, the overall environment within which the GIS will exist must be evaluated. If there exist "legacy" systems (either data, hardware or software) with which the new GIS must be compatible, then design choices may be limited. Both GIS hardware and software configurations and database organizations that are not compatible with the existing conditions should be eliminated from further consideration. Secondly, other constraints from an organizational perspective must be evaluated. It may, for example, be preferable to select a specific GIS or database structure because other agencies with whom data will be shared have adopted a particular systems. Finally, assuming that the intended GIS (whether it will be large or small) will be part of a corporate or shared database, the respective roles of each participant need to be evaluated. Clearly, greater flexibility of choice will exist for major players in a shared database (e.g., county, city, or regional unit of government) than for smaller players (town, village, or special purpose GIS applications). This does not mean that the latter must always go with the majority, but simply that the shared GIS environment must be realistically evaluated. In fact, one way for the smaller participants in a shared GIS to ensure their needs are considered, is to fully document their needs and resources using procedures recommended in these guidelines. Finally, with the completion of both the database planning and design and the pilot study/benchmark activities, sufficient detailed data volume estimates and GIS performance information will be known to calculate reliable cost estimates and prepare production schedules. This becomes the final feasibility check before major resources are committed to data conversion and GIS acquisition.

Survey of Available Data

What is already known about the GIS requirement Prior phases of the GIS development process should have produced the following information which is needed at this time: A complete list of data, properly defined and checked for validity and consistency (from the master data list, E-R data model and metadata entries). A list of potential data sources (maps, aerial photos, tabular files, digital files, etc. ) cataloged and evaluated for accuracy and completeness (from the available data survey). This inventory would also include all legacy data files, either within the agency or elsewhere, which must be maintained as part of the overall shared database. The list of functional capabilities required of the GIS (from needs assessment).

SELECTING SOURCES FOR THE GIS DATABASE

This activity involves matching each entity and its attributes to a source (map, document, photo, digital file). The information available for this task is as follows: List of entities and attributes from the conceptual design phase Attributes

Master Data List Entity

Spatial Object -------------------------------------------------------------------------------------------------------------------Street_segment name, address_range Line Street_intersection street_names Line Parcel section_block_lot#, Polygon owner_name, owner_address, sites_address, area, depth, front_footage, assessed_value, last_sale_date, last_sale_price, size (owner_name, owner_address, assessed_value as of previous January 1st)) Building building_ID, date_built, Footprint building_material, building_assessed_value Occupancy occupant_name, occupant_address, None occupancy_type_code Street_segment name, type, width, Polygon length, pavement_type Street_intersection length, width Polygon traffic_flow_conditions, intersecting_streets Water_main type, size, material, installation_date Line Valve type, installation_date Node Hydrant type, installation_date, Node pressure, last_pressure_test_date Service name, address, type, invalid_indicator None Soil soil_code, area Polygon Wetland wetland_code, area Polygon Floodplain flood_code, area Polygon Traffic_zone zone_ID#, area Polygon Census_tract tract#, population Polygon

GIS Development Guide

Water_District name, ID_number Polygon Zoning zoning_code, area Polygon The list of surveyed data sources from the Available Data Survey and their recorded characteristics in the metadata tables Source Documents, Entities Contained in Source, and Attributes by Entity.

Source Documents
Source Document Name: Parcel Map Source ID #: 1 Source Organization: Town of Amherst Type of Document: Map Number of Sheets (map, photo, etc): 200 Source Material: Mylar Projection Name: UTM Coordinate System: State Plane Date Created: 5-Oct-91 Last Updated: 8-Nov-95 Control Accuracy Map: National Map Accuracy Standard Scale: Variable; 1" = 50 ft To 1" = 200 ft Availability: Current Reviewed By: Lee Stockholm Review Date: 19-Dec-95 Spatial Extent: Town of Amherst File Format: N/A Comments:

Survey of Available Data

Entities Contained In Source


Source ID #: 1 Entity Name: Parcel Spatial Entity: Polygon Estimate Volume Spatial Entity: 126 per map sheet Symbol: None Accuracy Description Spatial Entity: National Map Accuracy Standard Reviewed By: Lee Stockholm Review Date: 02-Jan-94 Scrub Needed: Yes Comments:

Attributes By Entity
Source ID #: 1 Entity Name: Parcel Attribute Name: SBL Number Attribute Description: Section, Block, and Lot Number Code Set Name: N/A Accuracy Description Attribute: N/A Reviewed By: John Henry Review Date: 08-Feb-93 Comments: If there is a choice between sources, that is, two or more sources are available for a particular entity attribute, then criteria for deciding between them will be needed. In general, these criteria will be: Accuracy of resulting data Cost of conversion from source to database Availability of the source for conversion Availability of a continuing flow of data for database maintenance.

GIS Development Guide

Occasionally, alternative sources may result in different representations in the database, such as a vector representation versus a scanned image. In this situation, the ability of each representation to satisfy the requirements of the GIS applications will need to be evaluated. Once a source has been selected, the metadata tables that record source data information need to be completed as appropriate. These are: Data Object Information Attribute Information Spatial Object Information Source Document Information

To complete the accuracy information, the accuracy expected from the conversion process will need to be determined. This accuracy target will also be used later in the database construction phase by the quality control procedures. The metadata tables that need to be completed at this time are shown below:

Data Object Information


Data Object Name Parcel Type: Simple Data Object Description: Land ownership parcel Spatial Object Type: Polygon Comments:

Attribute Information
Data Object Name: Parcel

Data Attribute Name: SBL Number


Attribute Description: Section, Block, and Lot Number Attribute Filename: Parcel.PAT Codeset Name/Description: N/A Measurement Units: N/A Accuracy Description: N/A Comments:

Spatial Object Information

Survey of Available Data

Data Object Name: Parcel Spatial Object Type: Polygon Place Name: Amherst Projection Name/Description: UTM HCS Name: State Plane Coordinate System HCS Datum: NAD83 HCS X-offset: 1000000 HCS Y-offset: 800000 HCS Xmin: 25 HCS Xmax: 83 HCS Ymin: 42 HCS Ymax: 98 HCS Units: Feet HCS Accuracy Description: National Map Accuracy Standard VCS Name: VCS Datum: VCS Zmin: 0 VCS Zmax: 0 VCS Units:

VCS Accuracy Description:


Comments:

Source Document Information

10

GIS Development Guide

Data Object Name: Parcel Spatial Object Type: Polygon Source Document Name: Parcel Map Type: Map Scale: Variable: 1" = 50 feet To 1" = 200 feet Date Document Created: 17-Nov-89 Date Last Updated: 05-Oct-94 Date Digitized/Scanned: 24-Apr-95 Digitizing/Scanning Method Description: Manual digitized with Wild B8 Accuracy Description: 90% of all tested points within 2 feet Comments:

For some of the above tables, information will be available for only some of the entries. The remaining entries will be completed later as the database is implemented. The examples shown are from the metadata portion of the GIS Design software package that accompanies these guidelines. This package is a Microsoft Access program that runs "stand-alone" (you do not need a copy of Microsoft Access) on a regular PC. Where the same information is needed for multiple tables, this information is only entered once. The information is then automatically transferred to the other tables where it is needed.

THE LOGICAL/PHYSICAL DESIGN OF THE GIS DATABASE

This activity involves converting the conceptual design to the logical/physical design of the GIS database (hereafter referred to as the physical design). The GIS software to be used dictates most of the physical database design. The structure and format of the data in a GIS, like ARC/INFO, Intergraph, MapInfo, System 9, etc. have already been determined by each vendor respectively. If one separates the conceptual entity and its attributes from the corresponding spatial entity and its geometric representation, it can be seen that the physical database design for the spatial entity has been completely defined by the vendor and the GIS designer does not need to do anything more for this part of the data. The attributes of the entities may, however, be held in a relational database management system linked to the GIS. If this is the case, the GIS analyst needs to design the relational tables for the attribute information. Figure 1 illustrates the split between the entity's attributes and the spatial information. This example is based on the ARC/INFO GIS and a relational database system.

Survey of Available Data 11

Entity
Key Attributes

Spatial Object
Key

Object
Key Attributes

RDBMS Tables

INFO

Coverage Name

Attributes

Attributes

PAT

AAT

TIC

BND

ARC

ETC.

Attributes

Figure 1 - GIS Representation of Object and Associated Spatial Object The translation from the entity representation in the E-R diagram to the physical design of the database for a single entity is shown in Figure 2:
Attribute List of Entity "Parcel"
COINCIDENT AREA

COINCIDENT AREA

PARCEL

Parcel [subdivision_block_lot#, owner_name, owner_address situs_address, area, depth, front_footage, assessed_value, last_sale_date, last_sale_price (owner_name, owner_address, assessed_value as of Jan. 1 for last two years)]
T

POLYGON

Oracle Tables ARC/INFO Spatial Database Structure Parcel INFO Parcel Sub_bl_lot# Owner_name Owner_add Situs_add Depth Front_footage Assessed_value Last_sale_date Last_sale_price Previous Values Sub_bl_lot# Year Owner_name Owner_address Assessed_value

ARC

AAT

TIC

BND

ETC.

PAT

Area Perimeter Poly ID # Sub_lot_bl#

Figure 2 - Example of Mapping of E-R Entity and Attribute List into ARC/INFO & ORACLE Logical Database

12

GIS Development Guide

Again, this example is based on ARC/INFO and the Oracle relational database system and shows how one entity from the E-R diagram would be represented in a single layer (coverage in ARC/INFO terms) and two Oracle tables. It will not always be the case where one entity from the E-R diagram translates into a single layer. More complex representations will be needed. Generally this will involve two or more entities forming a single layer with, possibly, several relational database tables. For example, Figure 3 from the conceptual design guideline shows, in part, the following entities:

VALVE

WATER MAIN LINK

HYDRANT

Figure 3 - E-R Representation of Elements of a Water Distribution System


ORACLE TABLES
INFO WATER SYSTEM WATER MAIN WATER MAIN ID #

ETC

ARC

TIC

BND

NAT

AAT

WATER MAIN ID # VALVE ID # HYDRANT ID #

VALVE VALVE ID #

HYDRANT HYDRANT ID #

Figure 4 - Physical Design of Several Entities in a Single Layer and Three Relational Tables

Survey of Available Data 13 In figure 4, the water main segments, the valves and the fire hydrants have been placed together in one layer as line segments, and two sets of nodes. However, each entity has its own relational table to record its respective attributes (see Table 1, page 2). The relationship is maintained by unique keys for each instance of each entity. Every entity shown on the E-R diagram must be translated to either a GIS layer, a relational table(s), or both, as indicated by the information to be included. In addition, every relationship of the type "relationship represented in database" (single line hexagon on the E-R diagram) must be implemented through the primary and secondary keys in the tables for the entities represented.

PARCEL CONTAINS
POLYGON G T

BUILDING
POLYGON G

PARCEL TABLE LAYER PARCEL ID #

BUILDING TABLE BUILDING ID # PARCEL ID # LAYER

Figure 5 - Standard Database Relationship with Primary and Secondary Keys As shown in Figure 5, the entity "parcel" may "contain " the entity "building ." The table for each entity would have its own primary key (ID#), however, the table for building must also have a secondary key (parcel ID#) to maintain the relationship in the database. The completed physical database design must account for all entities and their attributes, the spatial object with topology and coordinates as needed, and all relationships to be contained in the database. The remaining items on the E-R diagram, the two types of spatial relationships, must be accounted for in the list of functional capabilities, that is, the implied spatial operations must be possible in the chosen GIS software

PROCEDURES FOR BUILDING THE GIS DATABASE

Developing a GIS database is frequently thought of as simply replicating a map in a computer. As can be inferred by the nature and detail of the activities recommended up to this point in these guidelines, building a GIS database involves much more than "replicating a map." While substantial portions of the GIS database will come from map source documents, many other sources may also be used, such as aerial photos, tabular files, other digital data, etc. Also, the "map" representation is only part of the GIS database. In addition to the map representation and

14

GIS Development Guide

relational tables, a GIS can hold scanned images (drawings, plans, photos), references to other objects, names and places, and derived views from the data. The collection of data from diverse sources and its organization into a useful database requires development of procedures to cover the following major activities: Getting the Data which may include acquiring existing data from both internal and external sources, evaluating and checking the source materials for completeness and quality, and/or creating new data by planning and conducting aerial or field surveys. Contemporary GIS projects attempt to rely on existing, rather than new, data due to the high cost of original data collection. However, existing data (maps and other forms) were usually created for some other purpose and thus have constraints for use in a GIS. This places much greater importance on evaluating and checking the suitability of source data for use in a GIS. Fixing any problems in the data source, often focused only on map source documents, this activity has been called "map scrubbing." Depending on the technology to be used to convert the map graphic image into its digital form, the source documents will have to meet certain standards. Some conversion processes require the map to be almost perfect which other processes attempt to automate all needed "fixes" to the map. What is required here is for the GIS analyst to specify, in detail, a procedure capable of converting the map documents into an acceptable digital file while accounting for all known problems in the map documents. This procedure should be tested in the pilot project and modified as needed. Converting to digital data, the physical process of digitizing or scanning to produce digital files in the required format. The major decision here is whether or not to use an outside data conversion contractor or to do the conversion within the organization. In either case, specifications describing the nature of the digital files should be prepared. In addition to including the physical database design, specifications should describe the following: - Accuracy requirements (completeness required, positional accuracy for spatial objects, allowable classification error rates for attributes). - Quality control procedures that will be conducted to measure accuracy. - Partitioning of the area covered by the GIS into working units (map sheets) and how these will be organized in the resulting database (including edge matching requirements). - Document and digital file flow control, including logging procedures, naming conventions, and version control. Change control , most map series are not static but are updated on a periodic basis. Once a portion of the map has been sent to digitizing (or whatever process is used), a procedure must be in place to capture any updates to the map and enter these into the digital files. Building the GIS Database, once digitizing has been completed, the sponsoring organization has a set of digital files, not an organized database (illustrated in Figure 5). The system integration process (a subsequent guideline document) must take all the digital files and set-up the ultimate GIS database in a form that will be efficient for the users. The several considerations required for this process are covered under GIS Data Database Construction, GIS System Integration and GIS maintenance and use.

Survey of Available Data 15

Identify Database Requirements

Identify Data to be Created

Identify Appropriate Data Sources

Develop Conceptual Database Design

Develop Physical Database Design

Procure Conversion Services

Identify Accuracy Requirements

Determine Conversion Strategy

Develop Data Conversion Work Plan

Commence Source Preparation and Scrub

Commence Other In-House Activities

Finalize Acceptance Criteria and QC Plan

Edit Delivered Data

Commence Database Maintenance

Develop Database Maintenance Procedures

Figure 6 - Guide to Data Conversion/Database Creation - GIS Handbook

Data

Conversion

PROCEDURES FOR MANAGING AND MAINTAINING THE DATABASE

Because the physical world is constantly changing, the GIS database must be updated to reflect these changes. Once again, the credibility of the GIS database is at stake if the data is not current. Usually, the effort required to maintain the database is as much as, or more than that required to create it. This ongoing maintenance work is usually assigned to in-house personnel as opposed to a contractor. The entire process should be planned well in advance. Once again, the equipment and personnel must be ready to take over the maintenance of the database when the data conversion effort and database building processes are complete. Database maintenance requires two supporting efforts: ongoing user training and user support. Ongoing user training is needed to replace departing users with newly trained personnel. This will enable the data maintenance to be carried out on a continuous and timely basis. It is also important to offer advanced training to existing users to provide them with the opportunity to improve their skills and to make better use of the system. GIS is a complicated technology, making operating problems inevitable. User support will help users solve these problems quickly. It will also customize the GIS software to enable them to execute processing tasks more quickly and more efficiently. User support is usually provided by in-house or contract programmers. It requires a knowledge of the operating system and macro programming language as well as troubleshooting common command and file problems.

16

GIS Development Guide

GIS DATA SHARING COOPERATIVES

The establishment of data sharing cooperatives within the public sector is a cost-effective means of database development and maintenance which is encouraged. Cooperative-multiparticipant database projects allow for data exchange, and the opportunity to create new means for developing, maintaining, and accessing information. The sharing of data in the public sector, especially between government agencies and offices which are funded by the same financial resources, should be expected. It does not make fiscal sense for public funds to be utilized in the development of two GIS databases of the same geographic area for two different agencies. Benefits of data sharing thus would include: the development of a much larger database, for far less cost; the development of more efficient interaction between public agencies; and through the utilization of a single, seamless database the availability of more accurate information, since all agencies would share the same, up-to-date information. Following pages represents a matrix which indicates in general opportunities for data sharing between municipal operating units/functions.

The goal of a data sharing strategy is to maximize the utility of data while minimizing the cost to the organization. It is important that all parties involved have clear and realistic expectations as well as common objectives to make the data sharing work. Under any circumstance, however, database management and maintenance will require us to redefine our relationships with those we routinely exchange data with, whether they are within an organization or part of a multiparticipant effort including outside agencies. Work flow and information flow must be reviewed and changed if necessary. Procedures and practices for the timely exchange and updating of data must be put in place and data quality standards adhered to, whether it be hard copy data which must be converted for inclusion or digital files which might be available for importing to our system. Systematic collection and integration of new and/or update data must be employed in order to safeguard the initial investment, maintain the integrity of the database and assure, system reliability to meet function needs.

Survey of Available Data 17

GIS DEVELOPMENT GUIDE: DATABASE CONSTRUCTION

INTRODUCTION

Scope Of Database Construction A database construction process is divided into two major activities creation of digital files from maps, air photos, tables and other source documents; organization of the digital files into a GIS database. This guideline document describes the first process, digital conversion, and the subsequent guideline entitled "GIS System Integration" deals with the organization of the digital files into a database. Figures 1 and 2 are two versions of the digital data conversion process (Burrough, 1986; and Montgomery and Schuch). Only the second half of figure 2 describes the actual digital conversion process, the first half identifies previous planning activities. In both figures, the end product(s) are digital data files which, if passed through quality control, are suitable for inclusion in the GIS database.
Steps in creating a topologically correct vector polygon database

FIELD DATA

NON-SPATIAL ATTRIBUTES

linked by unique indentifiers

SPATIAL DATA

MANUAL DIGITIZING

SCANNING

INPUT TO TEXT FILE

DIGITIZE

SCAN AND VECTORIZE

VISUAL CHECK CLEAN UP LINES AND JUNCTIONS WEED OUT EXCESS COORDINATES CORRECT FOR SCALE AND WARPING CONSTRUCT POLYGONS ADD UNIQUE IDENTIFIERS MANUALLY

LINK SPATIAL TO NON-SPATIAL DATA

TOPOLOGICALLY CORRECT VECTOR DATABASE OF POLYGONS

Figure 1 - Source: Principles of Geographic Information

18

GIS Development Guide Systems for Land Resources


Develop Physical Database Design

Assessment,

Burrough,

P.A.,

1986
Identify Database Requirements Identify Data to be Created Identify Appropriate Data Sources Develop Conceptual Database Design Procure Conversion Services

Identify Accuracy Requirements

Determine Conversion Strategy

Develop Data Conversion Work Plan

Commence Source Preparation and Scrub

Commence Other In-House Activities

Finalize Acceptance Criteria and QC Plan

Edit Delivered Data

Commence Database Maintenance

Develop Database Maintenance Procedures

Figure 2 - Guide to Data Conversion. Source: Montgomery and Schuch

INFORMATION REQUIRED TO SUPPORT DATA CONVERSION PROCESS

Data Model GIS technology employs computer software to link tabular databases to map graphics, allowing users to quickly visualize their data. This can be in the form of generating maps, on-line queries, producing reports, or performing spatial analysis. To briefly summarize the characteristics of GIS software and the data required for operations, we offer the following diagram:

Survey of Available Data 19

GIS Data Model


Graphic / Data links

123

Layers of Map Graphics Tabular Databases

123

Figure 3 GIS Data Model GIS (Spatial) Data Formats In digital form, GIS data is composed of two types: map graphics (layers) and tabular databases. Map graphics represent all of the features (entities) on a map as points, lines, or areas, or pixels. Tabular databases contain the attribute information which describe the features (buildings, parcels, poles, transformers, etc.).

GIS data layers are created through the process of digitizing. The digitizing process produces the digital graphic features (point, line or area) and their geographical location. Tables can be created from most database files and can be loaded into a GIS from spreadsheet or database software programs like Excel, Access, FoxPro, Oracle, Sybase, etc. A common key must be established between the map graphics and the tabular database records to create a link. This link is usually defined during the scrubbing phase (data preparation) and created during data capture (digitizing). For parcel data, the parcel-id or SBL number (section, block and lot) is a good example of a common key. The map graphic (point or polygon) is assigned an SBL number as it is digitized. The database records are created with an SBL number and other attributes of the parcel (value, landuse, ownership, etc.). Raster and Vector Format GIS allows map or other visual data to be stored in either a raster or vector data structure: There are two types of raster or scanned image: 1) remotely sensed data from satellites; and 2) scanned drawings or pictures. Satellite imagery partions the earth's surface into a uniform set of grid cells called pixels. This type of GIS data is termed raster data. Most remote sensing devices record data from several wave-lengths of the electromagnetic spectrum. These values can be interpreted to produce a "classified image" to give each pixel a value that represents conditions on the earth's surface (e.g., land use/land cover, temperature, etc.). The second type of scanned image is a simple raster image where each pixel can be either black or white (on or off) or can have

20

GIS Development Guide

a set of values to represent colors. These scanned images can be displayed on computer screens as needed. Raster data is produced by scanning a map, drawing or photo. The result is an array of pixels (small, closely packed cells) which are either turned "on" or "off." A simple scanned image, for example, in TIFF (Tagged Image File format) format, does not have the ability to be utilized for GIS analysis, and is used only for its display value. The "cells" of the digital version of the image do not have any actual geographical nature as they represent only the dimensions of the original analog version of the image. Raster data in its most basic form is purely graphical and has no intelligence or associated database records.

Raster GIS Data


Graphics Grid/Raster Value Attribute Table

2 2 2 2 5 5

2 2 2 5 5 5

2 2 3 5 5 5

2 3 3 5 5 5

3 3 3 5 5 5

3 3 3 3 5 5

Cell Value 2 3 5

Real World Entity Lake Wooded Built-up

Figure 4 - Raster Data (pixels) Raster data can be enhanced to provide spatial analysis within a GIS. Pixels or cells represent measurable areas on the earth's surface and are linked to attribute information. These cells are assigned numeric values which correspond to the type of real-world entity which is represented at that location (e.g., cells containing value "2" may represent a lake, cells of value "3" may represent a particular wooded area, etc.). Vector data represents map features in graphic elements known as points, lines and polygons (areas).

Survey of Available Data 21

Vector GIS Data


Vector GIS Polygon Layer Polygon Attribute Table Polygon Number 1 2 3 Identity Attribute Lake Wooded Built-up

Figure 5 - Vector GIS Data Vector graphics coordinates are represented as single, or a series of, xy-coordinates. Data is normally collected in this format by tracing map features on the actual source maps or photos with a stylus on a digitizing board. As the stylus passes over the feature, the operator activates the appropriate control for the computer to capture the xy-coordinates. The system stores the xycoordinates within a file. Vector data can also be collected on-screen (called "heads-up" digitizing), by tracing a scanned image on the computer screen in a similar manner.

3 DATA CONVERSION TECHNOLOGIES AVAILABLE


Manual Digitizing

Manual digitizing involves the use of a digitizing tablet and cursor tool called a puck, a plastic device holding a coil with a set of locator cross-hairs to select and digitally encode points on a map. A trained operator securely mounts the source map upon the digitizing tablet and, utilizing the cross hairs on the digitizing puck, traces the cross hair axis along each linear feature to be captured in the digital file. The tablet records the movement of the puck and captures the features coordinates. The work is time-consuming and labor intensive. Concentration, skill and hand-eye coordination are crucial in order to maintain the positional accuracy and completeness of the map features. Traditional data conversion efforts are based on producing a vector data file compiled by manually digitizing paper maps. Vector data provides a high degree of GIS functionality by associating attributes with map features, allowing graphic selections, spatial queries and other analytical uses of the data. Vector data also carries with it the highest costs for conversion. The industry average for a complete data conversion project to digitize parcel lines, dimensions and text is between $3.00 - $5.00 per parcel. The price is determined by the complexity and amount of data. To keep costs down, data can be selectively omitted from conversion (i.e. not all text and annotation will be captured). The resulting vector data can reproduce a useful, albeit more visually stark version of the original map. A bare bones data conversion project can be conducted by digitizing only the linework from the tax maps. The minimum industry cost for digitizing parcel line work with a unique ID only is between $1.00 to $1.50 per parcel. Scanning

22

GIS Development Guide

Scanning converts lines and text on paper maps into a series of picture elements or pixels. The higher the resolution of the scanned image (more dots per square-inch), the smoother and more accurately defined the data will appear. As the dots per inch (DPI) increases, so does the file size. Most tax maps should be captured with a scan resolution of 300-400 DPI. One of the main advantages to scanning is that the user sees a digital image that looks identical to their paper maps -- complete with notes, symbology, text style and coffee stains, etc. Scanning can replicate the visual nature of the original map at a fraction of the cost of digitizing. However this low cost has a "price". The raster image is a dumb graphic -- there is no intelligence associated with it, i.e. individual entities cannot be manipulated. Edge-matching and geo-referencing the images (associating the pixels with real world coordinates) improves the utility of the scanned images by providing a seamless view of the raster data in an image catalog. Scanned images require more disk space than an equivalent vector dataset, but the trade-off is that the raster scanning conversion process is faster and costs less than vector conversion. Raster to Vector Conversion Scanned data, in raster format, can be "vectorized (converted into vector data) in many high-end GIS software packages or through a stand-alone data conversion package. Vectorizing simply involves running a scanned image through a conversion program. In the vectorization process, features which are represented as pixels are converted into a series of X,Y points and/or linear features with nodes and vertices. Once converted within a GIS environment, the data is in the same format created using a digitizing tablet and cursor. Many vectorized datasets require significant editing after conversion. Hybrid Solution Since both vector and raster datasets have decided advantages and disadvantages, a hybrid solution capitalizes on the best of both worlds. Overlaying vector format data with a geo-referenced backdrop image provides a powerful graphic display tool. The combined display solution could show the vector map features and their attributes (also available for GIS query), and an exact replica of the scanned source material which may be a tax map or aerial photography. If needed, individual parcels, pavement edges, city blocks or entire maps can be vectorized from the georeferenced scanned images. This process is called incremental conversion. It allows the county to convert scanned raster data to vector formatted data on an as-needed basis. There are a plethora of raster to vector conversion routines on the market, but it is important that the conversion take place in the same map coordinate system and data format as your existing data. The key advantage to the hybrid approach is this: even after full vectorization, the scanned images continue to provide a higher quality graphic image as a visual backdrop behind the vector data. Entry of Attribute Data Additional attribute data can be added to the database by joining a table which contains the new attributes to an existing table already in the GIS. To join these tables together a common field must be present. Most GIS software can then use the resulting table to display the new attributes linked to the entities. There are various sources for building an attribute database for a GIS. From CDROM telephone and business market listings with addresses, to data which is maintained in various government databases in dbase or various other database formats.

Survey of Available Data 23

Acquisition of External Digital Data The availability of existing digital data will have an effect upon the design of the database. Integrating existing databases with the primary GIS will require the establishment of common data keys and other unique identifiers. Issues of data location, data format, record match rates, and the overall value of integrating the external data should all be considered before deciding to purchase or acquire existing datasets. GIS Hardware And Software Used in Digital Data Conversion Most contemporary GIS software packages are structured to operate on computer workstations to accomplish digitizing and editing tasks. Four basic types of workstations can be identified: A digitizing station, a workstation which is connected to a precision digitizing tablet, which utilizes a high-resolution display terminal, and which also has all of the analysis functions necessary for querying, displaying and editing data An editing workstation, which is used for conducting most of the QA/QC functions of the conversion process, having all the functionality of the digitizing station except for the ability to digitize data via a digitizing tablet Graphic data review/Tabular data input workstations are used for displaying and reviewing graphic data, and for the entering of tabular attribute data associated with these features X Terminals are the fourth type of workstation and these allow for graphic display and input of data utilizing the X Window System communications protocol.

With the increasing power of todays personal computers, many GIS analysis packages are being designed for PCs. As GIS data files are very large, PC-based GIS packages usually require a PC with minimum requirements including a 486 processor and 16 megabytes of RAM. Hard-drive disk space depends upon how large the datasets are which are being used. A safe bottom-line for hard-drive space with a PC is 500 megabytes. For most data conversion projects, much more hard-drive space will be needed in order to store data as they are converted. Tape storage hardware is also necessary in order to efficiently backup the many megabytes of files created in the conversion process. Just to provide an idea of the storage requirements necessary for basic scanning conversion, the file-size of one tax map alone, in (Tagged Image File Format or TIFF) image format, scanned at a 500 dots per inch (dpi) resolution, can range anywhere from 1-3 megabytes alone. Digitizing hardware requirements vary according to the conversion approach which is applied. For vector conversion, a digitizing tablet will be necessary in usually a manual digitizing process. Another piece of digitizing hardware, a scanner, is used to create raster images. Automatic digitization, through the use of a scanner is a very popular approach for capturing data. Raster data can subsequently be transformed into vector data in most turn-key GIS packages, through the use of raster-to-vector conversion algorithms. After the conversion of map data into digital form, hardware will be needed for outputting digital data in hardcopy format. When handling a data conversion project, a necessary piece of output

24

GIS Development Guide

hardware is a pen or raster plotter. GIS software allows for the creation of plots at any viewscale. The plotter, with its ability to draw on a variety of materials (including paper, mylar and vellum), allows for the creation of quality map plots. Most plotters usually have a minimum width of three feet. Vector and raster plotters are both available on the market. Vector, pen plotters utilize various pens for the drawing of linear features on drawing media. Pen plotters can handle most plotting jobs, but they do not produce good results in area shading such as in the production of cholorpleth maps. Raster plotters, on the other hand, are excellent in producing shading results. Raster plotters usually cost more than vector plotters, but are substantially more versatile and have better capabilities. Other output devices for the creation of hardcopies of GIS data include: screen copy devices, used for copying screen contents onto paper without having to produce a plot file; computer FAX (facsimilie) transmissions, often used in communications between conversion contractors and clients, produce small letter-size plots, and the fax transmission files (as raster images) can be saved and viewed later; printers are used to output tabular data which is derived from the GIS, and if configured correctly, can produce small letter-size plots.

Pilot Project/Benchmark Test Results The pilot project is a very important activity that precedes the data conversion project. The pilot project allows you, the GIS software developer, and the data conversion contractor the ability to test and review the numerous steps involved in creating the database. Defining the pilot study area involves selection of a small geographic area which will allow for a high degree of being successful, that is, that it will be completed in a relatively short period of time and will allow for the testing of all project elements which are necessary (conversion procedures, applications, database design). Test results which are obtained from the pilot project usually include assessments of: database content, conversion procedures, suitability of sources, database design, efficiency of prepared applications on datasets, the accuracy of final data, and cost estimates. Identified Problems With Source Data The pilot study involves testing and finding successes and problems in procedures and designs for the GIS. It involves looking for problems that occur due to lack of, or inadequacy in, source data. It is important to identify problems especially at the source data level since it is usually the easiest and cheapest to correct errors prior to data conversion. When evaluating the results of a pilot study, problems with digital data accuracy resulting from source data flaws, are bound to arise. Usually, the source data used for a project are not in the proper format required for the best possible data result. For example, problems may arise when the source data for a certain data layer consists of maps which are at various scales. These various scale differences can create error when these digitized layers are joined into a single layer. Other problems arise when there are not adequate control points found upon map sheets in order to accurately register coverages while they are being digitized. At times, even adjacent large-scale source map sheets may have positional discrepancies between them. Such inconsistencies will be reflected in the corresponding digital data. Procedures for dealing with all known source data problems need to be specified prior to the start of data conversion.

Survey of Available Data 25

DATA CONVERSION CONTRACTORS

Firms Available And Services Offered There are different types of firms which can handle GIS data conversion. There are some firms which specialize in GIS data conversion, and sub-contract out the services of other firms as needed. Some other firms which handle data conversion but do not particularly specialize in data conversion alone include: aerial mapping firms, engineering firms and GIS vendors. Various firms will offer standard data conversion services, but based upon their main type of work, may offer some unique services. For example, a firm specializing in GIS data conversion may have a wide variety of software options which the client company can choose from. Such a firm usually will have numerous digitizing workstations and a large staff, and be able to complete the project in a shorter period of time than other firms which do not particularly specialize in GIS data conversion. If needed, a specialized GIS data conversion company could subcontract services from another company. Aerial mapping firms can offer many specialized data conversion services associated with photogrammetry, which will not be available directly through a general data conversion contractor. Many aerial mapping firms now have considerable expertise with the creation of digital orthophoto images, rectified and scaled scans of aerial photography, which can be displayed and utilized with vector data. Engineering and surveying firms are well-equipped to deal with most data conversion projects, and will usually have a major civil engineering/surveying unit within the organization. These firms usually will focus upon certain aspects of GIS systems and approach conversion projects with stress upon the extent of construction detail, positional accuracy requirements, COGO input, scale requirements and database accuracy issues. At times, GIS software vendors will handle data conversion projects in order to test their software in benchmark studies and pilot projects. The main conversion services which are usually offered include: physical GIS database design and implementation, deed research, record compilation, scrubbing, digitizing, surveying, programming and image development and registration. Approximate Cost of Services Outsourcing data conversion with data purchase/ownership CONVERSION METHOD Manually digitized vector data (linework alone) Manually digitized vector data (linework & annotation) Vector data developed from the vectorization of scanned maps (linework & Annotation) PER-PARCEL COST $1.20 / Parcel $5.00 / Parcel

$3.00 / Parcel

26

GIS Development Guide

Raster image data (registered to a coordinate system) Outsourcing Data Conversion and Licensing Data CONVERSION METHOD Manually digitized Vector Data (Linework and Annotation) (No cost estimates are available for Raster Data)

$50. / map = $0.55 / Parcel

PER-PARCEL COST $1.50 / Parcel

(Note: All of the above cost estimates are based upon average prices offered by various data conversion vendors)

Making Arrangements For External Data Conversion There are a number of ways of obtaining the digital conversion of map data. Arrangements are usually made through the development of a Request for Proposal (RFP), and then evaluating the proposals submitted by various conversion contractors. Some of the criteria which are desired in selecting a conversion contractor include: the companys technical capability, the companys experience with data conversion, the companys range of services, location, personnel experience and the overall technical plan of operation. Balanced with all of these items is usually the organizations budget and the costs associated with the project.

DATA CONVERSION PROCESSES

Digital Conversion Of Mapped Data Digital data conversion of mapped data is a costly and time-consuming effort. The more closely the digital data reflects the source document, and the more attributes are associated with the map features, the higher the map utility but also the higher the cost of conversion. Because of the high cost of digitizing all graphic map features, and text/graphic symbology, conversion efforts may compromise data functionality by limiting the number of features captured in order to keep costs down. The actual processes involved with digital conversion of mapped data are usually the most involved, and most time-consuming of all. These two traits together explain why data conversion is usually the highest cost of implementing the GIS. Planning The Data Conversion Process The data conversion process needs to be planned effectively in order to minimize the chance of data conversion problems which can greatly disrupt the normal workflow of the organization. It is

Survey of Available Data 27 necessary to plan all of the physical processes which will be involved in data conversion and to develop time-estimates for all work. These main processes include: Specifications Source map preparation Document flow control Supervision plans Problem resolution procedures

These procedures allow for the efficient conversion of mapped data. Guidelines for normal data capture procedures such as scanning and table digitizing should be developed to ensure that all data are consistently digitized. Particularly when an organization is conducting conversion in-house, a small amount of time invested in developing error prevention procedures will greatly benefit the organization by saving time in the correction/editing phase of the conversion. It is easier to prevent errors than to go ahead and try to correct them after the actual digitizing has been conducted.

Data Conversion Specifications: Horizontal And Vertical Control, Projection; Coordinate System, Accuracy Requirements Any discussion about data conversion should start with the topic of accuracy. We've all heard the expression, "Garbage In, Garbage Out." Without the ability to meet the proper accuracy standards established early in a GIS conversion project, the resulting GIS may be useless based upon its lack of accuracy. Even still, in reality, when building a GIS and handling data conversion, we are faced with a variety of source documents which may each carry a different scale, resolution, quality and level of accuracy. Some source map data may be so questionable that it should not be loaded into the GIS. Extracting reliable data later-on from the GIS will depend upon either the converting of data from reliable source documents, or the development of new data "from scratch." Map projections affect the way that map features are displayed (as they affect the amount of visual distortion of the map), and the way map coordinates are distributed. Before any GIS graphic data layers will be ready for overlay functions, the layers must be referenced to a common geographic coordinate system. GIS software can display data in any number of projection systems, such as UTM (Universe Transverse Mercator), State Plane Coordinate Systems, and more. For scanned maps and aerial photos (which are simple non-GIS raster images), to be displayed effectively with vector data, the images need to be registered and rectified to the same coordinate system. Establishing specific requirements for map accuracy should be done at the beginning of a project. If a certain level of accuracy is desired, it is this level which will have to be developed in future aspects of the project. Procedures should be standardized in order to ensure the best and most consistent results possible. Source Map Preparation (Pre-Digitizing Edits) Preparing the analog data that will be converted is an important first step. This needs to be done whether the data will be scanned or digitized, and whether you are outsourcing the work or

28

GIS Development Guide

completing it in-house. This pre-processing is also referred to as scrubbing the data. The process involves coding the source document using unique IDs and/or using some method to highlight the data that should be captured from these documents. This makes it clear to the person performing the scanning or digitizing what they should be picking up. It will also be important later for performing quality control checks and to make sure that the digital data has a link to the attribute database needed for a GIS. Document Flow Control Without a clear system for monitoring and planning the flow of map (and attribute data) documents between the normal storage locations of map documents and those parties handling the actual data conversion, problems will usually arise in tracking the location of maps. When a large number of maps are being converted, it is important to maintain a full understanding between both the conversion contractor or in-house conversion staff, and the normal user group of the source documents about exactly which documents are being handled, and at what time. Source maps are delivered to the conversion group or contractor as a work packet, usually consisting of a manageable number of maps of a certain geographic region, which is pre-determined within the data conversion workplan. A scheme for tracking packets of source documents, as well as the resulting digital files is needed. This scheme should be able to track the digital file through the quality control processes. In addition to tracking the flow of documents and digital files through the entire data conversion process, a procedure needs to be established for handling updates to the data that occur during the conversion time period. This change control procedure may be quite similar to the final database maintenance plan, however, it must be in place before any of the data conversion processes are started. Also, if this procedure will likely be very different from the previous manual map updating methods used and may involve substantial restructuring of tasks and responsibilities within the organization. Supervision Plans (Particularly For Contract Conversion) When planning the data conversion process, it is important that attention be given to the development of detailed plans for supervising the data conversion process. Supervisory plans allow the organization to distribute responsibility for the many different facets of the data conversion project. When data conversion has been contracted out, it is important that communication be maintained between the client company and the contractor. The development of specific variations normal administrative tools used for scheduling and budget control can be very useful (e.g., CPM/PERT scheduling procedures; GANTT charts, etc.) Problem Resolution Procedures In order to ensure the efficient progress of all aspects of the data conversion project, it is important to develop formal procedures for problem resolution. Editing procedures and data standards should be developed for such items as: major and minor positional accuracy problems; inaccurate rubber-sheeting, or map-joining/file-matching problems; attribute coding errors, etc. Other procedures for events such as missing source data, handling various scale resolution issues, and

Survey of Available Data 29 even hardware and software system problems should also be created. Establishing such procedures and assigning responsibilities for resolution are extremely important, particularly when outside contractors are involved. Converting The Data As stated earlier, it is important to follow consistent pre-established procedures in the actual digitizing of the datasets. Consistently using a tested and approved set of conversion guidelines and procedures will eliminate any chance of ambiguity in methods. Using established procedures will allow for the most consistent product possible.

Reviewing Digital Data The digital data review process involves three issues: data file format and format conversion problems data quality questions data updating and maintenance The review process must first be handled before the decision to rely on other digital data sources is made. Additionally, formal data sharing agreements should be made between the two organizations. Quality Control (Accuracy) Checking Procedures A quality assurance (QA) program is a crucial aspect of the GIS implementation process. To be successful in developing reliable QA methods, individual tasks must be worked out and documented in detail. Data acceptance criteria is a very important aspect of the conversion program, and can be a complex issue. A full analysis of accuracy and data content needs will facilitate the creation of documentation which may be utilized by the accuracy assessment team. A combination of automatic and manual data verification procedures is normally found in a complete QA program. The actual process normally involves validation of the data against the source material, evaluation of the datas utility within the database design, and an assessment of the data with regard to the standards established by the organization handling the conversion project. Automated procedures will normally require customized software in order to perform data checks. Most GIS packages today have their own macro programming languages which allow for the creation of customized programs. Some automated QA procedures include: checking that all features are represented according to conversion specifications (e.g., placed in the correct layer); features requiring network connectivity are represented with logical relationships, for example, two different diameters of piping or two different gauges of wire must have a connecting device between them which should be represented by a graphic feature with unique attributes;

30

GIS Development Guide

relationships of connectivity must be maintained between graphic features (Montgomery and Schuch, 143). Manual quality control procedures normally involve creating and checking edit plots of vector data against source map data. QA requirements which will have to be met include: absolute/relative accuracy of map features should be met and all features specified on the source map should be included on the edit plot; map annotation should be in required format (e.g., correct symbology, font, color, etc.) and text offsets should be within specified distance and of correct orientation; plots of joined datasets should have adequate edge matching capability (M&S, 145). Final Correction Responsibilities Quality control editing of the digitized product is a crucial step in preparing spatial feature data. After initially digitizing a data layer, an edit plot is produced of those digitized features. The edit plot is a hard-copy printing of the digitized features. The edit plot is printed at the same scale as the source data and checked by overlaying the plot with the source map on a light table. This edit check allows for the determination of errors such as misaligned or missing features. Corrections may then be made by adding or deleting and re-digitizing features. When on-screen digitizing, feature placement errors may be corrected by rubbersheeting the graphic features to fit the source data. Rubbersheeting is the process of stretching graphic features through the establishment of graphic movement links with a from-point (where the feature presently is located), and a to-point (where the feature should be placed). GIS graphic manipulation routines then move graphics according to these specified links. File Matching Procedures (Edge Match, Logical Relationships Within Data, Etc.) Files which are going to be spatially joined must first have adequate edge-matching alignment of their graphic features. This entails a number of basic GIS graphic manipulation procedures: (1) coordinate transformation, which projects the data layer into its appropriate real-world coordinates; (2) rubbersheeting of the graphic features in one data file to accurately coincide with the adjacent graphic features in another file; (3) spatial joining, the combining of two or more data files into one seamless file spanning the geographic area of all files. Coordinate transformation is the process of establishing control points upon the digitized layer and defining real-world coordinates for those points. A GIS coordinate transformation routine is then used to transform the coordinates of all features on the data layer based upon those control point coordinates. Once transformed, spatially adjacent data layers may then be displayed simultaneously within their combined geographic extent. A determination may then be made as to the effectiveness and accuracy of the coordinates assigned to the data layers. If necessary, graphic features found in both data layers may be rubbersheeted to better align features which will need to be connected. For example, if the endpoint of a graphic feature representing a street centerline is not reasonably close to its corresponding starting point on the adjacent data layer, one or both of these graphic lines will have to be moved so that the graphic feature will connect. An alignment problem such as this can signal possible errors in the coordinate transformation and/or the source data. After features are accurately matched, the data files may be combined into a single data file. The combined data file will afterwards require editing and the development of new topological relationships in the new dataset. An example of one post-spatial join editing procedure is the

Survey of Available Data 31 removal of graphic line-connection points called nodes which may interfere with various elements of the attribute database. Final Acceptance Criteria Standards for appropriate quality assurance, and accuracy verification procedures in general, depend greatly upon the data sources, the schematics of the database for which data is being prepared, and the actual data conversion approaches applied. Acceptance of the joined digital map files depends upon the datas meeting certain criteria. Criteria usually relate to accuracy, such as the determination of whether the product meet National Map Accuracy Standards at the appropriate scale. Other criteria may relate to whether attributes are in order, if they have been added. Most acceptance determinations should be made on whether the feature data is meeting standards of accuracy, completeness, topological consistency, and attribute data content.

Building Main Database One of the final stages involved in developing a GIS database involves putting all the converted data together. Establishing one uniform database involves entering all attribute and feature data into a common database with an established workable file/directory structure, sometimes known as a data library. As the database is developed and data is ready for use, it can be released to the various data users for analysis. Once the database is designed, it then becomes important to maintain data accuracy and currency. If changes are made within the confines of the data layers, these changes must be defined and updates made to keep the integrity of the database. Subsequent guideline documents deal with data integration and database maintenance.

ATTRIBUTE DATA ENTRY

Source Documents There are a number of source documents which can be utilized as data for the attribute database. Many organizations are able to utilize their existing electronic database files and import this data directly into their GIS database. In the case of paper files relating to geographic areas, and attribute data existing on paper maps, this data will have to be manually entered into GIS attribute data files in the form of tables. Before this information is entered into a database, it must first be reviewed and edited. It is also important to have a procedural plan designed for the entry of this data in order to coordinate the flow of these source documents.

Pre-Entry Checking And Editing A review of GIS attribute source documents can oftentimes reveal an unorganized mass of maps, charts, tables, spreadsheets, and various textual documents. The checking and editing of source

32

GIS Development Guide

documents is handled in the scrubbing phase of the project. Without a specific plan designed for the entry of these various data elements, it is highly likely that error will be introduced into the GIS database. It is crucial that all source documents are readable and properly formatted to allow for the most efficient entry of numerical and textual data. If the database conversion is being outsourced, and the contractor is unable to read the source data, the resulting database will be inaccurate, more costly, or both. It is recommended that a formal scrub manual, designed according to the database and application requirements, be developed to help facilitate the supplementing of source data and its entry into the database. Logical consistency is an important element for both graphic and attribute elements. Records and attributes which are related to graphic elements within a network system must maintain logical relationships. Document Flow Control An organization will typically have a multitude of different document formats which it will need in coding all of its GIS attribute data. It is crucial that tracking mechanisms be implemented in preparation for the key entry process. Usually duplication of source documents which will be used in the key entry process will not be feasible. As many source documents to be key entered are used on a regular basis within the organization, it will be important to develop guidelines for tracking these documents if they are needed during the process. Timing and coordination will be factors in planning document usage. Key Entry Process As stated earlier, some organizations will be able to enter much tabular data into the database simply by way of importing existing tables or files into the GIS, or relating tables which exist in their external DBMS. Normally, it will be necessary to enter attribute data into the system utilizing a keyboard. Many organizations choose to use lists when entering data from the keyboard. It is much more efficient during conversion to enter a 2 or 3-digit code which has a reference list associated with it. Typing in a full description of the graphic into the text field takes longer, and increases the chance of typographical error. Digital File Flow Control Numerous files will result from the key entry process. These files will need to be given proper names and directory locations in order to track and prepare the data logically for use within the GIS. Quality Control Procedures Most databases allow the user to specify the type of field for each data element, whether it is numeric, alphanumeric date, etc; whether it has decimal places, and so on. This feature can help prevent mistakes as the system will not allow entries other than those specified in advance. There are a number of automated and manual procedures which can be performed to check the quality of attribute data. Some customized programs may be required for the testing of some

Survey of Available Data 33 quality control criteria. Some attribute value validity checks which may be performed include: verifying that each record represents a graphic feature in the database, verifying that each feature has a tabular record with attributes associated with it, determining if all attribute records are correct, and determining that all attributes calculated from certain applications must be correct based upon the input values and the corresponding formulas. The translation of obsolete record symbology into a GIS usable format, according to conversion specifications, is one procedure which will have to be conducted manually (Montgomery and Schuch, 145). The responsibility for checking and maintaining automated quality control procedures can be placed in the hands of the staff responsible for actual data conversion. When outsourcing data conversion, one of the most time-consuming aspects of the project is the evaluation of converted data once it has been received from the vendor. Usually, automated routines are developed which can be utilized in the evaluation of the datasets, and in determining if the data fulfills all of the requirements and standards stated in the contract. This process can be simplified by the client company delivering automated quality control checking routines to the data conversion vendor. The vendor is then able to run these routines, evaluate and edit the data so that it will meet requirements before it is even shipped to the client. Such a procedure saves valuable time and expenses which would otherwise have been spent on quality control evaluation, shipping and business communication. Change Control Final editing procedures and data acceptance are based upon whether major revisions in the data will need to be performed. After data verification and quality assurance checks, it may be necessary to again re-evaluate database design, technical specifications of the data, and conversion procedures overall. Ideally, the planning and design of the database will be sufficiently comprehensive and correct such that the logical/physical database design will not have to be modified. However, it is rare that a data conversion project will be able to push through to completion without some changes being necessary. Many conversion projects develop procedures which are used to identify, evaluate and then to approve or disapprove the final products. A form should be developed which is used to list desired changes which have been identified. The listing of desired changes is then evaluated in terms of both the volume of the data which has yet to be edited, and the amount of data which has already been converted. The conversion vendor will usually develop documentation which describes the estimated cost/savings which will be associated with the changes and final edits. Most organizations now accept the fact that changes will be a normal part of data conversion and change requests are usually expected. The challenge then lies in the methods by which change mechanisms are developed and agreed upon between client and vendor. Final Acceptance Criteria Acceptance criteria are the measures of data quality which are used to determine if the data conversion work has been performed according to requirements specified. In the case of outsourcing of conversion, these criteria will determine if the data has been prepared according to the contract specifications. If the data does not meet these specifications, the conversion contractor will be required to perform any necessary editing upon the data to reach acceptable standards. Acceptance criteria and standards may vary between organizations.

34

GIS Development Guide

File Matching And Linking In most GIS packages which utilize relational database technology, the file matching and linking is a fairly simple process. Most GIS packages contain straight-forward procedures for joining and relating attribute files, which normally entails the selection of the unique identifying key between the graphic feature attribute table and any other data attribute tables. Once the identifier-link has been specified, the GIS software automatically establishes the relationship between the tables, and maintains the relationship between them.

EXTERNAL DIGITAL DATA

Sources Of Digital Data Digital spatial and attribute data can be found from a variety of sources. Various companies today produce canned digital spatial datasets which are ready for use within a GIS environment. Utilizing an existing database is a good way to supplement data in the conversion process and is one of the best ways to save money on the cost of producing a database. Most federal, state, and local government agencies have data which is available to the public for minimal cost. Two of the largest spatial databases which are national in coverage include the US. Geological Surveys DLG (Digital Line Graph) database, and the U.S. Census Bureaus TIGER (Topologically Integrated Geographic Encoding and Referencing) database. Both systems contain vector data with point, line and area cartographic map features, and also have attribute data associated with these features. The TIGER database is particularly useful in that its attribute data also contains valuable Bureau of the Census demographic data which is associated with block groups and census tracts. This data is used today in a variety of analysis applications. Many companies have refined various government datasets, including TIGER, and these datasets offer various enhancements in their attribute characteristics, which increases the utility of the data. Unfortunately, problems associated with the positional accuracy of these datasets usually remain and are much more difficult to resolve. Satellite and digital orthophoto imagery, raster GIS datasets, and tabular datasets are also available from various data producing companies and government agencies. Transfer Specifications Many government agencies produce spatial data which is in its own unique format. Many fullfeature GIS packages have the ability to import government spatial datasets into data layers which are usable within their own environment. Some agencies or companies may produce their data in

Survey of Available Data 35 the most common data formats for government data in the transfer of their data (e.g. TIGER or DLG format). Such policies allow for easy transfer to various systems. Quality Control Checks Quality control checks on external datasets will be necessary. Many government datasets, although extensive in their geographic coverage and in the utility of the associated data, do not always have the most accurate or complete data, particularly in terms of positional accuracy. It is always advisable to be skeptical of a datasets accuracy statement and compliance with standards and to fully test and evaluate the data before purchasing it or incorporating it into the database. Various automated and manual quality control procedures, discussed for both assessing cartographic feature and attribute characteristics should be utilized in a quality assurance evaluation of the external data.

ACCURACY AND FINAL ACCEPTANCE CRITERIA

Acceptance criteria determine to what standards data must comply in order to be usable within the system. Graphic acceptance standards for external digital data may be identified in three different cartographic quality types which include: relative accuracy, absolute accuracy and graphic quality. Standards for GIS data will normally depend upon the accuracy required of the dataset. In the GIS environment, accuracy will depend upon the scale at which the data is digitized, and at which scale it is meant to be used. Relative accuracy is basically a measure of the normal deviation between two objects on a map and is normally described in terms of + or - the number of measurement units (normally inches or feet) the feature is located apart from its neighboring map features, as compared to their locations in the real-world. Absolute accuracy criteria will evaluate the measure of the maximum deviation between the location of the digital map feature and its location in the real-world. Many organizations set their absolute accuracy standards based upon National Map Accuracy Standards. Graphic Quality refers to the visual cartographic display quality of the data, and pertains to aspects such as the datas legibility on the display, the logical consistency of map graphic representations, and adherence to common graphic standards. Placement and legibility of annotation, linework, and other common map elements all fall under graphic quality.

Informational quality is another accuracy criteria component which should be given much attention in building a database. Informational quality relates to the level of accuracy for both map graphic features and to their corresponding tabular attribute data. There are four basic categories for assessing these qualities: completeness correctness timeliness integrity

36

GIS Development Guide

Together, these aspects of informational quality comprise the extent to which the dataset will meet the basic requirements for data conversion acceptance. Completeness is an assessment of the datasets existing features against what should currently be located within the dataset. Completeness may relate to a number of digital map features: annotation symbols, textual annotation, linework. Completeness will also relate to the attribute data, and whether all of the necessary attributes are accounted for. A typical requirement for the bottom limit of dataset completeness, when outsourcing conversion, is that not more than 1% of the required features and attributes will be missing from the digital dataset. For example, out of 80 roads that are located within a geographic area, if only 72 are included on the map, then only 90% of the data is included, and thus the map is only 90% complete. Correctness is that quality which relates to the truth and full knowledge of the information contained. If a map shows a number of roads, and the linework is positioned correctly, but is not labeled correctly, there is a problem with correctness. Correctness applies both to map features and to attribute data. If a dataset has the positional accuracy, or the completeness in terms of placing an object, but does not have the correct label for that object, this is a problem with the correctness of the dataset. Evaluating correctness can be done through automated or manual procedures. Validation procedures are those which would be utilized in the testing of the datasets. An example of assessing correctness might include the matching of one dataset source against another to check for data accuracy from the various matching qualities. Every graphic and database feature has the potential for error. Timeliness is another measure of informational quality, and it is a unique form of correctness. Timeliness is based upon the currency of a dataset, and if it is not up-to-date, or current, then the dataset must be of a specified age. The timeliness of a dataset begins from the date the dataset arrives at the clients door. From that point on, it is the responsibility of the client organization to maintain the data, and its currency. The integrity of a dataset is a measure of its utility. Graphically, database integrity means that the dataset is maintaining its connectivity and topological consistency. In it, all lines are connected, there are no line overshoots or undershoots, and all feature on the display are representative of realworld features. In order to maintain database integrity, there should not be any missing or duplicate records or features.

Survey of Available Data 37

GIS DEVELOPMENT GUIDE: PILOT STUDIES AND BENCHMARK TESTS

INTRODUCTION

Prior to making a commitment to a new technology like GIS, it is important to consider testing concepts and physical designs for development of such a system within a local government. This can be done by performing a pilot study to determine if GIS can be useful in the daily conduct of business and, if so, further conducting a benchmark test to determine the best hardware and software combination to meet specific needs. Numerous GIS pilot studies and benchmark tests have been conducted by local governments within the state and across the nation. Decisions on deployment of GIS should not be based solely on other experience. Managers and end users respond best to relevant local data and actual applications, and will learn more readily if they have first hand experience defining and conducting a pilot study on benchmark test in-house.

PILOT STUDY: PROVING THE CONCEPT

Planning a Pilot Study A pilot study provides the opportunity for a local government to evaluate the feasibility of integrating a GIS into the day-to-day functions of its' operating units. Implementing GIS is a major undertaking. A pilot study provides a limited but useful insight into what it will take to implement GIS within the organization. Proving the concept, measuring performance, and uncovering problems during a pilot study, which runs concurrent with detailed system planning, database planning, and design, is more beneficial than pressing forward with implementation without this knowledge. To maximize the usefulness of the pilot study, it must be planned and designed to match the organizations work flow, functions, and goals as described in the GIS needs assessment. The pilot study will be successful if it has the support and involvement of upper management and staff from the outset. This involvement will provide the opportunity to evaluate management and staff ability to learn and adopt new technology. Objectives of a Pilot Study A pilot study is a focused test to prove the utility of GIS within a local government. It is not a full GIS implementation nor is it simply a GIS demonstration; but rather a test of how GIS can be deployed within an organization to improve operations. It is the platform for testing preliminary design assumptions, data conversion strategies, and system applications. A properly planned and executed pilot study should:

38

GIS Development Guide create a sample of the database test the quality of source documents test applications test data management and maintenance procedures estimate data volumes estimate costs for data conversion estimate costs for staff training

The pilot study should be limited to a small number of departments or GIS functions and a small geographic area. The pilot study should be application or function driven. Even though data conversion will take a major portion of the pilot study development time, it is the use of the data that is important. What the GIS can do with the data proves the functionality and feasibility of GIS in local government. The Needs Assessment document has identified applications, data required, sources of data, etc. In addition, a conceptual database design has been previously developed. Following is a list of procedures for carrying out a pilot study: select applications from needs assessment determine study area review conceptual database design determine conversion strategy develop physical database design procure conversion services and develop conversion work plan commence source preparation and scrubbing develop acceptance criteria and qc plan develop data management and maintenance procedures test application evaluate and quantify results prepare cost estimates

Selecting Applications to Include Care must be taken to select a variety of applications appropriate to test the functional capabilities of GIS and the entire database structure. A review of the Needs Assessment report should provide selective applications to meet these requirements. Make sure to include data administration applications along with end user/operations applications. Data loading, backups, editing and QC routines have little user appeal, but they represent important functions that the organization will rely on daily to update and maintain the GIS database. Selecting Data Data to be tested in the pilot study can either be purchased from external sources or converted from in-house maps, photos, drawings, documents and databases. In any event, the data should represent the full mix and range of data expected to be included with the final database. It should include samples of archived or legacy system records and documents if they are planned to be included in the GIS in the future. All potential data types and formats should be considered for the pilot. This is the chance to test the whole process of integrating and managing data, together with

Survey of Available Data 39 the utility of the data in a GIS environment and different conversion and compression methods, before final decisions are made. Spatial Extent of the Pilot Study Selection of the study area should address several issues: Data density Representative sampling Seamless vs. sheet-wise conversion or storage Choose an area (or areas) of interest that represents the range of data density and complexity. Make sure that all data entities to be tested exist in the area of interest. This will provide a representative dataset and allow the extrapolation of data volumes and conversion costs for the range of data over the entire conversion area. To measure hardware performance the selected area should be chosen to match the file or map sheet size the end user will normally work with. Be aware that even if the data is currently represented as single map sheets at a variety of scales, the GIS will store the data as a "seamless" dataset. Preliminary Data Conversion Specifications A set of data conversion specifications need to be defined for each of the required data layers in the test datasets. The conversion specs need to address.... Accuracy Coverage Completeness Timeliness Correctness Credibility Validity Reliability Convenience Condition Readability Precedence Maintainability Metadata

The foundation of the GIS is derived from the conversion process which creates a topologically correct spatial database. The following diagram identifies in detail the steps necessary to create this database.

40

GIS Development Guide

Steps in creating a topologically correct vector polygon database

FIELD DATA

NON-SPATIAL ATTRIBUTES

linked by unique indentifiers

SPATIAL DATA

MANUAL DIGITIZING

SCANNING

INPUT TO TEXT FILE

DIGITIZE

SCAN AND VECTORIZE

VISUAL CHECK CLEAN UP LINES AND JUNCTIONS WEED OUT EXCESS COORDINATES CORRECT FOR SCALE AND WARPING CONSTRUCT POLYGONS ADD UNIQUE IDENTIFIERS MANUALLY

LINK SPATIAL TO NON-SPATIAL DATA

TOPOLOGICALLY CORRECT VECTOR DATABASE OF POLYGONS

Figure 1 - Source: Principles of Geographic Information Systems for Land Resources Assessment, Burrough, P . A . , 1986. Selecting GIS Hardware and Software To provide for continuity and to minimize added expense for total system development, select the most likely choice of hardware and software based on the database design specifications, and purchase or borrow that necessary for the pilot study from the hardware and software vendors. Selecting a Data Conversion Vendor Even though this is only a pilot study, it also serves as a test of likely suppliers of hardware, software and data conversion services. Therefore, a respectable data conversion vendor should be selected to perform the work, and prior uses of the vendor services should be contacted to confirm their ability to meet expectations. It shouldn't matter what method the conversion vendor uses to convert the data. Be open to suggestions from the potential conversion vendors as to the most cost effective methods to convert the data. As long as you get the data in the correct and usable format to satisfy your database plans, the method for data conversion used should not be an issue.

Survey of Available Data 41 However, you will get much better results if the vendor has first hand experience with the chosen GIS software and the data conversion takes place in the same GIS software package. There is always a chance of losing attributes or inheriting coordinating precision errors converting from one format to another. Defining Criteria for Evaluating the Pilot Study The pilot study performance must be evaluated in measurable terms. By its very name, a pilot study implies an initial investigation. An investigation implies a set of questions to ask and a set of answers to achieve. For clarity, the questions can be addressed to match the major component of GIS plus others as needed. Database Were adequate source documents available and was their quality sufficient? How much effort was involved in "scrubbing" the data before conversion? How long did the conversion process take? Were there any problems or setbacks? Was supplemental data purchased, if so, what was the cost? Did the data model work for each layer as defined? Was the data adequate (i.e. all data elements populated)? What errors were found in the data (closure, connectivity, accuracy, completeness, etc.)

Applications Were the applications written as specified Did the applications fit smoothly in the GIS or was a separate process invoked? Are the required functions built into the GIS or will applications need to be developed? Is the GIS customizable? How responsive and knowledgeable is the software developer's technical support staff? Were expectations met?

Management and Maintenance Procedures How will the data be updated, managed, and maintained in the future? Have all those who will contribute to the updating and maintenance been identified? Have data management and administration applications been developed and tested? Have data accuracy and security issues been addressed? Who will have permission to read, write, and otherwise access data? How will using GIS change information flow and work flow in the organization?

Costs

42

GIS Development Guide

How large a database will be created? What will be the required level of existing staff commitment during the data preparation and GIS construction process? What will be the cost for data conversion of in-house documents? What will be the cost for obtaining supplemental data from outside sources? How will GIS impact or interface with existing hardware and software? What new hardware, software and peripheral equipment is required? How much training of staff is required? Will additional staff with distinct GIS programming and analysis capabilities be required?

EXECUTING THE PILOT STUDY

Data Preparation (Scrubbing) and Delivery Document preparation of source data representing the entire range of data to be included in the database must be completed before the conversion contractor can begin work. Data preparation includes improving the clarity of data for people outside the organization who are unfamiliar with internal practices. This pre-conversion process is referred to a "scrubbing." Scrubbing is used to identify and highlight features on maps that will be converted to a digital format. The process provides a unique opportunity to review or research the source and quality of the documents and data being used for conversion.

Identify Database Requirements

Identify Data to be Created

Identify Appropriate Data Sources

Develop Conceptual Database Design

Develop Physical Database Design

Procure Conversion Services

Identify Accuracy Requirements

Determine Conversion Strategy

Develop Data Conversion Work Plan

Commence Source Preparation and Scrub

Commence Other In-House Activities

Finalize Acceptance Criteria and QC Plan

Edit Delivered Data

Commence Database Maintenance

Develop Database Maintenance Procedures

Figure 2 - Guide to Data Conversion Handbook

Source:

GIS

Data

Conversion

Survey of Available Data 43 Scrubbing is generally an internal process, but may also be performed by the conversion vendor. The conversion vendor will need to be trained on how to read your maps or drawings. The first map (or all maps) may need to be marked with highlighter pens and an attached symbol key to define what features need to be collected. At the same time the maps are marked-up, coding sheets are filled out with the attributes of the features to be captured and a unique id number is assigned to both the feature and the coding sheet to create a relate key. This key is critical to connecting the attribute records to the correct map feature defined in Database Design. The best key is a dumb, unique, sequential number that has no significance. The key should never be intelligent, that is contain other information. The key should never be a value that has meaning, or has the potential of changing. Dont use address, or map sheet number or XY coordinates or date installed. These values are very important and should each have their own field in the database. Dont use them as the primary key. The reason is very simple. If you use a smart key like SBL number and you have to change the number, you run the risk of losing the connection to all other related tables that key on the SBL number. Make the change and the records no longer match. However, if the key is unique and has no meaning it will never have to be changed. Street names change, numbers get transposed, features are discovered to be on the wrong map sheet or at the wrong XY coordinates. If any corrections need to be made, a large defensive programming effort must be in-place to guarantee the integrity of the intelligent key. Avoid the grief and use a dumb, unique key. Coding sheets are only required if the attributes of the features are not readily available from the map document. For example, if all the required attributes for a feature are shown as annotations on the map (e.g. the size, material and slope for a sanitary sewer line), then a coding sheet is unnecessary. If additional research is required to find the installation date, contractor name, flow modeling parameters or video inspection survey, then a coding sheet needs to filled out for each feature. Again it is critical to create and maintain a unique key between the map feature and the attribute data on the coding sheet. Once the data has been prepared for conversion, make copies of everything being sent out and make an inventory of the maps, coding sheets, photos, etc. that will be sent to the vendor. Ask the vendor to perform an inventory check on the receiving end to verify a complete shipment arrived. Change management is essential. If the manual maps or data will be continually updated inhouse during the conversion process, keep careful records about what maps and or features have changed since the maps have been sent out. This is an important process that needs to be fully inplace if the pilot study leads to a full GIS implementation. When and Where to Set Up the Pilot Study Expect the pilot study to have an impact on daily work. Choose participants where the pilot will not have a negative impact on the daily workload. Even if the GIS is to assist a mission critical process like E911, conduct the pilot as a parallel effort, dont expect it to replace an existing system. At the same time try to make the GIS a part of the daily workflow to test the integration potential.

44

GIS Development Guide

To ensure some level of success of the pilot study, choose willing participants to act as the test bed/ pilot study group. Make sure they understand the impact the pilot will have on the organization and the level of commitment from the staff members. Use educational seminars to inform the employees about GIS technology and the purpose of the pilot study. Communicate very clearly what the objectives of the pilot study will be, what functions and datasets will be tested and which questions will be investigated. Describe the required feedback and the use of questionnaires or checklists that will be used. Above all else, communicate to keep staff informed and to control expectations. Who Should Participate A team representing a cross-section including managers, supervision, and operations staff should be assembled for the pilot study. Choose the staff carefully to assure objective and thoughtful system evaluation. If possible, choose the same people that were involved in the needs assessment process. They will be more aware of GIS technology and may be eager to see the project move forward. Testing and Evaluation Period Have a pilot team kickoff meeting with the conversion / software / hardware vendors present. Restate the objectives of the pilot study and responsibilities of each party. Review Needs Assessment, database design documents and assess training requirements. Define communication protocol guidelines if necessary to keep key players communicating and resolving problems. Before the data arrives, install the software and or hardware in the target department. Conduct user training to familiarize employees with the use of the GIS software. If employees are unfamiliar with computers, allow more time for training and familiarization. Once the data has been converted and delivered, have the conversion vendor or the software vendor load the data on the target machines. Be sure that this step and all preparatory efforts are monitored and treated as a learning process for your staff. Begin a through investigation of the capabilities and limitations of the hardware and software. Keep user and vendor defined checklists beside the machines at all times. Have each user log their observations and impressions with each session. Make sure to note any change in performance as a function of time of day or workload. Also note if the users level of comfort has increased with time spent using the software. Log all calls to the data conversion, software and hardware vendors. Note the knowledge and skill of the call takers, responsiveness and turn-around time from initial call to problem resolution. Some problems may be addressed on the phone, others may take days. If the call cannot be handled immediately, ask the outside technical support person for an estimated time.

Obtaining Feedback From Participants

Survey of Available Data 45

It is imperative that all individuals involved in the pilot study provide input before during and after the pilot study is complete. The best method to guarantee feedback from the participants is to have them help formulate the objectives of the pilot, the questionnaires and checklists. Sample questions to address were listed earlier in this document. Augment these with questions from your own staff. Some questions can be answered with a yes/no checklist, some answers will be a dollar figure, and some will require a scoring system to rate aspects of the system performance from satisfactory to poor or unacceptable. Other issues that may effect information flow, traditional procedures and work tasks will require participants to write essay questions or draw sketches of changes they would like to see in the user interface or in the map display. All responses should be compiled in such a way that the responses can be measured and rated numerically.

EVALUATING THE PILOT STUDY

What Information Should Be Derived From the Pilot Study The first question to be addressed is whether the pilot study was a success. Success doesnt necessarily mean that the process went without a hitch. A successful pilot study can be fraught with problems and GIS can be rejected as a technology for the organization. The success of the pilot study should be measured by whether the goals and objectives defined for the pilot were achieved. Most issues listed below were covered in earlier portions of the document, but are summarized again. Data Specific Issues Many issues to be assessed in the pilot study are data specific and are related to data quality, volumes and conversion efforts. Source Document Quality Most first time GIS users are so awe struck by seeing their maps on the computer screen or on colorful hard copy plots that they overlook the importance of reviewing the quality and usefulness of the source documents and the utility of the final product. Many original maps are so old and faded, that they are unusable as a source document to create a GIS dataset. Some municipal agencies have scraped the existing maps and re-surveyed the entire towns street and utility infrastructure. This is not a cheap alternative, but digitizing bad maps is not a good investment. Quality Control Needs There is a danger present in any data conversion project (even for a pilot study) that the vendor will perform the conversion and deliver the data to the client without an adequate Quality Control process in place. If the client is new to GIS, they may not be able to determine if all the data is present, if the data is layered correctly or if all attributes are populated.

46

GIS Development Guide Because a GIS looks at map features as spatially related, connected or closed features, GIS query and display functions can be used to identify features that are in error. By displaying each map layer one at a time using the attributes of the features, item values that are out of range (blank, zero, or extreme values) will show up graphically on the maps in different colors or symbol patterns. Erroneous values should be reported to the conversion vendor immediately for resolution. The client may consider using a third party GIS consulting firm to review the quality of the data and verify the map accuracy. Data Availability Before an attribute field is added to a coding sheet as a target for data capture, be sure the value is readily available and has importance to the operation of the agency. Many data fields would be nice to have, but may not be cost effective. For example, a sidewalk and driveway inventory for a community would be a useful data layer to capture. However, if there are no existing maps showing sidewalk locations, using aerial photos and photogrammetry is a costly approach to capture sidewalks and driveways. A cheaper alternative may be to create two single digit fields in the street centerline attribute table to hold flags indicating the presence or absence of sidewalks on the left or right side of the street. An operator looking at the GIS screen and air photos can assign the values to the flags without a large amount of effort. Based on these values, different line styles or colors can be used to symbolize the presence of sidewalks in a screen display or hardcopy maps. Pre-conversion Editing Be sure to track and review the number of man hours and problems encountered during the pre-conversion scrubbing effort. These steps will undoubtedly be performed again during the full conversion and now is the time to assess the impact on the organization. Data Volumes Data volumes and disk space is an important issue to evaluate in the pilot study. The pilot by design covers a small area of interest. Use the same data cost ratios discussed above to extrapolate data volumes for the entire GIS implementation effort. Data volume is not only a disk space issue. There are inherent problems associated with managing large datasets. Large files take more computer resources to manipulate, backup, restore, copy, convert, etc. A tiling scheme (i.e. breaking the data into smaller packets for storage and manipulation) should be investigated in the pilot study as a future solution for full implementation.

Assessing the Adequacy of the Data Conversion Specifications Data conversion specifications are provided to give the conversion vendor and the client organization a set of guidelines on what layers, features and attributes should be captured, at what

Survey of Available Data 47 precision, level of accuracy and in what format is the data to be delivered. Best intentions and reality need to meet in the pilot study to evaluate the expectations and the level of effort (costs) involved with converting the target dataset. Ask the conversion vendor for feedback on the clarity of the specifications. Do the specs make sense? Some vendors, holding to the adage the customer is always right, will not question your specifications and will do whatever you ask no matter how in-efficient the process. Others will openly suggest alternatives approaches and will seek clarifications. Note the kinds of questions they present and be open to changes early in the process. Evaluation of logical data model and applications Not only should the quality of the data conversion and the GIS software be reviewed in the pilot, but just as important, the logical data model needs to be reviewed. The logical data model describes how map features are defined (points, lines, polygons, annotations) and the relationships between these map features and related database tables. Running applications against the data model will allow measurement of response time that is a function of data organization. The bottom line is does the data model make sense for all the applications being addressed in the pilot and will it be useful in the full implementation. Ask the conversion and software vendors to explain the organizational structure of the GIS data model. What are the advantages, disadvantages and tradeoffs for the model used in the pilot and ask if the same structure would work comparably in a full implementation. Look carefully for short cuts or data model changes to make a dataset work in the pilot. It may work very well for a demo on a small dataset, but it may be unwieldy in a large implementation. GIS hardware and software performance Test the GIS running under a variety of scenarios ranging from single to multiple users performing simple to complex tasks. Ask your software vendor to write a simple macro to simulate multiple users running a series of large database queries. Test the performance of query and display user applications while data administration functions are running. Were the users able to learn to use the system and perform useful work? Refined GIS Cost Estimates By requiring the conversion vendor to keep detailed logs of conversion times for each data layer and feature type by map sheet, the client organization can project or extrapolate from the pilot data conversion to a cost for full conversion. One approach that has work well in the past is to use parcel density as an indicator of manmade features. For example, if you compute a series of ratios of the number of buildings, light poles, miles of pavement edge, manholes, hydrants, and other features against the number of parcels in the pilot area, you can compute with pretty good certainty the number of manmade features in the remainder of the GIS implementation area. The Office of Real

48

GIS Development Guide Property Services has a low cost ($50 / town) parcel centroid database in a GIS format that can be used as a guide for parcel density. Unfortunately physical features like streams, ponds, contours, wooded areas, wetlands, etc., do not have a direct correlation to parcels. In fact there seems to be an inverse relationship between parcel density and number of physical features. The point to be learned is that the pilot study should provide an indication of costs for a full featured/full function GIS implementation effort.

Analyzing User Feedback Tally the number of positive responses to yes/no questions, compute an average score for user satisfaction, and compile the essay responses for content and tone. Review the complied results with all team members and management. Interview team members to clarify questions with unclear or strong responses to gain more insight. From response scorecards and comments develop an overall score to determine user satisfaction, completion of goals and objectives.

BENCHMARK TESTS: COMPETITIVE EVALUATION

The purpose of a benchmark is to evaluate the performance and functionality of different data conversion methods, hardware and software configurations in a controlled environment. Each software package can be compared in the same hardware environment or one software package can be compared across different hardware platforms. By defining a uniform set of functions to be performed against a standard dataset, key advantages and disadvantages of the different configurations can be compared fairly and objectively. Planning a Benchmark Test As with any successful project, a detailed, thought out plan needs to be devised. It should be noted that performing a benchmark takes a large amount of effort by both the local government agency and the vendors taking part. Few firms can afford to devote large amounts of staff time and computing resources competing in benchmark tests for free. Keep that in mind as you design the benchmark to focus the tests on key issues that can be readily compared. If the benchmark will be extensive, associated costs may be incurred. Objectives for the Test A benchmark provides an opportunity to evaluate the claims of advanced technology and high performance presented by the marketing/sales force of competing data conversion, hardware and GIS software vendors. The objectives of the benchmark should be defined clearly and communicated to all parties involved. Suggested objectives for each of the different types of benchmarks include testing: Conversion Methods Cost effective procedures

Survey of Available Data 49 Sound methodology Quality control measures Compliance with conversion specifications

Hardware Computing performance Conformance to standards Network compatibility and interoperability Future growth plans and downward compatibility Software Conformance to standards Computing speed / performance GIS functionality (standard and advanced) Can the software run on your existing hardware system Ease of use - menu interface, on-line help, map generation, etc. Ease of customization for non-standard functions Licensing and maintenance costs This list of objectives is not all inclusive and should only be used as a guideline or a starting point for your organization to design a benchmark study. Preparing Ground Rules Based on the defined objectives, all parties involved should be aware of what will be tested, how they will be judged and what criteria will be used as a measure (i.e. low cost, high performance, good service, quality, accuracy, etc.). of Tests to be performed should be as fair as possible The exact same information and datasets should be given to all vendors A reasonable time frame should be provided to perform the work No vendor should be given preferential treatment over any other and clarifications intent should be offered to all Tests should be quantitatively measurable Hardware tests should use comparably equipped or comparably priced machines Software tests should be performed on the same hardware and operating system

Create scoring sheets for each aspect of the test. For subjective tests, like ease of use, have each user rate their satisfaction/dissatisfaction with the results of each phase using a numeric rank-order scheme. This won't eliminate bias but will allow impressions and opinions to be compared. For objective tests, like machine performance, record the clock speed, disk space requirements, number of button clicks, error messages, response time, etc. for each test conducted. Preparing the Test Specifications (Preliminary Request for Proposals or RFP) The test specifications need to outline the type of test to be conducted (conversion, hardware or software); objectives of the test; detailed description of the test; measures for compliance; and a time frame for completion.

50

GIS Development Guide

Selecting the Participants and Location In order to conduct a benchmark, you need knowledgeable participants (both internal and external). The internal participants should be knowledgeable regarding the topic to be tested (data conversion, hardware or software). Selecting external participants is more involved. Situations range from not knowing any vendors to invite to how to limit the number of vendors. The smaller the number of participants the easier the final selection process will be for the local government agency. The Request for Qualifications (RFQ) process can be used to filter or pre-qualify potential participants. GIS is a specialized field and not every business involved with computers is qualified. Several factors should be considered when selecting vendors for a benchmark test Are they knowledgeable about local government agency operations Are they a well known company Are they technically qualified Are they experienced and have a successful track record Are they financially sound, insured or bonded Are they going to be around 5 years down the road Are they local or do they have a local representative Would their previous clients hire them again

If the RFQ and/or the RFP are written clearly and succinctly, the process will filter the participants and only those companies that specialize in the subject in question will respond. The benchmark can occur either at the clients site or the vendors offices. Some tests like data conversion are best conducted at the vendor site to minimize relocating staff and equipment for a test. Hardware and software benchmarks are commonly conducted at both the vendor and client site. The initial data loading, customization and testing is performed at the vendor site. Once the operations are stable, the client is invited to view the results at the vendor site, or the system is transported to the client site. Preparing the Data For a data conversion benchmark, provide each vendor with a set of marked up (scrubbed) set of maps, documents and coding sheets as described in the pilot study section above. If possible, provide the data conversion vendor with an example dataset from the pilot study which shows the appropriate data layering, tolerances and attributes to be captured. If not a dataset, clear specifications for how the data should appear when complete. Specify what data format (*.dxf, *.e00, *.mif, tar, zip, etc.) and what type and size of media (1/4, 8mm or 4mm tapes) you want the data delivered in.

Survey of Available Data 51 For a hardware or software benchmark, provide a sample dataset which contains all possible layers for inclusion in the GIS. The data could be purchased, converted during the pilot study or could be the results from a data conversion benchmark noted above. Provide sufficient documentation with the data to describe the use of the data, the organizational structure and contents. Scheduling The Benchmark Test Once the benchmark has been defined and agreed to by the participants, set a time for the testing to occur. Schedule a start date and a duration. Unless you specifically want to use company responsiveness as part of the test (i.e. how fast can they respond to a problem), don't require an immediate start date or extremely short time frame. There is no need to cause undue panic and stress, you want a good test. Transmitting Application Specifications And Data To Participants Before transmitting maps, documents or data to any vendor, make an inventory and backup copies of all items. Either specify to the vendors that the data will be provided in a single data format on a specific media, or make arrangements to provide the data in a format they can read. Be sure to test the readability of the tape or disk on a target machine in your office before sending the data out. Once the data has been verified as complete and readable, make two copies of the tapes or diskettes, one to send and one to keep as a recoverable backup for documentation of the delivery. Provide detailed instructions as to the contents of the tapes or disks and how to extract the data. List phone numbers of responsible persons should problems arise with delivery or data extraction. Ask the vendor to perform an inventory at the receiving end to acknowledge receipt of the data or documents. On-Site Arrangements If the tests are to be conducted at your site, make sure you have the authorization and backing of management and all personnel to be involved. Provide plenty of advanced notice and time to setup. If you are conducting hardware tests you have to decide if more than one vendor's machines will be present at the same time for comparative testing. With both machines setup in the same room, you can conduct the exact same tests in "real time" and visually compare the results, but this will require more setup space and logistic leeway in the schedule to accommodate multiple vendors. Make sure you have a suitable environment for equipment with adequate power, air conditioning and security. Also make sure you have all required utility software in place to read and write compressed files from tape and virus detection software. If you are performing software tests, make sure you have two or more machines with the exact same hardware and operating system configurations. If you can't have multiple machines, be sure to backup and restore the current operating system files before testing each software package to ensure a fair test of disk space requirements, resource usage and functionality. Always use the same datasets for each test. Identifying Deficiencies In Specifications Although the tests were well thought out and carefully followed, you will probably wish you had performed additional tests during the benchmark. If short comings are discovered early on and they do not involve major changes in direction, additional tests could be incorporated. Be sure to notify the local management, staff and vendor participants of the change in objectives.

52

GIS Development Guide

Defining benchmark criteria


Data Conversion Issues A standard set of tests need to be performed to evaluate the results of a data conversion benchmark. Overlaying checkplots with the source documents on a light table is a straightforward but time consuming way to compare the conversion results. Suggestions made in the Pilot Study section of this document, outline methods for using GIS query and display functions to determine if all the data is present, layered correctly and attribute values are within range. Displaying map features by attributes will highlight errors or items out of range in different colors or symbol patterns. GIS Software Performance Software tests can be classified into 2 groups - capabilities and performance. Capabilities tests if the software can perform a specific task (i.e. convert DXF files, register image data, access external databases, read AutoCAD drawings, etc.) Performance deals with how well or how fast the software performs the selected task. How fast can be measured with a stopwatch, how well is open to interpretation. The operating system on the machines in question will play a big factor in how GIS software will perform. GIS software written to run on a 32 bit operating system will not perform as well in a 16 bit environment without work arounds. Likewise, a 16 bit application will run faster on a 32 bit machine, but will not run as well as 32 bit software on a 32 bit operating system like UNIX, Windows 95 or Windows NT. Hardware Performance The goal is to find the fastest, cheapest hardware to meet your budget. Take advantage of computer magazine reviews of hardware. They conduct standard benchmark tests involving word processing, spreadsheets and graphics packages. The test results wont be GIS specific, but will show the overall performance of a given computer. Oddly enough, two computers with seemingly identical hardware specifications (clock speed, memory, and disk space) can perform very differently based on internal wiring, graphics acceleration and chip configurations. Evaluating Benchmark Results If the questions were formulated clearly, and the results were recorded honestly, evaluating the results of the benchmark should be process of simple addition. Essay responses and comments will have to be followed up with further tests to clarify any problems or differences encountered.

Departments and Functions which will Utilize GIS


Sanitation Bldg. & Plumbing Inspection Planning Board Conservation Advisory Council Assessor Youth and Recreation Svces Senior Services Traffic Safety Zoning Board Police Fire/Disaster Coordination Community Development Facilities Management

Type of Use
Query/Display Spatial Model Map Analysis CAD/Display

Highway

Sanitary Sewers

Clerk

General Description of GIS Map Products and Applications

Engineering Sewer Maintenance

Pump Stations & Force Mains Water Lines Storm Sewers Lighting District Boundaries Easements Street Map Streams & Ditches Soils & Rock Wetlands Woodlands Archaeological Sites Hazardous Materials Sites

C r i t i c a l E n v i r o n m e n t a l Z o n es Drainage Basins Tributary Areas Sewer Flow Analysis Sewer Capacity Analysis Scheduled Repair Work Emergency Repair Work Dispatch Route Selection Building Types Crimes Fires Subdivisions

GIS DEVELOPMENT GUIDE Volume III

Table of Contents
ACQUISITION OF GIS HARDWARE & SOFTWARE Introduction .....................................................................................1 GIS Hardware and Software Acquisition................................................2 Steps in the GIS Acquisition Process......................................................2 Evaluation of Proposals........................................................................4 GIS Delivery and Installation Plan.........................................................6 Sample Hardware Specifications............................................................6 Network and Communications Specifications..........................................9 Software Specifications ......................................................................10 GIS Database Structure ......................................................................13 Summary ...................................................................................14 GIS SYSTEM INTEGRATION Introduction ...................................................................................15 GIS System Components.....................................................................15 System Testing..................................................................................18 User Training ...................................................................................18 Figures 1 - Database Integration .................................................................16 2 - Library Structure to Support Editing..........................................17 3 - System Integration....................................................................18 GIS APPLICATION DEVELOPMENT Introduction ...................................................................................19 Why Applications are Needed.............................................................19 Categories of Applications..................................................................19 Database Applications ........................................................................21 Figure 1Life Cycle of a GIS Database.....................................................20

T able of Contents cont' d

GIS USE & MAINTENANCE Introduction ...................................................................................22 User Support and Service...................................................................22 Data Maintenance Procedures .............................................................23 Examples ...................................................................................28 Figure 1Overview of GIS Maintenance ...................................................24

GIS DEVELOPMENT GUIDE: ACQUISITION OF HARDWARE AND SOFTWARE

INTRODUCTION

This guide begins the description of the first of four steps of the GIS Development process (figure 1) which deal with the actual assembly of the GIS and its subsequent operation.

Needs Assessment

Conceptual Design Database Planning and Design Database Construction

Available Data Survey

GIS System Integration

Application Development

GIS Use and Database Maintenance

Pilot/ Benchmark

Aquisition of GIS Hardware and Software

H/W & S/W Survey

Figure 1 - GIS Development Process All of the necessary planning, design and testing should have been completed during the execution of the previous seven steps of the GIS development process. The remaining steps and their main purpose are as follows: GIS Hardware and Software Acquisition - includes the final selection of the hardware and software (by competitive bid in response to a Request for Proposals - RFP, as necessary); the delivery and installation of the hardware and software; and all necessary renovation of space, wiring, and environmental remodeling. GIS System Integration - bringing the final database and the hardware and software together and testing their combined operation.

GIS Application Development - preparing applications identified in the Needs Assessment which require additional programming using the GIS macro language or other supporting programming languages. GIS Use and Maintenance - starting use of the GIS and institution of database, hardware and software maintenance programs. Further application development and user training are also continuing needs.

GIS HARDWARE AND SOFTWARE ACQUISITION

This step is the actual purchase of the GIS - hardware and software. The GIS to be acquired is usually subject to competitive bid by the interested vendors. The single most critical part of this process is the preparation of an adequate (and detailed) Request for proposals (RFP). Acquiring the components for your GIS is an important step. Use all of the information you have gathered up to this point to produce a document telling prospective bidders what you need. The document should clearly communicate your needs and how bidders should respond to the RFP. During this phase remain objective. Keep as much of the politicking out of the selection process. You should be looking for the best value for your money, not the lowest cost.

STEPS IN THE GIS ACQUISITION PROCESS

Evaluation Team The evaluation team should be made up of interested staff from departments involved in implementing GIS within the local government. These individuals need to be objective and not have pre-defined ideas of what system they want. They need to be action oriented and willing to put in the time to do the job right. A successful RFP process involves a great deal of hard work and coordination. You will need to have people on the committee to help accomplish this. Once a draft RFP has been developed, have an objective 3rd party look at it. You want it as complete and readable as possible. This can be another local government (maybe one of the ones how supplied you a copy of theirs) or a consultant helping you with the RFP process (make sure the consultant is not planning on bidding on the project). Preparation of Request for Proposal (RFP) The RFP document is used to communicate your needs to potential bidders. It will also tell bidders how you want them to respond to the RFP. Be as specific as possible in defining what you need for your GIS. Provide detailed descriptions of the functionality, services and support you are looking for. It is recommended that you do not use specific brand names of software and hardware products in your RFP specifications. This will limit the number of potential bidders you can choose. There will be situations where specific

products are needed. An example is when your organization has a policy in place for using a type of operating system or has already standardized and developed data sets for use in a particular software package. Focus more on what you want the system to do. You will not get what you need unless you specify it clearly in the RFP. In your RFP, tell the bidders how you want them to respond. Provide examples of what you want: define how pricing should be structured, use standardized forms if appropriate, clearly state criteria for evaluating the responses. You will receive responses that are more consistent and easier to evaluate if you define the response guidelines in the RFP. To get started, contact other local governments who have recently developed similar RFPs. Use these as a guide. It would be a good idea to contact the person responsible for evaluating the responses. Ask them what worked and what didnt work with the RFP. Adjust your RFP accordingly. Also adjust the scope of your RFP to fit your needs. If you are a small village, dont use a RFP developed by a larger city (or visa-verse) you will not get what you need and the potential bidders will be confused or mis-directed. Distribution of RFP You will want your RFP to go to qualified bidders. The best source for this is to go to trade shows or GIS user group meetings and ask around. Again, try to stay objective. Dont get mis-lead by flashy demos or excessive hype. Talk to other local governments and get recommendations of companies they think are qualified to respond to your RFP. Another method might be to post a notice in GIS trade journals (both regional and national). Be prepared for a large amount of companies inquiring about your project. This method is better used for large, expensive projects. Bidders Meeting A bidders meeting should be scheduled within a week or two of the RFP be sent out. Make sure the time and location is in the RFP. This meeting is used to get feedback from the bidders and to clarify anything not clearly stated in the RFP. It is always an interesting experience to have number of competitors gathered together in one room. There will be a reluctance by the bidders to ask any questions that might give away their bidding strategy to their competitors. Do not be surprised if there are not many questions raised at the meeting. To get things going, have a short prepared statement or presentation that outlines the history of the project and the requirements of the RFP. It is important to ask the bidders to submit written questions to you in a specified period of time. It is also recommended that all written questions and your responses be compiled and sent back to all bidders. This will provide consistency and fairness in the process. The purpose of this meeting is to communicate to all bidders what you need and how you want them to respond.

Answering questions In addition to the written responses from the bidders meeting, you will need to provide some mechanism for answering ad-hoc questions from bidders. The best way to do this is to require that all questions be faxed or e-mailed to a specific person and provide a response within 24 hours. It would be impractical for your organization to provide these ad-hoc questions and answers to all bidders. It would be a good policy to take questions up to the submission date for proposals. After that date no correspondence between a bidder and people involved with the selection process should be allowed. Deadline for submission Establish a deadline for submission. All responses must be in by the specified time and the specified location in order to be considered. Set you time to be a few hours before the close of business. Inevitably a bidder will get stuck in traffic or a courier will be delayed. This will give you a little cushion and allow you time to check in responses while still allowing you to go home at a reasonable time.

EVALUATION OF PROPOSALS

Evaluating proposals should be done by the RFP committee with all the members using the same criteria as listed in the RFP. This process should be documented in case a protest arises. If you have been specific defining your GIS needs and defining how bidders needed to respond, the evaluation process should be straight forward. Sample Questions - Has the bidder: Proven they can meet all of the functionality needed? Provided pricing that can be compared with other responses? Described the types of services and support in an understandable way? Provided references and related experience for you to check on?

Criteria for Evaluation It is important that this process be documented in case a protest is submitted or to explain why a proposal was not accepted. Each of the criteria needs to be measurable or quantifiable. Functional capabilities In the Needs Assessment phase GIS functionality was identified and documented. This documentation of functionality should be defined in the RFP and used for this evaluation. Develop a checklist of the various functions and have each committee member fill out the checklist for each proposal.

Vendor Support Without proper support any system is doomed to failure. Part of the evaluation is to understand the type of support being offered. What kind of response time is being offered and what are the standards. Will the vendor provide answers to a problem within 24 hours of a call? Will they provide on-site vs. factory service for hardware problem? Make sure you are comfortable with the level of service being offered. Cost / Maintenance Fee There are a lot of ways to state the price of a proposal. It is recommended that you be specific as possible in the RFP and bidders meeting about how the price should be structured. The more pricing can be itemized in the proposal the easier it will be to compare the responses to each other. A suggestion is to develop a pricing form for each bidder to fill out and include with their proposal. As a minimum have separate pricing for software, hardware, services and support. More detail for each of these sections would be nice, just dont get too carried away. Interviews / Benchmark Test ( see Benchmark Test Guide) After the RFP committee has evaluated the written proposals, a short list of bidders should be agreed upon. Any proposals that are not in compliance with the RFP or do not rank high in the evaluation should be eliminated from consideration the remaining bidders compromise the short list. Some marginally qualified bidders many need to be eliminated as well to keep the short list of bidders a manageable size. These short list bidders will be invited to a interview and/or a benchmark. During this process you will be evaluating the bidder on: Ability to interact with your organization Technical ability Ability to communicate effectively

Selecting a Proposal Once the Interview / Benchmark is completed. The RFP committee members should compile all of their evaluations independently then meet as a group. This meeting should review all of the proposals and begin to focus on which proposal to select. At this meeting questions may arise that need to be answered in more detail. Take the time to get these answers from the bidder before a selection is made (generally a phone call will work but sometimes a follow up interview is needed if practical). Once all of the committees questions are answered, it should move quickly to making a selection and notifying the bidders. At this point a contract needs to be put in place that defines the scope of work outlined in the RFP. This contract needs to be executed before any further phase of GIS implementation is started.

GIS DELIVERY AND INSTALLATION PLAN

Once you have selected a vendor(s) for your system you will need to coordinate the delivery and set up of all of the components. there are many resources to call on to do this. The most obvious being the vendor. They should have demonstrated that they have some level of expertise with GIS and can help you get up and running quickly. It is a good investment to buy their services to install and set up the system for you. These service can be contracted for on a time-and-material basis or under a scope-of-service contract. The most effective means of describing how to prepare the RFP is to do so by example. The remainder of this guideline consists of selected parts from an actual RFP, - presented here to illustrate the scope, content, and level of detail needed. A properly prepared RFP increases the chances that the vendor responses will be most appropriate to the needs of the local government.

SAMPLE HARDWARE SPECIFICATIONS

Specifications for a system configuration to support Geographical Information System (GIS) development and operational applications follow. The system configuration consists of various devices that will be networked together to support data capture, storage, processing and display in both digital and hard copy forms, including: mapping/analysis workstations (2) color laser printer (1) black and white laser printer (1) cartridge tape drive (1) color raster plotter (1)

The proposal shall include technical and functional capabilities of the devices offered to meet these specifications. Provision of the following information should be included for each device: manufacturer model number capabilities/configuration of each device in comparison to the device specifications documentation provided with the device (i.e., manuals) warranty included in the purchase price the nature and duration of user support services included in the purchase price such as maintenance agreements, user support and service, and the average time period between requests for user support and on-sit technical service if available.

GIS Workstations The Mapping/Analysis workstations will support a wide range of GIS activities, including database development, database quality control, user application development, database maintenance and all GIS applications supported by fully functional GIS software such as cartographic production, geographic database queries, and advanced geographic analysis using both spatial and attribute information. One of the GIS workstations must support high capacity data storage, and multi-user GIS processing, and should perform all GIS operations and applications within acceptable user response times. General Specifications for Workstations: Mass storage may be configured within the workstations' cabinetry and/or as external drives The workstations should be configured with a single high resolution (1280 x 1024 or greater) color monitor with at least 19" minimum diagonal screen dimension All devices shall include a keyboard and a pointing device such as a mouse Each GIS workstation should be network-ready, and should be capable of connecting to a local area Ethernet network and supporting a minimum transmission speed of 10 megabits per second (mbps). Multi-user, multi-tasking operating system supporting logical security measures such as user name/password validation, and user access privileges. The devices should support virtual memory operations, either through a dedicated hardware controllers(s) or through software (operating system) functions. Descriptions of options for upgrading speed and performance through the addition or replacement of boards or other components in the existing cabinetry of the workstations should be provided.

Specific Details of Workstations: Both workstations should support the following hardware specifications: The workstation should include a minimum of a 32-bit processor supporting both 64bit address and data buses. The CPUs should operate at a minimum of 75 MHz clock speed and/or have enough processing speed and capacity to support other intelligent GIS client devices. These will consist of X-Stations or PCs. The workstations may have multiple CPUs on board. The devices should include at least 128 MB (megabytes) of main memory and shall support 32 MB memory modules and be expandable to at least 256 MB. The devices should be configured with mass storage disk drive(s) for direct access of data and software functions. They will have a minimum of 3 GB of mass storage each The workstations will be configured with a quad speed CD-ROM drives that will facilitate the installation of upgrades to the operating system, installation and upgrade of application software, and user access and review of systems and application documentation. The devices should also be equipped with one 1.44 or 2.88 MB floppy drive each The server must support multi-user/multi-tasking operations and must concurrently support both server and host workstation functions.

Vendors shall describe options for upgrading the speed and performance of the server and mass storage capacity through the addition or replacement of boards or other components in the existing cabinetry. Also, Vendors shall describe options for increased performance and mass storage that involve connection of devices external to the existing cabinetry.

Small-Format Color Printer One (1) color printer will be used for the production of color hard copy graphic plots and nongraphic report generation. The color laser printer should meet the specifications or equivalent described below: Minimum of 300 dots per inch (dpi) resolution Minimum 100 sheet paper tray Minimum of 4 MB memory onboard with capacity for memory upgrades Support for letter, legal size, and 11"x17" paper sizes Built-in postscript compatibility Serial and parallel interface

Sample hard copy outputs from the proposed device(s) shall be included with the proposal. Cartridge Tape Drive The system should include one (1) 4mm DAT tape subsystem for the Planning and Zoning Department. The tape should have a capacity of not less than 5 GB. The tape subsystem will provide a mechanism for performing system and data back-ups. Large-Format Color Raster Plotter A color raster plotter shall be included for the production of high quality, large format cartographic products. This device must provide a high-volume color plotting capacity. The plotter shall support 36" x 42" plots and produce color plots at a minimum resolution of 300 dots per. The plotter shall be compatible with the proposed LAN hardware and communications protocols and must be accessible by all workstations on the LAN. A sample hard copy output from the proposed plotter(s) shall be included in the proposal. Additionally, the plotter shall meet the specifications described below: Capable of supporting true color plotting Minimum of 8 MB memory onboard with capacity for memory upgrades Support for all paper sizes, A through E size Built-in postscript compatibility Serial and parallel interface

Provide four (4) replacement paper rolls with the printer. The paper should be a high quality glossy bond.

NETWORK AND COMMUNICATIONS SPECIFICATIONS

There is a requirement to connect new hardware in two departments. Existing software consists of Intergraph's I-Dispatcher, emergency response dispatch system. Requirements for each level of communications are outlined in the section below. Vendors shall state the level of compliance and provide a description and cost quotation for all hardware and software components needed to meet the requirements at each level of data communications. Vendors should include in the cost proposal the cost of any specialized hardware devices that will be required to implement the proposed communication network. Network Processing Requirements Network processing requirements are as follows: Storage of data which is accessible by users on the network by specifying particular files, collections of features or attributes, and geographic areas Access security to allow assignment of different levels of access rights to portions of the GIS database by user name or physical device Ability to support query workstations on the network, directly connected to the server, or connected through remote communication lines so that network users can have access to these devices and vice versa Ability to allow database queries directly from workstations on the network without the need to download data to workstations Ability to allow network-wide access to plotters and printers, all with print/plot queries for generating hard copies

Network Management and Monitoring Capabilities The proposed physical network should also be able to perform the following network management functions: Access to data on remote nodes by reference to the node, disk, directory, and file Access to programs on remote nodes by similar reference Assignment of logical names or aliases for programs or data locations on remote nodes Control of peripheral devices from any node on the network Passing of mail messages across nodes Program-to-program communications across nodes Monitoring of traffic and errors on the network

The proposal shall include all cabling and devices required to implement all data communication connections, utilizing existing facilities.

Network Speed and Capacity The proposed system must operate at a minimum raw data speed of 10 megabits per second. The Proposer shall provide information about the upper limit in numbers of mapping/analysis/query workstations that can be supported without major degradation in response time or error rates on the proposed network. Transactions and Data Exchange with Existing Systems Initially, the GIS network will not support on-line links with the existing IBM mainframe. Access to data residing on the mainframe will be accomplished by downloading data onto 9-track tapes and then re-writing this data onto current industry standard media such as 4mm data tapes or CDs.

SOFTWARE SPECIFICATIONS

Software Component Overview The GIS software components shall fully support and exploit the capabilities of the proposed hardware platform and shall provide full functionality for entry, editing, maintenance, analysis, display, and hard copy output of both graphics and tabular data on a continuous and interactive basis. For purposes of this procurement, software component capabilities have been grouped into the functional categories of: Database structure User interface Data entry Data editing/maintenance Data query and analysis Data display/output Application development Operating system requirements

Data Editing And Maintenance The proposer shall describe the tools and capabilities of the proposed system to modify and manipulate spatial and attribute data in the GIS for the following categories: Interactive Graphic Editing Attribute Editing File Copying Deletion of Features Edit Controls Rubber Sheeting Coordinate Registration and Transformations Quality Control/Error Detection Merging, Extraction, Edge Matching of Data Data Transactions, including the capabilities of the proposed system to translate data into and out of the following formats: - GFIS to Proposed System Format (specify how attribute data is addressed - AutoCad DXF (specify how attribute data is addressed) - AutoCad CWG (specify how attribute data is addressed) - Intergraph IGDS (specify how attribute data is addressed) - USGS DLG and DEM - TIGER Line Files - ArcInfo Export Files - Exchange data with KVS Computer Assisted Mass Appraisal (CAMA) System Data Query And Analysis The proposed software shall support the following data query and analysis capabilities: Graphic Data Query Area/Perimeter/Distance Calculation Attribute Data Query Spatial Aggregation Buffer Analysis Address Matching Polygon Overlay Analysis Linear Network Analysis Area Districting and Zoning

Data Display/Output The data display and output tool capabilities that the proposed software shall support including the following: Graphic Display Tabular Display Raster Image Display/Production Vector Map Overlay Hard Copy Map Production Hard Copy Report Production Map Plot/Display Relationship with Scale Graph/Chart Production Interactive Map Composition

Application Development The Proposer shall propose one of more software components that, in a well integrated manner, provide the following capabilities and features: Menu Design and Custom Application Development Programming Features Supporting High-Level (4 GL) Programming Subroutine Libraries

Basic Operating System Requirements The operating system component of the software shall be the primary operating system of the proposed hardware platform and shall provide all of the traditional features of current operating systems as described below: Multi-user Support Multi-tasking, Multi-threading Support Security Management File Management Memory Management Database Backups Error Monitoring/Disaster Recover System Diagnostics Anti-viral Protection Electronic Mail (E-Mail)

Network Management Functions The proposed system shall provide capabilities for monitoring and managing all data and devices on the GIS network as one unified system and support the following capabilities: Multi-user Database Access and Maintenance Monitoring of network Activity Network Problem Diagnostics Print and Plot Management

GIS DATABASE STRUCTURE

Database Model A GIS database model defines the nature and usage of spatial (geographic)data with a database. The proposed software shall support a spatial data model that is capable of creating, managing, and manipulating data sets, defined on the basis of spatial coordinates and associated attribute data sets. Feature Types: The data model shall support multiple feature types including point, node, line, polygon, and text features Data Storage: Features shall be stored as double precision x and y coordinates Data Types: The data model shall support multiple graphic and nongraphic data types Database Organization: Vendors shall describe strategies for organizing data into logical groups on the basis of data themes, and shall describe the capabilities of the data model for supporting simple and complex feature types. Topological Data Structures The geographic data model shall support the creation and maintenance of topological data. Topology shall be created through execution of a software function to structure graphic data sets. Vendors shall describe the ability of the proposed data model to support logical polygons, networks, and user-defined topological structures. Design Software capabilities that support large-scale engineering and design activities should be outlined as well as specific engineering functions and appropriate modules.

Raster Image Data The Proposer shall describe support for storage of raster map images (e.g., scanned bluelines, orthophotos) and for raster scanned documents. Continuous Geographic Database The geographic data model shall support the creation and storage of a continuous geographic database Relational Database Management System The Proposer shall recommend a relational database management system (RDBMS) that will be able to maintain a minimum of 30,000 records of parcel ownership information in a single table and shall provide functionality for updating database content, queries, and production of reports. The recommended RDBMS must be either a part of the GIS software or have a direct access capability.

10

SUMMARY

The RFP sections presented above give a good example of the scope of topics and level of detail needed. This particular RFP did not present a conceptual data model for consideration by the venders, but rather specified general characteristics for the GIS data model required. An actual conceptual data model, rather than its general characteristics, could be more useful to vendors, and thus more productive for the user's organization.

GIS DEVELOPMENT GUIDE: GIS SYSTEM INTEGRATION

INTRODUCTION

At this point in the GIS development process the GIS hardware and software have been acquired and data conversion is complete (or a substantial portion has been finished). Different components of the hardware and software may have been purchased separately. It is now necessary to put all the pieces together, test them to make sure they work as expected, and to initiate all procedures necessary to use the GIS.

GIS SYSTEM COMPONENTS

GIS Software Vendors will usually install and test their software. Acceptance criteria (often the performance measures used during the pilot study or benchmark test) will be needed and the vendors must meet these criteria before you relieve them of their obligation to you. Check the functionality of the program(s) to ensure that you received what you expected. The vendor should fix any problems that arise, either in software functionality or performance prior to you indicating acceptance of the software. Check that not only the main GIS software works, but that it works in relation to the other software programs that are part of your "total system," which also includes all legacy databases, software, and hardware. In addition to acceptable performance for each individual piece of software, make sure all software works together. Once the total system is your responsibility and problems arise it can be very difficult to determine the part of the system causing the trouble. Although not nearly as common as in the past, the first response of a vendor can still be "blame the other guy!" Make the vendors responsible for providing you with one integrated system. Remember - they are the experts. Do not allow anything to be left up to you to check or test. If you are uncomfortable about something or do not understand how something works, talk to the vendor representative and get an explanation. Additionally, technical support is an extremely valuable necessity. All contracts should include on-site technical support and then on-going phone support after the installation is complete. GIS Hardware Implementing your hardware system is about the same as your software and must occur simultaneously. Contract with the vendor to install and test the hardware components. As with the software, choose acceptance criteria for the hardware and operating system. Check functionality and performance of the hardware and have the vendor resolve any problems. Make sure the hardware is able to support the software, database, and network as required. Technical support, both on-site and telephone, should have been included in the contract with the hardware vendor.

GIS Development Guide

Database Integrating and testing hardware and software components are fairly well-defined processes and vendors have good experience with these tasks. However, dealing with larger and more complex databases has not been nearly as common in the GIS area. Therefore, adequate procedures and vendor experience may be lacking. There are two processes which remain basically user responsibility: building a master database or library (database integration) integrating the database with the GIS hardware and software
Accept & run library loader program

Database integration

Verify delivery of data

Load to staging area

Run QC \ or routines visually verify

Determine errors

Check out data to editing workspace

Not acceptable

Contact vendor

Use edit tools to update data

Notify checker when done

Use checker tools to verify

Accept & commit to library

Check back to clean-up library

Check in master library

Errors?

Acceptable level

Fix by checker

Only a few?
No

Return to edit

Figure 1 - Database Integration

Acquisition of Hardware and Software

Figure 1 illustrated the steps of building the master database from the converted data files (the product of the digitizing or scanning process). The overall process deals with quality control checking, other editing procedures, correction procedures, checking corrections for accuracy and finally placing the data file into the master database (or library). It is assumed that organizing data entities into logical groups (i.e., layers) has been defined during the previously completed logical/physical database design activity. Processing to enter data into the master database may involve restructuring the content of the digital/scanned files from data conversion into the final database structure, usually combining entities that may have been digitized separately. Other database building processes that must be accomplished within the activities shown in figure 1 are: linking GIS layers to attribute tables edgematching between areas used in digitizing and repartitioning the spatial extent into the final organization initialization of all database related procedures needed for both establishing the database and its continued maintenance

Procedural components needed to complete the database include those on the following list. Many of these procedures will have been defined, at least initially, during database design and/or the pilot study and benchmark activities. The procedures are: naming convention for all files (covering versions, status, etc.) definition of error conditions definition of accuracy requirements quality control routines manual editing procedures checking procedures (verification of corrections) error recording (flags associated with data or other error/accuracy information recorded in the database

Raw, digitized data files

Edited files ready for checking

Completed (checked) files ready for master database

Figure 2 - Library Structure to Support Editing

GIS Development Guide

The second major process is the integration of the database and all other system components (figure 3).

Hardware

Editing Delivered Data

Database system integration

System Integration

Software

Network

Figure 3 - System Integration

SYSTEM TESTING

Once the installations are complete, you need to test your integrated system. Test how the software programs work together, how the network is running, are the computers running slowly when complex functions are requested or all workstations are running simultaneously, and if data retrieval is quick enough, to name a few. This process should continue at least a week, if not more. It is important to experiment with the system on multiple days, with different processes running, and with different numbers of people accessing the data. Ask your staff to document any problems and report these to the vendor. See that resolutions are provided back to you in a timely manner. Utilize technical support lines and keep in mind that the vendors are responsible for following through on what they told you would work.

USER TRAINING

Most hardware and software vendors offer classes to teach new users about their products. You can usually include vendor instruction as part of your contract with them. User groups often offer information sessions on software products where you can learn valuable information. Proper instruction is important, however, and is a step that should not be disregarded.

Acquisition of Hardware and Software

GIS DEVELOPMENT GUIDE: GIS APPLICATION DEVELOPMENT

INTRODUCTION

Through time, as users become more experienced with GIS, they require more complex applications. The initial Needs Assessment will contain some applications of a complex nature, however the majority of initial applications will be straight-forward, using the basic functionality that is part of every commercial GIS (e.g., query, display). The more complex applications usually are not supported by the basic functions of a GIS but must be programmed using the GIS macro language or other programming language. This guideline identifies several categories of applications that must be prepared by users and how overall requirements change over time.

2 WHY APPLICATIONS ARE NEEDED


Sales brochures, live demos and journal articles touting the impressive and extensive array of GIS capabilities creates the impression that application development is a non-issue. The vendors, it would seem, have already developed fully functional, out-of-the-box, meet-your-business-needs, GIS software. GIS can and should do anything and everything. So why are we talking about application development? Applications are the icing on the GIS layer cake; the highest level of customizable software. The underlying "cake" provides the functionality common to all user disciplines. Commercial GIS packages tend to focus on the common or basic applications - the "cake." When it comes to specialized uses, application development fills the needs for functionality. Though there is a great deal of commonalty in the basic spatial query and display functions, there is still a need for other advanced applications. We need additional applications because needs are different between organizations. Commercial GIS development is driven by market pressure. The software vendors only respond to what makes economic sense for their market share. What's important to your organization may not be important to others. Because of this, there are no truly "off-the-shelf" applications that will match all of your needs. You either have to adapt your uses to their data model and functionality or you develop applications to fit your use environment.

CATEGORIES OF APPLICATIONS

Application development is not rewriting the GIS software, but instead custom applications to meet specific needs. The applications may be as simple as a set of preferences that are stored for each user group or individual and are run as a macro at startup time. Or they may be a very complex query that selects a group of layers, identifies features of interest based on attribute ranges, creates variable width buffers, performs a series of overlays and produces a hard copy map. In either case, an application is required to convert the user's ideas into a usable, stable product.

GIS Development Guide

Data Objects Identified During Needs Assessment

Source Documents: Maps, Images, Air Photos, etc.

Preparation of Data Model

Match Needed Data to Available Data and Sources

Survey and Evaluation of Available Data

Prepare Detailed Database Plan

Create Initial Metadata

Map and Tabular Data Conversion

Add Record Retention Schedules to Metadata

Database QA/QC Editing

GIS Database

Continuing GIS Database Maintenance

Archives

Database Backups

Figure 1 - Life Cycle of a GIS Database

Acquisition of Hardware and Software

DATABASE APPLICATIONS

Applications are not restricted to user-defined needs. One of the short-comings of the needs assessment methodology presented earlier, is the focus on only end-user query, analysis, and display requirements. Collective needs, particularly those related to system-wide functions, are not identified by individual users. The most important of these are the data administration functions for maintaining the quality and integrity of the database, such as quality control, verification, editing, back-up routines, and security. Database applications fall into the following categories: database set-up (described as part of GIS System Integration) database management database maintenance data archiving and retention

Figure 1 again shows the database life-cycle. Each step identified in figure 1 needs to be fully defined, as appropriate to the specifics of the GIS program. The main point here is not how these steps are completed but rather to identify all of the necessary steps and to emphasize the importance of planning and executing each one. The MSAccess metadata software tool accompanying these guides sets forth a structure for creating documentation needed for the management and maintenance of the GIS database. Table definitions for the metadata tool are in the appendix to the GIS use and Maintenance Guideline. Formal Specifications for Advanced Applications The documentation of applications is the Needs Assessment guideline describes methods suitable for preparing full and formal specifications for all applications. However, most applications in a new GIS will be of the simpler, more basic type (display, query, map overlay). These applications will likely be satisfied by the normal functionality that is included in most commercial GIS. More complex applications, either database or spatial analysis, will require development using the GIS macro programming language. For these applications the process of preparing formal specifications, similar to what any large programming project uses, should be followed. The techniques recommended in the Needs Assessment guideline are data modeling by application (E-R technique) and data flow diagramming. These techniques are suitable to provide an overview of a complex application. Additional techniques should be used, as appropriate, including: structural analysis and programming rapid prototyping

As the application development needs increase, there will be a need for additional staff with the appropriate programming skills and experience using the macro programming language of the GIS.

GIS Development Guide

GIS DEVELOPMENT GUIDE: GIS USE AND MAINTENANCE

INTRODUCTION

The last step in GIS implementation is to put the system to use. With system integration and testing complete and at least some applications available for use, the system can be released to users. Two broad categories of activity must be in place at this time: user support and service system maintenance (database, hardware, software)

While we are describing the activities here, it should be noted that most of what is recommended in this final guideline should have been defined during the detailed database design step. So, if you are reading these documents for the first time and have yet to begin an in-depth system planning activity, you should add everything that follows to the Database Planning and Design and Pilot Study/Benchmark steps. One final comment - usually substantial time passes between the initiation of the needs assessment and the time a GIS is ready to use. A lot will change during this time period. The GIS design activity is in itself a change agent - users will understand more about a GIS and its associated technology after the needs assessment is concluded and will consequently expect more. The applications originally identified, plus all subsequent derived information, will change; the available GIS hardware and software will change; and the underlying computer technology will change. So basically, while you the GIS designer is trying to come to a set of definitive decisions to implement the GIS, everything is constantly changing. The best you will be able to do is to monitor all areas of possible change, at best a difficult task, and to decide on the GIS with the knowledge that the maintenance phase will have to accomodate substantial change. Any and all procedures we have discussed as "maintenance" in these guidelines will need to be put in place immediately after the corresponding document is created or decision is made.

USER SUPPORT AND SERVICE

User support falls into the following categories: basic orientation in GIS is preparation for the needs assessment continued briefings during the planning, design, and implementation phases user training courses as needed in computing, general purpose software, databases, GIS, and spatial analysis user involvement and evaluation during pilot study and benchmark tests user training in specific application use

Acquisition of Hardware and Software technical support service while GIS is in use

user feedback procedure to identify system enhancements - GIS functions/applications and database data error/problem reporting and resolution procedures user feedback on data accuracy and system performance user involvement in decisions on all system upgrades - data, software, and hardware

It is difficult to identify which of the above is most important. This will vary by situation and over time. However, the first main point in user dissatisfaction comes with the time period between the needs assessment, where expectations are raised, and the first operational use of the system. This user dissatisfaction can be such that there is a temptation to develop quick-and-easy applications for early use, to take short-cuts in database development, or to extend a pilot study into actual use. Such a situation cannot always be avoided, however any premature use of this type will likely lead to more user dissatisfaction in the long term. GIS System and Database Maintenance The structure of this task is shown in figure 1. Three driving components of maintenance and change are: system enhancements, database expansion, and routine system maintenance (updates). Figure 1 indicates the type of change that may occur in each component and identifies the benefits and costs associated with the on-going GIS maintenance activity. As users can be negatively affected by charges, major enhancements or expansions need to be subjected to user review, even if the change is only internal to the GIS and on the surface would not affect users.

DATA MAINTENACE PROCEDURES

Managing Existing Data Backup / Restore A reliable backup system is necessary for any database. Should anything happen to your hardware (i.e. the file server disk drive crashes), you will be able to restore your backup data to another machine and be operational again in minutes without losing the database. Determine a schedule for regular backups of the system. This can be done daily, weekly, or monthly depending on the size of the database and amount of changes being made to it. If your staff only makes edits once a week, a weekly backup should be enough. However, if changes are constantly being made, a daily backup is important. If you have a large dataset that would be time consuming to backup every day, consider backing up only part of the database daily and then do a full backup once per week.

10 GIS Development Guide

GIS and Database

System Enhancement Requests (Committee Review)

Database Expansion (Committee Review)

Routine System Maintenance

Additional Functionality Hardware and Software Upgrades New Technology (GPS) Interface with Additional Systems

Additional Attributes New Entities Expanded Spatial Extent

Problem/Error Resolution (Bug Fixes) Database Updating

Benefits User satifications (GIS can do more) Additional sharing (data and other) Improved performance

Costs Dollar cost of enhancement GIS staff retraining More for GIS staff to manage & maintain User retraining System down-time

Figure 1 - Overview of GIS Maintenance

Acquisition of Hardware and Software 11 Granting Access To Data Often times GIS applications call for users to display and/or analyze the data only, without editing it. By granting read only access to the data to these types of users, you eliminate any chances for data to be deleted or otherwise altered. If you have other users who edit data, such as supervisors or trained technicians, grant them read and write permissions to the data. Data access can usually be handled by the GIS application, by the database software, and/or by network (if you are running one) software security operations. Another important function in data maintenance to consider is transaction maintenance. This type of application registers items in the database such as when a record was updated, by whom, and from what source the changes came from. A history log is kept on each record and old records being updated can be sent to an archive file. This step may seem unnecessary in the beginning, but as the database enlarges an application such as this will be of great value. If there are problems or questions with data, you will know exactly who to turn to to question its accuracy and quality. Records Management And Retention Four important questions should be looked at with regards to management and retention: what to keep, how long to keep it, how to keep it, and how often to keep it. The New York State Archives and Records Administration is currently developing additional guidelines to regulate and define records management and retention policies for GIS in local government. Until these guidelines are completed, please see the Local Government Records Technical Information Series No. 39 pamphlet for more assistance. A record in a GIS is difficult to define. It can include: data in the database, maps, aerial photographs, data dictionaries, and metadata. To help determine how long to keep your data, obtain a retention schedule from SARA. These are used for hardcopy data retention but can be modified and used for your purposes. Electronic media is generally used for data storage. Again, SARA is developing regulations on this, so it would be best to contact them for guidance. Reviewing Current Data For Potential Errors and Changes Develop a system for QC of the data. Most likely the dataset will be too large to be able to check everything. Determine what will be checked and what degree of accuracy you require. Several things you should look for are described below. Incompleteness Begin by checking to make sure all the layers of data that should be in the database are there. Also, make sure no layers are repeated. Define a process for checking some of the individual features of each layer. Determine if there is any missing data and make sure data is not repeated in more than one layer.

12 GIS Development Guide Errors There are two types of errors you should be concerned with: positional and attribute. Positional errors are defined as absolute or relative. Relative accuracy is a measure of the maximum deviation between the interval between two objects on a map and the corresponding interval between the actual objects in the field. For example, a measurement on a map from a water valve to the street centerline must be within a certain relative accuracy requirement to be accepted. Relative accuracy does not relate to a reference grid and the correct geographic position of the object is not relevant. Absolute accuracy is a measure of the maximum deviation between the location where a feature is shown on the map and its true location on the surface of the earth (Montgomery and Schuch, 132-133). Attribute errors are problems with the feature itself, not where it is located. Topological Errors Many GIS software packages are equipped to find topological errors in your dataset. Use available tools, or develop your own, to detect the following types of errors: closure (unclosed polygons), connectivity (unconnected arcs that should be connected), and coincident features. Coincident features (shared arcs) are difficult to locate; they may appear to have one arc between two features, but it turns out to be two arcs, one on top of the other. This should be corrected because it can result in sliver polygons (small gaps between two polygons). Detecting Change and Identifying Sources for Updates As a local government several internal sources for data updates would include: building permits issued, real estate transactions, subdivisions proposed and/or approved by the town council, and zoning changes. This is all important information you might want to include in your GIS. External sources of data updates might include: aerial photo surveys, subdivision contractor drawings, the Department of Transportation, the U.S. Postal Service, the Office of Real Property Services, and state and federal agencies (i.e. environmental groups, soil surveys, and the Coast Guard). Collection of New Information Once you have determined that there are new pieces of information you want to capture in your GIS, you must decide how you will collect it. Data conversion can be expensive; however, you know what the accuracy and quality of the data will be and you will get the information when you want it. Many of the sources listed in the above section will have digital data they would be willing to sell. Consider signing a contract to receive any updates they make. A third option for data collection is finding a way to work it into the staffs daily routine. This makes data collection take longer, but it does not disrupt workflow and it costs less. Determine what field crew or staff would be able to capture the data without it being a burden on their job and decide which people know the most about the data you are attempting to capture.

Acquisition of Hardware and Software 13 Applying the Edits and Tracking Changes Editing the database can become a tedious task. However, it is important to the data integrity that the edits are done accurately and consistently. All changes should be tracked in a way, as described above, that will allow you to determine when the records were updated, by whom, and what level of confidence the data was rated. When necessary, a history log can be displayed for each record and all changes to the data will be noted. Archiving data is a good way to keep out-ofdate information from cluttering the system, while allowing easy recall should there be something wrong with the updates or new data. Verifying the Corrections Develop a QC process or use the procedure youve already implemented to check the corrections made. You will not want to verify every change made, but you could select a random number of records and confirm that corrections were made correctly. Updating the Master Database Once edits are made and youve verified that they were updated correctly in the database the master database can be updated. If edits are being made on a daily basis, the master database may be updated on a daily basis as well, but be sure not to skip the correction verifying step. Distributing the Updates to Users This will depend on the technology being used. Some users will have access to a modem and can dial-up and download any edits you make. Other users will have to receive the data on a tape or disk. Determine a schedule and plan for distributing edits to your users that best suits your company.

Montgomery, Glenn E. and Harold C. Schuch, GIS Data Conversion Handbook, GIS World, Inc. and UGC Consulting, Fort Collins, 1993.

14 GIS Development Guide

Appendix A

Acquisition of Hardware and Software 15

OrganizationInfo
Name of organization/agency: Department: Room/Suite #: Number and Street Name: City: State: Zip Code: Phone Number: Fax Number: Contact Person: Phone Number/Extension: Email Address: Organization Internet Address: Comments: 5525 Main Street Amherst NY 14221(716)555-8888 (716)555-7444 John Henry (716)555-8888 x777 jhenry@assr.amh.gov general@assr.amh.gov Town of Amherst Assessor

CONTINUE Inputing For This Organization

Input NEXT Organization

EXIT Database

16 GIS Development Guide

ReferenceInfo
Organization: Filename: File Format: Availability: Cost: File Internet Address: Metadata Created By: Date Metadata Created: Metadata Updated By: Date Metadata Updated: Metadata Standard Name: Comments: yes $15.00 ftp.assr.amh.gov/pub Lee Stockholm 07-May-95 Laura Hiffel 15-Feb-96 New York State Local Government Standard Town of Amherst ParcelMap ARC/INFO

CONTINUE Inputing For This File

Input NEXT File

SAVE and Input Next Organization

Acquisition of Hardware and Software 17

DataObjectInfo
Data Object Name: Type: Data Object Description: Spatial Object Type: Comments: Parcel Simple Land ownership parcel Polygon

CONTINUE with Spatial Info

CONTINUE with Attribute Info

CONTINUE with Lineage Info

CONTINUE with Update Info

Input NEXT Data Object

EXIT Data Object Info

18 GIS Development Guide

SpatialObjectInfo
Data Object Name: Spatial Object Type: Place Name: Projection Name/Description: HCS Name: HCS Datum: HCS X-offset: HCS Y-offset: HCS Xmin: HCS Xmax: HCS Ymin: HCS Ymax: HCS Units: HCS Accuracy Description: VCS Name: VCS Datum: VCS Zmin: VCS Zmax: VCS Units: VCS Accuracy Description: Comments: 0 0 Parcel Polygon Amherst UTM State plane Coordinate System NAD83 1000000 800000 25 83 42 98 Feet National Map Accuracy Standard

CONTINUE with Source Document

GO BACK to Data Object Form

Acquisition of Hardware and Software 19

SourceDocumentInfo
Data Object Name: Spatial Object Type: Source DocumentName: Type: Scale: Date Document Created: Date Last Updated: Date Digitized/Scanned: Digitizing/Scanning Method Description: Accuracy Description: Comments: Parcel Polygon Parcel Map Map Variable: 1" = 50 feet to 1" = 200 feet 17-Nov-89 05-Oct-94 24-Apr-95 Manual digitized with Wilde B8 90% of all tested points within 2 feet

GO BACK to Data Object Form

20 GIS Development Guide

AttributeInfo
Data Object Name: Data Attribute Name: Attribute Description: Attribute Filename: Codeset Name/Description: Measurement Units: Accuracy Description: Comments: Parcel SBL Number Section, Block, and Lot Number Parcel,PAT N/A N/A N/A

CONTINUE with Data Dictionary

GO BACK to Data Object Form

Acquisition of Hardware and Software 21

DataDictionaryInfo
Data Object Name: Data Attribute Name: Data Type: Field Length: Required: Comments: Parcel SBL Number Character 15 Yes

GO BACK to Data Object Form

22 GIS Development Guide

Lineage
Data Object Name: Data Object1: Data Object2: Description of Spatial Operation and Parameters: Accuracy Description: Comments: Parcel N/A N/A N/A N/A

GO BACK to Data Object Form

Acquisition of Hardware and Software 23

UpdateInfo
Data Object Name: Update Frequency: Date: Updated By: Comments: Parcel Annual 01-Jan-96 Lee Stockholm

CONTINUE with Archive Info

GO BACK to Data Object Form

24 GIS Development Guide

ArchiveInfo
Data Object Name: Retention Class: Retention Period: Date Archived: Archived By: Date to be Destroyed: Parcel A Permanent 31-Dec-95 Lee Stockholm

GO BACK to Data Object Form

You might also like