To implement the SDSS and support decision-making, we have to look first into the decision-making process. The steps of basic decision-making include:
Developing a GIS Database
The GIS database stores all raw data, processed data, and models. The ability to capture, retrieve, and manipulate complex spatial data can be the key to successful decision making. Each data set of any particular variable or criterion forms a data layer in the database. Since data must be collected from various sources, however, before the layers can be overlaid, the data must be referenced to a common geographic coordinate system. The GIS database in this study used a Universal Transverse Mercator (UTM) projection system as its coordinate system. There is not much difference in appearance and calculation between different projection systems for a small area like South Fork. The reason that UTM was chosen is that most of the data were originally projected to UTM, such as TM [?] images, streams, and mine sites. Therefore no more effort is needed if they are kept in UTM. Also, unnecessary distortion and loss of information from projection transformation are avoided. The common map unit for UTM projection is the meter. The data are stored either in the vector or in the raster format depending on the particular layer's properties.
Vector vs. Raster
Vector-based data represent geographic features similar to the way maps do. Points represent geographic features too small to be depicted as lines or areas; lines represent geographic features too narrow to be depicted as areas; and areas represent homogeneous geographic features. A Cartesian (x, y) coordinate system references real-world locations. In a vector-based data model, each location is recorded as a single x, y coordinate. Points are recorded as a single coordinate. Lines are recorded as a series of ordered x, y coordinates. Areas are recorded as a series of x, y coordinates defining line segments that enclose an area within a polygon, hence the term polygon, meaning ‘many-sided figure’.
Two file formats are used to store vector-based data in ESRI standards, 1. shape file, and 2. ArcInfo coverage. Shapefile is a simple, non-topological format for storing the geometric location and attribute information of geographic features. It is one of the spatial data formats that can be used in ArcView. [?] The shapefile format defines the geometry and attributes of geographically referenced features in as many as five files with specific file extensions that should be stored in the same project workspace. These file extensions are:
.shx - the file that stores the index of the feature geometry.
.dbf - the dBASE file that stores the attribute information of features. When a shapefile is added as a theme to a view, this file is displayed as a feature table.
.sbn and .sbx - the files that store the spatial index of the features. These two files may not exist until you perform theme on theme selection, spatial join, or create an index on a theme's Shape field.
Raster-based systems, like vector-based systems, also store geographic data, but they view and store surfaces differently. Vector systems define an object and proceed to define its characteristics and attributes, one of which is the x, y coordinate location. The raster-based data model is more like a photograph than a map and works in a XXX way similar to a photograph; it is a regular grid of dots (called cells, or pixels) filled with values. In fact, when a picture is stored in a computer, the raster data model is used (ESRI, 1997).
Raster-based systems divide the world into discrete uniform units called cells. Every cell represents a certain specified portion of the earth, such as a square kilometer, hectare or square meter. Each cell is given a value to correspond to the feature or characteristic that is located at or describes the site, such as a drainage basin, soil type, or residential classification. Location is not defined as an attribute but is inherent in the storage structure.
The cell is the primary spatial entity within a grid. Each cell is square, has the same size as other cells in the grid and contains a numeric value representing the spatial variable at that location. Cell values can be 32-bit integer or real (floating-point) numbers.
The uniform cells are organized into a Cartesian matrix consisting of rows and columns. A row identifies all cells equidistant from the top or bottom boundary of a grid. Columns identify all cells equidistant from the left or right boundary of the grid. Each Cartesian matrix is called a grid. Every cell in a grid has a unique row and column identifier.
Each grid represents a spatial variable. While vector features are stored as a series of x, y coordinates and topological relationships, grid cells are stored as rows and columns.
The data needed for the implementation of SDSS for coal mine reclamation include soils, streams and lakes, mine-sites, land cover, and topographic maps, as well as Digital Elevation Models (DEM).
The soil layer in this study was digitized from county soil survey maps. The physical, chemical and mineralogical soil properties were correlated to the geographic locations. The attributes included here are soil series, texture, organic matter content, structure, permeability, and pH. Since soil distributes as areas, this layer was stored in vector format. Soil properties were the main source of input for the RUSLE [?] model, and also they were used to generate an acidity map.
The stream layer was provided by the Indiana Geological Survey (IGS). It includes the streams and lakes in the area. Since the stream layer is used primarily for reference of location, no attributes are needed. This layer was stored in vector format.
The mine site layer was provided by IGS also. The attributes associated with mines are mining periods and mining methods. This layer, too, was stored in vector format.
The land cover layer was generated by interpreting remote sensing data and it serves as an input to the Revised Universal Soil Loss Equation (RUSLE) model. Two sets of Landsat Thematic Mapper (TM) data acquired in the study area in May and July of 1993 are the primary source. Each set of TM images contains the values from seven spectral bands that range from visible bands to infra red bands. One of the most common methods of information extraction from remotely sensed data is multispectral classification. Traditionally, there are two major classification approaches used to perform multispectral classification: unsupervised and supervised classification.
In an unsupervised classification, the computer performs clustering on the image while searching for natural groupings of the spectral properties of pixels. Generally there are two steps in an unsupervised classification procedure: building clusters and assigning pixels to each cluster. According to the user input, the computer first reads through the data set and generates clusters, with a mean vector associated with each cluster. Then using the minimum distance algorithm, the computer program operates on the whole data set on a pixel by pixel basis and assigns each pixel to a cluster. This procedure requires only a few initial inputs from the user. It is the user’s responsibility to categorize the clusters into meaningful information classes (Robinove, 1981). A good variety of algorithms have been developed for clustering procedures.
The advantages of unsupervised classification are (Campbell 1996):
Depending on their knowledge of the study area, XXX users can choose unsupervised or supervised classification. In some cases, both are used in the same study. Besides these two major approaches, there are also some other procedures, such as texture analysis and neutral network.
Image classification is not the ultimate goal. Post classification process is equally important for obtaining a good result. During the post classification procedure, the refinement of classes and assessment of accuracy are conducted with the aid of field data and data from other sources.
In this study, the final classification map was analyzed and merged into nine distinct categories. Depending on the coverage percentage of each category, a coverage factor value was assigned for the calculation of the Revised Universal Soil Loss Equation (RUSLE).
The Digital Elevation Model (DEM) was acquired from the U.S. Geological Survey (USGS) 7.5 minute DEM data, and was clipped by the study area boundary. The 7.5 minutes DEM data files are digital representations of cartographic information in a raster form. DEMs consist of a sampled array of elevations for a number of ground positions at regularly spaced intervals. Each 7.5 minute DEM is based on a 30 by 30 meter data spacing within a the Universal Transverse Mercator (UTM) projection. It provides the same coverage as the standard USGS 7.5 minute map series.
Topographic maps in this study were the digital version of USGS 7.5- minute topographic maps. These maps had no role in the spatial decision support system. They were not criteria for analysis, nor were they input for any criteria calculations. While serving as back draft for other layers, topographic maps provided field data for better land use/ land cover classification, and also assisted in the visualization of the location information of all the other thematic layers. This layer was stored as TIFF image file with a world file attached to it to identify the location.
All spatial analysis was carried out in a raster format. The resolution was set at 30 by 30 meters to match the resolution of both TM and USGS DEM data. Since this study deals with regional environmental management, this resolution is reasonably appropriate. All the vector data therein were converted to a raster format for analysis.
The Soil Erosion Rate Model
The erosion rate for a given site results from the combination of many physical and management variables. The Revised Universal Soil Loss Equation (RUSLE) is an erosion model designed to predict the longtime average annual soil loss carried away by runoff from specific field slopes in specified cropping and management areas. Widespread use of this equation has substantiated its usefulness and validity for this purpose. It works well too for nonagricultural conditions (Renard et al. 1997).
Along with the RUSLE model, USDA has developed a computer program and three databases for the calculation of six factors. The CITY [?] Database contains information on climate, the CROP[?] Database holds the parameters defining the characteristics of vegetative growth and residue; and, the OPERATIONS Database defines the effects of field operations on the soil, crop, and residues. Some of the values can be derived directly from the database, while others have to be calculated using data from the GIS database.
Multi-Criteria Decision-Making Analysis
Although there is a fairly extensive literature on decision making in the decision science which links management science, research and regional science fields, there is a broadly divergent use of terminology (Rosenthal 1985; Belton 1990). Therefore, before mutli-criteria decision-making analysis is applied, some terms have to be defined.
Considering the nature of reclamation and that the file format is in raster form, the decision model selected for this ordering of reclamation proceedings is a combination of weighted summations (Voogd, 1983).
The three main components in Multi-Criteria Analysis (MCA) in a GIS context are criterion scores, criterion weights, and evaluation.
Because of the different scales upon which criteria are meastured, it is necessary that factors be standardized before making linear combinations using the formula, and that they be transformed, if necessary, such that all factor maps are positively correlated with the reclamation priority. A variety of procedures for standardization were reviewed by Voogd (1983), typically using the minimum and maximum values as scaling points. The simplest and most common is a linear scaling such as :
X = (R-Rmin)/(Rmax-Rmin) * standardized_range [5]
Where R is raw score.
A critical issue in the standardization of factors is the choice of the end points. Research has suggested that blindly using a linear scaling (or indeed any other scaling ) between the default minimum and maximum values of the image is not reasonable. In setting these critical points for the standardizing function, it is important to consider their inherent meaning. Depending on the primary issues, the critical points need to be set accordingly, even on the same data. For example, if we feel that industrial development should be placed as far away from a nature reserve as possible, it would be dangerous to implement this without careful consideration. If the map covers a range of 100km from the reserve, then the farthest point away form the reserve would be given a value of 1.0. Using a linear function, then a location 5 km from the reserve would have a standardized value of only 0.05. Taken this number to the issue on pollution from polluted streams, for which a distance of only 1 km would have been equally as good as being 100 km away. Thus the standardized score should really have been 1.0. If a Multi-Criteria Analysis (MCA) were undertaken using XXX blind linear scaling, locations in the range of a few tens of kilometers would have been severely devalued when in fact they might have been quite good. In this case, the recommended critical points for the scaling should have been 0 and 1 km. In developing standardized factors careful consideration should be given to the inherent meaning of the end points chosen.
The development of criterion weights for Multi-Criteria Analysis (MCA) can be achieved through a variety of techniques (von Winterfeldt and Edwards 1986). One of the most promising is that of pairwise comparison developed by Saaty (1977) in the context of a decision making process. In this procedure for MCA using a weighted linear combination, the weights sum needs to be one. Saaty’s technique derives weights from the matrix of pairwise comparisons between the criteria. This method is very useful when direct evaluation of criteria weights are difficult, especially with an increasing number of criteria. However, in the research, only three criteria were chosen, and it is not difficult to obtain direct evaluation. Therefore Saaty’s pairwise technique was not employed.
Evaluation takes place after the criteria maps with scores and weights have been developed, to combine the information from the various criteria. Using the Weighted Linear Combination (WLC), it is fairly simple to multiply each criterion map by its weight and then sum the results. Since the weights sum to one, the resulting map should have the same range of values as that of the standardized criteria maps.
Implementation of the SDSS
The reclamation priorities of coal mine sites are primarily determined by pollution intensity, soil erosion rates, soil properties, and a coal mine site’s proximity to streams. To build a SDSS, all the data must be placed into distinct decision making categories. In this SDSS, three criteria are selected: soil erosion rate, soil acidity, and the proximity of a site to streams.
It is not an easy task to develop a weighting scheme for decision making in mine reclamation. Since the priority of reclamation is decided on by the attractiveness of the sites, according to their values on each criterion, aggregation of the values could generate the rankings. Therefore, the linear method is adopted in this study. The value of each criteria layer is normalized to a value from 0 to 10 (VCi) depending on the statistics of the original data set or on the decision-makers’ preference. Each layer is then given a weight (Wi) according to its importance. The output value for each site is, then, the sum of each criteria value multiplied by its weight.
For site j, the output value is
RPj = S ( VCij * Wij ) (i = 0..2) [6]
The erosion rate priority is positively related to the reclamation priorities. This means a high erosion rate gets a high priority value. High erosion rate soil has a great potential to be carried away into a stream or channel by running water and, therefore, to degrade the water quality and cause sediment pollution.
Soil acidity can be another major source of pollution. Highly acidic soil may cause severe problems for environmental management and for re-vegetation because of its low nutrient content.
Proximity to streams is the most important factor for reclamation as pollutants are usually carried by water, so it should be weighted the most highly in the weighting scheme. This factor is followed by the soil erosion rate factor, and soil acidity factor. The closer the site is to streams, the higher is the risk that it will spread pollutants from the abandoned sites.
User Interface
The user interface is designed to allow the decision-maker to step through a process that will result in the calculation and display of a map of weighted reclamation priorities. To use this system, the user has to have a knowledge ofX, or a preference for, each factor. After the user input of the weight of each variable, the system will calculate the final result and generate maps according to the user’s preference. In addition, the user can add new data sets to the system, in order to add a new set of criteria, such as land owner preference, in the multiple criteria list. Modification of the weighting scheme is also possible by interacting with the interface.
It was quite clear that with the task at hand, which was originally that of dealing with spatial data, GIS functions were crucial in building the application system. The implementation of the interface had combined four major parts in its structure: a main frame or programming platform; the GIS functions; XXX visualization tools for analyzing results; and, a communications capability among all of these.
Borland Delphi by Inprise is a popular package of programming software similar to Microsoft Visual Basic and Visual C++. Compared to its peers, Delphi has the best performance/input ratio. Visual Basic, which is predominantly a programming platform, uses very simple syntax in coding and is easy to use. But the compiled Visual Basic program file is usually large and it runs slowly. Visual C++, on the other hand, is more complicated in concept and syntax, though the compiled file runs faster. Delphi lies in between Visual Basic and Visual C++ with respect to speed, size, and programmer’s effort. Another advantage of using Delphi for this research is that several applications have already been built with MapObjects and it definitely helps to get on with the building of applications directly without spending time XXXfamiliarizing one’s self with the programming platform.
GIS functions have to be extracted from GIS software. With two or three GIS packages and components at hand, a developing environment had to be chosen among the available options. The available GIS software included ArcInfo by ESRI, ArcView by ESRI and MapObjects by ESRI.
ArcInfo is by far the most powerful GIS software in the GIS world with respect to the functions it provides for both raster and vector analyses. There is no doubt that ArcInfo could fulfill the task of building a spatial decision support system. The shortcomings of using ArcInfo to develop an SDSS are:
ArcView is another available GIS package by ESRI. It is a much easier system for users than ArcInfo because of its Windows interface. The functions for vector analysis and process in ArcView are well developed. The advantage of ArcView is that it allows multiple windows displaying different themes or overlays. This makes it very easy for users to conduct comparisons side by side. However, it lacks the capabilities to do raster analysis and grid modeling. Therefore, ArcView[Should this be bold?] alone could not fulfill an important requirement of this research--that of implementing a spatial decision support system that requires mostly grid anaylsis.
MapObjects is a new product by ESRI using GIS componentware technology. The portability and flexibility of MapObjects for use with other industrial standard packages distinguishes it from both the ArcInfo and the ArcView package. Wrapping the most popularly used GIS functions, it provides a means for those users who are outside the GIS field to display and analyze geographic information within the framework of a familiar programming environment. In implementing the spatial decision support system for this research, MapObjects was chosen to display multiple windows by using the "Map" object provided by the MapObjects program. Although MapObjects[Bold?] has the capability to display raster or image layers, it is clearly unable to support operations on raster data.
Therefore,XXX for this study, Delphi was selected to be the programming platform, ArcInfo was selected to be the main GIS function source, and MapObjects was selected to be the visualization tool. The only step left was to choose a method to connect these three parts. There was no difficulty in embedding MapObjects in the Delphi program because MapObjects was just an Active X control. The remaining question was how to gain access to the ArcInfo functions within the Delphi environment. The Open Development Environment (ODE) by ESRI provides an environment that allows standard programming languages to call ArcInfo functions through Active X controls. The main ArcInfo components such as, ArcEdit, ArcPlot, and Grid[Should these be bold?] are encapsulated in the controls. By embedding the controls in the program, the application is able easily to access the commands and functions.
The Open Development Environment (ODE) by ESRI was used as the primary means to develop a user interface to access the ArcInfo GIS functions. This was accomplished through the use of Rapid Application Development (RAD) tools to take advantage of the flexible programming environment, which has easy access to any ArcInfo functions. Since there is a limitation in accessing the ArcInfo through ODE controls designed for Windows NT, MapObjects, which is a GIS component developed by ESRI, was used for displaying multiple windows. There were three major steps required to implement the fully functioning graphic user interface. They involve interface design, function calls, and interface and function interaction.