1. Methodology
Reclamation of abandoned coal mine sites is an extremely complex process. Many physical, economic, social, and political factors affect the final outcome. Some of the non-physical variables are not controllable and there is no easy way to quantify them. The SDSS was designed to support decision making in the context of natural resource management. Data from various sources were collected and processed to be stored in the GIS database. A user interface was provided for adding additional information relevant to coal mine reclamation, as the user sees fit. This SDSS was integrated with the GIS database, with a soil erosion model, and with multi-criteria decision-making (MCDM) methods (Figure 3-1).

To implement the SDSS and support decision-making, we have to look first into the decision-making process. The steps of basic decision-making include:

    1. Define the alternatives.
This is the process used to identify the problems and the goals to be accomplished through decision-making. As for reclamation planning, the problem is that there are too many abandoned sites and too limited resources for funding. The goal is to select the most urgent sites and reclaim them using the limited resources.





    1. List the comparison criteria.
Comparison criteria consist of the available variables that might affect or relate to the problem and need to be taken into consideration. For simple decision making, the criteria can be listed in a table. For example, a customer wants to make a decision on choosing a computer by examining different brands. The decision-maker needs to: first decide what components or criteria such as memory, speed, size and hard drive are required; then acquire the information; and finally list them in a table to make comparisons. But reclamation is a far more complex process involving spatial aspects. Criteria can range from physical, to social, to political data. The data therefore have to be organized in a more systematic manner--a simple table is not sufficient for this kind of problem. Since GIS is the best tool to organize spatial data, it definitely improves processes involving spatially distributed data.
    1. Weigh the importance of each criterion.
Not all criteria are created equal--each has its own weight. Another use of weight is to calculate. For qualitative data, there is no way to calculate and compare this type of data, and therefore the advantage of weighing is more significant. There are three ways to assign weight to a criterion. One way is to give a particular number to each criterion. Another way is to set the total weight of all the criteria as 100, and assign a percentage to each criterion according to its relative importance in the whole criteria set. The third  way is a little more complicated and needs some algorithms to generate the final list of weights for each criterion. Pairwise comparison belongs to this category. In a pairwise comparison, each one of a pair of the criteria is given a relevant weight and the algorithm is applied to calculate the weight set for all the criteria.
    1. Rate each alternative of each criterion.
To descrube the data within each criterion, there is a range of values if it is quantitative, or a distinct set of categories, if it is qualitative XXX. Each alternative on a criterion is rated by assigning a value or a score according to its original value or category and depending on the data type. There are several approaches to rating a range of values. For instance, a linear method can be used for normally distributed data. Another approach is to classify the range of values into several categories and to assign a value to each category. XXX  In this case, all the alternatives belonging to a given category receive the same value.
    1. Perform the comparison.
With all the criteria weighted and alternatives in each criterion rated, the total score can be calculated in a straightforward manner. By comparing the final scores of all the alternatives, a decision-maker can choose the most practical one according to the preferences and to existing conditions. If the results are not satisfactory, the weighing and rating schema can be re-done. With a computer program, it is quite easy to accomplish this process. Decision supporting systems (DSS) provide the environment or tools for the decision-makers to go through all the steps without bothering to understand the system. Decision-makers do not have to write the code for the programs; what they need to do is select criteria, weigh it, and rate it, using the functions that DSS provides. The DSS takes care of the rest: storing the data, organizing the data and models, calculating the scores, and presenting the result in a more reasonable way. In a spatial decision support system for reclamation, a GIS database is used to store and manipulate the data, a model base is used to manage the models for calculation, and multiple criteria analysis is used to form the decision matrix and produce the eventual result.

Developing a GIS Database

The GIS database stores all raw data, processed data, and models. The ability to capture, retrieve, and manipulate complex spatial data can be the key to successful decision making. Each data set of any particular variable or criterion forms a data layer in the database. Since data must be collected from various sources, however, before the layers can be overlaid, the data must be referenced to a common geographic coordinate system. The GIS database in this study used a Universal Transverse Mercator (UTM) projection system as its coordinate system. There is not much difference in appearance and calculation between different projection systems for a small area like South Fork. The reason that UTM was chosen is that most of the data were originally projected to UTM, such as TM [?] images, streams, and mine sites. Therefore no more effort is needed if they are kept in UTM. Also, unnecessary distortion and loss of information from projection transformation are avoided. The common map unit for UTM projection is the meter. The data are stored either in the vector or in the raster format depending on the particular layer's properties.

Vector vs. Raster

Vector-based data represent geographic features similar to the way maps do. Points represent geographic features too small to be depicted as lines or areas; lines represent geographic features too narrow to be depicted as areas; and areas represent homogeneous geographic features. A Cartesian (x, y) coordinate system references real-world locations. In a vector-based data model, each location is recorded as a single x, y coordinate. Points are recorded as a single coordinate. Lines are recorded as a series of ordered x, y coordinates. Areas are recorded as a series of x, y coordinates defining line segments that enclose an area within a polygon, hence the term polygon, meaning ‘many-sided figure’.

Two file formats are used to store vector-based data in ESRI standards, 1. shape file, and 2. ArcInfo coverage. Shapefile is a simple, non-topological format for storing the geometric location and attribute information of geographic features. It is one of the spatial data formats that can be used in ArcView. [?] The shapefile format defines the geometry and attributes of geographically referenced features in as many as five files with specific file extensions that should be stored in the same project workspace. These file extensions are:

.shp - the file that stores the feature geometry.

.shx - the file that stores the index of the feature geometry.

.dbf - the dBASE file that stores the attribute information of features. When a shapefile is added as a theme to a view, this file is displayed as a feature table.

.sbn and .sbx - the files that store the spatial index of the features. These two files may not exist until you perform theme on theme selection, spatial join, or create an index on a theme's Shape field.

ArcInfo coverage is another digital version of a map forming the basic unit of vector data storage in ArcInfo. A coverage contains two kinds of information about geographic objects: geographic information and attribute information. It stores geographic features as primary features (such as arcs, nodes, polygons, and label points) and secondary features (such as tics, map extent, links, and annotation). Associated feature attribute tables describe and store attributes of the geographic features. A coverage usually represents a single theme such as soils, streams, or land use. Unlike Shapefile, which stores in several files, the physical storage of a coverage is in a directory structure containing several ArcInfo internally formatted files for storing tics, boundary, annotation, and attributes.

Raster-based systems, like vector-based systems, also store geographic data, but they view and store surfaces differently. Vector systems define an object and proceed to define its characteristics and attributes, one of which is the x, y coordinate location. The raster-based data model is more like a photograph than a map and works in a XXX way similar to a photograph; it is a regular grid of dots (called cells, or pixels) filled with values. In fact, when a picture is stored in a computer, the raster data model is used (ESRI, 1997).

Raster-based systems divide the world into discrete uniform units called cells. Every cell represents a certain specified portion of the earth, such as a square kilometer, hectare or square meter. Each cell is given a value to correspond to the feature or characteristic that is located at or describes the site, such as a drainage basin, soil type, or residential classification. Location is not defined as an attribute but is inherent in the storage structure.

The cell is the primary spatial entity within a grid. Each cell is square, has the same size as other cells in the grid and contains a numeric value representing the spatial variable at that location. Cell values can be 32-bit integer or real (floating-point) numbers.

The uniform cells are organized into a Cartesian matrix consisting of rows and columns. A row identifies all cells equidistant from the top or bottom boundary of a grid. Columns identify all cells equidistant from the left or right boundary of the grid. Each Cartesian matrix is called a grid. Every cell in a grid has a unique row and column identifier.

Each grid represents a spatial variable. While vector features are stored as a series of x, y coordinates and topological relationships, grid cells are stored as rows and columns.

The data needed for the implementation of SDSS for coal mine reclamation include soils, streams and lakes, mine-sites, land cover, and topographic maps, as well as Digital Elevation Models (DEM).

The soil layer in this study was digitized from county soil survey maps. The physical, chemical and mineralogical soil properties were correlated to the geographic locations. The attributes included here are soil series, texture, organic matter content, structure, permeability, and pH. Since soil distributes as areas, this layer was stored in vector format. Soil properties were the main source of input for the RUSLE [?] model, and also they were used to generate an acidity map.

The stream layer was provided by the Indiana Geological Survey (IGS). It includes the streams and lakes in the area. Since the stream layer is used primarily for reference of location, no attributes are needed. This layer was stored in vector format.

The mine site layer was provided by IGS also. The attributes associated with mines are mining periods and mining methods. This layer, too, was stored in vector format.

The land cover layer was generated by interpreting remote sensing data and it serves as an input to the Revised Universal Soil Loss Equation (RUSLE) model. Two sets of Landsat Thematic Mapper (TM) data acquired in the study area in May and July of 1993 are the primary source. Each set of TM images contains the values from seven spectral bands that range from visible bands to infra red bands. One of the most common methods of information extraction from remotely sensed data is multispectral classification. Traditionally, there are two major classification approaches used to perform multispectral classification: unsupervised and supervised classification.

In an unsupervised classification, the computer performs clustering on the image while searching for natural groupings of the spectral properties of pixels. Generally there are two steps in an unsupervised classification procedure: building clusters and assigning pixels to each cluster. According to the user input, the computer first reads through the data set and generates clusters, with a mean vector associated with each cluster. Then using the minimum distance algorithm, the computer program operates on the whole data set on a pixel by pixel basis and assigns each pixel to a cluster. This procedure requires only a few initial inputs from the user. It is the user’s responsibility to categorize the clusters into meaningful information classes (Robinove, 1981). A good variety of algorithms have been developed for clustering procedures.

The advantages of unsupervised classification are (Campbell 1996):

The disadvantages are : In a supervised classification, on the other hand, XX the process XX uses known properties and identities which are called samples to classify unknown pixels. The samples are located in the training area where the user has obtained both spectral and informational data. The result, therefore, relies heavily XXX on the quality of the samples. Comparing to unsupervised classification, supervised classification gives users more control over the process. The classification is tied to the user defined classes, therefore, those unpredictable classes which may be generated from unsupervised classification can be omitted. Secondly, the users can skip the step of matching natural classes to information classes. Furthermore the users can easily detect any errors by examining the training data. In spite of the advantages over unsupervised classification, there are also some disadvantages of supervised classification. The main shortcoming is that the classes proposed by the users might not match with some natural classes and some unique classes might be overlooked. Secondly, sample data may not be representative through the whole image, therefore those natural classes which are not in the training area may be dismissed. Finally selection of training data is extremely time-consuming, and the user might encounter difficulties in matching training areas obtained from other sources to the image.

Depending on their knowledge of the study area, XXX users can choose unsupervised or supervised classification. In some cases, both are used in the same study. Besides these two major approaches, there are also some other procedures, such as texture analysis and neutral network.

Image classification is not the ultimate goal. Post classification process is equally important for obtaining a good result. During the post classification procedure, the refinement of classes and assessment of accuracy are conducted with the aid of field data and data from other sources.

In this study, the final classification map was analyzed and merged into nine distinct categories. Depending on the coverage percentage of each category, a coverage factor value was assigned for the calculation of the Revised Universal Soil Loss Equation (RUSLE).

The Digital Elevation Model (DEM) was acquired from the U.S. Geological Survey (USGS) 7.5 minute DEM data, and was clipped by the study area boundary. The 7.5 minutes DEM data files are digital representations of cartographic information in a raster form. DEMs consist of a sampled array of elevations for a number of ground positions at regularly spaced intervals. Each 7.5 minute DEM is based on a 30 by 30 meter data spacing within a the Universal Transverse Mercator (UTM) projection. It provides the same coverage as the standard USGS 7.5 minute map series.

Topographic maps in this study were the digital version of USGS 7.5- minute topographic maps. These maps had no role in the spatial decision support system. They were not criteria for analysis, nor were they input for any criteria calculations. While serving as back draft for other layers, topographic maps provided field data for better land use/ land cover classification, and also assisted in the visualization of the location information of all the other thematic layers. This layer was stored as TIFF image file with a world file attached to it to identify the location.

All spatial analysis was carried out in a raster format. The resolution was set at 30 by 30 meters to match the resolution of both TM and USGS DEM data. Since this study deals with regional environmental management, this resolution is reasonably appropriate. All the vector data therein were converted to a raster format for analysis.

The Soil Erosion Rate Model

The erosion rate for a given site results from the combination of many physical and management variables. The Revised Universal Soil Loss Equation (RUSLE) is an erosion model designed to predict the longtime average annual soil loss carried away by runoff from specific field slopes in specified cropping and management areas. Widespread use of this equation has substantiated its usefulness and validity for this purpose. It works well too for nonagricultural conditions (Renard et al. 1997).

Along with the RUSLE model, USDA has developed a computer program and three databases for the calculation of six factors. The CITY [?] Database contains information on climate, the CROP[?]  Database holds the parameters defining the characteristics of vegetative growth and residue; and, the OPERATIONS Database defines the effects of field operations on the soil, crop, and residues. Some of the values can be derived directly from the database, while others have to be calculated using data from the GIS database.

Multi-Criteria Decision-Making Analysis

Although there is a fairly extensive literature on decision making in the decision science which links management science, research and regional science fields, there is a broadly divergent use of terminology (Rosenthal 1985; Belton 1990). Therefore, before mutli-criteria decision-making analysis is applied, some terms have to be defined.

    1. Decision is a choice between alternatives.
    2. Criterion is some basis for a decision that can be measured and evaluated. It is the evidence upon which an individual piece of data can be assigned to a decision set.
    3. Multi-criteria analysis: to meet a specific objective, it is frequently the case that several criteria will need to be analyzed. Such procedures are called multi-criteria analysis or evaluation (Voogd 1983; Carver 1991, Zeleny 1982).
In this study, MCDM methods were integrated with a GIS to provide a means to place reclamation proceedings in priority order based upon a variety of different choice criteria, and on the importance (weight) a decision-maker attaches to these criteria. In a GIS, Multi-Criteria Analysis (MCA) is most commonly achieved by one of two procedures : boolean overlay and weighted linear combination (WLC). In a boolean overlay procedure, all criteria are refined to logical statements of suitability and then combined by means of one or more logical operators such as AND, NOT, and OR. In a weighted linear combination (WLC), continuous criteria are standardized to a common numeric range, and then combined by means of a weighted average. The result is a continuous mapping of suitability that the decision maker may use as a threshold on which to base a final decision. Because these two procedures make very different statements about how criteria should be evaluated, they frequently lead to different results. Boolean overlay uses a very extreme form of decision making. If the criteria are combined with a logical AND, to be included in the decision set, a location has to meet every criterion. If only one criterion is not met, the location will be excluded from the decision set. Only those locations whose worst quality passes the test can succeed in such a procedure. On the other hand, if a logical OR is used, a location will be included if only a single criterion is met. This is somewhat of an optimistic strategy with some risks involved. In WLC, criteria are combined by applying a weight to each followed by a summation of the results to yield a suitability map. The advantage of WLC is that it allows factors to trade off their qualities. A very poor quality can be compensated for by having a number of very strong qualities. This operator represents neither an AND nor an OR, it lies somewhere in between (Bonissone and Deckher 1986) and, therefore it is not taking the extreme form of decision-making. Though the difference exists between the two approaches, each has its own advantage and disadvantage. It is hard to determine which one is better without investigating the cases. Depending on the features of the two procedures and the ease with which they can be implemented, the boolean overlay dominates vector approaches to MCA, while weighted linear combination dominates solutions in raster system.

Considering the nature of reclamation and that the file format is in raster form, the decision model selected for this ordering of reclamation proceedings is a combination of weighted summations (Voogd, 1983).

The three main components in Multi-Criteria Analysis (MCA) in a GIS context are criterion scores, criterion weights, and evaluation.

Because of the different scales upon which criteria are meastured, it is necessary that factors be standardized before making linear combinations using the formula, and that they be transformed, if necessary, such that all factor maps are positively correlated with the reclamation priority. A variety of procedures for standardization were reviewed by Voogd (1983), typically using the minimum and maximum values as scaling points. The simplest and most common is a linear scaling such as :

X = (R-Rmin)/(Rmax-Rmin) * standardized_range [5]

Where R is raw score.

A critical issue in the standardization of factors is the choice of the end points. Research has suggested that blindly using a linear scaling (or indeed any other scaling ) between the default minimum and maximum values of the image is not reasonable. In setting these critical points for the standardizing function, it is important to consider their inherent meaning. Depending on the primary issues, the critical points need to be set accordingly, even on the same data. For example, if we feel that industrial development should be placed as far away from a nature reserve as possible, it would be dangerous to implement this without careful consideration. If the map covers a range of 100km from the reserve, then the farthest point away form the reserve would be given a value of 1.0. Using a linear function, then a location 5 km from the reserve would have a standardized value of only 0.05. Taken this number to the issue on pollution from polluted streams, for which a distance of only 1 km would have been equally as good as being 100 km away. Thus the standardized score should really have been 1.0. If a Multi-Criteria Analysis (MCA) were undertaken using XXX blind linear scaling, locations in the range of a few tens of kilometers would have been severely devalued when in fact they might have been quite good. In this case, the recommended critical points for the scaling should have been 0 and 1 km. In developing standardized factors careful consideration should be given to the inherent meaning of the end points chosen.

The development of criterion weights for Multi-Criteria Analysis (MCA) can be achieved through a variety of techniques (von Winterfeldt and Edwards 1986). One of the most promising is that of pairwise comparison developed by Saaty (1977) in the context of a decision making process. In this procedure for MCA using a weighted linear combination, the weights sum needs to be one. Saaty’s technique derives weights from the matrix of pairwise comparisons between the criteria. This method is very useful when direct evaluation of criteria weights are difficult, especially with an increasing number of criteria. However, in the research, only three criteria were chosen, and it is not difficult to obtain direct evaluation. Therefore Saaty’s pairwise technique was not employed.

Evaluation takes place after the criteria maps with scores and weights have been developed, to combine the information from the various criteria. Using the Weighted Linear Combination (WLC), it is fairly simple to multiply each criterion map by its weight and then sum the results. Since the weights sum to one, the resulting map should have the same range of values as that of the standardized criteria maps.

Implementation of the SDSS

The reclamation priorities of coal mine sites are primarily determined by pollution intensity, soil erosion rates, soil properties, and a coal mine site’s proximity to streams. To build a SDSS, all the data must be placed into distinct decision making categories. In this SDSS, three criteria are selected: soil erosion rate, soil acidity, and the proximity of a site to streams.

It is not an easy task to develop a weighting scheme for decision making in mine reclamation. Since the priority of reclamation is decided on by the attractiveness of the sites, according to their values on each criterion, aggregation of the values could generate the rankings. Therefore, the linear method is adopted in this study. The value of each criteria layer is normalized to a value from 0 to 10 (VCi) depending on the statistics of the original data set or on the decision-makers’ preference. Each layer is then given a weight (Wi) according to its importance. The output value for each site is, then, the sum of each criteria value multiplied by its weight.

For site j, the output value is

RPj = S ( VCij * Wij ) (i = 0..2) [6]

The erosion rate priority is positively related to the reclamation priorities. This means a high erosion rate gets a high priority value. High erosion rate soil has a great potential to be carried away into a stream or channel by running water and, therefore, to degrade the water quality and cause sediment pollution.

Soil acidity can be another major source of pollution. Highly acidic soil may cause severe problems for environmental management and for re-vegetation because of its low nutrient content.

Proximity to streams is the most important factor for reclamation as pollutants are usually carried by water, so it should be weighted the most highly in the weighting scheme. This factor is followed by the soil erosion rate factor, and soil acidity factor. The closer the site is to streams, the higher is the risk that it will spread pollutants from the abandoned sites.

User Interface

The user interface is designed to allow the decision-maker to step through a process that will result in the calculation and display of a map of weighted reclamation priorities. To use this system, the user has to have a knowledge ofX, or a preference for, each factor. After the user input of the weight of each variable, the system will calculate the final result and generate maps according to the user’s preference. In addition, the user can add new data sets to the system, in order to add a new set of criteria, such as land owner preference, in the multiple criteria list. Modification of the weighting scheme is also possible by interacting with the interface.

It was quite clear that with the task at hand, which was originally that of dealing with spatial data, GIS functions were crucial in building the application system. The implementation of the interface had combined four major parts in its structure: a main frame or programming platform; the GIS functions; XXX visualization tools for analyzing results; and, a communications capability among all of these.

Borland Delphi by Inprise is a popular package of programming software similar to Microsoft Visual Basic and Visual C++. Compared to its peers, Delphi has the best performance/input ratio. Visual Basic, which is predominantly a programming platform, uses very simple syntax in coding and is easy to use. But the compiled Visual Basic program file is usually large and it runs slowly. Visual C++, on the other hand, is more complicated in concept and syntax, though the compiled file runs faster. Delphi lies in between Visual Basic and Visual C++ with respect to speed, size, and programmer’s effort. Another advantage of using Delphi for this research is that several applications have already been built with MapObjects and it definitely helps to get on with the building of applications directly without spending time XXXfamiliarizing one’s self with the programming platform.

GIS functions have to be extracted from GIS software. With two or three GIS packages and components at hand, a developing environment had to be chosen among the available options. The available GIS software included ArcInfo by ESRI, ArcView by ESRI and MapObjects by ESRI.

ArcInfo is by far the most powerful GIS software in the GIS world with respect to the functions it provides for both raster and vector analyses. There is no doubt that ArcInfo could fulfill the task of building a spatial decision support system. The shortcomings of using ArcInfo to develop an SDSS are:

    1. ArcInfo Macro Language (AML) has to be used, which might limit the future modification from other persons who are not familiar with AML.
    2. The interface is quite simple, but it is not up to the Windows program standard. This might cause confusion for XXX users who are familiar only with the traditional Windows interface.
    3. The display of maps is somewhat  more complicated in the ArcInfo environment than in the GIS components environment. It requires executing some unique commands or functions to set up each display window.
    4. The AML program running in the ArcInfo environment requires a quite large amount of memory, and it requires many internal commands or functions that cause the application to run slowly.
Despite the disadvantages of building the whole application using this software, ArcInfo provides the necessary GIS functions that the spatial decision support system requires to conduct its operations. If the functions are not used within the ArcInfo framework, there must be some additional software tool to gain access to the functions from outside ArcInfo.

ArcView is another available GIS package by ESRI. It is a much easier system for users than ArcInfo because of its Windows interface. The functions for vector analysis and process in ArcView are well developed. The advantage of ArcView is that it allows multiple windows displaying different themes or overlays. This makes it very easy for users to conduct comparisons side by side. However, it lacks the capabilities to do raster analysis and grid modeling. Therefore, ArcView[Should this be bold?] alone could not fulfill an important requirement of this research--that of implementing a spatial decision support system that requires mostly grid anaylsis.

MapObjects is a new product by ESRI using GIS componentware technology. The portability and flexibility of MapObjects for use with other industrial standard packages distinguishes it from both the ArcInfo and the ArcView package. Wrapping the most popularly used GIS functions, it provides a means for those users who are outside the GIS field to display and analyze geographic information within the framework of a familiar programming environment. In implementing the spatial decision support system for this research, MapObjects was chosen to display multiple windows by using the "Map" object provided by the MapObjects program. Although MapObjects[Bold?] has the capability to display raster or image layers, it is clearly unable to support operations on raster data.

Therefore,XXX for this study, Delphi was selected to be the programming platform, ArcInfo was selected to be the main GIS function source, and MapObjects was selected to be the visualization tool. The only step left was to choose a method to connect these three parts. There was no difficulty in embedding MapObjects in the Delphi program because MapObjects was just an Active X control. The remaining question was how to gain access to the ArcInfo functions within the Delphi environment. The Open Development Environment (ODE) by ESRI provides an environment that allows standard programming languages to call ArcInfo functions through Active X controls. The main ArcInfo components such as, ArcEdit, ArcPlot, and Grid[Should these be bold?] are encapsulated in the controls. By embedding the controls in the program, the application is able easily to access the commands and functions.

The Open Development Environment (ODE) by ESRI was used as the primary means to develop a user interface to access the ArcInfo GIS functions. This was accomplished through the use of Rapid Application Development (RAD) tools to take advantage of the flexible programming environment, which has easy access to any ArcInfo functions. Since there is a limitation in accessing the ArcInfo through ODE controls designed for Windows NT, MapObjects, which is a GIS component developed by ESRI, was used for displaying multiple windows. There were three major steps required to implement the fully functioning graphic user interface. They involve interface design, function calls, and interface and function interaction.

The interface is the graphic display of the system on the screen that allows easy viewing by users. Normally it shows the entries to the main functions the system provides. There are several considerations in interface design, such as colors, shapes, and layout of the components. Function calls are the underlying processes that are invisible to the users. These are at the core of a SDSS.   Modeling and computing are conducted by function calls. Interaction takes place when a user enters information into the system. The system accepts the user’s request and calls the necessary functions to generate results according to the particular user’s preference.