1. Results and Discussion
 

Using correct data is the most important step in the implementation of any spatial decision support system. It is impossible to provide good decision support using a spatial decision support system without decent data, even if the system is perfect. Therefore, the data need to be as error free as possible to assure reasonable results. The quality of data relies on two main factors: accuracy and currency. Data accuracy indicates how well the data reflect the real world. It involves data acquisition methods, data resolution, data format, and data units. Data currency determines how timely the data are for a particular research project. It is not always true that the more recent data are better. For example, if the data are used in a research project that is focused on the past, they must XXX represent the truth at that time.

Data Preparation and Description

For this research, data were acquired from various sources and were of different formats and coordinate systems. In order to analyze all the data in a single system, they had to be properly digitized into the system. ArcView 3.1 was adopted for digitization and feature editing, and Arcinfo 7.2 was used for projection transformation. The raw data collected for this research included the following:

    1. General boundary
To get the data of all the layers within the study area, a rectangular boundary polygon was created. This rectangle served as a general outline for the study area and was used to clip all the layers that extended beyond its edges. The specific area for analysis was determined by the distribution of the mine sites within the general geographic vicinity.  This area was of irregular shape.
    1. Soils
The study area lies on the border of Pike County and Warrick County. The revision dates of soil survey maps of these two counties are 10 years apart. The Pike County soil map survey was completed in 1989, and the Warrick County soil survey was completed in 1979. There were some on going mining activities in the study area during the time gap between 1979 and 1989. Also, although the geographic locations of the soil map boundaries of the two counties match perfectly, the attributes of the soils were discontinuous along their border.

For those mining areas in Warrick County which were explored between 1979 and 1989, the soil information recorded on the survey map was merely the soil series and properties that existed before mining activities began. To extract useful soil information, then, the data have to reflect either the current conditions or at least the conditions after mining activities began. To compensate for this situation in the study area, and to get one continuous soil map, an assumption was made that the mining areas during that particular mining period were covered with a unique soil series.

First of all, soil map sheets from the two counties were placed side by side, and the study area boundary was drawn on them. With the map registered in a Universal Transverse Mercator (UTM) projection, the soil polygons within the boundary were digitized and edited. During the editing phase, the map of mine sites was overlaid on the digital soil map. The polygons in the areas that were affected after 1979 in Warrick County were then deleted from the soil map and the mine site boundary polygon was added to the soil map. After all the polygons were edited, soil attributes were added to each of them. The information in the general soil map includes the soil series name, the acidity (pH value), and the K value. In the soil survey maps, the soil reaction (pH value) was recorded as a range of values. However, to use the pH value in calculations, the values must be saved as a single value for each member of the soil series. The average value of the value range for a soil series was taken as the value for that soil series. The attributes for those modified polygons in Warrick County were then imported from the data of similar areas during the same mining period in Pike County.

The soil layer covers all the soil properties needed in this research. Acidity data were a primary criterion in the multiple criteria analysis, and the K values were the input for the Revised Universal Soil Loss Equation (RUSLE) model. It would be quite time-consuming and most resource demanding if a particular property was extracted each time it was accessed in a program. It is much better to have every particular data set ready as a layer in the database. For this purpose both an acidity map (Figure 4-1) and a K value map (Figure 4-2) were generated from the soil map by using region functions provided by Environmental Systems Research Institute (ESRI).

    1. Digital Elevation Model (DEM)
The digital elevation model (DEM) records information regarding the terrain. Originally the values in a DEM were elevations above sea level. The USGS 7.5 minute DEM which has 30 meter by 30 meter ground resolution, was used for this research. The file was stored in USGS DEM format, which could not be processed directly in ArcInfo or ArcView. However, there was an import module in ArcView that allowed the USGS DEM file to be converted to a grid file. The DEM (Figure 4-3) was mainly used to calculate an LS value in RUSLE model for the calculation of erosion rate using the RUSLE model.

 

 

 

 

 

 

 

 

 

 

 

 

    1. Stream
The streams and lakes layer (Figure 4-4) was acquired from the Indiana Geological Survey (IGS). Since pollutants were carried by water and relocated along the waterways, the stream layer was extremely critical in evaluating the potential pollution caused by abandoned mine sites. This layer, originally stored in shape file format, was then clipped by the boundary shape file. Using the ArcInfo file format conversion function, the Shapefile was eventually converted to ArcInfo coverage allowing for the program to access it directly.

    1. Mine sites
Mine site layer data, collected and prepared by the IGS, recorded not only the mining areas but also mining periods. This was the main source for delineating the analysis boundary. The intersection of this layer and the general boundary was outlined as the primary analysis area.
    1. Topographic maps
Topographic maps were scanned from the U.S. 7.5 minute topographic maps as digital images. However, the geographic information on the map was not kept with the image file when the maps were scanned; a geo-reference file had to be created for this purpose. The world file is one that can record geographic locations for any image files. Primarily the world file keeps the upper-left corner coordinates, horizontal and vertical factor for converting screen units to map units, and rotation factor. A world file is a text file with five numbers or parameters. The first two numbers are the coordinates for the upper-left corner in geographic unit. The third and fourth are horizontal and vertical factors, and the fifth is rotation factor. Topographic maps provided general information about the area and served as background. They also were a source of verification for land use and land cover classification from the satellite images.
    1. Aerial photograph
The aerial photograph (Figure 4-5), which was downloaded from the Microsoft TerraServer at its website http://microsoft.terraserver.com, was in the JPEG image format. Since projection and coordinate system information was lost when saving the image in JPEG format, a world file was created to recover the coordinate system information. This was accomplished by using control points on the image and on the topographic map, which was already geo-referenced. The images were provided by the U.S. Geological Survey (USGS) in the format of Digital Orthophoto Quadrangles (DOQs) taken over the United States. The photos were originally obtained from the National Aerial Photography Program (NAPP). Each photo covers about one-fourth of a standard, 7.5-minute USGS topographic quadrangle map. The photos then were scanned into digital format and put through some enhancement processes, such as orthorectification, to remove distortions caused by camera angles. This layer could be used as a detailed field data both for classification of the satellite imagery and for the comparison of other layers. The images for the study area in this research were collected on Mar 24, 1992.

The Shapefile is a default file format for ArcView, and the coverage is a default format for ArcInfo. To reduce the amount of calculation in the program, all the layers in Shapefile format generated from ArcView were converted to ArcInfo coverage so that the ArcInfo functions could call up the data directly. This conversion was conducted in ArcInfo using the ‘ShapeArc’ command.

Land Use/ Land Cover Classification

The land use /land cover classification map was used for extracting the surface coverage percentage data for calculation of the C value of the Revised Universal Soil Loss Equation (RUSLE) model. The data selected were two sets of Thematic Mapper (TM) images, one was acquired in May of 1993, and the other was acquired in July of 1993. The main purpose of image classification is to cluster similar pixels on the image into natural classes. The process of extracting information from satellite image is the process of analyzing spectral features recorded on the image using image processing techniques. Spectral patterns of some surface objects may be similar, however the patterns of spectral reflectance may vary over time, therefore multi-temporal data can provide more detailed information to differentiate some similar objects. Because of the lack of field data, an unsupervised classification method was adopted to perform the classification.

The Iterative Self-Organizing Data Analysis Technique (ISODATA) algorithm from ERDAS Imagine software (ERDAS 1997) was used as the clustering method. It uses a minimum spectral distance formula to form clusters. Analysts have a control over the process by assigning parameters such as the number of final classes, the maximum iteration times, and the convergence threshold. Maximum iterations is the maximum number of iteration times that the ISODATA utility will recluster the data. It prevents the ISODATA process from running too long. Convergence threshold is the maximum percentage of pixels whose cluster assignments can go unchanged between iterations. It prevents the ISODATA from running indefinately. The ISODATA begins with an arbitrary cluster meanX or an existing signature set, the distance of every data point from each cluster center is calculated, and each point is allocated to the cluster with the closest mean. Each time the clustering repeats, the means of these clusters are recalculated and shifted. The new cluster means are used for the next iteration. The whole process is repeated until either a maximum number of iterations has been performed; a maximum percentage of unchanged pixels has been reached between two iterations; or a maximum of classes has been achieved. In this classification, the number of final classes was assigned to be twenty. Based on the spectral separabilities between classes, and on the observation from topographic maps and aerial photographs, the twenty classes were then merged into nine categories (Figure 4-6 ). The categories were:

The classification map was not used directly in the system, but it was used to derive coverage percentage data for the calculations using the RUSLE model. The cover management factor C in the model was mainly determined by the subfactors related to the land coverage such as the canopy cover subfactor and the surface subfactor. As the coverage subfactor – coverage percentage curve (Renard et al. 1997)--indicates, the relationship of the coverage subfactor to the coverage percentage is negative and close to linear. The higher coverage percentage has a lower C value. According to the trend, percentage subfactor values were assigned to each category generalized from the classification map (Table 4-1).
Table 4-1. Coverage Factor Values of Surface Classes
Category Coverage factor value
Lake /Stream 1
Sparsely vegetated strip mine 0.70
Strip mine forest 0.30
Mixed forest 0.25
Mature forest 0.20
Openland (Grassland) 0.40
Vegetated strip mine 0.60
Crop (Winter wheat) 0.40
Crop (Corn) 0.40
 

System Description

System building is another important part of this study, in addition to the data collection. To build an appropriate system, the spatial decision support system must be user friendly and functionally rich.

The SDSS application was built in a multiple document interface (MDI) environment (Figure 4-7). The MDI application comprises a main window and many child, or subservient, windows. It allows users to open child windows within the main application window. More than one document or child window can be opened within a single parent window. Another advantage of using the MDI application is that the child windows can share the toolbars or menus of their parents.

The main menu provides entrance to some typical Windows functions. The "File" menu item contains the functions to operate on files, such as opening a child window and printing a map. The user can open a new child window and import any layer for display, such as Shapefile, or coverage and image file. From the "View" menu item, the user can navigate the map and load the area of interest by zooming in, zooming out, and by panning. The "Modeling" menu item is the main function of this system. It covers all the models used in the system, which include erosion rate information, stream buffer information, normalization information, and weight matrix information. Each model is represented by a tab form, which is like a page in a notebook on the modeling window, and which provides blanks for the user to operate. By selecting this item, a user is able to interact with the system by typing preferences in a user-friendly environment. The "Windows" menu item controls the layout of child windows within the frame of their parent window. The windows can be arranged as tiled, cascaded, or as auto arranged.

FIGURE 4-7. Main Form of SDSS Application

Buffering Stream Proximity

The tab window for the stream proximity calculation comprises XXX four parts (Figure 4-8): a scale bar for values, a scale bar for distance to the stream, a text box, and a panel for the display. A user can assign values to the ranges of the distance to the streams simply by sliding the scale bar. With a click of the "Set" button, the user’s selection can be shown in the small text box below. The "Preview" button sends the user’s assignment to a temporary file and calls ArcInfo functions to calculate the buffering zones. It eventually displays the result in the display panel.

The buffer function of ArcInfo was integrated into the program to calculate buffering zones. Figure 4-9 illustrates how the function works. The buffer function creates buffer polygons around specified input coverage features. The output is a polygon coverage with a new attribute called "inside" attached to it. After every buffer function has been called, the output coverage is converted to a grid for further calculation. If only one buffer zone is set, it is quite straightforward to use the buffer function to generate the buffer zones. However, most of the time multiple buffering zones are requested, and the buffer function alone cannot handle it. In this case other techniques need to be incorporated.

The operation of the buffer function, with input of the buffer distance for a stream layer, generates a coverage with three values for the attribute "inside". The values are: 100 for those inside the buffer zone, 0 for those outside the buffer zone, and 1 for those in the "donuts" enclosed by pixels inside the buffer zone. In the case of more than one buffer distanceX being defined, the buffer function can calculate only the last defined buffer zone correctly. In this case, the most likely resultis that a value of 100 will be assigned to those pixels in the last defined buffer zone, a value of 0 to those pixels outside the buffer zone, and a value of 1 to those in the donuts (outside the polygons). To redeem this  shortcoming of the buffer function, the following scheme was introduced.

For example, the user’s selection is set as Table 4-2.
[Xiao Liu, sorry I can't see the Figures. This might affect how I edit, but it seems o.k. to me.]

First of all, D1 (meters) was used as the parameter for buffering. A grid file G1 was generated from the buffer function with values as 0, 1, and 100 according to the distance of the pixel from the streams. Then another buffer function was used on the original stream file using D2 (meters) as its input, and a grid file G2 was generated with values as 0, 1, and 100. The next step was to operate on G1 and G2 to produce the final grid file with the value V1, V2, and 0, depending on whether the pixel was within D1 to the streams, whether it lay between D1 and D2 to the streams, or if it lay beyond D2 to the streams. All the possible cases are listed in the table below. The condition command was used to generate the output grid as outlined in Table 4-3.

Table 4-2. Illustration of Assignment of Value to Distance of Streams
Distance to streams (meters)
Normalized value
0 – D1
V1
D1 –D2
V2
Table 4-3. Conditional Operation on Two Buffer Grids
G1 pixel value
G2 pixel value
Output value
100
Any
V1
0 or 1
100
V2
0 or 1
0 or 1
0
FIGURE 4-8. Tab Form for Stream Proximity
FIGURE 4-9. Buffer Function

Erosion Rate Calculation

The erosion rate window (Figure 4-10) consists of four buttons for conducting the functions. The "Calculate LS" button calls the function that runs the ArcInfo ArcInfo Macro Language (AML) (Appendix A). It generates the grid file for LS values. The "Statistics" button shows the statistics of the erosion rate values, such as maximum, minimum and standard deviation. These values can be used when normalization is needed. The "Calculate A" button operates on all the input files for RUSLE calculations. It also exports all the files to an output file of erosion rate values. The "Preview" button displays the output erosion rate file.

The RUSLE model expects inputs of five factors, a rainfall factor (R), a soil erosivity factor (K), a slope factor (LS), a cover and management factor (C), and a conservation factor (P).

The R value was acquired from the CITY database which came along with the RUSLE program. This R value was regionally distributed, so it was a constant for the whole study area. The value for the study area is 200 (hundreds.ft.tonf.in)/(acre.hr.yr) .

The K value was extracted from the general soil maps that had been digitized from the county soil survey maps. Using a region function in ArcView, the soil map was categorized by the K value. The polygons with the same K value were merged into one region. This file was saved as an ArcInfo coverage and then converted to raster format for calculation. The unit for the K value is (ton.acre.hr)/(hundred.ft.tonf.in) .

The P value was not taken into account in this study, since most of the study area has not been restored back to crop fields and therefore no conservation methods were applied.

The C value primarily reflects the coverage conditions of the soil. This layer was generated from the land use/land cover classification extracted from remotely sensed data.

 

 

Figure 4-10. Tab Form for Erosion Rate

Acidity

Like the tab form for proximity to streams, the window for acidity (Figure 4-11) contains the same structure. The two scalebars allow the user to assign normalized values to pH values and the "Set" button saves the assignment into a text file for final calculation. The "Preview" button reads from the text file that stores the settings, and then reclassifies the acidity file to generate the normalized pH value grid file.

Figure 4-11. Tab Form for Acidity

Weight Matrix

On the weight matrix form (Figure 4-12), three scalebars were designed to allow users to enter their preferences or weights, for each of three variables: erosion rate, proximity to streams, and acidity. After the weights are set, the current weight matrix shows up in the text box. By a click of the "Preview" button, the program calculates the final result according to the weight matrix and reclassifies the result so that the output values are normalized on a scale of 1 to 10 for reclamation priority.

Figure 4-12. Tab Form for Weight Matrix

Normalization (Reclassification)

Normalization is a very important step in multiple criteria analysis. It reassigns new values to original values according to the requirements of the problem. Usually normalization values fall within a range of numeric values. This step assures that every criterion is treated equally before any weight or preference is assigned to it.

The highest priority value for reclamation in this research was set to be 10. Therefore, all the criteria values have to be reclassified into values between 1 and 10 before the final calculation can be made. The higher value reflects the higher priority for reclamation. For the instance of proximity to streams and acidity, as the system was designed, the user assigns a normalized value to each range of the original values at the time the user interacts with the system. Higher values should be assigned to those areas that are close to streams and have low pH values. All the settings then are saved in a text file. The ArcInfo reclassify function can read from this text file and operate on the grid file to generate a normalized output grid.

In the case of erosion rate, the value calculated from the RUSLE model is not easy to predict and not normalized. Therefore, there needs to be a way to conduct the normalization. At the erosion rate tab form, the user has access to the statistics of the erosion rate result. Based on these statistics, the user is able to get a general idea on how the values are distributed. At the time of normalization, a linear algorithm is used. There are two ways to set the maximum and minimum values for this stretch of data. The default maximum and minimum values from the statistics can be used, or the user can determine them by analyzing the statistics himself.

Visualization

All the operations are conducted behind the screen windows. Therefore the visualization of and the presentation of results is another important component of the system. Displaying the data allows the user to view the geographical distribution of the attributes of a particular layer. Overlaying two or more layers in a single display panel or window provides more detailed information for assessing the results. Multiple display windows provide the capability for the user to make comparisons. If the user has two or more priority maps resulting from different settings of the weights for some of the criteria, he can compare the results on the screen. A great advantage of multiple display windows is that of allowing the user the option of calling up other data to assist in the comparison. In spite of the shortcomings of MapObjects to deal with Grid layers, MapObjects control was used to provide display panels. All the Grid layers were converted to Windows Bitmap image format which can be handled by MapObjects in order to be presented on the screen.

System requirements

To run the spatial decision support system properly, the following requirements have to be met.

Hardware : PC compatible, 32MB RAM

Software : Windows 95 or NT platform, 32 bit operating system

ArcInfo library and license

MapObjects Dynamic Linked Libraries (DLLs)

Regarding the physical storage of data and applications, two considerations need to be taken into account : 1. the directory structure, and 2. the naming convention. To simplify the system, a directory structure was created to avoid forcing the user to find the data path when running the system. It is necessary for users to copy the entire structure.  Since the system accesses a fixed directory structure for all the data files and program files, the structure must be intact to ensure that the system gets correct data and functions (AMLs).[?]  The naming convention for this system is critical for users who want to replace the existing data files with their own custom data files. If any of the data file names are changed, the system will not function normally.

All the data, AMLs, and application files must be saved in the same directory as a "working directory". In the system, the users have the option of changing the system path to the working directory. The files and directories under the working directory should include:

    1. DEM directory for digital elevation model data used to generate LS[?] values for the erosion rate model,
    2. K_Value directory for K values used in the erosion rate model,
    3. C_Value directory for C value used in the erosion rate model,
    4. Stream directory for the streams and lakes layer,
    5. PhGrid directory for pH values,
    6. sl.aml, fil.aml, dn_slope.aml, high_pts.aml, s_length.aml, and ls.aml (Appendix A) for LS value calculation algorithm.[As before, these are still confusing to me]  These are the files written in ArcInfo Macro Language (AML) to call the ArcInfo functions. They are wrapped[I don't understand your use of the word "wrapped" here.] in the system.
    7. SDSS.exe for the executable file of the application.
Examples

Two scenarios were created to test the system to generate the priority maps for the study area in the South Fork Patoka River watershed. The parameters were given as following:

Erosion rate: statistics max and min linear normalization

Acidity : value pH

Proximity to streams : value pH

Weight : erosion rate acidity proximity to streams

W1

W2
[I can only assume that your reader will know what you mean here (above).]
Discussion

At first phase of the system design, only ArcInfo and Delphi were chosen to be the tools to build the system, considering that ArcInfo has the GIS functions and display capabilities, and Delphi is capable of building standard windows program. Not like the final design of tab forms, each criterion was assigned a separate window for calculation and display. As the application building went on, some difficulties were encountered. First, ArcInfo license conflicts occurred when more than one function call from ArcInfo requested that the program XX perform some analyses in different program units. A program unit is a file containing the codes that associated with a window. For example, in the calculation of an LS[?] factor for the RUSLE model it is necessary to call the GRID component of ArcInfo.  After this call is completed, for the calculation of acidity it is necessary to call the GRID function again from another window.[was c What is this?]  Is it necessary to be attached to the word window?] At this time, an error would occur indicating that there was a GRID license conflict. This problem was caused by using Open Development Environment (ODE) which demands that only one license for each module XXX be invoked from one program. To be able to use the ArcInfo functions whenever the application requested, a window wrapping all the calculations which require calls to ArcInfo was created, and each calculation was separated into a single tab form on the window.

Another problem with initial system design was display reentry. The primary idea was to use ArcPlot, Grid, and ArcEdit control on the window, which provide not only functions but also display tools. Unfortunately, each control could be used in a program only once, it therefore limited the number of display windows. In the cases where two criteria had to use Grid controls to implement the calculation and analysis, there was no way to display the results of both at the same time. To compensate for this limitation of data presentation, MapObjects was introduced into the spatial decision support system to be the main display tool.  Here, there was no limitation of number of display windows.

The major, overall technical problems that occurred in the building of the system were mainly caused by the license limitations of using the Open Development System (ODE) to access the ArcInfo functionality.