Title: Neural Network Classifiers for GIS Data: Improved Search Strategies
Date: 25 July 1999
Authors: Gordon German
Artificial neural networks have several advantages when used as classifiers of complex geographic and remotely-sensed datasets. They normally require no assumptions on the data distribution and can be trained with relatively small sample sets. Further, they are robust classifiers that require little data preparation prior to use; however, the selection of a suitable architecture and the subsequent lengthy training time of the network have often been perceived as a disadvantage to the acceptability of such classifiers. The author has previously shown a methodology for selecting an architecture and reducing the time required for training by specific manipulation of the network parameters prior to the training phase.
The training of these classifiers is a process that typically involves the use of a minimisation routine to search for an acceptable minimum-error position, in some highly multidimensional weight space. The variation of these weights affects the positioning of separating hyperplanes in attribute-space, which are generated by the network's hidden layer. Although successful in many applications where a mechanistic, "goal-seeking" search is applicable (the algorithm generally seeks for a best global outcome regardless of the path required to attain it), the use of specific knowledge of the form of the solution can be used to improve the performance of the search. Specifically, the solution requires the hyperplanes to model class surface boundaries in attribute-space. Despite this, the minimisation routine makes no use of any prior information on the class separability, which is available from the sample sets and could be used to provide a gross positioning of the hyperplanes prior to training. The previously developed methodology partially addresses this. Initialisation of the network is done with regard to the class spread in attribute-space and the calculated redundancy available in the network. The method first positions the separating hyperplanes generated by the network in a sub-optimal, linear-separable configuration prior to training. It then assigns additional hyperplanes for classes displaying more complex separation boundaries, enabling a more efficient buildup of piecewise linear separating surfaces, reducing the training time and increasing the chance of convergence on a suitable minimum.
Even so it is often the case, especially when dealing with complex geographical datasets, that during training, hyperplanes that are already in an optimal position (and thereby contributing minimal local error) will be perturbed in an effort to reduce the overall global error; hence, there is not a monotonic increase in classification performance as training progresses. This reduces the efficiency of the training process and can limit the classification performance attainable. In an extension of the prior methodology, hyperplanes that already produce adequate separation in attribute-space are "frozen" in position prior to, or during, training. This allows the network to focus on areas of poor separability and spawn additional hyperplanes if required. Results obtained on several complex datasets show a significant improvement of classification performance, as well as a reduction in training time.
IV International Conference on GeoComputation, Mary Washington College, Fredericksburg, VA, USA, 25-28 July 1999.