Also available as
Artificial neural networks have proven very useful and powerful in
learning systems. Unfortunately neural nets are a kind of black box,
which provide output without explanation of how this output was
computed. This paper will examine two approaches to incorporating
explanation facilities in neural nets. The discussed approaches
include the extension of the neural net with a fuzzy logic
structure and processing of weights with genetic algorithms.
Artificial neural networks(ANNs) are powerful machine learning
constructs. One of their key advantages is their robustness and
ability to deal with erroneous or missing data. Unfortunately no
explanation is provided as to how a certain output was generated.
This is the primary obstacle in a more wide spread use of ANNs. It is
the purpose of this paper to examine two possible ways of providing
an explanation facility for neural networks.
Both of the discussed approaches deal with providing explanation
facilities for back-propagation networks. There are other explanatory
mechanisms not covered in this discussion which are useful to more
complex types of neural networks such as connectionist semantic
The algorithm discussed in [HNY96] uses fuzzy logic to
provide an explanation facility for neural networks. Narazaki et
al. describe the application of fuzzy logic to a logical formula as
the extension of a truth value to a real number. For example, given
the height of a person we can classify the individual as either tall
or not tall using traditional sets. There is no different description
for people who are somewhat tall. Fuzzy sets remedy this by providing
degrees of membership[PT92]. This is illustrated in
Illustration of fuzzy set
At the heart of the algorithm lies a grouping of training instances
into disjoint subsets based on their sensitivity
patterns. The sensitivity pattern of an input pattern is
given as follows
is the output of the neural network and is the size of the input
vector. These patterns provide the fuzzy structure needed for the
rule production and can be generated using the same input vectors used
to train the network.
After this the algorithm then proceeds to further partition the
obtained subsets into more disjoint subsets
This partitioning is done given the
- all points which fall on the segment between the points in
and the center are part of .
i.e. the sensitivity pattern for
matches that of the subset it is in.
The resulting disjoint subsets are then used to calculate a closed
interval by projecting the instances in onto each input
variable. The interval is representative of a monotonic region of the
input instances i.e. all instances which fall in this region are part
of the subset used for finding the region.
At this point a rule can be generated by using the center of the
It is interesting to note that while this algorithm does provide an
explanation facility via rule extraction, it does not provide the same
accuracy as a decision tree for example. According to
[HNY96] the algorithm is an approximation since there is a
trade-off between readability and accuracy. The authors give the
we say ``Birds fly'', knowing that there is at least on bird(e.g. a
penguin) that does not fly. In this example, that accuracy gives
way to the simplicity of the knowledge description to a certain
degree, and the occurrence of exceptions is allowed.
Genetic algorithms can also be the basis for an explanation facility
in neural networks[ED91]. This type of algorithm attempts
to model biological systems with respect to
evolution[NS92]. The primary operators used by genetic
algorithms are reproduction, crossover and
mutation. Reproduction is used to populate the initial mating
pool. Crossover describes the sharing of genetic information to
produce the next generation. The last operator is mutation and
serves the purpose of retaining good genetic material as well as
disposing of unwanted material. The overall performance of the
genetic algorithm is controlled by a fitness function, which helps to
determine the goodness of a given solution.
Genetic algorithms can be used to find points on the decision
hyper-surface of the input space, which provides the separation of one
classification from another. Eberhardt[Ebe92] does not
offer an actual mechanism for rule extraction. However, it is clear
that once the input space is better mapped only a partitioning process
is needed to group the points. The groupings can then be used to
extract rules. This process is notably similar to the later steps in
the fuzzy set algorithm.
The points on the decision hyper-surface are obtained by using an
initially randomized genetic algorithm population as the input for the
network. The weight matrix of the neural net is used as the fitness
function governing the genetic operations. The input provided by the
genetic algorithm is then propagated through the network in an attempt
to find the activation value of the decision hyper-surface. This value
is many times and serves to distinguish between
classifications. During this process the weights in the neural net
An additional point of interest is that this technique may also be
used to generate additional training samples for partially trained
neural nets, if only few training samples are available. This is done
by presenting the instances close to the activation value to an
expert. After the expert classifies the instance it can be used as
part of the training set.
While the genetic algorithm did not provide an explicit explanation
facility it was apparent, as stated earlier, that a mechanism for
rule extraction could be created fairly easily. It also seems that
the combination of the two algorithms may yield even better results.
Given that the fuzzy set approach provides an elegant means of
partitioning the input space and the genetic algorithms could provide
additional input instances to make this partitioning more precise, it
should be possible to obtain generally useful rules.
Explanation in artificial neural networks.
International Journal on Man-Machine Studies, 37:335-355,
R. C. Eberhart.
The role of genetic algorithms in neural network query-based learning
and explanation facilities.
In International Workshop on Combinations of Genetic Algorithms
and Neural Networks (COGANN-92), pages 169-183, 1992.
R. C. Eberhart and R. W. Dobbins.
Designing neural network explanation facilities using genetic
In Proceedings of 1991 IEEE International Joint Conference on
Neural Networks, volume 2, pages 1758-1763, New York, NY, Nov 1991. IEEE.
Toshihiko Watanabe Hiroshi Narazaki and Masaki Yamamoto.
Reorganizing knowledge in neural networks: An explanatory mechanism
for neural networks in data classification problems.
IEEE Transactions on Systems, Man, and Cybernetics,
26(1):107-117, February 1996.
Kendall E. Nygard and Susan M. Schilling.
Metastrategies for heuristic search.
In Proceedings of the Small College Computing Symposium, pages
213-222. SCCS, 1992.
Peter Pacini and Andrew Thorson.
Fuzzy logic primer.
Technical report, Togai InfraLogic, Inc., 1992.