Self-Organizing Maps (SOM) is an unsupervised learning method that uses vector quantization to allow visualization and clustering of data. It is unsupervised because the only input (besides the parameters of the algorithm) is the data itself. In a classification problem, for example, the labels for each data point are not available. One interesting property of the resulting maps is that it is topographically ordered, that is neurons that are closer to each other are more similar than more distant ones. This makes them useful for visualization.
The simplest form of SOM consists of a two dimensional rectangular or hexagonal grid of neurons, each of which represents a point in an N dimensional space. The dimensionality of this point is the same as the input data. Upon creation, each neuron is initialized to a random point. After training, the training data is projected onto this two dimensional neuronal network, with points in the data space that occur more frequently being represented by more nodes.
The simplest SOM works like so: For each training data, a node from the existing network that is closest (according to some distance measure, in FANN Euclidean distance) is assigned as the winning node. This node’s vector is then made closer to the training point by some fraction, determined by the learning rate. In addition, a neighborhood function that determines the winning node’s neighbors is given. These neighboring nodes are then modified to be closer to the training data point as well, but by a lesser amoung. Nodes that are further from the winning node are changed less than those that are closer.
There are several parameters for the SOM: The width and height (number of nodes in each direction), topology (hexagonal versus rectangular), neighborhood function (gaussian or linear), the function determining the decay of the learning rate (linear or inverse), and the starting learning rate.
In SOMs, the network to be trained is fixed throughout. There have been several dynamic variants where the new nodes are added or deleted based on the error characteristics of the nodes and network. One such variant, called Growing Neural Gas (GNG), has been implemented. This variant adds and deletes nodes based on the local and global error characteristics. See the reference documents for a description. The GNG algorithm has several important parameters, including the maximum number of nodes to add, the maximum age of a node before it is deleted, and scaling factors for the reduction of error of nodes that are moved. See the gng_params structure for a complete listing of modifiable parameters.