Essentials of the self-organizing map

doi:10.1016/j.neunet.2012.09.018

Neural Networks

Volume 37, January 2013, Pages 52-65

https://doi.org/10.1016/j.neunet.2012.09.018 Get rights and content

Abstract

The self-organizing map (SOM) is an automatic data-analysis method. It is widely applied to clustering problems and data exploration in industry, finance, natural sciences, and linguistics. The most extensive applications, exemplified in this paper, can be found in the management of massive textual databases and in bioinformatics. The SOM is related to the classical vector quantization (VQ), which is used extensively in digital signal processing and transmission. Like in VQ, the SOM represents a distribution of input data items using a finite set of models. In the SOM, however, these models are automatically associated with the nodes of a regular (usually two-dimensional) grid in an orderly fashion such that more similar models become automatically associated with nodes that are adjacent in the grid, whereas less similar models are situated farther away from each other in the grid. This organization, a kind of similarity diagram of the models, makes it possible to obtain an insight into the topographic relationships of data, especially of high-dimensional data items. If the data items belong to certain predetermined classes, the models (and the nodes) can be calibrated according to these classes. An unknown input item is then classified according to that node, the model of which is most similar with it in some metric used in the construction of the SOM. A new finding introduced in this paper is that an input item can even more accurately be represented by a linear mixture of a few best-matching models. This becomes possible by a least-squares fitting procedure where the coefficients in the linear mixture of models are constrained to nonnegative values.

Section snippets

Brain maps

It has been known for over hundred years that various cortical areas of the brain are specialized to different modalities of cognitive functions. However, it was not until, e.g., Mountcastle (1957) as well as Hubel and Wiesel (1962) found that certain single neural cells in the brain respond selectively to some specific sensory stimuli. These cells often form local assemblies, in which their topographic location corresponds to some feature value of a specific stimulus in an orderly fashion.

The classical vector quantization (VQ)

The implementation of optimally tuned feature-sensitive filters by competitive learning was actually demonstrated in abstract form much earlier in signal processing. I mean the classical vector quantization (VQ), the basic idea of which was introduced (in scalar form) by Lloyd (1957), and (in vector form) by Forgy (1965). Actually the optimal quantization of a vector space dates back to 1850, called the Dirichlet tessellation in two- and three-dimensional spaces and the Voronoi tessellation in

Motivation of the SOM

Around 1981–82 this author introduced a new nonlinearly projecting mapping, called the self-organizing map (SOM), which otherwise resembles the VQ, but in which, additionally, the models(corresponding to the codebook vectors in the VQ) become spatially, globally ordered (Kohonen, 1982a, Kohonen, 1982b, Kohonen, 1990, Kohonen, 2001).

The SOM models are associated with the nodes of a regular, usually two-dimensional grid (Fig. 1). The SOM algorithm constructs the models such that:

More similar

The original, stepwise recursive SOM algorithm

The original formulation of the SOM algorithm resembles a gradient-descent procedure. It must be emphasized, however, that this version of the algorithm was introduced heuristically, when trying to materialize the general learning principle given in Section 3.1. This basic form has not yet been shown to be derivable from any energy function. An approximative and purely formal, but not very strict derivation ensues from the stochastic approximation method (Robbins & Monro, 1951); it was applied

Main application areas of the SOM

Before looking into the details, one may be interested in knowing the justification of the SOM method. Briefly, by the end of the year 2005 we had documented 7768 scientific publications: cf. Kaski, Kangas et al. (1998), Oja, Kaski et al. (2003) and Pöllä et al. (2009) that analyze, develop, or apply the SOM. The following short list gives the main application areas:

1.
Statistical methods at large
- (a)
  exploratory data analysis
- (b)
  statistical analysis and organization of texts
2.
Industrial analyses, control,

Approximation of an input data item by a linear mixture of models

An analysis hitherto generally unknown is introduced in this chapter; cf. also Kohonen (2007) and Kohonen (2008). The purpose is to extend the use of the SOM by showing that instead of a single winner model, one can approximate the input data item more accurately by means of a set of several models that together define the input data item more accurately. It shall be emphasized that we do not mean $k$ winners that are rank-ordered according to their matching. Instead, the input data item is

Discussion

The self-organizing map (SOM) principle has been used extensively as an analytical and visualization tool in exploratory data analysis. It has had plenty of practical applications ranging from industrial process control and finance analyses to the management of very large document collections. New, very promising applications exist in bioinformatics. The largest applications so far have been in the management and retrieval of textual documents, of which this paper contains two large-scale

Acknowledgments

The author is indebted to all of his collaborators who over the years have implemented the SOM program packages and applications. Dr. Merja Oja has kindly provided the picture and associated material about the more recent HERV studies.

References (95)

S. Amari
Topographic organization of nerve fields
Bulletin of Mathematical Biology
(1980)
B. Fritzke
Growing cell structures—a self-organizing network for unsupervised and supervised learning
Neural Networks
(1994)
B. Hammer et al.
Recursive self-organizing network models
Neural Networks
(2004)
S. Kaski et al.
WEBSOM—self-organizing maps of document collections
Neurocomputing
(1998)
T. Kohonen
Median strings
Pattern Recognition Letters
(1985)
T. Kohonen
Self-organizing maps: optimization approaches
T. Kohonen
Self-organizing neural projections
Neural Networks
(2006)
T. Kohonen et al.
How to make large self-organizing maps for nonvectorial data
Neural Networks
(2002)
M.M. Merzenich et al.
Topographic reorganization of somatosensory cortical areas 3b and 1 in adult monkeys following restricted differentiation
Neuroscience
(1983)

M. Anderberg

Cluster analysis for applications

(1973)

C.M. Bishop et al.

GTM: the generative topographic mapping

Neural Computation

(1998)

Y. Cheng

Convergence and ordering of Kohonen’s Batch map

Neural Computation

(1997)

M. Cottrell et al.

Étude d’un processus d’auto-organization

Annales de l’Institut Henri Poincaré

(1987)

Cottrell, M., Fort, J.C., & Pagés, G. (1997). Theoretical aspects of the SOM algorithm. In Proceedings of the WSOM 97,...

G. Deboeck et al.

Visual explorations in finance with self-organizing maps

(1998)

S. Deerwester et al.

Indexing by latent semantic analysis

Journal of the American Society for Information Science

(1990)

G.L. Dirichlet

Über die Reduktion der positiven quadratischen Formen mit drei unbestimmten ganzen Zahlen

Journal für die Reine und Angewandte Mathematik

(1850)

E.W. Forgy

Cluster analysis of multivariate data: efficiency vs. interpretability of classifications

Biometrics

(1965)

A. Gersho

On the structure of vector quantizers

IEEE Transactions on Information Theory

(1979)

R.M. Gray

Vector quantization

IEEE ASSP Magazine

(1984)

S. Grossberg

On the development of feature detectors in the visual cortex with applications to learning and reaction–diffusion systems

Biological Cybernetics

(1976)

J. Hartigan

Clustering algorithms

(1975)

T.M. Heskes et al.

Error potential for self-organization

D.H. Hubel et al.

Receptive fields, binocular and functional architecture in the cat’s visual cortex

Journal of Physiology

(1962)

A.K. Jain et al.

Algorithms for clustering of data

(1988)

S. Kaski

Dimensionality reduction by random mapping

S. Kaski et al.

Bibliography of self-organizing map (SOM) papers: 1981–1997

Neural Computing Surveys

(1998)

T. Kohonen

Self-organized formation of topologically correct feature maps

Biological Cybernetics

(1982)

T. Kohonen

Clustering, taxonomy, and topological maps of patterns

T. Kohonen

Self-organization and associative memory

(1989)

T. Kohonen

The self-organizing map

Proceedings of the IEEE

(1990)

T. Kohonen

Emergence of invariant-feature detectors in self organization

T. Kohonen

Emergence of invariant-feature detectors in the adaptive-subspace self organizing ma

Biological Cybernetics

(1996)

T. Kohonen

Self-organizing maps

(2001)

Kohonen, T. (2005). Pointwise organizing projections. In Proceedings of the WSOM05, 5th workshop on self-organizing...

Kohonen, T. (2007). Description of input patterns by linear mixtures of SOM models. In WSOM 2007 CD-ROM proceedings,...

T. Kohonen

Data management by self-organizing maps

Kohonen, T., Hynninen, J., Kangas, J., & Laaksonen, J. (1996). The self-organizing map program package, report A31....

T. Kohonen et al.

Self organization of a massive document collection

IEEE Transactions on Neural Networks

(2000)

T. Kohonen et al.

Self-organized formation of various invariant-feature filters in the adaptive-subspace SOM

Neural Computation

(1997)

T. Kohonen et al.

Engineering applications of the self-organizing map

Proceedings of the IEEE

(1996)

T. Kohonen et al.

Contextually self-organized maps of Chinese words

J.B. Kruskal et al.

K. Lagus et al.

Keyword selection method for characterizing text document maps

J. Lampinen et al.

Self-organizing maps for spatial and temporal AR models

C.L. Lawson et al.

Solving least-squares problems

(1974)

Cited by (0)

View full text

Essentials of the self-organizing map

Abstract

Section snippets

Brain maps

The classical vector quantization (VQ)

Motivation of the SOM

The original, stepwise recursive SOM algorithm

Main application areas of the SOM

Approximation of an input data item by a linear mixture of models

Discussion

Acknowledgments

Bulletin of Mathematical Biology

Neural Networks

Neural Networks

Neurocomputing

Pattern Recognition Letters

Neural Networks

Neural Networks

Neuroscience

Cluster analysis for applications

GTM: the generative topographic mapping

Neural Computation

Convergence and ordering of Kohonen’s Batch map

Neural Computation

Étude d’un processus d’auto-organization

Annales de l’Institut Henri Poincaré

Visual explorations in finance with self-organizing maps

Indexing by latent semantic analysis

Journal of the American Society for Information Science

Über die Reduktion der positiven quadratischen Formen mit drei unbestimmten ganzen Zahlen

Journal für die Reine und Angewandte Mathematik

Cluster analysis of multivariate data: efficiency vs. interpretability of classifications

Biometrics

On the structure of vector quantizers

IEEE Transactions on Information Theory

Vector quantization

IEEE ASSP Magazine

On the development of feature detectors in the visual cortex with applications to learning and reaction–diffusion systems

Biological Cybernetics

Clustering algorithms

Error potential for self-organization

Receptive fields, binocular and functional architecture in the cat’s visual cortex

Journal of Physiology

Algorithms for clustering of data

Dimensionality reduction by random mapping

Bibliography of self-organizing map (SOM) papers: 1981–1997

Neural Computing Surveys

Self-organized formation of topologically correct feature maps

Biological Cybernetics

Clustering, taxonomy, and topological maps of patterns

Self-organization and associative memory

The self-organizing map

Proceedings of the IEEE

Emergence of invariant-feature detectors in self organization

Emergence of invariant-feature detectors in the adaptive-subspace self organizing ma

Biological Cybernetics

Self-organizing maps

Data management by self-organizing maps

Self organization of a massive document collection

IEEE Transactions on Neural Networks

Self-organized formation of various invariant-feature filters in the adaptive-subspace SOM

Neural Computation

Engineering applications of the self-organizing map

Proceedings of the IEEE

Contextually self-organized maps of Chinese words

Keyword selection method for characterizing text document maps

Self-organizing maps for spatial and temporal AR models

Solving least-squares problems