Self-Organizing Maps (SOM)

3/16/2016
Lecture 7
Self-Organizing Maps (SOM)
Prof. Zhen Ni
Assistant Professor
Department of Electrical Engineering and Computer Science
J. Lohr College of Engineering
South Dakota State University
223 Daktronics Engineering Hall, Box 2222
Brookings, SD 57007
Outline
• What is SOM?
• Touring SOM through equations and examples
• More SOM examples and variations
• SOM application in real world
• References
1
3/16/2016
Note
Most of the lecture slides are adopted from the following resources:
[1] S. Sayad: Self-Organizing Maps (SOM)

http://chem-eng.utoronto.ca/~datamining/Presentations/SOM.pdf
[2] AI-Junkie: “SOM tutorial”

http://www.ai-junkie.com/ann/som/som1.html
[3] S. Haykin, Neural networks: a comprehensive foundation (2nd

Edition). Prentice Hall PTR, 1994.
[4] S. Haykin, et al. Neural networks and learning machines. Vol. 3.

New York: Prentice Hall, 2009.
3
History
o Proposed by T. Kohonen in early 1980s

 Kohonen, Teuvo. "Self-organized formation of topologically correct feature
maps." Biological cybernetics 43.1 (1982): 59-69.
o Becomes increasingly popular throughout the 1990s

o Simplicity and effectiveness make SOM successful in various
real-world applications
 Engineering, biology, chemistry, geoscience, financial market, etc.
o A well-written book on SOM (cited over 15,000 times):

 Kohonen, Teuvo. Self-organizing maps. Vol. 30. Springer, 2001.
2
3/16/2016
What is SOM?
• So far we have looked at networks with supervised

training techniques, in which there is a target output
for each input pattern, and the network learns to
produce the required outputs.
• We now turn to unsupervised training, in which the
networks learn to form their own classifications of the
training data without external help. To do this we
have to assume that class membership is broadly
defined by the input patterns sharing common
features, and that the network will be able to identify
those features across the range of input patterns.
5
What is SOM?
• Also known as self-organizing feature map (SOFM)

or Kohonen map
• An unsupervised artificial Neural Network (ANN)
– Topographic maps acquired through learning
– Visualization of multi-dimensional data
– One of the most popular data clustering and analysis tool
with real-world and cross-disciplinary applications
3
3/16/2016
Topographic Maps
• Neurobiological studies indicate that different sensory

inputs (motor, visual, auditory, etc.) are mapped onto
corresponding areas of the cerebral cortex in an orderly
fashion.
• This form of map, known as a topographic map, has two
important properties:
– 1. At each stage of representation, or processing, each piece of
incoming information is kept in its proper
context/neighbourhood.
– 2. Neurons dealing with closely related pieces of information are
kept close together so that they can interact via short synaptic
connections. Our interest is in building artificial topographic
maps that learn through self-organization in a neurobiologically
inspired manner.
7
Topographic Maps
• Neurobiological studies indicate that different sensory

inputs (motor, visual, auditory, etc.) are mapped onto
corresponding areas of the cerebral cortex in an orderly
fashion.
• This form of map, known as a topographic map, has two
important properties:
– 1. At each stage of representation, or processing, each piece of
incoming information is kept in its proper
context/neighbourhood.
– 2. Neurons dealing with closely related pieces of information are
kept close together so that they can interact via short synaptic
connections. Our interest is in building artificial topographic
maps that learn through self-organization in a neurobiologically
inspired manner.
8
4
3/16/2016
Organization of the Mapping
We have points x in input space mapping to points I(x) in the output space
Each point I in the output space will map to a corresponding point w(I) in the input
9
space.
Visualizing the Self-Organization Process (1)
10
5
3/16/2016
11
12
6
3/16/2016
13
Visualizing the Self-Organization Process
14
7
3/16/2016
The SOM algorithm
15
Components of Self-Organization
• Four major steps in the self-organization procedure
1. Initialization:
 All the weights of neurons are initialized.
2. Competition:
 For each input pattern, compute the respective values of a discriminant
function for each neuron node;
 The particular neuron with the smallest value of the discriminant function
is declared the winner.
3. Cooperation:
 The winner determines a topological neighborhood of excited neurons,
whose degree of excitation is determined by their distance to the winner.
4. Adaptation:
 The excited neurons adjust their weights to reduce the discriminant
function for each input pattern.
16
8
3/16/2016
Overview: the general process
1. Initialize weights for each node on the map;
2. Sample a training data for SOM;
3. Find the Best Matching Unit (BMU) for each input instance
based on the distance between the data and the weights of
nodes;
4. Calculate the radius of the neighborhood around the BMU;
5. Update the weights of nodes according to their distances to the

BMU;
6. Repeat from step 2 for enough iterations until convergence.

17
Initialization
• Nodes are assigned with random initial

weights
 How to choose the range?
 Requires normalization?
• The weights, W, have identical
dimension to the input data
• Can also be initialized with eigenvalue-
based linear approach
A random initialization of the weights,

illustrated in different colors
18
9
3/16/2016
Data Sampling
• Can be done sequentially or randomly

• Why sequentially?
• Reduce to only one randomness (initial weights) in the training process
• How to implement random sampling?

• Hint: generate random integer number
• For random sampling, usually sample with replacement

• How to sample with replacement?
• Why?
19
Four Factors that Matters in the Self-

Organization
• The winning neuron, or Best Matching Unit (BMU)
• The neighborhood size of BMU
• The learning rate of SOM
• The lateral influence of BMU
20
10
3/16/2016
Competition: Finding the BMU
• Idea: the current Best Matching Unit

(BMU) is the node whose weight vector
is the closest to the current input data
• Calculate the Euclidean distance
• Note: this distance is calculated in the input

space – we will see another lattice distance of
output space in the next slide
A Best Matching Unit (BMU) with its
neighborhood 21
Updating the Weights: Neighborhood Size
• Idea: the influence of BMU will decrease over time, as other

nodes have been trained and loaded with input information
• In the output space: the neighborhood size shrinks over time
 σ0: initial neighborhood size;

 λσ : time factor for σ
22
11
3/16/2016
Updating the Weights: Neighborhood Size
• Idea: the radius of BMU’s influence will decrease over time, as

all other nodes are already trained by previous inputs
• Assume we keep the same BMU in this example
The neighborhood of BMU
The BMU Start training…
Shrinks
over time
A 2D Hexagonal SOM grid in the output space

23
Updating the Weights: The kernel
• Idea: Weights of nodes that are closer to the BMU will subject
to greater influence
• The lattice distances between the nodes and the BMU are
calculated in the output space
• A neighborhood function (kernel) is utilized to determine the
influence of BMU within the neighborhood
24
12
3/16/2016
Updating the Weights: The kernel
• Idea: Weights of nodes that are closer to the BMU will subject
to greater influence
• Assume we keep the same BMU in this example
The lateral influence of BMU
Also shrinks
over time

25
Updating the Weights: The Neighborhood Size and

Kernel Combined
• Idea: The BMU’s neighborhood size and its lateral influence
• Assume we keep the same BMU and learning rate in this
example
The neighborhood size and lateral influence of BMU
Both shrinks
over time

26
13
3/16/2016
Updating the Weights: Learning Rate
• Idea: Learning rate should be decreasing over time

• Learning rate L(t):
 L0: initial learning rate;

 λL : time factor for L
• A rough learning stage and a fine tuning stage are divided by λL
0.1
L(t)
L = 0.1
0.08 0
Learning Rate L(t)
 = 1000
0.06
0.04
0.02
0
1 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Iterations
27
The learning rate: A 2D example

Consider this SOM in the input space
(nodes represents 2D neuron weights rather than location on the lattice)
Current L0 = 0.8
0.5
0.1
input
x0 The SOM
weights
are drawn
The closer to
BMU x0
(original
location) The BMU
also moves
for any
non-zero
learning
rate
A 11-by-6 2D hexagonal SOM grid in the input space 28
14
3/16/2016
Updating the Weights
• The final formula of weight update
where
A Best Matching Unit (BMU) with its

neighborhood 29
Example: The learning process of an SOM
Weights Weights
Update Update
Best
Matching Input
Unit Data
• From left to right, the 2-D SOM grid tends to approximate

the data distribution in the input space
30
Source: http://en.wikipedia.org/wiki/Self-organizing_map
15
3/16/2016
Result presentation: SOM lattice

• Plot the SOM neurons in the input space
according to the weight vectors
Data and original map Sequential training after 300 steps
2 1
0.8
1.5
0.6
1
0.4
0.5
0.2
0 0
0 0.5 1 1.5 2 0 0.2 0.4 0.6 0.8 1
Example 3: SOM training result visualized in input space

31
A 1-D SOM example
Approximate an S-shape data distribution with 20 SOM nodes

in a line
32
16
3/16/2016
A 2-D SOM example
33
A 3-D SOM example
Data After initialization After training
1 1 1
0.5 0.5 0.5
0 0 0
1 1 1
1 1 BMU 1
0.5 0.5 0.5 0.5
0.5 0.5
0 0 0 0 0 0
Less intuitive/illustrative as the dimension of output space increases
34
17
3/16/2016
Variations: The learning process of an adaptive SOM

(from one of the previous examples)
35
SOM Applications
36
18
3/16/2016
SOM Applications
 Data Analysis (Engineering Applications)
o Computational Intelligence analysis
 Data visualization
 Data clustering
Hagenauer, J. and Helbich, M. (2013). “Contextual neural gas for spatial clustering and
analysis”. International Journal of Geographical Information Science, 27(2):251–266 37
SOM Applications
 Data Analysis (Engineering Applications)
o Communication networks
 Kaski, Samuel, et al. "WEBSOM– self-organizing maps of document
collections." Neurocomputing 21.1 (1998): 101-117.
o Chemical Engineering
 Melssen, W. J., et al. "Using artificial neural networks for solving chemical
problems: Part II. Kohonen self-organising feature maps and Hopfield networks."
Chemometrics and intelligent laboratory systems 23.2 (1994): 267-291.
o Geoinformation Systems (GIS)
 Ji, C. Y. "Land-use classification of remotely sensed data using Kohonen self-
organizing feature map neural networks." Photogrammetric Engineering and
Remote Sensing 66.12 (2000): 1451-1460.
 Bação, Fernando, Victor Lobo, and Marco Painho. "The self-organizing map, the
Geo-SOM, and relevant variants for geosciences." Computers & Geosciences 31.2
(2005): 155-163.
38
19
3/16/2016
SOM Applications
o Power System analysis
 Y., Jun, et al. "Multi-Contingency Cascading Analysis of Smart Grid Based on Self-
Organizing Map." (2013): Information Forensics and Security, IEEE Transactions
on,8.4 (2013)
 Thang, K. F., et al. "Analysis of power transformer dissolved gas data using the self-
organizing map." Power Delivery, IEEE Transactions on 18.4 (2003)
A visualization and security analysis for Texas Grid with SOM, Yan, Jun, et al (2013)
39
SOM Applications
 Data Analysis (Non-engineering fields)
o Biology
 Ideker, Trey, et al. "Integrated genomic and proteomic analyses of a systematically
perturbed metabolic network." Science 292.5518 (2001): 929-934.
 Tamayo, Pablo, et al. "Interpreting patterns of gene expression with self-organizing
maps: methods and application to hematopoietic differentiation. " Proceedings of
the National Academy of Sciences 96.6 (1999): 2907-2912.
 Fankhauser, Niklaus, and Pascal Mäser. "Identification of GPI anchor attachment
signals by a Kohonen self-organizing map." Bioinformatics 21.9 (2005): 1846-1852.
 Biomedical study
 Chen, Dar-Ren, Ruey-Feng Chang, and Yu-Len Huang. "Breast cancer diagnosis
using self-organizing map for sonography." Ultrasound in medicine & biology 26.3
(2000): 405-411.
 Andrade, M. A., et al. "Evaluation of secondary structure of proteins from UV
circular dichroism spectra using an unsupervised learning neural network.“ Protein
engineering 6.4 (1993): 383-390.
 Lagerholm, Martin, et al. "Clustering ECG complexes using Hermite functions and
self-organizing maps." Biomedical Engineering, IEEE Transactions on 47.7 (2000):
838-848. 40
20
3/16/2016
SOM Applications
 Data Analysis (Non-engineering fields)
o Finance
– Kiviluoto, Kimmo. "Predicting bankruptcies with the self-organizing map."
Neurocomputing 21.1 (1998): 191-201.
– Kaski, Samuel, Janne Sinkkonen, and Jaakko Peltonen. "Bankruptcy analysis with
self-organizing maps in learning metrics." Neural Networks, IEEE Transactions
on 12.4 (2001): 936-947.
– Eklund, Tomas, et al. "Using the self-organizing map as a visualization tool in
financial benchmarking." Information Visualization 2.3 (2003): 171-181.
o Ecology
– Giraudel, J. L., and S. Lek. "A comparison of self-organizing map algorithm and
some conventional statistical methods for ecological community ordination."
Ecological Modelling 146.1 (2001): 329-339.
41
SOM Resources
• Tutorial
– Kohonen's Self Organizing Feature Maps
http://www.ai-junkie.com/ann/som/som1.html
• Toolbox
– SOM Toolbox for Matlab
http://www.cis.hut.fi/somtoolbox/
– WEKA SOM package
https://sourceforge.net/projects/wekann/files/SelfOrganizingMap/
42
21
3/16/2016
Summary
• Self-Organizing Map
– The algorithm
• Initialization, competition, corporation, and adaptation
• BMU, neighborhood size, learning rate, and kernel function
– Examples
• The Iris example
• SOM variations
– SOM applications and resources
43
Reference
The lecture notes in this lecture are adopted and based on the following information:
 S. Haykin, “Neural Networks: A Comprehensive Foundation,” Prentice Hall, 2nd edition,

1999, ISBN: 0-13-273350-1
 Kohonen, Teuvo. Self-organizing maps. Vol. 30. Springer, 2001.
 Kohonen, Teuvo, et al. "Engineering applications of the self-organizing map."Proceedings of
the IEEE 84.10 (1996): 1358-1384. Oja, Erkki, and Samuel Kaski, eds. Kohonen maps.
Elsevier Science, 1999
 Vesanto, Juha, et al. "Self-organizing map in Matlab: the SOM toolbox."Proceedings of the
Matlab DSP conference. Vol. 99. 1999.
 Vesanto, Juha, and Esa Alhoniemi. "Clustering of the self-organizing map."Neural Networks,
IEEE Transactions on 11.3 (2000): 586-600.
 Ultsch, Alfred. "Self-organizing neural networks for visualisation and
classification." Information and classification. Springer Berlin Heidelberg, 1993. 307-313.
 http://en.wikipedia.org/wiki/Self-organizing_map
 http://www.ai-junkie.com/ann/som/som1.html
 http://www.cis.hut.fi/projects/somtoolbox/theory/somalgorithm.shtml
 https://sourceforge.net/projects/wekann/files/SelfOrganizingMap/
 http://www.eicstes.org/EICSTES_PDF/PAPERS/The%20Self-
Organizing%20Map%20(Kohonen).pdf
 http://www.cs.bham.ac.uk/~jxb/NN/l16.pdf
 http://users.ics.aalto.fi/sami/thesis/thesis.html
44
22

Self-Organizing Maps (SOM)

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Self-Organizing Maps (SOM)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Self-Organizing Maps (SOM)

Uploaded by

Copyright:

Available Formats

3/16/2016

[1] S. Sayad: Self-Organizing Maps (SOM)

[2] AI-Junkie: “SOM tutorial”

[3] S. Haykin, Neural networks: a comprehensive foundation (2nd

[4] S. Haykin, et al. Neural networks and learning machines. Vol. 3.

o Proposed by T. Kohonen in early 1980s

o Becomes increasingly popular throughout the 1990s

o A well-written book on SOM (cited over 15,000 times):

• So far we have looked at networks with supervised

• Also known as self-organizing feature map (SOFM)

• Neurobiological studies indicate that different sensory

• Neurobiological studies indicate that different sensory

Organization of the Mapping

Visualizing the Self-Organization Process (1)

Visualizing the Self-Organization Process (2)

Visualizing the Self-Organization Process (3)

Visualizing the Self-Organization Process (4)

Visualizing the Self-Organization Process

The SOM algorithm

Overview: the general process

1. Initialize weights for each node on the map;

2. Sample a training data for SOM;

4. Calculate the radius of the neighborhood around the BMU;

5. Update the weights of nodes according to their distances to the

6. Repeat from step 2 for enough iterations until convergence.

• Nodes are assigned with random initial

A random initialization of the weights,

• Can be done sequentially or randomly

• How to implement random sampling?

• For random sampling, usually sample with replacement

Four Factors that Matters in the Self-

• The winning neuron, or Best Matching Unit (BMU)

• The neighborhood size of BMU

• The learning rate of SOM

• The lateral influence of BMU

Competition: Finding the BMU

• Idea: the current Best Matching Unit

• Note: this distance is calculated in the input

Updating the Weights: Neighborhood Size

• Idea: the influence of BMU will decrease over time, as other

 σ0: initial neighborhood size;

Updating the Weights: Neighborhood Size

• Idea: the radius of BMU’s influence will decrease over time, as

A 2D Hexagonal SOM grid in the output space

Updating the Weights: The kernel

Updating the Weights: The kernel

A 2D Hexagonal SOM grid in the output space

Updating the Weights: The Neighborhood Size and

A 2D Hexagonal SOM grid in the output space

Updating the Weights: Learning Rate

• Idea: Learning rate should be decreasing over time

 L0: initial learning rate;

The learning rate: A 2D example

A 11-by-6 2D hexagonal SOM grid in the input space 28

Updating the Weights

• The final formula of weight update

A Best Matching Unit (BMU) with its

Example: The learning process of an SOM