Skip to content

Commit 0028fb4

Browse files
committed
DOC: put math at back, simplify formulation
1 parent 3fbd098 commit 0028fb4

File tree

1 file changed

+27
-26
lines changed

1 file changed

+27
-26
lines changed

doc/modules/clustering.rst

+27-26
Original file line numberDiff line numberDiff line change
@@ -222,11 +222,34 @@ values from other pairs. This updating happens iteratively until convergence,
222222
at which point the final exemplars are chosen, and hence the final clustering
223223
is given.
224224

225-
Affinity Propagation has a number of advantages over other algorithms. In many
226-
experiments it is shown to produce a lower error than other algorithms,
227-
specifically k-means, and also works without any parameters, choosing the
228-
number of clusters based on the data provided.
225+
.. figure:: ../auto_examples/cluster/images/plot_affinity_propagation_1.png
226+
:target: ../auto_examples/cluster/plot_affinity_propagation.html
227+
:align: center
228+
:scale: 50
229+
230+
231+
Affinity Propagation can be interesting as it chooses the number of
232+
clusters based on the data provided. For this purpose, the two important
233+
parameters are the `preference`, which controls how many examplars are
234+
used, and the `damping` factor.
235+
236+
The main drawback of Affinity Propagation is its complexity. The
237+
algorithm has a time complexity of the order :math:`O(N^2 T)`, where `N`
238+
is the number of samples and `T` is the number of iterations until
239+
convergence. Further, the memory complexity is of the order
240+
:math:`O(N^2)` if a dense similarity matrix is used, but reducible if a
241+
sparse similarity matrix is used. This makes Affinity Propagation most
242+
appropriate for small to medium sized datasets.
243+
244+
.. topic:: Examples:
245+
246+
* :ref:`example_cluster_plot_affinity_propagation.py`: Affinity
247+
Propagation on a synthetic 2D datasets with 3 classes.
248+
249+
* :ref:`example_applications_plot_stock_market.py` Affinity Propagation on
250+
Financial time series to find groups of companies
229251

252+
**Algorithm description:**
230253
The messages sent between points belong to one of two categories. The first is
231254
the responsibility `r(i, k)`, which is the accumulated evidence that sample `k`
232255
should be the exemplar for sample `i`. The second is the availability `a(i, k)`
@@ -253,28 +276,6 @@ availability of sample `k` to be the exemplar of sample `i` is given by:
253276
To begin with, all values for `r` and `a` are set to zero, and the calculation
254277
of each iterates until convergence.
255278

256-
While effective, Affinity Propagation has some disadvantages. The most pressing
257-
is its complexity. The algorithm has a time complexity of the order
258-
:math:`O(N^2 T)`, where `N` is the number of samples and `T` is the number of
259-
iterations until convergence. Further, the space complexity is of the order
260-
:math:`O(N^2)` if a dense similarity matrix is used, but reducible if a sparse
261-
similarity matrix is used. This makes Affinity Propagation most appropriate for
262-
small to medium sized datasets.
263-
264-
.. figure:: ../auto_examples/cluster/images/plot_affinity_propagation_1.png
265-
:target: ../auto_examples/cluster/plot_affinity_propagation.html
266-
:align: center
267-
:scale: 50
268-
269-
.. topic:: Examples:
270-
271-
* :ref:`example_cluster_plot_affinity_propagation.py`: Affinity
272-
Propagation on a synthetic 2D datasets with 3 classes.
273-
274-
* :ref:`example_applications_plot_stock_market.py` Affinity Propagation on
275-
Financial time series to find groups of companies
276-
277-
278279
.. _mean_shift:
279280

280281
Mean Shift

0 commit comments

Comments
 (0)