scikit-learn · qinhanmin2014 · Mar 4, 2019 · Mar 4, 2019 · Mar 4, 2019
diff --git a/doc/modules/clustering.rst b/doc/modules/clustering.rst
@@ -91,12 +91,6 @@ Overview of clustering methods
      - Non-flat geometry, uneven cluster sizes
      - Distances between nearest points
 
-   * - :ref:`OPTICS <optics>`
-     - minimum cluster membership
-     - Very large ``n_samples``, large ``n_clusters``
-     - Non-flat geometry, uneven cluster sizes, variable cluster density 
-     - Distances between points
-
    * - :ref:`Gaussian mixtures <mixture>`
      - many
      - Not scalable
@@ -802,11 +796,6 @@ by black points below.
     be used (e.g. with sparse matrices). This matrix will consume n^2 floats.
     A couple of mechanisms for getting around this are:
 
-    - Use :ref:`OPTICS <optics>` clustering in conjunction with the
-      `extract_dbscan` method. OPTICS clustering also calculates the full
-      pairwise matrix, but only keeps one row in memory at a time (memory
-      complexity n).
-
     - A sparse radius neighborhood graph (where missing entries are presumed to
       be out of eps) can be precomputed in a memory-efficient way and dbscan
       can be run over this with ``metric='precomputed'``.  See
@@ -825,106 +814,6 @@ by black points below.
    In Proceedings of the 2nd International Conference on Knowledge Discovery
    and Data Mining, Portland, OR, AAAI Press, pp. 226–231. 1996
 
-.. _optics:
-
-OPTICS
-======
-
-The :class:`OPTICS` algorithm shares many similarities with the
-:class:`DBSCAN` algorithm, and can be considered a generalization of
-DBSCAN that relaxes the ``eps`` requirement from a single value to a value
-range. The key difference between DBSCAN and OPTICS is that the OPTICS
-algorithm builds a *reachability* graph, which assigns each sample both a
-``reachability_`` distance, and a spot within the cluster ``ordering_``
-attribute; these two attributes are assigned when the model is fitted, and are
-used to determine cluster membership. If OPTICS is run with the default value
-of *inf* set for ``max_eps``, then DBSCAN style cluster extraction can be
-performed in linear time for any given ``eps`` value using the
-``extract_dbscan`` method. Setting ``max_eps`` to a lower value will result
-in shorter run times, and can be thought of as the maximum cluster object size
-(in diameter) that OPTICS will be able to extract.
-
-.. |optics_results| image:: ../auto_examples/cluster/images/sphx_glr_plot_optics_001.png
-        :target: ../auto_examples/cluster/plot_optics.html
-        :scale: 50
-
-.. centered:: |optics_results|
-
-The *reachability* distances generated by OPTICS allow for variable density
-extraction of clusters within a single data set. As shown in the above plot,
-combining *reachability* distances and data set ``ordering_`` produces a
-*reachability plot*, where point density is represented on the Y-axis, and
-points are ordered such that nearby points are adjacent. 'Cutting' the
-reachability plot at a single value produces DBSCAN like results; all points
-above the 'cut' are classified as noise, and each time that there is a break
-when reading from left to right signifies a new cluster. The default cluster
-extraction with OPTICS looks at changes in slope within the graph to guess at
-natural clusters. There are also other possibilities for analysis on the graph
-itself, such as generating hierarchical representations of the data through
-reachability-plot dendrograms. The plot above has been color-coded so that
-cluster colors in planar space match the linear segment clusters of the
-reachability plot-- note that the blue and red clusters are adjacent in the
-reachability plot, and can be hierarchically represented as children of a
-larger parent cluster.
-
-.. topic:: Examples:
-
-     * :ref:`sphx_glr_auto_examples_cluster_plot_optics.py`
-
-
-.. topic:: Comparison with DBSCAN
-
-    The results from OPTICS ``extract_dbscan`` method and DBSCAN are not quite
-    identical. Specifically, while *core_samples* returned from both OPTICS
-    and DBSCAN are guaranteed to be identical, labeling of periphery and noise
-    points is not. This is in part because the first sample processed by
-    OPTICS will always have a reachability distance that is set to ``inf``,
-    and will thus generally be marked as noise rather than periphery. This
-    affects adjacent points when they are considered as candidates for being
-    marked as either periphery or noise. While this effect is quite local to
-    the starting point of the dataset and is unlikely to be noticed on even
-    moderately large datasets, it is worth also noting that non-core boundry
-    points may switch cluster labels on the rare occasion that they are
-    equidistant to a competeing cluster due to how the graph is read from left
-    to right when assigning labels. 
-
-    Note that for any single value of ``eps``, DBSCAN will tend to have a
-    shorter run time than OPTICS; however, for repeated runs at varying ``eps``
-    values, a single run of OPTICS may require less cumulative runtime than
-    DBSCAN. It is also important to note that OPTICS output can be unstable at
-    ``eps`` values very close to the initial ``max_eps`` value. OPTICS seems
-    to produce near identical results to DBSCAN provided that ``eps`` passed to
-    ``extract_dbscan`` is a half order of magnitude less than the inital
-    ``max_eps`` that was used to fit; using a value close to ``max_eps``
-    will throw a warning, and using a value larger will result in an exception. 
-
-.. topic:: Computational Complexity
-
-    Spatial indexing trees are used to avoid calculating the full distance
-    matrix, and allow for efficient memory usage on large sets of samples.
-    Different distance metrics can be supplied via the ``metric`` keyword.
-
-    For large datasets, similar (but not identical) results can be obtained via
-    `HDBSCAN <https://hdbscan.readthedocs.io>`_. The HDBSCAN implementation is
-    multithreaded, and has better algorithmic runtime complexity than OPTICS--
-    at the cost of worse memory scaling. For extremely large datasets that
-    exhaust system memory using HDBSCAN, OPTICS will maintain *n* (as opposed
-    to *n^2* memory scaling); however, tuning of the ``max_eps`` parameter
-    will likely need to be used to give a solution in a reasonable amount of
-    wall time.
-
-.. topic:: References:
-
- *  "OPTICS: ordering points to identify the clustering structure."
-    Ankerst, Mihael, Markus M. Breunig, Hans-Peter Kriegel, and Jörg Sander.
-    In ACM Sigmod Record, vol. 28, no. 2, pp. 49-60. ACM, 1999.
-
- *  "Automatic extraction of clusters from hierarchical clustering
-    representations."
-    Sander, Jörg, Xuejie Qin, Zhiyong Lu, Nan Niu, and Alex Kovarsky.
-    In Advances in knowledge discovery and data mining,
-    pp. 75-87. Springer Berlin Heidelberg, 2003. 
-
 .. _birch:
 
 Birch