From 271f33935830a87dfcf9d724de53ef6ac15c8430 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Fri, 30 Jun 2017 12:38:33 +1000
Subject: [PATCH 01/19] DOC cleaning up what's new for 0.19

---
 doc/whats_new.rst | 311 ++++++++++++++++++++++++++++------------------
 1 file changed, 189 insertions(+), 122 deletions(-)
diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index d367c627c27c4..ec8709a3c8e66 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -10,6 +10,28 @@ Version 0.19
 
 **In Development**
 
+Highlights
+----------
+
+TODO:
+
+This release includes a number of great new features including Local Outlier Factor for anomaly detection, QuantileTransformer for robust feature transformation, and ClassifierChain to simply account for dependencies between classes in multilabel problems.
+
+Pipeline caching makes grid search over pipelines including slow transformations much more efficient.
+
+Multinomial logistic regression with L1 loss.
+
+?Rewrite of TSNE
+
+Multi-metric grid search and cross validation
+
+Major deprecations
+------------------
+
+TODO
+
+We have deprecated RandomizedLasso and RandomizedLogisticRegression and LSHForest because they weren't appropriate or up to standards. We have deprecated a number of utilities no longer necessary now that we require Scipy 0.13.3 and Numpy 1.8.2 at a minimum.
+
 Changed models
 --------------
 
@@ -19,6 +41,7 @@ occurs due to changes in the modelling logic (bug fixes or enhancements), or in
 random sampling procedures.
 
    * :class:`sklearn.ensemble.IsolationForest` (bug fix)
+   * TODO
 
 Details are listed in the changelog below.
 
@@ -31,32 +54,17 @@ Changelog
 New features
 ............
 
-   - Added :class:`multioutput.ClassifierChain` for multi-label
-     classification. By `Adam Kleczewski <adamklec>`_.
+Configuration
 
    - Validation that input data contains no NaN or inf can now be suppressed
      using :func:`config_context`, at your own risk. This will save on runtime,
      and may be particularly useful for prediction time. :issue:`7548` by
      `Joel Nothman`_.
 
-   - Added the :class:`neighbors.LocalOutlierFactor` class for anomaly
-     detection based on nearest neighbors.
-     :issue:`5279` by `Nicolas Goix`_ and `Alexandre Gramfort`_.
-
-   - The new solver ``'mu'`` implements a Multiplicate Update in
-     :class:`decomposition.NMF`, allowing the optimization of all
-     beta-divergences, including the Frobenius norm, the generalized
-     Kullback-Leibler divergence and the Itakura-Saito divergence.
-     :issue:`5295` by `Tom Dupre la Tour`_.
-
-   - Added the :class:`model_selection.RepeatedKFold` and
-     :class:`model_selection.RepeatedStratifiedKFold`.
-     :issue:`8120` by `Neeraj Gangwar`_.
+Classifiers and regressors
 
-   - Added :func:`metrics.mean_squared_log_error`, which computes
-     the mean square error of the logarithmic transformation of targets,
-     particularly useful for targets with an exponential trend.
-     :issue:`7655` by :user:`Karan Desai <karandesai-96>`.
+   - Added :class:`multioutput.ClassifierChain` for multi-label
+     classification. By `Adam Kleczewski <adamklec>`_.
 
    - Added solver ``'saga'`` that implements the improved version of Stochastic
      Average Gradient, in :class:`linear_model.LogisticRegression` and
@@ -65,6 +73,12 @@ New features
      during the first epochs of ridge and logistic regression.
      :issue:`8446` by `Arthur Mensch`_.
 
+Other estimators
+
+   - Added the :class:`neighbors.LocalOutlierFactor` class for anomaly
+     detection based on nearest neighbors.
+     :issue:`5279` by `Nicolas Goix`_ and `Alexandre Gramfort`_.
+
    - Added :class:`preprocessing.QuantileTransformer` class and
      :func:`preprocessing.quantile_transform` function for features
      normalization based on quantiles.
@@ -72,47 +86,33 @@ New features
      :user:`Guillaume Lemaitre <glemaitre>`, `Olivier Grisel`_, `Raghav RV`_,
      :user:`Thierry Guillemot <tguillemot>`, and `Gael Varoquaux`_.
 
+   - The new solver ``'mu'`` implements a Multiplicate Update in
+     :class:`decomposition.NMF`, allowing the optimization of all
+     beta-divergences, including the Frobenius norm, the generalized
+     Kullback-Leibler divergence and the Itakura-Saito divergence.
+     :issue:`5295` by `Tom Dupre la Tour`_.
+
+Model selection and evaluation
+
+   - Added :func:`metrics.mean_squared_log_error`, which computes
+     the mean square error of the logarithmic transformation of targets,
+     particularly useful for targets with an exponential trend.
+     :issue:`7655` by :user:`Karan Desai <karandesai-96>`.
+
    - Added :func:`metrics.dcg_score` and :func:`metrics.ndcg_score`, which
      compute Discounted cumulative gain (DCG) and Normalized discounted
      cumulative gain (NDCG).
      :issue:`7739` by :user:`David Gasquez <davidgasquez>`.
 
-Enhancements
-............
-
-   - :func:`metrics.matthews_corrcoef` now support multiclass classification.
-     :issue:`8094` by :user:`Jon Crall <Erotemic>`.
-   - Update Sphinx-Gallery from 0.1.4 to 0.1.7 for resolving links in
-     documentation build with Sphinx>1.5 :issue:`8010`, :issue:`7986` by
-     :user:`Oscar Najera <Titan-C>`
-   - :class:`multioutput.MultiOutputRegressor` and :class:`multioutput.MultiOutputClassifier`
-     now support online learning using `partial_fit`.
-     issue: `8053` by :user:`Peng Yu <yupbank>`.
-   - :class:`pipeline.Pipeline` allows to cache transformers
-     within a pipeline by using the ``memory`` constructor parameter.
-     :issue:`7990` by :user:`Guillaume Lemaitre <glemaitre>`.
-
-   - :class:`decomposition.PCA`, :class:`decomposition.IncrementalPCA` and
-     :class:`decomposition.TruncatedSVD` now expose the singular values
-     from the underlying SVD. They are stored in the attribute
-     ``singular_values_``, like in :class:`decomposition.IncrementalPCA`.
+   - Added the :class:`model_selection.RepeatedKFold` and
+     :class:`model_selection.RepeatedStratifiedKFold`.
+     :issue:`8120` by `Neeraj Gangwar`_.
 
-   - :class:`cluster.MiniBatchKMeans` and :class:`cluster.KMeans`
-     now uses significantly less memory when assigning data points to their
-     nearest cluster center. :issue:`7721` by :user:`Jon Crall <Erotemic>`.
 
-   - Added ``classes_`` attribute to :class:`model_selection.GridSearchCV`,
-     :class:`model_selection.RandomizedSearchCV`,  :class:`grid_search.GridSearchCV`,
-     and  :class:`grid_search.RandomizedSearchCV` that matches the ``classes_``
-     attribute of ``best_estimator_``. :issue:`7661` and :issue:`8295`
-     by :user:`Alyssa Batula <abatula>`, :user:`Dylan Werner-Meier <unautre>`,
-     and :user:`Stephen Hoover <stephen-hoover>`.
+Enhancements
+............
 
-   - Relax assumption on the data for the
-     :class:`kernel_approximation.SkewedChi2Sampler`. Since the Skewed-Chi2
-     kernel is defined on the open interval :math:`(-skewedness; +\infty)^d`,
-     the transform function should not check whether ``X < 0`` but whether ``X <
-     -self.skewedness``. :issue:`7573` by :user:`Romain Brault <RomainBrault>`.
+Trees and ensembles
 
    - The ``min_weight_fraction_leaf`` constraint in tree construction is now
      more efficient, taking a fast path to declare a node a leaf if its weight
@@ -120,33 +120,16 @@ Enhancements
      different from previous versions where ``min_weight_fraction_leaf`` is
      used. :issue:`7441` by :user:`Nelson Liu <nelson-liu>`.
 
-   - Added ``average`` parameter to perform weights averaging in
-     :class:`linear_model.PassiveAggressiveClassifier`. :issue:`4939`
-     by :user:`Andrea Esuli <aesuli>`.
-
-   - Custom metrics for the :mod:`sklearn.neighbors` binary trees now have
-     fewer constraints: they must take two 1d-arrays and return a float.
-     :issue:`6288` by `Jake Vanderplas`_.
-
    - :class:`ensemble.GradientBoostingClassifier` and :class:`ensemble.GradientBoostingRegressor`
      now support sparse input for prediction.
      :issue:`6101` by :user:`Ibraim Ganiev <olologin>`.
 
-   - Added ``shuffle`` and ``random_state`` parameters to shuffle training
-     data before taking prefixes of it based on training sizes in
-     :func:`model_selection.learning_curve`.
-     :issue:`7506` by :user:`Narine Kokhlikyan <NarineK>`.
-
-   - Added ``norm_order`` parameter to :class:`feature_selection.SelectFromModel`
-     to enable selection of the norm order when ``coef_`` is more than 1D.
-     :issue:`6181` by :user:`Antoine Wendlinger <antoinewdg>`.
-
-   - Added ``sample_weight`` parameter to :meth:`pipeline.Pipeline.score`.
-     :issue:`7723` by :user:`Mikhail Korobov <kmike>`.
+   - :class:`ensemble.VotingClassifier` now allow changing estimators by using
+     :meth:`ensemble.VotingClassifier.set_params`. Estimators can also be
+     removed by setting it to `None`.
+     :issue:`7674` by :user:`Yichuan Liu <yl565>`.
 
-   - ``check_estimator`` now attempts to ensure that methods transform, predict, etc.
-     do not set attributes on the estimator.
-     :issue:`7533` by :user:`Ekaterina Krivich <kiote>`.
+Linear, kernelized and related models
 
    - :class:`linear_model.SGDClassifier`, :class:`linear_model.SGDRegressor`,
      :class:`linear_model.PassiveAggressiveClassifier`,
@@ -157,10 +140,9 @@ Enhancements
      a ``n_iter_`` attribute, with actual number of iterations before
      convergence. By `Tom Dupre la Tour`_.
 
-   - For sparse matrices, :func:`preprocessing.normalize` with ``return_norm=True``
-     will now raise a ``NotImplementedError`` with 'l1' or 'l2' norm and with
-     norm 'max' the norms returned will be the same as for dense matrices.
-     :issue:`7771` by `Ang Lu <https://github.com/luang008>`_.
+   - Added ``average`` parameter to perform weight averaging in
+     :class:`linear_model.PassiveAggressiveClassifier`. :issue:`4939`
+     by :user:`Andrea Esuli <aesuli>`.
 
    - :class:`linear_model.RANSACRegressor` no longer throws an error
      when calling ``fit`` if no inliers are found in its first iteration.
@@ -168,73 +150,118 @@ Enhancements
      attributes, ``n_skips_*``.
      :issue:`7914` by :user:`Michael Horrell <mthorrell>`.
 
-   - :func:`model_selection.cross_val_predict` now returns output of the
-     correct shape for all values of the argument ``method``.
-     :issue:`7863` by :user:`Aman Dalmia <dalmia>`.
-
-   - Fix a bug where :class:`feature_selection.SelectFdr` did not
-     exactly implement Benjamini-Hochberg procedure. It formerly may have
-     selected fewer features than it should.
-     :issue:`7490` by :user:`Peng Meng <mpjlu>`.
-
-   - Added ability to set ``n_jobs`` parameter to :func:`pipeline.make_union`.
-     A ``TypeError`` will be raised for any other kwargs. :issue:`8028`
-     by :user:`Alexander Booth <alexandercbooth>`.
-
-   - Added type checking to the ``accept_sparse`` parameter in
-     :mod:`sklearn.utils.validation` methods. This parameter now accepts only
-     boolean, string, or list/tuple of strings. ``accept_sparse=None`` is deprecated
-     and should be replaced by ``accept_sparse=False``.
-     :issue:`7880` by :user:`Josh Karnofsky <jkarno>`.
-
-   - :class:`model_selection.GridSearchCV`, :class:`model_selection.RandomizedSearchCV`
-     and :func:`model_selection.cross_val_score` now allow estimators with callable
-     kernels which were previously prohibited. :issue:`8005` by `Andreas Müller`_ .
-
-   - Added ability to use sparse matrices in :func:`feature_selection.f_regression`
-     with ``center=True``. :issue:`8065` by :user:`Daniel LeJeune <acadiansith>`.
+   - Relax assumption on the data for the
+     :class:`kernel_approximation.SkewedChi2Sampler`. Since the Skewed-Chi2
+     kernel is defined on the open interval :math:`(-skewedness; +\infty)^d`,
+     the transform function should not check whether ``X < 0`` but whether ``X <
+     -self.skewedness``. :issue:`7573` by :user:`Romain Brault <RomainBrault>`.
 
-   - Add ``sample_weight`` parameter to :func:`metrics.cohen_kappa_score`.
-     :issue:`8335` by :user:`Victor Poughon <vpoughon>`.
+   - Custom metrics for the :mod:`neighbors` binary trees now have
+     fewer constraints: they must take two 1d-arrays and return a float.
+     :issue:`6288` by `Jake Vanderplas`_.
 
    - In :class:`gaussian_process.GaussianProcessRegressor`, method ``predict``
      is a lot faster with ``return_std=True``. :issue:`8591` by
      :user:`Hadrien Bertrand <hbertrand>`.
 
-   - Added ability to use sparse matrices in :func:`feature_selection.f_regression`
-     with ``center=True``. :issue:`8065` by :user:`Daniel LeJeune <acadiansith>`.
-
-   - :class:`ensemble.VotingClassifier` now allow changing estimators by using
-     :meth:`ensemble.VotingClassifier.set_params`. Estimators can also be
-     removed by setting it to `None`.
-     :issue:`7674` by :user:`Yichuan Liu <yl565>`.
-
-   - Prevent cast from float32 to float64 in
+   - Memory usage enhancement: Prevent cast from float32 to float64 in
      :class:`linear_model.LogisticRegression` when using newton-cg
      solver. :issue:`8835` by :user:`Joan Massich <massich>`.
 
-   - Prevent cast from float32 to float64 in
+   - Memory usage enhancement: Prevent cast from float32 to float64 in
      :class:`linear_model.Ridge` when using svd, sparse_cg, cholesky or lsqr solvers
      :class:`sklearn.linear_model.Ridge` when using svd, sparse_cg, cholesky or lsqr solvers
      by :user:`Joan Massich <massich>`, :user:`Nicolas Cordier <ncordier>`
 
-   - Add ``max_train_size`` parameter to :class:`model_selection.TimeSeriesSplit`
-     :issue:`8282` by :user:`Aman Dalmia <dalmia>`.
+Decomposition, manifold learning and clustering
 
-   - Make it possible to load a chunk of an svmlight formatted file by
-     passing a range of bytes to :func:`datasets.load_svmlight_file`.
-     :issue:`935` by :user:`Olivier Grisel <ogrisel>`.
+   - :class:`cluster.MiniBatchKMeans` and :class:`cluster.KMeans`
+     now use significantly less memory when assigning data points to their
+     nearest cluster center. :issue:`7721` by :user:`Jon Crall <Erotemic>`.
+
+   - :class:`decomposition.PCA`, :class:`decomposition.IncrementalPCA` and
+     :class:`decomposition.TruncatedSVD` now expose the singular values
+     from the underlying SVD. They are stored in the attribute
+     ``singular_values_``, like in :class:`decomposition.IncrementalPCA`.
+
+Preprocessing and feature selection
+
+   - Added ``norm_order`` parameter to :class:`feature_selection.SelectFromModel`
+     to enable selection of the norm order when ``coef_`` is more than 1D.
+     :issue:`6181` by :user:`Antoine Wendlinger <antoinewdg>`.
+
+   - Added ability to use sparse matrices in :func:`feature_selection.f_regression`
+     with ``center=True``. :issue:`8065` by :user:`Daniel LeJeune <acadiansith>`.
 
    - Small performance improvement to n-gram creation in
      :mod:`feature_extraction.text` by binding methods for loops and
      special-casing unigrams. :issue:`7567` by `Jaye Doepke <jtdoepke>`
 
+Model evaluation and meta-estimators
+
+   - :class:`pipeline.Pipeline` allows to cache transformers
+     within a pipeline by using the ``memory`` constructor parameter.
+     :issue:`7990` by :user:`Guillaume Lemaitre <glemaitre>`.
+
+   - Added ``sample_weight`` parameter to :meth:`pipeline.Pipeline.score`.
+     :issue:`7723` by :user:`Mikhail Korobov <kmike>`.
+
+   - Added ability to set ``n_jobs`` parameter to :func:`pipeline.make_union`.
+     A ``TypeError`` will be raised for any other kwargs. :issue:`8028`
+     by :user:`Alexander Booth <alexandercbooth>`.
+
+   - :class:`model_selection.GridSearchCV`, :class:`model_selection.RandomizedSearchCV`
+     and :func:`model_selection.cross_val_score` now allow estimators with callable
+     kernels which were previously prohibited. :issue:`8005` by `Andreas Müller`_ .
+
+   - :func:`model_selection.cross_val_predict` now returns output of the
+     correct shape for all values of the argument ``method``.
+     :issue:`7863` by :user:`Aman Dalmia <dalmia>`.
+
+   - Added ``shuffle`` and ``random_state`` parameters to shuffle training
+     data before taking prefixes of it based on training sizes in
+     :func:`model_selection.learning_curve`.
+     :issue:`7506` by :user:`Narine Kokhlikyan <NarineK>`.
+
    - Speed improvements to :class:`model_selection.StratifiedShuffleSplit`.
      :issue:`5991` by :user:`Arthur Mensch <arthurmensch>` and `Joel Nothman`_.
 
+   - :class:`multioutput.MultiOutputRegressor` and :class:`multioutput.MultiOutputClassifier`
+     now support online learning using `partial_fit`.
+     issue: `8053` by :user:`Peng Yu <yupbank>`.
+
+   - Add ``max_train_size`` parameter to :class:`model_selection.TimeSeriesSplit`
+     :issue:`8282` by :user:`Aman Dalmia <dalmia>`.
+
+Metrics
+
+   - :func:`metrics.matthews_corrcoef` now support multiclass classification.
+     :issue:`8094` by :user:`Jon Crall <Erotemic>`.
+
+   - Add ``sample_weight`` parameter to :func:`metrics.cohen_kappa_score`.
+     :issue:`8335` by :user:`Victor Poughon <vpoughon>`.
+
+Miscellaneous
+
+   - :func:`utils.check_estimator` now attempts to ensure that methods transform, predict, etc.
+     do not set attributes on the estimator.
+     :issue:`7533` by :user:`Ekaterina Krivich <kiote>`.
+
+   - Added type checking to the ``accept_sparse`` parameter in
+     :mod:`sklearn.utils.validation` methods. This parameter now accepts only
+     boolean, string, or list/tuple of strings. ``accept_sparse=None`` is deprecated
+     and should be replaced by ``accept_sparse=False``.
+     :issue:`7880` by :user:`Josh Karnofsky <jkarno>`.
+
+   - Make it possible to load a chunk of an svmlight formatted file by
+     passing a range of bytes to :func:`datasets.load_svmlight_file`.
+     :issue:`935` by :user:`Olivier Grisel <ogrisel>`.
+
 Bug fixes
 .........
 
+TODO
+
    - :func:`metrics.average_precision_score` no longer linearly
      interpolates between operating points, and instead weighs precisions
      by the change in recall since the last operating point, as per the
@@ -443,6 +470,38 @@ Bug fixes
      :class:`decomposition.IncrementalPCA`.
      :issue:`9105` by `Hanmin Qin <https://github.com/qinhanmin2014>`_. 
 
+Trees and ensembles
+Linear, kernelized and related models
+Decomposition, manifold learning and clustering
+Preprocessing and feature selection
+
+   - For sparse matrices, :func:`preprocessing.normalize` with ``return_norm=True``
+     will now raise a ``NotImplementedError`` with 'l1' or 'l2' norm and with
+     norm 'max' the norms returned will be the same as for dense matrices.
+     :issue:`7771` by `Ang Lu <https://github.com/luang008>`_.
+
+   - Fix a bug where :class:`feature_selection.SelectFdr` did not
+     exactly implement Benjamini-Hochberg procedure. It formerly may have
+     selected fewer features than it should.
+     :issue:`7490` by :user:`Peng Meng <mpjlu>`.
+
+
+Model evaluation and meta-estimators
+Metrics
+Miscellaneous
+
+   - Added ``classes_`` attribute to :class:`model_selection.GridSearchCV`,
+     :class:`model_selection.RandomizedSearchCV`,  :class:`grid_search.GridSearchCV`,
+     and  :class:`grid_search.RandomizedSearchCV` that matches the ``classes_``
+     attribute of ``best_estimator_``. :issue:`7661` and :issue:`8295`
+     by :user:`Alyssa Batula <abatula>`, :user:`Dylan Werner-Meier <unautre>`,
+     and :user:`Stephen Hoover <stephen-hoover>`.
+
+   - Update Sphinx-Gallery from 0.1.4 to 0.1.7 for resolving links in
+     documentation build with Sphinx>1.5 :issue:`8010`, :issue:`7986` by
+     :user:`Oscar Najera <Titan-C>`
+
+
 API changes summary
 -------------------
 
@@ -526,11 +585,11 @@ API changes summary
 
    - The ``n_topics`` parameter of :class:`decomposition.LatentDirichletAllocation`
      has been renamed to ``n_components`` and will be removed in version 0.21.
-     :issue:`8922` by :user:`Attractadore`
+     :issue:`8922` by :user:`Attractadore`.
 
    - :class:`cluster.bicluster.SpectralCoclustering` and
      :class:`cluster.bicluster.SpectralBiclustering` now accept ``y`` in fit.
-     :issue:`6126` by :user:ldirer
+     :issue:`6126` by :user:`Laurent Direr <ldirer>`.
 
    - :class:`neighbors.LSHForest` has been deprecated and will be
      removed in 0.21 due to poor performance.
@@ -574,6 +633,14 @@ API changes summary
       :issue:`8174` by :user:`Tahar Zanouda <tzano>`, `Alexandre Gramfort`_
       and `Raghav RV`_.
 
+Trees and ensembles
+Linear, kernelized and related models
+Decomposition, manifold learning and clustering
+Preprocessing and feature selection
+Model evaluation and meta-estimators
+Metrics
+Miscellaneous
+
 
 .. _changes_0_18_1:
 

From b504e9e4f6e22eee1d08bfbb87c21a6fa9c88154 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Fri, 30 Jun 2017 14:59:19 +1000
Subject: [PATCH 02/19] More cleaning up

---
 doc/whats_new.rst | 315 ++++++++++++++++++++++++----------------------
 1 file changed, 162 insertions(+), 153 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index ec8709a3c8e66..6d6d175154fce 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -23,6 +23,9 @@ Multinomial logistic regression with L1 loss.
 
 ?Rewrite of TSNE
 
+Fix longstanding implementation erorr in average_precision_score
+
+
 Multi-metric grid search and cross validation
 
 Major deprecations
@@ -226,6 +229,9 @@ Model evaluation and meta-estimators
    - Speed improvements to :class:`model_selection.StratifiedShuffleSplit`.
      :issue:`5991` by :user:`Arthur Mensch <arthurmensch>` and `Joel Nothman`_.
 
+   - Add ``shuffle`` parameter to :func:`model_selection.train_test_split`.
+     :issue:`8845` by  :user:`themrmax <themrmax>`
+
    - :class:`multioutput.MultiOutputRegressor` and :class:`multioutput.MultiOutputClassifier`
      now support online learning using `partial_fit`.
      issue: `8053` by :user:`Peng Yu <yupbank>`.
@@ -260,110 +266,32 @@ Miscellaneous
 Bug fixes
 .........
 
-TODO
-
-   - :func:`metrics.average_precision_score` no longer linearly
-     interpolates between operating points, and instead weighs precisions
-     by the change in recall since the last operating point, as per the
-     `Wikipedia entry <http://en.wikipedia.org/wiki/Average_precision>`_.
-     (`#7356 <https://github.com/scikit-learn/scikit-learn/pull/7356>`_). By
-     :user:`Nick Dingwall <ndingwall>` and `Gael Varoquaux`_.
-
-   - Fixed a bug in :class:`covariance.MinCovDet` where inputting data
-     that produced a singular covariance matrix would cause the helper method
-     ``_c_step`` to throw an exception.
-     :issue:`3367` by :user:`Jeremy Steward <ThatGeoGuy>`
+Trees and ensembles
 
    - Fixed a bug where :class:`ensemble.IsolationForest` uses an
      an incorrect formula for the average path length
      :issue:`8549` by `Peter Wang <https://github.com/PTRWang>`_.
 
-   - Fixed a bug where :class:`cluster.DBSCAN` gives incorrect
-     result when input is a precomputed sparse matrix with initial
-     rows all zero. :issue:`8306` by :user:`Akshay Gupta <Akshay0724>`
-
    - Fixed a bug where :class:`ensemble.AdaBoostClassifier` throws
      ``ZeroDivisionError`` while fitting data with single class labels.
      :issue:`7501` by :user:`Dominik Krzeminski <dokato>`.
 
-   - Fixed a bug when :func:`datasets.make_classification` fails
-     when generating more than 30 features. :issue:`8159` by
-     :user:`Herilalaina Rakotoarison <herilalaina>`.
-
-   - Fixed a bug where :func:`model_selection.BaseSearchCV.inverse_transform`
-     returns ``self.best_estimator_.transform()`` instead of
-     ``self.best_estimator_.inverse_transform()``.
-     :issue:`8344` by :user:`Akshay Gupta <Akshay0724>`.
-
-   - Fixed same issue in :func:`grid_search.BaseSearchCV.inverse_transform`
-     :issue:`8846` by :user:`Rasmus Eriksson <MrMjauh>`
-
-   - Fixed a bug where :class:`linear_model.RandomizedLasso` and
-     :class:`linear_model.RandomizedLogisticRegression` breaks for
-     sparse input. :issue:`8259` by :user:`Aman Dalmia <dalmia>`.
-
-   - Fixed a bug where :func:`linear_model.RANSACRegressor.fit` may run until
-     ``max_iter`` if finds a large inlier group early. :issue:`8251` by :user:`aivision2020`.
-
-   - Fixed a bug where :class:`sklearn.naive_bayes.MultinomialNB` and :class:`sklearn.naive_bayes.BernoulliNB`
-     failed when `alpha=0`. :issue:`5814` by :user:`Yichuan Liu <yl565>` and
-     :user:`Herilalaina Rakotoarison <herilalaina>`.
-
-   - Fixed a bug where :func:`datasets.make_moons` gives an
-     incorrect result when ``n_samples`` is odd.
-     :issue:`8198` by :user:`Josh Levy <levy5674>`.
-
-   - Fixed a bug where :class:`linear_model.LassoLars` does not give
-     the same result as the LassoLars implementation available
-     in R (lars library). :issue:`7849` by :user:`Jair Montoya Martinez <jmontoyam>`.
-
-   - Some ``fetch_`` functions in :mod:`sklearn.datasets` were ignoring the
-     ``download_if_missing`` keyword. :issue:`7944` by :user:`Ralf Gommers <rgommers>`.
-
    - Fixed a bug in :class:`ensemble.GradientBoostingClassifier`
      and :class:`ensemble.GradientBoostingRegressor`
      where a float being compared to ``0.0`` using ``==`` caused a divide by zero
      error. issue:`7970` by :user:`He Chen <chenhe95>`.
 
-   - Fix a bug regarding fitting :class:`cluster.KMeans` with a sparse
-     array X and initial centroids, where X's means were unnecessarily being
-     subtracted from the centroids. :issue:`7872` by :user:`Josh Karnofsky <jkarno>`.
-
-   - Fix estimators to accept a ``sample_weight`` parameter of type
-     ``pandas.Series`` in their ``fit`` function. :issue:`7825` by
-     `Kathleen Chen`_.
-
-   - Fixed a bug where :class:`ensemble.IsolationForest` fails when
-     ``max_features`` is less than 1.
-     :issue:`5732` by :user:`Ishank Gulati <IshankGulati>`.
-
-   - Fix a bug where :class:`ensemble.VotingClassifier` raises an error
-     when a numpy array is passed in for weights. :issue:`7983` by
-     :user:`Vincent Pham <vincentpham1991>`.
-
-   - Fix a bug in :class:`decomposition.LatentDirichletAllocation`
-     where the ``perplexity`` method was returning incorrect results because
-     the ``transform`` method returns normalized document topic distributions
-     as of version 0.18. :issue:`7954` by :user:`Gary Foreman <garyForeman>`.
-
    - Fix a bug where :class:`ensemble.GradientBoostingClassifier` and
      :class:`ensemble.GradientBoostingRegressor` ignored the
      ``min_impurity_split`` parameter.
      :issue:`8006` by :user:`Sebastian Pölsterl <sebp>`.
 
-   - Fixes to the input validation in :class:`covariance.EllipticEnvelope`.
-     :issue:`8086` by `Andreas Müller`_.
-
-   - Fix output shape and bugs with n_jobs > 1 in
-     :class:`decomposition.SparseCoder` transform and
-     :func:`decomposition.sparse_encode`
-     for one-dimensional data and one component.
-     This also impacts the output shape of :class:`decomposition.DictionaryLearning`.
-     :issue:`8086` by `Andreas Müller`_.
+   - Fixed oob_score in :class:`ensemble.BaggingClassifier`.
+     :issue:`8936` by :user:`mlewis1729 <mlewis1729>`
 
-   - Several fixes to input validation in
-     :class:`multiclass.OutputCodeClassifier`
-     :issue:`8086` by `Andreas Müller`_.
+   - Fixed a bug where :class:`ensemble.IsolationForest` fails when
+     ``max_features`` is less than 1.
+     :issue:`5732` by :user:`Ishank Gulati <IshankGulati>`.
 
    - Fix a bug where
      :class:`ensemble.gradient_boosting.QuantileLossFunction` computed
@@ -371,108 +299,107 @@ TODO
      wrong values when calling ``__call__``.
      :issue:`8087` by :user:`Alexis Mignon <AlexisMignon>`
 
-   - Fix :func:`multioutput.MultiOutputClassifier.predict_proba` to
-     return a list of 2d arrays, rather than a 3d array. In the case where
-     different target columns had different numbers of classes, a `ValueError`
-     would be raised on trying to stack matrices with different dimensions.
-     :issue:`8093` by :user:`Peter Bull <pjbull>`.
+   - Fix a bug where :class:`ensemble.VotingClassifier` raises an error
+     when a numpy array is passed in for weights. :issue:`7983` by
+     :user:`Vincent Pham <vincentpham1991>`.
 
-   - Fix a bug where :func:`linear_model.LassoLars.fit` sometimes
-     left `coef_` as a list, rather than an ndarray.
-     :issue:`8160` by :user:`CJ Carey <perimosocordiae>`.
+   - Fixed a bug where :func:`tree.export_graphviz` raised an error
+     when the length of features_names does not match n_features in the decision
+     tree. :issue:`8512` by :user:`Li Li <aikinogard>`.
 
-   - Fix a bug where :class:`feature_extraction.FeatureHasher`
-     mandatorily applied a sparse random projection to the hashed features,
-     preventing the use of
-     :class:`feature_extraction.text.HashingVectorizer` in a
-     pipeline with  :class:`feature_extraction.text.TfidfTransformer`.
-     :issue:`7513` by :user:`Roman Yurchak <rth>`.
+Linear, kernelized and related models
 
-   - Fix a bug in cases where ``numpy.cumsum`` may be numerically unstable,
-     raising an exception if instability is identified. :issue:`7376` and
-     :issue:`7331` by `Joel Nothman`_ and :user:`yangarbiter`.
+   - Fixed a bug where :func:`linear_model.RANSACRegressor.fit` may run until
+     ``max_iter`` if it finds a large inlier group early. :issue:`8251` by :user:`aivision2020`.
 
-   - Fix a bug where :meth:`base.BaseEstimator.__getstate__`
-     obstructed pickling customizations of child-classes, when used in a
-     multiple inheritance context.
-     :issue:`8316` by :user:`Holger Peters <HolgerPeters>`.
+   - Fixed a bug where :class:`sklearn.naive_bayes.MultinomialNB` and :class:`sklearn.naive_bayes.BernoulliNB`
+     failed when `alpha=0`. :issue:`5814` by :user:`Yichuan Liu <yl565>` and
+     :user:`Herilalaina Rakotoarison <herilalaina>`.
 
-   - Fix a bug in :func:`metrics.classification._check_targets`
-     which would return ``'binary'`` if ``y_true`` and ``y_pred`` were
-     both ``'binary'`` but the union of ``y_true`` and ``y_pred`` was
-     ``'multiclass'``. :issue:`8377` by `Loic Esteve`_.
+   - Fixed a bug where :class:`linear_model.LassoLars` does not give
+     the same result as the LassoLars implementation available
+     in R (lars library). :issue:`7849` by :user:`Jair Montoya Martinez <jmontoyam>`.
+
+   - Fixed a bug in :class:`linear_model.RandomizedLasso`,
+     :class:`linear_model.Lars`, :class:`linear_model.LassoLars`,
+     :class:`linear_model.LarsCV` and :class:`linear_model.LassoLarsCV`,
+     where the parameter ``precompute`` were not used consistently across
+     classes, and some values proposed in the docstring could raise errors.
+     :issue:`5359` by `Tom Dupre la Tour`_.
 
+   - Fix a bug where :func:`linear_model.LassoLars.fit` sometimes
+     left `coef_` as a list, rather than an ndarray.
+     :issue:`8160` by :user:`CJ Carey <perimosocordiae>`.
 
    - Fix :func:`linear_model.BayesianRidge.fit` to return
      ridge parameter `alpha_` and `lambda_` consistent with calculated
      coefficients `coef_` and `intercept_`.
      :issue:`8224` by :user:`Peter Gedeck <gedeck>`.
 
-   - Fixed a bug in :class:`manifold.TSNE` where it stored the incorrect
-     ``kl_divergence_``. :issue:`6507` by :user:`Sebastian Saeger <ssaeger>`.
-
    - Fixed a bug in :class:`svm.OneClassSVM` where it returned floats instead of
      integer classes. :issue:`8676` by :user:`Vathsala Achar <VathsalaAchar>`.
 
-   - Fixed a bug where :func:`tree.export_graphviz` raised an error
-     when the length of features_names does not match n_features in the decision
-     tree. :issue:`8512` by :user:`Li Li <aikinogard>`.
-
-   - Fixed a bug in :class:`manifold.TSNE` affecting convergence of the
-     gradient descent. :issue:`8768` by :user:`David DeTomaso <deto>`.
+   - Fix AIC/BIC criterion computation in :class:`linear_model.LassoLarsIC`.
+     :issue:`9022` by `Alexandre Gramfort`_ and :user:`Mehmet Basbug <mehmetbasbug>`.
 
    - Fixed a memory leak in our LibLinear implementation. :issue:`9024` by
      :user:`Sergei Lebedev <superbobry>`
-   - Fixed improper scaling in :class:`cross_decomposition.PLSRegression`
-     with ``scale=True``. :issue:`7819` by :user:`jayzed82 <jayzed82>`.
-
-   - Fixed oob_score in :class:`ensemble.BaggingClassifier`.
-     :issue:`8936` by :user:`mlewis1729 <mlewis1729>`
-
-   - Add ``shuffle`` parameter to :func:`model_selection.train_test_split`.
-     :issue:`8845` by  :user:`themrmax <themrmax>`
-
-   - Fix AIC/BIC criterion computation in :class:`linear_model.LassoLarsIC`.
-     :issue:`9022` by `Alexandre Gramfort`_ and :user:`Mehmet Basbug <mehmetbasbug>`.
 
    - Fix bug where stratified CV splitters did not work with
      :class:`linear_model.LassoCV`. :issue:`8973` by
      :user:`Paulo Haddad <paulochf>`.
 
-   - Fixed a bug in :class:`linear_model.RandomizedLasso`,
-     :class:`linear_model.Lars`, :class:`linear_model.LassoLars`,
-     :class:`linear_model.LarsCV` and :class:`linear_model.LassoLarsCV`,
-     where the parameter ``precompute`` were not used consistently across
-     classes, and some values proposed in the docstring could raise errors.
-     :issue:`5359` by `Tom Dupre la Tour`_.
-
-   - Fixed a bug where :func:`model_selection.validation_curve`
-     reused the same estimator for each parameter value.
-     :issue:`7365` by :user:`Aleksandr Sandrovskii <Sundrique>`.
-
-   - :class:`multiclass.OneVsOneClassifier`'s ``partial_fit`` now ensures all
-     classes are provided up-front. :issue:`6250` by
-     :user:`Asish Panda <kaichogami>`.
-
-   - Fixed an integer overflow bug in :func:`metrics.confusion_matrix` and
-     hence :func:`metrics.cohen_kappa_score`. :issue:`8354`, :issue:`7929`
-     by `Joel Nothman`_ and :user:`Jon Crall <Erotemic>`.
-
   -  Fixed a bug in :class:`gaussian_process.GaussianProcessRegressor`
      when the standard deviation and covariance predicted without fit
      would fail with a unmeaningful error by default.
      :issue:`6573` by :user:`Quazi Marufur Rahman <qmaruf>` and
      `Manoj Kumar`_.
 
+Decomposition, manifold learning and clustering
+
+   - Fix a bug in :class:`decomposition.LatentDirichletAllocation`
+     where the ``perplexity`` method was returning incorrect results because
+     the ``transform`` method returns normalized document topic distributions
+     as of version 0.18. :issue:`7954` by :user:`Gary Foreman <garyForeman>`.
+
+   - Fix output shape and bugs with n_jobs > 1 in
+     :class:`decomposition.SparseCoder` transform and
+     :func:`decomposition.sparse_encode`
+     for one-dimensional data and one component.
+     This also impacts the output shape of :class:`decomposition.DictionaryLearning`.
+     :issue:`8086` by `Andreas Müller`_.
+
    - Fixed the implementation of `explained_variance_`
      in :class:`decomposition.PCA`,
      :class:`decomposition.RandomizedPCA` and
      :class:`decomposition.IncrementalPCA`.
      :issue:`9105` by `Hanmin Qin <https://github.com/qinhanmin2014>`_. 
 
-Trees and ensembles
-Linear, kernelized and related models
-Decomposition, manifold learning and clustering
+   - Fixed a bug where :class:`cluster.DBSCAN` gives incorrect
+     result when input is a precomputed sparse matrix with initial
+     rows all zero. :issue:`8306` by :user:`Akshay Gupta <Akshay0724>`
+
+   - Fix a bug regarding fitting :class:`cluster.KMeans` with a sparse
+     array X and initial centroids, where X's means were unnecessarily being
+     subtracted from the centroids. :issue:`7872` by :user:`Josh Karnofsky <jkarno>`.
+
+   - Fixes to the input validation in :class:`covariance.EllipticEnvelope`.
+     :issue:`8086` by `Andreas Müller`_.
+
+   - Fixed a bug in :class:`covariance.MinCovDet` where inputting data
+     that produced a singular covariance matrix would cause the helper method
+     ``_c_step`` to throw an exception.
+     :issue:`3367` by :user:`Jeremy Steward <ThatGeoGuy>`
+
+   - Fixed a bug in :class:`manifold.TSNE` affecting convergence of the
+     gradient descent. :issue:`8768` by :user:`David DeTomaso <deto>`.
+
+   - Fixed a bug in :class:`manifold.TSNE` where it stored the incorrect
+     ``kl_divergence_``. :issue:`6507` by :user:`Sebastian Saeger <ssaeger>`.
+
+   - Fixed improper scaling in :class:`cross_decomposition.PLSRegression`
+     with ``scale=True``. :issue:`7819` by :user:`jayzed82 <jayzed82>`.
+
 Preprocessing and feature selection
 
    - For sparse matrices, :func:`preprocessing.normalize` with ``return_norm=True``
@@ -485,10 +412,24 @@ Preprocessing and feature selection
      selected fewer features than it should.
      :issue:`7490` by :user:`Peng Meng <mpjlu>`.
 
+   - Fixed a bug where :class:`linear_model.RandomizedLasso` and
+     :class:`linear_model.RandomizedLogisticRegression` breaks for
+     sparse input. :issue:`8259` by :user:`Aman Dalmia <dalmia>`.
+
+   - Fix a bug where :class:`feature_extraction.FeatureHasher`
+     mandatorily applied a sparse random projection to the hashed features,
+     preventing the use of
+     :class:`feature_extraction.text.HashingVectorizer` in a
+     pipeline with  :class:`feature_extraction.text.TfidfTransformer`.
+     :issue:`7513` by :user:`Roman Yurchak <rth>`.
+
 
 Model evaluation and meta-estimators
-Metrics
-Miscellaneous
+
+   - Fixed a bug where :func:`model_selection.BaseSearchCV.inverse_transform`
+     returns ``self.best_estimator_.transform()`` instead of
+     ``self.best_estimator_.inverse_transform()``.
+     :issue:`8344` by :user:`Akshay Gupta <Akshay0724>` and :user:`Rasmus Eriksson <MrMjauh>`.
 
    - Added ``classes_`` attribute to :class:`model_selection.GridSearchCV`,
      :class:`model_selection.RandomizedSearchCV`,  :class:`grid_search.GridSearchCV`,
@@ -497,6 +438,69 @@ Miscellaneous
      by :user:`Alyssa Batula <abatula>`, :user:`Dylan Werner-Meier <unautre>`,
      and :user:`Stephen Hoover <stephen-hoover>`.
 
+   - Fixed a bug where :func:`model_selection.validation_curve`
+     reused the same estimator for each parameter value.
+     :issue:`7365` by :user:`Aleksandr Sandrovskii <Sundrique>`.
+
+   - Several fixes to input validation in
+     :class:`multiclass.OutputCodeClassifier`
+     :issue:`8086` by `Andreas Müller`_.
+
+   - :class:`multiclass.OneVsOneClassifier`'s ``partial_fit`` now ensures all
+     classes are provided up-front. :issue:`6250` by
+     :user:`Asish Panda <kaichogami>`.
+
+   - Fix :func:`multioutput.MultiOutputClassifier.predict_proba` to
+     return a list of 2d arrays, rather than a 3d array. In the case where
+     different target columns had different numbers of classes, a `ValueError`
+     would be raised on trying to stack matrices with different dimensions.
+     :issue:`8093` by :user:`Peter Bull <pjbull>`.
+
+
+Metrics
+
+   - :func:`metrics.average_precision_score` no longer linearly
+     interpolates between operating points, and instead weighs precisions
+     by the change in recall since the last operating point, as per the
+     `Wikipedia entry <http://en.wikipedia.org/wiki/Average_precision>`_.
+     (`#7356 <https://github.com/scikit-learn/scikit-learn/pull/7356>`_). By
+     :user:`Nick Dingwall <ndingwall>` and `Gael Varoquaux`_.
+
+   - Fix a bug in :func:`metrics.classification._check_targets`
+     which would return ``'binary'`` if ``y_true`` and ``y_pred`` were
+     both ``'binary'`` but the union of ``y_true`` and ``y_pred`` was
+     ``'multiclass'``. :issue:`8377` by `Loic Esteve`_.
+
+   - Fixed an integer overflow bug in :func:`metrics.confusion_matrix` and
+     hence :func:`metrics.cohen_kappa_score`. :issue:`8354`, :issue:`7929`
+     by `Joel Nothman`_ and :user:`Jon Crall <Erotemic>`.
+
+Miscellaneous
+
+   - Fixed a bug when :func:`datasets.make_classification` fails
+     when generating more than 30 features. :issue:`8159` by
+     :user:`Herilalaina Rakotoarison <herilalaina>`.
+
+   - Fixed a bug where :func:`datasets.make_moons` gives an
+     incorrect result when ``n_samples`` is odd.
+     :issue:`8198` by :user:`Josh Levy <levy5674>`.
+
+   - Some ``fetch_`` functions in :mod:`sklearn.datasets` were ignoring the
+     ``download_if_missing`` keyword. :issue:`7944` by :user:`Ralf Gommers <rgommers>`.
+
+   - Fix estimators to accept a ``sample_weight`` parameter of type
+     ``pandas.Series`` in their ``fit`` function. :issue:`7825` by
+     `Kathleen Chen`_.
+
+   - Fix a bug in cases where ``numpy.cumsum`` may be numerically unstable,
+     raising an exception if instability is identified. :issue:`7376` and
+     :issue:`7331` by `Joel Nothman`_ and :user:`yangarbiter`.
+
+   - Fix a bug where :meth:`base.BaseEstimator.__getstate__`
+     obstructed pickling customizations of child-classes, when used in a
+     multiple inheritance context.
+     :issue:`8316` by :user:`Holger Peters <HolgerPeters>`.
+
    - Update Sphinx-Gallery from 0.1.4 to 0.1.7 for resolving links in
      documentation build with Sphinx>1.5 :issue:`8010`, :issue:`7986` by
      :user:`Oscar Najera <Titan-C>`
@@ -505,6 +509,11 @@ Miscellaneous
 API changes summary
 -------------------
 
+   - The ``non_negative`` parameter in :class:`feature_extraction.FeatureHasher`
+     has been deprecated, and replaced with a more principled alternative,
+     ``alternate_sign``.
+     :issue:`7565` by :user:`Roman Yurchak <rth>`.
+
    - Ensure that estimators' attributes ending with ``_`` are not set
      in the constructor but only in the ``fit`` method. Most notably,
      ensemble estimators (deriving from :class:`ensemble.BaseEnsemble`)

From 24c742bde0a5752628c09f57eb8b363b2852e220 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Fri, 30 Jun 2017 15:10:07 +1000
Subject: [PATCH 03/19] More cleaning up

---
 doc/whats_new.rst | 156 ++++++++++++++++++++++++----------------------
 1 file changed, 80 insertions(+), 76 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index 6d6d175154fce..c73ee6e4f78dd 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -423,7 +423,6 @@ Preprocessing and feature selection
      pipeline with  :class:`feature_extraction.text.TfidfTransformer`.
      :issue:`7513` by :user:`Roman Yurchak <rth>`.
 
-
 Model evaluation and meta-estimators
 
    - Fixed a bug where :func:`model_selection.BaseSearchCV.inverse_transform`
@@ -456,7 +455,6 @@ Model evaluation and meta-estimators
      would be raised on trying to stack matrices with different dimensions.
      :issue:`8093` by :user:`Peter Bull <pjbull>`.
 
-
 Metrics
 
    - :func:`metrics.average_precision_score` no longer linearly
@@ -509,21 +507,23 @@ Miscellaneous
 API changes summary
 -------------------
 
-   - The ``non_negative`` parameter in :class:`feature_extraction.FeatureHasher`
-     has been deprecated, and replaced with a more principled alternative,
-     ``alternate_sign``.
-     :issue:`7565` by :user:`Roman Yurchak <rth>`.
+Trees and ensembles
 
-   - Ensure that estimators' attributes ending with ``_`` are not set
-     in the constructor but only in the ``fit`` method. Most notably,
-     ensemble estimators (deriving from :class:`ensemble.BaseEnsemble`)
-     now only have ``self.estimators_`` available after ``fit``.
-     :issue:`7464` by `Lars Buitinck`_ and `Loic Esteve`_.
+   - Gradient boosting base models are no longer estimators. By `Andreas Müller`_.
 
-   - All checks in ``utils.estimator_checks``, in particular
-     :func:`utils.estimator_checks.check_estimator` now accept estimator
-     instances. Most other checks do not accept
-     estimator classes any more. :issue:`9019` by `Andreas Müller`_.
+   - All tree based estimators now accept a ``min_impurity_decrease``
+     parameter in lieu of the ``min_impurity_split``, which is now deprecated.
+     The ``min_impurity_decrease`` helps stop splitting the nodes in which
+     the weighted impurity decrease from splitting is no longer alteast
+     ``min_impurity_decrease``.  :issue:`8449` by `Raghav RV`_.
+
+Linear, kernelized and related models
+
+   - :class:`neighbors.LSHForest` has been deprecated and will be
+     removed in 0.21 due to poor performance.
+     :issue:`8996` by `Andreas Müller`_.
+
+Decomposition, manifold learning and clustering
 
    - Deprecate the ``doc_topic_distr`` argument of the ``perplexity`` method
      in :class:`decomposition.LatentDirichletAllocation` because the
@@ -531,20 +531,30 @@ API changes summary
      needed for the perplexity calculation. :issue:`7954` by
      :user:`Gary Foreman <garyForeman>`.
 
-   - Replace attribute ``named_steps`` ``dict`` to :class:`utils.Bunch`
-     in :class:`pipeline.Pipeline` to enable tab completion in interactive
-     environment. In the case conflict value on ``named_steps`` and ``dict``
-     attribute, ``dict`` behavior will be prioritized.
-     :issue:`8481` by :user:`Herilalaina Rakotoarison <herilalaina>`.
+   - The ``n_topics`` parameter of :class:`decomposition.LatentDirichletAllocation`
+     has been renamed to ``n_components`` and will be removed in version 0.21.
+     :issue:`8922` by :user:`Attractadore`.
 
-   - The :func:`multioutput.MultiOutputClassifier.predict_proba`
-     function used to return a 3d array (``n_samples``, ``n_classes``,
-     ``n_outputs``). In the case where different target columns had different
-     numbers of classes, a `ValueError` would be raised on trying to stack
-     matrices with different dimensions. This function now returns a list of
-     arrays where the length of the list is ``n_outputs``, and each array is
-     (``n_samples``, ``n_classes``) for that particular output.
-     :issue:`8093` by :user:`Peter Bull <pjbull>`.
+   - :class:`cluster.bicluster.SpectralCoclustering` and
+     :class:`cluster.bicluster.SpectralBiclustering` now accept ``y`` in fit.
+     :issue:`6126` by :user:`Laurent Direr <ldirer>`.
+
+Preprocessing and feature selection
+
+   - :class:`feature_selection.SelectFromModel` now has a ``partial_fit``
+     method only if the underlying estimator does. By `Andreas Müller`_.
+
+   - :class:`feature_selection.SelectFromModel` now validates the ``threshold``
+     parameter and sets the ``threshold_`` attribute during the call to
+     ``fit``, and no longer during the call to ``transform```, by `Andreas
+     Müller`_.
+
+   - The ``non_negative`` parameter in :class:`feature_extraction.FeatureHasher`
+     has been deprecated, and replaced with a more principled alternative,
+     ``alternate_sign``.
+     :issue:`7565` by :user:`Roman Yurchak <rth>`.
+
+Model evaluation and meta-estimators
 
    - Deprecate the ``fit_params`` constructor input to the
      :class:`model_selection.GridSearchCV` and
@@ -557,52 +567,42 @@ API changes summary
      :func:`model_selection.cross_val_predict`.
      :issue:`2879` by :user:`Stephen Hoover <stephen-hoover>`.
 
-   - The ``decision_function`` output shape for binary classification in
-     :class:`multiclass.OneVsRestClassifier` and
-     :class:`multiclass.OneVsOneClassifier` is now ``(n_samples,)`` to conform
-     to scikit-learn conventions. :issue:`9100` by `Andreas Müller`_.
-
-   - Gradient boosting base models are no longer estimators. By `Andreas Müller`_.
-
-   - :class:`feature_selection.SelectFromModel` now validates the ``threshold``
-     parameter and sets the ``threshold_`` attribute during the call to
-     ``fit``, and no longer during the call to ``transform```, by `Andreas
-     Müller`_.
-
-   - :class:`feature_selection.SelectFromModel` now has a ``partial_fit``
-     method only if the underlying estimator does. By `Andreas Müller`_.
+   - In version 0.21, the default behavior of splitters that use the
+     ``test_size`` and ``train_size`` parameter will change, such that
+     specifying ``train_size`` alone will cause ``test_size`` to be the
+     remainder. :issue:`7459` by :user:`Nelson Liu <nelson-liu>`.
 
    - :class:`multiclass.OneVsRestClassifier` now has a ``partial_fit`` method
      only if the underlying estimator does.  By `Andreas Müller`_.
 
-   - Estimators with both methods ``decision_function`` and ``predict_proba``
-     are now required to have a monotonic relation between them. The
-     method ``check_decision_proba_consistency`` has been added in
-     **sklearn.utils.estimator_checks** to check their consistency.
-     :issue:`7578` by :user:`Shubham Bhardwaj <shubham0704>`
+   - The ``decision_function`` output shape for binary classification in
+     :class:`multiclass.OneVsRestClassifier` and
+     :class:`multiclass.OneVsOneClassifier` is now ``(n_samples,)`` to conform
+     to scikit-learn conventions. :issue:`9100` by `Andreas Müller`_.
 
-   - In version 0.21, the default behavior of splitters that use the
-     ``test_size`` and ``train_size`` parameter will change, such that
-     specifying ``train_size`` alone will cause ``test_size`` to be the
-     remainder. :issue:`7459` by :user:`Nelson Liu <nelson-liu>`.
+   - The :func:`multioutput.MultiOutputClassifier.predict_proba`
+     function used to return a 3d array (``n_samples``, ``n_classes``,
+     ``n_outputs``). In the case where different target columns had different
+     numbers of classes, a `ValueError` would be raised on trying to stack
+     matrices with different dimensions. This function now returns a list of
+     arrays where the length of the list is ``n_outputs``, and each array is
+     (``n_samples``, ``n_classes``) for that particular output.
+     :issue:`8093` by :user:`Peter Bull <pjbull>`.
 
-   - All tree based estimators now accept a ``min_impurity_decrease``
-     parameter in lieu of the ``min_impurity_split``, which is now deprecated.
-     The ``min_impurity_decrease`` helps stop splitting the nodes in which
-     the weighted impurity decrease from splitting is no longer alteast
-     ``min_impurity_decrease``.  :issue:`8449` by `Raghav RV`_.
+   - Replace attribute ``named_steps`` ``dict`` to :class:`utils.Bunch`
+     in :class:`pipeline.Pipeline` to enable tab completion in interactive
+     environment. In the case conflict value on ``named_steps`` and ``dict``
+     attribute, ``dict`` behavior will be prioritized.
+     :issue:`8481` by :user:`Herilalaina Rakotoarison <herilalaina>`.
 
-   - The ``n_topics`` parameter of :class:`decomposition.LatentDirichletAllocation`
-     has been renamed to ``n_components`` and will be removed in version 0.21.
-     :issue:`8922` by :user:`Attractadore`.
+Metrics
 
-   - :class:`cluster.bicluster.SpectralCoclustering` and
-     :class:`cluster.bicluster.SpectralBiclustering` now accept ``y`` in fit.
-     :issue:`6126` by :user:`Laurent Direr <ldirer>`.
+Miscellaneous
 
-   - :class:`neighbors.LSHForest` has been deprecated and will be
-     removed in 0.21 due to poor performance.
-     :issue:`8996` by `Andreas Müller`_.
+   - Deprecate the ``y`` parameter in `transform` and `inverse_transform`.
+     The method  should not accept ``y`` parameter, as it's used at the prediction time.
+     :issue:`8174` by :user:`Tahar Zanouda <tzano>`, `Alexandre Gramfort`_
+     and `Raghav RV`_.
 
    - SciPy >= 0.13.3 and NumPy >= 1.8.2 are now the minimum supported versions
      for scikit-learn. The following backported functions in
@@ -637,18 +637,22 @@ API changes summary
      - ``utils.stats.rankdata``
      - ``neighbors.approximate.LSHForest``
 
-    - Deprecate the ``y`` parameter in `transform` and `inverse_transform`.
-      The method  should not accept ``y`` parameter, as it's used at the prediction time.
-      :issue:`8174` by :user:`Tahar Zanouda <tzano>`, `Alexandre Gramfort`_
-      and `Raghav RV`_.
+   - Estimators with both methods ``decision_function`` and ``predict_proba``
+     are now required to have a monotonic relation between them. The
+     method ``check_decision_proba_consistency`` has been added in
+     **sklearn.utils.estimator_checks** to check their consistency.
+     :issue:`7578` by :user:`Shubham Bhardwaj <shubham0704>`
 
-Trees and ensembles
-Linear, kernelized and related models
-Decomposition, manifold learning and clustering
-Preprocessing and feature selection
-Model evaluation and meta-estimators
-Metrics
-Miscellaneous
+   - All checks in ``utils.estimator_checks``, in particular
+     :func:`utils.estimator_checks.check_estimator` now accept estimator
+     instances. Most other checks do not accept
+     estimator classes any more. :issue:`9019` by `Andreas Müller`_.
+
+   - Ensure that estimators' attributes ending with ``_`` are not set
+     in the constructor but only in the ``fit`` method. Most notably,
+     ensemble estimators (deriving from :class:`ensemble.BaseEnsemble`)
+     now only have ``self.estimators_`` available after ``fit``.
+     :issue:`7464` by `Lars Buitinck`_ and `Loic Esteve`_.
 
 
 .. _changes_0_18_1:

From 187ee22942c1f0168d08e0236e1d45ba4ca598dd Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Sat, 1 Jul 2017 20:37:14 +1000
Subject: [PATCH 04/19] Deprecations

---
 doc/modules/classes.rst | 14 ++++++++++++--
 doc/whats_new.rst       | 13 ++++++-------
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/doc/modules/classes.rst b/doc/modules/classes.rst
index 5399e27ef4d08..09dd288a85dc0 100644
--- a/doc/modules/classes.rst
+++ b/doc/modules/classes.rst
@@ -723,8 +723,6 @@ Kernels:
    linear_model.PassiveAggressiveClassifier
    linear_model.PassiveAggressiveRegressor
    linear_model.Perceptron
-   linear_model.RandomizedLasso
-   linear_model.RandomizedLogisticRegression
    linear_model.RANSACRegressor
    linear_model.Ridge
    linear_model.RidgeClassifier
@@ -1391,6 +1389,18 @@ Recently deprecated
 ===================
 
 
+To be removed in 0.21
+---------------------
+
+.. autosummary::
+   :toctree: generated/
+   :template: deprecated_class.rst
+
+   linear_model.RandomizedLasso
+   linear_model.RandomizedLogisticRegression
+   neighbors.LSHForest
+
+
 To be removed in 0.20
 ---------------------
 
diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index c73ee6e4f78dd..04a70f07c7247 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -1,4 +1,4 @@
-.. currentmodule:: sklearn
+ ..currentmodule:: sklearn
 
 
 ===============
@@ -28,12 +28,11 @@ Fix longstanding implementation erorr in average_precision_score
 
 Multi-metric grid search and cross validation
 
-Major deprecations
-------------------
-
-TODO
-
-We have deprecated RandomizedLasso and RandomizedLogisticRegression and LSHForest because they weren't appropriate or up to standards. We have deprecated a number of utilities no longer necessary now that we require Scipy 0.13.3 and Numpy 1.8.2 at a minimum.
+Note also that we have deprecated RandomizedLasso,
+RandomizedLogisticRegression and LSHForest because they weren't
+appropriate or up to standards. We have deprecated a number of
+utilities no longer necessary now that we require Scipy 0.13.3 and
+Numpy 1.8.2 at a minimum.
 
 Changed models
 --------------

From 37df822a79cca2427e1f141b6d1945540a204dd9 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Sat, 1 Jul 2017 20:45:40 +1000
Subject: [PATCH 05/19] Clean up merge

---
 doc/whats_new.rst | 42 +++++++++++++++++++-----------------------
 1 file changed, 19 insertions(+), 23 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index 9ba019e59e7d6..5a8b9f2395d40 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -152,12 +152,6 @@ Linear, kernelized and related models
      attributes, ``n_skips_*``.
      :issue:`7914` by :user:`Michael Horrell <mthorrell>`.
 
-   - Relax assumption on the data for the
-     :class:`kernel_approximation.SkewedChi2Sampler`. Since the Skewed-Chi2
-     kernel is defined on the open interval :math:`(-skewedness; +\infty)^d`,
-     the transform function should not check whether ``X < 0`` but whether ``X <
-     -self.skewedness``. :issue:`7573` by :user:`Romain Brault <RomainBrault>`.
-
    - Custom metrics for the :mod:`neighbors` binary trees now have
      fewer constraints: they must take two 1d-arrays and return a float.
      :issue:`6288` by `Jake Vanderplas`_.
@@ -199,6 +193,16 @@ Preprocessing and feature selection
      :mod:`feature_extraction.text` by binding methods for loops and
      special-casing unigrams. :issue:`7567` by `Jaye Doepke <jtdoepke>`
 
+   - Relax assumption on the data for the
+     :class:`kernel_approximation.SkewedChi2Sampler`. Since the Skewed-Chi2
+     kernel is defined on the open interval :math:`(-skewedness; +\infty)^d`,
+     the transform function should not check whether ``X < 0`` but whether ``X <
+     -self.skewedness``. :issue:`7573` by :user:`Romain Brault <RomainBrault>`.
+
+   - Made default kernel parameters kernel-dependent in
+     :class:`kernel_approximation.Nystroem`.
+     :issue:`5229` by :user:`mth4saurabh` and `Andreas Müller`_.
+
 Model evaluation and meta-estimators
 
    - :class:`pipeline.Pipeline` allows to cache transformers
@@ -472,6 +476,10 @@ Metrics
      hence :func:`metrics.cohen_kappa_score`. :issue:`8354`, :issue:`7929`
      by `Joel Nothman`_ and :user:`Jon Crall <Erotemic>`.
 
+   - Fixed passing of ``gamma`` parameter to the ``chi2`` kernel in
+     :func:`metrics.pairwise_kernels` :issue:`5211` by :user:`nrhine1`,
+     :user:`mth4saurabh` and `Andreas Müller`_.
+
 Miscellaneous
 
    - Fixed a bug when :func:`datasets.make_classification` fails
@@ -501,18 +509,6 @@ Miscellaneous
    - Update Sphinx-Gallery from 0.1.4 to 0.1.7 for resolving links in
      documentation build with Sphinx>1.5 :issue:`8010`, :issue:`7986` by
      :user:`Oscar Najera <Titan-C>`
-   - Made default kernel parameters kernel-dependent in :class:`kernel_approximation.Nystroem`
-     :issue:`5229` by :user:`mth4saurabh` and `Andreas Müller`_.
-
-   - Fixed passing of ``gamma`` parameter to the ``chi2`` kernel in
-     :func:`metrics.pairwise_kernels` :issue:`5211` by :user:`nrhine1`,
-     :user:`mth4saurabh` and `Andreas Müller`_.
-
-  -  Fixed a bug in :class:`gaussian_process.GaussianProcessRegressor`
-     when the standard deviation and covariance predicted without fit
-     would fail with a unmeaningful error by default.
-     :issue:`6573` by :user:`Quazi Marufur Rahman <qmaruf>` and
-     `Manoj Kumar`_.
 
 
 API changes summary
@@ -565,6 +561,11 @@ Preprocessing and feature selection
      ``alternate_sign``.
      :issue:`7565` by :user:`Roman Yurchak <rth>`.
 
+   - :class:`linear_model.RandomizedLogisticRegression`,
+     and :class:`linear_model.RandomizedLasso` have been deprecated and will
+     be removed in version 0.21.
+     :issue: `8995` by :user:`Ramana.S <sentient07>`.
+
 Model evaluation and meta-estimators
 
    - Deprecate the ``fit_params`` constructor input to the
@@ -646,8 +647,6 @@ Miscellaneous
      - ``utils.random.choice``
      - ``utils.sparsetools.connected_components``
      - ``utils.stats.rankdata``
-     - ``neighbors.approximate.LSHForest``
-     - ``linear_model.randomized_l1``
 
    - Estimators with both methods ``decision_function`` and ``predict_proba``
      are now required to have a monotonic relation between them. The
@@ -1391,9 +1390,6 @@ Model evaluation and meta-estimators
      the parameter ``n_labels`` is renamed to ``n_groups``.
      :issue:`6660` by `Raghav RV`_.
 
-   - The :mod:`sklearn.linear_model.randomized_l1` is deprecated.
-     :issue: `8995` by :user:`Ramana.S <sentient07>`.
-
 Code Contributors
 -----------------
 Aditya Joshi, Alejandro, Alexander Fabisch, Alexander Loginov, Alexander

From ce535801af9a11c253180f1ce0908aa655963a2c Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Tue, 4 Jul 2017 13:53:07 +1000
Subject: [PATCH 06/19] Update

---
 doc/whats_new.rst | 36 ++++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index c2819595c83bc..ac6297c3be2d9 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -1,4 +1,4 @@
- ..currentmodule:: sklearn
+.. currentmodule:: sklearn
 
 
 ===============
@@ -23,6 +23,11 @@ Multinomial logistic regression with L1 loss.
 
 ?Rewrite of TSNE
 
+:class:`semi_supervised.LabelSpreading` and
+:class:`semi_supervised.LabelPropagation` have had substantial fixes.
+Propagation was previously broekn. Spreading should now function better
+with respect to parameters.
+
 Fix longstanding implementation erorr in average_precision_score
 
 
@@ -42,7 +47,9 @@ parameters, may produce different models from the previous version. This often
 occurs due to changes in the modelling logic (bug fixes or enhancements), or in
 random sampling procedures.
 
-   * :class:`sklearn.ensemble.IsolationForest` (bug fix)
+   * :class:`ensemble.IsolationForest` (bug fix)
+   * :class:`semi_supervised.LabelSpreading` (bug fix)
+   * :class:`semi_supervised.LabelPropagation` (bug fix)
    * TODO
 
 Details are listed in the changelog below.
@@ -312,6 +319,12 @@ Trees and ensembles
 
 Linear, kernelized and related models
 
+   - Fix :class:`semi_supervised.BaseLabelPropagation` to correctly implement
+     ``LabelPropagation`` and ``LabelSpreading`` as done in the referenced
+     papers. :issue:`9239`
+     by :user:`Andre Ambrosio Boechat <boechat107>`, :user:`Utkarsh Upadhyay
+     <musically-ut>`, and `Joel Nothman`_.
+
    - Fixed a bug where :func:`linear_model.RANSACRegressor.fit` may run until
      ``max_iter`` if it finds a large inlier group early. :issue:`8251` by :user:`aivision2020`.
 
@@ -523,20 +536,6 @@ Trees and ensembles
      The ``min_impurity_decrease`` helps stop splitting the nodes in which
      the weighted impurity decrease from splitting is no longer alteast
      ``min_impurity_decrease``.  :issue:`8449` by `Raghav RV`_.
-   - Fixed the implementation of `explained_variance_`
-     in :class:`decomposition.PCA`,
-     :class:`decomposition.RandomizedPCA` and
-     :class:`decomposition.IncrementalPCA`.
-     :issue:`9105` by `Hanmin Qin <https://github.com/qinhanmin2014>`_.
-
-   - Fix :class:`semi_supervised.BaseLabelPropagation` to correctly implement
-     ``LabelPropagation`` and ``LabelSpreading`` as done in the referenced
-     papers. :class:`semi_supervised.LabelPropagation` now always does hard
-     clamping. Its ``alpha`` parameter has no effect and is
-     deprecated to be removed in 0.21. :issue:`6727` :issue:`3550` issue:`5770`
-     by :user:`Andre Ambrosio Boechat <boechat107>`, :user:`Utkarsh Upadhyay
-     <musically-ut>`, and `Joel Nothman`_.
-
 
 Linear, kernelized and related models
 
@@ -544,6 +543,11 @@ Linear, kernelized and related models
      removed in 0.21 due to poor performance.
      :issue:`8996` by `Andreas Müller`_.
 
+   - The ``alpha`` parameter of :class:`semi_supervised.LabelPropagation` now
+     has no effect and is deprecated to be removed in 0.21. :issue:`9239`
+     by :user:`Andre Ambrosio Boechat <boechat107>`, :user:`Utkarsh Upadhyay
+     <musically-ut>`, and `Joel Nothman`_.
+
 Decomposition, manifold learning and clustering
 
    - Deprecate the ``doc_topic_distr`` argument of the ``perplexity`` method

From 89d5ea9111ea34ce965788d2bce7948769164c86 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Wed, 5 Jul 2017 11:12:33 +1000
Subject: [PATCH 07/19] TODOs to prose and minor changes

---
 doc/whats_new.rst | 73 ++++++++++++++++++++++++++---------------------
 1 file changed, 40 insertions(+), 33 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index ac6297c3be2d9..b16a91d02aefe 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -15,29 +15,32 @@ Highlights
 
 TODO:
 
-This release includes a number of great new features including Local Outlier Factor for anomaly detection, QuantileTransformer for robust feature transformation, and ClassifierChain to simply account for dependencies between classes in multilabel problems.
+This release includes a number of great new features including
+:class:`neighbors.LocalOutlierFactor` for anomaly detection,
+:class:`preprocessing.QuantileTransformer` for robust feature
+transformation, and :class:`multioutput.ClassifierChain` to simply
+account for dependencies between classes in multilabel problems. We
+have some new algorithms in existing estimators, such as
+multiplicative update in :class:`decomposition.NMF` and multinomial
+:class:`linear_model.LogisticRegression` with L1 loss.
+
+You can learn faster.  The new option to cache transformations in
+:class:`pipeline.Pipeline` makes grid search over pipelines including
+slow transformations much more efficient.
+
+And you can predict faster. If you're sure you know what you're doing,
+you can turn off some validation using :func:`config_context`.
 
-Pipeline caching makes grid search over pipelines including slow transformations much more efficient.
-
-Multinomial logistic regression with L1 loss.
-
-?Rewrite of TSNE
+Multi-metric grid search and cross validation
 
+We've made some important fixes too.
+TODO: ?Rewrite of TSNE
+We've fixed a longstanding implementation erorr in :func:`metrics.average_precision_score`.
 :class:`semi_supervised.LabelSpreading` and
 :class:`semi_supervised.LabelPropagation` have had substantial fixes.
 Propagation was previously broekn. Spreading should now function better
 with respect to parameters.
 
-Fix longstanding implementation erorr in average_precision_score
-
-
-Multi-metric grid search and cross validation
-
-Note also that we have deprecated RandomizedLasso,
-RandomizedLogisticRegression and LSHForest because they weren't
-appropriate or up to standards. We have deprecated a number of
-utilities no longer necessary now that we require Scipy 0.13.3 and
-Numpy 1.8.2 at a minimum.
 
 Changed models
 --------------
@@ -159,10 +162,6 @@ Linear, kernelized and related models
      attributes, ``n_skips_*``.
      :issue:`7914` by :user:`Michael Horrell <mthorrell>`.
 
-   - Custom metrics for the :mod:`neighbors` binary trees now have
-     fewer constraints: they must take two 1d-arrays and return a float.
-     :issue:`6288` by `Jake Vanderplas`_.
-
    - In :class:`gaussian_process.GaussianProcessRegressor`, method ``predict``
      is a lot faster with ``return_std=True``. :issue:`8591` by
      :user:`Hadrien Bertrand <hbertrand>`.
@@ -173,9 +172,15 @@ Linear, kernelized and related models
 
    - Memory usage enhancement: Prevent cast from float32 to float64 in
      :class:`linear_model.Ridge` when using svd, sparse_cg, cholesky or lsqr solvers
-     :class:`sklearn.linear_model.Ridge` when using svd, sparse_cg, cholesky or lsqr solvers
+     :class:`linear_model.Ridge` when using svd, sparse_cg, cholesky or lsqr solvers
      by :user:`Joan Massich <massich>`, :user:`Nicolas Cordier <ncordier>`
 
+Other predictors
+
+   - Custom metrics for the :mod:`neighbors` binary trees now have
+     fewer constraints: they must take two 1d-arrays and return a float.
+     :issue:`6288` by `Jake Vanderplas`_.
+
 Decomposition, manifold learning and clustering
 
    - :class:`cluster.MiniBatchKMeans` and :class:`cluster.KMeans`
@@ -264,7 +269,7 @@ Miscellaneous
      :issue:`7533` by :user:`Ekaterina Krivich <kiote>`.
 
    - Added type checking to the ``accept_sparse`` parameter in
-     :mod:`sklearn.utils.validation` methods. This parameter now accepts only
+     :mod:`utils.validation` methods. This parameter now accepts only
      boolean, string, or list/tuple of strings. ``accept_sparse=None`` is deprecated
      and should be replaced by ``accept_sparse=False``.
      :issue:`7880` by :user:`Josh Karnofsky <jkarno>`.
@@ -319,16 +324,10 @@ Trees and ensembles
 
 Linear, kernelized and related models
 
-   - Fix :class:`semi_supervised.BaseLabelPropagation` to correctly implement
-     ``LabelPropagation`` and ``LabelSpreading`` as done in the referenced
-     papers. :issue:`9239`
-     by :user:`Andre Ambrosio Boechat <boechat107>`, :user:`Utkarsh Upadhyay
-     <musically-ut>`, and `Joel Nothman`_.
-
    - Fixed a bug where :func:`linear_model.RANSACRegressor.fit` may run until
      ``max_iter`` if it finds a large inlier group early. :issue:`8251` by :user:`aivision2020`.
 
-   - Fixed a bug where :class:`sklearn.naive_bayes.MultinomialNB` and :class:`sklearn.naive_bayes.BernoulliNB`
+   - Fixed a bug where :class:`naive_bayes.MultinomialNB` and :class:`naive_bayes.BernoulliNB`
      failed when `alpha=0`. :issue:`5814` by :user:`Yichuan Liu <yl565>` and
      :user:`Herilalaina Rakotoarison <herilalaina>`.
 
@@ -371,6 +370,14 @@ Linear, kernelized and related models
      :issue:`6573` by :user:`Quazi Marufur Rahman <qmaruf>` and
      `Manoj Kumar`_.
 
+Other predictors
+
+   - Fix :class:`semi_supervised.BaseLabelPropagation` to correctly implement
+     ``LabelPropagation`` and ``LabelSpreading`` as done in the referenced
+     papers. :issue:`9239`
+     by :user:`Andre Ambrosio Boechat <boechat107>`, :user:`Utkarsh Upadhyay
+     <musically-ut>`, and `Joel Nothman`_.
+
 Decomposition, manifold learning and clustering
 
    - Fix a bug in :class:`decomposition.LatentDirichletAllocation`
@@ -503,7 +510,7 @@ Miscellaneous
      incorrect result when ``n_samples`` is odd.
      :issue:`8198` by :user:`Josh Levy <levy5674>`.
 
-   - Some ``fetch_`` functions in :mod:`sklearn.datasets` were ignoring the
+   - Some ``fetch_`` functions in :mod:`datasets` were ignoring the
      ``download_if_missing`` keyword. :issue:`7944` by :user:`Ralf Gommers <rgommers>`.
 
    - Fix estimators to accept a ``sample_weight`` parameter of type
@@ -537,7 +544,7 @@ Trees and ensembles
      the weighted impurity decrease from splitting is no longer alteast
      ``min_impurity_decrease``.  :issue:`8449` by `Raghav RV`_.
 
-Linear, kernelized and related models
+Other predictors
 
    - :class:`neighbors.LSHForest` has been deprecated and will be
      removed in 0.21 due to poor performance.
@@ -636,7 +643,7 @@ Miscellaneous
 
    - SciPy >= 0.13.3 and NumPy >= 1.8.2 are now the minimum supported versions
      for scikit-learn. The following backported functions in
-     :mod:`sklearn.utils` have been removed or deprecated accordingly.
+     :mod:`utils` have been removed or deprecated accordingly.
      :issue:`8854` and :issue:`8874` by :user:`Naoya Kanai <naoyak>`
 
      Removed in 0.19:
@@ -669,7 +676,7 @@ Miscellaneous
    - Estimators with both methods ``decision_function`` and ``predict_proba``
      are now required to have a monotonic relation between them. The
      method ``check_decision_proba_consistency`` has been added in
-     **sklearn.utils.estimator_checks** to check their consistency.
+     **utils.estimator_checks** to check their consistency.
      :issue:`7578` by :user:`Shubham Bhardwaj <shubham0704>`
 
    - All checks in ``utils.estimator_checks``, in particular

From 736e93feda98f3ea77fa56c9c1012d92cccf8c18 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Wed, 5 Jul 2017 21:14:27 +1000
Subject: [PATCH 08/19] Changed models and minor fixes

---
 doc/modules/pipeline.rst |  2 ++
 doc/whats_new.rst        | 49 ++++++++++++++++++++++++----------------
 2 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/doc/modules/pipeline.rst b/doc/modules/pipeline.rst
index b098ec04a999a..4356b3fe8d640 100644
--- a/doc/modules/pipeline.rst
+++ b/doc/modules/pipeline.rst
@@ -124,6 +124,8 @@ i.e. if the last estimator is a classifier, the :class:`Pipeline` can be used
 as a classifier. If the last estimator is a transformer, again, so is the
 pipeline.
 
+.. _pipeline_cache:
+
 Caching transformers: avoid repeated computation
 -------------------------------------------------
 
diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index b16a91d02aefe..1eb642b01d55d 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -13,8 +13,6 @@ Version 0.19
 Highlights
 ----------
 
-TODO:
-
 This release includes a number of great new features including
 :class:`neighbors.LocalOutlierFactor` for anomaly detection,
 :class:`preprocessing.QuantileTransformer` for robust feature
@@ -24,12 +22,12 @@ have some new algorithms in existing estimators, such as
 multiplicative update in :class:`decomposition.NMF` and multinomial
 :class:`linear_model.LogisticRegression` with L1 loss.
 
-You can learn faster.  The new option to cache transformations in
-:class:`pipeline.Pipeline` makes grid search over pipelines including
-slow transformations much more efficient.
-
+You can learn faster.  The :ref:`new option to cache transformations
+<pipeline_cache>` in :class:`pipeline.Pipeline` makes grid search over
+pipelines including slow transformations much more efficient.
 And you can predict faster. If you're sure you know what you're doing,
-you can turn off some validation using :func:`config_context`.
+you can turn off validating that the input is finite using
+:func:`config_context`.
 
 Multi-metric grid search and cross validation
 
@@ -38,9 +36,8 @@ TODO: ?Rewrite of TSNE
 We've fixed a longstanding implementation erorr in :func:`metrics.average_precision_score`.
 :class:`semi_supervised.LabelSpreading` and
 :class:`semi_supervised.LabelPropagation` have had substantial fixes.
-Propagation was previously broekn. Spreading should now function better
-with respect to parameters.
-
+Propagation was previously broken. Spreading should now correctly
+respect its alpha parameter.
 
 Changed models
 --------------
@@ -53,7 +50,18 @@ random sampling procedures.
    * :class:`ensemble.IsolationForest` (bug fix)
    * :class:`semi_supervised.LabelSpreading` (bug fix)
    * :class:`semi_supervised.LabelPropagation` (bug fix)
-   * TODO
+   * tree based models where ``min_weight_fraction_leaf`` is used (enhancement)
+   * :class:`ensemble.GradientBoostingClassifier` and
+     :class:`ensemble.GradientBoostingRegressor` where ``min_impurity_split`` is used (bug fix)
+   * gradient boosting with :class:`ensemble.gradient_boosting.QuantileLossFunction` (bug fix)
+   * :class:`linear_model.RANSACRegressor` (bug fix)
+   * :class:`linear_model.LassoLars` (bug fix)
+   * :class:`linear_model.LassoLarsIC` (bug fix)
+   * :class:`cluster.KMeans` with sparse X and initial centroids given (bug fix)
+   * :class:`manifold.TSNE` (bug fix)
+   * :class:`cross_decomposition.PLSRegression`
+     with ``scale=True`` (bug fix)
+   * :class:`feature_selection.SelectFdr` (bug fix)
 
 Details are listed in the changelog below.
 
@@ -136,8 +144,8 @@ Trees and ensembles
      now support sparse input for prediction.
      :issue:`6101` by :user:`Ibraim Ganiev <olologin>`.
 
-   - :class:`ensemble.VotingClassifier` now allow changing estimators by using
-     :meth:`ensemble.VotingClassifier.set_params`. Estimators can also be
+   - :class:`ensemble.VotingClassifier` now allows changing estimators by using
+     :meth:`ensemble.VotingClassifier.set_params`. An estimator can also be
      removed by setting it to `None`.
      :issue:`7674` by :user:`Yichuan Liu <yl565>`.
 
@@ -146,7 +154,7 @@ Linear, kernelized and related models
    - :class:`linear_model.SGDClassifier`, :class:`linear_model.SGDRegressor`,
      :class:`linear_model.PassiveAggressiveClassifier`,
      :class:`linear_model.PassiveAggressiveRegressor` and
-     :class:`linear_model.Perceptron` now expose a ``max_iter`` and
+     :class:`linear_model.Perceptron` now expose ``max_iter`` and
      ``tol`` parameters, to handle convergence more precisely.
      ``n_iter`` parameter is deprecated, and the fitted estimator exposes
      a ``n_iter_`` attribute, with actual number of iterations before
@@ -213,11 +221,11 @@ Preprocessing and feature selection
 
    - Made default kernel parameters kernel-dependent in
      :class:`kernel_approximation.Nystroem`.
-     :issue:`5229` by :user:`mth4saurabh` and `Andreas Müller`_.
+     :issue:`5229` by :user:`Saurabh Bansod <mth4saurabh>` and `Andreas Müller`_.
 
 Model evaluation and meta-estimators
 
-   - :class:`pipeline.Pipeline` allows to cache transformers
+   - :class:`pipeline.Pipeline` is now able to cache transformers
      within a pipeline by using the ``memory`` constructor parameter.
      :issue:`7990` by :user:`Guillaume Lemaitre <glemaitre>`.
 
@@ -301,7 +309,7 @@ Trees and ensembles
      ``min_impurity_split`` parameter.
      :issue:`8006` by :user:`Sebastian Pölsterl <sebp>`.
 
-   - Fixed oob_score in :class:`ensemble.BaggingClassifier`.
+   - Fixed ``oob_score`` in :class:`ensemble.BaggingClassifier`.
      :issue:`8936` by :user:`mlewis1729 <mlewis1729>`
 
    - Fixed a bug where :class:`ensemble.IsolationForest` fails when
@@ -444,7 +452,7 @@ Preprocessing and feature selection
      preventing the use of
      :class:`feature_extraction.text.HashingVectorizer` in a
      pipeline with  :class:`feature_extraction.text.TfidfTransformer`.
-     :issue:`7513` by :user:`Roman Yurchak <rth>`.
+     :issue:`7565` by :user:`Roman Yurchak <rth>`.
 
 Model evaluation and meta-estimators
 
@@ -497,8 +505,9 @@ Metrics
      by `Joel Nothman`_ and :user:`Jon Crall <Erotemic>`.
 
    - Fixed passing of ``gamma`` parameter to the ``chi2`` kernel in
-     :func:`metrics.pairwise_kernels` :issue:`5211` by :user:`nrhine1`,
-     :user:`mth4saurabh` and `Andreas Müller`_.
+     :func:`metrics.pairwise_kernels` :issue:`5211` by
+     :user:`Nick Rhinehart <nrhine1>`,
+     :user:`Saurabh Bansod <mth4saurabh>` and `Andreas Müller`_.
 
 Miscellaneous
 

From 8d1fff83dfde3898bf0e60e0a0fba0a7bf7ce5bc Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Wed, 5 Jul 2017 23:28:37 +1000
Subject: [PATCH 09/19] sort

---
 doc/whats_new.rst | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index 1eb642b01d55d..664ba8c585782 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -47,21 +47,21 @@ parameters, may produce different models from the previous version. This often
 occurs due to changes in the modelling logic (bug fixes or enhancements), or in
 random sampling procedures.
 
-   * :class:`ensemble.IsolationForest` (bug fix)
-   * :class:`semi_supervised.LabelSpreading` (bug fix)
-   * :class:`semi_supervised.LabelPropagation` (bug fix)
-   * tree based models where ``min_weight_fraction_leaf`` is used (enhancement)
+   * :class:`cluster.KMeans` with sparse X and initial centroids given (bug fix)
+   * :class:`cross_decomposition.PLSRegression`
+     with ``scale=True`` (bug fix)
    * :class:`ensemble.GradientBoostingClassifier` and
      :class:`ensemble.GradientBoostingRegressor` where ``min_impurity_split`` is used (bug fix)
    * gradient boosting with :class:`ensemble.gradient_boosting.QuantileLossFunction` (bug fix)
+   * :class:`ensemble.IsolationForest` (bug fix)
+   * :class:`feature_selection.SelectFdr` (bug fix)
    * :class:`linear_model.RANSACRegressor` (bug fix)
    * :class:`linear_model.LassoLars` (bug fix)
    * :class:`linear_model.LassoLarsIC` (bug fix)
-   * :class:`cluster.KMeans` with sparse X and initial centroids given (bug fix)
    * :class:`manifold.TSNE` (bug fix)
-   * :class:`cross_decomposition.PLSRegression`
-     with ``scale=True`` (bug fix)
-   * :class:`feature_selection.SelectFdr` (bug fix)
+   * :class:`semi_supervised.LabelSpreading` (bug fix)
+   * :class:`semi_supervised.LabelPropagation` (bug fix)
+   * tree based models where ``min_weight_fraction_leaf`` is used (enhancement)
 
 Details are listed in the changelog below.
 

From acc4e311d30a7f7bfdb478ff4d306a91a1c4ccc7 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Thu, 6 Jul 2017 07:43:54 +1000
Subject: [PATCH 10/19] Merge in 0.18.2 docs

---
 doc/whats_new.rst | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index 664ba8c585782..c8759d15fd493 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -700,12 +700,12 @@ Miscellaneous
      :issue:`7464` by `Lars Buitinck`_ and `Loic Esteve`_.
 
 
-.. _changes_0_18_1:
+.. _changes_0_18_2:
 
-Version 0.18.1
+Version 0.18.2
 ==============
 
-**November 11, 2016**
+**June 20, 2017**
 
 .. topic:: Last release with Python 2.6 support
 
@@ -713,6 +713,27 @@ Version 0.18.1
     Later versions of scikit-learn will require Python 2.7 or above.
 
 
+Changelog
+---------
+
+    - Fixes for compatibility with NumPy 1.13.0: :issue:`7946` :issue:`8355` by
+      `Loic Esteve`_.
+
+    - Minor compatibility changes in the examples :issue:`9010` :issue:`8040`
+      :issue:`9149`.
+
+Code Contributors
+-----------------
+Aman Dalmia, Loic Esteve, Nate Guerin, Sergei Lebedev
+
+
+.. _changes_0_18_1:
+
+Version 0.18.1
+==============
+
+**November 11, 2016**
+
 Changelog
 ---------
 

From 8c275997ec2e5b18fccabc85bf534f79bdfdb388 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Thu, 6 Jul 2017 14:41:49 +1000
Subject: [PATCH 11/19] Missing entry from 0.18 logs

---
 doc/whats_new.rst | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index c8759d15fd493..53679a601d2a6 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -1445,6 +1445,11 @@ Model evaluation and meta-estimators
      the parameter ``n_labels`` is renamed to ``n_groups``.
      :issue:`6660` by `Raghav RV`_.
 
+   - Error and loss names for ``scoring`` parameters are now prefixed by
+     ``'neg_'``, such as ``neg_mean_squared_error``. The unprefixed versions
+     are deprecated and will be removed in version 0.20.
+     :issue:`7261` by :user:`Tim Head <betatim>`.
+
 Code Contributors
 -----------------
 Aditya Joshi, Alejandro, Alexander Fabisch, Alexander Loginov, Alexander

From 0b8b79f215d2de44899683088cb7dbfa1da69a98 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Thu, 6 Jul 2017 18:21:54 +1000
Subject: [PATCH 12/19] Optimistically add some features to highlights

---
 doc/whats_new.rst | 36 ++++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 16 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index 53679a601d2a6..c2c7ea1edf232 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -13,7 +13,7 @@ Version 0.19
 Highlights
 ----------
 
-This release includes a number of great new features including
+We are excited to release a number of great new features including
 :class:`neighbors.LocalOutlierFactor` for anomaly detection,
 :class:`preprocessing.QuantileTransformer` for robust feature
 transformation, and :class:`multioutput.ClassifierChain` to simply
@@ -22,22 +22,26 @@ have some new algorithms in existing estimators, such as
 multiplicative update in :class:`decomposition.NMF` and multinomial
 :class:`linear_model.LogisticRegression` with L1 loss.
 
-You can learn faster.  The :ref:`new option to cache transformations
-<pipeline_cache>` in :class:`pipeline.Pipeline` makes grid search over
-pipelines including slow transformations much more efficient.
-And you can predict faster. If you're sure you know what you're doing,
-you can turn off validating that the input is finite using
-:func:`config_context`.
-
-Multi-metric grid search and cross validation
-
-We've made some important fixes too.
-TODO: ?Rewrite of TSNE
-We've fixed a longstanding implementation erorr in :func:`metrics.average_precision_score`.
-:class:`semi_supervised.LabelSpreading` and
+You can also learn faster.  For instance, the :ref:`new option to cache
+transformations <pipeline_cache>` in :class:`pipeline.Pipeline` makes grid
+search over pipelines including slow transformations much more efficient.  And
+you can predict faster: if you're sure you know what you're doing, you can turn
+off validating that the input is finite using :func:`config_context`.
+
+Cross validation is now able to return the results from multiple metric
+evaluations. The new :func:`model_selection.cross_validate` can return many
+scores on the test data as well as training set performance and timings, and we
+have extended the ``scoring`` and ``refit`` parameters for grid/randomized
+search :ref:`to handle multiple metrics <multimetric_grid_search>`.
+
+We've made some important fixes too.  We've fixed a longstanding implementation
+erorr in :func:`metrics.average_precision_score`, so please be cautious with
+prior results reported from that function.  A number of errors in the
+:class:`manifold.TSNE` implementation have been fixed, particularly in the
+default Barnes-Hut approximation.  :class:`semi_supervised.LabelSpreading` and
 :class:`semi_supervised.LabelPropagation` have had substantial fixes.
-Propagation was previously broken. Spreading should now correctly
-respect its alpha parameter.
+Propagation was previously broken. Spreading should now correctly respect its
+alpha parameter.
 
 Changed models
 --------------

From eb05651456bef04da4bdaafc9bbe86c9f3651d0b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Lo=C3=AFc=20Est=C3=A8ve?= <loic.esteve@ymail.com>
Date: Thu, 6 Jul 2017 11:39:23 +0200
Subject: [PATCH 13/19] Forgotten user directive

---
 doc/whats_new.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index c2c7ea1edf232..20484bea497a1 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -215,7 +215,7 @@ Preprocessing and feature selection
 
    - Small performance improvement to n-gram creation in
      :mod:`feature_extraction.text` by binding methods for loops and
-     special-casing unigrams. :issue:`7567` by `Jaye Doepke <jtdoepke>`
+     special-casing unigrams. :issue:`7567` by :user:`Jaye Doepke <jtdoepke>`
 
    - Relax assumption on the data for the
      :class:`kernel_approximation.SkewedChi2Sampler`. Since the Skewed-Chi2

From 5cc2b2837686f35d7e6f4830f1db40f1ede60272 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Lo=C3=AFc=20Est=C3=A8ve?= <loic.esteve@ymail.com>
Date: Thu, 6 Jul 2017 11:44:40 +0200
Subject: [PATCH 14/19] Fix alignment

---
 doc/whats_new.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index 20484bea497a1..9f0a72233fb0d 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -376,7 +376,7 @@ Linear, kernelized and related models
      :class:`linear_model.LassoCV`. :issue:`8973` by
      :user:`Paulo Haddad <paulochf>`.
 
-  -  Fixed a bug in :class:`gaussian_process.GaussianProcessRegressor`
+   - Fixed a bug in :class:`gaussian_process.GaussianProcessRegressor`
      when the standard deviation and covariance predicted without fit
      would fail with a unmeaningful error by default.
      :issue:`6573` by :user:`Quazi Marufur Rahman <qmaruf>` and

From 9643fc468dcb546bd49fc3c27b57b29d813fdfe5 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Fri, 7 Jul 2017 14:54:28 +1000
Subject: [PATCH 15/19] Cleaning up for Andy's comments

---
 doc/whats_new.rst                    |  50 +++---
 sklearn/manifold/tests/test_t_sne.py | 235 +++++++++++++++++----------
 2 files changed, 173 insertions(+), 112 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index c2c7ea1edf232..717ab8d894ce6 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -15,12 +15,12 @@ Highlights
 
 We are excited to release a number of great new features including
 :class:`neighbors.LocalOutlierFactor` for anomaly detection,
-:class:`preprocessing.QuantileTransformer` for robust feature
-transformation, and :class:`multioutput.ClassifierChain` to simply
-account for dependencies between classes in multilabel problems. We
-have some new algorithms in existing estimators, such as
-multiplicative update in :class:`decomposition.NMF` and multinomial
-:class:`linear_model.LogisticRegression` with L1 loss.
+:class:`preprocessing.QuantileTransformer` for robust feature transformation,
+and the :class:`multioutput.ClassifierChain` meta-estimator to simply account
+for dependencies between classes in multilabel problems. We have some new
+algorithms in existing estimators, such as multiplicative update in
+:class:`decomposition.NMF` and multinomial
+:class:`linear_model.LogisticRegression` with L1 loss (use ``solver='saga'``).
 
 You can also learn faster.  For instance, the :ref:`new option to cache
 transformations <pipeline_cache>` in :class:`pipeline.Pipeline` makes grid
@@ -40,8 +40,8 @@ prior results reported from that function.  A number of errors in the
 :class:`manifold.TSNE` implementation have been fixed, particularly in the
 default Barnes-Hut approximation.  :class:`semi_supervised.LabelSpreading` and
 :class:`semi_supervised.LabelPropagation` have had substantial fixes.
-Propagation was previously broken. Spreading should now correctly respect its
-alpha parameter.
+LabelPropagation was previously broken. LabelSpreading should now correctly
+respect its alpha parameter.
 
 Changed models
 --------------
@@ -150,7 +150,7 @@ Trees and ensembles
 
    - :class:`ensemble.VotingClassifier` now allows changing estimators by using
      :meth:`ensemble.VotingClassifier.set_params`. An estimator can also be
-     removed by setting it to `None`.
+     removed by setting it to ``None``.
      :issue:`7674` by :user:`Yichuan Liu <yl565>`.
 
 Linear, kernelized and related models
@@ -260,7 +260,7 @@ Model evaluation and meta-estimators
      :issue:`8845` by  :user:`themrmax <themrmax>`
 
    - :class:`multioutput.MultiOutputRegressor` and :class:`multioutput.MultiOutputClassifier`
-     now support online learning using `partial_fit`.
+     now support online learning using ``partial_fit``.
      issue: `8053` by :user:`Peng Yu <yupbank>`.
 
    - Add ``max_train_size`` parameter to :class:`model_selection.TimeSeriesSplit`
@@ -340,7 +340,7 @@ Linear, kernelized and related models
      ``max_iter`` if it finds a large inlier group early. :issue:`8251` by :user:`aivision2020`.
 
    - Fixed a bug where :class:`naive_bayes.MultinomialNB` and :class:`naive_bayes.BernoulliNB`
-     failed when `alpha=0`. :issue:`5814` by :user:`Yichuan Liu <yl565>` and
+     failed when ``alpha=0``. :issue:`5814` by :user:`Yichuan Liu <yl565>` and
      :user:`Herilalaina Rakotoarison <herilalaina>`.
 
    - Fixed a bug where :class:`linear_model.LassoLars` does not give
@@ -350,17 +350,17 @@ Linear, kernelized and related models
    - Fixed a bug in :class:`linear_model.RandomizedLasso`,
      :class:`linear_model.Lars`, :class:`linear_model.LassoLars`,
      :class:`linear_model.LarsCV` and :class:`linear_model.LassoLarsCV`,
-     where the parameter ``precompute`` were not used consistently across
+     where the parameter ``precompute`` was not used consistently across
      classes, and some values proposed in the docstring could raise errors.
      :issue:`5359` by `Tom Dupre la Tour`_.
 
    - Fix a bug where :func:`linear_model.LassoLars.fit` sometimes
-     left `coef_` as a list, rather than an ndarray.
+     left ``coef_`` as a list, rather than an ndarray.
      :issue:`8160` by :user:`CJ Carey <perimosocordiae>`.
 
    - Fix :func:`linear_model.BayesianRidge.fit` to return
-     ridge parameter `alpha_` and `lambda_` consistent with calculated
-     coefficients `coef_` and `intercept_`.
+     ridge parameter ``alpha_`` and ``lambda_`` consistent with calculated
+     coefficients ``coef_`` and ``intercept_``.
      :issue:`8224` by :user:`Peter Gedeck <gedeck>`.
 
    - Fixed a bug in :class:`svm.OneClassSVM` where it returned floats instead of
@@ -404,7 +404,7 @@ Decomposition, manifold learning and clustering
      This also impacts the output shape of :class:`decomposition.DictionaryLearning`.
      :issue:`8086` by `Andreas Müller`_.
 
-   - Fixed the implementation of `explained_variance_`
+   - Fixed the implementation of ``explained_variance_``
      in :class:`decomposition.PCA`,
      :class:`decomposition.RandomizedPCA` and
      :class:`decomposition.IncrementalPCA`.
@@ -484,10 +484,10 @@ Model evaluation and meta-estimators
      classes are provided up-front. :issue:`6250` by
      :user:`Asish Panda <kaichogami>`.
 
-   - Fix :func:`multioutput.MultiOutputClassifier.predict_proba` to
-     return a list of 2d arrays, rather than a 3d array. In the case where
-     different target columns had different numbers of classes, a `ValueError`
-     would be raised on trying to stack matrices with different dimensions.
+   - Fix :func:`multioutput.MultiOutputClassifier.predict_proba` to return a
+     list of 2d arrays, rather than a 3d array. In the case where different
+     target columns had different numbers of classes, a ``ValueError`` would be
+     raised on trying to stack matrices with different dimensions.
      :issue:`8093` by :user:`Peter Bull <pjbull>`.
 
 Metrics
@@ -561,7 +561,7 @@ Other predictors
 
    - :class:`neighbors.LSHForest` has been deprecated and will be
      removed in 0.21 due to poor performance.
-     :issue:`8996` by `Andreas Müller`_.
+     :issue:`9078` by :user:`Laurent Direr <ldirer>`.
 
    - The ``alpha`` parameter of :class:`semi_supervised.LabelPropagation` now
      has no effect and is deprecated to be removed in 0.21. :issue:`9239`
@@ -602,7 +602,7 @@ Preprocessing and feature selection
    - :class:`linear_model.RandomizedLogisticRegression`,
      and :class:`linear_model.RandomizedLasso` have been deprecated and will
      be removed in version 0.21.
-     :issue: `8995` by :user:`Ramana.S <sentient07>`.
+     :issue:`8995` by :user:`Ramana.S <sentient07>`.
 
 Model evaluation and meta-estimators
 
@@ -633,7 +633,7 @@ Model evaluation and meta-estimators
    - The :func:`multioutput.MultiOutputClassifier.predict_proba`
      function used to return a 3d array (``n_samples``, ``n_classes``,
      ``n_outputs``). In the case where different target columns had different
-     numbers of classes, a `ValueError` would be raised on trying to stack
+     numbers of classes, a ``ValueError`` would be raised on trying to stack
      matrices with different dimensions. This function now returns a list of
      arrays where the length of the list is ``n_outputs``, and each array is
      (``n_samples``, ``n_classes``) for that particular output.
@@ -645,11 +645,9 @@ Model evaluation and meta-estimators
      attribute, ``dict`` behavior will be prioritized.
      :issue:`8481` by :user:`Herilalaina Rakotoarison <herilalaina>`.
 
-Metrics
-
 Miscellaneous
 
-   - Deprecate the ``y`` parameter in `transform` and `inverse_transform`.
+   - Deprecate the ``y`` parameter in ``transform`` and ``inverse_transform``.
      The method  should not accept ``y`` parameter, as it's used at the prediction time.
      :issue:`8174` by :user:`Tahar Zanouda <tzano>`, `Alexandre Gramfort`_
      and `Raghav RV`_.
diff --git a/sklearn/manifold/tests/test_t_sne.py b/sklearn/manifold/tests/test_t_sne.py
index 52c056a5adadf..8b9c9d6a76862 100644
--- a/sklearn/manifold/tests/test_t_sne.py
+++ b/sklearn/manifold/tests/test_t_sne.py
@@ -10,6 +10,7 @@
 from sklearn.utils.testing import assert_array_equal
 from sklearn.utils.testing import assert_array_almost_equal
 from sklearn.utils.testing import assert_less
+from sklearn.utils.testing import assert_greater
 from sklearn.utils.testing import assert_raises_regexp
 from sklearn.utils.testing import assert_in
 from sklearn.utils.testing import skip_if_32bit
@@ -140,20 +141,26 @@ def test_binary_search_neighbors():
 
     # Test that when we use all the neighbors the results are identical
     k = n_samples
-    neighbors_nn = np.argsort(distances, axis=1)[:, :k].astype(np.int64)
-    P2 = _binary_search_perplexity(distances, neighbors_nn,
+    neighbors_nn = np.argsort(distances, axis=1)[:, 1:k].astype(np.int64)
+    distances_nn = np.array([distances[k, neighbors_nn[k]]
+                            for k in range(n_samples)])
+    P2 = _binary_search_perplexity(distances_nn, neighbors_nn,
                                    desired_perplexity, verbose=0)
-    assert_array_almost_equal(P1, P2, decimal=4)
+    P_nn = np.array([P1[k, neighbors_nn[k]] for k in range(n_samples)])
+    assert_array_almost_equal(P_nn, P2, decimal=4)
 
     # Test that the highest P_ij are the same when few neighbors are used
-    for k in np.linspace(80, n_samples, 10):
+    for k in np.linspace(80, n_samples, 5):
         k = int(k)
         topn = k * 10  # check the top 10 *k entries out of k * k entries
         neighbors_nn = np.argsort(distances, axis=1)[:, :k].astype(np.int64)
-        P2k = _binary_search_perplexity(distances, neighbors_nn,
+        distances_nn = np.array([distances[k, neighbors_nn[k]]
+                                for k in range(n_samples)])
+        P2k = _binary_search_perplexity(distances_nn, neighbors_nn,
                                         desired_perplexity, verbose=0)
         idx = np.argsort(P1.ravel())[::-1]
         P1top = P1.ravel()[idx][:topn]
+        idx = np.argsort(P2k.ravel())[::-1]
         P2top = P2k.ravel()[idx][:topn]
         assert_array_almost_equal(P1top, P2top, decimal=2)
 
@@ -175,6 +182,8 @@ def test_binary_perplexity_stability():
         P = _binary_search_perplexity(distances.copy(), neighbors_nn.copy(),
                                       3, verbose=0)
         P1 = _joint_probabilities_nn(distances, neighbors_nn, 3, verbose=0)
+        # Convert the sparse matrix to a dense one for testing
+        P1 = P1.toarray()
         if last_P is None:
             last_P = P
             last_P1 = P1
@@ -193,9 +202,9 @@ def test_gradient():
     alpha = 1.0
 
     distances = random_state.randn(n_samples, n_features).astype(np.float32)
-    distances = distances.dot(distances.T)
+    distances = np.abs(distances.dot(distances.T))
     np.fill_diagonal(distances, 0.0)
-    X_embedded = random_state.randn(n_samples, n_components)
+    X_embedded = random_state.randn(n_samples, n_components).astype(np.float32)
 
     P = _joint_probabilities(distances, desired_perplexity=25.0,
                              verbose=0)
@@ -233,21 +242,17 @@ def test_trustworthiness():
 def test_preserve_trustworthiness_approximately():
     # Nearest neighbors should be preserved approximately.
     random_state = check_random_state(0)
-    # The Barnes-Hut approximation uses a different method to estimate
-    # P_ij using only a number of nearest neighbors instead of all
-    # points (so that k = 3 * perplexity). As a result we set the
-    # perplexity=5, so that the number of neighbors is 5%.
     n_components = 2
     methods = ['exact', 'barnes_hut']
-    X = random_state.randn(100, n_components).astype(np.float32)
+    X = random_state.randn(50, n_components).astype(np.float32)
     for init in ('random', 'pca'):
         for method in methods:
-            tsne = TSNE(n_components=n_components, perplexity=50,
+            tsne = TSNE(n_components=n_components, perplexity=25,
                         learning_rate=100.0, init=init, random_state=0,
                         method=method)
             X_embedded = tsne.fit_transform(X)
-            T = trustworthiness(X, X_embedded, n_neighbors=1)
-            assert_almost_equal(T, 1.0, decimal=1)
+            t = trustworthiness(X, X_embedded, n_neighbors=1)
+            assert_greater(t, 0.9)
 
 
 def test_optimization_minimizes_kl_divergence():
@@ -255,7 +260,7 @@ def test_optimization_minimizes_kl_divergence():
     random_state = check_random_state(0)
     X, _ = make_blobs(n_features=3, random_state=random_state)
     kl_divergences = []
-    for n_iter in [200, 250, 300]:
+    for n_iter in [250, 300, 350]:
         tsne = TSNE(n_components=2, perplexity=10, learning_rate=100.0,
                     n_iter=n_iter, random_state=0)
         tsne.fit_transform(X)
@@ -280,13 +285,16 @@ def test_fit_csr_matrix():
 def test_preserve_trustworthiness_approximately_with_precomputed_distances():
     # Nearest neighbors should be preserved approximately.
     random_state = check_random_state(0)
-    X = random_state.randn(100, 2)
-    D = squareform(pdist(X), "sqeuclidean")
-    tsne = TSNE(n_components=2, perplexity=2, learning_rate=100.0,
-                metric="precomputed", random_state=0, verbose=0)
-    X_embedded = tsne.fit_transform(D)
-    assert_almost_equal(trustworthiness(D, X_embedded, n_neighbors=1,
-                                        precomputed=True), 1.0, decimal=1)
+    for i in range(3):
+        X = random_state.randn(100, 2)
+        D = squareform(pdist(X), "sqeuclidean")
+        tsne = TSNE(n_components=2, perplexity=2, learning_rate=100.0,
+                    early_exaggeration=2.0, metric="precomputed",
+                    random_state=i, verbose=0)
+        X_embedded = tsne.fit_transform(D)
+        t = trustworthiness(D, X_embedded, n_neighbors=1,
+                            precomputed=True)
+        assert t > .95
 
 
 def test_early_exaggeration_too_small():
@@ -310,10 +318,32 @@ def test_non_square_precomputed_distances():
                          tsne.fit_transform, np.array([[0.0], [1.0]]))
 
 
+def test_non_positive_precomputed_distances():
+    # Precomputed distance matrices must be positive.
+    bad_dist = np.array([[0., -1.], [1., 0.]])
+    for method in ['barnes_hut', 'exact']:
+        tsne = TSNE(metric="precomputed", method=method)
+        assert_raises_regexp(ValueError, "All distances .*precomputed.*",
+                             tsne.fit_transform, bad_dist)
+
+
+def test_non_positive_computed_distances():
+    # Computed distance matrices must be positive.
+    def metric(x, y):
+        return -1
+
+    tsne = TSNE(metric=metric, method='exact')
+    X = np.array([[0.0, 0.0], [1.0, 1.0]])
+    assert_raises_regexp(ValueError, "All distances .*metric given.*",
+                         tsne.fit_transform, X)
+
+
 def test_init_not_available():
     # 'init' must be 'pca', 'random', or numpy array.
+    tsne = TSNE(init="not available")
     m = "'init' must be 'pca', 'random', or a numpy array"
-    assert_raises_regexp(ValueError, m, TSNE, init="not available")
+    assert_raises_regexp(ValueError, m, tsne.fit_transform,
+                         np.array([[0.0], [1.0]]))
 
 
 def test_init_ndarray():
@@ -332,10 +362,29 @@ def test_init_ndarray_precomputed():
 
 def test_distance_not_available():
     # 'metric' must be valid.
-    tsne = TSNE(metric="not available")
+    tsne = TSNE(metric="not available", method='exact')
     assert_raises_regexp(ValueError, "Unknown metric not available.*",
                          tsne.fit_transform, np.array([[0.0], [1.0]]))
 
+    tsne = TSNE(metric="not available", method='barnes_hut')
+    assert_raises_regexp(ValueError, "Metric 'not available' not valid.*",
+                         tsne.fit_transform, np.array([[0.0], [1.0]]))
+
+
+def test_method_not_available():
+    # 'nethod' must be 'barnes_hut' or 'exact'
+    tsne = TSNE(method='not available')
+    assert_raises_regexp(ValueError, "'method' must be 'barnes_hut' or ",
+                         tsne.fit_transform, np.array([[0.0], [1.0]]))
+
+
+def test_angle_out_of_range_checks():
+    # check the angle parameter range
+    for angle in [-1, -1e-6, 1 + 1e-6, 2]:
+        tsne = TSNE(angle=angle)
+        assert_raises_regexp(ValueError, "'angle' must be between 0.0 - 1.0",
+                             tsne.fit_transform, np.array([[0.0], [1.0]]))
+
 
 def test_pca_initialization_not_compatible_with_precomputed_kernel():
     # Precomputed distance matrices must be square matrices.
@@ -345,6 +394,48 @@ def test_pca_initialization_not_compatible_with_precomputed_kernel():
                          tsne.fit_transform, np.array([[0.0], [1.0]]))
 
 
+def test_n_components_range():
+    # barnes_hut method should only be used with n_components <= 3
+    tsne = TSNE(n_components=4, method="barnes_hut")
+    assert_raises_regexp(ValueError, "'n_components' should be .*",
+                         tsne.fit_transform, np.array([[0.0], [1.0]]))
+
+
+def test_early_exaggeration_used():
+    # check that the ``early_exaggeration`` parameter has an effect
+    random_state = check_random_state(0)
+    n_components = 2
+    methods = ['exact', 'barnes_hut']
+    X = random_state.randn(25, n_components).astype(np.float32)
+    for method in methods:
+        tsne = TSNE(n_components=n_components, perplexity=1,
+                    learning_rate=100.0, init="pca", random_state=0,
+                    method=method, early_exaggeration=1.0)
+        X_embedded1 = tsne.fit_transform(X)
+        tsne = TSNE(n_components=n_components, perplexity=1,
+                    learning_rate=100.0, init="pca", random_state=0,
+                    method=method, early_exaggeration=10.0)
+        X_embedded2 = tsne.fit_transform(X)
+
+        assert not np.allclose(X_embedded1, X_embedded2)
+
+
+def test_n_iter_used():
+    # check that the ``n_iter`` parameter has an effect
+    random_state = check_random_state(0)
+    n_components = 2
+    methods = ['exact', 'barnes_hut']
+    X = random_state.randn(25, n_components).astype(np.float32)
+    for method in methods:
+        for n_iter in [251, 500]:
+            tsne = TSNE(n_components=n_components, perplexity=1,
+                        learning_rate=0.5, init="random", random_state=0,
+                        method=method, early_exaggeration=1.0, n_iter=n_iter)
+            tsne.fit_transform(X)
+
+            assert tsne.n_iter_final == n_iter - 1
+
+
 def test_answer_gradient_two_points():
     # Test the tree with only a single set of children.
     #
@@ -418,7 +509,13 @@ def _run_answer_test(pos_input, pos_output, neighbors, grad_output,
     pij_input = squareform(pij_input).astype(np.float32)
     grad_bh = np.zeros(pos_output.shape, dtype=np.float32)
 
-    _barnes_hut_tsne.gradient(pij_input, pos_output, neighbors,
+    from scipy.sparse import csr_matrix
+    P = csr_matrix(pij_input)
+
+    neighbors = P.indices.astype(np.int64)
+    indptr = P.indptr.astype(np.int64)
+
+    _barnes_hut_tsne.gradient(P.data, pos_output, neighbors, indptr,
                               grad_bh, 0.5, 2, 1, skip_num_points=0)
     assert_array_almost_equal(grad_bh, grad_output, decimal=4)
 
@@ -439,7 +536,7 @@ def test_verbose():
         sys.stdout = old_stdout
 
     assert("[t-SNE]" in out)
-    assert("Computing pairwise distances" in out)
+    assert("nearest neighbors..." in out)
     assert("Computed conditional probabilities" in out)
     assert("Mean sigma" in out)
     assert("Finished" in out)
@@ -481,10 +578,15 @@ def test_64bit():
     methods = ['barnes_hut', 'exact']
     for method in methods:
         for dt in [np.float32, np.float64]:
-            X = random_state.randn(100, 2).astype(dt)
+            X = random_state.randn(50, 2).astype(dt)
             tsne = TSNE(n_components=2, perplexity=2, learning_rate=100.0,
-                        random_state=0, method=method)
-            tsne.fit_transform(X)
+                        random_state=0, method=method, verbose=0)
+            X_embedded = tsne.fit_transform(X)
+            effective_type = X_embedded.dtype
+
+            # tsne cython code is only single precision, so the output will
+            # always be single precision, irrespectively of the input dtype
+            assert effective_type == np.float32
 
 
 def test_barnes_hut_angle():
@@ -499,10 +601,10 @@ def test_barnes_hut_angle():
         random_state = check_random_state(0)
         distances = random_state.randn(n_samples, n_features)
         distances = distances.astype(np.float32)
-        distances = distances.dot(distances.T)
+        distances = abs(distances.dot(distances.T))
         np.fill_diagonal(distances, 0.0)
         params = random_state.randn(n_samples, n_components)
-        P = _joint_probabilities(distances, perplexity, False)
+        P = _joint_probabilities(distances, perplexity, verbose=0)
         kl, gradex = _kl_divergence(params, P, degrees_of_freedom, n_samples,
                                     n_components)
 
@@ -510,58 +612,19 @@ def test_barnes_hut_angle():
         bt = BallTree(distances)
         distances_nn, neighbors_nn = bt.query(distances, k=k + 1)
         neighbors_nn = neighbors_nn[:, 1:]
-        Pbh = _joint_probabilities_nn(distances, neighbors_nn,
-                                      perplexity, False)
-        kl, gradbh = _kl_divergence_bh(params, Pbh, neighbors_nn,
-                                       degrees_of_freedom, n_samples,
-                                       n_components, angle=angle,
-                                       skip_num_points=0, verbose=False)
+        distances_nn = np.array([distances[i, neighbors_nn[i]]
+                                 for i in range(n_samples)])
+        assert np.all(distances[0, neighbors_nn[0]] == distances_nn[0]),\
+            abs(distances[0, neighbors_nn[0]] - distances_nn[0])
+        Pbh = _joint_probabilities_nn(distances_nn, neighbors_nn,
+                                      perplexity, verbose=0)
+        kl, gradbh = _kl_divergence_bh(params, Pbh, degrees_of_freedom,
+                                       n_samples, n_components, angle=angle,
+                                       skip_num_points=0, verbose=0)
+
+        P = squareform(P)
+        Pbh = Pbh.toarray()
         assert_array_almost_equal(Pbh, P, decimal=5)
-        assert_array_almost_equal(gradex, gradbh, decimal=5)
-
-
-def test_quadtree_similar_point():
-    # Introduce a point into a quad tree where a similar point already exists.
-    # Test will hang if it doesn't complete.
-    Xs = []
-
-    # check the case where points are actually different
-    Xs.append(np.array([[1, 2], [3, 4]], dtype=np.float32))
-    # check the case where points are the same on X axis
-    Xs.append(np.array([[1.0, 2.0], [1.0, 3.0]], dtype=np.float32))
-    # check the case where points are arbitrarily close on X axis
-    Xs.append(np.array([[1.00001, 2.0], [1.00002, 3.0]], dtype=np.float32))
-    # check the case where points are the same on Y axis
-    Xs.append(np.array([[1.0, 2.0], [3.0, 2.0]], dtype=np.float32))
-    # check the case where points are arbitrarily close on Y axis
-    Xs.append(np.array([[1.0, 2.00001], [3.0, 2.00002]], dtype=np.float32))
-    # check the case where points are arbitrarily close on both axes
-    Xs.append(np.array([[1.00001, 2.00001], [1.00002, 2.00002]],
-              dtype=np.float32))
-
-    # check the case where points are arbitrarily close on both axes
-    # close to machine epsilon - x axis
-    Xs.append(np.array([[1, 0.0003817754041], [2, 0.0003817753750]],
-              dtype=np.float32))
-
-    # check the case where points are arbitrarily close on both axes
-    # close to machine epsilon - y axis
-    Xs.append(np.array([[0.0003817754041, 1.0], [0.0003817753750, 2.0]],
-              dtype=np.float32))
-
-    for X in Xs:
-        counts = np.zeros(3, dtype='int64')
-        _barnes_hut_tsne.check_quadtree(X, counts)
-        m = "Tree consistency failed: unexpected number of points at root node"
-        assert_equal(counts[0], counts[1], m)
-        m = "Tree consistency failed: unexpected number of points on the tree"
-        assert_equal(counts[0], counts[2], m)
-
-
-def test_index_offset():
-    # Make sure translating between 1D and N-D indices are preserved
-    assert_equal(_barnes_hut_tsne.test_index2offset(), 1)
-    assert_equal(_barnes_hut_tsne.test_index_offset(), 1)
 
 
 @skip_if_32bit
@@ -569,8 +632,8 @@ def test_n_iter_without_progress():
     # Use a dummy negative n_iter_without_progress and check output on stdout
     random_state = check_random_state(0)
     X = random_state.randn(100, 2)
-    tsne = TSNE(n_iter_without_progress=-1, verbose=2,
-                random_state=1, method='exact')
+    tsne = TSNE(n_iter_without_progress=-1, verbose=2, learning_rate=1e8,
+                random_state=1, method='exact', n_iter=300)
 
     old_stdout = sys.stdout
     sys.stdout = StringIO()
@@ -616,7 +679,7 @@ def test_min_grad_norm():
         start_grad_norm = line.find('gradient norm')
         if start_grad_norm >= 0:
             line = line[start_grad_norm:]
-            line = line.replace('gradient norm = ', '')
+            line = line.replace('gradient norm = ', '').split(' ')[0]
             gradient_norm_values.append(float(line))
 
     # Compute how often the gradient norm is smaller than min_grad_norm

From 08fc42c7b73faac7bd4c57b96913745d568969a3 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Fri, 7 Jul 2017 14:59:14 +1000
Subject: [PATCH 16/19] Mention beta_loss=0 speedup

---
 doc/whats_new.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index 974fd48f7409d..71fb0612e70f2 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -204,6 +204,9 @@ Decomposition, manifold learning and clustering
      from the underlying SVD. They are stored in the attribute
      ``singular_values_``, like in :class:`decomposition.IncrementalPCA`.
 
+   - :class:`decomposition.NMF` now faster when ``beta_loss=0``.
+     :issue:`9277` by :user:`hongkahjun`.
+
 Preprocessing and feature selection
 
    - Added ``norm_order`` parameter to :class:`feature_selection.SelectFromModel`

From 6aa9fd0c87b8bb422c200da5f1b6a6751a5c5a8e Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Tue, 11 Jul 2017 09:56:53 +1000
Subject: [PATCH 17/19] Update

---
 doc/whats_new.rst | 51 ++++++++++++++++++++++++-----------------------
 1 file changed, 26 insertions(+), 25 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index 4295fc71ab7cb..dbc725d6d4c78 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -78,28 +78,6 @@ Changelog
 New features
 ............
 
-Configuration
-   - :class:`model_selection.GridSearchCV` and
-     :class:`model_selection.RandomizedSearchCV` now support simultaneous
-     evaluation of multiple metrics. Refer to the
-     :ref:`multimetric_grid_search` section of the user guide for more
-     information. :issue:`7388` by `Raghav RV`_
-
-   - Added the :func:`model_selection.cross_validate` which allows evaluation
-     of multiple metrics. This function returns a dict with more useful
-     information from cross-validation such as the train scores, fit times and
-     score times.
-     Refer to :ref:`multimetric_cross_validation` section of the userguide
-     for more information. :issue:`7388` by `Raghav RV`_
-     
-   - Added :class:`multioutput.ClassifierChain` for multi-label
-     classification. By `Adam Kleczewski <adamklec>`_.
-
-   - Validation that input data contains no NaN or inf can now be suppressed
-     using :func:`config_context`, at your own risk. This will save on runtime,
-     and may be particularly useful for prediction time. :issue:`7548` by
-     `Joel Nothman`_.
-
 Classifiers and regressors
 
    - Added :class:`multioutput.ClassifierChain` for multi-label
@@ -133,6 +111,19 @@ Other estimators
 
 Model selection and evaluation
 
+   - :class:`model_selection.GridSearchCV` and
+     :class:`model_selection.RandomizedSearchCV` now support simultaneous
+     evaluation of multiple metrics. Refer to the
+     :ref:`multimetric_grid_search` section of the user guide for more
+     information. :issue:`7388` by `Raghav RV`_
+
+   - Added the :func:`model_selection.cross_validate` which allows evaluation
+     of multiple metrics. This function returns a dict with more useful
+     information from cross-validation such as the train scores, fit times and
+     score times.
+     Refer to :ref:`multimetric_cross_validation` section of the userguide
+     for more information. :issue:`7388` by `Raghav RV`_
+
    - Added :func:`metrics.mean_squared_log_error`, which computes
      the mean square error of the logarithmic transformation of targets,
      particularly useful for targets with an exponential trend.
@@ -147,6 +138,12 @@ Model selection and evaluation
      :class:`model_selection.RepeatedStratifiedKFold`.
      :issue:`8120` by `Neeraj Gangwar`_.
 
+Miscellaneous
+
+   - Validation that input data contains no NaN or inf can now be suppressed
+     using :func:`config_context`, at your own risk. This will save on runtime,
+     and may be particularly useful for prediction time. :issue:`7548` by
+     `Joel Nothman`_.
 
 Enhancements
 ............
@@ -372,6 +369,10 @@ Linear, kernelized and related models
      classes, and some values proposed in the docstring could raise errors.
      :issue:`5359` by `Tom Dupre la Tour`_.
 
+   - Fix inconsistent results between :class:`linear_model.RidgeCV`
+     and :class:`linear_model.Ridge` when using ``normalize=True``
+     by `Alexandre Gramfort`_.
+
    - Fix a bug where :func:`linear_model.LassoLars.fit` sometimes
      left ``coef_`` as a list, rather than an ndarray.
      :issue:`8160` by :user:`CJ Carey <perimosocordiae>`.
@@ -564,9 +565,9 @@ Miscellaneous
    - Add ``data_home`` parameter to
      :func:`sklearn.datasets.fetch_kddcup99` by `Loic Esteve`_.
 
-   - Fix inconsistent results between :class:`linear_model.RidgeCV`
-     and :class:`linear_model.Ridge` when using ``normalize=True``
-     by `Alexandre Gramfort`_.
+   - Several minor issues were fixed with thanks to the alerts of
+     [lgtm.com](http://lgtm.com). :issue:`9278` by :user:`Jean Helie <jhelie>`,
+     among others.
 
 API changes summary
 -------------------

From 09878d4c082c0968f35ab5ec6a5f239911c5bfb5 Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Thu, 13 Jul 2017 08:51:38 +1000
Subject: [PATCH 18/19] Clean up new what's new entries

---
 doc/whats_new.rst | 42 +++++++++++++++++-------------------------
 1 file changed, 17 insertions(+), 25 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index d9e7331df4a4f..432bdd46475e9 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -66,8 +66,6 @@ random sampling procedures.
    * :class:`semi_supervised.LabelSpreading` (bug fix)
    * :class:`semi_supervised.LabelPropagation` (bug fix)
    * tree based models where ``min_weight_fraction_leaf`` is used (enhancement)
-   * :class:`sklearn.ensemble.IsolationForest` (bug fix)
-   * :class:`sklearn.manifold.TSNE` (bug fix)
 
 Details are listed in the changelog below.
 
@@ -221,6 +219,15 @@ Decomposition, manifold learning and clustering
    - :class:`decomposition.NMF` now faster when ``beta_loss=0``.
      :issue:`9277` by :user:`hongkahjun`.
 
+   - Memory improvements for method ``barnes_hut`` in :class:`manifold.TSNE`
+     :issue:`7089` by :user:`Thomas Moreau <tomMoral>` and `Olivier Grisel`_.
+
+   - Optimization schedule improvements for Barnes-Hut :class:`manifold.TSNE`
+     so the results are closer to the one from the reference implementation
+     `lvdmaaten/bhtsne <https://github.com/lvdmaaten/bhtsne>`_ by :user:`Thomas
+     Moreau <tomMoral>` and `Olivier Grisel`_.
+
+
 Preprocessing and feature selection
 
    - Added ``norm_order`` parameter to :class:`feature_selection.SelectFromModel`
@@ -307,21 +314,6 @@ Miscellaneous
      passing a range of bytes to :func:`datasets.load_svmlight_file`.
      :issue:`935` by :user:`Olivier Grisel <ogrisel>`.
 
-   - Small performance improvement to n-gram creation in
-     :mod:`feature_extraction.text` by binding methods for loops and
-     special-casing unigrams. :issue:`7567` by `Jaye Doepke <jtdoepke>`
-
-   - Speed improvements to :class:`model_selection.StratifiedShuffleSplit`.
-     :issue:`5991` by :user:`Arthur Mensch <arthurmensch>` and `Joel Nothman`_.
-
-   - Memory improvements for method barnes_hut in :class:`manifold.TSNE`
-     :issue:`7089` by :user:`Thomas Moreau <tomMoral>` and `Olivier Grisel`_.
-
-   - Optimization schedule improvements for so the results are closer to the
-     one from the reference implementation
-     `lvdmaaten/bhtsne <https://github.com/lvdmaaten/bhtsne>`_ by
-     :user:`Thomas Moreau <tomMoral>` and `Olivier Grisel`_.
-
 Bug fixes
 .........
 
@@ -428,6 +420,14 @@ Other predictors
 
 Decomposition, manifold learning and clustering
 
+   - Fixed the implementation of :class:`manifold.TSNE`:
+      - ``early_exageration`` parameter had no effect and is now used for the
+        first 250 optimization iterations.
+      - Fixed the ``InsersionError`` reported in :issue:`8992`.
+      - Improve the learning schedule to match the one from the reference
+        implementation `lvdmaaten/bhtsne <https://github.com/lvdmaaten/bhtsne>`_.
+     by :user:`Thomas Moreau <tomMoral>` and `Olivier Grisel`_.
+
    - Fix a bug in :class:`decomposition.LatentDirichletAllocation`
      where the ``perplexity`` method was returning incorrect results because
      the ``transform`` method returns normalized document topic distributions
@@ -586,14 +586,6 @@ Miscellaneous
      [lgtm.com](http://lgtm.com). :issue:`9278` by :user:`Jean Helie <jhelie>`,
      among others.
 
-   - Fixed the implementation of :class:`manifold.TSNE`:
-      - ``early_exageration`` parameter had no effect and is now used for the
-        first 250 optimization iterations.
-      - Fixed the ``InsersionError`` reported in :issue:`8992`.
-      - Improve the learning schedule to match the one from the reference
-        implementation `lvdmaaten/bhtsne <https://github.com/lvdmaaten/bhtsne>`_.
-     by :user:`Thomas Moreau <tomMoral>` and `Olivier Grisel`_.
-
 API changes summary
 -------------------
 

From 0312156db192c34de428f591dc89c2891b01680e Mon Sep 17 00:00:00 2001
From: Joel Nothman <joel.nothman@gmail.com>
Date: Thu, 13 Jul 2017 15:10:48 +1000
Subject: [PATCH 19/19] DOC Add changes missed from what's new

And other minor things.

This took lots of effort which I would have not committed where I not home sick...
---
 doc/whats_new.rst | 171 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 131 insertions(+), 40 deletions(-)

diff --git a/doc/whats_new.rst b/doc/whats_new.rst
index 432bdd46475e9..21eb3478dbc1b 100644
--- a/doc/whats_new.rst
+++ b/doc/whats_new.rst
@@ -56,7 +56,7 @@ random sampling procedures.
      with ``scale=True`` (bug fix)
    * :class:`ensemble.GradientBoostingClassifier` and
      :class:`ensemble.GradientBoostingRegressor` where ``min_impurity_split`` is used (bug fix)
-   * gradient boosting with :class:`ensemble.gradient_boosting.QuantileLossFunction` (bug fix)
+   * gradient boosting ``loss='quantile'`` (bug fix)
    * :class:`ensemble.IsolationForest` (bug fix)
    * :class:`feature_selection.SelectFdr` (bug fix)
    * :class:`linear_model.RANSACRegressor` (bug fix)
@@ -145,6 +145,10 @@ Miscellaneous
      and may be particularly useful for prediction time. :issue:`7548` by
      `Joel Nothman`_.
 
+   - Added a test to ensure parameter listing in docstrings match the
+     function/class signature. :issue:`9206` by `Alexandre Gramfort`_ and
+     `Raghav RV`_.
+
 Enhancements
 ............
 
@@ -165,6 +169,9 @@ Trees and ensembles
      removed by setting it to ``None``.
      :issue:`7674` by :user:`Yichuan Liu <yl565>`.
 
+   - :func:`tree.export_graphviz` now shows configurable number of decimal
+     places. :issue:`8698` by :user:`Guillaume Lemaitre <glemaitre>`.
+
 Linear, kernelized and related models
 
    - :class:`linear_model.SGDClassifier`, :class:`linear_model.SGDRegressor`,
@@ -174,7 +181,7 @@ Linear, kernelized and related models
      ``tol`` parameters, to handle convergence more precisely.
      ``n_iter`` parameter is deprecated, and the fitted estimator exposes
      a ``n_iter_`` attribute, with actual number of iterations before
-     convergence. By `Tom Dupre la Tour`_.
+     convergence. :issue:`5036` by `Tom Dupre la Tour`_.
 
    - Added ``average`` parameter to perform weight averaging in
      :class:`linear_model.PassiveAggressiveClassifier`. :issue:`4939`
@@ -190,14 +197,17 @@ Linear, kernelized and related models
      is a lot faster with ``return_std=True``. :issue:`8591` by
      :user:`Hadrien Bertrand <hbertrand>`.
 
-   - Memory usage enhancement: Prevent cast from float32 to float64 in
-     :class:`linear_model.LogisticRegression` when using newton-cg
-     solver. :issue:`8835` by :user:`Joan Massich <massich>`.
+   - Added ``return_std`` to ``predict`` method of
+     :class:`linear_model.ARDRegression` and
+     :class:`linear_model.BayesianRidge`.
+     :issue:`7838` by :user:`Sergey Feldman <sergeyf>`.
 
-   - Memory usage enhancement: Prevent cast from float32 to float64 in
-     :class:`linear_model.Ridge` when using svd, sparse_cg, cholesky or lsqr solvers
-     :class:`linear_model.Ridge` when using svd, sparse_cg, cholesky or lsqr solvers
-     by :user:`Joan Massich <massich>`, :user:`Nicolas Cordier <ncordier>`
+   - Memory usage enhancements: Prevent cast from float32 to float64 in:
+     :class:`linear_model.MultiTaskElasticNet`;
+     :class:`linear_model.LogisticRegression` when using newton-cg solver; and
+     :class:`linear_model.Ridge` when using svd, sparse_cg, cholesky or lsqr
+     solvers. :issue:`8835`, :issue:`8061` by :user:`Joan Massich <massich>` and :user:`Nicolas
+     Cordier <ncordier>` and :user:`Thierry Guillemot`.
 
 Other predictors
 
@@ -205,6 +215,11 @@ Other predictors
      fewer constraints: they must take two 1d-arrays and return a float.
      :issue:`6288` by `Jake Vanderplas`_.
 
+   - ``algorithm='auto`` in :mod:`neighbors` estimators now chooses the most
+     appropriate algorithm for all input types and metrics. :issue:`9145` by
+     :user:`Herilalaina Rakotoarison <herilalaina>` and :user:`Reddy Chinthala
+     <preddy5Pradyumna>`.
+
 Decomposition, manifold learning and clustering
 
    - :class:`cluster.MiniBatchKMeans` and :class:`cluster.KMeans`
@@ -215,6 +230,7 @@ Decomposition, manifold learning and clustering
      :class:`decomposition.TruncatedSVD` now expose the singular values
      from the underlying SVD. They are stored in the attribute
      ``singular_values_``, like in :class:`decomposition.IncrementalPCA`.
+     :issue:`7685` by :user:`Tommy Löfstedt <tomlof>`
 
    - :class:`decomposition.NMF` now faster when ``beta_loss=0``.
      :issue:`9277` by :user:`hongkahjun`.
@@ -227,6 +243,10 @@ Decomposition, manifold learning and clustering
      `lvdmaaten/bhtsne <https://github.com/lvdmaaten/bhtsne>`_ by :user:`Thomas
      Moreau <tomMoral>` and `Olivier Grisel`_.
 
+   - Memory usage enhancements: Prevent cast from float32 to float64 in
+     :class:`decomposition.PCA` and
+     :func:`decomposition.randomized_svd_low_rank`.
+     :issue:`9067` by `Raghav RV`_.
 
 Preprocessing and feature selection
 
@@ -257,6 +277,10 @@ Model evaluation and meta-estimators
      within a pipeline by using the ``memory`` constructor parameter.
      :issue:`7990` by :user:`Guillaume Lemaitre <glemaitre>`.
 
+   - :class:`pipeline.Pipeline` steps can now be accessed as attributes of its
+     ``named_steps`` attribute. :issue:`8586` by :user:`Herilalaina
+     Rakotoarison <herilalaina>`.
+
    - Added ``sample_weight`` parameter to :meth:`pipeline.Pipeline.score`.
      :issue:`7723` by :user:`Mikhail Korobov <kmike>`.
 
@@ -264,9 +288,11 @@ Model evaluation and meta-estimators
      A ``TypeError`` will be raised for any other kwargs. :issue:`8028`
      by :user:`Alexander Booth <alexandercbooth>`.
 
-   - :class:`model_selection.GridSearchCV`, :class:`model_selection.RandomizedSearchCV`
-     and :func:`model_selection.cross_val_score` now allow estimators with callable
-     kernels which were previously prohibited. :issue:`8005` by `Andreas Müller`_ .
+   - :class:`model_selection.GridSearchCV`,
+     :class:`model_selection.RandomizedSearchCV` and
+     :func:`model_selection.cross_val_score` now allow estimators with callable
+     kernels which were previously prohibited.
+     :issue:`8005` by `Andreas Müller`_ .
 
    - :func:`model_selection.cross_val_predict` now returns output of the
      correct shape for all values of the argument ``method``.
@@ -277,6 +303,9 @@ Model evaluation and meta-estimators
      :func:`model_selection.learning_curve`.
      :issue:`7506` by :user:`Narine Kokhlikyan <NarineK>`.
 
+   - :class:`model_selection.StratifiedShuffleSplit` now works with multioutput
+     multiclass (or multilabel) data.  :issue:`9044` by `Vlad Niculae`_.
+
    - Speed improvements to :class:`model_selection.StratifiedShuffleSplit`.
      :issue:`5991` by :user:`Arthur Mensch <arthurmensch>` and `Joel Nothman`_.
 
@@ -285,11 +314,14 @@ Model evaluation and meta-estimators
 
    - :class:`multioutput.MultiOutputRegressor` and :class:`multioutput.MultiOutputClassifier`
      now support online learning using ``partial_fit``.
-     issue: `8053` by :user:`Peng Yu <yupbank>`.
+     :issue: `8053` by :user:`Peng Yu <yupbank>`.
 
    - Add ``max_train_size`` parameter to :class:`model_selection.TimeSeriesSplit`
      :issue:`8282` by :user:`Aman Dalmia <dalmia>`.
 
+   - More clustering metrics are now available through :func:`metrics.get_scorer`
+     and ``scoring`` parameters. :issue:`8117` by `Raghav RV`_.
+
 Metrics
 
    - :func:`metrics.matthews_corrcoef` now support multiclass classification.
@@ -300,25 +332,31 @@ Metrics
 
 Miscellaneous
 
-   - :func:`utils.check_estimator` now attempts to ensure that methods transform, predict, etc.
-     do not set attributes on the estimator.
+   - :func:`utils.check_estimator` now attempts to ensure that methods
+     transform, predict, etc.  do not set attributes on the estimator.
      :issue:`7533` by :user:`Ekaterina Krivich <kiote>`.
 
    - Added type checking to the ``accept_sparse`` parameter in
-     :mod:`utils.validation` methods. This parameter now accepts only
-     boolean, string, or list/tuple of strings. ``accept_sparse=None`` is deprecated
-     and should be replaced by ``accept_sparse=False``.
+     :mod:`utils.validation` methods. This parameter now accepts only boolean,
+     string, or list/tuple of strings. ``accept_sparse=None`` is deprecated and
+     should be replaced by ``accept_sparse=False``.
      :issue:`7880` by :user:`Josh Karnofsky <jkarno>`.
 
    - Make it possible to load a chunk of an svmlight formatted file by
      passing a range of bytes to :func:`datasets.load_svmlight_file`.
      :issue:`935` by :user:`Olivier Grisel <ogrisel>`.
 
+   - :class:`dummy.DummyClassifier` and :class:`dummy.DummyRegressor`
+     now accept non-finite features. :issue:`8931` by :user:`Attractadore`.
+
 Bug fixes
 .........
 
 Trees and ensembles
 
+   - Fixed a memory leak in trees when using trees with ``criterion='mae'``.
+     :issue:`8002` by `Raghav RV`_.
+
    - Fixed a bug where :class:`ensemble.IsolationForest` uses an
      an incorrect formula for the average path length
      :issue:`8549` by `Peter Wang <https://github.com/PTRWang>`_.
@@ -327,10 +365,10 @@ Trees and ensembles
      ``ZeroDivisionError`` while fitting data with single class labels.
      :issue:`7501` by :user:`Dominik Krzeminski <dokato>`.
 
-   - Fixed a bug in :class:`ensemble.GradientBoostingClassifier`
-     and :class:`ensemble.GradientBoostingRegressor`
-     where a float being compared to ``0.0`` using ``==`` caused a divide by zero
-     error. issue:`7970` by :user:`He Chen <chenhe95>`.
+   - Fixed a bug in :class:`ensemble.GradientBoostingClassifier` and
+     :class:`ensemble.GradientBoostingRegressor` where a float being compared
+     to ``0.0`` using ``==`` caused a divide by zero error. :issue:`7970` by
+     :user:`He Chen <chenhe95>`.
 
    - Fix a bug where :class:`ensemble.GradientBoostingClassifier` and
      :class:`ensemble.GradientBoostingRegressor` ignored the
@@ -338,16 +376,21 @@ Trees and ensembles
      :issue:`8006` by :user:`Sebastian Pölsterl <sebp>`.
 
    - Fixed ``oob_score`` in :class:`ensemble.BaggingClassifier`.
-     :issue:`8936` by :user:`mlewis1729 <mlewis1729>`
+     :issue:`8936` by :user:`Michael Lewis <mlewis1729>`
+
+   - Fixed excessive memory usage in prediction for random forests estimators.
+     :issue:`8672` by :user:`Mike Benfield <mikebenfield>`.
+
+   - Fixed a bug where ``sample_weight`` as a list broke random forests in Python 2
+     :issue:`8068` by :user:`xor`.
 
    - Fixed a bug where :class:`ensemble.IsolationForest` fails when
      ``max_features`` is less than 1.
      :issue:`5732` by :user:`Ishank Gulati <IshankGulati>`.
 
-   - Fix a bug where
-     :class:`ensemble.gradient_boosting.QuantileLossFunction` computed
-     negative errors for negative values of ``ytrue - ypred`` leading to
-     wrong values when calling ``__call__``.
+   - Fix a bug where gradient boosting with ``loss='quantile'`` computed
+     negative errors for negative values of ``ytrue - ypred`` leading to wrong
+     values when calling ``__call__``.
      :issue:`8087` by :user:`Alexis Mignon <AlexisMignon>`
 
    - Fix a bug where :class:`ensemble.VotingClassifier` raises an error
@@ -361,11 +404,13 @@ Trees and ensembles
 Linear, kernelized and related models
 
    - Fixed a bug where :func:`linear_model.RANSACRegressor.fit` may run until
-     ``max_iter`` if it finds a large inlier group early. :issue:`8251` by :user:`aivision2020`.
+     ``max_iter`` if it finds a large inlier group early. :issue:`8251` by
+     :user:`aivision2020`.
 
-   - Fixed a bug where :class:`naive_bayes.MultinomialNB` and :class:`naive_bayes.BernoulliNB`
-     failed when ``alpha=0``. :issue:`5814` by :user:`Yichuan Liu <yl565>` and
-     :user:`Herilalaina Rakotoarison <herilalaina>`.
+   - Fixed a bug where :class:`naive_bayes.MultinomialNB` and
+     :class:`naive_bayes.BernoulliNB` failed when ``alpha=0``. :issue:`5814` by
+     :user:`Yichuan Liu <yl565>` and :user:`Herilalaina Rakotoarison
+     <herilalaina>`.
 
    - Fixed a bug where :class:`linear_model.LassoLars` does not give
      the same result as the LassoLars implementation available
@@ -378,8 +423,8 @@ Linear, kernelized and related models
      classes, and some values proposed in the docstring could raise errors.
      :issue:`5359` by `Tom Dupre la Tour`_.
 
-   - Fix inconsistent results between :class:`linear_model.RidgeCV`
-     and :class:`linear_model.Ridge` when using ``normalize=True``
+   - Fix inconsistent results between :class:`linear_model.RidgeCV` and
+     :class:`linear_model.Ridge` when using ``normalize=True``. :issue:`9302`
      by `Alexandre Gramfort`_.
 
    - Fix a bug where :func:`linear_model.LassoLars.fit` sometimes
@@ -471,6 +516,15 @@ Decomposition, manifold learning and clustering
    - Fixed improper scaling in :class:`cross_decomposition.PLSRegression`
      with ``scale=True``. :issue:`7819` by :user:`jayzed82 <jayzed82>`.
 
+   - :class:`cluster.bicluster.SpectralCoclustering` and
+     :class:`cluster.bicluster.SpectralBiclustering` ``fit`` method conforms
+     with API by accepting ``y`` and returning the object.  :issue:`6126`,
+     :issue:`7814` by :user:`Laurent Direr <ldirer>` and :user:`Maniteja
+     Nandana <maniteja123>`.
+
+   - Fix bug where :mod:`mixture` ``sample`` methods did not return as many
+     samples as requested. :issue:`7702` by :user:`Levi John Wolf <ljwolf>`.
+
 Preprocessing and feature selection
 
    - For sparse matrices, :func:`preprocessing.normalize` with ``return_norm=True``
@@ -494,6 +548,10 @@ Preprocessing and feature selection
      pipeline with  :class:`feature_extraction.text.TfidfTransformer`.
      :issue:`7565` by :user:`Roman Yurchak <rth>`.
 
+   - Fix a bug where :class:`feature_selection.mutual_info_regression` did not
+     correctly use ``n_neighbors``. :issue:`8181` by :user:`Guillaume Lemaitre
+     <glemaitre>`.
+
 Model evaluation and meta-estimators
 
    - Fixed a bug where :func:`model_selection.BaseSearchCV.inverse_transform`
@@ -512,6 +570,9 @@ Model evaluation and meta-estimators
      reused the same estimator for each parameter value.
      :issue:`7365` by :user:`Aleksandr Sandrovskii <Sundrique>`.
 
+   - :func:`model_selection.permutation_test_score` now works with Pandas
+     types. :issue:`5697` by :user:`Stijn Tonk <equialgo>`.
+
    - Several fixes to input validation in
      :class:`multiclass.OutputCodeClassifier`
      :issue:`8086` by `Andreas Müller`_.
@@ -545,7 +606,7 @@ Metrics
      by `Joel Nothman`_ and :user:`Jon Crall <Erotemic>`.
 
    - Fixed passing of ``gamma`` parameter to the ``chi2`` kernel in
-     :func:`metrics.pairwise_kernels` :issue:`5211` by
+     :func:`metrics.pairwise.pairwise_kernels` :issue:`5211` by
      :user:`Nick Rhinehart <nrhine1>`,
      :user:`Saurabh Bansod <mth4saurabh>` and `Andreas Müller`_.
 
@@ -579,8 +640,11 @@ Miscellaneous
      documentation build with Sphinx>1.5 :issue:`8010`, :issue:`7986` by
      :user:`Oscar Najera <Titan-C>`
 
-   - Add ``data_home`` parameter to
-     :func:`sklearn.datasets.fetch_kddcup99` by `Loic Esteve`_.
+   - Add ``data_home`` parameter to :func:`sklearn.datasets.fetch_kddcup99`.
+     :issue:`9289` by `Loic Esteve`_.
+
+   - Fix dataset loaders using Python 3 version of makedirs to also work in
+     Python 2. :issue:`9284` by :user:`Sebastin Santy <SebastinSanty>`.
 
    - Several minor issues were fixed with thanks to the alerts of
      [lgtm.com](http://lgtm.com). :issue:`9278` by :user:`Jean Helie <jhelie>`,
@@ -599,12 +663,24 @@ Trees and ensembles
      the weighted impurity decrease from splitting is no longer alteast
      ``min_impurity_decrease``.  :issue:`8449` by `Raghav RV`_.
 
+Linear, kernelized and related models
+
+   - ``n_iter`` parameter is deprecated in :class:`linear_model.SGDClassifier`,
+     :class:`linear_model.SGDRegressor`,
+     :class:`linear_model.PassiveAggressiveClassifier`,
+     :class:`linear_model.PassiveAggressiveRegressor` and
+     :class:`linear_model.Perceptron`. By `Tom Dupre la Tour`_.
+
 Other predictors
 
    - :class:`neighbors.LSHForest` has been deprecated and will be
      removed in 0.21 due to poor performance.
      :issue:`9078` by :user:`Laurent Direr <ldirer>`.
 
+   - :class:`neighbors.NearestCentroid` no longer purports to support
+     ``metric='precomputed'`` which now raises an error. :issue:`8515` by
+     :user:`Sergul Aydore <sergulaydore>`.
+
    - The ``alpha`` parameter of :class:`semi_supervised.LabelPropagation` now
      has no effect and is deprecated to be removed in 0.21. :issue:`9239`
      by :user:`Andre Ambrosio Boechat <boechat107>`, :user:`Utkarsh Upadhyay
@@ -622,9 +698,12 @@ Decomposition, manifold learning and clustering
      has been renamed to ``n_components`` and will be removed in version 0.21.
      :issue:`8922` by :user:`Attractadore`.
 
-   - :class:`cluster.bicluster.SpectralCoclustering` and
-     :class:`cluster.bicluster.SpectralBiclustering` now accept ``y`` in fit.
-     :issue:`6126` by :user:`Laurent Direr <ldirer>`.
+   - :meth:`decomposition.SparsePCA.transform`'s ``ridge_alpha`` parameter is
+     deprecated in preference for class parameter.
+     :issue:`8137` by :user:`Naoya Kanai <naoyak>`.
+
+   - :class:`cluster.DBSCAN` now has a ``metric_params`` parameter.
+     :issue:`8139` by :user:`Naoya Kanai <naoyak>`.
 
 Preprocessing and feature selection
 
@@ -633,7 +712,7 @@ Preprocessing and feature selection
 
    - :class:`feature_selection.SelectFromModel` now validates the ``threshold``
      parameter and sets the ``threshold_`` attribute during the call to
-     ``fit``, and no longer during the call to ``transform```, by `Andreas
+     ``fit``, and no longer during the call to ``transform```. By `Andreas
      Müller`_.
 
    - The ``non_negative`` parameter in :class:`feature_extraction.FeatureHasher`
@@ -664,6 +743,11 @@ Model evaluation and meta-estimators
      specifying ``train_size`` alone will cause ``test_size`` to be the
      remainder. :issue:`7459` by :user:`Nelson Liu <nelson-liu>`.
 
+   - :class:`multiclass.OneVsRestClassifier` now has ``partial_fit``,
+     ``decision_function`` and ``predict_proba`` methods only when the
+     underlying estimator does.  :issue:`7812` by `Andreas Müller`_ and
+     :user:`Mikhail Korobov <kmike>`.
+
    - :class:`multiclass.OneVsRestClassifier` now has a ``partial_fit`` method
      only if the underlying estimator does.  By `Andreas Müller`_.
 
@@ -878,6 +962,13 @@ Bug fixes
      parameter setting on the split produced by the first ``split`` call
      to the cross-validation splitter.  :issue:`7660` by `Raghav RV`_.
 
+   - Fix bug where :meth:`preprocessing.MultiLabelBinarizer.fit_transform`
+     returned an invalid CSR matrix.
+     :issue:`7750` by :user:`CJ Carey <perimosocordiae>`.
+
+   - Fixed a bug where :func:`metrics.pairwise.cosine_distances` could return a
+     small negative distance. :issue:`7732` by :user:`Artsion <asanakoy>`.
+
 API changes summary
 -------------------