Skip to content

MNT release 0.21.0 #13804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
May 9, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
a2bf6de
Update version to 0.21.0; set release for Thursday
jnothman May 6, 2019
18b4f55
BLD Fixes Cython cimport errors (#13754)
thomasjpfan May 1, 2019
51d25d7
Fix spacing and formatting inconsistencies (#13747)
Scowley4 May 1, 2019
0ec12eb
DOC Updating PolynomialFeatures.Transform docstring (#13755)
AWNystrom May 1, 2019
0da8f75
DOC: trivial rst fix (#13765)
GaelVaroquaux May 2, 2019
4ec2d7c
DOC: fix class ref (#13766)
GaelVaroquaux May 2, 2019
8a99f1d
MNT Cleaning for fast partial dependence computation (#13738)
NicolasHug May 2, 2019
be80dc4
[MRG] DOC: Fix unusual phrasing in svm.SVC (#13774)
bharatr21 May 4, 2019
47ca768
[MRG] DOC Added version information for PCA.singular_values_ (#13776)
olegstikhin May 4, 2019
2f2523b
DOC add example to IsotonicRegression class (#13768)
veerlosar May 4, 2019
7d182ca
ENH Ridge with solver SAG/SAGA does not cast to float64 (#13302)
massich May 4, 2019
4e34ea9
Fixed documentation for mean_precision_prior. Smaller->Larger (#13764)
AraiKensuke May 5, 2019
9f2dbb8
DOC Fix more formatting inconsistencies (#13787)
Scowley4 May 5, 2019
2dfb24c
DOC Fix note range in contributing.html (#13722)
May 5, 2019
3665df3
[MRG] MAINT: add fixture for init and clean-up with matplotlib (#13708)
glemaitre May 6, 2019
f1995b2
FIX Allow to disable estimator and passing weight in Voting estimator…
glemaitre May 6, 2019
f688e28
API use 'drop' to disable estimators in voting (#13780)
glemaitre May 7, 2019
03eb27e
DOC Fix typo. (#13813)
May 7, 2019
3cdeb44
DOC clarified hamming loss docstrings (#13760)
XavierSATTLER May 8, 2019
7b80bb7
ENH handle sparse x and intercept in _RidgeGCV (#13350)
jeromedockes May 8, 2019
10b8fba
TST Make test_ridge_regression_dtype_stability less random (#13816)
ogrisel May 8, 2019
b34096e
API Make IterativeImputer experimental (#13824)
jnothman May 8, 2019
44d1e65
STY unused imports
jnothman May 9, 2019
d1c8121
DOC update roadmap (#13809)
NicolasHug May 9, 2019
1f45e1e
DOC Fix reference (#13841)
thomasjpfan May 9, 2019
134c543
MNT Update release date to 10 May
jnothman May 9, 2019
39c74cd
DOC Remove experimental tag from ColumnTransformer (#13835)
qinhanmin2014 May 9, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ jobs:
SCIPY_VERSION: '0.17.0'
CYTHON_VERSION: '*'
PILLOW_VERSION: '4.0.0'
MATPLOTLIB_VERSION: '1.5.1'
# later version of joblib are not packaged in conda for Python 3.5
JOBLIB_VERSION: '0.12.3'
COVERAGE: 'true'
Expand Down
2 changes: 1 addition & 1 deletion build_tools/azure/install.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ IF "%PYTHON_ARCH%"=="64" (
call deactivate
@rem Clean up any left-over from a previous build
conda remove --all -q -y -n %VIRTUALENV%
conda create -n %VIRTUALENV% -q -y python=%PYTHON_VERSION% numpy scipy cython pytest wheel pillow joblib
conda create -n %VIRTUALENV% -q -y python=%PYTHON_VERSION% numpy scipy cython matplotlib pytest wheel pillow joblib

call activate %VIRTUALENV%
) else (
Expand Down
4 changes: 2 additions & 2 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,9 +263,9 @@
'sphx_glr_plot_compare_methods_001.png': 349}


# enable experimental module so that the new GBDTs estimators can be
# enable experimental module so that experimental estimators can be
# discovered properly by sphinx
from sklearn.experimental import enable_hist_gradient_boosting # noqa
from sklearn.experimental import * # noqa


def make_carousel_thumbs(app, exception):
Expand Down
126 changes: 73 additions & 53 deletions doc/developers/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -195,67 +195,67 @@ The preferred way to contribute to scikit-learn is to fork the `main
repository <https://github.com/scikit-learn/scikit-learn/>`__ on GitHub,
then submit a "pull request" (PR):

1. `Create an account <https://github.com/join>`_ on
GitHub if you do not already have one.
1. `Create an account <https://github.com/join>`_ on
GitHub if you do not already have one.

2. Fork the `project repository
<https://github.com/scikit-learn/scikit-learn>`__: click on the 'Fork'
button near the top of the page. This creates a copy of the code under your
account on the GitHub user account. For more details on how to fork a
repository see `this guide <https://help.github.com/articles/fork-a-repo/>`_.
2. Fork the `project repository
<https://github.com/scikit-learn/scikit-learn>`__: click on the 'Fork'
button near the top of the page. This creates a copy of the code under your
account on the GitHub user account. For more details on how to fork a
repository see `this guide <https://help.github.com/articles/fork-a-repo/>`_.

3. Clone your fork of the scikit-learn repo from your GitHub account to your
local disk::
3. Clone your fork of the scikit-learn repo from your GitHub account to your
local disk::

$ git clone git@github.com:YourLogin/scikit-learn.git
$ cd scikit-learn
$ git clone git@github.com:YourLogin/scikit-learn.git
$ cd scikit-learn

4. Install library in editable mode::
4. Install library in editable mode::

$ pip install --editable .
$ pip install --editable .

for more details about advanced installation, see the
:ref:`install_bleeding_edge` section.
for more details about advanced installation, see the
:ref:`install_bleeding_edge` section.

5. Create a branch to hold your development changes::
5. Create a branch to hold your development changes::

$ git checkout -b my-feature
$ git checkout -b my-feature

and start making changes. Always use a ``feature`` branch. It's good practice to
never work on the ``master`` branch!
and start making changes. Always use a ``feature`` branch. It's good practice to
never work on the ``master`` branch!

.. note::
.. note::

In the above setup, your ``origin`` remote repository points to
``YourLogin/scikit-learn.git``. If you wish to fetch/merge from the main
repository instead of your forked one, you will need to add another remote
to use instead of ``origin``. If we choose the name ``upstream`` for it, the
command will be::
In the above setup, your ``origin`` remote repository points to
``YourLogin/scikit-learn.git``. If you wish to fetch/merge from the main
repository instead of your forked one, you will need to add another remote
to use instead of ``origin``. If we choose the name ``upstream`` for it, the
command will be::

$ git remote add upstream https://github.com/scikit-learn/scikit-learn.git
$ git remote add upstream https://github.com/scikit-learn/scikit-learn.git

And in order to fetch the new remote and base your work on the latest changes
of it you can::
And in order to fetch the new remote and base your work on the latest changes
of it you can::

$ git fetch upstream
$ git checkout -b my-feature upstream/master
$ git fetch upstream
$ git checkout -b my-feature upstream/master

6. Develop the feature on your feature branch on your computer, using Git to do the
version control. When you're done editing, add changed files using ``git add``
and then ``git commit`` files::
6. Develop the feature on your feature branch on your computer, using Git to do the
version control. When you're done editing, add changed files using ``git add``
and then ``git commit`` files::

$ git add modified_files
$ git commit
$ git add modified_files
$ git commit

to record your changes in Git, then push the changes to your GitHub account with::
to record your changes in Git, then push the changes to your GitHub account with::

$ git push -u origin my-feature
$ git push -u origin my-feature

7. Follow `these
<https://help.github.com/articles/creating-a-pull-request-from-a-fork>`_
instructions to create a pull request from your fork. This will send an
email to the committers. You may want to consider sending an email to the
mailing list for more visibility.
7. Follow `these
<https://help.github.com/articles/creating-a-pull-request-from-a-fork>`_
instructions to create a pull request from your fork. This will send an
email to the committers. You may want to consider sending an email to the
mailing list for more visibility.

.. note::

Expand Down Expand Up @@ -626,7 +626,7 @@ reviewing pull requests, you may find :ref:`this tip
.. _testing_coverage:

Testing and improving test coverage
------------------------------------
-----------------------------------

High-quality `unit testing <https://en.wikipedia.org/wiki/Unit_testing>`_
is a corner-stone of the scikit-learn development process. For this
Expand All @@ -641,22 +641,42 @@ the corresponding subpackages.

We expect code coverage of new features to be at least around 90%.

.. note:: **Workflow to improve test coverage**
For guidelines on how to use ``pytest`` efficiently, see the
:ref:`pytest_tips`.

To test code coverage, you need to install the `coverage
<https://pypi.org/project/coverage/>`_ package in addition to pytest.
Writing matplotlib related tests
................................

1. Run 'make test-coverage'. The output lists for each file the line
numbers that are not tested.
Test fixtures ensure that a set of tests will be executing with the appropriate
initialization and cleanup. The scikit-learn test suite implements a fixture
which can be used with ``matplotlib``.

2. Find a low hanging fruit, looking at which lines are not tested,
write or adapt a test specifically for these lines.
``pyplot``
The ``pyplot`` fixture should be used when a test function is dealing with
``matplotlib``. ``matplotlib`` is a soft dependency and is not required.
This fixture is in charge of skipping the tests if ``matplotlib`` is not
installed. In addition, figures created during the tests will be
automatically closed once the test function has been executed.

3. Loop.
To use this fixture in a test function, one needs to pass it as an
argument::

For guidelines on how to use ``pytest`` efficiently, see the
:ref:`pytest_tips`.
def test_requiring_mpl_fixture(pyplot):
# you can now safely use matplotlib

Workflow to improve test coverage
.................................

To test code coverage, you need to install the `coverage
<https://pypi.org/project/coverage/>`_ package in addition to pytest.

1. Run 'make test-coverage'. The output lists for each file the line
numbers that are not tested.

2. Find a low hanging fruit, looking at which lines are not tested,
write or adapt a test specifically for these lines.

3. Loop.

Developers web site
-------------------
Expand Down
1 change: 1 addition & 0 deletions doc/modules/classes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -471,6 +471,7 @@ Samples generator
:toctree: generated/

experimental.enable_hist_gradient_boosting
experimental.enable_iterative_imputer


.. _feature_extraction_ref:
Expand Down
9 changes: 9 additions & 0 deletions doc/modules/impute.rst
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,16 @@ of ``y``. This is done for each feature in an iterative fashion, and then is
repeated for ``max_iter`` imputation rounds. The results of the final
imputation round are returned.

.. note::

This estimator is still **experimental** for now: the predictions
and the API might change without any deprecation cycle. To use it,
you need to explicitly import ``enable_iterative_imputer``.

::

>>> import numpy as np
>>> from sklearn.experimental import enable_iterative_imputer
>>> from sklearn.impute import IterativeImputer
>>> imp = IterativeImputer(max_iter=10, random_state=0)
>>> imp.fit([[1, 2], [3, 6], [4, 8], [np.nan, 3], [7, np.nan]]) # doctest: +NORMALIZE_WHITESPACE
Expand Down
21 changes: 14 additions & 7 deletions doc/modules/linear_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,17 +136,24 @@ Setting the regularization parameter: generalized Cross-Validation
------------------------------------------------------------------

:class:`RidgeCV` implements ridge regression with built-in
cross-validation of the alpha parameter. The object works in the same way
cross-validation of the alpha parameter. The object works in the same way
as GridSearchCV except that it defaults to Generalized Cross-Validation
(GCV), an efficient form of leave-one-out cross-validation::

>>> import numpy as np
>>> from sklearn import linear_model
>>> reg = linear_model.RidgeCV(alphas=[0.1, 1.0, 10.0], cv=3)
>>> reg.fit([[0, 0], [0, 0], [1, 1]], [0, .1, 1]) # doctest: +SKIP
RidgeCV(alphas=[0.1, 1.0, 10.0], cv=3, fit_intercept=True, scoring=None,
normalize=False)
>>> reg.alpha_ # doctest: +SKIP
0.1
>>> reg = linear_model.RidgeCV(alphas=np.logspace(-6, 6, 13))
>>> reg.fit([[0, 0], [0, 0], [1, 1]], [0, .1, 1]) # doctest: +NORMALIZE_WHITESPACE
RidgeCV(alphas=array([1.e-06, 1.e-05, 1.e-04, 1.e-03, 1.e-02, 1.e-01, 1.e+00, 1.e+01,
1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06]),
cv=None, fit_intercept=True, gcv_mode=None, normalize=False,
scoring=None, store_cv_values=False)
>>> reg.alpha_
0.01

Specifying the value of the `cv` attribute will trigger the use of
cross-validation with `GridSearchCV`, for example `cv=10` for 10-fold
cross-validation, rather than Generalized Cross-Validation.

.. topic:: References

Expand Down
19 changes: 0 additions & 19 deletions doc/roadmap.rst
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,6 @@ bottom.

#. Improved tools for model diagnostics and basic inference

* partial dependence plots :issue:`5653`
* alternative feature importances implementations (e.g. methods or wrappers)
* better ways to handle validation sets when fitting
* better ways to find thresholds / create decision rules :issue:`8614`
Expand All @@ -144,19 +143,6 @@ bottom.
:issue:`6929`
* Callbacks or a similar system would facilitate logging and early stopping

#. Use scipy BLAS Cython bindings

* This will make it possible to get rid of our partial copy of suboptimal
Atlas C-routines. :issue:`11638`
* This should speed up the Windows and Linux wheels

#. Allow fine-grained parallelism in cython

* Now that we do not use fork-based multiprocessing in joblib anymore it's
possible to use the prange / openmp thread management which makes it
possible to have very efficient thread-based parallelism at the Cython
level. Example with K-Means: :issue:`11950`

#. Distributed parallelism

* Joblib can now plug onto several backends, some of them can distribute the
Expand Down Expand Up @@ -240,9 +226,6 @@ Subpackage-specific goals
:mod:`sklearn.ensemble`

* a stacking implementation
* a binned feature histogram based and thread parallel implementation of
decision trees to compete with the performance of state of the art gradient
boosting like LightGBM.

:mod:`sklearn.model_selection`

Expand All @@ -269,5 +252,3 @@ Subpackage-specific goals

* Performance issues with `Pipeline.memory`
* see "Everything in Scikit-learn should conform to our API contract" above
* Add a verbose option :issue:`10435`

Loading