Skip to content

REL scikit-learn 1.3.2 #27603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Oct 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions .github/workflows/wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,6 @@ jobs:
- os: windows-latest
python: 312
platform_id: win_amd64
# TODO: remove when Python 3.12 is released
prerelease: "True"

# Linux 64 bit manylinux2014
- os: ubuntu-latest
Expand All @@ -97,8 +95,6 @@ jobs:
python: 312
platform_id: manylinux_x86_64
manylinux_image: manylinux2014
# TODO: remove when Python 3.12 is released
prerelease: "True"

# MacOS x86_64
- os: macos-latest
Expand All @@ -116,8 +112,6 @@ jobs:
- os: macos-latest
python: 312
platform_id: macosx_x86_64
# TODO: remove when Python 3.12 is released
prerelease: "True"

# MacOS arm64
# The wheel for the latest Python version is built and tested on
Expand Down
4 changes: 0 additions & 4 deletions build_tools/cirrus/arm_wheel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,6 @@ macos_arm64_wheel_task:
# is actually tested on Cirrus CI.
- env:
CIBW_BUILD: cp312-macosx_arm64
# TODO: remove when Python 3.12 is released
CIBW_PRERELEASE_PYTHONS: True

conda_script:
- curl -L --retry 10 -o ~/mambaforge.sh https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
Expand Down Expand Up @@ -78,8 +76,6 @@ linux_arm64_wheel_task:
CIBW_TEST_SKIP: "*_aarch64"
- env:
CIBW_BUILD: cp312-manylinux_aarch64
# TODO: remove when Python 3.12 is released
CIBW_PRERELEASE_PYTHONS: True

cibuildwheel_script:
- apt install -y python3 python-is-python3
Expand Down
4 changes: 3 additions & 1 deletion doc/templates/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,9 @@ <h4 class="sk-landing-call-header">News</h4>
<li><strong>On-going development:</strong>
<a href="https://scikit-learn.org/dev/whats_new.html"><strong>What's new</strong> (Changelog)</a>
</li>
<li><strong>September 2023.</strong> scikit-learn 1.3.1 is available for download (<a href="whats_new/v1.3.html#version-1-3-1">Changelog</a>).
<li><strong>October 2023.</strong> scikit-learn 1.3.2 is available for download (<a href="whats_new/v1.3.html#version-1-3-2">Changelog</a>).
</li>
<li><strong>September 2023.</strong> scikit-learn 1.3.1 is available for download (<a href="whats_new/v1.3.html#version-1-3-1">Changelog</a>).
</li>
<li><strong>June 2023.</strong> scikit-learn 1.3.0 is available for download (<a href="whats_new/v1.3.html#version-1-3-0">Changelog</a>).
</li>
Expand Down
41 changes: 41 additions & 0 deletions doc/whats_new/v1.3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,47 @@

.. currentmodule:: sklearn

.. _changes_1_3_2:

Version 1.3.2
=============

**October 2023**

Changelog
---------

:mod:`sklearn.datasets`
.......................

- |Fix| All dataset fetchers now accept `data_home` as any object that implements
the :class:`os.PathLike` interface, for instance, :class:`pathlib.Path`.
:pr:`27468` by :user:`Yao Xiao <Charlie-XIAO>`.

:mod:`sklearn.decomposition`
............................

- |Fix| Fixes a bug in :class:`decomposition.KernelPCA` by forcing the output of
the internal :class:`preprocessing.KernelCenterer` to be a default array. When the
arpack solver is used, it expects an array with a `dtype` attribute.
:pr:`27583` by :user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.metrics`
......................

- |Fix| Fixes a bug for metrics using `zero_division=np.nan`
(e.g. :func:`~metrics.precision_score`) within a paralell loop
(e.g. :func:`~model_selection.cross_val_score`) where the singleton for `np.nan`
will be different in the sub-processes.
:pr:`27573` by :user:`Guillaume Lemaitre <glemaitre>`.

:mod:`sklearn.tree`
...................

- |Fix| Do not leak data via non-initialized memory in decision tree pickle files and make
the generation of those files deterministic. :pr:`27580` by :user:`Loïc Estève <lesteve>`.


.. _changes_1_3_1:

Version 1.3.1
Expand Down
2 changes: 1 addition & 1 deletion sklearn/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
# Dev branch marker is: 'X.Y.dev' or 'X.Y.devN' where N is an integer.
# 'X.Y.dev0' is the canonical version of 'X.Y.dev'
#
__version__ = "1.3.1"
__version__ = "1.3.2"


# On OSX, we can get a runtime error due to multiple OpenMP libraries loaded
Expand Down
14 changes: 10 additions & 4 deletions sklearn/covariance/tests/test_graphical_lasso.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,12 @@
)


def test_graphical_lasso(random_state=0):
def test_graphical_lassos(random_state=1):
"""Test the graphical lasso solvers.

This checks is unstable for some random seeds where the covariance found with "cd"
and "lars" solvers are different (4 cases / 100 tries).
"""
# Sample data from a sparse multivariate normal
dim = 20
n_samples = 100
Expand All @@ -46,10 +51,11 @@ def test_graphical_lasso(random_state=0):
costs, dual_gap = np.array(costs).T
# Check that the costs always decrease (doesn't hold if alpha == 0)
if not alpha == 0:
assert_array_less(np.diff(costs), 0)
# use 1e-12 since the cost can be exactly 0
assert_array_less(np.diff(costs), 1e-12)
# Check that the 2 approaches give similar results
assert_array_almost_equal(covs["cd"], covs["lars"], decimal=4)
assert_array_almost_equal(icovs["cd"], icovs["lars"], decimal=4)
assert_allclose(covs["cd"], covs["lars"], atol=1e-4)
assert_allclose(icovs["cd"], icovs["lars"], atol=1e-4)

# Smoke test the estimator
model = GraphicalLasso(alpha=0.25).fit(X)
Expand Down
4 changes: 2 additions & 2 deletions sklearn/datasets/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def get_data_home(data_home=None) -> str:
----------
data_home : str or path-like, default=None
The path to scikit-learn data directory. If `None`, the default path
is `~/sklearn_learn_data`.
is `~/scikit_learn_data`.

Returns
-------
Expand All @@ -84,7 +84,7 @@ def clear_data_home(data_home=None):
----------
data_home : str or path-like, default=None
The path to scikit-learn data directory. If `None`, the default path
is `~/sklearn_learn_data`.
is `~/scikit_learn_data`.
"""
data_home = get_data_home(data_home)
shutil.rmtree(data_home)
Expand Down
6 changes: 3 additions & 3 deletions sklearn/datasets/_california_housing.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@

import logging
import tarfile
from os import makedirs, remove
from os import PathLike, makedirs, remove
from os.path import exists

import joblib
Expand Down Expand Up @@ -53,7 +53,7 @@

@validate_params(
{
"data_home": [str, None],
"data_home": [str, PathLike, None],
"download_if_missing": ["boolean"],
"return_X_y": ["boolean"],
"as_frame": ["boolean"],
Expand All @@ -76,7 +76,7 @@ def fetch_california_housing(

Parameters
----------
data_home : str, default=None
data_home : str or path-like, default=None
Specify another download and cache folder for the datasets. By default
all scikit-learn data is stored in '~/scikit_learn_data' subfolders.

Expand Down
4 changes: 2 additions & 2 deletions sklearn/datasets/_covtype.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@

@validate_params(
{
"data_home": [str, None],
"data_home": [str, os.PathLike, None],
"download_if_missing": ["boolean"],
"random_state": ["random_state"],
"shuffle": ["boolean"],
Expand Down Expand Up @@ -98,7 +98,7 @@ def fetch_covtype(

Parameters
----------
data_home : str, default=None
data_home : str or path-like, default=None
Specify another download and cache folder for the datasets. By default
all scikit-learn data is stored in '~/scikit_learn_data' subfolders.

Expand Down
4 changes: 2 additions & 2 deletions sklearn/datasets/_kddcup99.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@
@validate_params(
{
"subset": [StrOptions({"SA", "SF", "http", "smtp"}), None],
"data_home": [str, None],
"data_home": [str, os.PathLike, None],
"shuffle": ["boolean"],
"random_state": ["random_state"],
"percent10": ["boolean"],
Expand Down Expand Up @@ -92,7 +92,7 @@ def fetch_kddcup99(
To return the corresponding classical subsets of kddcup 99.
If None, return the entire kddcup 99 dataset.

data_home : str, default=None
data_home : str or path-like, default=None
Specify another download and cache folder for the datasets. By default
all scikit-learn data is stored in '~/scikit_learn_data' subfolders.

Expand Down
10 changes: 5 additions & 5 deletions sklearn/datasets/_lfw.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

import logging
from numbers import Integral, Real
from os import listdir, makedirs, remove
from os import PathLike, listdir, makedirs, remove
from os.path import exists, isdir, join

import numpy as np
Expand Down Expand Up @@ -234,7 +234,7 @@ def _fetch_lfw_people(

@validate_params(
{
"data_home": [str, None],
"data_home": [str, PathLike, None],
"funneled": ["boolean"],
"resize": [Interval(Real, 0, None, closed="neither"), None],
"min_faces_per_person": [Interval(Integral, 0, None, closed="left"), None],
Expand Down Expand Up @@ -272,7 +272,7 @@ def fetch_lfw_people(

Parameters
----------
data_home : str, default=None
data_home : str or path-like, default=None
Specify another download and cache folder for the datasets. By default
all scikit-learn data is stored in '~/scikit_learn_data' subfolders.

Expand Down Expand Up @@ -431,7 +431,7 @@ def _fetch_lfw_pairs(
@validate_params(
{
"subset": [StrOptions({"train", "test", "10_folds"})],
"data_home": [str, None],
"data_home": [str, PathLike, None],
"funneled": ["boolean"],
"resize": [Interval(Real, 0, None, closed="neither"), None],
"color": ["boolean"],
Expand Down Expand Up @@ -480,7 +480,7 @@ def fetch_lfw_pairs(
official evaluation set that is meant to be used with a 10-folds
cross validation.

data_home : str, default=None
data_home : str or path-like, default=None
Specify another download and cache folder for the datasets. By
default all scikit-learn data is stored in '~/scikit_learn_data'
subfolders.
Expand Down
6 changes: 3 additions & 3 deletions sklearn/datasets/_olivetti_faces.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# Copyright (c) 2011 David Warde-Farley <wardefar at iro dot umontreal dot ca>
# License: BSD 3 clause

from os import makedirs, remove
from os import PathLike, makedirs, remove
from os.path import exists

import joblib
Expand All @@ -36,7 +36,7 @@

@validate_params(
{
"data_home": [str, None],
"data_home": [str, PathLike, None],
"shuffle": ["boolean"],
"random_state": ["random_state"],
"download_if_missing": ["boolean"],
Expand Down Expand Up @@ -67,7 +67,7 @@ def fetch_olivetti_faces(

Parameters
----------
data_home : str, default=None
data_home : str or path-like, default=None
Specify another download and cache folder for the datasets. By default
all scikit-learn data is stored in '~/scikit_learn_data' subfolders.

Expand Down
6 changes: 3 additions & 3 deletions sklearn/datasets/_openml.py
Original file line number Diff line number Diff line change
Expand Up @@ -749,7 +749,7 @@ def _valid_data_column_names(features_list, target_columns):
"name": [str, None],
"version": [Interval(Integral, 1, None, closed="left"), StrOptions({"active"})],
"data_id": [Interval(Integral, 1, None, closed="left"), None],
"data_home": [str, None],
"data_home": [str, os.PathLike, None],
"target_column": [str, list, None],
"cache": [bool],
"return_X_y": [bool],
Expand All @@ -769,7 +769,7 @@ def fetch_openml(
*,
version: Union[str, int] = "active",
data_id: Optional[int] = None,
data_home: Optional[str] = None,
data_home: Optional[Union[str, os.PathLike]] = None,
target_column: Optional[Union[str, List]] = "default-target",
cache: bool = True,
return_X_y: bool = False,
Expand Down Expand Up @@ -815,7 +815,7 @@ def fetch_openml(
dataset. If data_id is not given, name (and potential version) are
used to obtain a dataset.

data_home : str, default=None
data_home : str or path-like, default=None
Specify another download and cache folder for the data sets. By default
all scikit-learn data is stored in '~/scikit_learn_data' subfolders.

Expand Down
6 changes: 3 additions & 3 deletions sklearn/datasets/_rcv1.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

import logging
from gzip import GzipFile
from os import makedirs, remove
from os import PathLike, makedirs, remove
from os.path import exists, join

import joblib
Expand Down Expand Up @@ -74,7 +74,7 @@

@validate_params(
{
"data_home": [str, None],
"data_home": [str, PathLike, None],
"subset": [StrOptions({"train", "test", "all"})],
"download_if_missing": ["boolean"],
"random_state": ["random_state"],
Expand Down Expand Up @@ -111,7 +111,7 @@ def fetch_rcv1(

Parameters
----------
data_home : str, default=None
data_home : str or path-like, default=None
Specify another download and cache folder for the datasets. By default
all scikit-learn data is stored in '~/scikit_learn_data' subfolders.

Expand Down
6 changes: 3 additions & 3 deletions sklearn/datasets/_species_distributions.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@

import logging
from io import BytesIO
from os import makedirs, remove
from os import PathLike, makedirs, remove
from os.path import exists

import joblib
Expand Down Expand Up @@ -136,7 +136,7 @@ def construct_grids(batch):


@validate_params(
{"data_home": [str, None], "download_if_missing": ["boolean"]},
{"data_home": [str, PathLike, None], "download_if_missing": ["boolean"]},
prefer_skip_nested_validation=True,
)
def fetch_species_distributions(*, data_home=None, download_if_missing=True):
Expand All @@ -146,7 +146,7 @@ def fetch_species_distributions(*, data_home=None, download_if_missing=True):

Parameters
----------
data_home : str, default=None
data_home : str or path-like, default=None
Specify another download and cache folder for the datasets. By default
all scikit-learn data is stored in '~/scikit_learn_data' subfolders.

Expand Down
Loading