ENH Adds `class_names` to `tree.export_text` #25387

mokwilliam · 2023-01-12T22:31:01Z

Reference Issues/PRs

closes #20576
closes #19824

What does this implement/fix? Explain your changes.

I've posted a message there #20576 (comment).

To summarize, the goal is to provide the class_names argument in the signature of the export_text function so that the user can choose the class name(s). So I don't focus on changing the display format of the tree.

Any other comments?

I'll copy/paste the comment I've posted on the issue #20576.

Explanation

Here's the initialization of my test :

from sklearn.datasets import load_iris
from sklearn import tree

# Load iris dataset
iris = load_iris()

# Load decision tree classifiers: DecisionTree
clf_DT = tree.DecisionTreeClassifier(random_state=0)

# Build decision tree classifier from iris dataset
clf_DT = clf_DT.fit(iris.data, iris.target)

To remind, the signature of the function is as follows (https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/tree/_export.py#L922) :

def export_text(
    decision_tree,
    *,
    feature_names=None,
    max_depth=10,
    spacing=3,
    decimals=2,
    show_weights=False,
)

Of course, if I try to put the argument class_names, we have the following error

TypeError: export_text() got an unexpected keyword argument class_names

If not, then we have :

feature_names = iris['feature_names']
r = tree.export_text(clf_DT, feature_names=feature_names, max_depth=2)
print(r)

|--- petal width (cm) <= 0.80
|   |--- class: 0
|--- petal width (cm) >  0.80
|   |--- petal width (cm) <= 1.75
|   |   |--- petal length (cm) <= 4.95
|   |   |   |--- truncated branch of depth 2
|   |   |--- petal length (cm) >  4.95
|   |   |   |--- truncated branch of depth 3
|   |--- petal width (cm) >  1.75
|   |   |--- petal length (cm) <= 4.85
|   |   |   |--- truncated branch of depth 2
|   |   |--- petal length (cm) >  4.85
|   |   |   |--- class: 2

After changing the source code (to see more, check "Changes in the code"), we have :

class_names0 = None 
# Then: class_names = decision_tree.classes_

class_names1 = [0, "1"] 
# Then: class_names = decision_tree.classes_
# len(class_names1) != decision_tree.n_classes_

class_names2 = ["class0", "class1", "class2"] 
# Then: class_names = class_names2
# Reason: len(class_names2) == decision_tree.n_classes_

c_names = [class_names0, class_names1, class_names2]

print("===> After change <===")
for i, class_name in enumerate(c_names):
    r = tree.export_text(clf_DT, class_names=class_name, feature_names=feature_names, max_depth=2)
    print(r)

===> After change <===
|--- petal width (cm) <= 0.80
|   |--- class: 0
|--- petal width (cm) >  0.80
|   |--- petal width (cm) <= 1.75
|   |   |--- petal length (cm) <= 4.95
|   |   |   |--- truncated branch of depth 2
|   |   |--- petal length (cm) >  4.95
|   |   |   |--- truncated branch of depth 3
|   |--- petal width (cm) >  1.75
|   |   |--- petal length (cm) <= 4.85
|   |   |   |--- truncated branch of depth 2
|   |   |--- petal length (cm) >  4.85
|   |   |   |--- class: 2

|--- petal width (cm) <= 0.80
|   |--- class: 0
|--- petal width (cm) >  0.80
|   |--- petal width (cm) <= 1.75
|   |   |--- petal length (cm) <= 4.95
|   |   |   |--- truncated branch of depth 2
|   |   |--- petal length (cm) >  4.95
|   |   |   |--- truncated branch of depth 3
|   |--- petal width (cm) >  1.75
|   |   |--- petal length (cm) <= 4.85
|   |   |   |--- truncated branch of depth 2
|   |   |--- petal length (cm) >  4.85
|   |   |   |--- class: 2

|--- petal width (cm) <= 0.80
|   |--- class: class0
|--- petal width (cm) >  0.80
|   |--- petal width (cm) <= 1.75
|   |   |--- petal length (cm) <= 4.95
|   |   |   |--- truncated branch of depth 2
|   |   |--- petal length (cm) >  4.95
|   |   |   |--- truncated branch of depth 3
|   |--- petal width (cm) >  1.75
|   |   |--- petal length (cm) <= 4.85
|   |   |   |--- truncated branch of depth 2
|   |   |--- petal length (cm) >  4.85
|   |   |   |--- class: class2

I hope this message was clear enough to understand. I can make changes if needed to fine tune.

Changes in the code

Before change

def export_text(
    decision_tree,
    *,
    feature_names=None,
    max_depth=10,
    spacing=3,
    decimals=2,
    show_weights=False,
):
    check_is_fitted(decision_tree)
    tree_ = decision_tree.tree_
    if is_classifier(decision_tree):
        class_names = decision_tree.classes_

After change

def export_text(
    decision_tree,
    *,
    feature_names=None,
    class_names=None,  # add class_names argument
    max_depth=10,
    spacing=3,
    decimals=2,
    show_weights=False,
):
    """
    ...
    class_names : list of arguments, default=None
        Names of each of the target classes in ascending numerical order.
        Only relevant for classification and not supported for multi-output.
    ...
    """
    check_is_fitted(decision_tree)
    tree_ = decision_tree.tree_
    if is_classifier(decision_tree):
        if (  # length of class_names must be equal to the number of classes given by the tree
            class_names is not None
            and len(class_names) == len(decision_tree.classes_)
        ):
            class_names = class_names
        else:  # by default, we leave decision_tree.classes_
            class_names = decision_tree.classes_

sklearn/tree/_export.py

crispinlogan · 2023-01-13T01:15:33Z

sklearn/tree/_export.py

@@ -943,6 +944,10 @@ def export_text(
        A list of length n_features containing the feature names.
        If None generic names will be used ("feature_0", "feature_1", ...).

+    class_names : list of arguments, default=None


Could this be list of str instead of list of arguments? Like the parameter above, feature_names.

I also wondered about this, but when you print the tree, it is converted directly to a string with the _add_leaf() function :

scikit-learn/sklearn/tree/_export.py

Lines 1026 to 1034 in dc7ef61

def _add_leaf(value, class_name, indent):

val = ""

is_classification = isinstance(decision_tree, DecisionTreeClassifier)

if show_weights or not is_classification:

val = ["{1:.{0}f}, ".format(decimals, v) for v in value]

val = "[" + "".join(val)[:-2] + "]"

if is_classification:

val += " class: " + str(class_name)

export_text.report += value_fmt.format(indent, "", val)

So I figured that any element in the list, of any type, is converted to a string.

glemaitre

We will need unit tests to check the behaviour of each new options:

that we delegate to decision_tree.classes_ when passing None
check that we raise warning and create generic numerical string when "numeric" is passed
check that we overwrite the class names by passing a list
check that we raise an error with the wrong number of items in the list

glemaitre · 2023-01-13T14:28:20Z

sklearn/tree/_export.py

@@ -986,7 +991,10 @@ def export_text(
    check_is_fitted(decision_tree)
    tree_ = decision_tree.tree_
    if is_classifier(decision_tree):
-        class_names = decision_tree.classes_
+        if class_names is not None and len(class_names) == len(decision_tree.classes_):


Actually, we are breaking backward compatibility here even if one would like to have sensible default.

So if we want to preserve the previous behaviour and not introduce a limited number of parameters that need to be deprecated later, I would propose the following:

if class_names is "numeric": class_name = np.argmax(value) elif class_names is None: class_name = decision_tree.classes_[np.argmax(value)] else: class_name = class_names[np.argmax(value)]

Then, by default, class_names="numeric". Once we introduce the parameter, we can deprecate "numeric" and default to None.

Here for the validation, we would have something like:

if class_names == "numeric": warnings.warn( "The option `class_names='numeric'` is deprecated in 1.3 and will be removed " "in 1.5. Set `class_names=None`, the classes as seen by `decision_tree` during " "`fit` will be used instead.", FutureWarning, ) elif class_names is not None and len(class_names) != len(decision_tree.classes_): raise ValueError( "When `class_names` is not None, it should be a list containing as many " f"items as `decision_tree.classes_`. Got {len(class_names)} while the tree " f"was fitted with {len(decision_tree.classes_)} classes." )

glemaitre · 2023-01-13T14:37:34Z

sklearn/tree/_export.py

+    class_names : list of arguments, default=None
+        Names of each of the target classes in ascending numerical order.
+        Only relevant for classification and not supported for multi-output.


Suggested change

class_names : list of arguments, default=None

Names of each of the target classes in ascending numerical order.

Only relevant for classification and not supported for multi-output.

class_names : "numeric", list or None, default="numeric"

Names of each of the target classes in ascending numerical order.

Only relevant for classification and not supported for multi-output.

- if `None`, the class names are delegated to `decision_tree.classes_`;

- if `"numeric"`, the class names are generic names representing numerical

numbers (e.g. `["0", "1", ...]`);

- if a list, the number of items should be the same as in

`decition_tree.classes_` and will be used.

.. versionadded:: 1.3

`class_names` was added in version 1.3.

.. deprecated:: 1.3

The `"numeric"` option is deprecated and will be replaced by `None`. Thus,

`decision_tree.classes_` will be used by default.

…ames

…arn into export_text_class_names

mokwilliam · 2023-01-13T15:46:26Z

Thank you for the reviews! So I've made new changes according to what you've written.
Hope it's better

glemaitre · 2023-01-13T16:51:41Z

We still need the tests as mentioned in my previous comment.

glemaitre · 2023-01-13T16:53:29Z

Interestingly, it seems that we broke the back compatibility as shown by the failure:

>       assert export_text(clf) == expected_report
E       AssertionError: assert '|--- feature...-- class: 1\n' == '|--- feature...-- class: 1\n'
E           |--- feature_1 <= 0.00
E         - |   |--- class: -1
E         ?                 -
E         + |   |--- class: 1
E           |--- feature_1 >  0.00
E           |   |--- class: 1

sklearn/tree/_export.py

glemaitre

We will also need to have an entry inside the changelog doc/whats_new/v1.3.rst.

mokwilliam · 2023-01-13T19:25:40Z

We will need unit tests to check the behaviour of each new options:

that we delegate to decision_tree.classes_ when passing None

check that we raise warning and create generic numerical string when "numeric" is passed

check that we overwrite the class names by passing a list

check that we raise an error with the wrong number of items in the list

Test: check that we raise an error with the wrong number of items in the list

def test_export_text_errors():
    clf = DecisionTreeClassifier(max_depth=2, random_state=0)
    clf.fit(X, y)

    err_msg = "max_depth bust be >= 0, given -1"
    with pytest.raises(ValueError, match=err_msg):
        export_text(clf, max_depth=-1)
    err_msg = "feature_names must contain 2 elements, got 1"
    with pytest.raises(ValueError, match=err_msg):
        export_text(clf, feature_names=["a"])
    err_msg = (
        "When `class_names` is not None, it should be a list containing as"
        " many items as `decision_tree.classes_`. Got 1 while"
        " the tree was fitted with 2 classes."
    )
    with pytest.raises(ValueError, match=err_msg):
        export_text(clf, class_names=["a"])
    err_msg = "decimals must be >= 0, given -1"
    with pytest.raises(ValueError, match=err_msg):
        export_text(clf, decimals=-1)
    err_msg = "spacing must be > 0, given 0"
    with pytest.raises(ValueError, match=err_msg):
        export_text(clf, spacing=0)

glemaitre · 2023-01-13T21:46:04Z

You need to put the test in the associated test file in test_export.py.

glemaitre

You need to write a new test to check that we raise a warning when passing "numeric".

Also, I would expect now the different tests to raise the warnings. We should modify them to use class_names=None. However, we should check the behaviour of "numeric" at the same time that we check that we raise the warning.

mokwilliam · 2023-01-20T16:30:24Z

I took the liberty of making some changes and tests :

in the export_text function, it doesn't recognize the list given for class_names.

        expected_report = dedent(
            """
        |--- feature_1 <= 0.00
        |   |--- class: a
        |--- feature_1 >  0.00
        |   |--- class: b
        """
        ).lstrip()
>       assert export_text(clf, class_names=["a", "b"]) == expected_report
E       AssertionError: assert '|--- feature...-- class: 1\n' == '|--- feature...-- class: b\n'
E           |--- feature_1 <= 0.00
E         - |   |--- class: a
E         ?                 ^
E         + |   |--- class: -1
E         ?                 ^^
E           |--- feature_1 >  0.00
E         - |   |--- class: b...
E
E         ...Full output truncated (4 lines hidden), use '-vv' to show

=> So I've decided to re-add the line. And the test above works.

class_names = class_names # when class_names is not None and len(class_names) == len(decision_tree.classes_)

I've added this line because : if "numeric", the class names are generic names representing numerical numbers

class_names = range(decision_tree.n_classes_)

glemaitre

Sorry for the misleading review. I tried to make the suggestion to remove the "numeric" option which is not necessary.

Could you also add an entry in the changelog in doc/whats_new/v1.3.rst to acknowlege the new parameter. It will be an enhancement.

glemaitre · 2023-01-23T09:15:15Z

sklearn/tree/_export.py

@@ -15,6 +15,7 @@
 from numbers import Integral

 import numpy as np
+import warnings


Can you move this import under from numbers import Integral

glemaitre · 2023-01-23T09:16:24Z

sklearn/tree/_export.py

@@ -1042,6 +1080,7 @@ def print_tree_recurse(node, depth):
            value = tree_.value[node][0]
        else:
            value = tree_.value[node].T[0]
+


You can revert this change.

If it was not introduced by black

glemaitre · 2023-01-23T09:20:41Z

sklearn/tree/_export.py

@@ -986,7 +1005,26 @@ def export_text(
    check_is_fitted(decision_tree)
    tree_ = decision_tree.tree_
    if is_classifier(decision_tree):
-        class_names = decision_tree.classes_


Uhm it seems that I misunderstood something when reading the documentation at first. We already uses decision_tree.classes_. So we don't need "numeric" and any deprecation (which is a good news).

Sorry to have brought this way. We will need to modify (remove) the code :)

glemaitre · 2023-01-23T09:21:03Z

sklearn/tree/_export.py

+        if class_names == "numeric":
+            warnings.warn(
+                "The option `class_names='numeric'` is deprecated in 1.3 and will be"
+                " removed in 1.5. Set `class_names=None`, the classes as seen by"
+                " `decision_tree` during `fit` will be used instead.",
+                FutureWarning,
+            )
+            class_names = range(decision_tree.n_classes_)
+        elif class_names is not None:


Suggested change

if class_names == "numeric":

warnings.warn(

"The option `class_names='numeric'` is deprecated in 1.3 and will be"

" removed in 1.5. Set `class_names=None`, the classes as seen by"

" `decision_tree` during `fit` will be used instead.",

FutureWarning,

)

class_names = range(decision_tree.n_classes_)

elif class_names is not None:

if class_names is not None:

glemaitre · 2023-01-23T09:22:01Z

sklearn/tree/_export.py

+    class_names : "numeric", list or None, default="numeric"
+        Names of each of the target classes in ascending numerical order.
+        Only relevant for classification and not supported for multi-output.
+
+        - if `None`, the class names are delegated to `decision_tree.classes_`;
+        - if `"numeric"`, the class names are generic names representing numerical
+          numbers (e.g. `["0", "1", ...]`);
+        - if a list, the number of items should be the same as in
+          `decition_tree.classes_` and will be used.
+
+        .. versionadded:: 1.3
+           `class_names` was added in version 1.3.
+
+        .. deprecated:: 1.3
+           The `"numeric"` option is deprecated and will be replaced by `None`. Thus,
+           `decision_tree.classes_` will be used by default.


Suggested change

class_names : "numeric", list or None, default="numeric"

Names of each of the target classes in ascending numerical order.

Only relevant for classification and not supported for multi-output.

- if `None`, the class names are delegated to `decision_tree.classes_`;

- if `"numeric"`, the class names are generic names representing numerical

numbers (e.g. `["0", "1", ...]`);

- if a list, the number of items should be the same as in

`decition_tree.classes_` and will be used.

.. versionadded:: 1.3

`class_names` was added in version 1.3.

.. deprecated:: 1.3

The `"numeric"` option is deprecated and will be replaced by `None`. Thus,

`decision_tree.classes_` will be used by default.

class_names : list, default="numeric"

Names of each of the target classes in ascending numerical order.

Only relevant for classification and not supported for multi-output.

- if `None`, the class names are delegated to `decision_tree.classes_`;

- if a list, the number of items should be the same as in

`decition_tree.classes_` and will be used.

.. versionadded:: 1.3

glemaitre · 2023-01-23T09:22:18Z

sklearn/tree/_export.py

+    class_names="numeric",
    max_depth=10,


Suggested change

class_names="numeric",

max_depth=10,

class_names=None,

max_depth=10,

glemaitre · 2023-01-23T09:22:55Z

sklearn/tree/tests/test_export.py

+def test_export_text_warnings():
+    clf = DecisionTreeClassifier(max_depth=2, random_state=0)
+    clf.fit(X, y)
+    warn_msg = (
+        "The option `class_names='numeric'` is deprecated in 1.3 and will be"
+        " removed in 1.5. Set `class_names=None`, the classes as seen by"
+        " `decision_tree` during `fit` will be used instead."
+    )
+    with pytest.warns(FutureWarning, match=warn_msg):
+        export_text(clf, class_names="numeric")
+
+


Suggested change

def test_export_text_warnings():

clf = DecisionTreeClassifier(max_depth=2, random_state=0)

clf.fit(X, y)

warn_msg = (

"The option `class_names='numeric'` is deprecated in 1.3 and will be"

" removed in 1.5. Set `class_names=None`, the classes as seen by"

" `decision_tree` during `fit` will be used instead."

)

with pytest.warns(FutureWarning, match=warn_msg):

export_text(clf, class_names="numeric")

We don't need this test anymore.

glemaitre · 2023-01-23T09:24:24Z

sklearn/tree/tests/test_export.py

+    assert export_text(clf, class_names=None, max_depth=0) == expected_report
    # testing that the rest of the tree is truncated
-    assert export_text(clf, max_depth=10) == expected_report
+    assert export_text(clf, class_names=None, max_depth=10) == expected_report


Suggested change

assert export_text(clf, class_names=None, max_depth=0) == expected_report

# testing that the rest of the tree is truncated

assert export_text(clf, max_depth=10) == expected_report

assert export_text(clf, class_names=None, max_depth=10) == expected_report

assert export_text(clf, max_depth=0) == expected_report

# testing that the rest of the tree is truncated

assert export_text(clf, max_depth=10) == expected_report

Basically, it is this test that indicated me that we were doing wrong since class_names=None seems to be the previous behaviour.

glemaitre · 2023-01-23T09:25:11Z

sklearn/tree/tests/test_export.py

+
+    expected_report = dedent(
+        """
+    |--- feature_1 <= 0.00
+    |   |--- class: 0
+    |--- feature_1 >  0.00
+    |   |--- class: 1
+    """
+    ).lstrip()
+    assert export_text(clf, class_names="numeric") == expected_report


Suggested change

expected_report = dedent(

"""

|--- feature_1 <= 0.00

| |--- class: 0

|--- feature_1 > 0.00

| |--- class: 1

"""

).lstrip()

assert export_text(clf, class_names="numeric") == expected_report

We don't need this case anymore.

glemaitre · 2023-01-23T09:25:47Z

sklearn/tree/tests/test_export.py

@@ -392,7 +421,29 @@ def test_export_text():
    |   |--- class: 1
    """
    ).lstrip()
-    assert export_text(clf, feature_names=["a", "b"]) == expected_report
+    assert (


You can remove the class_names=None since this will be the default indeed.

mokwilliam · 2023-01-23T10:49:55Z

Uhm it seems that I misunderstood something when reading the documentation at first. We already uses decision_tree.classes_. So we don't need "numeric" and any deprecation (which is a good news).

Sorry to have brought this way. We will need to modify (remove) the code :)

That’s fine! I’m making my first contribution on Sklearn, I’m learning a lot.

To point out something, when I'm doing pytest sklearn/tree/tests/test_export.py with this code, all of them passed.

    if is_classifier(decision_tree):
        if class_names is not None:
            if len(class_names) != len(decision_tree.classes_):
                raise ValueError(
                    "When `class_names` is not None, it should be a list containing as"
                    " many items as `decision_tree.classes_`. Got"
                    f" {len(class_names)} while the tree was fitted with"
                    f" {len(decision_tree.classes_)} classes."
                )
            else:
                class_names = class_names
        else:
            class_names = decision_tree.classes_

But if I'm combining the two if statements into one, it doesn't recognize the list passed as class_names.

    if is_classifier(decision_tree):
        if class_names is not None and len(class_names) != len(decision_tree.classes_):
            raise ValueError(
                "When `class_names` is not None, it should be a list containing as"
                " many items as `decision_tree.classes_`. Got"
                f" {len(class_names)} while the tree was fitted with"
                f" {len(decision_tree.classes_)} classes."
            )
        else:
            class_names = decision_tree.classes_

Gives

        expected_report = dedent(
            """
        |--- feature_1 <= 0.00
        |   |--- class: a
        |--- feature_1 >  0.00
        |   |--- class: b
        """
        ).lstrip()
>       assert export_text(clf, class_names=["a", "b"]) == expected_report
E       AssertionError: assert '|--- feature...-- class: 1\n' == '|--- feature...-- class: b\n'
E           |--- feature_1 <= 0.00
E         - |   |--- class: a
E         ?                 ^
E         + |   |--- class: -1
E         ?                 ^^
E           |--- feature_1 >  0.00
E         - |   |--- class: b...
E
E         ...Full output truncated (4 lines hidden), use '-vv' to show

sklearn\tree\tests\test_export.py:414: AssertionError

glemaitre · 2023-01-23T11:24:06Z

        if class_names is not None and len(class_names) != len(decision_tree.classes_):
            raise ValueError(...)
        else:
            class_names = decision_tree.classes_

Indeed, here you will fall in the else condition when passing an array since the right-hand side of the and operator will be False

mokwilliam · 2023-01-23T11:54:10Z

Yep, and after a raise error, we don't need a else statement. So it should be :

    if is_classifier(decision_tree):
        if class_names is not None:
            if len(class_names) != len(decision_tree.classes_):
                raise ValueError(
                    "When `class_names` is not None, it should be a list containing as"
                    " many items as `decision_tree.classes_`. Got"
                    f" {len(class_names)} while the tree was fitted with"
                    f" {len(decision_tree.classes_)} classes."
                )
            class_names = class_names
        else:
            class_names = decision_tree.classes_

glemaitre · 2023-01-23T16:41:14Z

sklearn/tree/_export.py

+        if class_names is not None:
+            if len(class_names) != len(decision_tree.classes_):
+                raise ValueError(
+                    "When `class_names` is not None, it should be a list containing as"
+                    " many items as `decision_tree.classes_`. Got"
+                    f" {len(class_names)} while the tree was fitted with"
+                    f" {len(decision_tree.classes_)} classes."
+                )
+            class_names = class_names
+        else:
+            class_names = decision_tree.classes_


Actually, the most compact way would be the following.

Suggested change

if class_names is not None:

if len(class_names) != len(decision_tree.classes_):

raise ValueError(

"When `class_names` is not None, it should be a list containing as"

" many items as `decision_tree.classes_`. Got"

f" {len(class_names)} while the tree was fitted with"

f" {len(decision_tree.classes_)} classes."

)

class_names = class_names

else:

class_names = decision_tree.classes_

if class_names is not None and len(class_names) != len(decision_tree.classes_):

raise ValueError(

"When `class_names` is not None, it should be a list containing as"

" many items as `decision_tree.classes_`. Got"

f" {len(class_names)} while the tree was fitted with"

f" {len(decision_tree.classes_)} classes."

)

elif class_names is None:

class_names = decision_tree.classes_

glemaitre · 2023-01-23T16:42:03Z

doc/whats_new/v1.3.rst

+- |Enhancement| Adds a `class_names` parameter to
+  :func:`tree.export_text`. This allows specifying the parameter `class_names`
+  for each target class in ascending numerical order.
+  :pr:`25387` by :user:`William M <Akbeeh>`, :user:`Guillaume Lemaitre <glemaitre>`, and


You can remove my name here. This is fine to not have it.

glemaitre · 2023-01-23T16:42:47Z

doc/whats_new/v1.3.rst

+  :pr:`25387` by :user:`William M <Akbeeh>`, :user:`Guillaume Lemaitre <glemaitre>`, and
+  :user:`crispinlogan <crispinlogan>`.
+
+:mod:`sklearn.tree`


I think that you mixed the entry here. Could you make sure to have sklearn.tree in the right alphabetic order and your entry just below it?

glemaitre

LGTM. Thanks @Akbeeh

mokwilliam · 2023-01-24T11:57:02Z

Thank you for all the comments. It was informative and I learned from it !

glemaitre · 2023-01-31T11:19:09Z

Putting a tag that this PR is waiting for another approval

thomasjpfan

Thank you for the PR @Akbeeh !

thomasjpfan · 2023-02-07T21:00:34Z

sklearn/tree/tests/test_export.py

+    expected_report = dedent(
+        """
+    |--- feature_1 <= 0.00
+    |   |--- class: a
+    |--- feature_1 >  0.00
+    |   |--- class: b
+    """
+    ).lstrip()
+    assert export_text(clf, class_names=["a", "b"]) == expected_report


Nit: To make a little more different compared to the test above:

Suggested change

expected_report = dedent(

"""

|--- feature_1 <= 0.00

| |--- class: a

|--- feature_1 > 0.00

| |--- class: b

"""

).lstrip()

assert export_text(clf, class_names=["a", "b"]) == expected_report

expected_report = dedent(

"""

|--- feature_1 <= 0.00

| |--- class: cat

|--- feature_1 > 0.00

| |--- class: dog

"""

).lstrip()

assert export_text(clf, class_names=["cat", "dog"]) == expected_report

thomasjpfan · 2023-02-07T21:02:44Z

sklearn/tree/_export.py

+        - if `None`, the class names are delegated to `decision_tree.classes_`;
+        - if a list, the number of items should be the same as in
+          `decition_tree.classes_` and will be used.
+


This needs a .. versionadded:: 1.3 directive to indicate that the parameter was added in 1.3.

thomasjpfan · 2023-02-07T21:05:02Z

sklearn/tree/_export.py

+        if class_names is not None and len(class_names) != len(decision_tree.classes_):
+            raise ValueError(
+                "When `class_names` is not None, it should be a list containing as"
+                " many items as `decision_tree.classes_`. Got"
+                f" {len(class_names)} while the tree was fitted with"
+                f" {len(decision_tree.classes_)} classes."
+            )
+        elif class_names is None:
+            class_names = decision_tree.classes_


Nit: This order for checking is a little clearer to me:

Suggested change

if class_names is not None and len(class_names) != len(decision_tree.classes_):

raise ValueError(

"When `class_names` is not None, it should be a list containing as"

" many items as `decision_tree.classes_`. Got"

f" {len(class_names)} while the tree was fitted with"

f" {len(decision_tree.classes_)} classes."

)

elif class_names is None:

class_names = decision_tree.classes_

if class_names is None:

class_names = decision_tree.classes_

elif len(class_names) != len(decision_tree.classes_):

raise ValueError(

"When `class_names` is a list, it should contain as"

" many items as `decision_tree.classes_`. Got"

f" {len(class_names)} while the tree was fitted with"

f" {len(decision_tree.classes_)} classes."

)

Also, I think it is less wordy to directly state that "When class_names is a list". (The test needs to be updated as well)

mokwilliam · 2023-02-08T17:58:51Z

Thanks for the remarks, I've made changes. Hope it'll fulfill your requirements !

thomasjpfan

Minor comment, otherwise LGTM!

sklearn/tree/_export.py

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

* ENH Raise NotFittedError in get_feature_names_out for MissingIndicator, KBinsDiscretizer, SplineTransformer, DictVectorizer (scikit-learn#25402) Co-authored-by: Alex <alex.buzenet.fr@gmail.com> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * DOC Update date and contributors list for v1.2.1 (scikit-learn#25459) * DOC Make MeanShift documentation clearer (scikit-learn#25305) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * Finishes boolean and arithmetic creation * Skeleton for traditional GP * DOC Reorder whats_new/v1.2.rst (scikit-learn#25461) Follow-up of scikit-learn#25459 Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com> * FIX fix faulty test in `cross_validate` that used the wrong estimator (scikit-learn#25456) * ENH Raise NotFittedError in get_feature_names_out for estimators that use ClassNamePrefixFeatureOutMixin and SelectorMixin (scikit-learn#25308) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * EFF Improve IsolationForest predict time (scikit-learn#25186) Co-authored-by: Felipe Breve Siola <felipe.breve-siola@klarna.com> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Tim Head <betatim@gmail.com> * MAINT refactor spectral_clustering to call SpectralClustering (scikit-learn#25392) * TST reduce warnings in test_logistic.py (scikit-learn#25469) * CI Build doc on CircleCI (scikit-learn#25466) * DOC Update news footer for 1.2.1 (scikit-learn#25472) * MAINT Validate parameter for `sklearn.cluster.cluster_optics_xi` (scikit-learn#25385) Co-authored-by: adossantosalfam <anthony.dos_santos_alfama@insa-rouen.fr> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * MAINT Parameters validation for additive_chi2_kernel (scikit-learn#25424) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * Initial Program Creation * CI Include linting in CircleCI (scikit-learn#25475) * MAINT Update version number to 1.2.1 in SECURITY.md (scikit-learn#25471) * TST Sets random_state for test_logistic.py (scikit-learn#25446) * MAINT Remove -Wcpp warnings when compiling sklearn.decomposition._online_lda_fast (scikit-learn#25020) Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> * FIX Support readonly sparse datasets for `manhattan_distances` (scikit-learn#25432) * TST Add non-regression test for scikit-learn#7981 This reproducer is adapted from the one of this message: scikit-learn#7981 (comment) Co-authored-by: Loïc Estève <loic.esteve@ymail.com> * FIX Support readonly sparse datasets for manhattan * DOC Add entry in whats_new/v1.2.rst for 1.2.1 * FIX Fix comment * Update sklearn/metrics/tests/test_pairwise.py Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com> * DOC Move entry to whats_new/v1.3.rst * Update sklearn/metrics/tests/test_pairwise.py Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Loïc Estève <loic.esteve@ymail.com> Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * MAINT dynamically expose kulsinski and remove support in BallTree (scikit-learn#25417) Co-authored-by: Loïc Estève <loic.esteve@ymail.com> Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> closes scikit-learn#25212 * DOC Adds CirrusCI badge to readme (scikit-learn#25483) * CI add linter display name (scikit-learn#25485) * DOC update description of X in `FunctionTransformer.transform()` (scikit-learn#24844) * MAINT remove -Wcpp warnings when compiling sklearn.preprocessing._csr_polynomial_expansion (scikit-learn#25041) * DOC more didactic example of bisecting kmeans (scikit-learn#25494) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com> Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * ENH csr_row_norms optimization (scikit-learn#24426) Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com> * TST Allow callables as valid parameter regarding cloning estimator (scikit-learn#25498) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Loïc Estève <loic.esteve@ymail.com> Co-authored-by: From: Tim Head <betatim@gmail.com> * DOC Fixes sphinx search on website (scikit-learn#25504) * FIX make IsotonicRegression always predict NumPy arrays (scikit-learn#25500) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * FEA Add Gamma deviance as loss function to HGBT (scikit-learn#22409) * FEA add gamma loss to HGBT * DOC add whatsnew * CLN address review comments * TST make test_gamma pass by not testing out-of-sample * TST compare gamma and poisson to LightGBM * TST fix test_gamma by comparing to MSE HGBT instead of Poisson HGBT * TST fix for test_same_predictions_regression for poisson * CLN address review comments * CLN nits * CLN better comments * TST use pytest.param with skip mark * TST Correct conditional test parametrization mark Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com> * CI Trigger CI Builds currently fail because requests to Azure Ubuntu repository timeout. * DOC add comment for lax comparison with LightGBM * CLN tuple needs trailing comma --------- Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> * MAINT Remove -Wsign-compare warnings when compiling sklearn.tree._tree (scikit-learn#25507) * MAINT add more intuition on OAS computation based on literature (scikit-learn#23867) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * CI Allow cirrus arm tests to run with cd build commit tag (scikit-learn#25514) * CI Upload ARM wheels from CirrusCI to nightly and staging index (scikit-learn#25513) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * MAINT Remove -Wcpp warnings from sklearn.utils._seq_dataset (scikit-learn#25406) * FIX Fixes linux ARM CI on CirrusCI (scikit-learn#25536) * DOC Fix grammatical mistake in `mixture` module (scikit-learn#25541) * DOC add missing trailing colon (scikit-learn#25542) * MAINT Parameters validation for sklearn.datasets.make_classification (scikit-learn#25474) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * MNT Expose allow_nan tag in bagging (scikit-learn#25506) * MAINT Clean-up comments and rename variables in `_middle_term_sparse_sparse_{32, 64}` (scikit-learn#25449) Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> * DOC: remove incorrect statement (scikit-learn#25544) * MAINT Parameters validation for reconstruct_from_patches_2d (scikit-learn#25384) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * MAINT Parameter validation for sklearn.metrics.d2_pinball_score (scikit-learn#25414) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * MAINT Parameters validation for spectral_clustering (scikit-learn#25378) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * MAINT Parameters validation for sklearn.datasets.fetch_kddcup99 (scikit-learn#25463) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * DOC Update MLPRegressor docs (scikit-learn#25556) Co-authored-by: Ian Thompson <ian.thompson@hrblock.com> * DOC Update docs for KMeans (scikit-learn#25546) Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * FIX BisectingKMeans crashes randomly (scikit-learn#25563) Fixes scikit-learn#25505 * ENH BaseLabelPropagation to accept sparse matrices (scikit-learn#19664) Co-authored-by: Kaushik Amar Das <kaushik.amar.das@accenture.com> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * MAINT Remove travis ci config and related doc (scikit-learn#25562) * DOC Add pynndescent to Approximate nearest neighbors in TSNE example (scikit-learn#25480) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * DOC Add docstring example to make_regression (scikit-learn#25551) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * MAINT ensure that pos_label support all possible types (scikit-learn#25317) * MAINT Parameters validation for sklearn.metrics.f1_score (scikit-learn#25557) Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * ENH Adds `class_names` to `tree.export_text` (scikit-learn#25387) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * MAINT Replace cnp.ndarray with memory views in sklearn.tree._tree (where possible) (scikit-learn#25540) * DOC Change print format in TSNE example (scikit-learn#25569) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * FIX ColumnTransformer supports empty selection for pandas output (scikit-learn#25570) Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> * DOC fix docstring of _plain_sgd (scikit-learn#25573) * FIX Enable setting of sub-parameters for deprecated base_estimator param (scikit-learn#25477) * DOC Improve minor and bug-fix release processes documentation (scikit-learn#25457) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Jérémie du Boisberranger <jeremiedbb@yahoo.fr> * MAINT Remove ReadonlyArrayWrapper from _loss module (scikit-learn#25555) * MAINT Remove ReadonlyArrayWrapper from _loss module * CLN Remove comments about Cython 3.0 * MAINT Remove ReadonlyArrayWrapper from _kmeans (scikit-learn#25554) * MAINT Remove ReadonlyArrayWrapper from _kmeans * more const and remove blas compile warnings * CLN Adds comment about casting to non const pointers * Update sklearn/utils/_cython_blas.pyx * MAINT Remove ReadonlyArrayWrapper from DistanceMetric (scikit-learn#25553) * DOC improve stop_words description w.r.t. max_df range in CountVectorizer (scikit-learn#25489) * MAINT Removes ReadOnlyWrapper (scikit-learn#25586) * MAINT Parameters validation for sklearn.metrics.log_loss (scikit-learn#25577) * MAINT Adds comments and better naming into tree code (scikit-learn#25576) * MAINT Adds comments and better naming into tree code * CLN Use feature_values instead of Xf * Apply suggestions from code review Co-authored-by: Adam Li <adam2392@gmail.com> * DOC Improve comment from review * Apply suggestions from code review Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> --------- Co-authored-by: Adam Li <adam2392@gmail.com> Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * FIX error when deserialzing a Tree instance from a read only buffer (scikit-learn#25585) * DOC: fix typo in California Housing dataset description (scikit-learn#25613) * ENH: Update KDTree, and example documentation (scikit-learn#25482) * ENH: Update KDTree, and example documentation * ENH: Add valid metric function and reference doc * CHG: Documentation update Co-authored-by: Adam Li <adam2392@gmail.com> * CHG: make valid metric property and fix doc string * FIX: documentation, and add code example * ENH: Change valid metric to class method, and doc * ENH: Change valid metric class variable, and doc * FIX: documentation error * FIX: documentation error * CHG: Use class method for valid metrics * FIX: CI problems --------- Co-authored-by: Adam Li <adam2392@gmail.com> Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> * TST Common test for checking estimator deserialization from a read only buffer (scikit-learn#25624) * DOC fix comment in plot_logistic_l1_l2_sparsity.py (scikit-learn#25633) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * DOC Places governance in navigation bar (scikit-learn#25618) * MAINT Check pyproject toml is consistent with min_dependencies (scikit-learn#25610) * MAINT Check pyproject toml is consistent with min_dependencies * CLN Make it clear that only SciPy and Cython are checked * CLN Revert auto formatter * MAINT Use newest NumPy C API in tree._criterion (scikit-learn#25615) * MAINT Use newest NumPy C API in tree._criterion * FIX Use pointer for children * FIX Fixes check_array nonfinite checks with ArrayAPI specification (scikit-learn#25619) * FIX Fixes check_array nonfinite checks with ArrayAPI specification * DOC Adds PR number * FIX Test on both cupy and numpy * DOC Correctly docstring in StackingRegressor.fit_transform (scikit-learn#25599) * MAINT Remove Cython compilation warnings ahead of Cython3.0 release (scikit-learn#25621) * ENH Preserve DataFrame dtypes in transform for feature selectors (scikit-learn#25102) * FIX report properly n_iter_ when warm_start=True (scikit-learn#25443) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * DOC fix typo in KMeans's param. (scikit-learn#25649) * FIX use const memory views in hist_gradient_boosting predictor (scikit-learn#25650) * DOC modified the graph for better readability (scikit-learn#25644) * MAINT Removes upper limit on setuptools (scikit-learn#25651) * DOC improve the `warm_start` glossary entry (scikit-learn#25523) * DOC Update governance document for SLEP020 (scikit-learn#25663) Co-authored-by: Tim Head <betatim@gmail.com> Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com> * FIX renormalization of y_pred inside log_loss (scikit-learn#25299) * Remove renormalization of y_pred inside log_loss * Deprecate eps parameter in log_loss * ENH Allows target to be pandas nullable dtypes (scikit-learn#25638) * DOC unify usage of 'w.r.t.' (scikit-learn#25683) * MAINT Parameters validation for metrics.max_error (scikit-learn#25679) * MAINT Parameters validation for datasets.make_friedman1 (scikit-learn#25674) Co-authored-by: jeremie du boisberranger <jeremiedbb@yahoo.fr> * MAINT Parameters validation for mean_pinball_loss (scikit-learn#25685) Co-authored-by: jeremie du boisberranger <jeremiedbb@yahoo.fr> * DOC Specify behavior of None for CountVectorizer (scikit-learn#25678) * DOC Specify behaviour of None for TfIdfVectorizer max_features parameter (scikit-learn#25676) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * MAINT Set random state for plot_anomaly_comparison (scikit-learn#25675) * MAINT Parameters validation for cluster.mean_shift (scikit-learn#25684) Co-authored-by: jeremie du boisberranger <jeremiedbb@yahoo.fr> * MAINT Parameters validation for sklearn.metrics.jaccard_score (scikit-learn#25680) Co-authored-by: jeremie du boisberranger <jeremiedbb@yahoo.fr> * DOC Add the custom compiler section back (scikit-learn#25667) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * MAINT Parameters validation for precision_recall_fscore_support (scikit-learn#25681) Co-authored-by: jeremie du boisberranger <jeremiedbb@yahoo.fr> * FIX Allow negative tol in SequentialFeatureSelector (scikit-learn#25664) * MAINT Replace deprecated cython conditional compilation (scikit-learn#25654) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * DOC fix formatting typo in related_projects (scikit-learn#25706) * MAINT Parameters validation for metrics.mean_absolute_percentage_error (scikit-learn#25695) * MAINT Parameters validation for metrics.precision_recall_curve (scikit-learn#25698) Co-authored-by: jeremie du boisberranger <jeremiedbb@yahoo.fr> * MAINT Parameter Validation for metrics.precision_score (scikit-learn#25708) Co-authored-by: jeremie du boisberranger <jeremiedbb@yahoo.fr> * CI Stablize build with random_state (scikit-learn#25701) Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * MAINT Remove -Wcpp warnings when compiling arrayfuncs (scikit-learn#25415) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * DOC Add scikit-learn-intelex to related projects (scikit-learn#23766) Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * ENH Support float32 in SGDClassifier and SGDRegressor (scikit-learn#25587) * FIX Raise appropriate attribute error in ensemble (scikit-learn#25668) * FIX Allow OrdinalEncoder's encoded_missing_value set to the cardinality (scikit-learn#25704) * ENH Let csr_row_norms support multi-thread (scikit-learn#25598) Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> Co-authored-by: Vincent M <maladiere.vincent@yahoo.fr> * MAINT Parameter Validation for feature_selection.chi2 (scikit-learn#25719) Co-authored-by: jeremiedbb <jeremiedbb@yahoo.fr> * MAINT Parameter Validation for feature_selection.f_classif (scikit-learn#25720) Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * MAINT Parameters validation for sklearn.metrics.matthews_corrcoef (scikit-learn#25712) Co-authored-by: jeremiedbb <jeremiedbb@yahoo.fr> * MAINT parameter validation for sklearn.datasets.dump_svmlight_file (scikit-learn#25726) Co-authored-by: jeremiedbb <jeremiedbb@yahoo.fr> * MAINT Clean dead code in build helpers (scikit-learn#25661) * MAINT Use newest NumPy C API in metrics._dist_metrics (scikit-learn#25702) * CI Adds permissions to workflows that use GITHUB_TOKEN (scikit-learn#25600) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * FIX Improves error message in partial_fit when early_stopping=True (scikit-learn#25694) Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * DOC Makes navbar static (scikit-learn#25688) * MAINT Remove redundant sparse square euclidian distances function (scikit-learn#25731) * MAINT Use float64 for accumulators in WeightVector* (scikit-learn#25721) * API make PatchExtractor being a real scikit-learn transformer (scikit-learn#24230) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * MAINT Update pyparsing.py to use bool instead of double negation (scikit-learn#25724) * API Deprecates values in partial_dependence in favor of pdp_values (scikit-learn#21809) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * API Use grid_values instead of pdp_values in partial_dependence (scikit-learn#25732) * MAINT remove np.product and inf/nan aliases in favor of canonical names (scikit-learn#25741) * MAINT Parameters validation for metrics.label_ranking_loss (scikit-learn#25742) Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * MAINT Parameters validation for metrics.coverage_error (scikit-learn#25748) * MAINT Parameters validation for metrics.dcg_score (scikit-learn#25749) * MAINT replace cnp.ndarray with memory views in _fast_dict (scikit-learn#25754) * MAINT Parameter Validation for feature_selection.f_regression (scikit-learn#25736) Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * MAINT Parameters validation for feature_selection.r_regression (scikit-learn#25734) Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * MAINT Parameter Validation for metrics.get_scorer (scikit-learn#25738) Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * DOC Move allowing pandas nullable dtypes to 1.2.2 (scikit-learn#25692) * MAINT replace cnp.ndarray with memory views in sparsefuncs_fast (scikit-learn#25764) * MAINT parameter validation for sklearn.datasets.fetch_covtype (scikit-learn#25759) Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> * MAINT Define centralized generic, but with explicit precision, types (scikit-learn#25739) * CI Disable network when SciPy requires it (scikit-learn#25743) * CI Open issue when arm wheel fails on CirrusCI (scikit-learn#25620) * ENH Speed-up expected mutual information (scikit-learn#25713) Co-authored-by: Kshitij Mathur <k.mathur68@gmail.com> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Omar Salman <omar.salman@arbisoft.com> * FIX add retry mechanism to handle quotechar in read_csv (scikit-learn#25511) * Merge Population Creation (#1) --------- Co-authored-by: Alex Buzenet <94121450+albuzenet@users.noreply.github.com> Co-authored-by: Alex <alex.buzenet.fr@gmail.com> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> Co-authored-by: Adam Kania <48769688+remilvus@users.noreply.github.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com> Co-authored-by: Shady el Gewily <90049412+shadyelgewily-slimstock@users.noreply.github.com> Co-authored-by: John Pangas <swiftyxswaggy@outlook.com> Co-authored-by: Felipe Siola <fsiola@gmail.com> Co-authored-by: Felipe Breve Siola <felipe.breve-siola@klarna.com> Co-authored-by: Tim Head <betatim@gmail.com> Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com> Co-authored-by: Loïc Estève <loic.esteve@ymail.com> Co-authored-by: Anthony22-dev <122220081+Anthony22-dev@users.noreply.github.com> Co-authored-by: adossantosalfam <anthony.dos_santos_alfama@insa-rouen.fr> Co-authored-by: Xiao Yuan <yuanx749@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Omar Salman <omar.salman@arbisoft.com> Co-authored-by: Rahil Parikh <75483881+rprkh@users.noreply.github.com> Co-authored-by: Gael Varoquaux <gael.varoquaux@normalesup.org> Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com> Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com> Co-authored-by: Meekail Zain <34613774+Micky774@users.noreply.github.com> Co-authored-by: davidblnc <40642621+davidblnc@users.noreply.github.com> Co-authored-by: Changyao Chen <changyao.chen@gmail.com> Co-authored-by: Nicola Fanelli <48762613+nicolafan@users.noreply.github.com> Co-authored-by: Vincent M <maladiere.vincent@yahoo.fr> Co-authored-by: partev <petrosyan@gmail.com> Co-authored-by: ouss1508 <121971998+ouss1508@users.noreply.github.com> Co-authored-by: ashah002 <97778401+ashah002@users.noreply.github.com> Co-authored-by: Ahmedbgh <83551938+Ahmedbgh@users.noreply.github.com> Co-authored-by: Pooja M <90301980+pm155@users.noreply.github.com> Co-authored-by: Ian Thompson <ianiat11@gmail.com> Co-authored-by: Ian Thompson <ian.thompson@hrblock.com> Co-authored-by: SANJAI_3 <86285670+sanjail3@users.noreply.github.com> Co-authored-by: Kaushik Amar Das <cozek@users.noreply.github.com> Co-authored-by: Kaushik Amar Das <kaushik.amar.das@accenture.com> Co-authored-by: Nawazish Alam <nawazishmail@gmail.com> Co-authored-by: William M <64324808+Akbeeh@users.noreply.github.com> Co-authored-by: Jérémie du Boisberranger <jeremiedbb@yahoo.fr> Co-authored-by: JanFidor <66260538+JanFidor@users.noreply.github.com> Co-authored-by: Adam Li <adam2392@gmail.com> Co-authored-by: Logan Thomas <logan.thomas005@gmail.com> Co-authored-by: Vyom Pathak <angerstick3@gmail.com> Co-authored-by: as-90 <88336957+as-90@users.noreply.github.com> Co-authored-by: Marvin Krawutschke <101656586+Marvvxi@users.noreply.github.com> Co-authored-by: Haesun Park <haesunrpark@gmail.com> Co-authored-by: Christine P. Chai <star1327p@gmail.com> Co-authored-by: Christian Veenhuis <124370897+ChVeen@users.noreply.github.com> Co-authored-by: Sortofamudkip <wishyutp0328@gmail.com> Co-authored-by: sonnivs <48860780+sonnivs@users.noreply.github.com> Co-authored-by: Ali H. El-Kassas <aliabdelmonem234@gmail.com> Co-authored-by: Yusuf Raji <raji.yusuf234@gmail.com> Co-authored-by: Tabea Kossen <tabeakossen@gmail.com> Co-authored-by: Pooja Subramaniam <poojas2086@gmail.com> Co-authored-by: JuliaSchoepp <63353759+JuliaSchoepp@users.noreply.github.com> Co-authored-by: Jack McIvor <jacktmcivor@gmail.com> Co-authored-by: zeeshan lone <56621467+still-learning-ev@users.noreply.github.com> Co-authored-by: Max Halford <maxhalford25@gmail.com> Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com> Co-authored-by: genvalen <genvalen@protonmail.com> Co-authored-by: Shiva chauhan <103742975+Shivachauhan17@users.noreply.github.com> Co-authored-by: Dayne <daynesorvisto@yahoo.ca> Co-authored-by: Ralf Gommers <ralf.gommers@gmail.com> Co-authored-by: Kshitij Mathur <k.mathur68@gmail.com>

Update _export.py

6a88e33

github-actions bot added the module:tree label Jan 12, 2023

crispinlogan reviewed Jan 13, 2023

View reviewed changes

sklearn/tree/_export.py Outdated Show resolved Hide resolved

crispinlogan reviewed Jan 13, 2023

View reviewed changes

sklearn/tree/_export.py Show resolved Hide resolved

crispinlogan reviewed Jan 13, 2023

View reviewed changes

glemaitre changed the title ~~[MRG] New feature: class_names argument in tree.export_text()~~ ENH add class_names in tree.export_text Jan 13, 2023

glemaitre reviewed Jan 13, 2023

View reviewed changes

mokwilliam and others added 4 commits January 13, 2023 16:28

Merge remote-tracking branch 'upstream/main' into export_text_class_n…

ae72848

…ames

Merge branch 'main' into export_text_class_names

ae2d3e9

Merge branch 'export_text_class_names' of github.com:Akbeeh/scikit-le…

d299d45

…arn into export_text_class_names

Update code & documentation after reviews

fd30a47

glemaitre reviewed Jan 13, 2023

View reviewed changes

sklearn/tree/_export.py Show resolved Hide resolved

glemaitre reviewed Jan 13, 2023

View reviewed changes

Update default class_names=numeric

38b15ff

Add tests - Need check, raises errors

013fc0e

glemaitre reviewed Jan 14, 2023

View reviewed changes

Add tests / Apply changes

39cf9e2

Merge branch 'main' into export_text_class_names

11c32ef

glemaitre self-requested a review January 23, 2023 09:11

Merge branch 'main' into export_text_class_names

d9066ab

glemaitre reviewed Jan 23, 2023

View reviewed changes

Changes - Add description in RST file

a720c93

Merge branch 'main' into export_text_class_names

deb0248

glemaitre self-requested a review January 23, 2023 11:19

Small change, remove else statement after raise ValueError

1803d3c

glemaitre reviewed Jan 23, 2023

View reviewed changes

mokwilliam and others added 2 commits January 24, 2023 12:25

Merge branch 'main' into export_text_class_names

acc505d

Small changes

7f14483

glemaitre approved these changes Jan 24, 2023

View reviewed changes

glemaitre added the Waiting for Second Reviewer First reviewer is done, need a second one! label Jan 31, 2023

thomasjpfan reviewed Feb 7, 2023

View reviewed changes

mokwilliam and others added 2 commits February 8, 2023 18:48

Merge branch 'main' into export_text_class_names

307ad81

Changes made

7e01150

thomasjpfan approved these changes Feb 8, 2023

View reviewed changes

sklearn/tree/_export.py Outdated Show resolved Hide resolved

mokwilliam and others added 2 commits February 8, 2023 21:00

Update: rewording

8626844

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

Merge branch 'main' into export_text_class_names

d248eac

thomasjpfan changed the title ~~ENH add class_names in tree.export_text~~ ENH Adds class_names to tree.export_text Feb 8, 2023

thomasjpfan merged commit aae5c83 into scikit-learn:main Feb 8, 2023

	def _add_leaf(value, class_name, indent):
	val = ""
	is_classification = isinstance(decision_tree, DecisionTreeClassifier)
	if show_weights or not is_classification:
	val = ["{1:.{0}f}, ".format(decimals, v) for v in value]
	val = "[" + "".join(val)[:-2] + "]"
	if is_classification:
	val += " class: " + str(class_name)
	export_text.report += value_fmt.format(indent, "", val)

Uh oh!

ENH Adds class_names to tree.export_text #25387

ENH Adds class_names to tree.export_text #25387

Uh oh!

Conversation

mokwilliam commented Jan 12, 2023 • edited by glemaitre Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Explanation

Changes in the code

Before change

After change

Uh oh!

Uh oh!

Uh oh!

crispinlogan Jan 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mokwilliam commented Jan 13, 2023

Uh oh!

glemaitre commented Jan 13, 2023

Uh oh!

glemaitre commented Jan 13, 2023

Uh oh!

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

mokwilliam commented Jan 13, 2023

Uh oh!

glemaitre commented Jan 13, 2023

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

mokwilliam commented Jan 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mokwilliam commented Jan 23, 2023

Uh oh!

glemaitre commented Jan 23, 2023

ENH Adds `class_names` to `tree.export_text` #25387

ENH Adds `class_names` to `tree.export_text` #25387

mokwilliam commented Jan 12, 2023 •

edited by glemaitre

Loading

crispinlogan Jan 13, 2023 •

edited

Loading

mokwilliam commented Jan 20, 2023 •

edited

Loading

mokwilliam commented Jan 23, 2023 •

edited

Loading