-
-
Notifications
You must be signed in to change notification settings - Fork 26k
ENH Add support for feature names in monotonic_cst #24855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ogrisel
merged 26 commits into
scikit-learn:main
from
ogrisel:monotonic_cst-feature-names
Nov 15, 2022
Merged
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
825400e
Add support for feature names in monotonic_cst
ogrisel e660969
docstring format
ogrisel 3b15660
docstring format
ogrisel cecf92e
Fix indentation in docstring?
ogrisel 1cd97d0
More docstring tweaking
ogrisel 44c33d3
Add a test for the nominal case
ogrisel 73d8a37
Test error messages
ogrisel 7ed48f5
Changelog entry
ogrisel 498ca43
Update example
ogrisel 0641514
Merge branch 'main' into monotonic_cst-feature-names
ogrisel 9f37dd7
Docstring tweak for sphinx?
ogrisel c754b80
More indentation tweaking
ogrisel 5eb8174
Update the regressors' docstring
ogrisel a523aaf
Fix docstring formating and phrasing
ogrisel f929848
Apply suggestions from code review
ogrisel 86c4cc7
Fix undefined variable
ogrisel 5ac2617
Exclude invalid values in ]-1, 1[
ogrisel 36dc2b7
Report number of unexpected feature names
ogrisel 766d1f8
Update sklearn/ensemble/_hist_gradient_boosting/grower.py
ogrisel 1925f0b
Link to example from docstring
ogrisel 7b2b3c3
Add missing test case to increase coverage
ogrisel fa814d0
Fix ref to example section
ogrisel afd2fa6
Merge branch 'main' into monotonic_cst-feature-names
ogrisel 161e87c
Update sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_…
ogrisel 5c6a7ee
Cosmetic change in error message
ogrisel 90060da
Apply suggestions from code review
ogrisel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,6 +23,7 @@ | |
check_is_fitted, | ||
check_consistent_length, | ||
_check_sample_weight, | ||
_check_monotonic_cst, | ||
) | ||
from ...utils._param_validation import Interval, StrOptions | ||
from ...utils._openmp_helpers import _openmp_effective_n_threads | ||
|
@@ -91,7 +92,7 @@ class BaseHistGradientBoosting(BaseEstimator, ABC): | |
"max_depth": [Interval(Integral, 1, None, closed="left"), None], | ||
"min_samples_leaf": [Interval(Integral, 1, None, closed="left")], | ||
"l2_regularization": [Interval(Real, 0, None, closed="left")], | ||
"monotonic_cst": ["array-like", None], | ||
"monotonic_cst": ["array-like", dict, None], | ||
"interaction_cst": [Iterable, None], | ||
"n_iter_no_change": [Interval(Integral, 1, None, closed="left")], | ||
"validation_fraction": [ | ||
|
@@ -369,6 +370,7 @@ def fit(self, X, y, sample_weight=None): | |
self._random_seed = rng.randint(np.iinfo(np.uint32).max, dtype="u8") | ||
|
||
self._validate_parameters() | ||
monotonic_cst = _check_monotonic_cst(self, self.monotonic_cst) | ||
|
||
# used for validation in predict | ||
n_samples, self._n_features = X.shape | ||
|
@@ -664,7 +666,7 @@ def fit(self, X, y, sample_weight=None): | |
n_bins_non_missing=self._bin_mapper.n_bins_non_missing_, | ||
has_missing_values=has_missing_values, | ||
is_categorical=self.is_categorical_, | ||
monotonic_cst=self.monotonic_cst, | ||
monotonic_cst=monotonic_cst, | ||
interaction_cst=interaction_cst, | ||
max_leaf_nodes=self.max_leaf_nodes, | ||
max_depth=self.max_depth, | ||
|
@@ -1259,16 +1261,27 @@ class HistGradientBoostingRegressor(RegressorMixin, BaseHistGradientBoosting): | |
.. versionchanged:: 1.2 | ||
Added support for feature names. | ||
|
||
monotonic_cst : array-like of int of shape (n_features), default=None | ||
Indicates the monotonic constraint to enforce on each feature. | ||
- 1: monotonic increase | ||
- 0: no constraint | ||
- -1: monotonic decrease | ||
monotonic_cst : array-like of int of shape (n_features) or dict, default=None | ||
Monotonic constraint to enforce on each feature are specified using the | ||
following integer values: | ||
|
||
- 1: monotonic increase | ||
- 0: no constraint | ||
- -1: monotonic decrease | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had to remove the indentation of the bullet list to avoid a warning for the old version of sphinx... |
||
|
||
If a dict with str keys, map feature to monotonic constraints by name. | ||
If an array, the features are mapped to constraints by position. See | ||
:ref:`monotonic_cst_features_names` for a usage example. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
||
The constraints are only valid for binary classifications and hold | ||
over the probability of the positive class. | ||
Read more in the :ref:`User Guide <monotonic_cst_gbdt>`. | ||
|
||
.. versionadded:: 0.23 | ||
|
||
.. versionchanged:: 1.2 | ||
Accept dict of constraints with feature names as keys. | ||
|
||
interaction_cst : iterable of iterables of int, default=None | ||
Specify interaction constraints, the sets of features which can | ||
interact with each other in child node splits. | ||
|
@@ -1596,18 +1609,27 @@ class HistGradientBoostingClassifier(ClassifierMixin, BaseHistGradientBoosting): | |
.. versionchanged:: 1.2 | ||
Added support for feature names. | ||
|
||
monotonic_cst : array-like of int of shape (n_features), default=None | ||
Indicates the monotonic constraint to enforce on each feature. | ||
- 1: monotonic increase | ||
- 0: no constraint | ||
- -1: monotonic decrease | ||
monotonic_cst : array-like of int of shape (n_features) or dict, default=None | ||
Monotonic constraint to enforce on each feature are specified using the | ||
following integer values: | ||
|
||
- 1: monotonic increase | ||
- 0: no constraint | ||
- -1: monotonic decrease | ||
|
||
If a dict with str keys, map feature to monotonic constraints by name. | ||
If an array, the features are mapped to constraints by position. See | ||
:ref:`monotonic_cst_features_names` for a usage example. | ||
|
||
The constraints are only valid for binary classifications and hold | ||
over the probability of the positive class. | ||
Read more in the :ref:`User Guide <monotonic_cst_gbdt>`. | ||
|
||
.. versionadded:: 0.23 | ||
|
||
.. versionchanged:: 1.2 | ||
Accept dict of constraints with feature names as keys. | ||
|
||
interaction_cst : iterable of iterables of int, default=None | ||
Specify interaction constraints, the sets of features which can | ||
interact with each other in child node splits. | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reduce the number of samples to make the plot less crowded while conveying the same intuitions and furthermore making the example run faster.