Skip to content

[MRG] Monotonic constraints for GBDT #15582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 63 commits into from
Mar 24, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
7eb7827
WIP
NicolasHug Nov 6, 2019
698636b
more WIP
NicolasHug Nov 6, 2019
e62fe14
original tests OK except for splitting
NicolasHug Nov 6, 2019
ad0e9f1
some tests
NicolasHug Nov 7, 2019
cec48bc
more tests
NicolasHug Nov 7, 2019
0e84854
comments
NicolasHug Nov 7, 2019
583f2a1
cleaned splitter tests and ignored the warm start ones
NicolasHug Nov 7, 2019
e818904
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mo…
NicolasHug Nov 7, 2019
887ca02
WIP
NicolasHug Nov 8, 2019
086a766
WIP
NicolasHug Nov 8, 2019
1605276
fouund bug, will fix later
NicolasHug Nov 8, 2019
b56538c
now only shrink at after tree is grown
NicolasHug Nov 8, 2019
e823883
more tests
NicolasHug Nov 9, 2019
5d943c1
tests
NicolasHug Nov 9, 2019
cf502cc
Some cleaning
NicolasHug Nov 9, 2019
165490e
small optimization for best bin finding
NicolasHug Nov 9, 2019
f6e9ad8
cleaning
NicolasHug Nov 9, 2019
e7913f5
used enum type for constraint
NicolasHug Nov 9, 2019
2ad5d1a
flake8
NicolasHug Nov 9, 2019
7d524ed
comments
NicolasHug Nov 9, 2019
b818001
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mo…
NicolasHug Nov 9, 2019
8629171
cleaned diff
NicolasHug Nov 9, 2019
84e3e14
Added example
NicolasHug Nov 9, 2019
c727d9e
pep8
NicolasHug Nov 10, 2019
de83163
use rand instead of random
NicolasHug Nov 10, 2019
c078df4
renaming
NicolasHug Nov 11, 2019
70827d3
simplify example code
NicolasHug Nov 11, 2019
d2035b4
example
NicolasHug Nov 11, 2019
0d0264b
added init parameter
NicolasHug Nov 11, 2019
88fe797
pep8
NicolasHug Nov 11, 2019
2d19106
fixed test
NicolasHug Nov 11, 2019
b8a1e0d
some UG + example
NicolasHug Nov 11, 2019
2cf348e
dont support for multiclass
NicolasHug Nov 11, 2019
e3c227b
pep
NicolasHug Nov 11, 2019
a8ababf
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mo…
NicolasHug Nov 13, 2019
e25e8f8
addressed comments
NicolasHug Nov 13, 2019
2f09ade
minimal comment
NicolasHug Nov 16, 2019
ad77286
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mo…
NicolasHug Feb 23, 2020
866b1f0
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mo…
NicolasHug Feb 25, 2020
943c648
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mo…
NicolasHug Mar 14, 2020
a195d3a
minor simplification
NicolasHug Mar 14, 2020
1c424dc
Apply suggestions from code review
NicolasHug Mar 19, 2020
39e1d88
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mo…
NicolasHug Mar 19, 2020
b5faf91
Merge branch 'monotonic_constraints' of github.com:NicolasHug/scikit-…
NicolasHug Mar 19, 2020
496dca4
Update sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_…
NicolasHug Mar 19, 2020
6fe3aa7
pep8
NicolasHug Mar 19, 2020
7bcc42a
Merge branch 'monotonic_constraints' of github.com:NicolasHug/scikit-…
NicolasHug Mar 19, 2020
aa37a1f
avoid dfs and parse array instead
NicolasHug Mar 19, 2020
b885617
used assert_allclose
NicolasHug Mar 19, 2020
15cbbb4
Added comment about dfs
NicolasHug Mar 19, 2020
77ae16d
Update sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_…
NicolasHug Mar 20, 2020
39ed017
Cap current node value when computing loss
NicolasHug Mar 20, 2020
0b835fb
Avoid some interactions
NicolasHug Mar 21, 2020
f8ba277
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mo…
NicolasHug Mar 22, 2020
9b785fa
Added whatsnew
NicolasHug Mar 22, 2020
57fde2a
Put back scoring default to 'loss' (bad merge probably)
NicolasHug Mar 22, 2020
83dba40
Use fast way when there's no constraints
NicolasHug Mar 22, 2020
abede37
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mo…
NicolasHug Mar 23, 2020
f0135d8
Never compute root's value, we don't need it
NicolasHug Mar 23, 2020
1f0e056
typo
NicolasHug Mar 23, 2020
d953457
Acutally set it in constructor
NicolasHug Mar 23, 2020
c945f8e
Added test for single node trees
NicolasHug Mar 23, 2020
20d4bd6
pep8
NicolasHug Mar 23, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 46 additions & 1 deletion doc/modules/ensemble.rst
Original file line number Diff line number Diff line change
Expand Up @@ -897,7 +897,7 @@ based on permutation of the features.
Histogram-Based Gradient Boosting
=================================

Scikit-learn 0.21 introduces two new experimental implementations of
Scikit-learn 0.21 introduced two new experimental implementations of
gradient boosting trees, namely :class:`HistGradientBoostingClassifier`
and :class:`HistGradientBoostingRegressor`, inspired by
`LightGBM <https://github.com/Microsoft/LightGBM>`__ (See [LightGBM]_).
Expand Down Expand Up @@ -1050,6 +1050,51 @@ multiplying the gradients (and the hessians) by the sample weights. Note that
the binning stage (specifically the quantiles computation) does not take the
weights into account.

.. _monotonic_cst_gbdt:

Monotonic Constraints
---------------------

Depending on the problem at hand, you may have prior knowledge indicating
that a given feature should in general have a positive (or negative) effect
on the target value. For example, all else being equal, a higher credit
score should increase the probability of getting approved for a loan.
Monotonic constraints allow you to incorporate such prior knowledge into the
model.

A positive monotonic constraint is a constraint of the form:

:math:`x_1 \leq x_1' \implies F(x_1, x_2) \leq F(x_1', x_2)`,
where :math:`F` is the predictor with two features.

Similarly, a negative monotonic constraint is of the form:

:math:`x_1 \leq x_1' \implies F(x_1, x_2) \geq F(x_1', x_2)`.

Note that monotonic constraints only constraint the output "all else being
equal". Indeed, the following relation **is not enforced** by a positive
constraint: :math:`x_1 \leq x_1' \implies F(x_1, x_2) \leq F(x_1', x_2')`.

You can specify a monotonic constraint on each feature using the
`monotonic_cst` parameter. For each feature, a value of 0 indicates no
constraint, while -1 and 1 indicate a negative and positive constraint,
respectively::

>>> from sklearn.experimental import enable_hist_gradient_boosting # noqa
>>> from sklearn.ensemble import HistGradientBoostingRegressor

... # positive, negative, and no constraint on the 3 features
>>> gbdt = HistGradientBoostingRegressor(monotonic_cst=[1, -1, 0])

In a binary classification context, imposing a monotonic constraint means
that the feature is supposed to have a positive / negative effect on the
probability to belong to the positive class. Monotonic constraints are not
supported for multiclass context.

.. topic:: Examples:

* :ref:`sphx_glr_auto_examples_ensemble_plot_monotonic_constraints.py`

Low-level parallelism
---------------------

Expand Down
5 changes: 5 additions & 0 deletions doc/whats_new/v0.23.rst
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,11 @@ Changelog
samples in the training set. :pr:`14516` by :user:`Johann Faouzi
<johannfaouzi>`.

- |Feature| :class:`ensemble.HistGradientBoostingClassifier` and
:class:`ensemble.HistGradientBoostingRegressor` now support monotonic
constraints, useful when features are supposed to have a positive/negative
effect on the target. :pr:`15582` by `Nicolas Hug`_.

- |Fix| Fixed a bug in :class:`ensemble.BaggingClassifier`,
:class:`ensemble.BaggingRegressor` and :class:`ensemble.IsolationForest`
where the attribute `estimators_samples_` did not generate the proper indices
Expand Down
70 changes: 70 additions & 0 deletions examples/ensemble/plot_monotonic_constraints.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
"""
=====================
Monotonic Constraints
=====================

This example illustrates the effect of monotonic constraints on a gradient
boosting estimator.

We build an artificial dataset where the target value is in general
positively correlated with the first feature (with some random and
non-random variations), and in general negatively correlated with the second
feature.

By imposing a positive (increasing) or negative (decreasing) constraint on
the features during the learning process, the estimator is able to properly
follow the general trend instead of being subject to the variations.

This example was inspired by the `XGBoost documentation
<https://xgboost.readthedocs.io/en/latest/tutorials/monotonic.html>`_.
"""
from sklearn.experimental import enable_hist_gradient_boosting # noqa
from sklearn.ensemble import HistGradientBoostingRegressor
from sklearn.inspection import plot_partial_dependence
import numpy as np
import matplotlib.pyplot as plt


print(__doc__)

rng = np.random.RandomState(0)

n_samples = 5000
f_0 = rng.rand(n_samples) # positive correlation with y
f_1 = rng.rand(n_samples) # negative correlation with y
X = np.c_[f_0, f_1]
noise = rng.normal(loc=0.0, scale=0.01, size=n_samples)
y = (5 * f_0 + np.sin(10 * np.pi * f_0) -
5 * f_1 - np.cos(10 * np.pi * f_1) +
noise)

fig, ax = plt.subplots()


# Without any constraint
gbdt = HistGradientBoostingRegressor()
gbdt.fit(X, y)
disp = plot_partial_dependence(
gbdt, X, features=[0, 1],
line_kw={'linewidth': 4, 'label': 'unconstrained'},
ax=ax)

# With positive and negative constraints
gbdt = HistGradientBoostingRegressor(monotonic_cst=[1, -1])
gbdt.fit(X, y)

plot_partial_dependence(
gbdt, X, features=[0, 1],
feature_names=('First feature\nPositive constraint',
'Second feature\nNegtive constraint'),
line_kw={'linewidth': 4, 'label': 'constrained'},
ax=disp.axes_)

for f_idx in (0, 1):
disp.axes_[0, f_idx].plot(X[:, f_idx], y, 'o', alpha=.3, zorder=-1)
disp.axes_[0, f_idx].set_ylim(-6, 6)

plt.legend()
fig.suptitle("Monotonic constraints illustration")

plt.show()
6 changes: 6 additions & 0 deletions sklearn/ensemble/_hist_gradient_boosting/common.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,9 @@ cdef packed struct node_struct:
unsigned int depth
unsigned char is_leaf
X_BINNED_DTYPE_C bin_threshold


cpdef enum MonotonicConstraint:
NO_CST = 0
POS = 1
NEG = -1
43 changes: 32 additions & 11 deletions sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,9 @@ class BaseHistGradientBoosting(BaseEstimator, ABC):
@abstractmethod
def __init__(self, loss, learning_rate, max_iter, max_leaf_nodes,
max_depth, min_samples_leaf, l2_regularization, max_bins,
warm_start, early_stopping, scoring, validation_fraction,
n_iter_no_change, tol, verbose, random_state):
monotonic_cst, warm_start, early_stopping, scoring,
validation_fraction, n_iter_no_change, tol, verbose,
random_state):
self.loss = loss
self.learning_rate = learning_rate
self.max_iter = max_iter
Expand All @@ -39,6 +40,7 @@ def __init__(self, loss, learning_rate, max_iter, max_leaf_nodes,
self.min_samples_leaf = min_samples_leaf
self.l2_regularization = l2_regularization
self.max_bins = max_bins
self.monotonic_cst = monotonic_cst
self.warm_start = warm_start
self.early_stopping = early_stopping
self.scoring = scoring
Expand Down Expand Up @@ -82,6 +84,12 @@ def _validate_parameters(self):
raise ValueError('max_bins={} should be no smaller than 2 '
'and no larger than 255.'.format(self.max_bins))

if self.monotonic_cst is not None and self.n_trees_per_iteration_ != 1:
raise ValueError(
'monotonic constraints are not supported for '
'multiclass classification.'
)

def fit(self, X, y, sample_weight=None):
"""Fit the gradient boosting model.

Expand Down Expand Up @@ -352,12 +360,12 @@ def fit(self, X, y, sample_weight=None):

# Build `n_trees_per_iteration` trees.
for k in range(self.n_trees_per_iteration_):

grower = TreeGrower(
X_binned_train, gradients[k, :], hessians[k, :],
n_bins=n_bins,
n_bins_non_missing=self.bin_mapper_.n_bins_non_missing_,
has_missing_values=has_missing_values,
monotonic_cst=self.monotonic_cst,
max_leaf_nodes=self.max_leaf_nodes,
max_depth=self.max_depth,
min_samples_leaf=self.min_samples_leaf,
Expand Down Expand Up @@ -790,6 +798,11 @@ class HistGradientBoostingRegressor(RegressorMixin, BaseHistGradientBoosting):
Features with a small number of unique values may use less than
``max_bins`` bins. In addition to the ``max_bins`` bins, one more bin
is always reserved for missing values. Must be no larger than 255.
monotonic_cst : array-like of int of shape (n_features), default=None
Indicates the monotonic constraint to enforce on each feature. -1, 1
and 0 respectively correspond to a positive constraint, negative
constraint and no constraint. Read more in the :ref:`User Guide
<monotonic_cst_gbdt>`.
warm_start : bool, optional (default=False)
When set to ``True``, reuse the solution of the previous call to fit
and add more estimators to the ensemble. For results to be valid, the
Expand Down Expand Up @@ -867,16 +880,18 @@ class HistGradientBoostingRegressor(RegressorMixin, BaseHistGradientBoosting):
def __init__(self, loss='least_squares', learning_rate=0.1,
max_iter=100, max_leaf_nodes=31, max_depth=None,
min_samples_leaf=20, l2_regularization=0., max_bins=255,
warm_start=False, early_stopping='auto', scoring='loss',
validation_fraction=0.1, n_iter_no_change=10, tol=1e-7,
monotonic_cst=None, warm_start=False, early_stopping='auto',
scoring='loss', validation_fraction=0.1,
n_iter_no_change=10, tol=1e-7,
verbose=0, random_state=None):
super(HistGradientBoostingRegressor, self).__init__(
loss=loss, learning_rate=learning_rate, max_iter=max_iter,
max_leaf_nodes=max_leaf_nodes, max_depth=max_depth,
min_samples_leaf=min_samples_leaf,
l2_regularization=l2_regularization, max_bins=max_bins,
warm_start=warm_start, early_stopping=early_stopping,
scoring=scoring, validation_fraction=validation_fraction,
monotonic_cst=monotonic_cst, early_stopping=early_stopping,
warm_start=warm_start, scoring=scoring,
validation_fraction=validation_fraction,
n_iter_no_change=n_iter_no_change, tol=tol, verbose=verbose,
random_state=random_state)

Expand Down Expand Up @@ -978,6 +993,11 @@ class HistGradientBoostingClassifier(BaseHistGradientBoosting,
Features with a small number of unique values may use less than
``max_bins`` bins. In addition to the ``max_bins`` bins, one more bin
is always reserved for missing values. Must be no larger than 255.
monotonic_cst : array-like of int of shape (n_features), default=None
Indicates the monotonic constraint to enforce on each feature. -1, 1
and 0 respectively correspond to a positive constraint, negative
constraint and no constraint. Read more in the :ref:`User Guide
<monotonic_cst_gbdt>`.
warm_start : bool, optional (default=False)
When set to ``True``, reuse the solution of the previous call to fit
and add more estimators to the ensemble. For results to be valid, the
Expand Down Expand Up @@ -1058,17 +1078,18 @@ class HistGradientBoostingClassifier(BaseHistGradientBoosting,

def __init__(self, loss='auto', learning_rate=0.1, max_iter=100,
max_leaf_nodes=31, max_depth=None, min_samples_leaf=20,
l2_regularization=0., max_bins=255, warm_start=False,
early_stopping='auto', scoring='loss',
l2_regularization=0., max_bins=255, monotonic_cst=None,
warm_start=False, early_stopping='auto', scoring='loss',
validation_fraction=0.1, n_iter_no_change=10, tol=1e-7,
verbose=0, random_state=None):
super(HistGradientBoostingClassifier, self).__init__(
loss=loss, learning_rate=learning_rate, max_iter=max_iter,
max_leaf_nodes=max_leaf_nodes, max_depth=max_depth,
min_samples_leaf=min_samples_leaf,
l2_regularization=l2_regularization, max_bins=max_bins,
warm_start=warm_start, early_stopping=early_stopping,
scoring=scoring, validation_fraction=validation_fraction,
monotonic_cst=monotonic_cst, warm_start=warm_start,
early_stopping=early_stopping, scoring=scoring,
validation_fraction=validation_fraction,
n_iter_no_change=n_iter_no_change, tol=tol, verbose=verbose,
random_state=random_state)

Expand Down
Loading