Skip to content

documentation of attributes of HistGradientBoostingClassifier except … #16283

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

judithabk6
Copy link
Contributor

contributes to #14312
adds documentation for attributes of HistGradientBoostingClassifier, except classes_ (see #16277)

@NicolasHug attribute bin_mapper_ is of private class _BinMapper but is exposed, is that ok?

@NicolasHug
Copy link
Member

Thanks for the PR @judithabk6

However my impression is that all of these attributes should instead be private rather than public. WDYT @ogrisel?

@@ -923,13 +923,24 @@ class HistGradientBoostingClassifier(BaseHistGradientBoosting,

Attributes
----------
bin_mapper_ : BinMapper object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this could be private.

@@ -923,13 +923,24 @@ class HistGradientBoostingClassifier(BaseHistGradientBoosting,

Attributes
----------
bin_mapper_ : BinMapper object
Transformer that maps a dataset into integer-valued bins.
do_early_stopping_ : bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be public since it is derived from an input parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very useful, also obsolete once #14516 is merged

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed this is obsolete now :)

Transformer that maps a dataset into integer-valued bins.
do_early_stopping_ : bool
Specifies if early stopping is used during the training.
loss_ : callable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be public since it is derived from an input parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree, the losses are all private objects

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The effective loss name could be stored as a public attribute.

Specifies if early stopping is used during the training.
loss_ : callable
Loss function used by the algorithm.
n_features : int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be private

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Private for now and replaced by n_features_in_ later.

n_iter_ : int
The number of estimators as selected by early stopping (if
n_iter_no_change is not None). Otherwise it corresponds to max_iter.
n_trees_per_iteration_ : int
The number of tree that are built at each iteration. This is equal to 1
for binary classification, and to ``n_classes`` for multiclass
classification.
scorer_ : callable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be public

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a very useful public attribute. What are the benefit of exposing it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we could also keep it private. The only other classes that have a public scorer_ attribute are the SearchCV classes and for those it's actually a list of scorer objects to deal with multi-metrics things.

Let's keep it private.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed we can probably close this PR and instead make those attributes private by moving the underscore at the start in the name of the attributes.

Transformer that maps a dataset into integer-valued bins.
do_early_stopping_ : bool
Specifies if early stopping is used during the training.
loss_ : callable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The effective loss name could be stored as a public attribute.

Specifies if early stopping is used during the training.
loss_ : callable
Loss function used by the algorithm.
n_features : int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Private for now and replaced by n_features_in_ later.

n_iter_ : int
The number of estimators as selected by early stopping (if
n_iter_no_change is not None). Otherwise it corresponds to max_iter.
n_trees_per_iteration_ : int
The number of tree that are built at each iteration. This is equal to 1
for binary classification, and to ``n_classes`` for multiclass
classification.
scorer_ : callable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we could also keep it private. The only other classes that have a public scorer_ attribute are the SearchCV classes and for those it's actually a list of scorer objects to deal with multi-metrics things.

Let's keep it private.

@cmarmo cmarmo added the Needs Decision Requires decision label Feb 16, 2020
@cmarmo
Copy link
Contributor

cmarmo commented Mar 28, 2020

@ogrisel, @glemaitre, @NicolasHug , is a consensus there about making all the attributes in this PR private? If so, @judithabk6 are you ok in proposing a new PR in that direction? Feel free to ask for clarifications if you need some. Thanks for your patience!

@NicolasHug
Copy link
Member

Yes let's make everything private except for do_early_stopping_ since it's useful now that we have early_stopping='auto' by default

@cmarmo cmarmo removed the Needs Decision Requires decision label Mar 28, 2020
@cmarmo cmarmo added Easy Well-defined and straightforward way to resolve good first issue Easy with clear instructions to resolve help wanted Stalled labels Jun 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Easy Well-defined and straightforward way to resolve good first issue Easy with clear instructions to resolve module:ensemble Sprint Stalled
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants