Skip to content

[MRG+2] DOC adding separate fit() methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor #7824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 9, 2016

Conversation

dalmia
Copy link
Contributor

@dalmia dalmia commented Nov 4, 2016

Reference Issue

Fixes #7809

What does this implement/fix? Explain your changes.

Added a separate fit function for DecisionTreeRegressor and DecisionTreeClassifier that calls super to have separate docstrings for both.

sample_weight=sample_weight,
check_input=check_input,
X_idx_sorted=X_idx_sorted)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are failing because self is not returned here and also for DecisionTreeRegressor. I suppose you missed to return self in here. Also since the tests are failing, I think it might be good to run flake8 / pep8 checker so that code is pep compliant.

@dalmia
Copy link
Contributor Author

dalmia commented Nov 4, 2016

@maniteja123 Thank you for pointing that out. My bad. Also, I ran the pep8 test, but it was showing a few errors mostly on lines that I had not changed, like 'E501 line too long (83 > 79 characters)'. So, for the time being I felt it's better left untouched. However, let me know if you intend to have a discussion on that.

@maniteja123
Copy link
Contributor

Hi @dalmia , you are right that the pep8 errors not related to your changes are better left untouched. But looking at the logs here, I thought it are related to the indentation in your changes.

Copy link
Member

@raghavrv raghavrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of having a separate fit method is to avoid generic docstrings and have specific ones for regression / classification. Both the docstrings seem identical. Kindly make it more specific...

@@ -731,6 +731,53 @@ def __init__(self,
min_impurity_split=min_impurity_split,
presort=presort)

def fit(self, X, y, sample_weight=None, check_input=True,
X_idx_sorted=None):
"""Build a decision tree classifier from the training set (X, y).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect indentation?

They should be aligned like

def fit(self, X, y, ...
        X_idx_sorted):
    """Build ...

    Parameters...
    """

"""

super(DecisionTreeRegressor,self).fit(
X=X,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X, y, ...

@raghavrv raghavrv changed the title [MRG] DOC adding separate fit() functions for DecisionTreeClassifier and De… [WIP] DOC adding separate fit() methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor Nov 4, 2016
@dalmia dalmia closed this Nov 4, 2016
@dalmia dalmia reopened this Nov 4, 2016
@dalmia dalmia changed the title [WIP] DOC adding separate fit() methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor [MRG] DOC adding separate fit() methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor Nov 4, 2016
@dalmia
Copy link
Contributor Author

dalmia commented Nov 4, 2016

@raghavrv I modified the generic docstrings for 'y' and 'sample_weight' to a more specific one. Please let me know if I there is any more issue.

@dalmia
Copy link
Contributor Author

dalmia commented Nov 4, 2016

@maniteja123 Actually, in my code editor, I used tabs and that messed up everything.

"""

super(DecisionTreeClassifier,self).fit(
X, y, sample_weight, check_input, X_idx_sorted)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if you misconstrued... I meant X and y alone without the keywords and the rest with keywords

X, y,
sampl...=sampl...

@dalmia
Copy link
Contributor Author

dalmia commented Nov 4, 2016

@raghavrv No problem!

@dalmia
Copy link
Contributor Author

dalmia commented Nov 4, 2016

@raghavrv Please let me know if there is any more issues.

@raghavrv raghavrv changed the title [MRG] DOC adding separate fit() methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor [MRG + 1] DOC adding separate fit() methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor Nov 4, 2016
@raghavrv
Copy link
Member

raghavrv commented Nov 4, 2016

@amueller 2nd look?

@raghavrv raghavrv added this to the 0.18.1 milestone Nov 6, 2016
@raghavrv
Copy link
Member

raghavrv commented Nov 7, 2016

@jnothman Would you have time for a quick review of this + Merge

@dalmia
Copy link
Contributor Author

dalmia commented Nov 7, 2016

Hi @raghavrv, would you also be available for a final review here :
#7818

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be removing the docstring from BaseDecisionTree.fit? Otherwise LGTM.

to a sparse ``csc_matrix``.

y : array-like, shape = [n_samples] or [n_samples, n_outputs]
The target values (class labels).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might as well specify "as integers or strings"

@dalmia
Copy link
Contributor Author

dalmia commented Nov 8, 2016

I feel we shouldn't remove the docstring from BaseDecisionTree.fit. As a user using it for the 1st time, it would be helpful to understand how it generalizes to both regressor and classifier. Please let me know what you think.

@nelson-liu
Copy link
Contributor

Indeed, but BaseDecisionTree doesn't show up in the docs, either. I'd personally be in favor of removing it if it is going to be split into these two separate fits.

@dalmia
Copy link
Contributor Author

dalmia commented Nov 8, 2016

Yes that's true. I'll make the change then.

@jnothman
Copy link
Member

jnothman commented Nov 8, 2016

I'd like someone else to give their opinion on removing docs from the base class.

@jnothman jnothman changed the title [MRG + 1] DOC adding separate fit() methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor [MRG+2] DOC adding separate fit() methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor Nov 8, 2016
@amueller
Copy link
Member

amueller commented Nov 9, 2016

LGTM

@amueller amueller merged commit f32d257 into scikit-learn:master Nov 9, 2016
amueller pushed a commit to amueller/scikit-learn that referenced this pull request Nov 9, 2016
…sionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824)

* DOC adding separate fit() functions

* DOC adding keywords to arguments of super()

* DOC removing trailing whitespaces

* DOC specifying the type of class labels

* DOC removing docstring from BaseDecisionTree.fit
@dalmia dalmia deleted the 7809 branch November 16, 2016 13:03
sergeyf pushed a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017
…sionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824)

* DOC adding separate fit() functions

* DOC adding keywords to arguments of super()

* DOC removing trailing whitespaces

* DOC specifying the type of class labels

* DOC removing docstring from BaseDecisionTree.fit
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
…sionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824)

* DOC adding separate fit() functions

* DOC adding keywords to arguments of super()

* DOC removing trailing whitespaces

* DOC specifying the type of class labels

* DOC removing docstring from BaseDecisionTree.fit
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Jul 27, 2017
* tag '0.18.1': (144 commits)
  skip tree-test on 32bit
  do the warning test as we do it in other places.
  Replase assert_equal by assert_almost_equal in cosine test
  version bump 0.18.1
  fix merge conflict mess in whatsnew
  add the python2.6 warning to 0.18.1
  fix learning_curve test that I messed up in cherry-picking the "reentrant cv" PR.
  sync whatsnew with master
  [MRG] TST Ensure __dict__ is unmodified by predict, transform, etc (scikit-learn#7553)
  FIX scikit-learn#6420: Cloning decision tree estimators breaks criterion objects (scikit-learn#7680)
  Add whats new entry for scikit-learn#6282 (scikit-learn#7629)
  [MGR + 2] fix selectFdr bug (scikit-learn#7490)
  fixed whatsnew cherry-pick mess (somewhat)
  [MRG + 2] FIX LogisticRegressionCV to correctly handle string labels (scikit-learn#5874)
  [MRG + 2] Fixed parameter setting in SelectFromModel (scikit-learn#7764)
  [MRG+2] DOC adding separate `fit()` methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824)
  Fix docstring typo (scikit-learn#7844) n_features --> n_components
  [MRG + 1] DOC adding :user: role to whats_new (scikit-learn#7818)
  [MRG+1] label binarizer not used consistently in CalibratedClassifierCV (scikit-learn#7799)
  DOC : fix docstring of AIC/BIC in GMM
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Jul 27, 2017
* releases: (144 commits)
  skip tree-test on 32bit
  do the warning test as we do it in other places.
  Replase assert_equal by assert_almost_equal in cosine test
  version bump 0.18.1
  fix merge conflict mess in whatsnew
  add the python2.6 warning to 0.18.1
  fix learning_curve test that I messed up in cherry-picking the "reentrant cv" PR.
  sync whatsnew with master
  [MRG] TST Ensure __dict__ is unmodified by predict, transform, etc (scikit-learn#7553)
  FIX scikit-learn#6420: Cloning decision tree estimators breaks criterion objects (scikit-learn#7680)
  Add whats new entry for scikit-learn#6282 (scikit-learn#7629)
  [MGR + 2] fix selectFdr bug (scikit-learn#7490)
  fixed whatsnew cherry-pick mess (somewhat)
  [MRG + 2] FIX LogisticRegressionCV to correctly handle string labels (scikit-learn#5874)
  [MRG + 2] Fixed parameter setting in SelectFromModel (scikit-learn#7764)
  [MRG+2] DOC adding separate `fit()` methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824)
  Fix docstring typo (scikit-learn#7844) n_features --> n_components
  [MRG + 1] DOC adding :user: role to whats_new (scikit-learn#7818)
  [MRG+1] label binarizer not used consistently in CalibratedClassifierCV (scikit-learn#7799)
  DOC : fix docstring of AIC/BIC in GMM
  ...

 Conflicts:  removed
	sklearn/externals/joblib/__init__.py
	sklearn/externals/joblib/_parallel_backends.py
	sklearn/externals/joblib/testing.py
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Jul 27, 2017
* dfsg: (144 commits)
  skip tree-test on 32bit
  do the warning test as we do it in other places.
  Replase assert_equal by assert_almost_equal in cosine test
  version bump 0.18.1
  fix merge conflict mess in whatsnew
  add the python2.6 warning to 0.18.1
  fix learning_curve test that I messed up in cherry-picking the "reentrant cv" PR.
  sync whatsnew with master
  [MRG] TST Ensure __dict__ is unmodified by predict, transform, etc (scikit-learn#7553)
  FIX scikit-learn#6420: Cloning decision tree estimators breaks criterion objects (scikit-learn#7680)
  Add whats new entry for scikit-learn#6282 (scikit-learn#7629)
  [MGR + 2] fix selectFdr bug (scikit-learn#7490)
  fixed whatsnew cherry-pick mess (somewhat)
  [MRG + 2] FIX LogisticRegressionCV to correctly handle string labels (scikit-learn#5874)
  [MRG + 2] Fixed parameter setting in SelectFromModel (scikit-learn#7764)
  [MRG+2] DOC adding separate `fit()` methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824)
  Fix docstring typo (scikit-learn#7844) n_features --> n_components
  [MRG + 1] DOC adding :user: role to whats_new (scikit-learn#7818)
  [MRG+1] label binarizer not used consistently in CalibratedClassifierCV (scikit-learn#7799)
  DOC : fix docstring of AIC/BIC in GMM
  ...
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
…sionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824)

* DOC adding separate fit() functions

* DOC adding keywords to arguments of super()

* DOC removing trailing whitespaces

* DOC specifying the type of class labels

* DOC removing docstring from BaseDecisionTree.fit
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
…sionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824)

* DOC adding separate fit() functions

* DOC adding keywords to arguments of super()

* DOC removing trailing whitespaces

* DOC specifying the type of class labels

* DOC removing docstring from BaseDecisionTree.fit
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
…sionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824)

* DOC adding separate fit() functions

* DOC adding keywords to arguments of super()

* DOC removing trailing whitespaces

* DOC specifying the type of class labels

* DOC removing docstring from BaseDecisionTree.fit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Explicit docstrings for DecisionTreeClassifier.fit and DecisionTreeRegressor.fit
6 participants