Skip to content

[MRG+1] remove n_nonzero_coefs from attr of LassoLarsCV + clean up call hierarchy #9004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 10, 2017

Conversation

agramfort
Copy link
Member

Reference Issue

Fixes #8475

What does this implement/fix? Explain your changes.

introduce a private _fit method in Lars to fit with explicit params without
having to hack the instance attributes.

It also fixes a call to Lars.fit that was ignoring the max_iter parameter.

Any other comments?

Nope

@agramfort agramfort changed the title remove n_nonzero_coefs from attr of LassoLarsCV + clean up call hierarchy [WIP] remove n_nonzero_coefs from attr of LassoLarsCV + clean up call hierarchy Jun 6, 2017
@@ -370,11 +370,11 @@ def lars_path(X, y, Xy=None, Gram=None, max_iter=500,
corr_eq_dir = np.dot(Gram[:n_active, n_active:].T,
least_squares)

g1 = arrayfuncs.min_pos((C - Cov) / (AA - corr_eq_dir + tiny))
g1 = arrayfuncs.min_pos((C - Cov) / (AA - corr_eq_dir + tiny32))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiny was not enough for #8475

and I think that 1.1754944e-38 is small enough (rather than 2.2250738585072014e-308)

alpha = 0. # n_nonzero_coefs parametrization takes priority
max_iter = self.n_nonzero_coefs
else:
max_iter = self.max_iter
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has been moved to Lars.fit to avoid the magic of hacking instance attributes

@@ -1145,10 +1154,14 @@ def fit(self, X, y):
# Now compute the full model
# it will call a lasso internally when self if LassoLarsCV
# as self.method == 'lasso'
Lars.fit(self, X, y)
Lars._fit(self, X, y, max_iter=self.max_iter, alpha=best_alpha,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now self.max_iter is not ignored

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the case that previously the call to Lars.fit reverted to a call to lasso_lars with alpha_min=0, while now alpha_min=best_alpha?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! so we save a bit of computation that we didn't need to do previously.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe just say self._fit now?

@property
@deprecated("Attribute alpha is deprecated in 0.18 and "
"will be removed in 0.20. See 'alpha_' instead")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a hack to be able to call Lars.fit but the CV classes should have an alpha_ but no alpha param

@@ -402,6 +403,7 @@ def test_lars_cv():
lars_cv.fit(X, y)
np.testing.assert_array_less(old_alpha, lars_cv.alpha_)
old_alpha = lars_cv.alpha_
assert_false(hasattr(lars_cv, 'n_nonzero_coefs'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a test for the ignored max_iter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -195,7 +195,6 @@ def lars_path(X, y, Xy=None, Gram=None, max_iter=500,
sys.stdout.write('.')
sys.stdout.flush()

tiny = np.finfo(np.float).tiny # to avoid division by 0 warning
tiny32 = np.finfo(np.float32).tiny # to avoid division by 0 warning
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the record tiny32 was introduced here 294d4b6

@codecov
Copy link

codecov bot commented Jun 6, 2017

Codecov Report

Merging #9004 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9004      +/-   ##
==========================================
+ Coverage   95.91%   95.92%   +<.01%     
==========================================
  Files         331      331              
  Lines       59851    59960     +109     
==========================================
+ Hits        57409    57516     +107     
- Misses       2442     2444       +2
Impacted Files Coverage Δ
sklearn/linear_model/least_angle.py 96.05% <100%> (+0.15%) ⬆️
sklearn/linear_model/tests/test_least_angle.py 100% <100%> (ø) ⬆️
sklearn/feature_selection/from_model.py 91.66% <0%> (-2.37%) ⬇️
sklearn/decomposition/dict_learning.py 93.23% <0%> (-0.23%) ⬇️
sklearn/decomposition/truncated_svd.py 97.91% <0%> (-0.09%) ⬇️
sklearn/feature_extraction/text.py 96.02% <0%> (-0.03%) ⬇️
sklearn/decomposition/tests/test_dict_learning.py 100% <0%> (ø) ⬆️
sklearn/feature_selection/rfe.py 97.45% <0%> (ø) ⬆️
sklearn/feature_selection/tests/test_from_model.py 100% <0%> (ø) ⬆️
sklearn/tests/test_multioutput.py 100% <0%> (ø) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f4b230c...ea66a5b. Read the comment docs.

@agramfort agramfort force-pushed the fix_lasso_lars_cv_attr branch from ea66a5b to fde74da Compare June 7, 2017 08:15
@agramfort agramfort changed the title [WIP] remove n_nonzero_coefs from attr of LassoLarsCV + clean up call hierarchy [MRG] remove n_nonzero_coefs from attr of LassoLarsCV + clean up call hierarchy Jun 7, 2017
@agramfort
Copy link
Member Author

good to go on my end

appveyor error is an appveyor outage so is unrelated

"""
X, y = check_X_y(X, y, y_numeric=True, multi_output=True)
def _fit(self, X, y, max_iter, alpha, fit_path, Xy=None):
"""Aux method to fit the model using X, y as training data"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's private, but I think it should still say "Auxiliary" :P

@vene
Copy link
Member

vene commented Jun 8, 2017

Other than the two extremely minor comments this LGTM. The regression test indeed fails on master and passes here. Clear improvement in terms of the code.

@vene vene changed the title [MRG] remove n_nonzero_coefs from attr of LassoLarsCV + clean up call hierarchy [MRG+1] remove n_nonzero_coefs from attr of LassoLarsCV + clean up call hierarchy Jun 8, 2017
@agramfort agramfort force-pushed the fix_lasso_lars_cv_attr branch from 002c95a to 70338a0 Compare June 8, 2017 08:40
self.copy_X = copy_X
self.positive = positive
# XXX : we don't use super(LarsCV, self).__init__
# to avoid setting n_nonzero_coefs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it tells us that we have the wrong inheritance diagram. But... I don't bother.

@GaelVaroquaux
Copy link
Member

LGTM. Merging

@GaelVaroquaux GaelVaroquaux merged commit 32b88d8 into scikit-learn:master Jun 10, 2017
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
…ll hierarchy (scikit-learn#9004)

* FIX : remove n_nonzero_coefs from attr of LassoLarsCV + clean up call to Lars._fit

* cleanup

* fix deprecation warning + clarify warning

* add test

* pep8

* adddress comments
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
…ll hierarchy (scikit-learn#9004)

* FIX : remove n_nonzero_coefs from attr of LassoLarsCV + clean up call to Lars._fit

* cleanup

* fix deprecation warning + clarify warning

* add test

* pep8

* adddress comments
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
…ll hierarchy (scikit-learn#9004)

* FIX : remove n_nonzero_coefs from attr of LassoLarsCV + clean up call to Lars._fit

* cleanup

* fix deprecation warning + clarify warning

* add test

* pep8

* adddress comments
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
…ll hierarchy (scikit-learn#9004)

* FIX : remove n_nonzero_coefs from attr of LassoLarsCV + clean up call to Lars._fit

* cleanup

* fix deprecation warning + clarify warning

* add test

* pep8

* adddress comments
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
…ll hierarchy (scikit-learn#9004)

* FIX : remove n_nonzero_coefs from attr of LassoLarsCV + clean up call to Lars._fit

* cleanup

* fix deprecation warning + clarify warning

* add test

* pep8

* adddress comments
AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017
…ll hierarchy (scikit-learn#9004)

* FIX : remove n_nonzero_coefs from attr of LassoLarsCV + clean up call to Lars._fit

* cleanup

* fix deprecation warning + clarify warning

* add test

* pep8

* adddress comments
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
…ll hierarchy (scikit-learn#9004)

* FIX : remove n_nonzero_coefs from attr of LassoLarsCV + clean up call to Lars._fit

* cleanup

* fix deprecation warning + clarify warning

* add test

* pep8

* adddress comments
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
…ll hierarchy (scikit-learn#9004)

* FIX : remove n_nonzero_coefs from attr of LassoLarsCV + clean up call to Lars._fit

* cleanup

* fix deprecation warning + clarify warning

* add test

* pep8

* adddress comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LassoLarsCV chatty but not that helpful.
4 participants