-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+1] Read-only data compatibility for Lasso #4775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] Read-only data compatibility for Lasso #4775
Conversation
transform_alpha=0.001, random_state=0, n_jobs=-1) | ||
code = dico.fit(X).transform(X) | ||
assert_array_almost_equal(np.dot(code, dico.components_), X, decimal=2) | ||
X.flags.writeable = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This last line is useless as X
will be garbage collected. Please remove it.
Sorry I had not realised that you were sharing the same X as in other tests. I would be cleaner to do:
X_readonly = X.copy()
X_readonly.flags.writeable = False
and use that instead of X
in the fit and transform calls.
I edited the title of this PR as a |
n_components = 12 | ||
X.flags.writeable = False | ||
dico = DictionaryLearning(n_components, transform_algorithm='lasso_cd', | ||
transform_alpha=0.001, random_state=0, n_jobs=-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use n_jobs=-1
in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this so that the test only assert that dict_learning works on memory mapped ro arrays
|
||
if __name__ == '__main__': | ||
import nose | ||
nose.runmodule() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not add such boilerplate. Use nosetests sklearn/decomposition/tests/test_dict_learning.py
to run the tests of a specific test module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, these lines still exist in some tests file though
Also can you please squash the commits of this PR? Such intermediate commits have not historical values per se. |
If we need help with squashing please feel free to ask. |
73c98dd
to
6094264
Compare
LGTM. I think we should write a new test in |
@@ -59,6 +61,13 @@ def test_dict_learning_reconstruction_parallel(): | |||
code = dico.transform(X) | |||
assert_array_almost_equal(np.dot(code, dico.components_), X, decimal=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8: you need 2 empty lines between top-level functions
Does the change to the cython code have any impact to run time? |
@arthurmensch could you please rebase this on top of master? it's no longer mergeable according to github. I think the Instead to test the fix, please add a unittest that calls into the private cython API directly with |
976e28e
to
bc33901
Compare
Please remove the last commit, I think this should be addressed in a separate PR. In the mean time write a unittest that uses the private API directly as I did here: #4684 |
Can you please squash those commits together? |
Copy from joblib.pool (for independance)""" | ||
try: | ||
if os.path.exists(folder_path): | ||
shutil.rmtree(folder_path) # This can fail under windows, but will succeed when called by atexit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pep8
6869512
to
caec867
Compare
Done |
Please squash commits that have trivial commit messages such as "Fix". |
9683cef
to
9a55eeb
Compare
@@ -74,7 +73,7 @@ | |||
|
|||
print('Learning the dictionary...') | |||
t0 = time() | |||
dico = MiniBatchDictionaryLearning(n_components=100, alpha=1, n_iter=500) | |||
dico = MiniBatchDictionaryLearning(n_components=100, alpha=1, n_iter=500, batch_size=100, n_jobs=4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will break on windows, right? Let's not do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I left it, I used it for tests. I don't understand why it would break on Windows though, could you explain ?
LGTM apart from minor comments. |
8dc0aa7
to
e3435a1
Compare
e3435a1
to
fead69a
Compare
as far as I know, using multiprocessing requires you to wrap your call into a |
Travis test failure from cec3bf9. Merging. |
…infix [MRG+1] Read-only data compatibility for Lasso
Should fix issue #4772. Benchmarking the changes (calling cd_fast within a lasso regression) suggest we do not reduce performance compared to current master
I added a regression test that fails on current master (setting a read only flag on design matrix)