Skip to content

[MRG+1] FIX out of bounds array access in SAGA #9376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 17, 2017

Conversation

jnothman
Copy link
Member

@jnothman jnothman commented Jul 16, 2017

Fixes #9351 (random inconsistencies between SAGA and liblinear LogisticRegression on OS X).

To hedge my bets that this is the right fix, I'll ping @TomDLT, @arthurmensch, @dsullivan7.

@agramfort
Copy link
Member

did you spot this by setting boundscheck=True?

@jnothman
Copy link
Member Author

I did not. I found it by narrowing in using old-school diffs over printed debug output.

@jnothman
Copy link
Member Author

Is there a way to disable the nogil feature for a compilation so that exceptions will be propagated?

@jnothman
Copy link
Member Author

I commented on the boundscheck at #9351 (comment)

@agramfort
Copy link
Member

I confirm that I don't get the error in the branch after running 10 times the test. It fails on master on my mac.

@agramfort agramfort changed the title [MRG] FIX out of bounds array access in SAGA [MRG+1] FIX out of bounds array access in SAGA Jul 16, 2017
@jnothman
Copy link
Member Author

I've tried it well over 10 times, if that adds any weight. :)

And even disabling the nogil, I'm not getting an exception raised with boundscheck=True :\

@ogrisel
Copy link
Member

ogrisel commented Jul 16, 2017

To get the exceptions to propagate:

  • change the prototype of:
cdef void lagged_update(...) nogil:

to:

cdef int lagged_update(...) nogil except -1:

and update the implementation to return 0.

  • wrap the code that raises the exception with "with gil":
with gil:
    raise ValueError(the_msg)

@ogrisel
Copy link
Member

ogrisel commented Jul 16, 2017

I also confirm that fixes the issue on my mac. The diff looks good, however I am not familiar enough with the algorithm to tell whether the fix is correct. @arthurmensch can you confirm?

@jnothman
Copy link
Member Author

Well, I'm sure that the successful runs are those when cumulative_sums[-1] happened to be 0. Which is why I would get behaviours like an entire run of 1000 trials would succeed, then a run of 1000 trials would fail 1/3 times (due to whatever memory the process was allocated and, I presume, garbage collection, because in this case I found it was precisely every third run that failed).

@jnothman
Copy link
Member Author

jnothman commented Jul 16, 2017 via email

@arthurmensch
Copy link
Contributor

arthurmensch commented Jul 17, 2017 via email

@jnothman
Copy link
Member Author

jnothman commented Jul 17, 2017 via email

- cumulative_sums[lagged_ind - 1])
prox_step = (cumulative_sums_prox[lagged_ind]
- cumulative_sums_prox[lagged_ind - 1])
if lagged_ind > 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should read >= 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Whoops. I think we were all a bit too enthusiastic to fix this.

- cumulative_sums[lagged_ind - 1])
prox_step = (cumulative_sums_prox[lagged_ind]
- cumulative_sums_prox[lagged_ind - 1])
if lagged_ind > 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise there was indeed a problem there, I wonder how it went silently although wraparound = False, any idea ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right: those are pointers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wraparound, etc, flags do not apply at all to pointers.

But the basic issue is that O/Ses allocate memory on the heap differently. Most of it tends to be zero (if only because we use 64 bit int arrays to store small numbers). So this just worked when the array happened to be allocated next to a zero ... which always happened on Travis and AppVeyor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We choose pointers because apparently they're a little more efficient than memoryviews, and much better where we want to handle read-only memmaps. But they're much less safe!

@jnothman
Copy link
Member Author

Fixed. Good for you?

@arthurmensch
Copy link
Contributor

LGTM. I hope there is no other errors like this... I wondered why we used pointer there in the first place, but it's true that it allows to handle read-only memmap. Thanks !

@ogrisel
Copy link
Member

ogrisel commented Jul 17, 2017

Thanks very much for the fix, merging.

@ogrisel ogrisel merged commit 055d17b into scikit-learn:master Jul 17, 2017
@ogrisel
Copy link
Member

ogrisel commented Jul 17, 2017

I will do the backport to 0.19.X.

ogrisel pushed a commit that referenced this pull request Jul 17, 2017
* FIX out of bounds array access in SAGA

* FIX Fix boundary for condition
@ogrisel
Copy link
Member

ogrisel commented Jul 17, 2017

Done.

yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Jul 27, 2017
Release 0.19b2

* tag '0.19b2': (808 commits)
  Preparing 0.19b2
  [MRG+1] FIX out of bounds array access in SAGA (scikit-learn#9376)
  FIX make test_importances pass on 32 bit linux
  Release 0.19b1
  DOC remove 'in dev' header in whats_new.rst
  DOC typos in whats_news.rst [ci skip]
  [MRG] DOC cleaning up what's new for 0.19 (scikit-learn#9252)
  FIX t-SNE memory usage and many other optimizer issues (scikit-learn#9032)
  FIX broken link in gallery and bad title rendering
  [MRG] DOC Replace \acute by prime (scikit-learn#9332)
  Fix typos (scikit-learn#9320)
  [MRG + 1 (rv) + 1 (alex) + 1] Add a check to test the docstring params and their order (scikit-learn#9206)
  DOC Residual sum vs. regression sum (scikit-learn#9314)
  [MRG] [HOTFIX] Fix capitalization in test and hence fix failing travis at master (scikit-learn#9317)
  More informative error message for classification metrics given regression output (scikit-learn#9275)
  [MRG] COSMIT Remove unused parameters in private functions (scikit-learn#9310)
  [MRG+1] Ridgecv normalize (scikit-learn#9302)
  [MRG + 2] ENH Allow `cross_val_score`, `GridSearchCV` et al. to evaluate on multiple metrics (scikit-learn#7388)
  Add data_home parameter to fetch_kddcup99 (scikit-learn#9289)
  FIX makedirs(..., exists_ok) not available in Python 2 (scikit-learn#9284)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Jul 27, 2017
* releases: (808 commits)
  Preparing 0.19b2
  [MRG+1] FIX out of bounds array access in SAGA (scikit-learn#9376)
  FIX make test_importances pass on 32 bit linux
  Release 0.19b1
  DOC remove 'in dev' header in whats_new.rst
  DOC typos in whats_news.rst [ci skip]
  [MRG] DOC cleaning up what's new for 0.19 (scikit-learn#9252)
  FIX t-SNE memory usage and many other optimizer issues (scikit-learn#9032)
  FIX broken link in gallery and bad title rendering
  [MRG] DOC Replace \acute by prime (scikit-learn#9332)
  Fix typos (scikit-learn#9320)
  [MRG + 1 (rv) + 1 (alex) + 1] Add a check to test the docstring params and their order (scikit-learn#9206)
  DOC Residual sum vs. regression sum (scikit-learn#9314)
  [MRG] [HOTFIX] Fix capitalization in test and hence fix failing travis at master (scikit-learn#9317)
  More informative error message for classification metrics given regression output (scikit-learn#9275)
  [MRG] COSMIT Remove unused parameters in private functions (scikit-learn#9310)
  [MRG+1] Ridgecv normalize (scikit-learn#9302)
  [MRG + 2] ENH Allow `cross_val_score`, `GridSearchCV` et al. to evaluate on multiple metrics (scikit-learn#7388)
  Add data_home parameter to fetch_kddcup99 (scikit-learn#9289)
  FIX makedirs(..., exists_ok) not available in Python 2 (scikit-learn#9284)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Jul 27, 2017
* dfsg: (808 commits)
  Preparing 0.19b2
  [MRG+1] FIX out of bounds array access in SAGA (scikit-learn#9376)
  FIX make test_importances pass on 32 bit linux
  Release 0.19b1
  DOC remove 'in dev' header in whats_new.rst
  DOC typos in whats_news.rst [ci skip]
  [MRG] DOC cleaning up what's new for 0.19 (scikit-learn#9252)
  FIX t-SNE memory usage and many other optimizer issues (scikit-learn#9032)
  FIX broken link in gallery and bad title rendering
  [MRG] DOC Replace \acute by prime (scikit-learn#9332)
  Fix typos (scikit-learn#9320)
  [MRG + 1 (rv) + 1 (alex) + 1] Add a check to test the docstring params and their order (scikit-learn#9206)
  DOC Residual sum vs. regression sum (scikit-learn#9314)
  [MRG] [HOTFIX] Fix capitalization in test and hence fix failing travis at master (scikit-learn#9317)
  More informative error message for classification metrics given regression output (scikit-learn#9275)
  [MRG] COSMIT Remove unused parameters in private functions (scikit-learn#9310)
  [MRG+1] Ridgecv normalize (scikit-learn#9302)
  [MRG + 2] ENH Allow `cross_val_score`, `GridSearchCV` et al. to evaluate on multiple metrics (scikit-learn#7388)
  Add data_home parameter to fetch_kddcup99 (scikit-learn#9289)
  FIX makedirs(..., exists_ok) not available in Python 2 (scikit-learn#9284)
  ...
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
* FIX out of bounds array access in SAGA

* FIX Fix boundary for condition
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
* FIX out of bounds array access in SAGA

* FIX Fix boundary for condition
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
* FIX out of bounds array access in SAGA

* FIX Fix boundary for condition
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
* FIX out of bounds array access in SAGA

* FIX Fix boundary for condition
AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017
* FIX out of bounds array access in SAGA

* FIX Fix boundary for condition
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
* FIX out of bounds array access in SAGA

* FIX Fix boundary for condition
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
* FIX out of bounds array access in SAGA

* FIX Fix boundary for condition
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unstable LogisticRegression with saga solver l1 penalty under macOS
4 participants