Skip to content

TST Improve assert_argkmin_results_quasi_equality error message #27281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jjerphan
Copy link
Member

@jjerphan jjerphan commented Sep 3, 2023

Reference Issues/PRs

Fixes #27126.
Towards #25888.

Updated scope of this PR

This changes the way we check neighbor results, notably for test_pairwise_distances_argkmin with float32 so that:

Furthermore, the tests parameterization has been updated to make it faster to run all the tests with all the seeds while still exploring various problem sizes.

Signed-off-by: Julien Jerphanion <git@jjerphan.xyz>

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@github-actions
Copy link

github-actions bot commented Sep 3, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 4a0592a. Link to the linter CI: here

@ogrisel
Copy link
Member

ogrisel commented Sep 11, 2023

@jjerphan I pushed a commit to run test_pairwise_distances_argkmin with "all random seeds" so that we can see if the improved error message help us understand what's going on with #27126.

@jjerphan
Copy link
Member Author

Feel free to adapt this PR at will.

@ogrisel
Copy link
Member

ogrisel commented Sep 11, 2023

I ran one of the failing tests of #27126 locally using:

SKLEARN_TESTS_GLOBAL_RANDOM_SEED="45" pytest sklearn/metrics/tests/test_pairwise_distances_reduction.py -v -k "test_pairwise_distances_argkmin and cityblock" -x  --pdb

the error message was:

E               AssertionError: Neighbors indices are not matching when rounding distances at
E                 3 significant digits derived from rtol=1.0e-04.
E                 
E                 Query vector index         : 61
E                 Reference ordered distances: [159450.703125 160197.78125  160429.53125  161349.21875  161953.109375
E                  162041.515625 162137.984375 162665.90625  162700.25     163778.59375 ]
E                 Computed ordered distances : [159450.65128034 160197.77867103 160429.4601412  161349.25246161
E                  161953.08416814 162041.40858293 162138.01407665 162665.88381121
E                  162700.24564451 163778.55885714]
E                 Norm. abs. ord. dist. diff.: 0.03828817754983902
E                 Rounded distance           : 164000.0
E                 Neighbors group rank       : 5
E                 Reference neighbors indices: [16]
E                 Computed neighbors indices : [88]
E               assert {16} == {88}
E                 Extra items in the left set:
E                 16
E                 Extra items in the right set:
E                 88
E                 Full diff:
E                 - {88}
E                 + {16}

but I also wanted to see the distances and the matching indices arrays, so I used the debugger:

(Pdb) p ref_indices_row
array([39, 48, 21, 56, 46, 93, 59, 54, 45, 16])
(Pdb) p indices_row
array([39, 48, 21, 56, 46, 93, 59, 54, 45, 88])
(Pdb) p ref_dist_row
array([159450.703125, 160197.78125 , 160429.53125 , 161349.21875 ,
       161953.109375, 162041.515625, 162137.984375, 162665.90625 ,
       162700.25    , 163778.59375 ])
(Pdb) p dist_row
array([159450.65128034, 160197.77867103, 160429.4601412 , 161349.25246161,
       161953.08416814, 162041.40858293, 162138.01407665, 162665.88381121,
       162700.24564451, 163778.55885714])

The distances for the last neighbor are indeed very close and the difference happens at the 8th digits which might be expected for float32 rounding errors.

Using pdb I went up in the stack to compare the distances in the reference distance matrix computed by cdist:

(Pdb) p dist_matrix[61, 16]
163778.57524980605
(Pdb) p dist_matrix[61, 88]
163778.55885714293

so those are indeed close. Let's have a look at the next neighbors in the reference cdist matrix:

(Pdb) p dist_matrix.argsort(axis=1)[61, :15]
array([39, 48, 21, 56, 46, 93, 59, 54, 45, 88, 16, 15, 17, 73,  7])

so 88 and 16 are indeed the closest candidate for rank k=10 according to cdist.

So we should probably find a way to tolerate those cases where the Cython code returns neighbors swaps within the range of rounding errors of the chosen dtype.

@ogrisel
Copy link
Member

ogrisel commented Sep 11, 2023

The CI has failed because of the way I formatted the commit message. Let me push another empty commit with the right message structure.

@ogrisel
Copy link
Member

ogrisel commented Sep 11, 2023

@jjerphan the last commit message makes it possible to reproduce all the failures with the extra info of this PR.

I suppose we should simplify the code of assert_argkmin_results_quasi_equality. Instead of the complex bucketing strategy, we could provide the full reference distance row (without sorting and truncation) and check that the top k computed neighbors are:

  • in the right order based on the Cython computed distances,
  • have Cython distance values that match the reference cdist distance within tolerance (irrespective of the reference ordering),
  • such that there exist no other element in the reference distance row of the query that is significantly smaller most distant top-k neighbor found by Cython.

This should be probably more robust while being both strict enough and easier to debug in case of failure. WDYT?

@jjerphan
Copy link
Member Author

We could do that yes.

I think the current yet complex assertion checks that the ordering between both algorithms is identical (up to a bucketing for acceptable numerical imprecision of float32). Is there a way we could check that with what you propose? If I understand your first item correctly, it does not propose to assert that but only the correct ordering of the index w.r.t their distances, is this right?

@ogrisel
Copy link
Member

ogrisel commented Sep 12, 2023

Is there a way we could check that with what you propose? If I understand your first item correctly, it does not propose to assert that but only the correct ordering of the index w.r.t their distances, is this right?

Yes but since the computed distances for each neighbor should match the reference computed distance within round-off errors and furthermore that there should not exist any significantly closer neighbor based on the reference distances, the approximate ordering should also be satisfied as a result.

@ogrisel
Copy link
Member

ogrisel commented Sep 12, 2023

Let me try to give it a shot.

@ogrisel
Copy link
Member

ogrisel commented Sep 13, 2023

@jjerphan I pushed two commits to implement what I had in mind with even stronger tests for the assertion function. This also fixes some xfail edge cases (at least on my machine).

I decided to focus this PR to the argkmin tests for now but we could do a follow-up PR to do a similar treatment for the radius neighbors tests and further simplify this test file.

Once the last commit CI is green, I will push a new empty commit with message [azure parallel] [all random seeds] test_pairwise_distances_argkmin to re-run will all seeds on the CI.

@ogrisel
Copy link
Member

ogrisel commented Sep 13, 2023

The code coverage report reveals that the tests for the checker could be improved. I will do that tomorrow. In the mean time let me push the all random seed commit.

@ogrisel ogrisel force-pushed the tst/improve-error-message-assert_argkmin_results_quasi_equality branch from d4161f7 to 04f7bdd Compare September 14, 2023 06:22
@ogrisel
Copy link
Member

ogrisel commented Sep 15, 2023

@jjerphan finally this PR is ready for reviewing.

It should reduce the average CI time quite significantly (see the CI reports of the second to last commit).

@jjerphan
Copy link
Member Author

Thanks!

I do not really have time to review it now, but I trust you.

(I find that having me approve this PR is a bit weird since I opened it but I, in fact, have not contributed to it much but you.)

@ogrisel
Copy link
Member

ogrisel commented Sep 16, 2023

With the recent merges to accelerate conda install and reduce time of the slowest test, combined with this PR, the macOS CI is almost fast again!

image

Copy link
Member Author

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not reviewed it entirely, but I am trusting you for coming up with better assertions than mine.

LGTM for the sake of a faster CI and better error messages.

(I can't approve this PR, but I could if you reopen a PR with this branch.)

@ogrisel
Copy link
Member

ogrisel commented Sep 17, 2023

I have not reviewed it entirely, but I am trusting you for coming up with better assertions than mine.

Let's wait for you to get a but of time to have a quick look, at least on the updated tests and their messages. If the test cases and the messages make sense to you, then this should be good.

I can't approve this PR.

You can just say so in a comment :)

def test_assert_argkmin_results_quasi_equality():
rtol = 1e-7
eps = 1e-7
def test_assert_compatible_argkmin_results():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, this is the main entry point for the review.

rtol = 1e-7
eps = 1e-7
@pytest.mark.parametrize("check_sorted", [True, False])
def test_assert_compatible_radius_results(check_sorted):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is a second entry point point to the review.

Copy link
Member Author

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️ LGTM. Thank you, @ogrisel.

Here are some comments which need not be treated and a question.

Edit: I would not block this PR if this fixes errors on the CI which have been there for too long.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only nitpicks. I am fine with the new tests.

indices_row_a,
indices_row_b,
threshold,
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be nice to have a docstring if the code below is already explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 7fb4261.

Comment on lines 181 to 185

for neighbor_rank in range(n_neighbors):
rounded_dist = relative_rounding(
ref_dist_row[neighbor_rank],
n_significant_digits=n_significant_digits,
)
reference_neighbors_groups[rounded_dist].add(ref_indices_row[neighbor_rank])
effective_neighbors_groups[rounded_dist].add(indices_row[neighbor_rank])

# Asserting equality of groups (sets) for each distance
msg = (
f"Neighbors indices for query {query_idx} are not matching "
f"when rounding distances at {n_significant_digits} significant digits "
f"derived from rtol={rtol:.1e}"
# Check that any neighbor with distances below the rounding error threshold have
# matching indices.
threshold = (1 - rtol) * np.maximum(
np.max(dist_row_a), np.max(dist_row_b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to scratch my head here. rtol is a bit different from the other usage. I don't know if this is siginificant enough to be mentioned (having the docstring in the assert_no_missing_neighbors will help already.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope that 7fb4261 helps.

# checking the results is expensive for large result sets), yielding 0 most
# of the time would make the test useless.
if precomputed_dists is None and metric is None:
raise ValueError("Either metric or dists must be provided")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently this is not covered.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_non_trivial_radius is a test helper function. It's not meant to be used in the library so I think it's fine (in particular it is not accounted in the test coverage report). This exception is just there to help maintainers use it correctly if they refactor the tests again in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, can it be:

assert (
    precomputed_dists is not None or metric is not None
), "Either metric or dists must be provided"

@ogrisel
Copy link
Member

ogrisel commented Sep 27, 2023

Thanks for the review @glemaitre.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Green tick for #27281 (review).

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good on my side.

@glemaitre glemaitre merged commit b06a099 into scikit-learn:main Sep 28, 2023
lesteve pushed a commit to lesteve/scikit-learn that referenced this pull request Sep 28, 2023
…ikit-learn#27281)

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
…ikit-learn#27281)

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

⚠️ CI failed on Ubuntu_Atlas.ubuntu_atlas ⚠️
4 participants