Skip to content

[CI] Fix test failures in the nightly job #19114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

alfaro96
Copy link
Member

@alfaro96 alfaro96 commented Jan 5, 2021

What does this implement/fix? Explain your changes.

This PR fixes the remaining failures in the nightly job.

@thomasjpfan
Copy link
Member

Can you provide a link to the failing test?

If we need to turn off xdist we can set PYTEST_XDIST_VERSION: none in azure-pipelines.yml.

@alfaro96
Copy link
Member Author

alfaro96 commented Jan 5, 2021

Can you provide a link to the failing test?

If we need to turn off xdist we can set PYTEST_XDIST_VERSION: none in azure-pipelines.yml.

The link to the failing test is here. Indeed, I was just testing if the problem was related with the parallel testing (and, indeed, it is).

I turn off xdist with PYTEST_XDIST_VERSION: 'none' but still need to investigate the failing test.

@thomasjpfan
Copy link
Member

thomasjpfan commented Jan 5, 2021

Ah its the fetch_20newsgroups_vectorized_fxt one. This fails because the fetch_* functions are not threadsafe, so when multiple processes tries to download the dataset, it could break. I think the "simple fix" would be to fetch all the data before pytest runs when we need the network for testing.

Edit: I had solution somewhere in #17553 which I pull out to resolve the issue directly.

@ogrisel
Copy link
Member

ogrisel commented Jan 6, 2021

I prefer the solution of #19118 (pre-downloading sequentially the datasets to avoid issues with pytest-xdist).

@alfaro96
Copy link
Member Author

alfaro96 commented Jan 6, 2021

I prefer the solution of #19118 (pre-downloading sequentially the datasets to avoid issues with pytest-xdist).

+1 about the @thomasjpfan solution. Nevertheless, we still need to solve the remaining test failure.

@ogrisel
Copy link
Member

ogrisel commented Jan 13, 2021

+1 about the @thomasjpfan solution. Nevertheless, we still need to solve the remaining test failure.

Indeed.

There is still a remaining failure in sklearn.metrics.cluster.tests.test_supervised.test_exactly_zero_info_score which looks like a real regression triggered by adjusted_mutual_info_score that deserves investigation.

@ogrisel
Copy link
Member

ogrisel commented Jan 13, 2021

Closing this PR now that is #19118 merged in favor of a dedicated issue to track the remaining test failure: #19165.

@ogrisel ogrisel closed this Jan 13, 2021
@alfaro96 alfaro96 deleted the fix_nightly_ci branch January 13, 2021 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants