Skip to content

Fix most doctests on numpy-dev with doctest ellipsis #10143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

lesteve
Copy link
Member

@lesteve lesteve commented Nov 15, 2017

Should remove most errors discussed in #10074.

@lesteve
Copy link
Member Author

lesteve commented Nov 15, 2017

Travis log from the Cron job on my fork: https://travis-ci.org/lesteve/scikit-learn/builds/302533632.

Note there will still be at least one error remaining because of numpy/numpy#10026.

@lesteve
Copy link
Member Author

lesteve commented Nov 15, 2017

Travis log from the Cron job on my fork: https://travis-ci.org/lesteve/scikit-learn/builds/302533632.

Only failure is the expected one:

____________ [doctest] sklearn.metrics.pairwise.manhattan_distances ____________
498         If sum_over_features is False shape is
499         (n_samples_X * n_samples_Y, n_features) and D contains the
500         componentwise L1 pairwise-distances (ie. absolute difference),
501         else shape is (n_samples_X, n_samples_Y) and D contains
502         the pairwise L1 distances.
503 
504     Examples
505     --------
506     >>> from sklearn.metrics.pairwise import manhattan_distances
507     >>> manhattan_distances([[3]], [[3]])#doctest:+ELLIPSIS
Expected:
    array([[ 0.]])
Got:
    array([[0.]])

Copy link
Member

@qinhanmin2014 qinhanmin2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sacrifice seems high, e.g., 0.849 -> 0.8, which makes me dare not to give the approval, though seems that we need to do so to pass the tests in numpy dev.

1.0
>>> completeness_score([0, 1, 2, 3], [0, 0, 1, 1]) # doctest: +ELLIPSIS
0.99999999...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a better way here? 0.99999999... seems a bit magical to serve as an example for ordinary users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a bug or a feature?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a better way here? 0.99999999... seems a bit magical to serve as an example for ordinary users.

Not that I know off.

Is this a bug or a feature?

It is kind of a feature. Now numpy floats prints and reprs the same as Python floats.

@lesteve
Copy link
Member Author

lesteve commented Nov 15, 2017

The sacrifice seems high, e.g., 0.849 -> 0.8, which makes me dare not to give the approval, though seems that we need to do so to pass the tests in numpy dev.

0.849 -> 0.8... is because in numpy dev you get 0.85 and in numpy 1.13 you get 0.849999999. There is no way to cover that in doctests (suggestions welcome!). astropy has a custom doctest runner to compare floats and there was some talk about implementing a more generic one in numpy but I don't think this has happened.

The killer argument for me is this: we do not and we should not rely on doctests to test scikit-learn thoroughly. The doctests are present in docstrings (and rst files) to kind of inform the user of the expected result and check that we do get the expected result.

This means that we should spend the minimal amount of time to get something working in both numpy 1.13 and numpy 1.14.

The best solution I see: when numpy 1.14 is released, we update all the docstrings to work with numpy 1.14 (without legacy mode printing of course) and we just skip the doctests in numpy <= 1.13 (for example using conftest.py).

@lesteve
Copy link
Member Author

lesteve commented Nov 15, 2017

We should probably wait to see how numpy/numpy#10029 unfolds before merging. If numpy decides that np.set_printoptions(legacy=True) should make numpy dev (aka 1.14) print/repr numpy scalars as in numpy 1.13, we do not need this PR at all.

@jnothman
Copy link
Member

jnothman commented Nov 15, 2017 via email

@lesteve
Copy link
Member Author

lesteve commented Nov 16, 2017

Can use string formatting if it comes to that...

Yeah I thought about that. I would do that as last resort though: this makes the docstring (or the code in the rst file) a bit harder to follow.

@NelleV
Copy link
Member

NelleV commented Nov 16, 2017

Before fixing anything related to numpy's development version, we should contact the numpy developers: it has happened in the past that some projects have too early "fixed" their code for numpy-dev, when in fact a regression had been added to the master branch (see -- I believe -- @njsmith's BIDS presentation ).

@lesteve
Copy link
Member Author

lesteve commented Nov 16, 2017

Before fixing anything related to numpy's development version, we should contact the numpy developers

Agreed, that's what we generally do of course unless it is very obvious that's something that is completely our fault (ignoring DeprecationWarning until they create an error jumps to mind). In this particular case I opened numpy/numpy#10026 and numpy/numpy#10029.

@lesteve
Copy link
Member Author

lesteve commented Nov 20, 2017

Closing this one, because numpy has fixed the issues I opened. On a slightly related note I am investigating the failures with the latest Cron job. It feels like conftest.py is not taken into account (probably because we run the tests from a different folder than the checkout one) but I need to investigate a bit more to be 100% sure.

@lesteve lesteve closed this Nov 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants