Skip to content

[MRG + 1] Remove pilutil from examples #10527

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 12, 2018

Conversation

jotasi
Copy link

@jotasi jotasi commented Jan 24, 2018

Reference Issues/PRs

See discussion in #10502

What does this implement/fix?

#10502 replaced scipy.misc.imresize, which was deprecated by SciPy, with the local copy sklearn.externals._pilutil.imresize. As pointed out by @jnothman and agreed on in the linked discussion, this usage of a private module in an example is non-ideal. So this PR replaces it again by a combination of scipy.ndimage.filters.gaussian_filter and skimage.image.rescale. The new dependency (scikit-image) is added to the README and to circleCI as pointed out by @lesteve in his instructions on how to proceed (Thank you for these, btw).

Any other comments?

  • The results differ slightly from the older version. I'll post a comparison later (or tomorrow).
  • To avoid the overwriting of face from a function to images of different resolution, I introduced new variables orig_face, smoothened_face, and rescaled_face. If preferred, I can also change it back, though.
  • Different from what was discussed in [MRG] Removed ref. to deprecated imread/imresize in docs #10502, I did not use skimage.filters.gaussian. This wrapper around scipy.ndimage.filters.gaussian_filter changed names from skimage.filter.gaussian_filter to skimage.filters.gaussian_filter in version 0.11 and then to skimage.filters.gaussian in 0.12, while skimage.filters.gaussian_filter was deprecated and is removed in the current dev version 0.14. As the resulting convolution of try: except ImportError: blocks did not seem right for a minor part of a simple example, I went with scipy.ndimage.filters.gaussian_filter instead.
  • So far I have checked compatibility with the minimum scikit-image version (0.9.3) in the README only in a local docker container. I'll add a link to the circleCI-artifacts as soon as the builds are done.
  • BTW, while adding scikit-image I saw that the versions in circleCI (SciPy=0.14, Matplotlib=1.3) do not match the minima in the README (SciPy>=0.13.3, Matplotlib>=1.1.1). Is this difference intentional or should this be updated in the README at some point?
  • Finally, both examples raise a warning on my machine:
    matplotlib/contour.py:967: UserWarning: The following kwargs were not used by contour: 'contours'
    plot_face_segmentation.py also raises:
    sklearn/utils/graph.py:115: FutureWarning: Conversion of the second argument of issubdtype from int to np.signedinteger is deprecated. In future, it will be treated as np.int64 == np.dtype(int).type.
    Should I remove the offending parts as well?

@jnothman
Copy link
Member

You really need to stop referring to skimage as scikit-learn here. It's just too funny. Above you propose we need a new dependency on our own library...

@jotasi
Copy link
Author

jotasi commented Jan 24, 2018

Oh, you're obviously right, shame on me!

@jotasi
Copy link
Author

jotasi commented Jan 25, 2018

Here are the links to the artifacts for python 2 (plot_face_segmentation.py and plot_face_ward_segmentation.py) and python 3 (plot_face_segmentation.py and plot_face_ward_segmentation.py). The links to the examples in the development documentation can be found here and here. I also put example plots of the differences produced on my local machine below, comparing them to the dev-docs. Differences arise for example depending on when the conversion to floats is done and there is also a slight dependence on the boundary conditions. Should I try to tweak the resizing/smoothing to get the results to match more closely or are the results good enough the way they are now?

The resulting differences between before and after this PR on my local machine for plot_face_ward_segmentation.py are similar to those between artifacts and docs:
Now:
for_gh_ward_now
Dev-doc:
image

For plot_face_segmentation.py, the situation is similar. Big differences only arise to the stable doc and that seems to be due to the fix in #9062. Nonetheless, there are still some less pronounced differences:

Now:
for_gh_kmeans_now
Dev-doc:
image

Now:
for_gh_discretize_now
Dev-doc:
image

Update:
Edited out the comparisons to the images before and after #9062 and rather compared the images 1-to-1 with the dev-docs, as suggested by @lesteve. Changed the text accordingly.

@lesteve
Copy link
Member

lesteve commented Jan 29, 2018

BTW, while adding scikit-image I saw that the versions in circleCI (SciPy=0.14, Matplotlib=1.3) do not match the minima in the README (SciPy>=0.13.3, Matplotlib>=1.1.1). Is this difference intentional or should this be updated in the README at some point?

Good catch I updated the README.md for matplotlib and pushed in master (sorry you'll need to fix the conflicts). At the moment we decided to use the versions from Ubuntu 14.04 as our minimal dependencies. We could specify the numpy and scipy version in .circleci/config.yml, if you could do that in a separate PR, that would be great!

For the image comparison you need to compare to the dev doc and not stable. To make it easier to compare, this is better if you post one image for this PR and the same image from the dev doc and then go on to the next image. If you could edit your previous post, that'd be great.

About the image difference, I don't know. Maybe @glemaitre and @jmargeta have some idea since they work on #9062.

Other comments:

  • it's fine using scipy.ndimage.filters.gaussian_filter
  • not sure about the warnings, it looks like they should be addressed. If you could do that in a separate PR or open an issue, this would be great.

@glemaitre
Copy link
Member

About the image difference, I don't know. Maybe @glemaitre and @jmargeta have some idea since they work on #9062.

yep this is due to the fix. I work better now :) The last example (with kmeans) shows the fix. We get cluster which are more consistent.

@jmargeta
Copy link
Contributor

I second what @glemaitre says.

@jotasi, it is always good to see it being reproduced, thank you!
The differences not only in graph construction, but also random seed initialization and image type can indeed cause noticeable differences in the boundaries.

It leads me to question whether this image is actually a good example for the technique for a image segmentation / superpixels task. In contrast to papers using similar techniques I do not find the current results particularly convincing (e.g. boundaries are jagged and do not follow the real objects' boundaries). Wouldn't an image with crisper boundaries show the potential more clearly?

@lesteve
Copy link
Member

lesteve commented Jan 29, 2018

I fixed the conflicts. I think this PR should focus on removing sklearn.externals._pilutil usages from the examples. Whether there is better image we can use for this example can be done in a separate issue/PR (and if we go ahead and merge this PR, we can take use one of the scikit-image image)

@jotasi
Copy link
Author

jotasi commented Jan 30, 2018

Thanks for taking care of the conflicts @lesteve, and thanks everyone for the feedback. I've updated my comment above now comparing the 3 images 1-to-1 with the dev-docs. I'll add separate issues/PRs for the following things:

  • Adding numpy/scipy versions to circleci
  • Fixing the warnings raised in the example
  • Checking for a better image

@jotasi jotasi changed the title [WIP] Remove pilutil from examples [MRG] Remove pilutil from examples Jan 30, 2018
@@ -26,26 +26,30 @@

import numpy as np
import scipy as sp
from scipy.ndimage.filters import gaussian_filter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that as well. The problem with it is that it changed both name and module at different times since version 0.9.3 (which is the default in Ubuntu trusty) and thus one would need multiple try:... except ImportError:... which did not seem right for a simple example to me. (See the 3rd additional comment in opening post of the PR).
Do you agree that I then should keep SciPys version or should I change it anyway?

face = imresize(face, 0.10) / 255.
# Applying a Gaussian filter for smoothing prior to down-scaling
# reduces aliasing artifacts.
smoothened_face = gaussian_filter(orig_face/255., sigma=(1/0.10-1)/2.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you don't need that:

http://scikit-image.org/docs/dev/api/skimage.transform.html#skimage.transform.rescale

You can turn anti_aliasing=True to get the same effect.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is only in the current dev-version. In the docs of the current stable version (0.13.1) it is not listed, yet. One might want to keep this in mind and change in the (rather far) future, though.
(See also discussion here)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. In this case rewrite it as:

smoothened_face = gaussian_filter(orig_face, sigma=4.5)

face = imresize(face, 0.10) / 255.
# Applying a Gaussian filter for smoothing prior to down-scaling
# reduces aliasing artifacts.
smoothened_face = gaussian_filter(orig_face/255., sigma=(1/0.10-1)/2.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my previous comment

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@jotasi
Copy link
Author

jotasi commented Feb 8, 2018

Thanks for reviewing @glemaitre. I've explained my reasoning to your comments inline. Please let me know if you want to change this anyway (or something else).

@glemaitre
Copy link
Member

I see the point regarding the filtering. Let's keep scipy since that scikit-image is not released.

except ImportError:
face = sp.face(gray=True)
orig_face = sp.face(gray=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to avoid using the / 255, let's use make the following

from skimage import img_as_float64

orig_face = img_as_float64(sp.face(gray=True))

# Applying a Gaussian filter for smoothing prior to down-scaling
# reduces aliasing artifacts.
smoothened_face = gaussian_filter(orig_face/255., sigma=(1/0.10-1)/2.)
rescaled_face = rescale(smoothened_face, 0.10, mode="reflect")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.10 -> 0.1

except ImportError:
face = sp.face(gray=True)
orig_face = sp.face(gray=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comments

@jotasi
Copy link
Author

jotasi commented Feb 8, 2018

I've implemented your suggestions. My original reasoning for sigma=(1/0.10-1)/2. was that this is the way sigma will be computed by scikit-image's rescale with antialiasing=True in the future but I agree that this is not really helpful without an explanation and as this not the point of the example, I've dropped it. Concerning using skimage.img_as_float64, also good point! Btw. I chose skimage.img_as_float instead, as the former is again only available in the dev-version and skimage.img_as_float produces float64 anyway (see the docs). I hope that this covers everything. Otherwise, please let me know!

@glemaitre
Copy link
Member

Concerning using skimage.img_as_float64, also good point! Btw. I chose skimage.img_as_float instead, as the former is again only available in the dev-version and skimage.img_as_float produces float64 anyway (see the docs). I hope that this covers everything. Otherwise, please let me know!

Good point. I should really check the stable documentation :)

So this PR LGTM.

@jotasi could you open some issue with the points mentioned there: #10527 (comment)

@glemaitre glemaitre changed the title [MRG] Remove pilutil from examples [MRG + 1] Remove pilutil from examples Feb 8, 2018
@jotasi
Copy link
Author

jotasi commented Feb 8, 2018

I've opened an issue for the last point. I'll submit a PR for that if this PR gets merged. The other two should already be fixed by #10557 and #10569 respectively.

@jotasi
Copy link
Author

jotasi commented Feb 8, 2018

Sorry, that was not really true. Actually, the first point

Adding numpy/scipy versions to circleci

is not really fixed, as the PR had to be reverted. It is not possible to simply add the minimum version of SciPy mentioned in the README (0.13.3) to circleCI, as one of the examples actually fails to work with this version (works only for >0.14.0 as explained in this comment in the discussion to #10557). Should I open an issue for that as well as even though it does not seem to be easily fixable?

@lesteve
Copy link
Member

lesteve commented Feb 12, 2018

I pushed some minor tweaks, I will merge when CIs are green.

@lesteve
Copy link
Member

lesteve commented Feb 12, 2018

Circle is green and I checked that the examples looked fine. Merging thanks a lot @jotasi!

@lesteve lesteve merged commit 7f0e433 into scikit-learn:master Feb 12, 2018
@jotasi jotasi deleted the remove__pilutil_from_examples branch February 15, 2018 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants