Skip to content

[MRG] Example Bag-of-Visual-Words #6509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 17 commits into from

Conversation

glemaitre
Copy link
Member

This is a first draft of the example illustrating the BOVW using scikit-learn.

@amueller
Copy link
Member

amueller commented Mar 8, 2016

using k-means++ often means that the initialization takes much longer than the actual algorithm (for large n_clusters in praticular).
I wouldn't do cross-validation, and I'd see if you can make it faster by using less words or less iterations and get comparable performance.

@@ -0,0 +1,126 @@
"""TU Darmstadt dataset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I'll just add this to the example, and not put it in the datasets folder. This is not really a widely used dataset. and once it is in the public api, it is hard to get rid of.

@glemaitre
Copy link
Member Author

I moved the dataset into example as suggested.

Regarding the parameters, I decrease the number of words to 80. I also limited the maximum number of patches extracted to 20,000 per images (maybe we will classify grass instead of cow), which is about 1/4-5th of the possible total number of patches.

The results seem appropriate with a large decrease of the computation time, which start to be suitable.

Classification performed - the confusion matrix obtained is:
[[23  0  0]
 [ 1 19  0]
 [ 0  0 23]]
It took 85.6792821884 seconds.

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Mar 9, 2016 via email

@glemaitre
Copy link
Member Author

Using 50 words will make drop a bit the accuracy but the time is 35 sec.

Classification performed - the confusion matrix obtained is:
[[19  4  0]
 [ 0 20  0]
 [ 0  0 23]]
It took 33.9749679565 seconds.

Also, I am taking advantage of my 4 available threads (i7 2620M). I don't know if it has to be taken into account.

@glemaitre
Copy link
Member Author

@GaelVaroquaux @amueller

What further improvements/changes should I take care of?

Cheers,

@glemaitre
Copy link
Member Author

@amueller @GaelVaroquaux

What else should I look at?

@amueller
Copy link
Member

sorry for the slow reply.
Please fix python3 compatibility, rename the example to plot_bovw.py" (theplot_`` is the important part, I just thought "example" was a bit redundant in the example folder ;).

Also add a header to the example and a text explaining what's happening.
thanks!

@glemaitre
Copy link
Member Author

Ok, I am up to date now with the requirements. I'll do that in the week-end and ping once done.

@glemaitre
Copy link
Member Author

@amueller Shall I remove the timer?

@glemaitre
Copy link
Member Author

@amueller @GaelVaroquaux Do you see any other additional improvements to bring?

Should tudarmstadt.py be moved into the sklearn.datasets module.

I decided to decrease the number of patches to get under 10 sec of processing.
I got that confusion matrix which I think is fine for an example:

[[21  2  0]
 [ 4 16  0]
 [ 0  0 23]]

We are confusing cars and motorbike but we get the cows right ;D

Bag of Visual Words
===================

An illustration of a Bag of Visual Words approach for image recognition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe explain that this is not how one would actually solve this problem but that it's simplified not use any vision library and to run fast enough ;)

@amueller
Copy link
Member

Can you check the pep8 errors please?

@glemaitre
Copy link
Member Author

Can you check the pep8 errors please?

@amueller I have the following errors since the doc come first.

plot_bovw.py:25:1: E402 module level import not at top of file
plot_bovw.py:26:1: E402 module level import not at top of file
plot_bovw.py:28:1: E402 module level import not at top of file
plot_bovw.py:29:1: E402 module level import not at top of file
plot_bovw.py:30:1: E402 module level import not at top of file
plot_bovw.py:31:1: E402 module level import not at top of file
plot_bovw.py:32:1: E402 module level import not at top of file
plot_bovw.py:33:1: E402 module level import not at top of file
plot_bovw.py:34:1: E402 module level import not at top of file
plot_bovw.py:36:1: E402 module level import not at top of file

from sklearn.metrics import confusion_matrix
from sklearn.externals.joblib import Parallel, delayed

from tudarmstadt import fetch_tu_darmstadt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be moved to sklearn/datasets

@glemaitre
Copy link
Member Author

@raghavrv I moved it. But I will recall what @amueller mentioned earlier

maybe I'll just add this to the example, and not put it in the datasets folder. This is not really a widely used dataset. and once it is in the public api, it is hard to get rid of.

That's why we kept it originally in the example folder. So let me know what is the best.

@raghavrv
Copy link
Member

That's why we kept it originally in the example folder. So let me know what is the best.

Argh. My bad! I did not notice that comment... Could you very kindly revert please?

@glemaitre
Copy link
Member Author

It has been hidden since that I previously addressed it. I remind it when it pop up after reversing.

@glemaitre
Copy link
Member Author

@amueller ping

@lesteve
Copy link
Member

lesteve commented Dec 1, 2016

That's why we kept it originally in the example folder. So let me know what is the best.

Hmmm but then it kind of defeats the implicit contract of an example which is being able to copy and paste it in an IPython session and have it running without any hitch.

Not sure what is the best way of doing this (the dataset is 42MB so too big to be included in the scikit-learn repo). Here are a few alternatives I can think of:

  • put the fetcher code in the example
  • keep it like this but explain in the example docstring that you need to copy locally the fetcher file (potentially linking to its example HTML URL)
  • create a sklearn.example_datasets for datasets that are only used in the examples

Not convinced by any of these alternatives, so better suggestions welcome!

@lesteve
Copy link
Member

lesteve commented Dec 1, 2016

Another alternative: you download the fetcher file tudarmstadt.py from the example gallery "Download Python file" URL and exec in inside plot_bovw.py. Keeps the example simple, maybe a little bit too much magic ...

@GaelVaroquaux
Copy link
Member

I think that it needs either to be added inside the example or moved to the datasets module.

If we move it to the datasets module, do you foresee other examples built upon it?

@glemaitre
Copy link
Member Author

I think that there is room for additional examples using this dataset and core methods from sklearn.
I would probably focus on examples proposed in the PASCAL challenge.

@amueller
Copy link
Member

amueller commented Dec 6, 2016

I don't think we'll have other examples use visual datasets (either this or PASCAL VOC) because we don't have the tools to extract the right features - and because everything runs waaaay to long for a scikit-learn example.

@amueller
Copy link
Member

amueller commented Dec 6, 2016

So therefore I'd rather not have it in the datasets folder - it's not very relevant to current computer vision, and it's not really helpful to show anything else in scikit-learn. But the example file is already quite long...

# Define the parameters in use afterwards
patch_size = (9, 9)
max_patches = 100
n_jobs = -1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't really do that in an example, can we? We have j_jobs=3 and n_jobs=4 in some examples, though I'm not sure if that is a good idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified to be on a single core.

patch_arr = patch_arr.reshape((patch_arr.shape[0] * patch_arr.shape[1],
patch_arr.shape[2]))
# Build a PCA dictionary
dict_PCA = PCA(n_components=n_components, random_state=rng)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the results get much worse / are much slower if you work on the raw pixels instead of PCA components?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results are equivalent. I remove the PCA since it shorten the example and this is still a texton approach.

@glemaitre
Copy link
Member Author

I don't think we'll have other examples use visual datasets (either this or PASCAL VOC) because we don't have the tools to extract the right features - and because everything runs waaaay to long for a scikit-learn example.

Fair enough. With this example, the only interesting point to me is the parallel between the Bag of Words and Bag of Visual Words.

@glemaitre
Copy link
Member Author

@amueller are the last changes fine with you?

@glemaitre glemaitre changed the title [WIP] Example Bag-of-Visual-Words [MRG] Example Bag-of-Visual-Words Jan 12, 2017
@glemaitre
Copy link
Member Author

@amueller @GaelVaroquaux is this PR still of interest for merging or it should be closed?

@glemaitre
Copy link
Member Author

I am closing that PR. It seems that scikit-learn has already plenty of examples.

@glemaitre glemaitre closed this Feb 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants