-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Example Bag-of-Visual-Words #6509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
using k-means++ often means that the initialization takes much longer than the actual algorithm (for large n_clusters in praticular). |
@@ -0,0 +1,126 @@ | |||
"""TU Darmstadt dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I'll just add this to the example, and not put it in the datasets folder. This is not really a widely used dataset. and once it is in the public api, it is hard to get rid of.
I moved the dataset into example as suggested. Regarding the parameters, I decrease the number of words to 80. I also limited the maximum number of patches extracted to 20,000 per images (maybe we will classify grass instead of cow), which is about 1/4-5th of the possible total number of patches. The results seem appropriate with a large decrease of the computation time, which start to be suitable.
|
It took 85.6792821884 seconds.
That's looking great! If you manage to get this below 1 min (you're
almost there), we can have this as one of our standard examples.
|
Using 50 words will make drop a bit the accuracy but the time is 35 sec.
Also, I am taking advantage of my 4 available threads (i7 2620M). I don't know if it has to be taken into account. |
What further improvements/changes should I take care of? Cheers, |
What else should I look at? |
sorry for the slow reply. Also add a header to the example and a text explaining what's happening. |
Ok, I am up to date now with the requirements. I'll do that in the week-end and ping once done. |
6309899
to
8e46d38
Compare
@amueller Shall I remove the timer? |
@amueller @GaelVaroquaux Do you see any other additional improvements to bring? Should I decided to decrease the number of patches to get under 10 sec of processing.
We are confusing cars and motorbike but we get the cows right ;D |
Bag of Visual Words | ||
=================== | ||
|
||
An illustration of a Bag of Visual Words approach for image recognition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe explain that this is not how one would actually solve this problem but that it's simplified not use any vision library and to run fast enough ;)
Can you check the pep8 errors please? |
@amueller I have the following errors since the doc come first.
|
from sklearn.metrics import confusion_matrix | ||
from sklearn.externals.joblib import Parallel, delayed | ||
|
||
from tudarmstadt import fetch_tu_darmstadt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be moved to sklearn/datasets
@raghavrv I moved it. But I will recall what @amueller mentioned earlier
That's why we kept it originally in the example folder. So let me know what is the best. |
Argh. My bad! I did not notice that comment... Could you very kindly revert please? |
3851d39
to
3c853eb
Compare
It has been hidden since that I previously addressed it. I remind it when it pop up after reversing. |
@amueller ping |
Hmmm but then it kind of defeats the implicit contract of an example which is being able to copy and paste it in an IPython session and have it running without any hitch. Not sure what is the best way of doing this (the dataset is 42MB so too big to be included in the scikit-learn repo). Here are a few alternatives I can think of:
Not convinced by any of these alternatives, so better suggestions welcome! |
Another alternative: you download the fetcher file tudarmstadt.py from the example gallery "Download Python file" URL and exec in inside plot_bovw.py. Keeps the example simple, maybe a little bit too much magic ... |
I think that it needs either to be added inside the example or moved to the datasets module. If we move it to the datasets module, do you foresee other examples built upon it? |
I think that there is room for additional examples using this dataset and core methods from |
I don't think we'll have other examples use visual datasets (either this or PASCAL VOC) because we don't have the tools to extract the right features - and because everything runs waaaay to long for a scikit-learn example. |
So therefore I'd rather not have it in the datasets folder - it's not very relevant to current computer vision, and it's not really helpful to show anything else in scikit-learn. But the example file is already quite long... |
# Define the parameters in use afterwards | ||
patch_size = (9, 9) | ||
max_patches = 100 | ||
n_jobs = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't really do that in an example, can we? We have j_jobs=3 and n_jobs=4 in some examples, though I'm not sure if that is a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified to be on a single core.
patch_arr = patch_arr.reshape((patch_arr.shape[0] * patch_arr.shape[1], | ||
patch_arr.shape[2])) | ||
# Build a PCA dictionary | ||
dict_PCA = PCA(n_components=n_components, random_state=rng) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the results get much worse / are much slower if you work on the raw pixels instead of PCA components?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The results are equivalent. I remove the PCA since it shorten the example and this is still a texton approach.
Fair enough. With this example, the only interesting point to me is the parallel between the Bag of Words and Bag of Visual Words. |
c9d04f9
to
c5f13ac
Compare
@amueller are the last changes fine with you? |
@amueller @GaelVaroquaux is this PR still of interest for merging or it should be closed? |
I am closing that PR. It seems that |
This is a first draft of the example illustrating the BOVW using scikit-learn.