-
Notifications
You must be signed in to change notification settings - Fork 228
Add documentation for the new API #133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documentation for the new API #133
Conversation
Hi, I just pushed an outline of documentation so that we can discuss it here |
doc/outline.md
Outdated
|
||
- Supervised Metric Learning: (add links to examples/images from examples at the right place in the description) | ||
- Problem setting | ||
- Input data (+ see Preprocessor section) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only going to be simple usage of preprocessor and then talk about more details in the preprocessor section, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, something like: input data: a 2D array-like X (or 1D array-like of indices, see preprocessor section)
doc/outline.md
Outdated
|
||
- Examples/Tutorials: | ||
- One example with faces (prediction if same/different person) | ||
- One example of grid search to compare different algorithms (mmc, itml etc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the goal is to do the following in order:
- grid search for each metric learning algorithm
- grid search for a classifier (same classification algorithm)
- and then compare the results using classification metrics?
Right? Or are you only comparing the visualization after metric transformation (if so, maybe this could be combined with "Data visualisation" section)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking only grid search for Weakly Supervised Algorithms like MMC, the score being the accuracy score on predicting if a pair is positive/negative, since this case is the most unusual for users used to classic train/test sets of points and not pairs. But indeed we could also make a grid search on pipelines (SupervisedMetricLearner, KNN) for instance, like a regular classifier, with a classification accuracy score.
And I don't know if you were refering to it by 1., but maybe for Supervised Learners we could also do a GridSearch with the scoring of Weakly Supervised Algorithms (we would split the points X traditionnally, but on the test set we would sample some pairs and then evaluate the accuracy of score_pairs
) I don't know if this kind of scoring could is a traditional option for Supervised Metric Learners ? @bellet any opinion on this ?
And I agree that it would be interesting to see the impact of hyperparameters tuning on data visualization: maybe this could make an example in the data visualization section indeed
doc/outline.md
Outdated
- Problem setting | ||
- Input data (+ See Preprocessor section) | ||
- What you can do after fit (predict/score, tranform...) | ||
- Scikit-learn compatibility (compatible with grid search + link to example of grid search) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might also be good to show a sklearn Pipeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree, since the pipeline (Metric Learner, KNN) for instance, represents a major use case of metric learning for classification
…ection using sphinx-gallery
…d learners and copy the previous content to the different sections
Hi, I just added what could be a minimal documentation for a first merge, including minimal descriptions of the preprocessor and the recent changes in the API of Weakly Supervised Metric Learners. I just need to optimize a bit the style (align the text, format well examples sections etc, try to make the sandwich example plot something), before merging I think, but the content should not change so you can already have a look at it. Basically I copied what was already present in the doc/docstrings, into the new sections of the new doc, as we said @bellet and @nvauquie |
@perimosocordiae @terrytangyuan maybe you could review this PR ? Then we could merge it as soon as possible into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. I left a few comments, but I think we can merge this pretty soon.
doc/preprocessor.rst
Outdated
>>> for img_path in arr: | ||
>>> result.append(X[int(img_path[3:5])]) | ||
>>> # transforms 'img01.png' into X[1] | ||
>>> return np.array(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example could be simplified a bit:
def find_images(file_paths):
# each file contains a small image to use as an input datapoint
return np.row_stack([imread(f).ravel() for f in file_paths])
nca = NCA(preprocessor=find_images)
nca.fit(['img01.png', 'img00.png', 'img02.png'], [1, 0, 1])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and your snippet with the imread etc is actually we would really do in a real use case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
doc/preprocessor.rst
Outdated
>>> for img_path in arr: | ||
>>> result.append(X[int(img_path[3:5])]) | ||
>>> # transforms 'img01.png' into X[1] | ||
>>> return np.array(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can re-use find_images
from the other example above here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
doc/preprocessor.rst
Outdated
>>> y_pairs = np.array([1, -1]) | ||
>>> | ||
>>> mmc = MMC(preprocessor=X) | ||
>>> mmc.fit(pairs, y_pairs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe be explicit here that the X
array is not used at all in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I replaced it with a "not implemented yet" preprocessor, and wrote that the code would work anyways
examples/README.txt
Outdated
Examples | ||
======== | ||
|
||
Below is a gallery of example of metric-learn use cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...a gallery of example metric-learn use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done
outline.md
Outdated
@@ -0,0 +1,45 @@ | |||
documentation outline: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks more suitable for the wiki, or maybe a master issue for improving docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, maybe you can put this in a wiki
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yes I forgot it here. That's right, I'll open a page on the wiki for that
Thanks for the review @terrytangyuan and @perimosocordiae ! |
doc/preprocessor.rst
Outdated
|
||
>>> from metric_learn import MMC | ||
>>> def preprocessor_wip(array): | ||
>>> return NotImplementedError("This preprocessor does nothing yet.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise instead of return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good! I left some minor comments.
Additional remarks:
- are the docstrings up to date? For instance it does not show
preprocessor
as an argument of init - code examples in the doc for weakly supervised algorithms use the supervised version! so this should be changed
- the previous code examples could be used in the supervised section to document the supervised version of the weakly supervised algorithms (can add later)
- like you said it would be nice to show the figures for the sandwich examples (but that can also be done later)
- finally, there is an empty "module contents" section in the package overview part
|
||
Quick start | ||
=========== | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just add a sentence or two to briefly describe what the code snippet does (compute cross validation score of NCA on iris dataset)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I've added a quick description and also modified the example since it didn't work in fact (for now we don't have a scoring for cross-validation on supervised metric learners), so I updated the example with a pipeline nca + knn
doc/introduction.rst
Outdated
machine learning dedicated to automatically constructing optimal distance | ||
metrics. | ||
|
||
This package contains efficient Python implementations of several popular |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emphasize the scikit-learn compatible aspect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, done
|
||
Each metric learning algorithm supports the following methods: | ||
|
||
- ``fit(...)``, which learns the model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing a few other generic methods, like score_pairs
etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, thanks. I've added score_pairs
and replaced transformer()
by the newtransformer_from_metric(metric)
. I didn't put _prepare_inputs
since it is private, and neither check_preprocessor
because it is public for now but probably should rather be private ?
doc/preprocessor.rst
Outdated
Array-like | ||
---------- | ||
You can specify ``preprocessor=X`` where ``X`` is an array-like containing the | ||
dataset of points. In this case, the estimator will be able to take as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the fit method of the estimator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it may confuse users to say "the estimator will". I'll also say "predict etc", since preprocessors can also be used for prediction/scoring etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Preprocessor | ||
============ | ||
|
||
Estimators in metric-learn all have a ``preprocessor`` option at instantiation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe briefly explain default behavior (when preprocessor=None
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, done
doc/supervised.rst
Outdated
|
||
.. rubric:: References: | ||
|
||
`Information-theoretic Metric Learning <http://machinelearning.wustl.edu/mlpapers/paper_files/icml2007_DavisKJSD07.pdf>`_ Jason V. Davis, et al. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not the right reference (this is for ITML)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, didn't see that
Done
doc/weakly_supervised.rst
Outdated
``preprocessor``, which will go fetch and form the tuples. This allows to | ||
give more general indicators than just indices from an array (for instance | ||
paths in the filesystem, name of records in a database etc...) See section | ||
:ref:`preprocessor` for more details on how to use the preprocessor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link to section does not seem to work (at least not on the version I compiled locally)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's because I forgot to put the right formatting for referencing sections :p
Done
doc/weakly_supervised.rst
Outdated
1. ITML | ||
------- | ||
|
||
Information Theoretic Metric Learning, Kulis et al., ICML 2007 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is Davis et al.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, thanks, done
doc/weakly_supervised.rst
Outdated
|
||
.. todo:: add more details on `_Supervised` classes | ||
|
||
1. ITML |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the numbering of the section for each algorithm has an issue (e.g., shows 3.4.1. 1. ITML)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done
doc/supervised.rst
Outdated
|
||
.. todo:: Covariance is unsupervised, so its doc should not be here. | ||
|
||
:class:`Covariance` does not "learn" anything, rather it calculates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to have these references to link to the docstring page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, indeed I used the wrong way to reference them
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the changes! Two additional minor comments ;-)
doc/introduction.rst
Outdated
the domain. | ||
Distance metric learning (or simply, metric learning) is the sub-field of | ||
machine learning dedicated to automatically constructing optimal distance | ||
metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest two changes here:
- "optimal" is not great (wrt what is it optimal?). I would instead say "to automatically construct task-specific distance metrics from (weakly) supervised data".
- the new quick start example below makes me realize that we should clearly mention the relation between distance metric learning and embedding/representation learning. We could add a sentence like:
"The learned distance metric often corresponds to a Euclidean distance in a new embedding space, hence distance metric learning can be seen as a form of representation learning."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "optimal" is not great (wrt what is it optimal?). I would instead say "to automatically construct task-specific distance metrics from (weakly) supervised data".
I agree, done
- the new quick start example below makes me realize that we should clearly mention the relation between distance metric learning and embedding/representation learning. We could add a sentence like:
"The learned distance metric often corresponds to a Euclidean distance in a new embedding space, hence distance metric learning can be seen as a form of representation learning."
I agree, for now I just added this sentence, but indeed we should emphasize this more, maybe in the future with some section in the docs, and also maybe this will get clearer with examples of metric learners used as transformers (examples of dimensionality reduction for instance)
doc/introduction.rst
Outdated
:math:`D`-dimensional learned metric space :math:`X L^{\top}`, | ||
in which standard Euclidean distances may be used. | ||
- ``transform(X)``, which applies the aforementioned transformation. | ||
- ``score_pairs`` which returns the similarity of pairs of points. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe clarify the inputs to score_pairs
, and "similarity of pairs of points" --> "distance between pairs of points"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, done
…of the weakly supervised version
Thanks for the review @bellet ! To answer your remarks:
This should be updated now that I merged back the updated
I agree, I'll update them, for instance with this kind of examples: from sklearn.datasets import fetch_lfw_pairs
from metric_learn import MMC
dataset = fetch_lfw_pairs()
pairs, y = dataset.pairs, dataset.target
pairs = pairs.reshape(pairs.shape[0], 2, -1)
y = 2*y - 1 # we want +1 to indicate similar pairs and -1 dissimilar pairs
mmc = MMC()
mmc.fit(pairs, y) Or the same but maybe with a smaller artificial dataset
I agree, I've added a simple example with
Yes, I'm still trying to understand why it's failing :p
Yes, it was already there in the previous docs, but I'm still trying to understand how this modules and submodules part is generated. @perimosocordiae maybe an idea on that ? |
Looks good, but I would indeed use a smaller dataset
Maybe this should be put in the supervised section instead? Linking "weakly-supervised algorithm" in the text to the appropriate section, and maybe rename "_Supervised version" into "Supervised versions of weakly-supervised algorithms" |
maybe last reply in sphinx-doc/sphinx#3177 ? |
I've updated the docs with some toy dataset |
Ah yes, I hadn't fully read your previous comment |
I've updated the examples in the weakly supervised section as you said |
Turns out it comes from the file |
Turns out the problem with |
I guess we should be ready to merge ! :) |
It's likely that we aren't using good parameter settings for SDML in that example. We should file a bug to investigate further, though. |
Docs are merged! Now we can merge |
For now there is nothing added but I'll update it soon with some draft of documentation