Skip to content

Add documentation for the new API #133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

wdevazelhes
Copy link
Member

For now there is nothing added but I'll update it soon with some draft of documentation

@wdevazelhes
Copy link
Member Author

Hi, I just pushed an outline of documentation so that we can discuss it here

doc/outline.md Outdated

- Supervised Metric Learning: (add links to examples/images from examples at the right place in the description)
- Problem setting
- Input data (+ see Preprocessor section)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only going to be simple usage of preprocessor and then talk about more details in the preprocessor section, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, something like: input data: a 2D array-like X (or 1D array-like of indices, see preprocessor section)

doc/outline.md Outdated

- Examples/Tutorials:
- One example with faces (prediction if same/different person)
- One example of grid search to compare different algorithms (mmc, itml etc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the goal is to do the following in order:

  1. grid search for each metric learning algorithm
  2. grid search for a classifier (same classification algorithm)
  3. and then compare the results using classification metrics?

Right? Or are you only comparing the visualization after metric transformation (if so, maybe this could be combined with "Data visualisation" section)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking only grid search for Weakly Supervised Algorithms like MMC, the score being the accuracy score on predicting if a pair is positive/negative, since this case is the most unusual for users used to classic train/test sets of points and not pairs. But indeed we could also make a grid search on pipelines (SupervisedMetricLearner, KNN) for instance, like a regular classifier, with a classification accuracy score.
And I don't know if you were refering to it by 1., but maybe for Supervised Learners we could also do a GridSearch with the scoring of Weakly Supervised Algorithms (we would split the points X traditionnally, but on the test set we would sample some pairs and then evaluate the accuracy of score_pairs) I don't know if this kind of scoring could is a traditional option for Supervised Metric Learners ? @bellet any opinion on this ?

And I agree that it would be interesting to see the impact of hyperparameters tuning on data visualization: maybe this could make an example in the data visualization section indeed

doc/outline.md Outdated
- Problem setting
- Input data (+ See Preprocessor section)
- What you can do after fit (predict/score, tranform...)
- Scikit-learn compatibility (compatible with grid search + link to example of grid search)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be good to show a sklearn Pipeline

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree, since the pipeline (Metric Learner, KNN) for instance, represents a major use case of metric learning for classification

@bellet bellet mentioned this pull request Dec 14, 2018
7 tasks
@wdevazelhes
Copy link
Member Author

Hi, I just added what could be a minimal documentation for a first merge, including minimal descriptions of the preprocessor and the recent changes in the API of Weakly Supervised Metric Learners. I just need to optimize a bit the style (align the text, format well examples sections etc, try to make the sandwich example plot something), before merging I think, but the content should not change so you can already have a look at it. Basically I copied what was already present in the doc/docstrings, into the new sections of the new doc, as we said @bellet and @nvauquie

@wdevazelhes
Copy link
Member Author

@perimosocordiae @terrytangyuan maybe you could review this PR ? Then we could merge it as soon as possible into new_api, so that we can then merge new_api to master (cf discussion in #117 (comment) )

Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Copy link
Contributor

@perimosocordiae perimosocordiae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. I left a few comments, but I think we can merge this pretty soon.

>>> for img_path in arr:
>>> result.append(X[int(img_path[3:5])])
>>> # transforms 'img01.png' into X[1]
>>> return np.array(result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example could be simplified a bit:

def find_images(file_paths):
   # each file contains a small image to use as an input datapoint
   return np.row_stack([imread(f).ravel() for f in file_paths])

nca = NCA(preprocessor=find_images)
nca.fit(['img01.png', 'img00.png', 'img02.png'], [1, 0, 1])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and your snippet with the imread etc is actually we would really do in a real use case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

>>> for img_path in arr:
>>> result.append(X[int(img_path[3:5])])
>>> # transforms 'img01.png' into X[1]
>>> return np.array(result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can re-use find_images from the other example above here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

>>> y_pairs = np.array([1, -1])
>>>
>>> mmc = MMC(preprocessor=X)
>>> mmc.fit(pairs, y_pairs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe be explicit here that the X array is not used at all in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I replaced it with a "not implemented yet" preprocessor, and wrote that the code would work anyways

Examples
========

Below is a gallery of example of metric-learn use cases.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...a gallery of example metric-learn use cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done

outline.md Outdated
@@ -0,0 +1,45 @@
documentation outline:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks more suitable for the wiki, or maybe a master issue for improving docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, maybe you can put this in a wiki

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yes I forgot it here. That's right, I'll open a page on the wiki for that

@wdevazelhes
Copy link
Member Author

Thanks for the review @terrytangyuan and @perimosocordiae !


>>> from metric_learn import MMC
>>> def preprocessor_wip(array):
>>> return NotImplementedError("This preprocessor does nothing yet.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise instead of return

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks

Copy link
Member

@bellet bellet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good! I left some minor comments.

Additional remarks:

  • are the docstrings up to date? For instance it does not show preprocessor as an argument of init
  • code examples in the doc for weakly supervised algorithms use the supervised version! so this should be changed
  • the previous code examples could be used in the supervised section to document the supervised version of the weakly supervised algorithms (can add later)
  • like you said it would be nice to show the figures for the sandwich examples (but that can also be done later)
  • finally, there is an empty "module contents" section in the package overview part


Quick start
===========

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just add a sentence or two to briefly describe what the code snippet does (compute cross validation score of NCA on iris dataset)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I've added a quick description and also modified the example since it didn't work in fact (for now we don't have a scoring for cross-validation on supervised metric learners), so I updated the example with a pipeline nca + knn

machine learning dedicated to automatically constructing optimal distance
metrics.

This package contains efficient Python implementations of several popular
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emphasize the scikit-learn compatible aspect

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, done


Each metric learning algorithm supports the following methods:

- ``fit(...)``, which learns the model.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing a few other generic methods, like score_pairs etc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thanks. I've added score_pairs and replaced transformer() by the newtransformer_from_metric(metric). I didn't put _prepare_inputs since it is private, and neither check_preprocessor because it is public for now but probably should rather be private ?

Array-like
----------
You can specify ``preprocessor=X`` where ``X`` is an array-like containing the
dataset of points. In this case, the estimator will be able to take as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fit method of the estimator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it may confuse users to say "the estimator will". I'll also say "predict etc", since preprocessors can also be used for prediction/scoring etc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Preprocessor
============

Estimators in metric-learn all have a ``preprocessor`` option at instantiation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe briefly explain default behavior (when preprocessor=None)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, done


.. rubric:: References:

`Information-theoretic Metric Learning <http://machinelearning.wustl.edu/mlpapers/paper_files/icml2007_DavisKJSD07.pdf>`_ Jason V. Davis, et al.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the right reference (this is for ITML)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, didn't see that
Done

``preprocessor``, which will go fetch and form the tuples. This allows to
give more general indicators than just indices from an array (for instance
paths in the filesystem, name of records in a database etc...) See section
:ref:`preprocessor` for more details on how to use the preprocessor.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link to section does not seem to work (at least not on the version I compiled locally)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's because I forgot to put the right formatting for referencing sections :p
Done

1. ITML
-------

Information Theoretic Metric Learning, Kulis et al., ICML 2007
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Davis et al.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thanks, done


.. todo:: add more details on `_Supervised` classes

1. ITML
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the numbering of the section for each algorithm has an issue (e.g., shows 3.4.1. 1. ITML)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done


.. todo:: Covariance is unsupervised, so its doc should not be here.

:class:`Covariance` does not "learn" anything, rather it calculates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to have these references to link to the docstring page

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, indeed I used the wrong way to reference them
Done

Copy link
Member

@bellet bellet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the changes! Two additional minor comments ;-)

the domain.
Distance metric learning (or simply, metric learning) is the sub-field of
machine learning dedicated to automatically constructing optimal distance
metrics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest two changes here:

  • "optimal" is not great (wrt what is it optimal?). I would instead say "to automatically construct task-specific distance metrics from (weakly) supervised data".
  • the new quick start example below makes me realize that we should clearly mention the relation between distance metric learning and embedding/representation learning. We could add a sentence like:
    "The learned distance metric often corresponds to a Euclidean distance in a new embedding space, hence distance metric learning can be seen as a form of representation learning."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • "optimal" is not great (wrt what is it optimal?). I would instead say "to automatically construct task-specific distance metrics from (weakly) supervised data".

I agree, done

  • the new quick start example below makes me realize that we should clearly mention the relation between distance metric learning and embedding/representation learning. We could add a sentence like:
    "The learned distance metric often corresponds to a Euclidean distance in a new embedding space, hence distance metric learning can be seen as a form of representation learning."

I agree, for now I just added this sentence, but indeed we should emphasize this more, maybe in the future with some section in the docs, and also maybe this will get clearer with examples of metric learners used as transformers (examples of dimensionality reduction for instance)

:math:`D`-dimensional learned metric space :math:`X L^{\top}`,
in which standard Euclidean distances may be used.
- ``transform(X)``, which applies the aforementioned transformation.
- ``score_pairs`` which returns the similarity of pairs of points.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe clarify the inputs to score_pairs, and "similarity of pairs of points" --> "distance between pairs of points"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, done

@wdevazelhes
Copy link
Member Author

Thanks for the review @bellet ! To answer your remarks:

Thanks, looks good! I left some minor comments.
Additional remarks:
are the docstrings up to date? For instance it does not show preprocessor as an argument of init

This should be updated now that I merged back the updated new_api_design into the PR

code examples in the doc for weakly supervised algorithms use the supervised version! so this should be changed

I agree, I'll update them, for instance with this kind of examples:

from sklearn.datasets import fetch_lfw_pairs
from metric_learn import MMC

dataset = fetch_lfw_pairs()
pairs, y = dataset.pairs, dataset.target
pairs = pairs.reshape(pairs.shape[0], 2, -1)
y = 2*y - 1  # we want +1 to indicate similar pairs and -1 dissimilar pairs
mmc = MMC()
mmc.fit(pairs, y)

Or the same but maybe with a smaller artificial dataset

the previous code examples could be used in the supervised section to document the supervised version of the weakly supervised algorithms (can add later)

I agree, I've added a simple example with MMC. Indeed in the future it could be good to add more documentation on how the constraints are created from a labeled dataset. issue #135 should keep track on that.

like you said it would be nice to show the figures for the sandwich examples (but that can also be done later)

Yes, I'm still trying to understand why it's failing :p

finally, there is an empty "module contents" section in the package overview part

Yes, it was already there in the previous docs, but I'm still trying to understand how this modules and submodules part is generated. @perimosocordiae maybe an idea on that ?

@bellet
Copy link
Member

bellet commented Dec 20, 2018

I agree, I'll update them, for instance with this kind of examples:

from sklearn.datasets import fetch_lfw_pairs
from metric_learn import MMC

dataset = fetch_lfw_pairs()
pairs, y = dataset.pairs, dataset.target
pairs = pairs.reshape(pairs.shape[0], 2, -1)
y = 2*y - 1  # we want +1 to indicate similar pairs and -1 dissimilar pairs
mmc = MMC()
mmc.fit(pairs, y)

Looks good, but I would indeed use a smaller dataset

I agree, I've added a simple example with MMC. Indeed in the future it could be good to add more documentation on how the constraints are created from a labeled dataset. issue #135 should keep track on that.

Maybe this should be put in the supervised section instead? Linking "weakly-supervised algorithm" in the text to the appropriate section, and maybe rename "_Supervised version" into "Supervised versions of weakly-supervised algorithms"

@bellet
Copy link
Member

bellet commented Dec 20, 2018

finally, there is an empty "module contents" section in the package overview part

Yes, it was already there in the previous docs, but I'm still trying to understand how this modules and submodules part is generated. @perimosocordiae maybe an idea on that ?

maybe last reply in sphinx-doc/sphinx#3177 ?
otherwise it looks like "module contents" will show the docstring of __init__.py, and there is an option to show this before the submodules:
sphinx-doc/sphinx#2190

@wdevazelhes
Copy link
Member Author

Looks good, but I would indeed use a smaller dataset

I've updated the docs with some toy dataset

@wdevazelhes
Copy link
Member Author

Maybe this should be put in the supervised section instead? Linking "weakly-supervised algorithm" in the text to the appropriate section, and maybe rename "_Supervised version" into "Supervised versions of weakly-supervised algorithms"

Ah yes, I hadn't fully read your previous comment
I agree, done

@wdevazelhes
Copy link
Member Author

I've updated the examples in the weakly supervised section as you said

@wdevazelhes
Copy link
Member Author

finally, there is an empty "module contents" section in the package overview part

Yes, it was already there in the previous docs, but I'm still trying to understand how this modules and submodules part is generated. @perimosocordiae maybe an idea on that ?

maybe last reply in sphinx-doc/sphinx#3177 ?
otherwise it looks like "module contents" will show the docstring of __init__.py, and there is an option to show this before the submodules:
sphinx-doc/sphinx#2190

Turns out it comes from the file metric_learn.rst. I changed it manually, but I guess it was generated at the beginning with some command like sphinx-apidoc as in the link you gave @bellet
So I don't know if my manual modification is what would generate sphinx-apidoc with the right flag --implicit-namespaces from sphinx-doc/sphinx#3177 (comment), but maybe this is fine for now and later we could think of the process of generating the rst files ?

@wdevazelhes
Copy link
Member Author

Turns out the problem with sandwich.py was that it needs to be named plot_sandwich.py to be executed: https://sphinx-gallery.readthedocs.io/en/latest/getting_started.html#structure-the-examples-folder
Now it works

@wdevazelhes
Copy link
Member Author

I guess we should be ready to merge ! :)

@bellet
Copy link
Member

bellet commented Dec 20, 2018

sphx_glr_plot_sandwich_001
Quick question on the example: why SDML seems to not do anything?

@perimosocordiae
Copy link
Contributor

It's likely that we aren't using good parameter settings for SDML in that example. We should file a bug to investigate further, though.

@perimosocordiae perimosocordiae merged commit 073451a into scikit-learn-contrib:new_api_design Dec 20, 2018
@perimosocordiae
Copy link
Contributor

Docs are merged! Now we can merge new_api_design into master.

@wdevazelhes wdevazelhes deleted the doc/add_documentation branch January 3, 2019 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants