Skip to content

New Feature: Added NCA as first algorithm in metric_learning module #5276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed

New Feature: Added NCA as first algorithm in metric_learning module #5276

wants to merge 5 commits into from

Conversation

terrytangyuan
Copy link

See #3213. I've integrated NCA with scikit-learn in this PR.

@terrytangyuan
Copy link
Author

Anyone knows what I have missed? I got this from Travis:

FAIL: sklearn.tests.test_common.test_root_import_all_completeness
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/scikit-learn/scikit-learn/testvenv/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/build/scikit-learn/scikit-learn/sklearn/tests/test_common.py", line 144, in test_root_import_all_completeness
    assert_in(modname, sklearn.__all__)
AssertionError: 'metric_learning' not found in ['calibration', 'cluster', 'covariance', 'cross_decomposition', 'cross_validation', 'datasets', 'decomposition', 'dummy', 'ensemble', 'externals', 'feature_extraction', 'feature_selection', 'gaussian_process', 'grid_search', 'isotonic', 'kernel_approximation', 'kernel_ridge', 'lda', 'learning_curve', 'linear_model', 'manifold', 'metrics', 'mixture', 'multiclass', 'naive_bayes', 'neighbors', 'neural_network', 'pipeline', 'preprocessing', 'qda', 'random_projection', 'semi_supervised', 'svm', 'tree', 'discriminant_analysis', 'clone']

It looks like I need to add this metric_learning into somewhere? Any help would be appreciated.

@jnothman
Copy link
Member

It looks like I need to add this metric_learning into somewhere?

Modify __all__ in sklearn/__init__.py.

return [g, gradg]


class NCA(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please inherit from sklearn.base.BaseEstimator

@jnothman
Copy link
Member

Huh. Wait, this is being worked on in #4789, and was the subject of GSoC funding... If you can see benefits of your implementation over that one, please comment there. Otherwise I think you should expect that implementation to take precedence.

The :mod:`sklearn.metric_learning` module implements metric learning models.

The algorithms that have been implemented are:
Relevant Components Analysis (RCA)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RCA?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo! I was reading something else at the same time...thanks!
On Sep 16, 2015 4:19 AM, "B@rmaley.exe" notifications@github.com wrote:

In sklearn/metric_learning/init.py
#5276 (comment)
:

@@ -0,0 +1,8 @@
+"""
+The :mod:sklearn.metric_learning module implements metric learning models.
+
+The algorithms that have been implemented are:
+Relevant Components Analysis (RCA)

RCA?


Reply to this email directly or view it on GitHub
https://github.com/scikit-learn/scikit-learn/pull/5276/files#r39603872.

@terrytangyuan
Copy link
Author

@Barmaley-exe @jnothman Still failing after the changes. Any ideas?

@artsobolev
Copy link
Contributor

@terrytangyuan you can take a look at Travis' output to find failed tests. You can also see the failing test's code to understand what's going on.

@terrytangyuan
Copy link
Author

@Barmaley-exe I got this

  File "/home/travis/build/scikit-learn/scikit-learn/sklearn/metric_learning/NCA.py", line 219, in fit
    res = opt.minimize(fun=self.objective,
AttributeError: 'module' object has no attribute 'minimize'

which is not expected since it's imported from here import scipy.optimize as opt and it passed on my local pc.

BTW, do you suggest that I keep working on this or help you finish yours (I'd love to help)? Yours seems to have a problem of not improving the classification accuracy? The main goal is to get metric learning on this package since it's extremely useful.

@artsobolev
Copy link
Contributor

@terrytangyuan I'm not sure why this exception arises. One possibility is that sklearn supports SciPy >= 0.9, and minimize was added in 0.11

My personal opinion is that my PR is much more developed, so we'd be better off finishing that. Main problem is that it's not well-tested, and it'd benefit from more documentation and examples.

@terrytangyuan
Copy link
Author

Okay. Let's finish that instead of making a new one then. Can you update your existing code? How should I make changes on your PR? I assume there's a better way than fork your forked repo.

@terrytangyuan terrytangyuan deleted the metriclearn branch September 16, 2015 21:57
@mblondel
Copy link
Member

In my opinion, the point of metric learning is that it can leverage supervised information in the form of pairwise similarity or dissimilarity labels. If you are going to generate those labels from just (X, y), one is probably better off just using a classifier. So before we add NCA to scikit-learn, I really want to see empirical evidence that it is worth it both in terms of accuracy and training time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants