-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
make ProjectionToHashMixin and GaussianRandomProjectionHash private? #8029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think the module needs some dedicated work, something that I once thought I might put into it, but have not. We have not facilitated its integration as a substitute for NN methods. It also seems that other ANN approaches seem more successful for the general case (especially when you can learn something about the distribution of your data, i.e. assuming samples will be identically distributed). LSH is good when identical distribution is a poor assumption, and has been prominent in fast textual retrieval, using hash functions particularly suited to n-grams. Yes, I think we should get someone dedicated to work on it, or abandon it. |
As long as the hash is not really configurable, those can certainly be private. |
I think banking on you or Olivier to invest a lot of time into this issue is not a good idea. I remember that there was some effort to work in this, but I think that didn't work out so well? We could do what we did for HMMs, which was basically say "this is abandonware, use at own risk" - Or simply say "this is not fast enough and will be removed". |
So
|
Presumably |
These two classes are not really documented and seem more like implementation details.
Given how the LSHForest is not great, I'm not sure how much work we should invest in the module, though.
@jnothman @ogrisel is there a path forward with LSHForest? (in other words, should we deprecate the whole thing?)
The text was updated successfully, but these errors were encountered: