Skip to content

Meta-estimator for semi-supervised learning #1243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
amueller opened this issue Oct 16, 2012 · 15 comments · Fixed by #11682
Closed

Meta-estimator for semi-supervised learning #1243

amueller opened this issue Oct 16, 2012 · 15 comments · Fixed by #11682

Comments

@amueller
Copy link
Member

Using self-taught learning it is possible to turn any estimator into a semi-supervised one.
Not that hard to do.

@GaelVaroquaux
Copy link
Member

Do we want to keep that in the 1.0 milestone. @amueller : you opened that issue, what's your feeling?

@jnothman
Copy link
Member

I took a look at the ICML'07 paper (Raina et al.) introducing this term. I assume you are interested in implementing the specific technique they introduce (or some variant on it), rather than the broader class of solutions to the problem they pose.

Although it is not a constraint of their general problem formulation, their technique more-or-less involves fitting a transformer on a lot of unlabelled data, then applying the transformation before classification. So it merely comes down to something like:

class SelfTaughtLearner(BaseEstimator):
    def __init__(transformer, estimator):
        ...

    def fit(self, X, y):
        mask = y == -1
        self.transformer.fit(X[safe_mask(X, mask)])
        Xt = self.transformer.transform(X[safe_mask(X, ~mask)])
        self.estimator.fit(Xt, y[safe_mask(y, ~mask)])
        return self

    def predict(self, X):
        Xt = self.transformer.transform(X)
        return self.estimator.predict(Xt)

I note that this would be a nice framework for many scikit-learn dimensionality reduction (including feature agglomeration) techniques.

(Presumably, this should include support for out-of-core learning of the transformer, as there can be lots of unlabelled data. One annoyance of the current semi-supervised API is that selecting portions where y == -1 necessarily involves a copy, hence invalidating use of memmaps to avoid the out-of-core problem. If we required unlabelled portions to be at the beginning/end of the data, we could slice without copy.)

@ogrisel
Copy link
Member

ogrisel commented Dec 12, 2013

At which point is the transformer involved in this?

@jnothman
Copy link
Member

Sorry. I failed to write what I meant. I've fixed the code snippet now.

@jnothman
Copy link
Member

So is this considered a useful helper to demonstrate transfer-type semi-supervised learning?

@amueller amueller removed this from the 1.0 milestone Mar 5, 2015
@amueller
Copy link
Member Author

amueller commented Mar 5, 2015

Just do add to the (old) discussion above: Often the lines in the fit would have a for-loop around then, as far as I know.

@amueller
Copy link
Member Author

I was actually referring to "self-training" aka "self-learning".

@chkoar
Copy link
Contributor

chkoar commented Jul 28, 2015

I would go for this

@amueller
Copy link
Member Author

@chrsrds sure, go ahead :)
The main thing would be to show how this can be useful in practice, though. We don't have great semi-supervised datasets in sklearn at the moment. Maybe working with digits (or MNIST?) and dropping some labels would be interesting?

@chkoar
Copy link
Contributor

chkoar commented Jul 29, 2015

Maybe working with digits (or MNIST?) and dropping some labels would be interesting?

Right. One option is to keep few labels per class (arbitrary number or percentage) on the training dataset and drop the rest. Next, we have to compare the accuracy of the self trained model and the supervised model using only the labeled examples.

@amueller
Copy link
Member Author

exactly, and maybe compare against label propagation and label spreading, too. (though I am not convinced by our implementation).

@maniteja123
Copy link
Contributor

Hello everyone, this definitely is new to me but if no one is working on this, I would like to try implementing this. I have understood the idea to the best of my ability and tried a version based on the above discussion at here. I mostly have never directly implemented any algorithm in semi-supervised learning, so kindly pardon my mistakes. I understand that all of you are busy but if you can guide at your convenience, will try to work on this. If you prefer that I first complete my pending PRs, will happily oblige to do so. Thanks.

@chkoar
Copy link
Contributor

chkoar commented Mar 10, 2016

I was actually referring to "self-training" aka "self-learning".

I think that @amueller is referring to the Self-Training (a.k.a. Bootstrapping) algorithm
http://www.vinartus.net/spa/03c-v7.pdf

Unfortunately I did not have time for docstrings, narrative documentation and writting tests.
If anyone has time to collaborate with me I will be glad to open o WIP PR.

@maniteja123
Copy link
Contributor

Thanks for informing about the paper. I just saw the discussion above and probably misunderstood the complexity of the algorithm. Sorry. I would first read the paper carefully and if within my ability, would be glad to contribute as much as I can.

@maniteja123
Copy link
Contributor

Hi, I have recently come across this exercise and also read the ICML paper related to this. I was hoping to work on this if there is sufficient interest and would be within my capabilities to implement it. Please let me know if you have any suggestions or anything else which I could refer to better understand the algorithm. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants