Skip to content

What about Gaussian Mixture Regression? #6073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
belevtsoff opened this issue Dec 20, 2015 · 11 comments
Closed

What about Gaussian Mixture Regression? #6073

belevtsoff opened this issue Dec 20, 2015 · 11 comments

Comments

@belevtsoff
Copy link

It's strange that scikit learn doesn't have a GMM based regression model. Is there a reason for this? If that's simply because no one had time/interest to do it, I can put up a PR.

@belevtsoff belevtsoff changed the title Gaussian Mixture Regression What about Gaussian Mixture Regression? Dec 20, 2015
@agramfort
Copy link
Member

agramfort commented Dec 21, 2015 via email

@belevtsoff
Copy link
Author

@agramfort It's tough to find isolated reference, but Section II.A from Toda 2007 gives a good review. It's quite popular for spectral mapping. The idea is to model the joint pdf p(x, y) with GMM and then predict y as an expectation of a conditional distribution p(y | x), i.e. \hat{y} = E[p(y | x)]. In essence, this is a smooth piecewise-linear transformation (approximately linear around cluster centroids). Here's an example:
gmr

@agramfort
Copy link
Member

agramfort commented Dec 21, 2015 via email

@belevtsoff
Copy link
Author

@agramfort alright then. Once it's ready, it can also be discussed through the mailing list

@amueller
Copy link
Member

Closing for now.Seems kinda related to GPs? But without a solid reference and usage example probably a "no". feel free to post to scikit-learn-contrib.

@stnorton
Copy link

I realize this is closed, but there is reference to this in section 11.2.4 in Murphy (2012) Machine Learning: A Probabilistic Perspective. As a usage example, see Imai and Tingley (2011) in the American Journal of Political Science.

GMM regressions are fairly commonly used in the social sciences, and are well implemented in R (package flexmix).

@amueller
Copy link
Member

@stnorton I don't think we want to implement everything mentioned in Murphy's book.
This is quite easy to implement with sklearn right now, right?

Is there a reason this is commonly used in social sciences?

@stnorton
Copy link

@amueller It's difficult enough to discourage sklearn's use for mixture regressions, especially given that it's so easy in R. There are also no examples of implementation available online (a search for an example for a seminar I'm teaching led me to this thread), leading to the need to reinvent the wheel each time.

Mixture regressions are often used in social science for running models with heterogeneous treatment effects where the source of the heterogeneity isn't observed.

Obviously, it's not necessary to cater to all research communities, but as Python becomes more popular for social scientists, this would be a nice feature for that research community to have.

@amueller
Copy link
Member

I agree it would be good for the research community to have. I think we'd be happy to add an example, or you could add a package to scikit-learn-contrib.
Also see the faq.
Having something "in R" is the equivalent of having it "on Pypi" not "in scikit-learn".

It guess it's something like

probs = GaussianMixture().fit(X).predict_proba(X)
expanded = (X * probs.reshape(X.shape[0], 1, -1)).reshape(X.shape[0], -1)
lr = LinearRegression().fit(expanded, y)

?
(the reshape is untested ;)

@jnothman
Copy link
Member

jnothman commented Sep 26, 2018 via email

@AlexanderFabisch
Copy link
Member

A little bit late, but I made this library about 5 years ago because GMR would not fit perfectly to sklearn's estimator interface: https://github.com/AlexanderFabisch/gmr

gmr

I could move it to scikit-learn-contrib. I don't know if it covers all use cases of GMR though. It was sufficient for my application.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants