-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Scaling kills DPGMM [was: mixture.DPGMM not fitting to data] #2454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This looks pretty bad :-/ |
My explanation for this is: the model assumes a N(0, 1) prior on the means [and also a fixed prior on the covariance], which is not reasonable for your data. To make this work, the data should be scaled to have zero mean and unit variance. Then the result would be much more sensible. I have to little experience in these kind of models to say what a good solution would be.
Ps: any Baysian should feel free to hit me and implement the hierarchical approach. |
Thinking about it, I'm not sure if 1000 samples shouldn't be enough to overcome the prior... hum... |
The derivation of the mean http://scikit-learn.org/dev/modules/dp-derivation.html#the-updates is quite different from the one listed in Bishop's or Murphy's book. In particular, in the books the variational mean parameters don't depend on the variational precision parameters, which they do in the derivation in the docs (which is odd). |
I am not very attached to our implementation. It has given us a lot of |
Closing: the new Dirichlet process GMM re-write has been merged in master. It is not affected by this bug. |
I am trying out the Gaussian mixture models in the package. I tried to model a mixture with two Components, G(1000,500^2), and G(2000,600^2). The following is the code:
And I got the following means of the components.
[[ 0.13436485]
[ 0.13199086]
[ 0.11750537]
[ 0.10560644]
[ 0.12162311]
[ 0.00204134]
[ 0.12058521]
[ 0.11997703]
[ 0.11944384]
[ 0.11890694]]
It seems the model does not fit properly to the data. Is it a bug or I have got something wrong in the application of the model?
Thanks.
Fan
The text was updated successfully, but these errors were encountered: