-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+2] Modification of GaussianMixture class. #7123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The purpose here is to prepare the integration of BayesianGaussianMixture.
for init in range(n_init): | ||
self._print_verbose_msg_init_beg(init) | ||
|
||
if do_init: | ||
self._initialize_parameters(X) | ||
current_log_likelihood, resp = self._e_step(X) | ||
self.lower_bound_ = np.infty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't you document the new attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, sorry for that mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know np.infty
also returns np.inf
... Interesting...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be self.lower_bound_ = -np.infty
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has been merged but I re-ask the question: shouldn't it be -np.infty
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @tguillemot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's it for me |
@@ -136,7 +136,7 @@ def _initialize_parameters(self, X): | |||
---------- | |||
X : array-like, shape (n_samples, n_features) | |||
""" | |||
n_samples = X.shape[0] | |||
n_samples, _ = X.shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if this was suggested before...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just I prefer like this and it also check that the shape of X is a tuple.
Sorry for these useless little modifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Thx!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it also checks implicitly that X.ndim == 2. It do the same in my code.
@@ -563,6 +553,9 @@ class GaussianMixture(BaseMixture): | |||
|
|||
n_iter_ : int | |||
Number of step used by the best fit of EM to reach the convergence. | |||
|
|||
lower_bound_ : float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lower_bound_ -> best_log_likelihood_ ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact for GMM this is the best log likelihood but not for VBGMM which is lower bound.
I've chosen lower_bound_ because it was the most understandable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@ngoix This PR is related to Bayesian Gaussian Mixture, if you have some time to review it. |
one more +1 and we're good here... |
|
||
for n_iter in range(self.max_iter): | ||
prev_log_likelihood = current_log_likelihood | ||
prev_log_likelihood = self.lower_bound_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prev_log_likelihood -> prev_lower_bound ?
By testing the code, I realized that there is a problem which was here before this PR: |
|
||
def _estimate_log_weights(self): | ||
return np.log(self.weights_) | ||
|
||
def _compute_lower_bound(self, _, log_prob_norm): | ||
return log_prob_norm | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In _set_parameters
below, compute self.covariances_
from self.precisions_
.
That's all from me! |
@ngoix Indeed you're right. I've missed that. Thanks. |
@agramfort merge ? |
@amueller Can you merge please ? |
Thanks @tguillemot |
@agramfort @ngoix Thanks |
* Modification of GaussianMixture class. The purpose here is to prepare the integration of BayesianGaussianMixture. * Fix comments. * Modification of the Docstring. * Add license and author. * Fix review and add tests for init.
The PR is to simplify the integration of the
BayesianGaussianMixture
class #6651.I've simplify the
GaussianMixture
class by integrating a function which computes the determinant of the cholesky decomposition of the precision matrix (which will be usefull forBayesianGaussianMixture
).I've also corrected a bug during the EM process: normally, the lower bound is computed after the M-step not after the E-step. It's not a problem for GMM but that creates some problem for
BayesianGaussianMixture
.@agramfort @ogrisel @amueller Can you have a look ?