Clustering Mixture
Clustering Mixture
Jing Gao
SUNY Buffalo
1
Outline
• Basics
– Motivation, definition, evaluation
• Methods
– Partitional
– Hierarchical
– Density-based
– Mixture model
– Spectral methods
• Advanced topics
– Clustering ensemble
– Clustering in MapReduce
– Semi-supervised clustering, subspace clustering, co-clustering,
etc.
2
Using Probabilistic Models for Clustering
3
Gaussian Distribution
Changing σ increases or
decreases the spread
σ
μ x
Probability density function f(x) is a function of x
given μ and σ 1 1 x 2
N ( x | , )
2
exp( ( ) )
2 2 4
Likelihood
x
Define likelihood as a function of μ and σ
given x1, x2, …, xn n
iN
i 1
( x | , 2
)
5
Gaussian Distribution
• Multivariate Gaussian
mean covariance
• Log likelihood
n n
1
L( , ) ln N ( xi | , ) ( ( xi )T 1 ( xi )) ln | |)
i 1 i 1 2
6
Maximum Likelihood Estimate
• MLE
– Find model parameters , that maximize log
likelihood
L( , )
7
Gaussian Mixture
parameters to be estimated
8
Gaussian Mixture
• To generate a data point:
– first pick one of the components with probability
– then draw a sample from that component distribution
• Each data point is generated by one of K components, a latent variable
is associated with each
9
Gaussian Mixture
10
Expectation-Maximization (EM) Algorithm
• Parameter update:
12
EM Algorithm
• Relation to K-means
– Consider GMM with common covariance
20
Mixture Model
• Strengths
– Give probabilistic cluster assignments
– Have probabilistic interpretation
– Can handle clusters with varying sizes, variance etc.
• Weakness
– Initialization matters
– Choose appropriate distributions
– Overfitting issues
21
Take-away Message
• Probabilistic clustering
• Maximum likelihood estimate
• Gaussian mixture model for clustering
• EM algorithm that assigns points to clusters and
estimates model parameters alternatively
• Strengths and weakness
22