0% found this document useful (0 votes)

11 views22 pages

Clustering Mixture

The document discusses clustering methods, focusing on mixture models and the Gaussian mixture model (GMM). It explains the concepts of hard vs. soft clustering, the use of probabilistic models, and the Expectation-Maximization (EM) algorithm for parameter estimation. The document also highlights the strengths and weaknesses of mixture models in clustering applications.

Uploaded by

Asish kumar Nayak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views22 pages

Clustering Mixture

Uploaded by

Asish kumar Nayak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Clustering

Lecture 5: Mixture Model

Jing Gao
SUNY Buffalo

1
Outline
• Basics
– Motivation, definition, evaluation
• Methods
– Partitional
– Hierarchical
– Density-based
– Mixture model
– Spectral methods
• Advanced topics
– Clustering ensemble
– Clustering in MapReduce
– Semi-supervised clustering, subspace clustering, co-clustering,
etc.

2
Using Probabilistic Models for Clustering

• Hard vs. soft clustering

– Hard clustering: Every point belongs to exactly one cluster
– Soft clustering: Every point belongs to several clusters with
certain degrees
• Probabilistic clustering
– Each cluster is mathematically represented by a
parametric distribution
– The entire data set is modeled by a mixture of these
distributions

3
Gaussian Distribution

f(x) Changing μ shifts the

distribution left or right

Changing σ increases or
decreases the spread
σ

μ x
Probability density function f(x) is a function of x
given μ and σ 1 1 x 2
N ( x | , ) 
2
exp(  ( ) )
 2 2  4
Likelihood

Which Gaussian distribution is

f(x) more likely to generate the data?

x
Define likelihood as a function of μ and σ
given x1, x2, …, xn n

 iN
i 1
( x |  ,  2
)
5
Gaussian Distribution

• Multivariate Gaussian

mean covariance

• Log likelihood

n n
1
L(  , )   ln N ( xi | , )   ( ( xi   )T  1 ( xi   ))   ln |  |)
i 1 i 1 2

6
Maximum Likelihood Estimate

• MLE
– Find model parameters ,  that maximize log
likelihood
L( , )

• MLE for Gaussian

7
Gaussian Mixture

• Linear combination of Gaussians

where

parameters to be estimated

8
Gaussian Mixture
• To generate a data point:
– first pick one of the components with probability
– then draw a sample from that component distribution
• Each data point is generated by one of K components, a latent variable
is associated with each

9
Gaussian Mixture

• Maximize log likelihood

• Without knowing values of latent variables, we have to

maximize the incomplete log likelihood

10
Expectation-Maximization (EM) Algorithm

• E-step: for given parameter values we can compute

the expected values of the latent variables
(responsibilities of data points)

– Note that instead of but we

still have
11
Expectation-Maximization (EM) Algorithm

• M-step: maximize the expected complete log

likelihood

• Parameter update:

12
EM Algorithm

• Iterate E-step and M-step until the log likelihood

of data does not increase any more.
– Converge to local optimal
– Need to restart algorithm with different initial
guess of parameters (as in K-means)

• Relation to K-means
– Consider GMM with common covariance

– As , two methods coincide

13
14
14
15
15
16
16
17
17
18
18
19
19
K-means vs GMM

• Objective function • Objective function

– Maximize log-likelihood
– Minimize sum of squared
Euclidean distance • EM algorithm
• Can be optimized by an EM – E-step: Compute posterior
algorithm probability of membership
– E-step: assign points to clusters – M-step: Optimize parameters
– M-step: optimize clusters – Perform soft assignment during
E-step
– Performs hard assignment
during E-step • Can be used for non-spherical
clusters
• Assumes spherical clusters with
equal probability of a cluster • Can generate clusters with
different probabilities

20
Mixture Model

• Strengths
– Give probabilistic cluster assignments
– Have probabilistic interpretation
– Can handle clusters with varying sizes, variance etc.
• Weakness
– Initialization matters
– Choose appropriate distributions
– Overfitting issues

21
Take-away Message

• Probabilistic clustering
• Maximum likelihood estimate
• Gaussian mixture model for clustering
• EM algorithm that assigns points to clusters and
estimates model parameters alternatively
• Strengths and weakness

Desert Fever An Overview of Mining in California Desert
No ratings yet
Desert Fever An Overview of Mining in California Desert
141 pages
Lec15 16 Handout
No ratings yet
Lec15 16 Handout
33 pages
(Astrobiology Perspectives on Life of the Universe) Martin M. Hanczyc_ Stoyan K. Smoukov_ Joseph Seckbach_ Richard Gordon - Conflicting Models for the Origin of Life-Wiley-Scrivener (2023)
No ratings yet
(Astrobiology Perspectives on Life of the Universe) Martin M. Hanczyc_ Stoyan K. Smoukov_ Joseph Seckbach_ Richard Gordon - Conflicting Models for the Origin of Life-Wiley-Scrivener (2023)
504 pages
5 Clustering
No ratings yet
5 Clustering
38 pages
Download full Steels Microstructure and Properties 3rd Edition Harry Bhadeshia ebook all chapters
75% (4)
Download full Steels Microstructure and Properties 3rd Edition Harry Bhadeshia ebook all chapters
55 pages
Statistical Methods For NLP: Document and Topic Clustering, K-Means, Mixture Models, Expectation-Maximization
No ratings yet
Statistical Methods For NLP: Document and Topic Clustering, K-Means, Mixture Models, Expectation-Maximization
47 pages
Week 5 - Science Technology and Building Nation
No ratings yet
Week 5 - Science Technology and Building Nation
19 pages
Pattern Analysis-Machine Learning
No ratings yet
Pattern Analysis-Machine Learning
74 pages
I2ml3e Chap7
No ratings yet
I2ml3e Chap7
22 pages
Gaussian Mixture Models Unit-III
No ratings yet
Gaussian Mixture Models Unit-III
13 pages
Defining Species: A Sourcebook From Antiquity To Today
No ratings yet
Defining Species: A Sourcebook From Antiquity To Today
239 pages
EM and Kmeans relations
No ratings yet
EM and Kmeans relations
70 pages
Tutorial em
No ratings yet
Tutorial em
57 pages
Introduction To (Statistical) Machine Learning
No ratings yet
Introduction To (Statistical) Machine Learning
30 pages
Gmat User Guide
No ratings yet
Gmat User Guide
77 pages
lecture_06
No ratings yet
lecture_06
51 pages
14 Gaussian Mixture Models
No ratings yet
14 Gaussian Mixture Models
60 pages
16) ISM-Session 16 - 30th and 31st March 2024
No ratings yet
16) ISM-Session 16 - 30th and 31st March 2024
36 pages
401 Week7 Part 2 EM Algorithm
No ratings yet
401 Week7 Part 2 EM Algorithm
58 pages
Chapter 1 - Part1
No ratings yet
Chapter 1 - Part1
56 pages
Expectation Maximization
No ratings yet
Expectation Maximization
19 pages
Weinig Moulders_Timbermat 5, 6 and Unimat 318 - Harikrishna
No ratings yet
Weinig Moulders_Timbermat 5, 6 and Unimat 318 - Harikrishna
14 pages
GMM
No ratings yet
GMM
26 pages
Week 7 - Latent Variable Models and Expectation Maximization
No ratings yet
Week 7 - Latent Variable Models and Expectation Maximization
39 pages
کتاب ششم بارگزاری شده
No ratings yet
کتاب ششم بارگزاری شده
49 pages
Chap2 Part2 GMM
No ratings yet
Chap2 Part2 GMM
34 pages
20-gaussian-mixture-model
No ratings yet
20-gaussian-mixture-model
55 pages
ML RUSA Module 6 Probablistic EM KNN SVM
No ratings yet
ML RUSA Module 6 Probablistic EM KNN SVM
51 pages
EM-converted
No ratings yet
EM-converted
22 pages
Dsci303-19 GM - em
No ratings yet
Dsci303-19 GM - em
81 pages
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material I 12-May-2021 5.5 Expectation Maximization
No ratings yet
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material I 12-May-2021 5.5 Expectation Maximization
28 pages
Cse291d 7
No ratings yet
Cse291d 7
39 pages
Gaussian Mixture Mode
No ratings yet
Gaussian Mixture Mode
30 pages
Transferring To Real-World Layouts: A Depth-Aware Framework For Scene Adaptation
No ratings yet
Transferring To Real-World Layouts: A Depth-Aware Framework For Scene Adaptation
11 pages
Module13 GaussianMixtureModel
No ratings yet
Module13 GaussianMixtureModel
17 pages
GaussianMixtureModel(GMM)_0a8d7758700f041bd57d8aef0862eb14
No ratings yet
GaussianMixtureModel(GMM)_0a8d7758700f041bd57d8aef0862eb14
18 pages
Some Studies of Expectation Maximization Clustering Algorithm To Enhance Performance
No ratings yet
Some Studies of Expectation Maximization Clustering Algorithm To Enhance Performance
16 pages
JMH 200 13 OCT (1)
No ratings yet
JMH 200 13 OCT (1)
12 pages
S6, S7, S8 CS - U4 Getter Setter EM Algorithm
No ratings yet
S6, S7, S8 CS - U4 Getter Setter EM Algorithm
32 pages
Surge Protection System 280311
No ratings yet
Surge Protection System 280311
21 pages
Algoritmo E-M. Utilizado para Calcular La Mezcla de Gausianas
No ratings yet
Algoritmo E-M. Utilizado para Calcular La Mezcla de Gausianas
8 pages
Notes7_Mixtures_and_EM
No ratings yet
Notes7_Mixtures_and_EM
7 pages
Week 7 GMM
No ratings yet
Week 7 GMM
9 pages
ELLIPTICAL MIXTURE MODELS IMPROVE THE ACCURACY OF GAUSSIAN MIXTURE MODELS WITH EXPECTATIONMAXIMIZATION ALGORITHM
No ratings yet
ELLIPTICAL MIXTURE MODELS IMPROVE THE ACCURACY OF GAUSSIAN MIXTURE MODELS WITH EXPECTATIONMAXIMIZATION ALGORITHM
20 pages
PROBABILISTIC Learning Jb-new
No ratings yet
PROBABILISTIC Learning Jb-new
13 pages
Get One More Story in Your Member Preview When You Sign Up. It's Free
No ratings yet
Get One More Story in Your Member Preview When You Sign Up. It's Free
12 pages
ML UNIT III
No ratings yet
ML UNIT III
12 pages
gmm
No ratings yet
gmm
8 pages
GMMEMNotes
No ratings yet
GMMEMNotes
10 pages
GAUSSIAN MIXTURES
No ratings yet
GAUSSIAN MIXTURES
5 pages
Mid Term Revision p6 Super Land
No ratings yet
Mid Term Revision p6 Super Land
21 pages
CB PDF
No ratings yet
CB PDF
69 pages
Gaussian Mixture Modelling GMM
No ratings yet
Gaussian Mixture Modelling GMM
11 pages
15_GMC
No ratings yet
15_GMC
4 pages
Mixture Models and Clustering
No ratings yet
Mixture Models and Clustering
8 pages
Magnetic Method Assignment
No ratings yet
Magnetic Method Assignment
18 pages
Symmetrical Based Projects
No ratings yet
Symmetrical Based Projects
105 pages
Ltib 2023 Exhibitor List
No ratings yet
Ltib 2023 Exhibitor List
5 pages
PROBLEM SOLVING (4 Items X 5 Points) : Expected Utilization Rates
No ratings yet
PROBLEM SOLVING (4 Items X 5 Points) : Expected Utilization Rates
3 pages
Gaussian Distribution
No ratings yet
Gaussian Distribution
5 pages
Daftar Pustaka: Universitas Kristen Maranatha
No ratings yet
Daftar Pustaka: Universitas Kristen Maranatha
3 pages
UNIT 5 - ML
No ratings yet
UNIT 5 - ML
10 pages
[2021년 9월 학평] 고1 영어_3.변형문제[69문제]
No ratings yet
[2021년 9월 학평] 고1 영어_3.변형문제[69문제]
30 pages
Gaussian Mixture Model: P (X - Y) P (Y - X) P (X)
No ratings yet
Gaussian Mixture Model: P (X - Y) P (Y - X) P (X)
3 pages
AI29
No ratings yet
AI29
3 pages
Purposive Communication 1 UGRD-GE6106
No ratings yet
Purposive Communication 1 UGRD-GE6106
4 pages
Mixture Models and Expectation-Maximization: Justus H. Piater
No ratings yet
Mixture Models and Expectation-Maximization: Justus H. Piater
11 pages
Comments Resolution Sheet: Wadi Safar - Infrastructure Design & Construction Support
No ratings yet
Comments Resolution Sheet: Wadi Safar - Infrastructure Design & Construction Support
5 pages
pset-gmm
No ratings yet
pset-gmm
1 page
Gaussian Mixture Model _ GeeksforGeeks
No ratings yet
Gaussian Mixture Model _ GeeksforGeeks
6 pages
ASSIGNMENT1
No ratings yet
ASSIGNMENT1
7 pages
Gaussian Mixture Models
No ratings yet
Gaussian Mixture Models
3 pages
EUCLID
No ratings yet
EUCLID
3 pages
To What Extent Question - Sample Answer Discussion
No ratings yet
To What Extent Question - Sample Answer Discussion
4 pages
A Vector in The Direction of Vector That Has Magnitude 15 Is
No ratings yet
A Vector in The Direction of Vector That Has Magnitude 15 Is
8 pages
CPAR JFECHALIN LAS QUARTER 4 WEEK1 Portfolio
No ratings yet
CPAR JFECHALIN LAS QUARTER 4 WEEK1 Portfolio
3 pages
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
From Everand
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
Rob Porter
No ratings yet
Lesson1 Advanced Mathematics Intro
No ratings yet
Lesson1 Advanced Mathematics Intro
21 pages
Chapter 3 - Research Methodology
No ratings yet
Chapter 3 - Research Methodology
7 pages
cs229 Notes7b PDF
No ratings yet
cs229 Notes7b PDF
4 pages
Learn How To Learn Part 1
No ratings yet
Learn How To Learn Part 1
3 pages
Fault Detection On Radial Power Distribution
No ratings yet
Fault Detection On Radial Power Distribution
15 pages
Manp General Management Plan
100% (2)
Manp General Management Plan
220 pages
Perdev Module 2
No ratings yet
Perdev Module 2
7 pages
Eng A - Jan 2020 MCQ
100% (2)
Eng A - Jan 2020 MCQ
16 pages
Exercises of Statistical Inference
From Everand
Exercises of Statistical Inference
Simone Malacrida
No ratings yet
Coriolis Meter Calibration by Compact Procedure (With Density) English Version
100% (1)
Coriolis Meter Calibration by Compact Procedure (With Density) English Version
3 pages
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)