# GMM Gaussian mixture model

2022-01-27 05:03:07

This is my participation 8 The fourth of the yuegengwen challenge 14 God , Check out the activity details ：8 Yuegengwen challenge

# Gaussian mixture model GMM

Gaussian Mixture Model Gaussian mixture model , It belongs to a common generation model .

## Mixture Model

A hybrid model is one that can be used to represent the overall distribution （distribution） contains K Probability model of sub distribution , let me put it another way , The mixed model represents the probability distribution of the observed data in the population , It is a result of K A mixed distribution consisting of sub distributions . The hybrid model does not require the observed data to provide information about the sub distribution , To calculate the probability of observation data in the overall distribution .

## Gaussian Mixture Model

We know that the multi-dimensional Gaussian distribution follows the following probability density function ：

$p(x\mid \theta)=\dfrac{1}{(2\pi)^{\frac{n}{2}}|\Sigma|^{\frac{1}{2}}}exp(-\dfrac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu))$

Gaussian mixture model （Gaussian Mixture Model, GMM） Is an extension of a single Gaussian probability density function ,GMM It can smoothly approximate the density distribution of any shape .

Gaussian mixture model can be regarded as composed of K A model composed of a single Gaussian model , this K The sub model is the implicit variable of the mixed model .

We can think that the probability density function of Gaussian mixture model can be determined by its k Weighted by a single Gaussian probability density function .

Suppose our sample data is $X$, share $x_1,x_2,...,x_N\; N$ Samples , use $\alpha_k$ It means the first one $k$ The weight factor of a single Gaussian model , $G(x|\theta)$ Represents the probability density function of a single Gaussian , Yes ：

$p(x|\theta)=\sum_{k=1}^{K}\alpha_kG(x|\theta_k)$

obviously ,GMM The parameters of are a set of $\theta$,

$\theta = (\tilde{\mu_k},\tilde{\Sigma_k},\tilde{\alpha_k})$

Look here and you'll find , $k$ The value of needs to be determined in advance , It's very important , similar $k-means$ Then you need to make sure $k$.

## Parameter estimation

In the learning of multi-dimensional Gaussian distribution , We know , It can be estimated by maximum likelihood $\theta$ Value , The likelihood function is $L(\theta)=p(x|\theta)$

$\theta = argmaxL(\theta)$

about GMM, We assume that each set of sample data is independent , Then the likelihood function is $k$ A tired ride , Considering that the probability of a single point is very small , After multiplication, the data will be smaller , It is easy to cause floating point underflow , So we can use log likelihood ：

$L(\theta)=\prod_{k=1}^{K}p(x_k|\theta) \\ logL(\theta)=\sum_{j=1}^{N}log\sum_{k=1}^{K}\alpha_kG(x|\theta_k)$

## Use EM Algorithmic solution

First, initialize a set of parameters randomly

$\theta=(\mu_k,\Sigma_k,\alpha_k),\;k=1,2,...,K$

E Step ：

So-called E Namely Expectation, When we know the model parameters , For implicit variables $X$ Expect , As shown below ：

$\gamma_{j,k}=\dfrac{\alpha_k G(x_j|\mu_k,\Sigma_k)}{\sum_{k=1}^{K}\alpha_k G(x_j|\mu_k,\Sigma_k)}$

$\gamma_{j,k}$ That is to say, data $x_j$ Belong to the first $k$ Probability of sub Gaussian model .

M Step ：

Now we have $\gamma_{j,k}$ , We can use the maximum likelihood to estimate the parameters of the next iteration ：

$\mu_k=\dfrac{\sum_{j=1}^{N}(\gamma_{j,k}x_j)}{\sum_{j=1}^{N}\gamma_{j,k}},\; k=1,2,...,K \\ \Sigma_k=\dfrac{\sum_{j=1}^{N}\gamma_{j,k}(x_j-\mu_k)(x_j-\mu_k)^T}{\sum_{j=1}^{N}\gamma_{j,k}},\; k=1,2,...,K \\ \alpha_k = \dfrac{\sum_{j=1}^{N}\gamma_{j,k}}{N},\; k=1,2,...,K$

repeat E Step sum M Step until convergence

It should be noted that ,EM The algorithm has convergence , But it is not guaranteed to find the global maximum , It is possible to find the local maximum . The solution is to initialize several different parameters for iteration , The one with the best result .

Reference resources

1. Mr. Li Hang ——《 Statistical learning method 》