Maximize Likelihood and KL divergence
Lower Bound
VAE lower Bound of
logPθ(x)
ELBO
Maximizing the likelihood of the observed x:
P(z): a normal distribution
P(x∣z)=N(x;μ(z),σ(z)), μ(z),σ(z) is unknown and going to be estimated.
Loss: L=∑ilogP(x(i))
It is straightforward to figure out the following:
Then the lower bound Lb could be derived as follows:
q(z∣x) is also a normal distribution, which is estimated by a neural network. q(z∣x)=N(z;μ′(x),σ′(x)). In other words, the mean and variance of z are given by two functions μ′(⋅) and σ′(⋅), which will be estimated by the ouput of a neural network.
maximizing this lower bound needs to minimize KL[q(z∣x)∣∣P(z)] and maximize the second term, which will be connected with neural networks.
Minimize the first term
KL[q(z∣x)∣∣P(z)]
Maximize the second term
VAE and GMM
Problem in VAE