VAE: capture lower bound of p(x) result in blurry mena

Normalizing flow: exact p(x) with restricted architecture

Diffusion: score function (zero divergence)

QUESTION: what is the ODE in the first slide

QUESTION: connecting score with noise

QUESTION: revert by tracing

Deterministic ODE: by Song Yang, dx = -\frac{1}{2} \nabla_x \log p_t(x) dt # QUESTION: why is this the function

Langevin dynamics: add noise at the end # QUESTION: correct error: numeric error? why Langevin

QUESTION: do we fix t=1000 in warping the t?

Heun step: higher order and euclid steps??

cascaded diffusion models

QUESTION: why resampling and out of distribution VAE? because VAE trianed separately? ve-gans, ve-vae

QUESTION: deepfloyd is image diffusion, 3 models, LDM just code easy

QUESTION: in-painting why it need to train another model

Basics of GAN

Generative models are hard due to hard to specify loss function

L2 loss average color, can't generate non-blurry, colorful stuff

classification loss

perceptual loss

Measures of Distribution:

FID(W2-distance) -> Github: clean-fid

JSD(symmetric, worse to W2)

KL

Loss in practice: don't use original classification loss. HingeLoss, Least Square Loss + R1 penalty and W2 is great. Paper: Are All GANs Created Equal?

StyleGan: AdaIn(x) replace Batch Normalization, nowadays people use cross-attention instead

QUESTION: FID for conditioned model

upsampler in diffusion is GAN?
transformer isn't compatible with GAN (we use L2 attention), GigaGAN uses filter banks