Autoencoder: just autoencoder
Denoising Autoencoder: autoencoder, but predict noise from corrupted
Variational Autoencoder: just add variation
VQ-VAE (Vector Quantization): maps continuous vectors to a codebook of discrete set of vectors
VQ-GAN: VQ-VAE with Transformer and GAN
Problem: what are the training images and corresponding scores that contribute in the generation process of this image
Solution: get a bunch of dataset by fine-tuning 1000 models with 1000 images. And then train a CLIP-like model.
Table of Content