Deriving adversarial training from the variational form of $f$-divergences
In the previous post, we defined the general principle of generative modeling as learning a model distribution $P_\theta$ that approximates the true data distribution $P_X$ by minimizing a divergence:
\[\theta^{*} = \arg\min_\theta D(P_X \,\|\, P_\theta).\]But in practice, both $p_X(x)$ and $p_\theta(x)$ are intractable.
We only have:
So, how can we minimize a divergence without knowing either density?
Let’s start with a basic idea from probability theory.
This means integrals over $p_X$ can be approximated by finite sample averages:
\[\int h(x)p_X(x)\,dx \;\approx\; \frac{1}{n}\sum_{i=1}^n h(x_i).\]this is the foundation of Monte Carlo estimation used across all generative learning paradigms — GANs, VAEs, Diffusion Models — which all optimize expectations rather than explicit densities.
Recall that for a convex function $f:\mathbb{R}_{+} \to \mathbb{R}$ with $f(1)=0$,
\[D_f(P_X\|P_\theta) = \int p_\theta(x)\,f\!\left(\frac{p_X(x)}{p_\theta(x)}\right)\,dx.\]Since this involves unknown densities, we need a reformulation that depends only on expectations — quantities we can estimate from samples.
By the Fenchel–Young inequality, \(ut \le f(u) + f^*(t), \quad \forall\,u,t.\)
Equality holds when $t=f’(u)$.
We apply the conjugate definition to rewrite $f$ inside the divergence.
This yields the variational representation:
\[\boxed{ D_f(P_X\|P_\theta) \ge \sup_T \Big(\mathbb{E}_{x\sim P_X}[T(x)] - \mathbb{E}_{x\sim P_\theta}[f^*(T(x))]\Big). }\]The inequality appears because the space of test functions $T$ we optimize over may not include the exact optimum $T^*(x)=f’!\big(\frac{p_X(x)}{p_\theta(x)}\big)$.