Toolbox

T1: Total Variation Distance

Reminder: This post contains 749 words · 3 min read · by Xianbin

Total variation distance is very useful in many areas. This post shows some useful properties.

Definition

Consider two distributions μ,ν\mu, \nu (probabilities on EE), the total variation distance between μ\mu and ν\nu is as follows.

μνtvd=supAEμ(A)ν(A)\lVert \mu - \nu \rVert_{\text{tvd}}= \textup{sup}_{A \subset E}\lvert \mu(A) - \nu(A) \rvert

Since xEμ(x)=xEν(x)=1\sum_{x\in E} \mu(x) = \sum_{x\in E} \nu(x) = 1, then we have

μ(x)ν(x)μ(x)ν(x)=ν(x)μ(x)ν(x)μ(x)\sum_{\mu(x) \geq \nu(x)}\mu(x) - \nu(x) = \sum_{\nu(x) \leq \mu(x)}\nu(x) - \mu(x) μνtvd=12xEμ(x)ν(x)\lVert \mu - \nu \rVert_{\text{tvd}} = \frac{1}{2} \sum_{x\in E} \lvert \mu(x) - \nu(x) \rvert

Let B:={x:μ(x)ν(x)}B: = \{x : \mu(x) \geq \nu(x) \}

Let Xμ,YνX\sim \mu, Y \sim \nu.

P(XY)P(XB,YBˉ)=P(XB)P(XB,YB)P(XB)P(YB)=μ(B)ν(B)=μνtvd\mathbb{P}(X \neq Y ) \geq P(X\in B, Y \in \bar B) = \\ \mathbb{P}(X\in B) - \mathbb{P}(X\in B, Y\in B) \geq \\ \mathbb{P}(X\in B) - \mathbb{P}(Y\in B) = \mu(B) - \nu(B) = \lVert \mu - \nu \rVert_{\text{tvd}}