Toolbox

T1: Total Variation Distance

Reminder: This post contains 749 words · 3 min read · by Xianbin

Total variation distance is very useful in many areas. This post shows some useful properties.

Definition

Consider two distributions \(\mu, \nu\) (probabilities on \(E\)), the total variation distance between \(\mu\) and \(\nu\) is as follows.

\[\lVert \mu - \nu \rVert_{\text{tvd}}= \textup{sup}_{A \subset E}\lvert \mu(A) - \nu(A) \rvert\]

Since \(\sum_{x\in E} \mu(x) = \sum_{x\in E} \nu(x) = 1\), then we have

\[\sum_{\mu(x) \geq \nu(x)}\mu(x) - \nu(x) = \sum_{\nu(x) \leq \mu(x)}\nu(x) - \mu(x)\] \[\lVert \mu - \nu \rVert_{\text{tvd}} = \frac{1}{2} \sum_{x\in E} \lvert \mu(x) - \nu(x) \rvert\]

Let \(B: = \{x : \mu(x) \geq \nu(x) \}\)

Let \(X\sim \mu, Y \sim \nu\).

\[\mathbb{P}(X \neq Y ) \geq P(X\in B, Y \in \bar B) = \\ \mathbb{P}(X\in B) - \mathbb{P}(X\in B, Y\in B) \geq \\ \mathbb{P}(X\in B) - \mathbb{P}(Y\in B) = \mu(B) - \nu(B) = \lVert \mu - \nu \rVert_{\text{tvd}}\]