Reminder: This post contains 4188 words
· 12 min read
· by Xianbin
Discrepancy is a very important technique for lower bound proof in communication complexity.
Definitions
We give the definition with respect to communication complexity.
Given a function \(f: X \times Y \to \{0,1\}\), and let \(\mu\) be a distribution over \(X \times Y\). Let \(R \subseteq X \times Y\) be a rectangle.
\[disc_\mu(f, R) = \lvert \mu(R\cap f^{-1}(0) )- \mu(R\cap f^{-1}(1)\rvert\]
Basically, it means the gap of the number of zeros and ones in the rectangle \(R\).
We define
\[disc_\mu(f) = \max_{R} disc_\mu(f,R)\]
\(\textbf{Theorem}\). \(D^\epsilon_\mu(f) \geq \log \frac{1-2\epsilon}{disc_\mu(f)}\)
\(\textbf{Proof}\). Let us pause for a few moments. How to prove such a thing?
Suppose \(D^\epsilon_\mu(f) = c\), then, we need to prove
\[disc_\mu(f) \geq \frac{1-2\epsilon}{2^c}\]
That is to prove
\[\max_{R} \lvert \mu(R\cap f^{-1}(0) )- \mu(R\cap f^{-1}(1)\rvert \geq \frac{1-2\epsilon}{2^c}\]
We only need to compute the value of \(disc_\mu(f)\) under the condition that the error of the protocol with \(D^\epsilon_\mu(f)\) at most \(\epsilon\). Recall that we say that a deterministic protocol \(\Pi\) has error \(\epsilon\), which means that
\[P[f(x,y) = \Pi(x,y)] \geq 1-\epsilon;\]
and
\[P[f(x,y) \neq \Pi(x,y)] \leq \epsilon;\]
Therefore, the “discrepancy” is,
\[\big \lvert P[f(x,y) = \Pi(x,y)] - P[f(x,y) \neq \Pi(x,y)] \big \rvert \geq 1-2\epsilon\]
Since the length of the protocol \(\Pi\) is \(c\), there are at most \(2^c\) rectangles partitioned in \(X \times Y\).
\[P[\Pi(x,y) = f(x,y)] = \sum_{i=1}^{t}P[(x,y)\in R_i \land f(x,y) = e_i]\]
where \(e_i\in \{0,1\}\).
Then, we obtain that
\[\big \lvert \sum_{i=1}^{t}P[(x,y)\in R_i \land f(x,y) = e_i] - \sum_{i=1}^{t}P[(x,y)\in R_i \land f(x,y) = (1-e_i)]\big \rvert \geq 1-2\epsilon\]
\[\big \lvert P[(x,y)\in R_i \land f(x,y) = e_i] - P[(x,y)\in R_i \land f(x,y) = (1-e_i)]\big \rvert \geq \frac{1-2\epsilon}{\sum_{i=1}^{t}} \geq \frac{1-2\epsilon}{2^c}\]
Then,
\[\big \lvert \mu(R_i\cap f^{-1}(e_i)) - \mu(R_i\cap f^{-1}(1-e_i)) \big \rvert \geq \frac{1-2\epsilon}{2^c}\]
Then,
\[\big \lvert \mu(R_i\cap f^{-1}(e_i)) - \mu(R_i\cap f^{-1}(1-e_i)) \big \rvert = \big \lvert \mu(R_i\cap f^{-1}(0)) - \mu(R_i\cap f^{-1}(1)) \big \rvert \geq \frac{1-2\epsilon}{2^c}\]
So, it is proved.
The Eigenvalue Method
Let us introduce the sign matrix.
\[M_f[x,y] = (-1)^{f(x,y)} =
\begin{cases}
-1 && f(x,y) = 1;\\
1 && f(x,y) = 0.
\end{cases}\]
Because \(S_f\) is symmetric, there are some properties:
- \(\lVert M_f \rVert\) is the absolute value fo the largest eigenvalue of \(M_f\).
\(\textbf{Lemma}\). Let \(f\) be a symmetric boolean function such that \(f(x,y) = f(y,x)\), we have
\[\text{disc}(f, A\times B) \leq 2^{-2n}\lambda_{max}\sqrt{\lvert A \rvert \cdot \lvert B \rvert}.\]
where \(n = \lvert x \rvert = \lvert y \rvert\) is the input size, and \(\lambda_{max}\) is the largest eigenvalue of the matrix \(M_f\).
\(\textbf{Proof}\).
\[A
\left (
\begin{matrix}
1 & 0 & \cdots & 0\\
1 & 0 & \cdots & 1\\
\vdots &&&\vdots \\
1 & 0 & \cdots & 1
\end{matrix}
\right)\\
B\]
Let \(\textbf{1}_A, \textbf{1}_B \in \{0,1\}^{2^n}\) be the characteristic vectors of the sets \(A\) and \(B\). Intuitively speaking, \(\textbf{1}_A (\textbf{1}_B)\) indicates which rows (columns) have elements.
Then, we have
\[\sum_{a\in A, b\in B} M_{ab} = \textbf{1}_A^\intercal M_f \textbf{1}_B\]
We define \(\lvert A \rvert = \sum_{a\in A} 1^2\)
Since \(M_f\) is symmetric, by the spectral theorem, its eigenvectors \(v_1, \ldots, v_{2^n}\) are orthonormal basic for \(\mathbb{R}^{2^n}\). Let \(\lambda_i\) be corresponding eigenvalues. We have
\[M_f v_i = \lambda_i v_i\]
Since \(R\) is a rectangle, we can expand the characteristic vectors of \(A\) and \(B\) in the basis: (if you do not understand this, please see this post).
\[\textbf{1}_A = \sum \alpha_i v_i, \\
\textbf{1}_B = \sum \beta_i v_i\]
Then, we have
\[2^{2n}\text{disc}(f, A\times B) = \sum \lvert M_{ab} \rvert = \lvert \textbf{1}_A M_f \textbf{1}_B \rvert \\
=\left \lvert (\sum \alpha_i v_i)^\intercal M_f \sum \beta_i v_i\right\rvert\\
=\left \lvert (\sum \alpha_i v_i)^\intercal \sum \beta_i \lambda_i v_i\right\rvert\\
=\left \lvert \sum \alpha_i \beta_i \lambda_i \right \rvert \\
\leq \lambda_{max} \left \lvert \sum \alpha_i \beta_i \right \rvert\]
By Parseval’s identity (see this post), we have
\[\sum \alpha_i^2 = \lVert \textbf{1}_A \rVert^2 = \lvert A \rvert \\
\sum \beta_i^2 = \lvert B \rvert\]
Then,
\[\left \lvert \sum \alpha_i \beta_i \right \rvert \leq \sqrt{\left \lvert \sum \alpha_i^2\right \rvert} \sqrt{\left \lvert \sum \beta_i^2\right \rvert} = \sqrt{\lvert A \rvert \cdot \lvert B \rvert}\]
So, we finish the proof.
Reference
[1]. CS 2429 - Foundations of Communication Complexity, Toniann Pitassi
[2]. T1: A Simple Question 2: Parseval’s Inequality