Toolbox

T1: Discrepancy (1)

Reminder: This post contains 4188 words · 12 min read · by Xianbin

Discrepancy is a very important technique for lower bound proof in communication complexity.

Definitions

We give the definition with respect to communication complexity.

Given a function \(f: X \times Y \to \{0,1\}\), and let \(\mu\) be a distribution over \(X \times Y\). Let \(R \subseteq X \times Y\) be a rectangle.

\[disc_\mu(f, R) = \lvert \mu(R\cap f^{-1}(0) )- \mu(R\cap f^{-1}(1)\rvert\]

Basically, it means the gap of the number of zeros and ones in the rectangle \(R\).

We define

\[disc_\mu(f) = \max_{R} disc_\mu(f,R)\]

\(\textbf{Theorem}\). \(D^\epsilon_\mu(f) \geq \log \frac{1-2\epsilon}{disc_\mu(f)}\)

\(\textbf{Proof}\). Let us pause for a few moments. How to prove such a thing?

Suppose \(D^\epsilon_\mu(f) = c\), then, we need to prove

\[disc_\mu(f) \geq \frac{1-2\epsilon}{2^c}\]

That is to prove

\[\max_{R} \lvert \mu(R\cap f^{-1}(0) )- \mu(R\cap f^{-1}(1)\rvert \geq \frac{1-2\epsilon}{2^c}\]

We only need to compute the value of \(disc_\mu(f)\) under the condition that the error of the protocol with \(D^\epsilon_\mu(f)\) at most \(\epsilon\). Recall that we say that a deterministic protocol \(\Pi\) has error \(\epsilon\), which means that

\[P[f(x,y) = \Pi(x,y)] \geq 1-\epsilon;\]

and

\[P[f(x,y) \neq \Pi(x,y)] \leq \epsilon;\]

Therefore, the “discrepancy” is,

\[\big \lvert P[f(x,y) = \Pi(x,y)] - P[f(x,y) \neq \Pi(x,y)] \big \rvert \geq 1-2\epsilon\]

Since the length of the protocol \(\Pi\) is \(c\), there are at most \(2^c\) rectangles partitioned in \(X \times Y\).

\[P[\Pi(x,y) = f(x,y)] = \sum_{i=1}^{t}P[(x,y)\in R_i \land f(x,y) = e_i]\]

where \(e_i\in \{0,1\}\).

Then, we obtain that

\[\big \lvert \sum_{i=1}^{t}P[(x,y)\in R_i \land f(x,y) = e_i] - \sum_{i=1}^{t}P[(x,y)\in R_i \land f(x,y) = (1-e_i)]\big \rvert \geq 1-2\epsilon\] \[\big \lvert P[(x,y)\in R_i \land f(x,y) = e_i] - P[(x,y)\in R_i \land f(x,y) = (1-e_i)]\big \rvert \geq \frac{1-2\epsilon}{\sum_{i=1}^{t}} \geq \frac{1-2\epsilon}{2^c}\]

Then,

\[\big \lvert \mu(R_i\cap f^{-1}(e_i)) - \mu(R_i\cap f^{-1}(1-e_i)) \big \rvert \geq \frac{1-2\epsilon}{2^c}\]

Then,

\[\big \lvert \mu(R_i\cap f^{-1}(e_i)) - \mu(R_i\cap f^{-1}(1-e_i)) \big \rvert = \big \lvert \mu(R_i\cap f^{-1}(0)) - \mu(R_i\cap f^{-1}(1)) \big \rvert \geq \frac{1-2\epsilon}{2^c}\]

So, it is proved.

The Eigenvalue Method

Let us introduce the sign matrix.

\[M_f[x,y] = (-1)^{f(x,y)} = \begin{cases} -1 && f(x,y) = 1;\\ 1 && f(x,y) = 0. \end{cases}\]

Because \(S_f\) is symmetric, there are some properties:

  1. \(\lVert M_f \rVert\) is the absolute value fo the largest eigenvalue of \(M_f\).

\(\textbf{Lemma}\). Let \(f\) be a symmetric boolean function such that \(f(x,y) = f(y,x)\), we have

\[\text{disc}(f, A\times B) \leq 2^{-2n}\lambda_{max}\sqrt{\lvert A \rvert \cdot \lvert B \rvert}.\]

where \(n = \lvert x \rvert = \lvert y \rvert\) is the input size, and \(\lambda_{max}\) is the largest eigenvalue of the matrix \(M_f\).

\(\textbf{Proof}\).

\[A \left ( \begin{matrix} 1 & 0 & \cdots & 0\\ 1 & 0 & \cdots & 1\\ \vdots &&&\vdots \\ 1 & 0 & \cdots & 1 \end{matrix} \right)\\ B\]

Let \(\textbf{1}_A, \textbf{1}_B \in \{0,1\}^{2^n}\) be the characteristic vectors of the sets \(A\) and \(B\). Intuitively speaking, \(\textbf{1}_A (\textbf{1}_B)\) indicates which rows (columns) have elements.

Then, we have

\[\sum_{a\in A, b\in B} M_{ab} = \textbf{1}_A^\intercal M_f \textbf{1}_B\]

We define \(\lvert A \rvert = \sum_{a\in A} 1^2\)

Since \(M_f\) is symmetric, by the spectral theorem, its eigenvectors \(v_1, \ldots, v_{2^n}\) are orthonormal basic for \(\mathbb{R}^{2^n}\). Let \(\lambda_i\) be corresponding eigenvalues. We have

\[M_f v_i = \lambda_i v_i\]

Since \(R\) is a rectangle, we can expand the characteristic vectors of \(A\) and \(B\) in the basis: (if you do not understand this, please see this post).

\[\textbf{1}_A = \sum \alpha_i v_i, \\ \textbf{1}_B = \sum \beta_i v_i\]

Then, we have

\[2^{2n}\text{disc}(f, A\times B) = \sum \lvert M_{ab} \rvert = \lvert \textbf{1}_A M_f \textbf{1}_B \rvert \\ =\left \lvert (\sum \alpha_i v_i)^\intercal M_f \sum \beta_i v_i\right\rvert\\ =\left \lvert (\sum \alpha_i v_i)^\intercal \sum \beta_i \lambda_i v_i\right\rvert\\ =\left \lvert \sum \alpha_i \beta_i \lambda_i \right \rvert \\ \leq \lambda_{max} \left \lvert \sum \alpha_i \beta_i \right \rvert\]

By Parseval’s identity (see this post), we have

\[\sum \alpha_i^2 = \lVert \textbf{1}_A \rVert^2 = \lvert A \rvert \\ \sum \beta_i^2 = \lvert B \rvert\]

Then,

\[\left \lvert \sum \alpha_i \beta_i \right \rvert \leq \sqrt{\left \lvert \sum \alpha_i^2\right \rvert} \sqrt{\left \lvert \sum \beta_i^2\right \rvert} = \sqrt{\lvert A \rvert \cdot \lvert B \rvert}\]

So, we finish the proof.

Reference

[1]. CS 2429 - Foundations of Communication Complexity, Toniann Pitassi

[2]. T1: A Simple Question 2: Parseval’s Inequality