T1: Discrepancy (1)

July 15, 2024

Reminder: This post contains 4188 words · 12 min read · by Xianbin

Definitions
The Eigenvalue Method
Reference

Discrepancy is a very important technique for lower bound proof in communication complexity.

Definitions

We give the definition with respect to communication complexity.

Given a function $f: X \times Y \to \{0,1\}$ , and let $\mu$ be a distribution over $X \times Y$ . Let $R \subseteq X \times Y$ be a rectangle.

disc_\mu(f, R) = \lvert \mu(R\cap f^{-1}(0) )- \mu(R\cap f^{-1}(1)\rvert

Basically, it means the gap of the number of zeros and ones in the rectangle $R$ .

We define

disc_\mu(f) = \max_{R} disc_\mu(f,R)

$\textbf{Theorem}$ . $D^\epsilon_\mu(f) \geq \log \frac{1-2\epsilon}{disc_\mu(f)}$

$\textbf{Proof}$ . Let us pause for a few moments. How to prove such a thing?

Suppose $D^\epsilon_\mu(f) = c$ , then, we need to prove

disc_\mu(f) \geq \frac{1-2\epsilon}{2^c}

That is to prove

\max_{R} \lvert \mu(R\cap f^{-1}(0) )- \mu(R\cap f^{-1}(1)\rvert \geq \frac{1-2\epsilon}{2^c}

We only need to compute the value of $disc_\mu(f)$ under the condition that the error of the protocol with $D^\epsilon_\mu(f)$ at most $\epsilon$ . Recall that we say that a deterministic protocol $\Pi$ has error $\epsilon$ , which means that

P[f(x,y) = \Pi(x,y)] \geq 1-\epsilon;

and

P[f(x,y) \neq \Pi(x,y)] \leq \epsilon;

Therefore, the “discrepancy” is,

\big \lvert P[f(x,y) = \Pi(x,y)] - P[f(x,y) \neq \Pi(x,y)] \big \rvert \geq 1-2\epsilon

Since the length of the protocol $\Pi$ is $c$ , there are at most $2^c$ rectangles partitioned in $X \times Y$ .

P[\Pi(x,y) = f(x,y)] = \sum_{i=1}^{t}P[(x,y)\in R_i \land f(x,y) = e_i]

where $e_i\in \{0,1\}$ .

Then, we obtain that

\big \lvert \sum_{i=1}^{t}P[(x,y)\in R_i \land f(x,y) = e_i] - \sum_{i=1}^{t}P[(x,y)\in R_i \land f(x,y) = (1-e_i)]\big \rvert \geq 1-2\epsilon

\big \lvert P[(x,y)\in R_i \land f(x,y) = e_i] - P[(x,y)\in R_i \land f(x,y) = (1-e_i)]\big \rvert \geq \frac{1-2\epsilon}{\sum_{i=1}^{t}} \geq \frac{1-2\epsilon}{2^c}

Then,

\big \lvert \mu(R_i\cap f^{-1}(e_i)) - \mu(R_i\cap f^{-1}(1-e_i)) \big \rvert \geq \frac{1-2\epsilon}{2^c}

Then,

\big \lvert \mu(R_i\cap f^{-1}(e_i)) - \mu(R_i\cap f^{-1}(1-e_i)) \big \rvert = \big \lvert \mu(R_i\cap f^{-1}(0)) - \mu(R_i\cap f^{-1}(1)) \big \rvert \geq \frac{1-2\epsilon}{2^c}

So, it is proved.

The Eigenvalue Method

Let us introduce the sign matrix.

M_f[x,y] = (-1)^{f(x,y)} = \begin{cases} -1 && f(x,y) = 1;\\ 1 && f(x,y) = 0. \end{cases}

Because $S_f$ is symmetric, there are some properties:

$\lVert M_f \rVert$ is the absolute value fo the largest eigenvalue of $M_f$ .

$\textbf{Lemma}$ . Let $f$ be a symmetric boolean function such that $f(x,y) = f(y,x)$ , we have
$\text{disc}(f, A\times B) \leq 2^{-2n}\lambda_{max}\sqrt{\lvert A \rvert \cdot \lvert B \rvert}.$
where $n = \lvert x \rvert = \lvert y \rvert$ is the input size, and $\lambda_{max}$ is the largest eigenvalue of the matrix $M_f$ .

$\textbf{Proof}$ .

A \left ( \begin{matrix} 1 & 0 & \cdots & 0\\ 1 & 0 & \cdots & 1\\ \vdots &&&\vdots \\ 1 & 0 & \cdots & 1 \end{matrix} \right)\\ B

Let $\textbf{1}_A, \textbf{1}_B \in \{0,1\}^{2^n}$ be the characteristic vectors of the sets $A$ and $B$ . Intuitively speaking, $\textbf{1}_A (\textbf{1}_B)$ indicates which rows (columns) have elements.

Then, we have

\sum_{a\in A, b\in B} M_{ab} = \textbf{1}_A^\intercal M_f \textbf{1}_B

We define $\lvert A \rvert = \sum_{a\in A} 1^2$

Since $M_f$ is symmetric, by the spectral theorem, its eigenvectors $v_1, \ldots, v_{2^n}$ are orthonormal basic for $\mathbb{R}^{2^n}$ . Let $\lambda_i$ be corresponding eigenvalues. We have

M_f v_i = \lambda_i v_i

Since $R$ is a rectangle, we can expand the characteristic vectors of $A$ and $B$ in the basis: (if you do not understand this, please see this post).

\textbf{1}_A = \sum \alpha_i v_i, \\ \textbf{1}_B = \sum \beta_i v_i

Then, we have

2^{2n}\text{disc}(f, A\times B) = \sum \lvert M_{ab} \rvert = \lvert \textbf{1}_A M_f \textbf{1}_B \rvert \\ =\left \lvert (\sum \alpha_i v_i)^\intercal M_f \sum \beta_i v_i\right\rvert\\ =\left \lvert (\sum \alpha_i v_i)^\intercal \sum \beta_i \lambda_i v_i\right\rvert\\ =\left \lvert \sum \alpha_i \beta_i \lambda_i \right \rvert \\ \leq \lambda_{max} \left \lvert \sum \alpha_i \beta_i \right \rvert

By Parseval’s identity (see this post), we have

\sum \alpha_i^2 = \lVert \textbf{1}_A \rVert^2 = \lvert A \rvert \\ \sum \beta_i^2 = \lvert B \rvert

Then,

\left \lvert \sum \alpha_i \beta_i \right \rvert \leq \sqrt{\left \lvert \sum \alpha_i^2\right \rvert} \sqrt{\left \lvert \sum \beta_i^2\right \rvert} = \sqrt{\lvert A \rvert \cdot \lvert B \rvert}

So, we finish the proof.

Reference

[1]. CS 2429 - Foundations of Communication Complexity, Toniann Pitassi

[2]. T1: A Simple Question 2: Parseval’s Inequality