Discrepancy is a very important technique for lower bound proof in communication complexity.
Definitions
We give the definition with respect to communication complexity.
Given a function f:X×Y→{0,1}, and let μ be a distribution over X×Y. Let R⊆X×Y be a rectangle.
discμ(f,R)=∣μ(R∩f−1(0))−μ(R∩f−1(1)∣
Basically, it means the gap of the number of zeros and ones in the rectangle R.
We define
discμ(f)=Rmaxdiscμ(f,R)
Theorem. Dμϵ(f)≥logdiscμ(f)1−2ϵ
Proof. Let us pause for a few moments. How to prove such a thing?
Suppose Dμϵ(f)=c, then, we need to prove
discμ(f)≥2c1−2ϵ
That is to prove
Rmax∣μ(R∩f−1(0))−μ(R∩f−1(1)∣≥2c1−2ϵ
We only need to compute the value of discμ(f) under the condition that the error of the protocol with Dμϵ(f) at most ϵ. Recall that we say that a deterministic protocol Π has error ϵ, which means that
P[f(x,y)=Π(x,y)]≥1−ϵ;
and
P[f(x,y)=Π(x,y)]≤ϵ;
Therefore, the “discrepancy” is,
∣∣P[f(x,y)=Π(x,y)]−P[f(x,y)=Π(x,y)]∣∣≥1−2ϵ
Since the length of the protocol Π is c, there are at most 2c rectangles partitioned in X×Y.
Because Sf is symmetric, there are some properties:
∥Mf∥ is the absolute value fo the largest eigenvalue of Mf.
Lemma. Let f be a symmetric boolean function such that f(x,y)=f(y,x), we have
disc(f,A×B)≤2−2nλmax∣A∣⋅∣B∣.
where n=∣x∣=∣y∣ is the input size, and λmax is the largest eigenvalue of the matrix Mf.
Proof.
A⎝⎛11⋮1000⋯⋯⋯01⋮1⎠⎞B
Let 1A,1B∈{0,1}2n be the characteristic vectors of the sets A and B. Intuitively speaking, 1A(1B) indicates which rows (columns) have elements.
Then, we have
a∈A,b∈B∑Mab=1A⊺Mf1B
We define ∣A∣=∑a∈A12
Since Mf is symmetric, by the spectral theorem, its eigenvectors v1,…,v2n are orthonormal basic for R2n. Let λi be corresponding eigenvalues. We have
Mfvi=λivi
Since R is a rectangle, we can expand the characteristic vectors of A and B in the basis: (if you do not understand this, please see this post).