Reminder: This post contains 1979 words
· 6 min read
· by Xianbin
Problem Definition (INDEXING)
Alice has a string S while Bob has an index. The goal is to let Bob output Si. Namely, Bob needs to know the i-th (index starts from 1) elements in S without giving Alice the index (otherwise, it is trivial.)
We know that the lower bound of communication complexity for INDEXING is Ω(n).
The First Proof
Imagin the following scenario:
You know Y, and you want to estimate correlated variable X. What can you do? And, what is the wrong probability.
We can see that if H(X∣Y) is very small, i.e., after knowing Y, we almost know X. Then, we have a small wrong probability.
Fano’s inequality quantifiers the above observation.
Fano’s Inequality. Given a Markov chain X0→Y→X1 Let pe=P(X1=X0). Then, we have
H2(pe)+pelog(∣supp(X0)∣−1)≥H(X0∣X1)≥H(X0∣Y)
where H2(pe) is the binary entropy, i.e., H2(pe)=pelogpe1+(1−pe)log1−pe1.
For simplicity, let us consider a uniform distribution on S∈{0,1}n.
Let Π be the one-way transcript with ℓ bits. Here is a Markov Chain: S→Π→S′, where S is the string and S′ is the set of estimated indexes (a vector).
The information revealed by Π is
I(Π:S)=H(S)−H(S∣Π)=n−H(S∣Π)
H(S∣Π)=i=1∑nH(Si∣S<i,Π)
How to use Fano’s inequality?
From S→Π→S′, we know that, S∣Π⊥S′∣Π. Then, we have
Si∣Π⊥Si′∣Π
Then, we have a Markov chain Si→M→Si′.
By Fano’s inequality, we have
H2(pe)+pelog(∣2∣−1)≥H(Si∣Π)
That is,
H2(pe)≥H(Si∣Π)
Just consider pe=1/3, we have H2(1/3)<1. Then, we have
H(Si∣Π)<1.
Therefore,
∑i=1nH(Si∣S<i,Π)≤∑i=1nH(Si∣Π)≤cn
where c<1.
Then, we have
H(S∣Π)≤cn
So, we have
I(Π:S)≥Ω(n)
Also, we know that
I(Π:S)≤H(Π)≤i∑ℓH(Π[i])≤ℓ
Then, we obtain that
ℓ≥Ω(n)
Reference
[1]CS 15-859: Algorithms for Big Data Fall 2020 Lecture 11 — 11/19/20 Prof. David Woodruff.