Reminder: This post contains 2106 words
· 7 min read
· by Xianbin
The augmented indexing problem is an invariant of the indexing problem, which has applications of lower bounds. This post is base on [1].
Definition
Alice has a string \(S\) while Bob has an index \(j\), \(S_{<i}\) and a check bit \(c\in \{0,1\}\). The goal is to let Bob output \(S_j\). Namely, Bob needs to know the \(j\)-th (index starts from 1) elements in \(S\) without giving Alice the index (otherwise, it is trivial.)
Comparing to the indexing problem, now, Bob has a set of elements before the \(j\)-th elements, i.e., \(S_{<j}\).
We use AInd to denote the augmented indexing problem. The goal is to output \(\text{AInd}(S,j, c):=x_j\oplus c\).
Here, we need to consider how much information Bob learns from Alice and Alice learns from Bob. It is called internal information cost.
\[IC(\Pi) = I(\Pi: A \mid B) + I(\Pi:B\mid A)\]
On the distribution \(\mu \in \{0, 1\}^n \times [n] \times \{0, 1\}\), we use \(IC^A(\Pi)\) to denote the cost of Alice, i.e., \(I(\Pi: S \mid S_{<J}, J, C, R)\) and use \(IC^B(\Pi)\) to denote the cost of Bob, i.e., \(I(\Pi:K, C\mid S, R)\) where \(R\) is the public coin.
Average Argument
\(\textbf{Lemma 1}\). Consider a function family \(f_1, \ldots, f_L: D\to R^+\), and parameter \(c_1,\ldots,c_L\),where \(L>0\) is an integer and \(D\) is a finite domain. Let \(Z\) be a random variable over \(D\). Then, we have the following:
\(\forall i\in [L], \mathbb{E}[f_i(Z)] \leq c_i\)
leads to there exists \(z\in D\) such that
\[\forall i\in[L], f_i (Z) \leq L c_i\]
What does this lemma mean? It means that if a set of functions in a domain have a small average value, then there exists an element in the domain where all functions have a small value.
It is not hard to prove this tiny lemma. Let \(g(z):=\sum f_i(z)/b_i\), and we know that \(\mathbb{E}[g(z)]\leq L\). Then, there must be a \(z\) such that \(g(z) \leq L\).
Quite simple but useful. We will see why we will need this lemma soon.
Theorem 1
\(\textbf{Theorem 1}\). If \(\Pi\) is a randomized protocol for \(\text{Aind}\) with two-sided error at most \(1/\log^2 n\), either \(IC^A(\Pi) = \Omega(n)\), or \(IC^B(\Pi) = \Omega(1)\).
It means that either Alice reveals \(\Omega(n)\) bits of her input, or Bob reveals \(\Omega(1)\) bits of his input.
To be continued…
Reference
[1]. Chakrabarti, A., Cormode, G., Kondapally, R. and McGregor, A., 2013. Information cost tradeoffs for augmented index and streaming language recognition. SIAM Journal on Computing, 42(1), pp.61-83.