Grocery on Distributed Algorithms

C1: Yao's minimax principle

Reminder: This post contains 3438 words · 10 min read · by Xianbin

Terminology

There are two ways to measure the probability that a randominzed algorithm (protocol) makes an error: worst-case and average-case.

\(\textbf{worst-case}\). For all input \((x,y)\), we have \(\mathbb{P}_{R,R_a,R_b}[\Pi_{R,R_a,R_b}(x,y) \text{is false}] \leq \epsilon\) where \(R, R_a, R_b\) represents the public coins, private coins for Alice, and private coins for Bob.

\(\textbf{average-case}\). Given a distribution on inputs \(\mu\), we say that a protocol has error \(\epsilon\) with respect to \(\mu\) if the probability that the protocol makes an error is at most \(\epsilon\) when the inputs are sampled from \(\mu\), i.e.,

\[\mathbb{P}_{R,R_a,R_b, \mu \sim (X,Y)}[\Pi_{R,R_a,R_b}(x,y) \text{is false}] \leq \epsilon\]

Theorem 1

\(\textbf{Theorem 1}\) (Yao’s minimax principle). Let \(f: X\times Y \to Z\) and \(\epsilon > 0\). Then \(\max_{\mu}D_{\mu, \epsilon}(f) = R_{\epsilon}^{pub}(f)\) where the maximum is over all distributions \(\mu\) on \(X \times Y\).

\(\textbf{What does Theorem 1 really say? }\). It says that the worst-case randomized communication complexity of a function \(f\) with error at most \(\epsilon\) is equal to the maximum average-case deterministic communication complexity of \(f\) with error at most \(\epsilon\) over all distribution \(\mu\).

This indicates that the techniques of showing lower bounds for randomized protocols can be by finding a hard distribution \(\mu\) and then prove that there is no good deterministic protocol can compute \(f\) with error at most \(\epsilon\) under \(\mu\).

\(\textbf{Proof}.\) To prove the above equality, we only need to prove inequality 1) \(\max_{\mu}D_{\mu, \epsilon}(f) \leq R_{\epsilon}^{pub}(f)\) and inequality 2) \(\max_{\mu}D_{\mu, \epsilon}(f) \geq R_{\epsilon}^{pub}(f)\).

We start with inequality 1). The maximum average-case complexity is at most the worst-case complexity. So, it is proved.

For inequality 2), we need the Von Neumann’s minimax theorem.

Theorem 2

\(\textbf{Thereom 2}\) (Von’s minimax principle). Let \(M\) be an \(m\times n\) matrix with entries of real numbers. Let \(R_M\) denote the set of \(1\times m\) vectors with nonnegative entries such that \(\Sigma x_i = 1\). Let \(C_M\) denote the set of \(n\times 1\) vectors such that \(\Sigma y_i = 1\). Then, we have

\[\min_{x\in R_M}\max_{y\in C_M}x M y = \max_{y\in C_M}\min_{x\in R_M}xMy.\]

Now, consider \(M\) is a \(m\times n\) matrix. Each row denotes the input and each column denotes the protocol \(\Pi\) (the core of a protocol is a string). We say that \(M_{i,j} = 1\) if a \(j\)-th protocol solves the \(i\)-th input correctly. \(M_{i,j} = 0\) means that the answer is wrong. Here \(M_{i,j}\) is exactly \(xMy\).

\(\color{blue}{\text{Notice that, we assume for each input}},\) there is a (column) protocol that can answer correctly with prob. at least \(1-\epsilon\). (Technically speaking, it is not an assumption. Suppose that there are two same columns (protocols), we can just remove this column and remove this bit from each input, because it is useless. Then, we get that each input and each protocol is unique. For each input, we have a protocol that can answer correctly with prob. at least \(1-\epsilon\). We can map these protocols to the columns (protocols).)

Suppose that for any input, we have a deterministic algorithm with error \(\epsilon\) and complexity \(c\). It means that \(\min_{x\in R_M}\max_{y\in C_M}x M y = 1-\epsilon\). By Von’s minimax theorem, the strategy is equal to that a player chooses the hard distribution and another player chooses a protocol with minimum cost and the player can always choose the protocol with maximum probability, so \(\max_{y\in C_M}\min_{x\in R_M}xMy\geq 1-\epsilon\). Then, we have a randomized protocol with a success probability of at least \(1-\epsilon\). Since they have the same size of complexity while the deterministic protocol has a success probability \(1-\epsilon\) and the randomized protocol has a success probability of at least \(1-\epsilon\), inequality 2) is proved.