Toolbox

T1: Basic Techniques in Property Testing

Reminder: This post contains 4643 words · 14 min read · by Xianbin

Yao’s minimax principle in Property Testing

For basics on Yao’s minimax principle, see this post.

Theorem 1\textbf{Theorem 1}. Let P\mathcal{P} be a property over a function f:[n]Rf: [n] \to R. Suppose for nZ+n\in \mathbb{Z}^+, and ϵ(0,1)\epsilon \in (0,1), there exists a distribution DD over functions such that, for any deterministic algorithm A\mathcal{A} that makes at most q(n,ϵ)q(n,\epsilon) queries to the function, it holds

PfD[A errs in testing P on (n,ϵ,f)]>1/3P_{f \sim \mathcal{D}}\left[\mathcal{A} \text{ errs in testing } \mathcal{P}\text{ on } (n,\epsilon, f)\right] > 1/3

Then, the query complexity to test P\mathcal{P} on an error parameter ϵ\epsilon and a function f:[n]Rf:[n] \to R is more than q(n,ϵ)q(n,\epsilon).

Notice that if we only care about non-adaptive/one sided eror algorithms, the lower bounds hold to non-adaptive/one sided error testing.

Corollary 1\textbf{Corollary 1}. Let p1=PfD1[A(n,ϵ,f)=1]1η1p_1 = P_{f\in D_1}[A(n,\epsilon, f) = 1] \geq 1-\eta_1 be the probability that fD1f\sim D_1 satisfies P\mathcal{P}. Let p2=PfD1[A(n,ϵ,f)=0]1η0p_2 = P_{f\in D_1}[A(n,\epsilon, f) = 0] \geq 1-\eta_0 be the probability that fD2f\sim D_2 ϵ\epsilon-far from P\mathcal{P}. For any deterministic algorithm A\mathcal{A} that makes at most q(n,ϵ)q(n,\epsilon) queries to the input function, we have

p1p2η2.\lvert p_1 - p_2 \rvert \leq \eta_2.

If i=13ηi<1/3\sum_{i=1}^3 \eta_i < 1/3, then the query complexity of testing P\mathcal{P} on an error parameter ϵ\epsilon and a function f:[n]Rf: [n] \to R is more than q(n,ϵ)q(n,\epsilon).

Applications

1. Testing all-one strings.

Let Lone={1n:nZ+}{0,1}L_{one} = \{1^n: n\in \mathbb{Z}^{+} \}\subseteq \{0,1\}^* be the language of all-one strings.

Theorem\textbf{Theorem}. Any tester for LoneL_{one} requires Ω(ϵ1)\Omega(\epsilon^{-1}) queries.

Proof\textbf{Proof}. Let D1D_1 be the function distribution which we always sample the all-one string. Next, we construct D0D_0.

pt1

Divide [n][n] into 1/ϵ1/\epsilon blocks and flip 1 into 0 in one of the blocks. Then, we obtain 1/ϵ1/\epsilon blocks shown above, i.e., b1,,bϵb_1,\ldots,b_\epsilon. Let D0D_0 be the uniform distribution over the above blocks.

We can see that D1D_1 satisfies LoneL_{one} and D0D_0 is ϵ\epsilon-far from LoneL_{one}. To apply Corollary 1, we need to prove that the gap between the probability is small. Let AA denote a deterministic algorithm that queries at most o(1/ϵ)o(1/\epsilon) indexes. Let p1=PsD1[A(n,ϵ,s)=1]p_1 = P_{s\in D_1}[A(n,\epsilon, s)=1] and p2=PsD0[A(n,ϵ,s)=1]p_2 = P_{s\in D_0}[A(n,\epsilon, s)=1].

Now matter how smart the algorithm AA is, each block has a probability ϵ\epsilon having 0.

Then,

p1=1;p2=PsD0[A(n,ϵ,s)=1](1ϵ)S>2/3p_1 = 1;\\ p_2 = P_{s\in D_0}[A(n,\epsilon, s)=1] \geq (1-\epsilon)^{\lvert S\rvert} > 2/3

where SS is the set of queried indexes, S=o(1/ϵ)\lvert S \rvert = o(1/\epsilon). (The reason why we set S\lvert S \rvert is that we will show the lower bound is ω(S)=Θ(ϵ))\omega(\lvert S\rvert) = \Theta(\epsilon)).

Then,

p1p2<1/3\lvert p_1 - p_2 \rvert < 1/3

By Corollary 1, it is proved.

2. Testing Monotonicity on a Hypercube

Theorem\textbf{Theorem}. For any one-sided error non-adaptive tester for monotonicity of functions f:{0,1}{0,1}f:\{0,1\}\to \{0,1\}, it requires Ω(n)\Omega(\sqrt{n}) quires.

Proof\textbf{Proof}. For this problem, the key is to understand the conceptions. For monotonicity, what is the ϵ\epsilon-far from the monotonicity? It means that there are at least ϵ\epsilon proportion of inputs have violations, i.e., for x<yx < y, f(x)f(y)f(x) \geq f(y).

Next, we define violating pairs and violating edges. A pair (x,y){0,1}n×{0,1}n(x,y) \in \{0,1\}^n \times \{0, 1\}^n is a violating pair if xyx \leq y and f(x)>f(y)f(x) > f(y). Further, (x,y)(x,y) is a violating edge\textit{violating edge} if x+1=y\lVert x \rVert + 1 = \lVert y\rVert.

Since our goal is to prove the lower bounds, we need to create two distributions that are not easy to distinguish.

Notice that in this problem, we need to check by at least two queries and then we may know the result. Also, recall the definition of monotonicity, if x<yx<y, then x<y\lVert x \rVert < \lVert y \rVert (x=xi\lVert x \rVert = \sum \lvert x_i \rvert).

fj(x1,,xn)={1x<n/2+n0x<n/2n1xjmiddle-case.f_j(x_1,\ldots,x_n) = \begin{cases} 1 && \lVert x \rVert < n/2 + \sqrt{n}\\ 0 && \lVert x \rVert < n/2 - \sqrt{n} \\ 1 - x_j && \text{middle-case}. \end{cases}

Now, consider the following edge

(a,b)(a,b)

where a=(x1,,xj1,0,xj+1,,xn)a = (x_1,\ldots, x_{j-1},0, x_{j+1},\ldots,x_n) and b=(x1,,xj1,1,xj+1,,xn)b = (x_1,\ldots, x_{j-1},1, x_{j+1},\ldots,x_n)

We can see that (a,b)(a,b) is a violating edge if and only if a,ba, b are in the middle-case. Let us consider the size of middle-case. Since for each bit, there are only two possibilities, 0 or 1. We can consider that it is a random process in which 0 or 1 has probability 1/2. Then, by the Chernoff Bound, there are at least some ϵn\epsilon n violating edges, which means that each fjf_j is ϵ\epsilon-far from being monotone.

Next, we use the old trick. Let DD be the uniform distribution over all fjf_j.

Let AA be a deterministic non-adaptive tester for monotonicity of functions f:{0,1}n{0,1}f: \{0,1\}^n\to \{0,1\} with query complexity qq.

Lemma\textbf{Lemma}. The probability that AA detects a violating pair from DD is at most O(q/n)O(q/\sqrt{n}).

By Corollary 1, the gap between two probabilities is at most O(q/n)O(q/\sqrt{n}). By setting q/n=ϵq/\sqrt{n} = \epsilon', we can show that q=Ω(n)q = \Omega(\sqrt{n}).

Reference

[1]. Bhattacharyya, A. and Yoshida, Y., 2022. Property Testing: Problems and Techniques. Springer Nature.