Averaging Argument

In computational complexity theory and cryptography, averaging argument is a standard argument for proving theorems. It usually allows us to convert probabilistic polynomial-timealgorithms into non-uniform polynomial-size circuits. (Wikipedia)

The following example can be regarded as something like Pigeonhole Principle.

Pigeons: Please do not GuGu when you are going on a date.

Example 1

If every person likes at least $1/3$ of the books in a library, then the library has a book, which at least $1/3$ of people like.

Proof: Suppose there are $N$ people in total and $B$ books in the library. Let make people leave a mark on the book they like, then there are at least $NB/3$ marks. Presume there are no such book that has at least $M/3$ books on it, then the total number of marks would be less than $NB/3$, contradicted.

When we deal with problems solved by the pigeonhole principle, we see the total number of pigeons and pigeonholes. As for the problems solved by averaging argument, we usually don’t see the number of pigeonholes or the number of pigeons or both of them. In this kind of case, we should keep the number of these things in mind, or we say we presume them.

As you can see, you case dealing with probability is always the latter case, like the example.

Let’s introduce the first formal discription of averaging argument.

Averaging Argument

Let $X$ and $Y$ be sets, and $p$ be a prediction on $X\times Y$. Let $\alpha$ be a real number in the interval $[0,1]$. If for each $x$ in $X$, there exists at least $\alpha|Y|$ in $Y$ that satisfy $p(x,y)$. Then, there exists a $y$ in $Y$ such that there exist at least $\alpha|X|$ elements $x$ in $X$ that satisfy $p(x,y)$.

Proof: We can construct a proof of the same style as the proof of Example 1.

Keep in mind that this argument can be generalized to continuous groups as well as probabilistic situations.

Averaging Argument

Let $f$ be a function. If we have a circuit $C$ such that $C(x,y)=f(x)$ with probability at least $\rho$ where $x$ is chosen at random and $y$ is chosen independently from some distribution $\mathcal Y$ over $\{0,1\}^m$. Then there exists a single string $y_0\in\{0,1\}^m$ such that $\Pr[C(x,y_0)]>\rho$.

Proof: Keep the invisible pigeonholes in mind, the proof might get simple. But we can also prove it in probabilistic style.

… We need a probabilistic proof here.

Have you seen that if there is a problem which is in $\mathbf{RP}$, then we can use a circuit to solve it and $C(\cdot, y_0)$ is the circuit we need. Of course, for different length of input, different circuit is needed. So we have $\mathbf{RP}\in \mathbf P_{/\text{poly}}$.

The proof of $\mathbf{BPP}\in\mathbf{P}_{/\text{poly}}$ is a little more complicate than this, but still bases on the same idea. The proof will be given in the computational complexity.

The basic idea of Averaging Argument can be used somewhere else, like in the situation about a collection of some $x$ in the theorem Averaging Argument. Where the set size can be get by solving inequality.

The next exam is quite important, which will be used in the proof of Goldreich-Levin Theorem.

Example 2

Let $f$ be a function $f:\{0,1\}^n\to \{0,1\}^{l(n)}$ and $f^\prime :\{0,1\}^n \times\{0,1\}^n\to \times\{0,1\}^{l(n)}\times \times\{0,1\}^n$ defines as
$$
f^\prime(x,r)=(f(x),r)
$$
where $|x|=|r|$. Let $\mathsf{gl}(x,r)=\bigoplus_{i=1}^n x_i\cdot r_i$.

Suppose there exists a $\mathsf{PPT}$ algorithm $\mathcal A$ that running time $t$ and satisfy
$$
\Pr\limits_{
x\overset $\leftarrow\{0,1\}^n \\
r\overset $\leftarrow\{0,1\}^n
} [\mathcal A(f(x),r)=\mathsf{gl}(x,r)]\geq\frac 1 2+\varepsilon(n)
$$
Then there exists a set $\mathcal S_n\in\{0,1\}^n$ of size at least $\frac{\varepsilon(n)}2\cdot 2^n$ such that for every $x\in \mathcal S_n$ it holds that
$$
\Pr\limits_{x\overset $\leftarrow \mathcal S_n\ r\overset $ \leftarrow \{0,1\}^n}[\mathcal A(f(x),r)=\mathsf{gl}(x,r)]\geq \frac 1 2+\frac{\varepsilon (n)} 2
$$

Proof: Still we keep in mind the inversible pigeonholes. We have “1” pigeonholes. “1” means if we choose a pigeonhole uniformly at random from all the pigeonholes then we have probability “1” that we chosen from the set of all pigeonholes… This is the verbose, since there are cases that the probability is less than “1”.

We assume we have probability $p$ to choose from $\mathcal S_n$ and then we have probability $(1-p)$ to choose from $\overline{\mathcal S_n}=\{0,1\}^n-\mathcal S_n$. We want to evaluate the least size of $\mathcal S_n$, so we make the things we wish($\Pr[\mathcal A(f(x),r)=\mathsf{gl}(x,r)]$) to happen as frequent as posible, that is the probability $1$. As for cases in $\overline{\mathcal S_n}$, by the assumption made it has to happen with probability as least $\frac 1 2+\frac{\varepsilon (n)} 2$, then we have
$$
\frac 1 2 \leq \Pr\limits_{r\overset $ \leftarrow \{0,1\}^n}[\mathcal A(f(x),r)=\mathsf{gl}(x,r)] \leq p\times 1+(1-p)\times (\frac 1 2+\frac{\varepsilon (n)} 2)\leq p+\frac 1 2+\frac{\varepsilon(n)}{2}
$$
By solving the inequality
$$
p\geq \frac{\varepsilon (n)}2
$$
Then we have
$$
|\mathcal S_n|\geq \frac{\varepsilon(n)}2\cdot 2^n
$$

Exercise

Prove Example 2 by using Markov Inequality.