One-way funtion is a funtion which is easy to compute but hard to invert. Here “easy” and “hard” means the lower bound of its computational complexity. The existence of one-way function is the necessary condition of the existence of modern cryptography schemes.

Some Information about the Course

In this course, we mainly talk about the foundation of Cryptography. The whole course includes the conditions of the existence of cryptography, the provable security of cryptosystems, the general construction method of cryptosystems. Basic knowledge of probability theory, computational complexity is necessary for this course.

This serie of notes mainly follows Goldreich’s book Foundations of Cryptograph, which usually being regarded sealed. Actually with some basic knowledge about computational complexity and being familiar with those “arguments”, this book is not quite hard to read.

Furthermore, since Goldreich’s book has been finished for several years, newly research results are not included in it. Results being found in resent years has been added by me in the notes. In fact, this is not the note of a single course. It is a set of notes for several cryptographic courses and textbooks. Some of the contents are beyond any of the courses the writer have taken.

Thanks to Mr. Yu and Mrs. Liu at SJTU for giving us wonderful cryptographic courses.

Some Information about the Symbols

Never be confused with the symbol $\to$. Though in math we always use $\to$ between the definitional domain and range, $\to$ here in the series of article often means the output of a algorithm. Algorithms are different with functions if we consider the complexity of it. And algorithms can be probablistic, that is, the outputs can be different even if the inputs are the same. Because of this, $=$ is meaningless. So we took the assignment operator from pseudocode $\to$ and $\leftarrow$ to denote the output of an algorithm.

We are not going to write too many conditions in a probabilistic formula. We sometimes write conditions by distributions under the simbol $\Pr$.

As for make of a probabilistic distribution like $U_n$, it has two means. Firstly, it denotes a distribution as it is. E.g. Random variable $X$ sampled by $U_n$ is denoted by $X\leftarrow U_n$, then we for each string $x\in U_n$ we have
$$
\Pr[x\in U_n]=\frac 1 {2^n}
$$
In other situations, these kind of mark can represent an anonymous random variable sampled from the distribution it denotes. We must make it clear that in the same math block, same mark of distribution $D$ always represents the same anonymous random variable sampled from $D$. For different anonymous random variable sampled from the same distribution $D$, we can use superscript to distinguish them. E.g. $D^{(1)}, D^{(2)},\cdots,D^{(m)}$.

Some information about Abbreviations and Terms

We presume that “neglible” is well-known. A function $p: \mathbb N\to [0,1]$ is called overwhelming if $1-p$ is negligible.

“PPT” stands for probalistic polynonial time, it means the function runs on a probalistic turing machine for polynomial time. The error probability is not included in “PPT”, it usually will be state later.

One-Way Function

Strong One-way Function

The existence of cryptography is based on $\mathbf P\neq \mathbf{NP}$. But $\mathbf P\neq \mathbf{NP}$ is not enough, cryptography asks for stronger precondition, that is the existence of (Strong) One-way Function.

Strong One-way Function (OWF)

A function $f:\{0,1\}^\ast\to \{0,1\}^\ast$ is a strong one-way function if it is

Easy to compute: There is a deterministic polynomial-time algorithm $\mathsf{EvalF}$ such that for all $n$ and all $x\in\{0,1\}^n$ we have $\mathsf{EvalF(x)} = f(x)$.

Hard to invert: There is no probabilitistic polynomial-time algorithm could invert the function with probability better than $\mathsf{negl(n)}$.

The second property can be restate more formally as

There is no PPT algorithm $\mathcal {A}$ satisfy
$$
\Pr[\mathcal A(1^n,f(U_n))\in f^{-1}(f(U_n))]>\frac{1}{n^c}
$$
for any $c>0$.

When you see a distribution been used as a random variable in a probabilitistic formula, it means you first sample once from that distribution and use the result as a random variable. It’s easy to understant this since we do this just because we don’t want to give the random variable a explicit name.

! Another thing to pay attention to is that $f$ is defined for all $n$. Or wey say, $f$ is uniform.

As it is shown in the definition, a strong one-way function cannot be revert efficiently in average-case. That is, the function might somehow be inverted by a algorithm once, but hard on average.

Theorem

If strong one-way function exists, then $\mathbf{P}\neq \mathbf{NP}$.

Proof: Suppose $\mathbf{P}=\mathbf{NP}$ and there exists an one-way function $f:\{0,1\}^\ast\to \{0,1\}^\ast$. Then decide
$$
L=\{\langle x^\ast,y\rangle| \exists x:x^\ast\sqsubset x \text{and} f(x)=y\}
$$
$L$ is the collection of tuples like $\langle x^\ast, y\rangle$ that $x^\ast$ is the prefix of some $x$ that $f(x)=y$. Clearly $x$ itself is a certificate, so $L\in \mathbf{NP}$. But we have $\mathbf{P}=\mathbf{NP}$, so for every $y$, we can recover $x$ one bit by one bit. This can be done by call the oracle of $\mathcal O_L$. To invert $y$, we firstly ask $\langle 0, y\rangle$. If it returns $0$ then we ask $\langle 10, y\rangle$, otherwise we ask $\langle 00, y\rangle$.

However, there’s another kind of one-way function, which sounds weaker that the previous defined one.

Weak One-Way Function

Weak One-Way Function

A function $f:\{0,1\}^\ast\to \{0,1\}^\ast$ is a one-way function if it is

Easy to compute: There is a deterministic polynomial-time algorithm $\mathsf{EvalF}$ such that for all $n$ and all $x\in\{0,1\}^n$ we have $\mathsf{EvalF(x)} = f(x)$.

Slightly Hard to invert: There exists a polynomial $p(\cdot)$ such that for every PPT algorithm $\mathcal A$ and all sufficiently large n’s,
$$
\Pr[\mathcal A(f(U_n,1^n)\notin f^{-1}(f(U_n))]<1-\frac 1{p(n)}
$$

See, it’s not “very hard” to invert weak one-way function. But pay attention that every algorithm shares the same $p(\cdot)$ in the definition. It means that every algorithms could only invert the function correctly with a upper bound probability. We can regard it as a “gap”.

If you are familiar with PCP theorem, you might think you there should be someway to amplify the gap. Then we can construct a strong one-way function by the given weak one-way function.

One-Way Function Defined for Some Lengths.

We could define one-way function for some lengths. Let $I\in \mathbb N$, then $I$ contains some natural numbers. We define the successor of $n$ in $I$ the minimal number in $I$ that is larger than $I$, denoted by $s_I(n)$.

E.g. Let $I= \{2,3,7,9,10,12,14,\cdots\}$, then $s_I(8)=9$.

$I$ is called polynomial-time-enumerable if there exists a $poly$-time turing machine that on input $n$ output $1^{S_I(n)}$.

To modify the definition of strong/weak one way function, change “every $n$” to “every $n\in I$ we get the definiton of strong/weak one-way functions for some lengths.

Fact One-Way function for polynomial-time-enumerable $I$ can be easily transformed to the original one-way function.

Theorem

Let $f(\cdot)$ be a one way function defined for lengths of polynomial-time-enumerable set $I$. Let $x=x^\prime x^”$, where $x^\prime$ is the longest prefix of $x$ with length in $I$. Define
$$
g(x)=f(x^\prime)x^”
$$
Then $g(x)$ is a one-way function.

Proof: The proof is simple, which is omitted here. But we review the basic thought of the proof. The technique used here is reduction. Suppose there is a algorithm $\mathcal A$ that invert $g(\cdot)$ efficiently, then we call construct a efficient algorithm $\mathcal A^\prime$ which could invoke $\mathcal A$ polynomial times and invert $f(\cdot)$ efficiently.

! This is the basic ideal for strong one-way function. As for weak one-way function, the basic idea is similar.

Length-Regular & Length-Preserving One-Way Function

The following two definitions can be used to both strong & weak one-way functions.

Length-Regular One-Way function

A one-way function is length-regular if for every $|x_1|=|x_2|$, it holds that $|f(x_1)|=|f(x_2)|$.

Length-Preserving One-Way function

A one-way function is length-preserving if for every $x$, it holds that $|x|=|f(x)|$.

Given a one-way function, we can construct a length-preserving one-way function by the following 2 steps. First given onw-way function $f(\cdot)$, we first construct a length-preserving function $g$ by adding the pad $10^\ast$
$$
g(x)\doteq f(x)10^{p(|x|)-|f(x)|}
$$
Where $p(\cdot)$ is the indicator of how many 0’s we are going to add. Then we transform $g$ to a length-preserving one-way function $g^\prime$.
$$
g^\prime(x^\prime x^”)\doteq g(x^\prime), \quad\text{where } |x^\prime x^”|=p(|x^\prime|)+1
$$
We’ve done. The one-wayness of $g^\prime$ can be also proved by reduction.

But the previous technique might not preserving the 1-1 of a one-way function, as for 1-1 one-way fucntion $f$, the first step do construct a 1-1 length-regular one-way function. But the second step break the 1-1 attribute.

Non-uniformly One-Way Function

$\mathbf{BPP}\subseteq \mathbf{P}_{/\text{poly}}$ as we known, a one-way function might still be invert efficiently by an non-uniform algorithm. If we design cryptosystem based on one-way function, it might still be analysis by using non-uniform algorithms. How strong are you going to define cryptosystem? It’s up to you. You might still define security means can’t be efficiently analized by quantum computers…

Non-uniformly one-way function is a stronger version of one-way function, it is those one-way functions has non-uniform one-wayness, which means it can’t be revert efficiently by a non-uniform algorithm. Notice that non-uniform algorithms are deterministic if they are represented by circuit families.

Non-uniformly One-Way Function

A function $f:\{0,1\}^\ast\to \{0,1\}^\ast$ is a strong one-way function if it is

Easy to compute: There is a deterministic polynomial-time algorithm $\mathsf{EvalF}$ such that for all $n$ and all $x\in\{0,1\}^n$ we have $\mathsf{EvalF(x)} = f(x)$.

Hard to invert: For all $poly$-size circuit family $\{C_n\}$ , polynomial $p(\cdot)$ and efficient large $n$, it holds that
$$
\Pr[C_n(f(x))\in f^{-1}(f(x))]<\frac 1 {p(n)}
$$

Look, to compute the function, a $\mathbf {BPP}$ algorithm is enough, but even a non-uniform circuit familiy of $poly$-size cold invert it.

Weak One-Way Functions Imply Strong Ones

Theorem

Weak one-way function exists if and only if strong one-way function exitst.

Not all weak one-way functions are strong one-way functions, refer to Goldreich’s textbook for a counterexample. And $\Leftarrow$ direction is trivial, so we focus on prove the $\Rightarrow$ direction.

Let $f$ be a weak one-way function with $p(\cdot)$ to be guaranteed by definition. Then we might construct
$$
g(x_1,\cdots,x_{t(n)})\doteq (f(x_1),\cdots,f(x_{t(n)}))
$$

where $t(n)=n\cdot p(n)$ .

Naive Proof

It is obvious that $g$ is easy to compute. As for invertion, one might think the success probability is less than $(1-\frac{1}{p(n)})^{n\cdot p(n)}$. This is not right since it is based on the assumption that the algorithm invert $g$ works by invert every $f(x_i)$ independently, which cannot be justified.

Right Proof

The fact is the construction is right but the proof is wrong. A detailed right proof is given in the following. Suppose we have the algorithm $\mathcal B$ to invert $g$ efficiently, that is
$$
\Pr[\mathcal B(g(U_m))\in g^{-1}(g(U_m))]>\frac 1{q(m)}
$$
then as for $m=n^2p(n)$ there is
$$
\Pr[\mathcal B(g(U_{n^2p(n)}))\in g^{-1}(g(U_{n^2p(n)}))]>\frac{1}{q(n^2p(n))}
$$

Construct Independent Inversion Procedure

The basic idea is to using the invert algorithm severala times to change the non-independently inversion to independently inversion by a procedure $I$. $I$ works as follows:

Procedure $I$

Input: $y$

$n\leftarrow |y|$

for $i=1$ to $t(n)$ do

$x_1\leftarrow U_n, x_2\leftarrow U_n,\cdots,x_{t(n)}\leftarrow U_n$

$z_1,z_2,\cdots,z_{t(n)}\leftarrow \mathcal B(f(x_1), \cdots, f(x_{i-1}), y,f(x_{i+1}),\cdots,f(x_{t(n)}))$

if $f(z_i)=y$ then return $z_i$

end for

Some comments:

“return $x$” means write $x$ on the output tape and halt for a turing machine (since algorithms & procedures can be regard as turing machines).

The basic idea here is: If we try to invert some $f(x_i)$ in the output of $g$, $y_i$ must be tangled with other $f(x_j)$’s, so here we try to put $y=f(x_i)$ to all the places to get the “best chance”. And by using procedure $I$ for each $f(x)$, we are going to invert them of independently.

Averaging Arguments for Successful Inversions

We define another polynomial $a(n)\doteq 2n^2\cdot p(n)\cdot q(n^2p(n))$, and we are going to invoke $I$ for $a(n)$ times. It is easy to findout why $a(n)$ times by rewriting $a(n)$ as $a(n)=2\cdot t(n)\cdot q(m)$. According the previous argument, mes we can construct an invert algorithm $\mathcal A$ for $f(\cdot)$ BY invoking procedure $I$ for $a(n)$ times. For an input $y=f(x)$,

$\mathcal A(y)$ halts whenever $I(y)$ halts with an output. If every invoking of $I(y)$ halts without any output, $\mathcal A$ just simply guess an invert of $y$.

The first question we are going to ask is with how much confidence you are going to believe $I$ inverts its input $y=f(x)$ correctly? Well, since $\mathcal B$ is a good algorithm for inverting $g$, it will invert $g$ correctly with probability at least $\frac 1 {q(m)}$. Recall that iterating for $i=1$ to $t(n)$ just helps to get the best chance for invert $y=f(x)$, so the probability to invert $y$ by $I$ will success with probability better than $\frac 1 {q(m) t(n)}$ where $m=n^2p(n)$. This can be done by basic averaging argument. Then it is quite clear we must try $I$ for about $q(m)t(n)=n\cdot p(n)\cdot q(n^2p(n))$ times to get some good enougth probability, this is where $a$ comes from.

As it is shown in the article Averaging Argument, we are going to devide $x\in\{0,1\}^n$ into “good ones” and “bad ones”, we mark the collect of goodones as $S_n$, which defines as
$$
S_n\doteq\left\{(f(x))\in f^{-1}(f(x))]>\frac{n}{a(n)}\right\}
$$
And the rest of strings are “bad ones”. To understand why strings in $S_n$ are “good ones”, the reduction for $\mathbf{RP}$ might be help and the detailed explanation will be shown in next part.

Now we are going to focus on the size of $S_n$. Since $S_n$ collects all the good ones, why might guess if the input $f(x_1),\cdots,f(x_{t(n)})$ of $g$ satisfy that all $x_1$ come from $S_n$, then this will be easy to invert. Otherwise it would be hard to invert. According to this basic ideal, we define
$$
s(n)\doteq \Pr[g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))]>\frac{1}{q(n\cdot t(n))}
$$
And seperate it into two parts, that is
$$
s_1(n)\doteq \Pr [g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))\wedge(\exists i: U^{i}_n\notin S_n)]
$$
and
$$
s_2(n)\doteq \Pr[g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))\wedge(\forall i: U^{i}_n\in S_n)]
$$
It is clear that $s(n)=s_1(n)+s_2(n)$.

Probability Bound for $s_1(n)$

Though deducing process is quite long, it is really simple by recalling that “try to put $y=f(x_i)$ to all the places to get the ‘best chance’”. We even could immediatly guess that
$$
s_1(n)=\sum^{n\cdot p(n)}_{i=1}\max\{\Pr[I(f(x))\in f^{-1}(f(x))]\}
$$
The detailed procedure are shown in the following:
$$
\begin{align}
s_1(n)
&=\Pr[\exists i:\mathcal B(g(U_{n\cdot t(n)})\in g^{-1}(g(U_{n\cdot t(n)}))\wedge(U^{(i)}_n\notin S_n)] \\
& \leq \sum^{n\cdot p(n)}_{i=1}\Pr[\mathcal B(g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))\wedge(U^{(i)}_n\notin S_n)] \\
&\leq \sum^{n\cdot p(n)}_{i=1}\sum_{x\notin S_n}\Pr[\mathcal B(g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))\wedge(U^{(i)}_n\neq x)] \\
&\leq \sum^{n\cdot p(n)}_{i=1}\sum_{x\notin S_n}\Pr[U^{(i)}_n=x]\cdot\Pr[\mathcal B(g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))|U^{(i)}_n=x] \\
&\leq \sum^{n\cdot p(n)}_{i=1} \max_{x\notin S_n}\{\Pr[\mathcal B(g(U_{n\cdot t(n)}))\in g^{-1}(g(U_{n\cdot t(n)}))|U^{(i)}_n=x]\} \\
&\leq \sum^{n\cdot p(n)}_{i=1} \max_{x\notin s_n}{\Pr[I(f(x))\notin f^{-1}(f(x))]} \\
&\leq n\cdot p(n)\cdot \frac{n}{a(n)} \\
&= \frac{n^2\cdot p(n)}{a(n)}
\end{align}
$$

While the last equality comes from our assumption.

Next, we are going to prove that $|S_n|>(1-\frac{1}{2p(n)})\cdot 2^n$

Notice that
$$
\frac 1{q(n^2 p(n))}<s(n)=s_1(n)+s_2(n)\leq \frac{1}{2q(n^2p(n))}+s_2(n)
$$
If we assume $|S_n|\leq (1-\frac 1 {2p(n)})\cdot 2^n$, then
$$
\begin{align}
s_2(n) &\leq \Pr[\forall i:U^{(i)}_n\in S_n]\\
&\leq (1-\frac{1}{2p(n)})^{n\cdot p(n)}\\
&< \frac 1 {2^{n/2}} < \frac{n^2\cdot p(n)}{a(n)}
\end{align}
$$
holds for efficient large $n$.

By contradiction
$$
\frac 1{n^2\cdot q(n)}<s(n)<s_1(n)+s_2(n)<\frac 1{n^2\cdot q(n)}
$$
we have
$$
|S_n|>(1-\frac 1 {2p(n)})\cdot 2^n
$$

Bound the lower probability of $\mathcal A$

We now bound the success invert probability of $\mathcal A$. For $x\in S_n$, the failure probability can be bounded by
$$
\Pr\limits_{x\leftarrow U(S_n)}[\mathcal{A}(f(x))\notin f^{-1}(f(x))]<(1-\frac{n}{a(n)})^{a(n)}<\frac 1{2^n}
$$
Now its easy to bold the success probability of $\mathcal A$.
$$
\begin{align}
&\quad\,\Pr[\mathcal A(f(x))\in f^{-1}(f(x))] \\
&\geq \Pr[\mathcal A(f(x))\in f^{-1}(f(x)) \wedge(U_n\in S_n)] \\
&= \Pr[\mathcal A(f(x))\in f^{-1}(f(x))|U_n\in S_n]\cdot\Pr[U_n\in S_n] \\
&\geq (1-\frac 1 {2p(n)})\cdot(1-2^{-n}) > 1-\frac 1{p(n)}
\end{align}
$$
Which is contradicted with the fact that $f$ is promised to be failed to invert with probability of $p(n)$. We’ve done the proof.

Summary for One-Way Function

As we have learned, there is a universal construction method for strong one-way function by weak one-way function. We are going to use the word “one-way function” in future discussing.

Futhermore, since we’ve known that one-way functions for polynomial-time-enumerable sets exists imply one-way function for all lengths exists. And the existence of the later implies the existence of length-regular or length-preserving functions. We might use these attributes without declaration.

Universal One-Way Function

We can define a function
$$
f_{\text{uni}}(\llcorner\boldsymbol M\lrcorner,x)
\doteq(\llcorner\boldsymbol M\lrcorner,\boldsymbol M(x))
$$
where $\boldsymbol M$ is a turing machine as $\llcorner\boldsymbol M\lrcorner$ is its description. This is called a universal one-way function, which is obviously computable on a universal TM with a step counter. It is interesting that if there is any one-way function $g$, so is $f_{\text{uni}}$.

Furthermore, we can use $f_{\text{uni}} $ to construct the polynomial-time computable strong one-way function.

One-Way Function Family

Well if we put some of one-way functions together, we get a collection of one-way funtions. This is meaningless unless we have a good way to compute them by a single evaluation function which halts in $poly$-time. If such a function do exists, with some extra attributes, the collection would be quite useful. Based on this main idea, we define one-way function family.

One-Way Function Family

A collection of functions $\{f_i:D_i\to\{0,1\}^\ast | i\in I\}$ is called strongly (resp., weakly) one-way if there exists three PPT algorithms $\mathcal G,\mathcal S,\mathcal E$ satisfy

The generation algorithm $\mathcal G $: given the input $1^n$, output a random variable $i$ from $I_n=I\cap \{0,1\}^n$.

The sample algorithm $\mathcal S$: given the input $1^n$ and $I_n$, output a random value $x$ from $D_i$.

Then evaluation algorithm $\mathcal E$: given the input $1^n, I_n, x$, output $f_i(x)$ correctly all the time.

and one-wayness:

For every PPT algorithm $\mathcal A$, every polynomial $p(\cdot)$ and all sufficient large $n$’s, it holds that

$$
\Pr\limits_{i\leftarrow \mathcal S(1^n)\ x\leftarrow\mathcal G(1^n,i)}[\mathcal A(i,f_{i}(x))\in f^{-1}_{i}(f_{i}(x))]<\frac 1 {p(n)}
$$

The definition is straight forward, but there are still somewhere to pay attention to.

As for the index set I, in order to efficiently sample from $I\cap \{0,1\}^n$ for all efficient large $n$, the size of $I$ has to be noticable, which means $|I\cap \{0,1\}^n|\geq 2^n/\text{poly}(n)$. The generation algorithm $\mathcal G$ don’t have to sample $I_n$ uniformly at random on $I\cap \{0,1\}^n$. Likewise, the output of sample algorithm $\mathcal S$ on input $i$ do not necessarily distributed uniformly over $D_i$.

Did you find something wrong? The defintion of “One-way function family” doesn’t mentioned that all the functions in it has to be one-way function! Furthermore, the inversion is hard with respect $\mathcal G$ and $\mathcal S$!

Here we recall the reduction of $\mathbf{RP}$ again. We can always construct a generation or sample algorithm that outputs YES with overwhelming probability by repeat $\mathcal G$ or $\mathcal S$ polynomial times. We are going to use this in the definition of trapdoor permutation family.

A Problem

Goldreich says “without lost of generosity, $\mathcal E$ can be deterministic”, but how comes that $\mathcal E$ evaluates right with just overwhelming probabilty in the definition of trapdoor one-way family? Why are they different?

Trapdoor One-Way Permutations

Goldreich didn’t mentioned about non-permutation trapdoor one-way functions, so we’are going to skip that (since its quite simple as well).

The idea of trapdoor, I think, comes from the certificate of an $\mathbf{NP}$ problem. With its trapdoor, the one-way function is easy to invert. This sounds like at least one inversion (since every image might have mutiple preimages) of such one-way function is a $\mathbf{NP}$ language. The trapdoor might sounds more like an advice since its same for all the inputs. But for one-way function family, I feel nature about regarding trapdoors as certificates.

For OWP family (One-Way Permutation family) with trapdoors, we are going to use the most convenient definition for cryptography.

Trapdoor Permutation Family

Let $I\subseteq \{0,1\}^n$ and define $I_n=I\cap\{0,1\}^n$. A trapdoor permutation famility with indices in $I$ is a set of functions $\left\{f_i:D_i\to D_i|i\in I\right\}$ such that each $f_i$ is a permutation on the corresponding $D_i$. Such a collection is called a trapdoor permutation if there exist four PPT algorithms $\mathcal G, \mathcal S, \mathcal E, \mathcal E^{-1}$ such that the following five conditions hold:

Index and trapdoor generation $\mathcal G$:
$$
\Pr[\mathcal G(1^n)\in I_n\times \{0,1\}^\ast]>1-2^{-n}
$$

Random variable in domain sampling $\mathcal S$, for every $n\in\mathbb N$ and $i\in I_n$

$\Pr[\mathcal S(i)\in D_i]>1-2^{-n}$

If the output of $\mathcal S$ is in $D_i$, then its distributed uniformly at random on $D_i$, that is
$$
\Pr[\mathcal S(i)=x|\mathcal{S}(i)\in D_i]=\frac1{|D_i|}
$$

Efficient evaluation $\mathcal E$, for every $n\in\mathbb N,i\in I_n$, and $x\in D_i$
$$
\Pr[\mathcal E(i,x)=f_i(x)]>1-2^{-n}
$$

Hard to invert:

Hard-Core Predicates

As we know, to invert a trapdoor one-way function is hard without giving the trapdoor. However, “hard“ here means it is hard to get the exact inverse of an value. Though it’s hard to invert one-way function, we may still get some information about it, which leads to unsecrity of constructing PKEs directly by using one-way permutations.

Luckily we have found that some bit of one-way are truly hard to predicate, we call those bits “hard-core”. By using the difficulty of predicate the hard-core of one-way permutation, we can construct PKEs which are secure enough.

Hard-Core Predicate

A PPT computable predicate $b:\{0,1\}^\ast\to\{0,1\}$ is called a hard-core of a function $f$ if for every PPT algorithm $\mathcal A$, every polynomial $p(\cdot)$ and all sufficient large $n$, it holds that
$$
\Pr[\mathcal A(f(U_n))=b(U_n)]<\frac{1}{2}+\frac 1 {p(n)}
$$

Don’t be confused with the word “predicate”; it just means a function with range $\{0,1\}$. As we can induce from the difinition, only one-way functions have hard-cores. Otherwise since $b$ is PPT computable, we could compute it by invert the functions first.

We can also define hard-core for a one-way function family.

Hard-Core Predicate for One-Way Function Family

For one-way function family $(\mathcal G,\mathcal S ,\mathcal E)$, a $poly$-time alogrithm $\mathcal B:\{0,1\}^\ast\times \{0,1\}^\ast\to \{0,1\}$ is called a hard-core of the family if for every PPT algorithm $\mathcal A$, every polynomial $p(\cdot)$, and all sufficiently large n’s,
$$
\Pr\limits_{i\leftarrow \mathcal G(1^n)\ x\leftarrow \mathcal S(1^n,i)}[\mathcal A(i,f_i(x))=\mathcal B(i,x)]<\frac 12+\frac 1{p(n)}
$$

Hard-Core for Any One-Way Function

In this section, we are going to prove “where there is a one-way function, there is a hard-core”. The construction of the hard-core is based on a very simple idea of generalize the XOR lemma of statistical indistinguishable to computational indistinguishable.

Goldreich-Levin Theorem

Let $f$ be an arbitrary strong one-way function, and let $g$ be defined as $g(x,r)=(f(x),r)$, where $|x|=|r|$. Then predicate $b=\bigoplus_{i=1}^{|x|}x\cdot r$ is a hard-core of function $g$.

Well the proof of this theorem is quite long, but seems not difficult since the basic idea easy. Assume that there is some alogrithm $\mathcal A$ that predict $a$ with non-negligible probability, we are going to construct a algorithm $\mathcal B$ whom invokes $\mathcal A$ polynomial times and recover $x\in f^{-1}(y)$ bit by bit. We are going to prove a weaker theorem first in order to build up the intuition of this theorem.

Weaker Version: The constant successful prdiction probability $> 0.76$

We are going to show how to recover 1 bit by invoking the algorithm hard-core predicting algorithm $\mathcal A$ that satisfy
$$
\Pr[\mathcal A(f(x),r)=b(x,r)]>0.75+\frac{1}{2p(n)}
$$
where $p(n)$ is a polynomial.

Since $r$ is uniformly random, $r\oplus e_i$ is uniformly random, where $e_i$ is an $n$-dimensional binary vector with $1$ in the $i$th component and $0$ in the others. We can use to $\mathcal A(f(x),r \oplus e_i)$ to predict $b(x,r\oplus e_i)$, and get the right result with probability $> 0.75+\frac 1{2p(n)}$.

Since
$$
\begin{align}
\mathcal A(f(x),r)\oplus \mathcal A(f(x),r\oplus e_i) &= b(x,r)\oplus b(x,r\oplus e_i) \\
&= b(x,e_i) \\
& = x_i
\end{align}
$$
Then we can constuct an algorithm $\mathcal B(f(x),r)=\mathcal A(f(x),r)\oplus \mathcal A(f(x),r\oplus e_i)$ and by simply using basic probabilistic theory or Chernoff bound, it holds that
$$
\Pr[\mathcal B(f(x),r)=x_i]>\frac 1 2 +\frac 1 {p(n)}
$$
We can recover 1 bit of $x$ with noticable probability. By repeatly invoking $\mathcal B$ for polynomial times, we could recover all bits of $x$ with high enough probability.

Proof of Goldreich-Levin Theorem

Since we are going to work with the tool $\mathcal A$ that has higher probability, the previous introduced method cannot work.

I think Goldreich has introduced his intuition finely, so I’m going to quote the contents in his book here, with some slightly modify to the symbols.

What is required is an alternative way of suing the algorithm $\mathcal B$, a way that does not double the orighinal error probability of $\mathcal B$. They key idea is to generate the $r$’s in a way that requires applying algorithm $\mathcal B$ only once per each $r$ (and $i$), instead of twice. Specifically, we shall use algorithm $\mathcal B$ to obtain a “guess” for $b(x,r\oplus e_i)$ and obtain $b(x,r)$ in a different way. The good news is that the error probability is no longer doubled, since we use $\mathcal B$ get a “quess” of $b(x,r\oplus e_i)$. The bad news is that we still need to know $b(x,r)$ for only one $r$ (or logarithmacally in |x| many r’s), but the problem is that we nned to know (and hence guess) the values of $b(x,r)$ for polynomially many r’s. An obvious way of guessing these $b(x,r)$’s yields an exponentially vanishing success probability. Instead we generate these polynomially many r’s such that, on one hand, they are “sufficiently random,” whereas, on the other hand, we can guess all the $b(x,r)$’s with noticable success probability. Specifically, generating the r’s in a particular pairwise-independent manner will satisfy both (seemingly contradictory) reqirements.

Back to the proof with some explaination

Suppose there is an algorithm $\mathcal A$ holds that
$$
\Pr\limits_{x\leftarrow U_n\\r\leftarrow U_n}[\mathcal A(f(x),r)=b(x,r)]>\frac12+\frac1{p(n)}
$$
where $p(\cdot)$ is a polynomial and $|x|=n$.

We define a “good” subset of $\{0,1\}^n$. $S_n$ defines as
$$
S_n=\left\{S_n|\Pr\limits_{r\leftarrow U_n}[\mathcal A(f(x),r)=b(x,r)]>\frac 12+\frac1{2p(n)},x\in S_n \text{ is a constant}\right\}
$$
We have proved that $|S_n|>\frac1{2p(n)}\cdot 2^n$.

We set $l\doteq \lceil \log_2(2n\cdot p(n)^2+1)\rfloor$. We construct a algorithm $\mathcal B$ proceeds as follows:

Sample $s^1,\cdots, s^l\in\{0,1\}^\ast $and $\sigma^1,\cdots, \sigma^l\in\{0,1\}$ uniformly at random.
For every one-empty set $J\subseteq [l]$, computes $r^J=\bigoplus_{j\in J} s_j$ and a bit $\rho^J=\bigoplus_{j\in J}\sigma_j$.
For every $i\in [n]$ and every non-empty $J \subseteq [l]$, computes
$$
z^J_i\leftarrow \rho^J\oplus G(y,r^J\oplus e_i)
$$
For evry $i\in[n]$, it sets $z_i$ to be the majority of the $z^J_i$ values.
Output $z=z_1\cdots z_n$.

We have some comments. $\sigma_j$ is regard as the approximation of $b(x, s_j)$. It might sound wired since $\sigma_j$ is sampled uniformly at random. It sounds more like a “guess” and the probability of guessing right all can be computed:
$$
\Pr[\forall i\in[l]:\sigma_i=b(x,s_i)]= 2^{-l}>\frac 1 {4n\cdot p(n)^2}
$$
Whereas, the tricky technique here is we are not going use reductions to get the correct guess. Actually, we are going to use “democrative voting”.

The basic idea is that since we can guess the hard-core rightly, we are going to guess one-bit of $x_i$ from $f(x)$ with rightly with probability better than $1/2$ by using $\mathcal A(x,r^J)\oplus \mathcal(x,r^J\oplus e_i)$. Using Chernoff bound is an bad idea since it force us to double the error probability. Luckily, at last we have sampled pairwise-independent $s^J$’s and Chebyshev’s indequality should be helpful.

We define a new variable $\zeta^J_i$. $\zeta^J_i = 1$ if and only if we use $r^J$ guesses $x_i$ correctly, which means $b(x,r^J)\oplus \mathcal B(f(x),r^J\oplus e_i)=x_i$. Since $b(x,r^J)\oplus b(x,r^J\oplus e_i)$, if follows that $\zeta^J =1 $ if and only if $G(f(x),r^J\oplus e_i)=b(x,r^J\oplus e_i)$. For $x\in S_n$, We have
$$
\begin{align}
\Pr\left[\sum_{J\subseteq[l]} \zeta^J\leq \frac m 2\right] &\leq \Pr\left[\left|\sum_{J\subseteq[l]}\zeta^J-\left(\frac12+\frac1{2p(n)}\right)\cdot m\right|\geq ]\frac1{2p(n)m}\right] \\
&\leq \frac{m\cdot \text{Var}[\zeta]}{\left(\frac1{2p(n)}\cdot m\right)^2} \\
&\leq \frac{m\cdot \text{Var}[\zeta]}{\left(\frac1{2p(n)}\right)^2 \cdot (2n\cdot p(n)^2)} \\
&= \frac 1{2n}
\end{align}
$$
And now its time to evaluate the probability of $z_1\cdots z_n=x_1\cdots x_n$,
$$
\begin{align}
\Pr[z_1\cdots z_n=x_1\cdots x_n|x\in S_n]&\geq 1- \sum^n_{j=1}x\in \Prx\in S_n}{4n\cdot p(n)^2}=\frac 1{8n\cdot p(n)^2}
\end{align}
$$
Every time we get the output, we just test. If the output is wrong, we try algorithm $\mathcal B$ again. Within polynomial times of invoking, the success probability will get to $2/3$.

Exercise

Please don’t ask me for solutions, because I’m solving them, too. I would like to give some hints for difficult problems after I’ve done them.

Prove that if one-way function exists, then $\mathbf{NP}\backslash\mathbf{BPP}\neq \emptyset$.
Prove that in the proof of Goldreich-Levin Theorem, for any $I,J\subseteq [l]$, $r^I$ and $r^J$ are independent.