New Lookup Argument

rkm0959 · October 26, 2022, 3:04am

New Lookup Argument

Authors: rkm0959 (Gyumin Roh), Wei Dai, Mark (majabbour), Andrew He

This work started from the problems Haichen Shen and Ying Tong presented at the ETH x ZK Research Workshop in Stanford. We thank them and many others from the research workshop for discussing the base of the results below together. We also thank Yan Zhang for helping us in reviewing this document.

In this post, we present a new lookup argument and analyze its efficiency, alongside with some comparisons with the PLOOKUP.

We will fix a field \mathbb{F} with a multiplicative subgroup of order N and its generator \omega.
Denote \mathcal{L}_i(x) as the Lagrange basis polynomials where \mathcal{L}_i(\omega^j) = 1 if i = j and 0 otherwise.

Representations of Multisets

Consider an n-vector \vec a = (a_0, \ldots, a_{n-1}) \in \mathbb F^n with n \le N. We identify \vec a with the multi-set \mathcal A = \{a_i\}_{i=0,\ldots, n- 1} and consider two representations of \vec a and A:

The standard representation of A is the polynomial

A(X) = \sum^{n-1}_{i=0} a_i \cdot \mathcal{L}_i(X)\

The roots representation of A is the polynomial
Z_{A}(X) = \prod_{i=0}^{n-1} (X - a_i) = \prod_{i=0}^{n-1} (X - A(\omega^i))\,.
The inverse roots representation of A is the rational polynomial
W_A(X) = \sum_{a \in A} \frac{1}{X - a}

New Lookup Argument

Consider two multisets A = \{a_0, a_1, \cdots, a_{N-1} \} and B = \{b_0, b_1, \cdots , b_{n-1}\}.

If the size of A is less than N, we can pad it with some values in B.

Claim: Every element of \mathcal A is in \mathcal B iff R = Z_B \cdot W_A is a polynomial of degree less than n.

Proof sketch: Suppose there is some element of \mathcal A, say a, not in \mathcal B, then clearly there is a term (X - a)^{-1} in the summands of the expression for R(X), which means that R(X) is not a low degree polynomial. On the other hand, suppose every element of \mathcal A is in \mathcal B, then it is straight forward to check that Z_B \cdot W_A is a polynomial of degree n - 1.

Therefore, we can create a lookup argument that checks a single polynomial identity

Z_B(X) \cdot W_A(X) = R(X)

We explicitly write out the entire lookup protocol. We assume that the lookup table B is fixed and known by both prover and verifier. We’ll first discuss the interactive version.

We assume that Z_B and its KZG commitment are precomputed.

Step 1

The prover computes the standard representation of A and commits to it as [A(x)]_1.
The prover also computes R(X) and commits to it as [R(x)]_1.

Step 2

The verifier sends the random challenge \gamma.

Step 3

The prover computes y = W_A(\gamma). To prove this, we use a grand product argument.
The grand product polynomial Z is constructed as

(Z(\omega X) - Z(X) + y/N)\cdot(\gamma - A(X)) = 1

Here, we see that our recursive formula for Z(\omega^k) gives

Z(\omega^k) = Z(1) - \frac{yk}{N} + \sum_{i=0}^{k-1} \frac{1}{\gamma - a_i}

for each 0 \le k < N and the final check forces

Z(\omega^N) = Z(1) - y + \sum_{i=0}^{N-1} \frac{1}{\gamma - a_i}

which is what we wanted.

The quotient polynomial we want is

t(X) = \frac{1}{Z_H(X)} \left((Z(\omega X) - Z(X) + y/N) (\gamma - A(X)) - 1 \right)

which is degree at most N.

The prover sends y, the KZG commitment of Z(X), KZG commitment of t(X), and the evaluation for Z_B(\gamma).

Step 4

The verifier sends another challenge \delta for the grand product argument.

Step 5

The prover sends the KZG evaluation proofs of Z_B(\gamma), R(\gamma), Z(\delta), Z(\omega \delta), t(\delta), A(\delta). Here, R(\gamma) = Z_B(\gamma) y should be checked.
Maller’s optimization and the protocol from 2020/081 can be used here.

Step 6

The verifier checks the quotient polynomial identities with the given KZG proofs.

For the non-interactive version, the challenges are computed via Fiat-Shamir.

Prover Time

We will consider a single FFT evaluating a polynomial with degree less than N at N roots of unity as “a single FFT”. This means that multiplying two polynomials of degree no more than N, which requires evaluating both at $2N$th roots of unity, multiplying the evaluations and turning it back to coefficient form, requires 6 FFTs.

We assume that the KZG commitments to all Lagrange basis are precomputed, which can be done in \mathcal{O}(N \log N) time by algorithms, for example, [FK20].

First, we explain that the computation of R(X) takes \mathcal{O}(n \log^2 n) field operations. We note that multiplying n linear polynomials can be done with Divide & Conquer along with FFT in \mathcal{O}(n \log^2 n).

Therefore, we can precompute Z_B(X) in \mathcal{O}(n \log^2 n). Also, the sum

\sum_{a \in A} \frac{1}{X - a}

can be represented in \mathcal{O}(N) time

\sum_{i=0}^{n-1} \frac{cnt_i}{X - b_i}

where cnt_i is the number of times b_i appear in A.

This can be written as a quotient of two polynomials of degree no more than n in \mathcal{O}(n \log^2 n) field operations with a similar Divide & Conquer and FFT.

Let’s take a look at the computations of prover.

The prover needs to compute

KZG commitment to A(X)
- Assuming that A is given, this is a MSM of size N.
- They need to compute A(X) itself, which is one FFT.
KZG commitment to R(X)
- Computing R(X) takes \mathcal{O}(n \log^2 n) field operations.
- Computing the KZG commitment is a MSM of size n.
KZG commitment to Z(X)
- Computing the KZG commitment is a MSM of size N.
- Computing Z itself is one FFT as well.
KZG commitment to t(X)
- Computation of t(X) requires one polynomial multiplication between Z(\omega X) - Z(X) and \gamma - A(X), both of degree no more than N, 6 FFTs.
- After computing t(X), the rest is a multi-scalar-multiplication of size N.

The prover also needs to provide KZG proofs to R(\gamma), Z_B(\gamma), Z(\delta), Z(\omega \delta), and

(Z(\omega \delta) - Z(\delta) + y/N) (\gamma - A(X)) - 1 - t(X) Z_H(\delta)

at X = \delta, which should evaluate to 0.

These four distinct polynomials are all of degree at most N. Using the protocol from https://eprint.iacr.org/2020/081.pdf, we need to send

2 G_1 elements to verifier
at most 2N+1 G_1-exponentiation
7 G_1-exponentiation and 2 pairings by verifier

All in all, the prover cost is

3 MSMs of size N and 1 MSM of size n
8 FFTs of size N
one Divide & Conquer + FFT computation of cost \mathcal{O}(n \log^2 n) field operations
one protocol of 2020/081 with degree no more than N over 4 distinct polys

The proof consists of [A(x)]_1, [R(x)]_1, y, [Z(x)]_1, Z_B(\gamma), [t(X)]_1 and the proofs from the polynomial evaluation protocol in 2020/081 (2 G_1 elements)

which is 6 G_1 elements and 2 field elements.

A quick look back at PLOOKUP (PLONKUP Version)

In PLONKUP, with \mathbb{F} of order N and roots of unity \omega, there are

The query vector f = (f_0, \cdots , f_{N-1})
The table vector t = (t_0, \cdots , t_{N-1})

and s = (f, t) with s sorted w.r.t. t. Note that the query/tables are both size N.

In PLONKUP, the prover divides s into alternating halves h_1, h_2, i.e.

h_1 = (s_0, s_2, \cdots , s_{2N-2})
h_2 = (s_1, s_3, \cdots , s_{2N-1})

The grand product here is, with random \epsilon, \delta,

Z(\omega X) = Z(X) \frac{(1+\delta)(\epsilon + f(X))(\epsilon(1+\delta) + t(X) + \delta t(\omega X))}{(\epsilon(1+\delta) + h_1(X) + \delta h_2(X)) (\epsilon(1+\delta) + h_2(X) + \delta h_1(X\omega))}

Therefore, the prover needs to KZG commit f, h_1, h_2. The prover also needs to commit to the grand product polynomial and the quotient polynomial.

The prover needs to compute

KZG commitments to f(X), h_1(X), h_2(X)
- Each of these can be computed with a MSM of size N
- You also need an FFT to compute each polynomials.
KZG commitment to Z(X)
- The KZG commitment calculations is a MSM of size N
- You also need an FFT to compute the polynomials.
KZG commitment to q(X)
- This requires two multiplications of three polynomials of degree N, and the KZG commitment requires two multi-scalar-multiplication of size N.
- This requires 32 FFTs (there may be better ways) and two MSM of size N.
KZG proofs to f(\delta), t(\delta), t(\omega \delta), Z(\omega \delta), h_1(\omega \delta), h_2(\delta) and the linearization polynomial.
- These are all degree at most N, and the protocol from 2020/081 applies.
- There are 6 different polynomials, and two evaluation points.

Again, note that f, h_1, h_2, Z has degree \approx N and q has degree \approx 2N.

All in all, the prover cost is

6 MSMs of size N
36 FFTs of size N
one protocol of 2020/081 with degree no more than N over 6 distinct polys

The proof consists of [f(x)]_1, [h_1(x)]_1, [h_2(x)], [Z(x)]_1, [q_{low}(X)]_1, [q_{high}(X)]_1, [t(X)]_1 and the proofs from the polynomial evaluation protocol in 2020/081 (2 G_1 elements)

which is 9 G_1 elements

Therefore, the new lookup argument may be more efficient in the prover side in the regime N \gg n, i.e. where the \mathcal{O}(n \log^2 n) doesn’t become the bottleneck.

A quick look back at Halo2’s Lookup

https://zcash.github.io/halo2/design/proving-system/lookup.html

Here, with queries A and lookup table S, they provide a permutation A' and S' and check if

(A'(X) - S'(X))(A'(X) - A'(\omega^{-1} X)) = 0

along with the permutation check with grand product

Z_{i+1} = Z_i \cdot \frac{(A_i + \beta)(S_i + \gamma)}{(A_i' + \beta)(S_i' + \gamma)}

Therefore, we need to do (assuming S is fixed)

KZG commitments to A, A', S'
- each of them require MSM of size N and FFT of size N
KZG commitment to Z
- it requires an MSM of size N and FFT of size N
Quotient Polynomial
- it requires one multiplication of two polynomials of degree less than N, and two multiplication of three polynomials less than N, so 6 + 16 * 2 = 38 FFTs. The KZG commitment requires two MSM of size N.

Therefore, it requires at least 6 MSM of size N and 42 FFTs of size N.

Supporting Vector Lookups

Assume that there is a size n table of length t vectors. We’ll denote it B_0, B_1, \cdots, B_{t-1}.

We wish to prove that (A_{0, i}, A_{1, i}, \cdots , A_{t-1, i}) = (B_{0, j}, B_{1, j}, \cdots, B_{t-1, j}) for some j for all i.

Unlike the standard representation, the roots representation is not linear if we combine each columns with a random linear combination. Therefore, to compute the equivalent value for Z_B(\gamma), we would need another grand product argument.

We would need to consider

R(X) = \prod_{i=0}^{n-1} \left(X - \sum_{k=0}^{t-1} \alpha^k B_{k,i} \right) \cdot \left( \sum_{i=0}^{N-1} \frac{1}{X - \sum_{k=0}^{t-1} \alpha^k A_{k, i}} \right)

and commit to it.

To prove y = W_A(\gamma), the grand product polynomial Z is constructed as

(Z(\omega X) - Z(X) + y/N)\cdot \left(\gamma - \sum_{i=0}^{t-1} \alpha^i A_i(X) \right) = 1

The prover now also needs to prove and compute w = Z_B(\gamma). This is also done with a grand product. The grand product polynomial Z^* is constructed as

Z^*(1) = 1 \\ Z^*(\omega X) = Z^*(X) \left(\gamma - \sum_{i=0}^{t-1} \alpha^i B_i(X) \right) \\ Z^*(\omega^n) = w

The rest is similar to the original protocol, but we don’t need to open Z_B(\gamma).

Prover Time in Vector Lookup Case

Here, we will simply consider the variant where the prover also needs to provide proof for the evaluation of Z_B(\gamma) with a grand product argument.

We will assume that N \gg n, as this is the case where our new lookup is expected to outperform PLONKUP.

The additional element is the grand product polynomial Z^*, and the quotient polynomial requiring

\frac{1}{Z_H(X)}\left( (Z^*(X) - 1) \mathcal{L}_0(X) + \alpha (Z^*(\omega X) - Z^*(X) (\gamma - B(X)))(X - \omega^{N-1}) + \alpha^2(Z^*(X) - w) \mathcal{L}_n(X)\right)

Computing Z^* requires one FFT and one MSM of size N.
t still has degree of at most N, and 6 additional FFT are required.
- there are some division by linear polynomials
We need Z^*(\delta), Z^*(\omega \delta)'s proofs as well, but no need for Z_B(\gamma)'s proof.
- The number of polynomials for 2020/081 remains the same

All in all, the prover cost is

4 MSMs of size N and 1 MSM of size n
15 FFT of size N
one Divide & Conquer + FFT computation of cost \mathcal{O}(n \log^2 n) field operations
one protocol of 2020/081 with degree no more than N over 4 distinct polys

The proof consists of [A(x)]_1, [R(x)]_1, y, w, [Z(x)]_1, [Z^*(x)]_1, [t(X)]_1 and the proofs from the polynomial evaluation protocol in 2020/081 (2 G_1 elements)

which is 7 G_1 elements and 2 field elements.

This is also still better than PLONKUP if N \gg n.

Appendix: Alternate Derivation of New Lookup Argument

For distinct a_i and e_i \ge 1, if

P(X) = \prod_{i=0}^{n-1} (X - a_i)^{e_i}

it can be proved that

\gcd(P(X), P'(X)) = \prod_{i=0}^{n-1} (X - a_i)^{e_i - 1}

and therefore

\frac{P(X)}{\gcd(P(X), P'(X))} = \prod_{i=0}^{n-1} (X - a_i)

Now looking at the lookup argument, the claim that a \in B for each a \in A is equivalent to

Z_B \equiv 0 \bmod \left( \frac{Z_A}{\gcd(Z_A, Z_A')} \right)

which is equivalent to

Z_B Z_A' \equiv 0 \pmod{Z_A}

i.e.

Z_B \cdot \frac{Z_A'}{Z_A} = Z_B \cdot \sum_{a \in A} \frac{1}{X - a} = Z_B \cdot W_A

is a polynomial. This is the base of our new lookup argument.

rkm0959 · October 26, 2022, 3:16am

So with the new paper https://eprint.iacr.org/2022/1447.pdf (flookup) uploaded, here are some differences AFAIK.

Denote n as the table size and m as the number of times we are looking it up.

flookup

requires O(n \log^2 n) group operations as preprocessing
then requires O(m \log^2 m) field ops for proving
has strengths when n / \log n \gg m

this idea

requires O(n \log^2 n + m \log m) field ops for proving
the constant in O(m \log m) part than PLONKUP, has strengths when m \gg n

rkm0959 · December 28, 2022, 1:16am

so it seems that the inverse roots form is indeed really strong!