Discussions on Lookups in Groth16: Ultragroth

yush · August 3, 2024, 2:17am

@levs57 posted a construction of how one would add lookups to groth16, called UltraGroth. The above linked hackmd has the main details. This thread is for sharing additional discussion between nx, yush, and levs.

nx:

i had some questions after reading ur notes. Sorry if they seem silly, it’s been a while since i’ve messed with groth16

so from what i understand, the main idea here is that we split public and private inputs into multiple rounds, such that private inputs for a round cannot depend on inputs from later rounds?

i’m not really sure why this can’t be done with vanilla groth16. i think it would help if i had an explicit example

levs:

ok, this is almost vanilla groth16, but with slightly modified crs

it has multiple gammas (and I think also multiple deltas if you want multi-round public inputs, though I think it is not necessary)

if you just did it with a single delta (i.e. communicate multiple commitments which allegedly correspond to rounds), I’d be unable to see the integrity - i.e. you’d be able to arbitrarily subtract something from commitment to 1st round, and add the same point to 2nd…

which is not good

groth16 lets recall how it works first

we have R1CS witness w_i

and we compute A = [\alpha]_0 + \sum A_i w_i

B = [\beta]_1 + \sum B_i w_i

and C is kinda funky

we have 3 sets of polynomials, and they correspond to rows in matrix repr of R1CS

L_i (x), R_i(x), O_i(x)

“left, right, output”

each C_i is actually (\beta A_i + \alpha B_i + O_i) / delta

and there is also subset of things treated as “public”

and for these C_i-s are denoted IC_i -s

and have different formula

(\beta A_i + \alpha B_i - O_i) / gamma

next, in C there is also some additional stuff related to quotient polynomial

because R1CS equation says that A(x) B(x) = C(x) on our evaluation subset, right?

so it is actually A(x) B(x) = C(x) + Z(x) H(x)

lets denote h_i coefficients of H(x)

so I want to say that A_i = L_i(\tau), B_i = R_i(\tau)

and C_i is kinda funky.

and I forgot signs of course

anyway, into C we also inject terms related to H

so we also have Z_k = \tau^k Z(\tau)/delta

and hence sum h_k Z_k sort of represents term Z(x) H(x), and 1/\delta “marks” it so it is grouped into C

so lets inspect verification eq

e(A, B) = e([\alpha]_1, [\beta]_2) + e(C, \delta) + e(IC, \gamma)

lets ignore public stuff for a moment

e(A, B) opens up into e(\alpha, beta), plus products of A-terms by beta, plus products of B-terms by alpha (all clearly cancelled in RHS)

and actual interesting stuff, which is exactly A(\tau) B(\tau)

and it gets cancelled by C(\tau)/delta * \delta

ok, so now lets try to understand why prover can’t cheat too much

we can do it in algebraic group model, which basically means that we just pretend that these elements are like indeterminates, and every value that prover produces is linear combination of things that it has in common reference string

first, lets observe that lhs must have alpha and beta to produce \alpha \beta term on the right

moreover, because crs doesnt contain [beta]_1 or [alpha]_2, A actually must contain alpha, and B must contain beta

lets also linearly rescale them in a way that A= alpha + stuff, B = beta + stuff

then observe that stuff thats multiplied by alpha can only be cancelled by terms in C, similar to stuff in beta

that will enforce that witness coefficients that are involved in A, B and C are all equal

and then, there will be additional terms related to delta, don’t exactly remember how they are cancelled, but it is actually written in my text

so now lets see what gamma does, functionally

it sort of splits C into 2 parts, such that the union of their coefficients (these are 2 non-intersecting vectors) is the same as coefficients of A, or B

but they can not get intermixed with each other

in groth16, it is used to separate into public and private inputs

but we can use the same principle to separate into just few “rounds” - separate subsets of the circuit, such that the commitment to them is exposed in the proof

now we can hash them, and pass data into public inputs, i.e. use them as actual challenge rounds

which is exactly my proposal

the question of zero knowledge is a bit more subtle than in normal Groth16, which is even rerandomizable - i.e. given a proof, I can make a new proof which is equidistributed in space of all valid proofs

but indeed we can find blinding factors that properly hide everything

There is also a way to get to the same construction from other side, basically starting “modularly”, saying - lets use groth16-style construction for linear combinations, and then somehow interface into plonk, which supports rounds.

Then, you can think what functionally works as commitment to data (so you can pass it to plonk ) and apparently find that now you don’t need to pass it, and can just make rounds inside of groth16.

Considering public inputs - I think we can have a single round of public inputs, i.e. multiple gammas, but single delta.

Whats important, though, is that Fiat-Shamir must include actual public inputs. It can be achieved by creating a private signal for each public signal (in a round corresponding to this public signal “real round”), and then applying the constraint that makes them equal.

In reality, public inputs in round 0 are real public inputs, and all others are challenges.

nx:

it’s still not super clear to me what the exact workflow is for a prover here. like say i want to prove some lookup is correct, what do i need to compute and what is public vs private

levs:

oh, considering lookup it is quite easy, we only need 2 rounds

0-th round contains all your interesting data, and lookup argument itself (there are a bunch of them, but they all reduce to some sort of randomized check over some array, for example standard halo2 lookup eventually checks permutation argument, which says that \prod (x_i - t) = \prod (y_i - t) for a random t)

now, t is a challenge, and evaluation of this thing is allocated in 1st round

so prover flow is as follows

compute witness to 0-th round

compute C of the 0-th round

(apply blinding factors here, too, actually)

hash it to obtain t

now compute full witness

t is also substituted as public input in its rightful place

cool, we are now ready to do normal groth16 proof

(except C is like, split)

now verifier does following checks:

<A, B> = <\alpha, \beta> + <C0, \delta_0> + <C1, \delta_1> + <IC, \gamma>
IC.t = Hash(C0) (recall IC are public inputs, so they are sent plainly with the proof)

nx:

is it reasonable to think of this as like “separate” groth16 proofs kinda “chained” together

levs:

nah, more like a single proof with more “unmixable” groups than just private and public inputs

chained thing could be done too, it is just like… worse

i.e. you could imagine 2 groth16 proofs with some sort of interface between them

but to actually move data between them, you need commitment, which, in turn, actually solves the problem already

i.e. “C actually commits to witness”

nx:

i see

is it also technically possible to not reveal the hash challenge and prove it was correctly computed in circuit

so that it just ends up being a normal groth16 proof?

levs:

let me think, thats interesting

I guess no, you will need to somehow pass C0 (parsed as collection of scalar field elements) back into the circuit, still

if not t, you will need to at least pass C0

normal groth proof doesn’t have access to itself in a circuit

I need verifier to pass it as public input

nx:

can’t i do a hash over all the earlier inputs or smth

levs:

you need to hash commitment to private data

you could technically hash EVERYTHING thats in C0

but that would defeat the purpose, i.e. we are like abusing the verifier to give us hash of commitment of stuff to C0

to avoid doing it in circuit

(similar to what happens in plonk, exactly)

instead of doing in-circuit hashing, ask verifier to hash the commitment - it is cheaper

levs:

maybe an instructive thing is Plonk + linchecks

you can make an argument, simpler than plonk

that proves the following: for a fixed matrix A (trusted setup data), and committed vector v, it holds that Av = 0

and this can be combined with normal PlonK (even without copy constraints, because we can simulate copy constraints as particular case of lincheck)

so this achieves (in a bit more involved way, but still) the same combo - we have both lookups, and free linear combinations

this even has certain advantages, i.e. you can use gates of higher degree if you use plonkish argument then

and lincheck is quite easy to do - you make normal KZG commitments, and for each basis element e_i you also make something like

e_i / delta + A(e_i) / gamma

and you publish [delta]_2

but do not publish anything related to gamma

so only way to cancel them is make linear combination \sum s_i e_i such that A(\sum s_i e_i) = 0

and then for kzg commitment C (normal one) you ask to find an element C’ such that <C, [1]_2> = <C’, [\delta]_2>

this is like small and really stupid version of groth which is only capable of checking linear equations ;))

nx:

so if this were to be implemented in circom groth16,

r1cs doesn’t need to change, u just need some extra metadata in ur .r1cs file storing what “rounds” each wire belongs to
the witness generation gets a little funky since ur challenges are H(C_i). In the sense that u kind of need to compute ur proof and ur witness together, rather than being able to do them separately

levs:

indeed, exactly

slightly different format of CRS (normal groth16 crs can be used, but insecure of course, you need different deltas)

considering circom specifically, there are 2 different approaches to syntax that can be taken

you can make “squeeze challenge” primitive, which shifts round to the next - and all signals are allocated to next round after that
it is very simple and clean, but non-modular - components now become very annoying to use…

second option is allowing user to declare rounds manually, but then they can write something which will not execute (because they made a cycle and require some info depending on a challenge in earlier rounds) - which can be easily detected, though, so the program just will not compile, similar to circular dependency in unassigned signals

yush:

btw whats the main diff between this scheme and bellings? https://ethresear.ch/t/grolup-plookup-for-r1cs/14307

levs:

this is heavier, akin to modular scheme I mentioned

i.e. it is like interfacing from snark to plonk (thats the content of legosnark paper)