Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
duyuefeng0708
GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/frontier/10-snarks-starks/sage/10d-groth16-overview.ipynb
483 views
unlisted
Kernel: SageMath 10.5

Notebook 10d: Groth16 Overview

Module 10. SNARKs and STARKs


Motivating Question. We have the QAP equation A(x)B(x)C(x)=H(x)Z(x)A(x) \cdot B(x) - C(x) = H(x) \cdot Z(x). But the verifier can't check this directly without learning the witness. How do we evaluate polynomials "in the exponent", hiding the witness inside elliptic curve points while still allowing verification? Groth16 does this using pairings and a trusted setup, producing a proof that's just 3 curve points (< 200 bytes) no matter how big the circuit.


Prerequisites. You should be comfortable with:

  • QAP construction: polynomials, vanishing polynomial, H(x)H(x) (Notebook 10c)

  • Bilinear pairings: e(aP,bQ)=e(P,Q)abe(aP, bQ) = e(P, Q)^{ab} (Module 07)

  • Elliptic curve scalar multiplication (Module 06)

Learning objectives. By the end of this notebook you will be able to:

  1. Understand the trusted setup ceremony and the "toxic waste" problem.

  2. See how polynomial evaluations are encoded as elliptic curve points.

  3. Walk through a simplified Groth16-style proof and verification.

  4. Appreciate why the proof is constant-size and why the trusted setup is necessary.

1. The Idea: Evaluate in the Exponent

Bridge from Notebook 10c. The QAP check is: does A(τ)B(τ)C(τ)=H(τ)Z(τ)A(\tau) \cdot B(\tau) - C(\tau) = H(\tau) \cdot Z(\tau) for a secret τ\tau? Recall from 10c that the Schwartz-Zippel lemma guarantees this check at a single random point τ\tau catches any cheating prover with overwhelming probability, so we only need one evaluation point. The problem: if the verifier learns A(τ)A(\tau) as a field element, the witness leaks. The solution: encode A(τ)A(\tau) as a curve point [A(τ)]1=A(τ)G1[A(\tau)]_1 = A(\tau) \cdot G_1 and use a pairing (Module 07) to check the multiplication without revealing the scalar:

e([A(τ)]1,[B(τ)]2)=e(G1,G2)A(τ)B(τ)e([A(\tau)]_1, [B(\tau)]_2) = e(G_1, G_2)^{A(\tau) \cdot B(\tau)}

This is exactly what Groth16 does. The secret τ\tau is generated during a trusted setup and then destroyed (the "toxic waste").

ComponentPlain QAP (10c)Groth16
Evaluation pointRandom τ\tau known to verifierτ\tau hidden in CRS, then destroyed
Polynomial valuesField elements A(τ),B(τ),A(\tau), B(\tau), \ldotsCurve points [A(τ)]1,[B(τ)]2[A(\tau)]_1, [B(\tau)]_2
Checking ABA \cdot BField multiplicationPairing: e([A]1,[B]2)e([A]_1, [B]_2)
Proof sizeAll polynomial values3 curve points (constant!)
Zero-knowledgeNone (witness is visible)Yes (randomized proof)
# Setup: we'll use a toy pairing-friendly curve # (Same supersingular curve as Module 07) p = 467 E = EllipticCurve(GF(p), [1, 0]) # y² = x³ + x n = 13 # subgroup order k = 2 # embedding degree cofactor = E.cardinality() // n F2 = GF(p^k, 'a') E_ext = E.change_ring(F2) # Generators while True: G1 = cofactor * E.random_point() if G1 != E(0) and n * G1 == E(0): break G1_ext = E_ext(G1) cofactor_ext = E_ext.cardinality() // n while True: G2 = cofactor_ext * E_ext.random_point() if G2 != E_ext(0) and n * G2 == E_ext(0): if G2.weil_pairing(G1_ext, n) != 1: break print(f"Curve: y² = x³ + x over F_{p}") print(f"Subgroup order: n = {n}") print(f"G1 = {G1}") print(f"G2 = {G2}") # Also set up the QAP field (small for demonstration) F_qap = GF(n) # QAP computations mod n R.<X> = PolynomialRing(F_qap)

2. The Trusted Setup ("Ceremony")

A trusted party (or a multi-party computation) generates a secret τZ/nZ\tau \in \mathbb{Z}/n\mathbb{Z} and publishes curve points encoding powers of τ\tau:

CRS={[τi]1=τiG1}i=0d,{[τi]2=τiG2}i=0d\text{CRS} = \{[\tau^i]_1 = \tau^i \cdot G_1\}_{i=0}^{d}, \quad \{[\tau^i]_2 = \tau^i \cdot G_2\}_{i=0}^{d}

Then τ\tau is destroyed, this is the toxic waste. If anyone learns τ\tau, they can forge proofs.

Misconception alert. "The trusted setup means you have to trust one person." Not in practice, multi-party computation (MPC) ceremonies ensure that as long as any single participant destroys their share, the toxic waste is unrecoverable. Zcash's Powers of Tau ceremony had hundreds of participants.

# Trusted setup: generate tau and compute CRS tau = ZZ(F_qap.random_element()) # secret! while tau == 0: tau = ZZ(F_qap.random_element()) max_degree = 4 # enough for our toy example # Powers of tau as curve points CRS_G1 = [ZZ(F_qap(tau^i)) * G1 for i in range(max_degree + 1)] # [G1, τG1, τ²G1, ...] CRS_G2 = [ZZ(F_qap(tau^i)) * G2 for i in range(max_degree + 1)] # [G2, τG2, τ²G2, ...] print(f"=== Trusted Setup ===") print(f"Secret tau = {tau} (TOXIC WASTE, must be destroyed!)") print(f"\nCRS (G1 points):") for i, pt in enumerate(CRS_G1): print(f" [τ^{i}]₁ = τ^{i}·G1 = {pt}") print(f"\nCRS (G2 points):") for i, pt in enumerate(CRS_G2): print(f" [τ^{i}]₂ = τ^{i}·G2 = {pt}") print(f"\nAnyone can USE these points but cannot recover τ from them.") print(f"(Recovering τ from τ·G would require solving the discrete log problem.)")

Checkpoint 1. The CRS encodes τ\tau "in the exponent", as curve points. Given [τi]1[\tau^i]_1, you can compute [f(τ)]1[f(\tau)]_1 for any polynomial ff by taking linear combinations: [f(τ)]1=f0[1]1+f1[τ]1+f2[τ2]1+[f(\tau)]_1 = f_0 [1]_1 + f_1 [\tau]_1 + f_2 [\tau^2]_1 + \cdots. But you can never learn τ\tau itself.

3. Evaluating Polynomials on Curve Points

Given a polynomial f(x)=a0+a1x+a2x2+f(x) = a_0 + a_1 x + a_2 x^2 + \cdots and the CRS, anyone can compute [f(τ)]1[f(\tau)]_1 without knowing τ\tau:

[f(τ)]1=a0[1]1+a1[τ]1+a2[τ2]1+[f(\tau)]_1 = a_0 [1]_1 + a_1 [\tau]_1 + a_2 [\tau^2]_1 + \cdots
def eval_poly_on_curve(poly, CRS_points, E_zero): """ Evaluate polynomial on CRS curve points. Returns [f(tau)]_G = sum(coeff_i * [tau^i]_G) """ coeffs = poly.list() # [a_0, a_1, a_2, ...] result = E_zero # point at infinity for i, c in enumerate(coeffs): if c != 0: result = result + ZZ(c) * CRS_points[i] return result # Example: f(X) = 3X² + 2X + 1 f_test = R(1 + 2*X + 3*X^2) print(f"f(X) = {f_test}") print(f"f(τ) = f({tau}) = {f_test(F_qap(tau))}") # Evaluate on curve points (without knowing tau!) f_tau_G1 = eval_poly_on_curve(f_test, CRS_G1, E(0)) # Compare with direct computation (using tau, cheating for verification) f_tau_direct = ZZ(f_test(F_qap(tau))) * G1 print(f"\n[f(τ)]₁ via CRS: {f_tau_G1}") print(f"[f(τ)]₁ directly: {f_tau_direct}") print(f"Match? {f_tau_G1 == f_tau_direct}") print(f"\nWe computed [f(τ)]₁ without knowing τ = {tau}!")

4. Simplified Groth16 Proof

Let's build a simplified SNARK for our running example f(x)=x3+x+5f(x) = x^3 + x + 5.

Setup: We have the QAP from Notebook 10c. Now the prover computes:

  • [A(τ)]1[A(\tau)]_1, encoding of the "left" polynomial evaluation

  • [B(τ)]2[B(\tau)]_2, encoding on G2G_2

  • [H(τ)]1[H(\tau)]_1, encoding of the quotient polynomial

Verification: The verifier checks: e([A(τ)]1,[B(τ)]2)=e([C(τ)]1,G2)e([H(τ)]1,[Z(τ)]2)e([A(\tau)]_1, [B(\tau)]_2) = e([C(\tau)]_1, G_2) \cdot e([H(\tau)]_1, [Z(\tau)]_2)

This is the pairing-based version of A(τ)B(τ)=C(τ)+H(τ)Z(τ)A(\tau) \cdot B(\tau) = C(\tau) + H(\tau) \cdot Z(\tau).

# Build the QAP (reusing tools from 10c, simplified) # f(x) = x³ + x + 5, two multiplication constraints # Simplified: A(X)·B(X) - C(X) = H(X)·Z(X) # We'll work with a concrete example where we know all values # x_input = 3, so f(3) = 35 x_input = F_qap(3) # For this demo, define the QAP polynomials directly # (In practice these come from R1CS interpolation) # Simplified: A(τ) = 3 (our x value), B(τ) = 3, result is 9 = w1 # Let's directly demonstrate the pairing check concept # Suppose A(τ) = a, B(τ) = b, C(τ) = c, H(τ) = h, Z(τ) = z # such that a * b = c + h * z a_val = ZZ(F_qap(5)) # some QAP evaluation b_val = ZZ(F_qap(7)) # some QAP evaluation hz_val = ZZ(F_qap(a_val * b_val % n)) # h*z such that a*b = c + h*z c_val = ZZ(F_qap(0)) # for simplicity # So a*b mod n = 35 mod 13 = 9, and c + h*z = 0 + 9 = 9 ✓ # Encode as curve points proof_A = a_val * G1 # [A(τ)]₁ proof_B = b_val * G2 # [B(τ)]₂ proof_C = c_val * G1 # [C(τ)]₁ proof_HZ = hz_val * G1 # [H(τ)·Z(τ)]₁ print(f"=== Proof Elements ===") print(f"[A(τ)]₁ = {proof_A}") print(f"[B(τ)]₂ = {proof_B}") print(f"[C(τ)]₁ = {proof_C}") print(f"[H·Z(τ)]₁ = {proof_HZ}") print(f"\nProof = 3 curve points (in real Groth16)") print(f"Size: constant regardless of circuit complexity!")

5. Pairing-Based Verification

The verifier checks: e([A]1,[B]2)=e([C]1+[HZ]1,G2)e([A]_1, [B]_2) = e([C]_1 + [H \cdot Z]_1, G_2)

Why does this work?

  • LHS: e(aG1,bG2)=e(G1,G2)abe(a \cdot G_1, b \cdot G_2) = e(G_1, G_2)^{ab}

  • RHS: e((c+hz)G1,G2)=e(G1,G2)c+hze((c + hz) \cdot G_1, G_2) = e(G_1, G_2)^{c + hz}

  • Equal when ab=c+hzab = c + hz, which is exactly the QAP equation!

# Pairing-based verification proof_A_ext = E_ext(proof_A) proof_C_ext = E_ext(proof_C) proof_HZ_ext = E_ext(proof_HZ) G1_ext_pt = E_ext(G1) # LHS: e([A]₁, [B]₂) lhs = proof_A_ext.weil_pairing(proof_B, n) # RHS: e([C]₁ + [HZ]₁, G₂) rhs_point = proof_C_ext + proof_HZ_ext rhs = rhs_point.weil_pairing(G2, n) print(f"=== Pairing Verification ===") print(f"LHS: e([A]₁, [B]₂) = {lhs}") print(f"RHS: e([C+HZ]₁, G₂) = {rhs}") print(f"Equal? {lhs == rhs}") print(f"\nThe verifier checked a·b = c + h·z using pairings,") print(f"without ever learning a={a_val}, b={b_val}, c={c_val}, hz={hz_val}!")
# Demonstrate soundness: wrong proof fails print("=== Wrong Proof ===") fake_a = ZZ(F_qap(6)) # wrong value fake_A = fake_a * G1 fake_A_ext = E_ext(fake_A) lhs_fake = fake_A_ext.weil_pairing(proof_B, n) print(f"Fake LHS: e([6·G1], [B]₂) = {lhs_fake}") print(f"Real RHS: e([C+HZ]₁, G₂) = {rhs}") print(f"Equal? {lhs_fake == rhs}") print(f"\nForged proof REJECTED!")

Checkpoint 2. The pairing translates the polynomial identity check into an elliptic curve equation. The verifier computes a constant number of pairings (3 in real Groth16), regardless of the circuit size. This is why Groth16 verification is so fast, it takes ~5ms even for circuits with millions of gates.

6. The Toxic Waste Problem

If an attacker knows τ\tau, they can forge proofs for any statement.

# With tau, attacker can forge proofs! print("=== Toxic Waste Attack ===") print(f"Attacker knows tau = {tau}") # Attacker wants to prove a*b = c + hz for arbitrary values # They can compute any [f(τ)]₁ directly fake_statement_a = ZZ(F_qap(11)) fake_statement_b = ZZ(F_qap(3)) fake_ab = ZZ(F_qap(fake_statement_a * fake_statement_b)) forged_A = fake_statement_a * G1 forged_B = fake_statement_b * G2 forged_HZ = fake_ab * G1 forged_C = E(0) # zero # Verify the forged proof forged_A_ext = E_ext(forged_A) forged_HZ_ext = E_ext(forged_HZ) lhs_forged = forged_A_ext.weil_pairing(forged_B, n) rhs_forged = forged_HZ_ext.weil_pairing(G2, n) print(f"Forged proof passes? {lhs_forged == rhs_forged}") print(f"\n⚠ This is why tau MUST be destroyed!") print(f"In practice, multi-party computation ensures no single party knows tau.")

7. Real Groth16: The Full Picture

The actual Groth16 protocol is more complex. Here's the high-level structure:

PhaseWhat HappensOutput
SetupGenerate CRS from τ,α,β,γ,δ\tau, \alpha, \beta, \gamma, \deltaProving key, Verification key
ProveProver computes π=([A]1,[B]2,[C]1)\pi = ([A]_1, [B]_2, [C]_1) using witness3 curve points
VerifyVerifier checks pairing equationAccept/Reject

Groth16 Pairing Equation (Simplified)

e([A]1,[B]2)=e([α]1,[β]2)e(publicsi[Li]1,[γ]2)e([C]1,[δ]2)e([A]_1, [B]_2) = e([\alpha]_1, [\beta]_2) \cdot e(\sum_{\text{public}} s_i [L_i]_1, [\gamma]_2) \cdot e([C]_1, [\delta]_2)

Where:

  • α,β,γ,δ\alpha, \beta, \gamma, \delta are additional toxic waste parameters

  • LiL_i encode the QAP polynomials for public inputs

  • The proof randomization uses blinding factors for zero-knowledge

Proof Size and Verification Cost

MetricGroth16
Proof size2 G1G_1 points + 1 G2G_2 point = 192 bytes (BLS12-381)
Verification3 pairings + 1 multi-scalar multiplication
Verification time~5 ms
Proving timeO(nlogn)O(n \log n) where nn = circuit size

8. SNARK Properties

Groth16 is a zk-SNARK, let's unpack the acronym:

LetterPropertyMeaning
zkZero-KnowledgeProof reveals nothing about the witness
SSuccinctProof is constant-size (192 bytes)
NNon-interactiveProver sends one message to verifier
ARARgumentComputationally sound (not information-theoretically)
KKnowledgeProver must "know" a valid witness (extractability)
# Size comparison print("=== Proof Size Comparison ===") for gates in [100, 10000, 1000000, 100000000]: witness_bytes = gates * 32 # each wire value is 32 bytes proof_bytes = 192 # always 192 bytes! print(f"{gates:>15,} | {witness_bytes:>12,} B | {proof_bytes} B") print(f"\nThe proof is ALWAYS 192 bytes, that's succinctness!") print(f"A 100M-gate circuit (ZK-EVM scale) still has a tiny proof.")

Checkpoint 3. Groth16 achieves the smallest proof size of any known SNARK (192 bytes on BLS12-381). The trade-off is the trusted setup, if the toxic waste is compromised, soundness breaks. This trade-off motivates transparent proof systems like STARKs (next notebooks).

9. Groth16 in the Wild

ApplicationWhat's provedCircuit size
ZcashTransaction validity (spend authority, value balance)~100K gates
FilecoinStorage proofs (PoRep)~150M gates
Tornado CashMerkle tree membership (deposit/withdrawal)~30K gates
zkSync (v1)Batch of transfer validity~100K gates

Crypto foreshadowing. Groth16 needs a circuit-specific trusted setup, changing the circuit requires a new ceremony. Universal SNARKs (PLONK, Marlin) require only a single "universal" setup. STARKs (next notebooks) eliminate the trusted setup entirely.

10. Exercises

Exercise 1 (Worked): CRS Polynomial Evaluation

Problem. Given the CRS, compute [f(τ)]1[f(\tau)]_1 for f(X)=X2+2X+3f(X) = X^2 + 2X + 3 without knowing τ\tau. Then verify by comparing with the direct computation.

Solution:

# Exercise 1: Worked solution f_ex1 = R(3 + 2*X + X^2) # 3 + 2X + X² print(f"f(X) = {f_ex1}") # Using CRS (without knowing tau) f_tau_crs = eval_poly_on_curve(f_ex1, CRS_G1, E(0)) # Direct (cheating, using tau for verification) f_tau_val = ZZ(f_ex1(F_qap(tau))) f_tau_direct = f_tau_val * G1 print(f"\nVia CRS: [f(τ)]₁ = {f_tau_crs}") print(f"Direct: [f(τ)]₁ = {f_tau_direct}") print(f"Match? {f_tau_crs == f_tau_direct}") print(f"\nf(τ) = f({tau}) = {f_tau_val} (verifier doesn't see this!)")

Exercise 2 (Guided): Pairing Verification

Problem. Create curve points [a]1[a]_1 and [b]2[b]_2 for a=4,b=9a = 4, b = 9. Verify that e([a]1,[b]2)=e([ab]1,G2)e([a]_1, [b]_2) = e([ab]_1, G_2) using the Weil pairing.

Fill in the TODOs:

# Exercise 2: fill in the TODOs # TODO 1: Create the curve points # pt_a = ZZ(4) * G1 # [a]₁ # pt_b = ZZ(9) * G2 # [b]₂ # pt_ab = ZZ(F_qap(4 * 9)) * G1 # [ab]₁ # TODO 2: Compute pairings # lhs_ex = E_ext(pt_a).weil_pairing(pt_b, n) # e([a]₁, [b]₂) # rhs_ex = E_ext(pt_ab).weil_pairing(G2, n) # e([ab]₁, G₂) # TODO 3: Compare # print(f"e([a]₁, [b]₂) = {lhs_ex}") # print(f"e([ab]₁, G₂) = {rhs_ex}") # print(f"Equal? {lhs_ex == rhs_ex}")

Exercise 3 (Independent): Forge with Toxic Waste

Problem.

  1. Suppose you know τ\tau. Create a valid-looking proof for the statement 7×8=47 \times 8 = 4 (which is false mod 13, since 56mod13=456 \bmod 13 = 4... wait, that's actually true!).

  2. Instead, prove 7×8=37 \times 8 = 3 (which IS false mod 13). Can you make the pairing check pass if you know τ\tau?

  3. Explain why not knowing τ\tau prevents this forgery.

# Exercise 3: write your solution here

Summary

ConceptKey Fact
Trusted setupGenerate CRS from toxic waste τ\tau; τ\tau must be destroyed
CRSCurve points [τi]1,[τi]2[\tau^i]_1, [\tau^i]_2, encode powers of τ\tau
Polynomial evaluation[f(τ)]1=fi[τi]1[f(\tau)]_1 = \sum f_i [\tau^i]_1, no need to know τ\tau
Pairing checke([A]1,[B]2)=e(G1,G2)A(τ)B(τ)e([A]_1, [B]_2) = e(G_1, G_2)^{A(\tau) \cdot B(\tau)}, multiplicative check
Proof size3 curve points = 192 bytes (constant!)
Toxic waste riskKnowing τ\tau allows forging any proof

Groth16 is the gold standard for compact zk-SNARKs: tiny proofs, fast verification, but requires a trusted setup. In the next notebook, we'll explore FRI, the key ingredient of STARKs, which eliminate the trusted setup entirely.


Next: 10e: FRI Protocol