GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/frontier/10-snarks-starks/sage/10f-starks-vs-snarks.ipynb
⁴⁸³ views

unlisted

Kernel: SageMath 10.5

Notebook 10f: STARKs vs SNARKs

Module 10: SNARKs and STARKs

Motivating Question. We've now seen two ways to prove computation: Groth16 (trusted setup, pairings, constant-size proofs) and FRI (transparent, hash-based, larger proofs). Which is "better"? The answer depends on what you value: proof size, trust assumptions, quantum resistance, or prover speed. This notebook puts them side by side so you can reason about the trade-offs yourself.

Prerequisites. You should be comfortable with:

The SNARK pipeline: circuit → R1CS → QAP → Groth16 proof (Notebooks 10a–10d)
The FRI protocol: folding, Merkle commitments, proximity testing (Notebook 10e)

Learning objectives. By the end of this notebook you will be able to:

Contrast the SNARK and STARK pipelines step by step.
Quantify the trade-offs in proof size, verification cost, and prover time.
Understand the trust model difference and why it matters.
Evaluate which proof system fits different real-world scenarios.
Know where the field is heading (universal SNARKs, folding schemes).

1. Two Pipelines, One Goal

Bridge from Notebook 10e. We've built both pipelines from the ground up. Now let's put them side by side.

Both SNARKs and STARKs prove the same thing: "I know a witness $w$ such that $C(x, w) = 0$ " where $C$ is an arithmetic circuit. The difference is how they prove it.

SNARK Pipeline (Groth16):                  STARK Pipeline:
┌──────────────────────┐                   ┌──────────────────────┐
│  Arithmetic Circuit  │                   │  Arithmetic Circuit  │
└──────────┬───────────┘                   └──────────┬───────────┘
           │ flatten                                  │ flatten
           ▼                                          ▼
┌──────────────────────┐                   ┌──────────────────────┐
│   R1CS (A, B, C)     │                   │  AIR Constraints     │
└──────────┬───────────┘                   └──────────┬───────────┘
           │ interpolate                              │ compose
           ▼                                          ▼
┌──────────────────────┐                   ┌──────────────────────┐
│   QAP polynomials    │                   │ Composition poly     │
└──────────┬───────────┘                   └──────────┬───────────┘
           │ evaluate on CRS                          │ evaluate on domain
           ▼                                          ▼
┌──────────────────────┐                   ┌──────────────────────┐
│  Pairing check       │                   │  FRI (hash-based)    │
│  (trusted setup)     │                   │  (transparent)       │
└──────────┬───────────┘                   └──────────┬───────────┘
           │                                          │
           ▼                                          ▼
     3 curve points                          Merkle roots + queries
     (192 bytes)                             (~50-200 KB)

In [ ]:

# Let's quantify these differences with concrete numbers
# We'll use our toy examples from earlier notebooks

# Setup fields
F_snark = GF(13)   # toy field for SNARK demo (from 10d)
F_stark = GF(257)  # toy field for STARK demo (from 10e)

print("=== Field Comparison ===")
print(f"SNARK field (toy): F_{13}, needs pairing-friendly curve")
print(f"STARK field (toy): F_257, just needs large 2-adic subgroup")
print()
print("In practice:")
print(f"  Groth16 uses BLS12-381: p ≈ 2^381 (must support pairings)")
print(f"  STARKs use Goldilocks: p = 2^64 - 2^32 + 1 (fast arithmetic, no pairings needed)")
print(f"  Or Baby Bear: p = 2^31 - 2^27 + 1 (even faster, used in RISC Zero)")

2. Trust Model: The Fundamental Divide

The deepest difference between SNARKs and STARKs is the trust model.

Aspect	SNARK (Groth16)	STARK
Setup	Trusted, toxic waste $\tau$ must be destroyed	Transparent, no secrets at all
What if setup is compromised?	Attacker can forge proofs for ANY statement	N/A, there is no setup
Mitigation	MPC ceremony (many participants)	Not needed
Setup scope	Per-circuit, new circuit needs new ceremony	N/A
Cryptographic assumption	Pairing-based (knowledge of exponent)	Collision-resistant hash function

In [ ]:

# Demonstrate the trust difference
# SNARK: the CRS contains hidden structure
# STARK: the "setup" is just choosing a hash function

import hashlib

print("=== SNARK Setup (Groth16) ===")
print("Secret: τ = 7 (TOXIC WASTE)")
print("CRS: [G₁, τG₁, τ²G₁, ...], structured reference string")
print("If τ leaks → anyone can forge proofs")
print("If circuit changes → need entirely new ceremony")
print()

print("=== STARK Setup ===")
print("Hash function: SHA-256 (or Poseidon for efficiency)")
print("That's it. No secrets. No ceremony.")
print("Any circuit can use the same hash function.")
print()

# The "S" in STARK stands for "Scalable" AND the proof system is "Transparent"
print("=== Acronym Breakdown ===")
print("SNARK: Succinct Non-interactive ARgument of Knowledge")
print("STARK: Scalable Transparent ARgument of Knowledge")
print()
print("Key difference in name: 'Succinct' (constant size) vs 'Scalable' (polylog size)")
print("                        vs 'Transparent' (no trusted setup)")

Checkpoint 1. The trust model difference is not just theoretical. Zcash held a real ceremony with 6 participants in 2016 (Sprout) and a much larger one with 87 participants for Sapling. If all participants colluded or were compromised, they could mint fake ZEC. STARKs avoid this risk entirely, but at the cost of larger proofs.

3. Proof Size: Constant vs Polylogarithmic

This is where Groth16 shines, the proof is always 192 bytes, regardless of circuit size. STARK proofs grow logarithmically.

In [ ]:


def groth16_proof_size(n_gates):
    """Groth16 proof size: always 192 bytes (2 G1 + 1 G2 on BLS12-381)."""
    return 192

def stark_proof_size(n_gates, blowup=4, n_queries=80, hash_size=32):
    """
    Approximate STARK proof size.
    - log(n) folding rounds, each with a Merkle root (32 bytes)
    - n_queries query paths, each with log(n) hashes per round
    """
    domain_size = n_gates * blowup
    n_rounds = max(1, int(log2(n_gates)))
    # Merkle roots for each round
    roots = n_rounds * hash_size
    # Query responses: each query needs a Merkle authentication path per round
    path_length = int(log2(domain_size))
    queries = n_queries * n_rounds * path_length * hash_size
    # Final polynomial + field elements
    final = 256  # approximate
    return roots + queries + final


for n in [2**10, 2**14, 2**18, 2**22, 2**26]:
    g16 = groth16_proof_size(n)
    stark = stark_proof_size(n)
    ratio = stark / g16
    
    def fmt_size(b):
        if b < 1024:
            return f"{b} B"
        elif b < 1024**2:
            return f"{b/1024:.1f} KB"
        else:
            return f"{b/1024**2:.1f} MB"
    
    print(f"{n:>15,} | {fmt_size(g16)} | {fmt_size(stark)} | {ratio:>7.0f}x")

print(f"\nGroth16 is ALWAYS 192 bytes.")
print(f"STARKs grow as O(log²(n)), bigger, but still small relative to the computation.")

Misconception alert. "STARK proofs are huge." In absolute terms, a ~100 KB proof is still tiny compared to the computation being proved (which might involve millions of operations). The real question is whether the proof fits cheaply on-chain, and here Groth16's 192 bytes is hard to beat.

4. Verification Time

Groth16 verification is dominated by 3 pairing computations, constant regardless of circuit size. STARK verification involves checking Merkle paths, which grows logarithmically.

In [ ]:

def groth16_verify_ops(n_gates, n_public):
    """Approximate Groth16 verification operations."""
    pairings = 3  # always 3 pairings
    msm = n_public  # multi-scalar multiplication for public inputs
    return {'pairings': pairings, 'msm_size': msm, 'total_heavy_ops': pairings}

def stark_verify_ops(n_gates, n_queries=80, blowup=4):
    """Approximate STARK verification operations."""
    domain = n_gates * blowup
    n_rounds = max(1, int(log2(n_gates)))
    path_length = int(log2(domain))
    hashes = n_queries * n_rounds * path_length  # Merkle path verification
    field_ops = n_queries * n_rounds  # folding checks
    return {'hashes': hashes, 'field_ops': field_ops, 'n_rounds': n_rounds}


for n in [2**10, 2**16, 2**20, 2**24]:
    g = groth16_verify_ops(n, n_public=5)
    s = stark_verify_ops(n)
    print(f"{n:>15,} | {g['pairings']} | {s['hashes']:>15,} | {s['n_rounds']}")

print(f"\nGroth16: always 3 pairings (~5 ms on modern hardware).")
print(f"STARK: more operations, but each is a fast hash (~1-2 ms per 1000 hashes).")
print(f"In practice: Groth16 verify ≈ 5 ms, STARK verify ≈ 10-50 ms.")

5. Prover Time: Where STARKs Win

While STARKs lose on proof size, they can win on prover time, especially for large circuits. The key reason: STARK arithmetic uses smaller fields (e.g., 64-bit Goldilocks vs 381-bit BLS12-381) and doesn't need expensive elliptic curve operations.

In [ ]:

def groth16_prover_ops(n_gates):
    """Approximate Groth16 prover cost (dominated by MSMs on curve)."""
    # Multi-scalar multiplications: O(n log n) curve operations
    # Each curve op is expensive (381-bit field, EC arithmetic)
    return n_gates * int(log2(max(n_gates, 2)))

def stark_prover_ops(n_gates, blowup=4):
    """Approximate STARK prover cost (dominated by FFTs + hashing)."""
    domain = n_gates * blowup
    # FFTs on small field + Merkle tree construction
    return domain * int(log2(max(domain, 2)))


for n in [2**10, 2**16, 2**20, 2**24]:
    g_ops = groth16_prover_ops(n)
    s_ops = stark_prover_ops(n)
    note = ""
    if s_ops > g_ops:
        note = f"But STARK ops are ~10x cheaper (64-bit vs 381-bit)"
    print(f"{n:>12,} | {g_ops:>15,} | {s_ops:>15,} | {note}")

print(f"\nGroth16 ops: EC scalar multiplications on BLS12-381 (expensive per-op)")
print(f"STARK ops: field multiplications + hashes on 64-bit field (cheap per-op)")
print(f"\nRule of thumb: STARKs have ~5-10x more ops, but each op is ~10-50x cheaper.")
print(f"Net result: STARK proving can be faster for large circuits.")

Checkpoint 2. The prover cost trade-off is subtle. Groth16 has fewer operations but each one involves 381-bit elliptic curve arithmetic. STARKs have more operations but use 64-bit field arithmetic and hashing. For circuits with > $2^{20}$ gates, STARKs often prove faster in wall-clock time.

6. Quantum Resistance

This is a critical forward-looking concern:

System	Broken by quantum computer?	Why?
Groth16	Yes	Relies on discrete log and pairing assumptions, broken by Shor's algorithm
STARKs	No	Relies only on collision-resistant hash functions, resistant to known quantum attacks

A sufficiently powerful quantum computer could:

Recover the discrete log, breaking the CRS
Forge Groth16 proofs at will

STARKs remain secure because:

Grover's algorithm only gives a quadratic speedup for hash collision finding
Doubling the hash output size restores security (e.g., SHA-256 → 128-bit post-quantum security)

In [ ]:

# Quantum security comparison
print("=== Post-Quantum Security ===")
print()
print("Groth16 (BLS12-381):")
print(f"  Classical security: ~128 bits")
print(f"  Quantum security:   ~0 bits (Shor's algorithm breaks DLP in polynomial time)")
print()
print("STARK (SHA-256):")
print(f"  Classical security: ~128 bits (collision resistance)")
print(f"  Quantum security:   ~85 bits (Grover reduces collision resistance by ~1/3)")
print(f"  → Use SHA-384 or SHA-512 for 128-bit post-quantum security")
print()
print("STARK (Poseidon hash, algebraic hash):")
print(f"  Designed for STARK-friendliness (fewer constraints in arithmetic circuit)")
print(f"  Still hash-based → still post-quantum secure")
print()
print("Timeline concern: 'Harvest now, decrypt later' attacks.")
print("Proofs generated today with Groth16 could be forged in the future.")
print("STARKs provide forward security against quantum adversaries.")

7. The Full Comparison

Here's the comprehensive side-by-side comparison:

In [ ]:

# Comprehensive comparison table
comparison = [
    ("Trust model",       "Trusted setup (CRS)",            "Transparent (no setup)"),
    ("Setup scope",       "Per-circuit",                     "Universal"),
    ("Crypto assumption", "Pairing + knowledge-of-exponent", "Collision-resistant hash"),
    ("Quantum safe?",     "No",                              "Yes"),
    ("Proof size",        "192 B (constant)",                "~50-200 KB (O(log²n))"),
    ("Verify time",       "~5 ms (3 pairings)",              "~10-50 ms (hash checks)"),
    ("Prover time",       "Slower (big-field EC ops)",       "Faster (small-field FFTs)"),
    ("Recursion",         "Efficient (small proof)",         "Harder (large proof to verify in-circuit)"),
    ("On-chain cost",     "Very cheap (~200K gas)",          "More expensive (~1M gas)"),
    ("Maturity",          "Mature (Zcash since 2016)",       "Newer (StarkNet 2021+)"),
]

for prop, snark, stark in comparison:

8. Real-World Deployments

Let's look at who uses what and why.

In [ ]:

deployments = [
    ("Zcash",        "Groth16",   "Privacy coin",       "192 B proofs critical for on-chain space"),
    ("Filecoin",     "Groth16",   "Storage proofs",     "~150M gate circuits, compact proofs on-chain"),
    ("StarkNet",     "STARK",     "ZK-rollup",          "No trusted setup, quantum resistance"),
    ("RISC Zero",    "STARK",     "General ZK-VM",      "Proves arbitrary RISC-V execution"),
    ("zkSync Era",   "SNARK+STARK", "ZK-rollup",       "STARK proves execution, SNARK wraps for on-chain"),
    ("Polygon zkEVM","SNARK",     "ZK-rollup",          "PLONK-based (universal SNARK, one-time setup)"),
    ("Scroll",       "SNARK",     "ZK-rollup",          "KZG-based polynomial commitment"),
    ("Mina",         "SNARK",     "Succinct blockchain", "Recursive SNARKs keep chain at 22 KB"),
]

for project, system, use_case, reason in deployments:

print(f"\nNotice: zkSync uses BOTH, STARK for proving, then wraps in a SNARK for on-chain verification.")
print(f"This hybrid approach gets the best of both worlds.")

Checkpoint 3. The "STARK proves, SNARK verifies" hybrid is increasingly popular. The prover gets STARK's fast proving and transparency, while the on-chain verifier gets SNARK's compact proofs. This is sometimes called a "STARK-to-SNARK wrapper" or "proof composition."

9. Beyond Groth16: Universal SNARKs

Groth16's per-circuit trusted setup is a major limitation. Universal SNARKs like PLONK and Marlin need only a single setup ceremony that works for any circuit up to a given size.

Property	Groth16	PLONK	Marlin	STARK
Setup	Per-circuit	Universal	Universal	None
Proof size	192 B	~400 B	~800 B	~100 KB
Verification	3 pairings	~10 pairings	~10 pairings	Hash-based
Prover	MSM	FFT + MSM	FFT + MSM	FFT + Hash
Quantum safe	No	No	No	Yes

In [ ]:

# Proof size spectrum visualization (text-based)
systems = [
    ("Groth16",  192,      "●"),
    ("PLONK",    400,      "●"),
    ("Marlin",   800,      "●"),
    ("Halo2",    5000,     "●"),
    ("STARK",    100000,   "●"),
]

print("=== Proof Size Spectrum ===")
print("(log scale, approximate)")
print()

max_log = log2(200000)
bar_width = 50

for name, size, marker in systems:
    log_size = log2(size)
    bar_pos = int(bar_width * log_size / max_log)
    bar = " " * bar_pos + marker
    trusted = "(trusted)" if name in ["Groth16", "PLONK", "Marlin"] else "(transparent)"
    
    if size < 1024:
        size_str = f"{size} B"
    else:
        size_str = f"{size/1024:.0f} KB"
    
    print(f"{name} |{bar} {size_str} {trusted}")

print(f" |{'─' * bar_width}")
print("  Small                              Large")print()
print("Trend: there's a spectrum from tiny+trusted to larger+transparent.")
print("The field is converging toward efficient transparent systems.")

10. The Frontier: Where the Field is Heading

The SNARK/STARK divide is narrowing. Key trends:

Folding schemes (Nova, Supernova, HyperNova):

Incrementally verify computation without full proof at each step
Very efficient for iterative/recursive computations
Can work with both SNARK and STARK backends

Lookup arguments (Plookup, LogUp):

Dramatically reduce constraint count for operations like range checks, hash functions
Used in both PLONK-based SNARKs and STARKs

Circle STARKs (Stwo):

Use the circle group instead of multiplicative subgroups
Work over Mersenne primes (M31 = $2^{31} - 1$ ) for ultra-fast arithmetic
Even faster proving with comparable proof sizes

Proof aggregation and recursion:

Combine many proofs into one
STARK → SNARK wrappers for on-chain efficiency
Recursive verification enables constant-size blockchain (Mina)

Crypto foreshadowing. Module 11 (Homomorphic Encryption) and Module 12 (MPC) explore the other pillars of advanced cryptography. Together with ZK proofs, these form the foundation of programmable privacy: systems where computation is verifiable, encrypted, and distributed.

In [ ]:

# Decision tree for choosing a proof system
print("=== Which Proof System Should You Use? ===")
print()
print("Start here:")
print("")
print("  Need post-quantum security?")
print("  ├── YES → Use STARK")
print("  └── NO")
print("       │")
print("       Need smallest possible proof?")
print("       ├── YES → Groth16 (if you accept per-circuit setup)")
print("       │         PLONK  (if you want universal setup)")
print("       └── NO")
print("            │")
print("            Need fast proving?")
print("            ├── YES → STARK (small-field FFTs)")
print("            └── NO")
print("                 │")
print("                 Willing to accept trusted setup?")
print("                 ├── YES → PLONK (good all-around)")
print("                 └── NO  → STARK")
print()
print("In practice, many projects use hybrid: STARK prove + SNARK verify on-chain.")

11. Exercises

Exercise 1 (Worked): Compute Verification Cost

Problem. For a circuit with $2^{20}$ gates, compute the approximate number of verification operations for both Groth16 and STARKs. Which is faster?

Solution:

In [ ]:

# Exercise 1: Worked solution
n = 2^20
print(f"Circuit size: 2^20 = {n:,} gates")

# Groth16
g = groth16_verify_ops(n, n_public=10)
print(f"\nGroth16 verification:")
print(f"  Pairings: {g['pairings']}")
print(f"  Public input MSM: {g['msm_size']} point multiplications")
print(f"  Total: 3 pairings ≈ 5 ms")

# STARK
s = stark_verify_ops(n, n_queries=80)
print(f"\nSTARK verification:")
print(f"  Hash operations: {s['hashes']:,}")
print(f"  Field operations: {s['field_ops']:,}")
print(f"  Folding rounds: {s['n_rounds']}")
print(f"  Estimate: ~{s['hashes'] // 1000}K hashes ≈ {s['hashes'] // 100000 + 5} ms")

print(f"\nConclusion: Groth16 is ~5-10x faster to verify.")
print(f"But both are fast enough for practical use (<100 ms).")

Exercise 2 (Guided): Proof Size Break-Even

Problem. At what circuit size does a STARK proof exceed 1 MB? How does this compare to the Groth16 proof at the same circuit size?

Fill in the TODOs:

In [ ]:

# Exercise 2: fill in the TODOs

# TODO 1: Search for the circuit size where STARK proof exceeds 1 MB
# target = 1024 * 1024  # 1 MB in bytes
# for exp in range(10, 40):
#     n = 2^exp
#     size = stark_proof_size(n)
#     if size > target:
#         print(f"STARK proof exceeds 1 MB at 2^{exp} = {n:,} gates")
#         print(f"  Proof size: {size / 1024:.1f} KB")
#         break

# TODO 2: Compare with Groth16 at that circuit size
# g_size = groth16_proof_size(n)
# print(f"  Groth16 at same circuit size: {g_size} bytes (still constant!)")

# TODO 3: Compute the ratio
# print(f"  Ratio: STARK/Groth16 = {size / g_size:.0f}x")

Exercise 3 (Independent): Design a Proof System

Problem. You're building a ZK application. For each scenario, argue which proof system you'd choose and why:

A privacy coin where every transaction needs an on-chain proof and gas costs matter.
A government voting system that must remain secure against future quantum computers.
A ZK-rollup processing 10,000 transactions per batch where proving speed is the bottleneck.
A recursive blockchain (like Mina) where each block verifies the previous block's proof.

For each, state your choice (Groth16 / PLONK / STARK / hybrid) and justify it with specific metrics from this notebook.

In [ ]:

# Exercise 3: write your analysis here

Summary

Dimension	SNARK (Groth16)	STARK
Best at	Smallest proofs, fastest verification	No trust, quantum resistance, fast proving
Worst at	Trusted setup, quantum vulnerability	Large proofs, on-chain cost
Ideal for	On-chain verification, recursion	Computation integrity, long-term security
Real use	Zcash, Filecoin, Mina	StarkNet, RISC Zero
Hybrid	\multicolumn{2}{c}{STARK prove + SNARK verify (zkSync)}

Key takeaways:

There is no single "best" proof system, the right choice depends on your constraints.
The field is converging: universal SNARKs, STARK-SNARK hybrids, and folding schemes are blurring the lines.
Quantum resistance will become increasingly important as quantum computers mature.
Both SNARKs and STARKs rest on the same foundation: encoding computation as polynomial equations and proving those equations hold.

This completes Module 10. You've gone from arithmetic circuits all the way to understanding the two major families of zero-knowledge proofs.

Next module: Module 11: Homomorphic Encryption