Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
duyuefeng0708
GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/frontier/09-commitments-sigma-protocols/sage/09c-sigma-protocols-intuition.ipynb
483 views
unlisted
Kernel: SageMath 10.5

Sigma Protocols: The 3-Move Proof

Module 09 | 09-commitments-sigma-protocols

Commit-challenge-response, completeness, special soundness, honest-verifier zero-knowledge

The Impossible Question

Can you PROVE you know a password without ever typing it? Without revealing ANY information about it?

This sounds impossible. If you prove something, surely the proof itself leaks information? But sigma protocols do exactly this, they let a prover convince a verifier of knowledge while revealing nothing beyond the truth of the statement.

This notebook builds the intuition for how such a proof works. By the end, you will understand the elegant 3-move dance that makes zero-knowledge proofs possible, and you will run one yourself in SageMath.

Objectives

By the end of this notebook you will be able to:

  1. Describe the 3-move structure (commit, challenge, response) of a sigma protocol.

  2. State and demonstrate the three defining properties: completeness, special soundness, and honest-verifier zero-knowledge (HVZK).

  3. Execute a concrete sigma protocol (prove knowledge of a discrete log) in SageMath.

  4. Construct a simulator that creates fake transcripts indistinguishable from real ones.

  5. Extract a witness from two accepting transcripts with the same commitment but different challenges.

Prerequisites

  • Completion of Pedersen Commitments.

  • Familiarity with cyclic groups and discrete logarithms (Modules 05-06).


Bridge from 09b: In the Pedersen commitment notebook, you saw how to commit to a value mm using C=gmhrC = g^m h^r, hiding mm perfectly while binding you to it. Sigma protocols use a very similar idea in their first move: the prover sends a commitment to some randomness. The key insight is that this commitment structure, combined with a challenge-response mechanism, can prove knowledge of a secret without ever revealing it. The Pedersen commitment you already understand is, in fact, the algebraic engine inside many sigma protocols.

The Ali Baba Cave: An Analogy

Before we touch any math, let us build intuition with a famous analogy.

Imagine a cave shaped like a ring, with a single entrance. Deep inside, the ring splits into two paths (left and right) that reconnect at a locked door. Only someone who knows the magic word can open the door and pass through.

Entrance | / \ Left / \ Right | | | [D] | D = locked door | | \ / \ /

The protocol:

  1. Commit: The prover (Peggy) walks into the cave and randomly chooses the left or right path. The verifier (Victor) waits outside and does not see which path she took.

  2. Challenge: Victor shouts a random demand: "Come out from the LEFT" or "Come out from the RIGHT."

  3. Response: If Peggy knows the magic word, she can always comply, if she is on the wrong side, she opens the door and walks through.

Why this works:

  • Completeness: If Peggy knows the word, she succeeds every time.

  • Soundness: If Peggy does NOT know the word, she can only comply if she happened to choose the right path, a 50% chance. After nn rounds, a fake prover succeeds with probability 1/2n1/2^n, which is negligible.

  • Zero-knowledge: Victor learns nothing! A skeptical friend watching a VIDEO of the protocol cannot distinguish it from a staged video where Peggy and Victor colluded (Victor told Peggy the challenge in advance). This is the simulation argument, if you can fake it, it reveals nothing.

Key idea: Zero-knowledge does not mean the proof is somehow "weak." It means the verifier gains NO information beyond the single bit: "Peggy knows the secret." Not a partial clue. Not a single bit of the secret itself. Nothing.

Common Misconception: "Zero-knowledge means the verifier cannot compute the secret."

It is much stronger than that. Zero-knowledge means the verifier cannot learn anything at all, not even a single bit of information about the secret. The verifier's view of the interaction is identical to something it could have generated entirely on its own, without ever talking to the prover. This is a profound guarantee: the interaction is literally worthless to the verifier as a source of information.

The 3-Move Structure

A sigma protocol for a relation RR is an interactive proof with exactly three messages:

StepDirectionNameContent
1Prover \to VerifierCommitment (aa)Prover commits to fresh randomness
2Verifier \to ProverChallenge (ee)Verifier sends a random challenge
3Prover \to VerifierResponse (zz)Prover responds using witness + challenge

The verifier then runs a verification equation on (a,e,z)(a, e, z) and accepts or rejects.

Why "sigma"? The communication pattern, down, back up diagonally, then down again, looks like the Greek letter Σ\Sigma (sigma) when drawn as a message sequence diagram:

Prover Verifier |--- a (commit) --->| \ | | > looks like Σ |<-- e (challenge) -| / | | / |--- z (response) ->| /

A Concrete Sigma Protocol: Proof of Knowledge of Discrete Log

Let us make this concrete. Consider the most fundamental sigma protocol:

Statement: "I know a secret xx such that y=gxy = g^x in a cyclic group."

The prover wants to convince the verifier that they know xx (the witness) without revealing xx.

Protocol:

  1. Commit: Prover picks random kk, sends R=gkR = g^k

  2. Challenge: Verifier sends random cc

  3. Response: Prover computes s=k+cx(modq)s = k + c \cdot x \pmod{q} and sends ss

Verification: Verifier checks that gs=Rycg^s = R \cdot y^c

Why does this equation hold? Expand the right side: Ryc=gk(gx)c=gkgxc=gk+cx=gsR \cdot y^c = g^k \cdot (g^x)^c = g^k \cdot g^{xc} = g^{k + cx} = g^s \checkmark

Let us implement this step by step.

# ── Setup: a cyclic group of prime order ────────────────────────────── # We work in Z_p^* with a prime-order subgroup of order q. # For pedagogical clarity, we use a small safe prime: p = 2q + 1. # Find a safe prime: p = 2q + 1 where both p and q are prime q = next_prime(10^15) # A 15-digit prime (subgroup order) p = 2 * q + 1 while not is_prime(p): q = next_prime(q + 1) p = 2 * q + 1 print(f"q = {q} (subgroup order, {len(str(q))} digits)") print(f"p = {p} (field prime, {len(str(p))} digits)") # Find a generator of the order-q subgroup # Any element h with h^q = 1 and h != 1 generates the subgroup h_candidate = Mod(2, p) g = h_candidate^((p - 1) // q) # Map into the subgroup assert g != 1, "Bad generator" assert g^q == 1, "g does not have order q" print(f"g = {g} (generator of order-q subgroup)") print(f"\nSanity check: g^q mod p = {g^q} (should be 1)")
# ── The Prover's secret ─────────────────────────────────────────────── # The prover knows x; the public value is y = g^x. x = randint(2, int(q) - 1) # Prover's SECRET witness y = g^x # PUBLIC statement print(f"Secret x = {x} (only the prover knows this!)") print(f"Public y = g^x = {y}") print(f"\nThe prover wants to prove: 'I know x such that y = g^x'") print(f"WITHOUT revealing x.")
# ── The 3-Move Protocol ────────────────────────────────────────────── # Step 1: COMMIT. Prover picks random k, sends R = g^k k = randint(2, int(q) - 1) # Random nonce (ephemeral secret) R = g^k # Commitment print("Step 1 (Prover -> Verifier):") print(f" Prover picks random k and sends R = g^k = {R}") print() # Step 2: CHALLENGE. Verifier sends random c c = randint(1, int(q) - 1) # Random challenge print("Step 2 (Verifier -> Prover):") print(f" Verifier sends random challenge c = {c}") print() # Step 3: RESPONSE. Prover computes s = k + c*x mod q s = Mod(k + c * x, q) # Response print("Step 3 (Prover -> Verifier):") print(f" Prover computes s = k + c*x mod q = {s}") print() # ── Verification ───────────────────────────────────────────────────── lhs = g^Integer(s) # g^s rhs = R * y^c # R * y^c print("Verification:") print(f" g^s = {lhs}") print(f" R * y^c = {rhs}") print(f" Equal? {lhs == rhs}") assert lhs == rhs, "Proof failed!" print("\nThe proof ACCEPTS. The verifier is convinced.")

Checkpoint: Verify by Hand

Before reading further, convince yourself why the verification works:

gs=gk+cx=gkgcx=gk(gx)c=Rycg^s = g^{k + cx} = g^k \cdot g^{cx} = g^k \cdot (g^x)^c = R \cdot y^c

Question: What happens if the prover uses s=k+cx+1s = k + cx + 1 instead? Would the proof still verify? Why or why not?

Click for answer

No! Then gs=gk+cx+1=Rycgg^s = g^{k+cx+1} = R \cdot y^c \cdot g, which is NOT equal to RycR \cdot y^c (since g1g \neq 1). The verification equation would fail. The response ss must be computed exactly as k+cxmodqk + cx \bmod q.

Property 1: Completeness

Completeness: An honest prover (who knows xx) can ALWAYS convince an honest verifier.

This is the most basic requirement, if the protocol is correct, it should actually work. We already verified this for one run above. Let us confirm it works for many runs.

# ── Completeness: Run the protocol 1000 times ──────────────────────── # An honest prover should succeed EVERY time. num_trials = 1000 all_passed = True for trial in range(num_trials): # Prover commits k_i = randint(2, int(q) - 1) R_i = g^k_i # Verifier challenges c_i = randint(1, int(q) - 1) # Prover responds s_i = Mod(k_i + c_i * x, q) # Verify if g^Integer(s_i) != R_i * y^c_i: all_passed = False print(f"FAILURE at trial {trial}!") break if all_passed: print(f"Completeness verified: {num_trials}/{num_trials} trials passed.") print("An honest prover ALWAYS convinces an honest verifier.")

Property 2: Special Soundness

Special soundness: Given TWO accepting transcripts (R,c1,s1)(R, c_1, s_1) and (R,c2,s2)(R, c_2, s_2) with the same commitment RR but different challenges c1c2c_1 \neq c_2, anyone can extract the witness xx.

This is the proof of knowledge property. It means the prover must actually "know" xx, they cannot bluff, because if they could answer two different challenges for the same commitment, the secret can be extracted.

The extraction: From the two transcripts:

  • s1=k+c1x(modq)s_1 = k + c_1 \cdot x \pmod{q}

  • s2=k+c2x(modq)s_2 = k + c_2 \cdot x \pmod{q}

Subtract: s1s2=(c1c2)x(modq)s_1 - s_2 = (c_1 - c_2) \cdot x \pmod{q}

Therefore: x=s1s2c1c2(modq)x = \frac{s_1 - s_2}{c_1 - c_2} \pmod{q}

Since qq is prime, c1c2≢0(modq)c_1 - c_2 \not\equiv 0 \pmod{q} has an inverse, so we can always extract xx.

# ── Special Soundness: Extract the witness from two transcripts ────── # The prover uses the SAME commitment R (same k) but faces two challenges k_sound = randint(2, int(q) - 1) R_sound = g^k_sound # Transcript 1: challenge c1 c1 = randint(1, int(q) - 1) s1 = Mod(k_sound + c1 * x, q) print(f"Transcript 1: (R, c1={c1}, s1={s1})") print(f" Verify: g^s1 == R * y^c1 ? {g^Integer(s1) == R_sound * y^c1}") # Transcript 2: same R, different challenge c2 c2 = randint(1, int(q) - 1) while c2 == c1: c2 = randint(1, int(q) - 1) s2 = Mod(k_sound + c2 * x, q) print(f"Transcript 2: (R, c2={c2}, s2={s2})") print(f" Verify: g^s2 == R * y^c2 ? {g^Integer(s2) == R_sound * y^c2}") # ── EXTRACT the secret! ────────────────────────────────────────────── x_extracted = Mod(Integer(s1) - Integer(s2), q) / Mod(c1 - c2, q) print(f"\n--- Extraction ---") print(f"x_extracted = (s1 - s2) / (c1 - c2) mod q") print(f"x_extracted = {x_extracted}") print(f"x_actual = {x}") print(f"Match? {Integer(x_extracted) == x}") assert Integer(x_extracted) == x, "Extraction failed!" print("\nThe witness has been EXTRACTED from two transcripts!") print("This proves that anyone who can answer two challenges MUST know x.")

Why Special Soundness Implies Security

Think about what this means for a cheating prover (who does NOT know xx):

  1. A cheater commits to RR.

  2. The verifier sends a random cc.

  3. If the cheater could produce valid responses for TWO different challenges (same RR), we could extract xx. But the cheater does not know xx!

  4. Therefore, the cheater can respond correctly to at most one challenge value.

  5. Since the challenge is random, the cheater succeeds with probability at most 1/q1/q, negligible.

The critical point: the challenge must come AFTER the commitment. If the cheater sees cc before committing, they can cheat (as we will see in the simulator).

Property 3: Honest-Verifier Zero-Knowledge (HVZK)

HVZK: There exists a simulator that, WITHOUT knowing the witness xx, can produce transcripts (R,c,s)(R, c, s) that are statistically indistinguishable from real protocol transcripts.

This is the most subtle and profound property. Let us understand it through the simulation argument.

The Simulation Argument

Zero-knowledge means: "The verifier learns NOTHING because they could have generated the transcript themselves."

How? The simulator works backwards:

  1. Pick a random challenge cc and random response ss.

  2. Compute the commitment as R=gsycR = g^s \cdot y^{-c}.

  3. Output the transcript (R,c,s)(R, c, s).

Check: Does this transcript verify? gs=?Ryc=(gsyc)yc=gsg^s \stackrel{?}{=} R \cdot y^c = (g^s \cdot y^{-c}) \cdot y^c = g^s \checkmark

Yes! The simulated transcript passes verification, and the simulator never knew xx!

# ── The Simulator: Fake transcripts WITHOUT knowing x ──────────────── def simulate_transcript(g, y, q): """ Produce a valid-looking transcript (R, c, s) WITHOUT knowing x. The trick: choose c and s first, then compute R backwards. """ # Step 1: Pick random c and s c_sim = randint(1, int(q) - 1) s_sim = randint(1, int(q) - 1) # Step 2: Compute R = g^s * y^(-c) (work backwards!) R_sim = g^s_sim * y^(-c_sim) return R_sim, c_sim, s_sim # Generate a simulated transcript R_fake, c_fake, s_fake = simulate_transcript(g, y, q) # Verify it! lhs_fake = g^s_fake rhs_fake = R_fake * y^c_fake print("Simulated transcript (NO knowledge of x was used!):") print(f" R = {R_fake}") print(f" c = {c_fake}") print(f" s = {s_fake}") print(f"\n Verification: g^s == R * y^c ? {lhs_fake == rhs_fake}") assert lhs_fake == rhs_fake print("\nThe FAKE transcript passes verification!") print("The simulator never knew x, yet produced a valid proof.")

Are Simulated Transcripts Really Indistinguishable?

Let us generate many real and simulated transcripts and compare their distributions. In a real transcript:

  • cc is uniform random in Zq\mathbb{Z}_q

  • s=k+cxs = k + cx where kk is uniform random, so ss is also uniform random in Zq\mathbb{Z}_q

  • R=gkR = g^k is a uniform random group element

In a simulated transcript:

  • cc is uniform random in Zq\mathbb{Z}_q (chosen directly)

  • ss is uniform random in Zq\mathbb{Z}_q (chosen directly)

  • R=gsycR = g^s y^{-c} is determined by cc and ss

Both distributions are identical: in both cases, (R,c,s)(R, c, s) is a uniformly random accepting transcript. No algorithm can tell them apart, not even with unlimited computational power.

# ── Compare real vs simulated transcripts ───────────────────────────── # We collect the challenge and response values (mod a small number) # to visualize their distributions. num_samples = 2000 bucket_mod = 50 # Reduce to small range for histogram comparison real_c_buckets = [0] * bucket_mod real_s_buckets = [0] * bucket_mod sim_c_buckets = [0] * bucket_mod sim_s_buckets = [0] * bucket_mod for _ in range(num_samples): # Real transcript k_r = randint(2, int(q) - 1) c_r = randint(1, int(q) - 1) s_r = Integer(Mod(k_r + c_r * x, q)) real_c_buckets[c_r % bucket_mod] += 1 real_s_buckets[s_r % bucket_mod] += 1 # Simulated transcript _, c_s, s_s = simulate_transcript(g, y, q) sim_c_buckets[c_s % bucket_mod] += 1 sim_s_buckets[s_s % bucket_mod] += 1 # Display comparison expected_per_bucket = num_samples / bucket_mod real_c_deviation = max(abs(b - expected_per_bucket) for b in real_c_buckets) sim_c_deviation = max(abs(b - expected_per_bucket) for b in sim_c_buckets) real_s_deviation = max(abs(b - expected_per_bucket) for b in real_s_buckets) sim_s_deviation = max(abs(b - expected_per_bucket) for b in sim_s_buckets) print(f"Distribution comparison ({num_samples} samples, {bucket_mod} buckets):") print(f"Expected count per bucket: {expected_per_bucket:.1f}") print(f"") print(f"Challenge values (c mod {bucket_mod}):") print(f" Real max deviation: {real_c_deviation:.1f}") print(f" Simulated max deviation: {sim_c_deviation:.1f}") print(f"") print(f"Response values (s mod {bucket_mod}):") print(f" Real max deviation: {real_s_deviation:.1f}") print(f" Simulated max deviation: {sim_s_deviation:.1f}") print(f"") print("Both distributions are statistically indistinguishable.") print("This is WHY the protocol is zero-knowledge: the verifier's view") print("could have been generated without the prover's help.")

The Profound Implication

Think about what we just showed:

  1. The verifier interacts with the prover and gets a transcript (R,c,s)(R, c, s).

  2. But the verifier could have generated an identically distributed transcript by itself, using the simulator.

  3. Therefore, the transcript contains zero information that the verifier did not already have.

  4. The verifier learns exactly one thing: the statement is true (the prover knows xx).

This is why it is called zero-knowledge: the knowledge transferred is zero. The verifier is convinced, but gains nothing it could use to (for example) convince someone else, or learn even one bit of xx.

Note: We proved honest-verifier zero-knowledge (HVZK), meaning the verifier follows the protocol (picks cc randomly). Full zero-knowledge, which handles malicious verifiers, requires additional techniques (see notebook 09e on the Fiat-Shamir transform).

What Happens When a Cheater Tries?

Let us see what happens when someone who does NOT know xx tries to run the protocol.

# ── A cheating prover who does NOT know x ───────────────────────────── # Strategy 1: The cheater picks random R and random s, hopes for the best. num_cheat_attempts = 1000 cheat_successes = 0 for _ in range(num_cheat_attempts): # Cheater picks random R (doesn't know k such that R = g^k, # well, they do know k, but they don't know x) k_cheat = randint(2, int(q) - 1) R_cheat = g^k_cheat # Verifier sends challenge c_cheat = randint(1, int(q) - 1) # Cheater must produce s such that g^s = R * y^c # Without x, they can only guess, pick random s s_cheat = randint(1, int(q) - 1) if g^s_cheat == R_cheat * y^c_cheat: cheat_successes += 1 print(f"Cheater success rate: {cheat_successes}/{num_cheat_attempts}") print(f"Expected: ~0 (probability 1/q = 1/{q} per attempt)") print(f"\nA cheater who doesn't know x cannot produce valid proofs!")

Checkpoint: The Cheater's Dilemma

Question: Could a cheater succeed by choosing ss BEFORE the challenge cc arrives?

Think about it: the cheater knows gg, yy, RR (which they chose), and ss (which they chose). They need gs=Rycg^s = R \cdot y^c. But they do not know cc yet!

  • If they fix ss first, then gsg^s and RR are fixed, so yc=gs/Ry^c = g^s / R is fixed. This pins down exactly one value of cc. With probability 1/q1/q, the verifier happens to pick that exact cc.

  • Alternatively, the cheater could pick cc first and compute a valid (R,s)(R, s) pair. But then they would need the verifier to send exactly that cc, again probability 1/q1/q.

Conclusion: No matter what strategy the cheater uses, they succeed with probability at most 1/q1/q.

The Three Properties Together

Let us summarize the three properties of our sigma protocol:

PropertyWhat it saysWhat we showed
CompletenessHonest prover always convinces honest verifier1000/1000 trials passed
Special SoundnessTwo transcripts (same RR, different cc) reveal xxExtracted xx from two transcripts
HVZKSimulator produces indistinguishable transcripts without xxSimulated transcripts pass verification

These three properties together give us something remarkable: a proof system where:

  • Honest provers always succeed (completeness)

  • Cheating provers always fail (soundness)

  • The verifier learns nothing beyond the truth of the statement (zero-knowledge)

Putting It All Together: Reusable Functions

Let us package the protocol into clean, reusable functions.

# ── Clean implementation of the sigma protocol ──────────────────────── def sigma_prove(g, x, q): """Prover's side: commit and prepare to respond. Returns: (R, k) where R is the commitment and k is the nonce.""" k = randint(2, int(q) - 1) R = g^k return R, k def sigma_challenge(q): """Verifier's side: produce a random challenge.""" return randint(1, int(q) - 1) def sigma_respond(k, c, x, q): """Prover's side: compute the response.""" return Integer(Mod(k + c * x, q)) def sigma_verify(g, y, R, c, s): """Verifier's side: check the proof.""" return g^s == R * y^c def sigma_simulate(g, y, q): """Simulator: produce a valid transcript WITHOUT knowing x.""" c = randint(1, int(q) - 1) s = randint(1, int(q) - 1) R = g^s * y^(-c) return R, c, s def sigma_extract(s1, s2, c1, c2, q): """Extract witness from two accepting transcripts.""" return Integer(Mod(s1 - s2, q) / Mod(c1 - c2, q)) # ── Demo ────────────────────────────────────────────────────────────── print("=== Full Protocol Run ===") R_demo, k_demo = sigma_prove(g, x, q) c_demo = sigma_challenge(q) s_demo = sigma_respond(k_demo, c_demo, x, q) result = sigma_verify(g, y, R_demo, c_demo, s_demo) print(f"Proof accepted: {result}") print("\n=== Simulation ===") R_sim, c_sim, s_sim = sigma_simulate(g, y, q) result_sim = sigma_verify(g, y, R_sim, c_sim, s_sim) print(f"Simulated proof accepted: {result_sim}") print("\n=== Extraction ===") # Two proofs with same k (same R), different challenges R_ext, k_ext = sigma_prove(g, x, q) c_ext1, c_ext2 = sigma_challenge(q), sigma_challenge(q) while c_ext1 == c_ext2: c_ext2 = sigma_challenge(q) s_ext1 = sigma_respond(k_ext, c_ext1, x, q) s_ext2 = sigma_respond(k_ext, c_ext2, x, q) x_ext = sigma_extract(s_ext1, s_ext2, c_ext1, c_ext2, q) print(f"Extracted x: {x_ext}") print(f"Actual x: {x}") print(f"Match: {x_ext == x}")

Crypto Foreshadowing: The protocol we just built is essentially Schnorr's protocol, the most important sigma protocol in cryptography. In the next notebook (09d), we will see Schnorr's protocol in full detail, including its use as a digital signature scheme. Schnorr signatures are used in:

  • Bitcoin's Taproot upgrade (BIP 340), the biggest protocol change in Bitcoin's history

  • Ed25519, the most widely deployed signature scheme on the internet

  • Every modern zero-knowledge proof system builds on sigma protocols as a foundation

The Fiat-Shamir transform (notebook 09e) will show how to make sigma protocols non-interactive, turning our 3-message conversation into a single message that anyone can verify.


Exercises

Exercise 1: Verify a Given Transcript (Fully Worked)

Problem: You are given the following transcript from a sigma protocol. Verify whether the proof is valid.

  • Public parameters: group with generator gg and order qq (as set up above)

  • Public statement: y=gxy = g^x (as set up above)

  • Transcript: (R,c,s)(R, c, s) generated by a real prover

# ── Exercise 1: Fully Worked ───────────────────────────────────────── # # Generate a transcript for the student to verify: k_ex1 = randint(2, int(q) - 1) R_ex1 = g^k_ex1 c_ex1 = randint(1, int(q) - 1) s_ex1 = Integer(Mod(k_ex1 + c_ex1 * x, q)) print("Given transcript:") print(f" R = {R_ex1}") print(f" c = {c_ex1}") print(f" s = {s_ex1}") print(f" y = {y} (public key)") print(f" g = {g} (generator)") print() # SOLUTION: # Step 1: Compute the left-hand side: g^s lhs_ex1 = g^s_ex1 print(f"Step 1: Compute g^s = {lhs_ex1}") # Step 2: Compute the right-hand side: R * y^c rhs_ex1 = R_ex1 * y^c_ex1 print(f"Step 2: Compute R * y^c = {rhs_ex1}") # Step 3: Compare print(f"Step 3: g^s == R * y^c ? {lhs_ex1 == rhs_ex1}") print() if lhs_ex1 == rhs_ex1: print("ACCEPT: The proof is valid.") else: print("REJECT: The proof is invalid.") # Step 4: Now verify a TAMPERED transcript (change s by 1) print("\n--- Now try a tampered transcript (s+1 instead of s) ---") s_tampered = s_ex1 + 1 lhs_tampered = g^s_tampered rhs_tampered = R_ex1 * y^c_ex1 print(f"g^(s+1) == R * y^c ? {lhs_tampered == rhs_tampered}") print("REJECT: Tampering with s breaks the proof.")

Exercise 2: Build a Simulator (Guided)

Problem: Write a simulator that produces valid-looking transcripts WITHOUT access to the secret xx. You only have access to the public values gg, yy, qq.

Hints:

  1. Choose cc and ss randomly first.

  2. Compute RR so that the verification equation gs=Rycg^s = R \cdot y^c holds.

  3. Verify your transcript passes the check.

# ── Exercise 2: Guided ─────────────────────────────────────────────── # # Fill in the blanks to build a simulator. # You should NOT use the variable x anywhere in this cell! def my_simulator(g, y, q): # Step 1: Choose random c and s c = randint(1, int(q) - 1) s = randint(1, int(q) - 1) # Step 2: Compute R so that g^s = R * y^c # Rearranging: R = ? R = ______ # <-- FILL IN: Express R in terms of g, s, y, c return R, c, s # Test your simulator: # R_test, c_test, s_test = my_simulator(g, y, q) # print(f"Transcript: R={R_test}, c={c_test}, s={s_test}") # print(f"Valid? {sigma_verify(g, y, R_test, c_test, s_test)}")

Exercise 3: Soundness Extraction Attack (Independent)

Problem: Suppose a careless prover reuses the same random nonce kk in two different protocol runs (same RR, different challenges). You intercept both transcripts. Extract the prover's secret key xx.

Here are two real transcripts generated with the same nonce kk:

# ── Exercise 3: Independent ────────────────────────────────────────── # # A careless prover reused nonce k. You have two transcripts: k_reused = randint(2, int(q) - 1) R_reused = g^k_reused c_a = randint(1, int(q) - 1) s_a = Integer(Mod(k_reused + c_a * x, q)) c_b = randint(1, int(q) - 1) while c_b == c_a: c_b = randint(1, int(q) - 1) s_b = Integer(Mod(k_reused + c_b * x, q)) print("Intercepted transcripts (same R!):") print(f" Transcript A: (R, c_a={c_a}, s_a={s_a})") print(f" Transcript B: (R, c_b={c_b}, s_b={s_b})") print(f" Both verify correctly: A={sigma_verify(g, y, R_reused, c_a, s_a)}, B={sigma_verify(g, y, R_reused, c_b, s_b)}") print() print("YOUR TASK: Extract the secret x from these two transcripts.") print("Hint: s_a = k + c_a * x (mod q) and s_b = k + c_b * x (mod q)") print() # YOUR CODE HERE: # x_recovered = ??? # print(f"Recovered x = {x_recovered}") # print(f"Verify: g^x_recovered == y ? {g^x_recovered == y}")

Real-world connection: This is not a theoretical exercise. In 2010, hackers extracted Sony's PlayStation 3 ECDSA private key because Sony reused the same nonce kk across multiple signatures. The same extraction formula you just used (or will use) is exactly what broke PS3 security. Never reuse nonces.

Summary

ConceptKey idea
3-move structureProver commits (RR), verifier challenges (cc), prover responds (ss). The flow resembles the letter Σ\Sigma
CompletenessAn honest prover who knows the witness xx always produces a valid proof
Special soundnessTwo accepting transcripts with the same commitment but different challenges allow extraction of the witness
HVZKA simulator can produce valid-looking transcripts without knowing xx, so the verifier learns nothing from the interaction
Simulation argumentZero-knowledge means "the verifier could have generated the transcript themselves." If a transcript can be faked, it cannot contain information
Nonce reuse is catastrophicReusing the random nonce kk allows anyone to extract the secret key

Next: The Schnorr Protocol, the sigma protocol we built here, formalized as a full identification scheme and digital signature.