GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/frontier/09-commitments-sigma-protocols/sage/09c-sigma-protocols-intuition.ipynb
⁴⁸³ views

unlisted

Kernel: SageMath 10.5

Sigma Protocols: The 3-Move Proof

Module 09 | 09-commitments-sigma-protocols

Commit-challenge-response, completeness, special soundness, honest-verifier zero-knowledge

The Impossible Question

Can you PROVE you know a password without ever typing it? Without revealing ANY information about it?
This sounds impossible. If you prove something, surely the proof itself leaks information? But sigma protocols do exactly this, they let a prover convince a verifier of knowledge while revealing nothing beyond the truth of the statement.

This notebook builds the intuition for how such a proof works. By the end, you will understand the elegant 3-move dance that makes zero-knowledge proofs possible, and you will run one yourself in SageMath.

Objectives

By the end of this notebook you will be able to:

Describe the 3-move structure (commit, challenge, response) of a sigma protocol.
State and demonstrate the three defining properties: completeness, special soundness, and honest-verifier zero-knowledge (HVZK).
Execute a concrete sigma protocol (prove knowledge of a discrete log) in SageMath.
Construct a simulator that creates fake transcripts indistinguishable from real ones.
Extract a witness from two accepting transcripts with the same commitment but different challenges.

Prerequisites

Completion of Pedersen Commitments.
Familiarity with cyclic groups and discrete logarithms (Modules 05-06).

Bridge from 09b: In the Pedersen commitment notebook, you saw how to commit to a value $m$ using $C = g^m h^r$ , hiding $m$ perfectly while binding you to it. Sigma protocols use a very similar idea in their first move: the prover sends a commitment to some randomness. The key insight is that this commitment structure, combined with a challenge-response mechanism, can prove knowledge of a secret without ever revealing it. The Pedersen commitment you already understand is, in fact, the algebraic engine inside many sigma protocols.

The Ali Baba Cave: An Analogy

Before we touch any math, let us build intuition with a famous analogy.

Imagine a cave shaped like a ring, with a single entrance. Deep inside, the ring splits into two paths (left and right) that reconnect at a locked door. Only someone who knows the magic word can open the door and pass through.

        Entrance
           |
          / \
    Left /   \ Right
        |     |
        | [D] |    D = locked door
        |     |
         \   /
          \ /

The protocol:

Commit: The prover (Peggy) walks into the cave and randomly chooses the left or right path. The verifier (Victor) waits outside and does not see which path she took.
Challenge: Victor shouts a random demand: "Come out from the LEFT" or "Come out from the RIGHT."
Response: If Peggy knows the magic word, she can always comply, if she is on the wrong side, she opens the door and walks through.

Why this works:

Completeness: If Peggy knows the word, she succeeds every time.
Soundness: If Peggy does NOT know the word, she can only comply if she happened to choose the right path, a 50% chance. After $n$ rounds, a fake prover succeeds with probability $1/2^n$ , which is negligible.
Zero-knowledge: Victor learns nothing! A skeptical friend watching a VIDEO of the protocol cannot distinguish it from a staged video where Peggy and Victor colluded (Victor told Peggy the challenge in advance). This is the simulation argument, if you can fake it, it reveals nothing.

Key idea: Zero-knowledge does not mean the proof is somehow "weak." It means the verifier gains NO information beyond the single bit: "Peggy knows the secret." Not a partial clue. Not a single bit of the secret itself. Nothing.

Common Misconception: "Zero-knowledge means the verifier cannot compute the secret."
It is much stronger than that. Zero-knowledge means the verifier cannot learn anything at all, not even a single bit of information about the secret. The verifier's view of the interaction is identical to something it could have generated entirely on its own, without ever talking to the prover. This is a profound guarantee: the interaction is literally worthless to the verifier as a source of information.

The 3-Move Structure

A sigma protocol for a relation $R$ is an interactive proof with exactly three messages:

Step	Direction	Name	Content
1	Prover $\to$ Verifier	Commitment ( $a$ )	Prover commits to fresh randomness
2	Verifier $\to$ Prover	Challenge ( $e$ )	Verifier sends a random challenge
3	Prover $\to$ Verifier	Response ( $z$ )	Prover responds using witness + challenge

The verifier then runs a verification equation on $(a, e, z)$ and accepts or rejects.

Why "sigma"? The communication pattern, down, back up diagonally, then down again, looks like the Greek letter $\Sigma$ (sigma) when drawn as a message sequence diagram:

  Prover          Verifier
    |--- a (commit) --->|      \
    |                   |       > looks like Σ
    |<-- e (challenge) -|      /
    |                   |     /
    |--- z (response) ->|    /

A Concrete Sigma Protocol: Proof of Knowledge of Discrete Log

Let us make this concrete. Consider the most fundamental sigma protocol:

Statement: "I know a secret $x$ such that $y = g^x$ in a cyclic group."

The prover wants to convince the verifier that they know $x$ (the witness) without revealing $x$ .

Protocol:

Commit: Prover picks random $k$ , sends $R = g^k$
Challenge: Verifier sends random $c$
Response: Prover computes $s = k + c \cdot x \pmod{q}$ and sends $s$

Verification: Verifier checks that $g^s = R \cdot y^c$

Why does this equation hold? Expand the right side: $R \cdot y^c = g^k \cdot (g^x)^c = g^k \cdot g^{xc} = g^{k + cx} = g^s \checkmark$

Let us implement this step by step.

In [ ]:

# ── Setup: a cyclic group of prime order ──────────────────────────────
# We work in Z_p^* with a prime-order subgroup of order q.
# For pedagogical clarity, we use a small safe prime: p = 2q + 1.

# Find a safe prime: p = 2q + 1 where both p and q are prime
q = next_prime(10^15)           # A 15-digit prime (subgroup order)
p = 2 * q + 1
while not is_prime(p):
    q = next_prime(q + 1)
    p = 2 * q + 1

print(f"q = {q}  (subgroup order, {len(str(q))} digits)")
print(f"p = {p}  (field prime, {len(str(p))} digits)")

# Find a generator of the order-q subgroup
# Any element h with h^q = 1 and h != 1 generates the subgroup
h_candidate = Mod(2, p)
g = h_candidate^((p - 1) // q)   # Map into the subgroup
assert g != 1, "Bad generator"
assert g^q == 1, "g does not have order q"

print(f"g = {g}  (generator of order-q subgroup)")
print(f"\nSanity check: g^q mod p = {g^q}  (should be 1)")

In [ ]:

# ── The Prover's secret ───────────────────────────────────────────────
# The prover knows x; the public value is y = g^x.

x = randint(2, int(q) - 1)       # Prover's SECRET witness
y = g^x                           # PUBLIC statement

print(f"Secret x = {x}  (only the prover knows this!)")
print(f"Public y = g^x = {y}")
print(f"\nThe prover wants to prove: 'I know x such that y = g^x'")
print(f"WITHOUT revealing x.")

In [ ]:

# ── The 3-Move Protocol ──────────────────────────────────────────────

# Step 1: COMMIT. Prover picks random k, sends R = g^k
k = randint(2, int(q) - 1)       # Random nonce (ephemeral secret)
R = g^k                           # Commitment
print("Step 1 (Prover -> Verifier):")
print(f"  Prover picks random k and sends R = g^k = {R}")
print()

# Step 2: CHALLENGE. Verifier sends random c
c = randint(1, int(q) - 1)       # Random challenge
print("Step 2 (Verifier -> Prover):")
print(f"  Verifier sends random challenge c = {c}")
print()

# Step 3: RESPONSE. Prover computes s = k + c*x mod q
s = Mod(k + c * x, q)            # Response
print("Step 3 (Prover -> Verifier):")
print(f"  Prover computes s = k + c*x mod q = {s}")
print()

# ── Verification ─────────────────────────────────────────────────────
lhs = g^Integer(s)                # g^s
rhs = R * y^c                     # R * y^c
print("Verification:")
print(f"  g^s     = {lhs}")
print(f"  R * y^c = {rhs}")
print(f"  Equal?  {lhs == rhs}")
assert lhs == rhs, "Proof failed!"
print("\nThe proof ACCEPTS. The verifier is convinced.")

Checkpoint: Verify by Hand

Before reading further, convince yourself why the verification works:

g^s = g^{k + cx} = g^k \cdot g^{cx} = g^k \cdot (g^x)^c = R \cdot y^c

Question: What happens if the prover uses $s = k + cx + 1$ instead? Would the proof still verify? Why or why not?

Click for answer

No! Then $g^s = g^{k+cx+1} = R \cdot y^c \cdot g$ , which is NOT equal to $R \cdot y^c$ (since $g \neq 1$ ). The verification equation would fail. The response $s$ must be computed exactly as $k + cx \bmod q$ .

Property 1: Completeness

Completeness: An honest prover (who knows $x$ ) can ALWAYS convince an honest verifier.

This is the most basic requirement, if the protocol is correct, it should actually work. We already verified this for one run above. Let us confirm it works for many runs.

In [ ]:

# ── Completeness: Run the protocol 1000 times ────────────────────────
# An honest prover should succeed EVERY time.

num_trials = 1000
all_passed = True

for trial in range(num_trials):
    # Prover commits
    k_i = randint(2, int(q) - 1)
    R_i = g^k_i
    
    # Verifier challenges
    c_i = randint(1, int(q) - 1)
    
    # Prover responds
    s_i = Mod(k_i + c_i * x, q)
    
    # Verify
    if g^Integer(s_i) != R_i * y^c_i:
        all_passed = False
        print(f"FAILURE at trial {trial}!")
        break

if all_passed:
    print(f"Completeness verified: {num_trials}/{num_trials} trials passed.")
    print("An honest prover ALWAYS convinces an honest verifier.")

Property 2: Special Soundness

Special soundness: Given TWO accepting transcripts $(R, c_1, s_1)$ and $(R, c_2, s_2)$ with the same commitment $R$ but different challenges $c_1 \neq c_2$ , anyone can extract the witness $x$ .

This is the proof of knowledge property. It means the prover must actually "know" $x$ , they cannot bluff, because if they could answer two different challenges for the same commitment, the secret can be extracted.

The extraction: From the two transcripts:

$s_1 = k + c_1 \cdot x \pmod{q}$
$s_2 = k + c_2 \cdot x \pmod{q}$

Subtract: $s_1 - s_2 = (c_1 - c_2) \cdot x \pmod{q}$

Therefore: $x = \frac{s_1 - s_2}{c_1 - c_2} \pmod{q}$

Since $q$ is prime, $c_1 - c_2 \not\equiv 0 \pmod{q}$ has an inverse, so we can always extract $x$ .

In [ ]:

# ── Special Soundness: Extract the witness from two transcripts ──────

# The prover uses the SAME commitment R (same k) but faces two challenges
k_sound = randint(2, int(q) - 1)
R_sound = g^k_sound

# Transcript 1: challenge c1
c1 = randint(1, int(q) - 1)
s1 = Mod(k_sound + c1 * x, q)
print(f"Transcript 1:  (R, c1={c1}, s1={s1})")
print(f"  Verify: g^s1 == R * y^c1 ? {g^Integer(s1) == R_sound * y^c1}")

# Transcript 2: same R, different challenge c2
c2 = randint(1, int(q) - 1)
while c2 == c1:
    c2 = randint(1, int(q) - 1)
s2 = Mod(k_sound + c2 * x, q)
print(f"Transcript 2:  (R, c2={c2}, s2={s2})")
print(f"  Verify: g^s2 == R * y^c2 ? {g^Integer(s2) == R_sound * y^c2}")

# ── EXTRACT the secret! ──────────────────────────────────────────────
x_extracted = Mod(Integer(s1) - Integer(s2), q) / Mod(c1 - c2, q)

print(f"\n--- Extraction ---")
print(f"x_extracted = (s1 - s2) / (c1 - c2) mod q")
print(f"x_extracted = {x_extracted}")
print(f"x_actual    = {x}")
print(f"Match?        {Integer(x_extracted) == x}")

assert Integer(x_extracted) == x, "Extraction failed!"
print("\nThe witness has been EXTRACTED from two transcripts!")
print("This proves that anyone who can answer two challenges MUST know x.")

Why Special Soundness Implies Security

Think about what this means for a cheating prover (who does NOT know $x$ ):

A cheater commits to $R$ .
The verifier sends a random $c$ .
If the cheater could produce valid responses for TWO different challenges (same $R$ ), we could extract $x$ . But the cheater does not know $x$ !
Therefore, the cheater can respond correctly to at most one challenge value.
Since the challenge is random, the cheater succeeds with probability at most $1/q$ , negligible.

The critical point: the challenge must come AFTER the commitment. If the cheater sees $c$ before committing, they can cheat (as we will see in the simulator).

Property 3: Honest-Verifier Zero-Knowledge (HVZK)

HVZK: There exists a simulator that, WITHOUT knowing the witness $x$ , can produce transcripts $(R, c, s)$ that are statistically indistinguishable from real protocol transcripts.

This is the most subtle and profound property. Let us understand it through the simulation argument.

The Simulation Argument

Zero-knowledge means: "The verifier learns NOTHING because they could have generated the transcript themselves."

How? The simulator works backwards:

Pick a random challenge $c$ and random response $s$ .
Compute the commitment as $R = g^s \cdot y^{-c}$ .
Output the transcript $(R, c, s)$ .

Check: Does this transcript verify? $g^s \stackrel{?}{=} R \cdot y^c = (g^s \cdot y^{-c}) \cdot y^c = g^s \checkmark$

Yes! The simulated transcript passes verification, and the simulator never knew $x$ !

In [ ]:

# ── The Simulator: Fake transcripts WITHOUT knowing x ────────────────

def simulate_transcript(g, y, q):
    """
    Produce a valid-looking transcript (R, c, s) WITHOUT knowing x.
    
    The trick: choose c and s first, then compute R backwards.
    """
    # Step 1: Pick random c and s
    c_sim = randint(1, int(q) - 1)
    s_sim = randint(1, int(q) - 1)
    
    # Step 2: Compute R = g^s * y^(-c)  (work backwards!)
    R_sim = g^s_sim * y^(-c_sim)
    
    return R_sim, c_sim, s_sim


# Generate a simulated transcript
R_fake, c_fake, s_fake = simulate_transcript(g, y, q)

# Verify it!
lhs_fake = g^s_fake
rhs_fake = R_fake * y^c_fake

print("Simulated transcript (NO knowledge of x was used!):")
print(f"  R = {R_fake}")
print(f"  c = {c_fake}")
print(f"  s = {s_fake}")
print(f"\n  Verification: g^s == R * y^c ? {lhs_fake == rhs_fake}")
assert lhs_fake == rhs_fake
print("\nThe FAKE transcript passes verification!")
print("The simulator never knew x, yet produced a valid proof.")

Are Simulated Transcripts Really Indistinguishable?

Let us generate many real and simulated transcripts and compare their distributions. In a real transcript:

$c$ is uniform random in $\mathbb{Z}_q$
$s = k + cx$ where $k$ is uniform random, so $s$ is also uniform random in $\mathbb{Z}_q$
$R = g^k$ is a uniform random group element

In a simulated transcript:

$c$ is uniform random in $\mathbb{Z}_q$ (chosen directly)
$s$ is uniform random in $\mathbb{Z}_q$ (chosen directly)
$R = g^s y^{-c}$ is determined by $c$ and $s$

Both distributions are identical: in both cases, $(R, c, s)$ is a uniformly random accepting transcript. No algorithm can tell them apart, not even with unlimited computational power.

In [ ]:

# ── Compare real vs simulated transcripts ─────────────────────────────
# We collect the challenge and response values (mod a small number)
# to visualize their distributions.

num_samples = 2000
bucket_mod = 50   # Reduce to small range for histogram comparison

real_c_buckets = [0] * bucket_mod
real_s_buckets = [0] * bucket_mod
sim_c_buckets  = [0] * bucket_mod
sim_s_buckets  = [0] * bucket_mod

for _ in range(num_samples):
    # Real transcript
    k_r = randint(2, int(q) - 1)
    c_r = randint(1, int(q) - 1)
    s_r = Integer(Mod(k_r + c_r * x, q))
    real_c_buckets[c_r % bucket_mod] += 1
    real_s_buckets[s_r % bucket_mod] += 1
    
    # Simulated transcript
    _, c_s, s_s = simulate_transcript(g, y, q)
    sim_c_buckets[c_s % bucket_mod] += 1
    sim_s_buckets[s_s % bucket_mod] += 1

# Display comparison
expected_per_bucket = num_samples / bucket_mod

real_c_deviation = max(abs(b - expected_per_bucket) for b in real_c_buckets)
sim_c_deviation  = max(abs(b - expected_per_bucket) for b in sim_c_buckets)
real_s_deviation = max(abs(b - expected_per_bucket) for b in real_s_buckets)
sim_s_deviation  = max(abs(b - expected_per_bucket) for b in sim_s_buckets)

print(f"Distribution comparison ({num_samples} samples, {bucket_mod} buckets):")
print(f"Expected count per bucket: {expected_per_bucket:.1f}")
print(f"")
print(f"Challenge values (c mod {bucket_mod}):")
print(f"  Real max deviation:      {real_c_deviation:.1f}")
print(f"  Simulated max deviation: {sim_c_deviation:.1f}")
print(f"")
print(f"Response values (s mod {bucket_mod}):")
print(f"  Real max deviation:      {real_s_deviation:.1f}")
print(f"  Simulated max deviation: {sim_s_deviation:.1f}")
print(f"")
print("Both distributions are statistically indistinguishable.")
print("This is WHY the protocol is zero-knowledge: the verifier's view")
print("could have been generated without the prover's help.")

The Profound Implication

Think about what we just showed:

The verifier interacts with the prover and gets a transcript $(R, c, s)$ .
But the verifier could have generated an identically distributed transcript by itself, using the simulator.
Therefore, the transcript contains zero information that the verifier did not already have.
The verifier learns exactly one thing: the statement is true (the prover knows $x$ ).

This is why it is called zero-knowledge: the knowledge transferred is zero. The verifier is convinced, but gains nothing it could use to (for example) convince someone else, or learn even one bit of $x$ .

Note: We proved honest-verifier zero-knowledge (HVZK), meaning the verifier follows the protocol (picks $c$ randomly). Full zero-knowledge, which handles malicious verifiers, requires additional techniques (see notebook 09e on the Fiat-Shamir transform).

What Happens When a Cheater Tries?

Let us see what happens when someone who does NOT know $x$ tries to run the protocol.

In [ ]:

# ── A cheating prover who does NOT know x ─────────────────────────────
# Strategy 1: The cheater picks random R and random s, hopes for the best.

num_cheat_attempts = 1000
cheat_successes = 0

for _ in range(num_cheat_attempts):
    # Cheater picks random R (doesn't know k such that R = g^k, 
    # well, they do know k, but they don't know x)
    k_cheat = randint(2, int(q) - 1)
    R_cheat = g^k_cheat
    
    # Verifier sends challenge
    c_cheat = randint(1, int(q) - 1)
    
    # Cheater must produce s such that g^s = R * y^c
    # Without x, they can only guess, pick random s
    s_cheat = randint(1, int(q) - 1)
    
    if g^s_cheat == R_cheat * y^c_cheat:
        cheat_successes += 1

print(f"Cheater success rate: {cheat_successes}/{num_cheat_attempts}")
print(f"Expected: ~0 (probability 1/q = 1/{q} per attempt)")
print(f"\nA cheater who doesn't know x cannot produce valid proofs!")

Checkpoint: The Cheater's Dilemma

Question: Could a cheater succeed by choosing $s$ BEFORE the challenge $c$ arrives?

Think about it: the cheater knows $g$ , $y$ , $R$ (which they chose), and $s$ (which they chose). They need $g^s = R \cdot y^c$ . But they do not know $c$ yet!

If they fix $s$ first, then $g^s$ and $R$ are fixed, so $y^c = g^s / R$ is fixed. This pins down exactly one value of $c$ . With probability $1/q$ , the verifier happens to pick that exact $c$ .
Alternatively, the cheater could pick $c$ first and compute a valid $(R, s)$ pair. But then they would need the verifier to send exactly that $c$ , again probability $1/q$ .

Conclusion: No matter what strategy the cheater uses, they succeed with probability at most $1/q$ .

The Three Properties Together

Let us summarize the three properties of our sigma protocol:

Property	What it says	What we showed
Completeness	Honest prover always convinces honest verifier	1000/1000 trials passed
Special Soundness	Two transcripts (same $R$ , different $c$ ) reveal $x$	Extracted $x$ from two transcripts
HVZK	Simulator produces indistinguishable transcripts without $x$	Simulated transcripts pass verification

These three properties together give us something remarkable: a proof system where:

Honest provers always succeed (completeness)
Cheating provers always fail (soundness)
The verifier learns nothing beyond the truth of the statement (zero-knowledge)

Putting It All Together: Reusable Functions

Let us package the protocol into clean, reusable functions.

In [ ]:

# ── Clean implementation of the sigma protocol ────────────────────────

def sigma_prove(g, x, q):
    """Prover's side: commit and prepare to respond.
    Returns: (R, k) where R is the commitment and k is the nonce."""
    k = randint(2, int(q) - 1)
    R = g^k
    return R, k

def sigma_challenge(q):
    """Verifier's side: produce a random challenge."""
    return randint(1, int(q) - 1)

def sigma_respond(k, c, x, q):
    """Prover's side: compute the response."""
    return Integer(Mod(k + c * x, q))

def sigma_verify(g, y, R, c, s):
    """Verifier's side: check the proof."""
    return g^s == R * y^c

def sigma_simulate(g, y, q):
    """Simulator: produce a valid transcript WITHOUT knowing x."""
    c = randint(1, int(q) - 1)
    s = randint(1, int(q) - 1)
    R = g^s * y^(-c)
    return R, c, s

def sigma_extract(s1, s2, c1, c2, q):
    """Extract witness from two accepting transcripts."""
    return Integer(Mod(s1 - s2, q) / Mod(c1 - c2, q))

# ── Demo ──────────────────────────────────────────────────────────────
print("=== Full Protocol Run ===")
R_demo, k_demo = sigma_prove(g, x, q)
c_demo = sigma_challenge(q)
s_demo = sigma_respond(k_demo, c_demo, x, q)
result = sigma_verify(g, y, R_demo, c_demo, s_demo)
print(f"Proof accepted: {result}")

print("\n=== Simulation ===")
R_sim, c_sim, s_sim = sigma_simulate(g, y, q)
result_sim = sigma_verify(g, y, R_sim, c_sim, s_sim)
print(f"Simulated proof accepted: {result_sim}")

print("\n=== Extraction ===")
# Two proofs with same k (same R), different challenges
R_ext, k_ext = sigma_prove(g, x, q)
c_ext1, c_ext2 = sigma_challenge(q), sigma_challenge(q)
while c_ext1 == c_ext2:
    c_ext2 = sigma_challenge(q)
s_ext1 = sigma_respond(k_ext, c_ext1, x, q)
s_ext2 = sigma_respond(k_ext, c_ext2, x, q)
x_ext = sigma_extract(s_ext1, s_ext2, c_ext1, c_ext2, q)
print(f"Extracted x: {x_ext}")
print(f"Actual x:    {x}")
print(f"Match: {x_ext == x}")

Crypto Foreshadowing: The protocol we just built is essentially Schnorr's protocol, the most important sigma protocol in cryptography. In the next notebook (09d), we will see Schnorr's protocol in full detail, including its use as a digital signature scheme. Schnorr signatures are used in:

Bitcoin's Taproot upgrade (BIP 340), the biggest protocol change in Bitcoin's history
Ed25519, the most widely deployed signature scheme on the internet
Every modern zero-knowledge proof system builds on sigma protocols as a foundation

The Fiat-Shamir transform (notebook 09e) will show how to make sigma protocols non-interactive, turning our 3-message conversation into a single message that anyone can verify.

Exercises

Exercise 1: Verify a Given Transcript (Fully Worked)

Problem: You are given the following transcript from a sigma protocol. Verify whether the proof is valid.

Public parameters: group with generator $g$ and order $q$ (as set up above)
Public statement: $y = g^x$ (as set up above)
Transcript: $(R, c, s)$ generated by a real prover

In [ ]:

# ── Exercise 1: Fully Worked ─────────────────────────────────────────
#
# Generate a transcript for the student to verify:
k_ex1 = randint(2, int(q) - 1)
R_ex1 = g^k_ex1
c_ex1 = randint(1, int(q) - 1)
s_ex1 = Integer(Mod(k_ex1 + c_ex1 * x, q))

print("Given transcript:")
print(f"  R = {R_ex1}")
print(f"  c = {c_ex1}")
print(f"  s = {s_ex1}")
print(f"  y = {y}  (public key)")
print(f"  g = {g}  (generator)")
print()

# SOLUTION:
# Step 1: Compute the left-hand side:  g^s
lhs_ex1 = g^s_ex1
print(f"Step 1: Compute g^s = {lhs_ex1}")

# Step 2: Compute the right-hand side:  R * y^c
rhs_ex1 = R_ex1 * y^c_ex1
print(f"Step 2: Compute R * y^c = {rhs_ex1}")

# Step 3: Compare
print(f"Step 3: g^s == R * y^c ? {lhs_ex1 == rhs_ex1}")
print()
if lhs_ex1 == rhs_ex1:
    print("ACCEPT: The proof is valid.")
else:
    print("REJECT: The proof is invalid.")

# Step 4: Now verify a TAMPERED transcript (change s by 1)
print("\n--- Now try a tampered transcript (s+1 instead of s) ---")
s_tampered = s_ex1 + 1
lhs_tampered = g^s_tampered
rhs_tampered = R_ex1 * y^c_ex1
print(f"g^(s+1) == R * y^c ? {lhs_tampered == rhs_tampered}")
print("REJECT: Tampering with s breaks the proof.")

Exercise 2: Build a Simulator (Guided)

Problem: Write a simulator that produces valid-looking transcripts WITHOUT access to the secret $x$ . You only have access to the public values $g$ , $y$ , $q$ .

Hints:

Choose $c$ and $s$ randomly first.
Compute $R$ so that the verification equation $g^s = R \cdot y^c$ holds.
Verify your transcript passes the check.

In [ ]:

# ── Exercise 2: Guided ───────────────────────────────────────────────
#
# Fill in the blanks to build a simulator.
# You should NOT use the variable x anywhere in this cell!

def my_simulator(g, y, q):
    # Step 1: Choose random c and s
    c = randint(1, int(q) - 1)
    s = randint(1, int(q) - 1)
    
    # Step 2: Compute R so that g^s = R * y^c
    #         Rearranging: R = ?
    R = ______  # <-- FILL IN: Express R in terms of g, s, y, c
    
    return R, c, s

# Test your simulator:
# R_test, c_test, s_test = my_simulator(g, y, q)
# print(f"Transcript: R={R_test}, c={c_test}, s={s_test}")
# print(f"Valid? {sigma_verify(g, y, R_test, c_test, s_test)}")

Exercise 3: Soundness Extraction Attack (Independent)

Problem: Suppose a careless prover reuses the same random nonce $k$ in two different protocol runs (same $R$ , different challenges). You intercept both transcripts. Extract the prover's secret key $x$ .

Here are two real transcripts generated with the same nonce $k$ :

In [ ]:

# ── Exercise 3: Independent ──────────────────────────────────────────
#
# A careless prover reused nonce k. You have two transcripts:

k_reused = randint(2, int(q) - 1)
R_reused = g^k_reused

c_a = randint(1, int(q) - 1)
s_a = Integer(Mod(k_reused + c_a * x, q))

c_b = randint(1, int(q) - 1)
while c_b == c_a:
    c_b = randint(1, int(q) - 1)
s_b = Integer(Mod(k_reused + c_b * x, q))

print("Intercepted transcripts (same R!):")
print(f"  Transcript A: (R, c_a={c_a}, s_a={s_a})")
print(f"  Transcript B: (R, c_b={c_b}, s_b={s_b})")
print(f"  Both verify correctly: A={sigma_verify(g, y, R_reused, c_a, s_a)}, B={sigma_verify(g, y, R_reused, c_b, s_b)}")
print()
print("YOUR TASK: Extract the secret x from these two transcripts.")
print("Hint: s_a = k + c_a * x (mod q) and s_b = k + c_b * x (mod q)")
print()

# YOUR CODE HERE:
# x_recovered = ???
# print(f"Recovered x = {x_recovered}")
# print(f"Verify: g^x_recovered == y ? {g^x_recovered == y}")

Real-world connection: This is not a theoretical exercise. In 2010, hackers extracted Sony's PlayStation 3 ECDSA private key because Sony reused the same nonce $k$ across multiple signatures. The same extraction formula you just used (or will use) is exactly what broke PS3 security. Never reuse nonces.

Summary

Concept	Key idea
3-move structure	Prover commits ( $R$ ), verifier challenges ( $c$ ), prover responds ( $s$ ). The flow resembles the letter $\Sigma$
Completeness	An honest prover who knows the witness $x$ always produces a valid proof
Special soundness	Two accepting transcripts with the same commitment but different challenges allow extraction of the witness
HVZK	A simulator can produce valid-looking transcripts without knowing $x$ , so the verifier learns nothing from the interaction
Simulation argument	Zero-knowledge means "the verifier could have generated the transcript themselves." If a transcript can be faked, it cannot contain information
Nonce reuse is catastrophic	Reusing the random nonce $k$ allows anyone to extract the secret key

Next: The Schnorr Protocol, the sigma protocol we built here, formalized as a full identification scheme and digital signature.