Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
duyuefeng0708
GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/frontier/09-commitments-sigma-protocols/sage/09b-pedersen-commitments.ipynb
483 views
unlisted
Kernel: SageMath 10.5

Pedersen Commitments

Module 09b | Commitments and Sigma Protocols

Can you commit to a secret so that even an all-powerful adversary learns nothing about it?

Motivating Question: In 09a we saw hash-based commitments: C=H(mr)C = H(m \| r). They are computationally hiding, an efficient adversary can't determine mm from CC. But an adversary with unlimited computing power could brute-force every possible (m,r)(m', r') and find which message was committed. Can we do better?

Pedersen commitments can. They achieve perfect hiding: even with infinite computational power, the commitment reveals absolutely zero information about mm. The price? Binding becomes computational rather than perfect, it relies on the hardness of the discrete logarithm problem (Module 05).

Objectives

By the end of this notebook you will be able to:

  1. Set up Pedersen commitment parameters (p,q,g,h)(p, q, g, h) and explain why logg(h)\log_g(h) must be unknown

  2. Commit to a message mm with randomness rr and verify the opening

  3. Prove that Pedersen commitments are perfectly hiding by finding alternative openings

  4. Explain why breaking binding requires solving the DLP

  5. Use the homomorphic property to add commitments without opening them

  6. Extend to vector Pedersen commitments for committing to multiple values at once

Prerequisites

  • Completion of 09a: Commitment Schemes (hiding and binding properties)

  • Familiarity with the discrete logarithm problem (05a)

  • Modular exponentiation in prime-order groups (01d)

The Pedersen Setup

The Pedersen commitment scheme works in a prime-order subgroup of Zp\mathbb{Z}_p^*. The setup requires:

  1. A large prime pp such that p=2q+1p = 2q + 1 for another prime qq (a safe prime)

  2. Two generators gg and hh of the subgroup of order qq

  3. Crucially: nobody knows logg(h)\log_g(h), the discrete log of hh with respect to gg

Why must logg(h)\log_g(h) be unknown? We'll see shortly that knowing it would let you break binding.

For pedagogical clarity, we'll start with small primes. In practice, qq would be at least 256 bits.

# === Pedersen Setup === # We need a safe prime p = 2q + 1 so that we have a subgroup of prime order q. # Step 1: Find a safe prime def find_safe_prime(bits=32): """Find a safe prime p = 2q + 1 where q is also prime.""" while True: q = random_prime(2^bits, lbound=2^(bits-1)) p = 2*q + 1 if is_prime(p): return p, q p, q = find_safe_prime(32) print(f'Safe prime: p = {p}') print(f'Sophie Germain prime: q = {q}') print(f'Subgroup order: q = {q}') print(f'Check: p = 2q + 1? {p == 2*q + 1}') # Step 2: Find generators of the order-q subgroup # Any element a where a^q = 1 and a != 1 is a generator of the order-q subgroup def find_generator(p, q): """Find a generator of the order-q subgroup of Z_p*.""" while True: a = Mod(randint(2, p-2), p) g = a^2 # squaring maps into the order-q subgroup if g != 1: return g g = find_generator(p, q) h = find_generator(p, q) # independent generator, nobody knows log_g(h) print(f'\nGenerator g = {g}') print(f'Generator h = {h}') print(f'Verify: g^q = {g^q} (should be 1)') print(f'Verify: h^q = {h^q} (should be 1)')

Checkpoint: Why do we compute g = a^2 to get a generator of the order-qq subgroup? Think about it: Zp\mathbb{Z}_p^* has order p1=2qp - 1 = 2q. Squaring any element sends it into the unique subgroup of order qq (the quadratic residues). As long as g1g \neq 1, it generates this entire subgroup because qq is prime.

Commit and Reveal

The Pedersen commitment to message mZqm \in \mathbb{Z}_q with randomness rZqr \in \mathbb{Z}_q is:

C=gmhrmodpC = g^m \cdot h^r \mod p
  • Commit phase: Choose random rZqr \leftarrow \mathbb{Z}_q, compute CC, send CC to verifier

  • Reveal phase: Send (m,r)(m, r) to verifier, who checks C=?gmhrC \stackrel{?}{=} g^m \cdot h^r

# === Pedersen Commit and Reveal === def pedersen_commit(m, r, g, h, p): """Compute Pedersen commitment C = g^m * h^r mod p.""" return (g^m * h^r) def pedersen_verify(C, m, r, g, h, p): """Verify that C opens to (m, r).""" return C == g^m * h^r # Alice commits to the message m = 42 m = 42 r = randint(1, int(q) - 1) # random blinding factor C = pedersen_commit(m, r, g, h, p) print(f'Message: m = {m}') print(f'Randomness: r = {r}') print(f'Commitment: C = {C}') # Later, Alice reveals (m, r) and Bob verifies valid = pedersen_verify(C, m, r, g, h, p) print(f'\nVerification: {valid}') # Trying to open with a wrong message fails wrong_m = 99 valid_wrong = pedersen_verify(C, wrong_m, r, g, h, p) print(f'Wrong message verification: {valid_wrong}')

Perfect Hiding

Here is the key insight that makes Pedersen commitments special.

Claim: For any commitment CC and any target message mm', there exists exactly one rr' such that C=gmhrC = g^{m'} \cdot h^{r'}.

Why? If C=gmhrC = g^m \cdot h^r, then we need gmhr=gmhrg^m \cdot h^r = g^{m'} \cdot h^{r'}, which gives hrr=gmmh^{r' - r} = g^{m - m'}. Setting r=r+(mm)logg(h)1r' = r + (m - m') \cdot \log_g(h)^{-1}... but wait, we don't know logg(h)\log_g(h)! The point is that such an rr' exists mathematically.

This means the commitment CC is consistent with every possible message. An adversary looking at CC gains literally zero information about mm. This is information-theoretic hiding, it holds even against computationally unbounded adversaries.

Let's verify this concretely. Since we're in a teaching setting, we can compute logg(h)\log_g(h) for our small parameters and demonstrate the alternative opening.

# === Demonstrating Perfect Hiding === # For our small group, we CAN compute log_g(h) to show alternative openings exist. # Compute the discrete log (only feasible for small parameters!) log_g_h = discrete_log(h, g) print(f'log_g(h) = {log_g_h} (we cheat and compute this for demonstration)') print(f'Verify: g^{log_g_h} = {g^log_g_h} = h = {h}? {g^log_g_h == h}') print() # Original commitment print(f'Original: C = g^{m} * h^{r}') print(f'C = {C}') print() # For any alternative message m', find r' such that C = g^m' * h^r' # We need: g^m * h^r = g^m' * h^r' # So: h^(r'-r) = g^(m-m'), meaning r' = r + (m - m') * log_g(h)^(-1) mod q for m_prime in [0, 1, 100, 9999]: # r' = r + (m - m') * (log_g_h)^(-1) mod q r_prime = Mod(r + (m - m_prime) * inverse_mod(log_g_h, q), q) C_check = g^(m_prime) * h^(int(r_prime)) print(f" m' = {m_prime}, r' = {r_prime} => C = {C_check} match: {C_check == C}") print() print('Every message is equally consistent with C. PERFECT hiding!')

Checkpoint: We just showed that commitment CC can be "opened" to m=0,1,100,9999m' = 0, 1, 100, 9999 (and any other value). Convince yourself: why does this prove that seeing CC gives an adversary no information about mm? The key is that the distribution of CC is identical regardless of which mm was committed, because rr was chosen uniformly at random.

Computational Binding

If Pedersen commitments are perfectly hiding, how can they be binding at all?

Common mistake: "If the commitment is perfectly hiding, how can it be binding at all?" The resolution: binding is computational, not information-theoretic. An adversary with unlimited computation could break binding. But an efficient adversary cannot, because doing so requires solving the DLP.

Why breaking binding implies solving the DLP:

Suppose a cheating committer finds two openings (m1,r1)(m_1, r_1) and (m2,r2)(m_2, r_2) with m1m2m_1 \neq m_2 for the same commitment CC:

gm1hr1=gm2hr2g^{m_1} \cdot h^{r_1} = g^{m_2} \cdot h^{r_2}

Then gm1m2=hr2r1g^{m_1 - m_2} = h^{r_2 - r_1}, which gives:

logg(h)=m1m2r2r1modq\log_g(h) = \frac{m_1 - m_2}{r_2 - r_1} \mod q

So breaking binding directly reveals logg(h)\log_g(h), it solves the DLP!

# === Breaking Binding = Solving the DLP === # If someone finds two valid openings, they've solved the DLP. # Suppose we somehow have two openings for the same C: m1, r1 = m, r # original opening # Use the alternative opening we computed above (m'=100) m2 = 100 r2 = int(Mod(r + (m - m2) * inverse_mod(log_g_h, q), q)) # Verify both open to the same C print(f'Opening 1: m={m1}, r={r1} => C = {pedersen_commit(m1, r1, g, h, p)}') print(f'Opening 2: m={m2}, r={r2} => C = {pedersen_commit(m2, r2, g, h, p)}') print(f'Same C? {pedersen_commit(m1, r1, g, h, p) == pedersen_commit(m2, r2, g, h, p)}') print() # Extract log_g(h) from the two openings recovered_log = Mod((m1 - m2) * inverse_mod(int(Mod(r2 - r1, q)), q), q) print(f'Recovered log_g(h) = {recovered_log}') print(f'Actual log_g(h) = {log_g_h}') print(f'Match: {int(recovered_log) == int(Mod(log_g_h, q))}') print() print('Breaking binding <=> Solving the DLP!')

The Hiding-Binding Duality

In 09a we learned that commitments have two security properties: hiding and binding. Here is a fundamental result:

SchemeHidingBinding
Hash commitment H(mr)H(m | r)ComputationalPerfectly binding (collision-free)
Pedersen gmhrg^m h^rPerfectComputational (DLP-hard)

A deep theorem states: no commitment scheme can be both perfectly hiding AND perfectly binding. Intuitively:

  • Perfect hiding means every CC is consistent with every mm, so alternative openings exist

  • Perfect binding means only one valid opening exists, so CC leaks some information

You must choose one to be perfect and the other computational. Pedersen chooses perfect hiding because in many cryptographic protocols (zero-knowledge proofs, confidential transactions), the prover needs the strongest possible privacy guarantee.

The Homomorphic Property

This is the feature that makes Pedersen commitments a cornerstone of modern cryptography.

Given two commitments:

  • C1=gm1hr1C_1 = g^{m_1} \cdot h^{r_1}

  • C2=gm2hr2C_2 = g^{m_2} \cdot h^{r_2}

Their product is: C1C2=gm1hr1gm2hr2=gm1+m2hr1+r2=Commit(m1+m2,  r1+r2)C_1 \cdot C_2 = g^{m_1} h^{r_1} \cdot g^{m_2} h^{r_2} = g^{m_1 + m_2} \cdot h^{r_1 + r_2} = \text{Commit}(m_1 + m_2,\; r_1 + r_2)

Commitments can be added without opening them! Nobody needs to know m1m_1 or m2m_2 individually, the product of the commitments is a valid commitment to the sum of the messages.

# === Homomorphic Addition of Pedersen Commitments === # Alice commits to m1, Bob commits to m2 m1 = 17 r1 = randint(1, int(q) - 1) C1 = pedersen_commit(m1, r1, g, h, p) m2 = 25 r2 = randint(1, int(q) - 1) C2 = pedersen_commit(m2, r2, g, h, p) print(f'C1 = Commit({m1}, r1) = {C1}') print(f'C2 = Commit({m2}, r2) = {C2}') print() # Multiply the commitments C_product = C1 * C2 # This should equal Commit(m1 + m2, r1 + r2) C_sum = pedersen_commit(m1 + m2, (r1 + r2) % int(q), g, h, p) print(f'C1 * C2 = {C_product}') print(f'Commit({m1}+{m2}, r1+r2) = {C_sum}') print(f'Equal? {C_product == C_sum}') print() print(f'We computed a commitment to {m1} + {m2} = {m1+m2} WITHOUT knowing the individual values!')

Checkpoint: What happens if you raise a commitment to a scalar kk? We get Ck=(gmhr)k=gkmhkr=Commit(km,kr)C^k = (g^m h^r)^k = g^{km} \cdot h^{kr} = \text{Commit}(km, kr). Pedersen commitments are homomorphic under scalar multiplication too! Together with addition, this makes them linearly homomorphic.

# === Scalar Multiplication of Commitments === k = 3 C_scaled = C1^k C_direct = pedersen_commit(k * m1, (k * r1) % int(q), g, h, p) print(f'C1^{k} = {C_scaled}') print(f'Commit({k}*{m1}, {k}*r1) = {C_direct}') print(f'Equal? {C_scaled == C_direct}') print() # Linear combination: 3*m1 + 5*m2 a, b = 3, 5 C_lincomb = C1^a * C2^b C_lincomb_direct = pedersen_commit( (a*m1 + b*m2) % int(q), (a*r1 + b*r2) % int(q), g, h, p ) print(f'{a}*C1 + {b}*C2 (multiplicatively) = {C_lincomb}') print(f'Commit({a}*{m1}+{b}*{m2}, ...) = {C_lincomb_direct}') print(f'Equal? {C_lincomb == C_lincomb_direct}') print(f'\nWe computed Commit({a*m1 + b*m2}) without knowing m1 or m2!')

Visualizing the Commitment Distribution

Let's verify perfect hiding visually. We commit to two different messages m=0m = 0 and m=1m = 1 many times (with fresh random rr each time) and plot the distribution of the resulting commitments. If hiding is perfect, the two distributions should be indistinguishable.

# === Visualizing Perfect Hiding === # Commit to m=0 and m=1 many times and compare distributions # Use a smaller group for visualization p_vis = 2 * 227 + 1 # 455 = 2*227+1, and 227 is prime assert is_prime(p_vis) and is_prime(227) q_vis = 227 # Find generators of the order-227 subgroup g_vis = Mod(2, p_vis)^2 # square to get into subgroup while g_vis^q_vis != 1 or g_vis == 1: g_vis = Mod(randint(2, p_vis - 2), p_vis)^2 h_vis = Mod(randint(2, p_vis - 2), p_vis)^2 while h_vis^q_vis != 1 or h_vis == 1 or h_vis == g_vis: h_vis = Mod(randint(2, p_vis - 2), p_vis)^2 N = 500 commits_m0 = [int(g_vis^0 * h_vis^randint(0, q_vis - 1)) for _ in range(N)] commits_m1 = [int(g_vis^1 * h_vis^randint(0, q_vis - 1)) for _ in range(N)] # Plot as histograms p1 = histogram(commits_m0, bins=50, color='blue', alpha=0.5, legend_label='m=0') p2 = histogram(commits_m1, bins=50, color='red', alpha=0.5, legend_label='m=1') show(p1 + p2, title='Commitment distributions for m=0 vs m=1 (should overlap)', figsize=(10, 4))

Both histograms should look uniformly distributed across the group, identical distributions regardless of the committed message. This is perfect hiding in action: the commitment value carries zero information about mm.

Vector Pedersen Commitments

What if you want to commit to multiple values (m1,m2,,mn)(m_1, m_2, \ldots, m_n) at once? The vector Pedersen commitment generalizes the scheme:

C=g1m1g2m2gnmnhrmodpC = g_1^{m_1} \cdot g_2^{m_2} \cdot \ldots \cdot g_n^{m_n} \cdot h^r \mod p

where g1,g2,,gn,hg_1, g_2, \ldots, g_n, h are independent generators (nobody knows any discrete log relation between them).

This is much more efficient than making nn separate commitments: a single group element commits to an entire vector.

# === Vector Pedersen Commitment === def find_n_generators(n, p, q): """Find n independent generators of the order-q subgroup.""" generators = [] seen = set() while len(generators) < n: candidate = Mod(randint(2, int(p) - 2), p)^2 if candidate != 1 and candidate^q == 1 and int(candidate) not in seen: generators.append(candidate) seen.add(int(candidate)) return generators # Commit to a vector of 4 values n_vals = 4 gens = find_n_generators(n_vals, p, q) # g_1, ..., g_4 h_vec = find_generator(p, q) # blinding generator # Message vector and randomness messages = [10, 20, 30, 40] r_vec = randint(1, int(q) - 1) # Compute vector commitment: C = g_1^m_1 * g_2^m_2 * ... * g_n^m_n * h^r C_vec = prod(gi^mi for gi, mi in zip(gens, messages)) * h_vec^r_vec print(f'Message vector: {messages}') print(f'Vector commitment: C = {C_vec}') # Verify C_verify = prod(gi^mi for gi, mi in zip(gens, messages)) * h_vec^r_vec print(f'Verification: {C_verify == C_vec}') print() # The homomorphic property extends to vectors! messages2 = [5, 10, 15, 20] r_vec2 = randint(1, int(q) - 1) C_vec2 = prod(gi^mi for gi, mi in zip(gens, messages2)) * h_vec^r_vec2 # Component-wise sum msg_sum = [(m1 + m2) % int(q) for m1, m2 in zip(messages, messages2)] r_sum = (r_vec + r_vec2) % int(q) C_sum_vec = prod(gi^mi for gi, mi in zip(gens, msg_sum)) * h_vec^r_sum print(f'Vector 1: {messages}') print(f'Vector 2: {messages2}') print(f'Sum: {msg_sum}') print(f'C1 * C2 = {C_vec * C_vec2}') print(f'C(sum) = {C_sum_vec}') print(f'Homomorphic? {C_vec * C_vec2 == C_sum_vec}')

Exercises

Exercise 1 (Worked)

Task: Set up Pedersen parameters with the safe prime p=23p = 23 (so q=11q = 11). Find generators gg and hh of the order-11 subgroup. Commit to m=7m = 7 with r=3r = 3. Then find an alternative opening (m,r)(m', r') with m=2m' = 2 that produces the same commitment.

This exercise reinforces the perfect hiding property by working through a small concrete example where we can compute logg(h)\log_g(h) by hand.

# Exercise 1. Worked Solution # Step 1: Setup p_ex = 23 # safe prime: 23 = 2*11 + 1 q_ex = 11 # subgroup order # Find a generator: square a random element g_ex = Mod(2, p_ex)^2 # 4 print(f'g = {g_ex}') print(f'g^11 = {g_ex^11} (should be 1)') # List the order-q subgroup generated by g subgroup = [g_ex^i for i in range(q_ex)] print(f'Subgroup: {sorted([int(x) for x in subgroup])}') # Pick h as another generator (different from g) h_ex = Mod(3, p_ex)^2 # 9 print(f'h = {h_ex}') print(f'h^11 = {h_ex^11} (should be 1)') print() # Step 2: Commit to m=7 with r=3 m_ex = 7 r_ex = 3 C_ex = g_ex^m_ex * h_ex^r_ex print(f'C = g^{m_ex} * h^{r_ex} = {g_ex^m_ex} * {h_ex^r_ex} = {C_ex}') print() # Step 3: Find log_g(h) by brute force (small group!) log_g_h_ex = None for i in range(q_ex): if g_ex^i == h_ex: log_g_h_ex = i break print(f'log_g(h) = {log_g_h_ex}') print(f'Check: g^{log_g_h_ex} = {g_ex^log_g_h_ex} = h? {g_ex^log_g_h_ex == h_ex}') print() # Step 4: Find r' such that C = g^2 * h^r' (i.e., m'=2) m_prime = 2 # r' = r + (m - m') * log_g(h)^{-1} mod q inv_log = inverse_mod(log_g_h_ex, q_ex) r_prime = Mod(r_ex + (m_ex - m_prime) * inv_log, q_ex) print(f"Alternative opening: m' = {m_prime}, r' = {r_prime}") # Verify C_check = g_ex^m_prime * h_ex^int(r_prime) print(f"g^{m_prime} * h^{int(r_prime)} = {C_check}") print(f'Same as original C = {C_ex}? {C_check == C_ex}') print() print('Both (m=7, r=3) and (m=2, r={}) open to the same commitment!'.format(int(r_prime))) print('This is why Pedersen commitments are PERFECTLY hiding.')

Exercise 2 (Guided)

Task: Demonstrate the homomorphic property with a concrete application. Alice commits to her age m1m_1 and Bob commits to his age m2m_2. Without either revealing their age, compute a commitment to the sum of their ages and verify it.

Use the parameters p,q,g,hp, q, g, h from the main setup above.

Hints:

  1. Create two commitments C1C_1 and C2C_2 with different messages and randomness

  2. Compute Csum=C1C2C_{\text{sum}} = C_1 \cdot C_2

  3. Verify that CsumC_{\text{sum}} opens with message m1+m2m_1 + m_2 and randomness r1+r2modqr_1 + r_2 \mod q

# Exercise 2. Guided (fill in the TODOs) # Alice's age (secret) alice_age = 28 r_alice = randint(1, int(q) - 1) C_alice = pedersen_commit(alice_age, r_alice, g, h, p) print(f'Alice publishes her commitment: C_alice = {C_alice}') # Bob's age (secret) bob_age = 34 r_bob = randint(1, int(q) - 1) C_bob = pedersen_commit(bob_age, r_bob, g, h, p) print(f'Bob publishes his commitment: C_bob = {C_bob}') print() # TODO: Compute the combined commitment C_sum = C_alice * C_bob # C_sum = ??? # TODO: What message and randomness does C_sum commit to? # combined_age = ??? # combined_r = ??? # TODO: Verify by computing Commit(combined_age, combined_r) directly # C_direct = pedersen_commit(???, ???, g, h, p) # TODO: Print whether C_sum == C_direct # print(f'Homomorphic addition works? {???}')

Exercise 3 (Independent)

Task: Implement a "balance proof" system. A bank commits to each customer's balance using Pedersen commitments. Given n=5n = 5 customer balances, show that the bank can prove the total deposits equal a claimed amount TT by:

  1. Computing individual commitments Ci=gbihriC_i = g^{b_i} h^{r_i} for each balance bib_i

  2. Computing the product Ctotal=CiC_{\text{total}} = \prod C_i (homomorphic sum)

  3. Opening CtotalC_{\text{total}} with m=bim = \sum b_i and r=rimodqr = \sum r_i \mod q to prove the total

Choose your own balances and verify the entire flow. Bonus: what happens if the bank tries to lie about the total? Show that the verification fails.

# Exercise 3. Your code here

Summary

ConceptKey idea
Pedersen setupPrime-order group with generators gg and hh where logg(h)\log_g(h) is unknown
CommitmentC=gmhrC = g^m \cdot h^r, opened by revealing (m,r)(m, r)
Perfectly hidingFor any CC and any mm', there exists rr' with C=gmhrC = g^{m'} h^{r'}, so the commitment leaks zero information, even to an unbounded adversary
Computationally bindingFinding two valid openings implies solving the DLP
Hiding-binding dualityYou cannot have both perfect hiding and perfect binding. Pedersen trades perfect binding for perfect hiding (opposite of hash commitments)
HomomorphicCommit(m1)Commit(m2)=Commit(m1+m2)\text{Commit}(m_1) \cdot \text{Commit}(m_2) = \text{Commit}(m_1 + m_2), so commitments can be added without opening
Vector commitmentsC=g1m1gnmnhrC = g_1^{m_1} \cdots g_n^{m_n} \cdot h^r commits to an entire vector in one group element

Crypto foreshadowing: The homomorphic property of Pedersen commitments is the foundation of Bulletproofs range proofs and confidential transactions in Monero and Mimblewimble. When you send cryptocurrency in these systems, the transaction amounts are hidden inside Pedersen commitments, and miners can verify that inputs equal outputs without seeing any amounts, using exactly the homomorphic addition we demonstrated above. In 09c, we'll see how Pedersen commitments combine with sigma protocols to build zero-knowledge proofs.

Next: Sigma Protocols: Intuition