Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
duyuefeng0708
GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/frontier/09-commitments-sigma-protocols/sage/09a-commitment-schemes.ipynb
483 views
unlisted
Kernel: SageMath 10.5

Commitment Schemes

Module 09a | Commitments and Sigma Protocols

The cryptographic sealed envelope: bind yourself to a value without revealing it.

Motivating question: Alice and Bob want to play "odds and evens" over the internet. Each picks a number (say 0 or 1). If the sum is even, Alice wins; if odd, Bob wins. But here's the problem: if Alice announces first, Bob sees her choice and picks accordingly. If Bob goes first, Alice cheats the same way. Can they play fairly WITHOUT a trusted third party?

By the end of this notebook, you'll have a complete answer, and you'll see that the tool behind it, the commitment scheme, is one of the most fundamental primitives in modern cryptography.

Objectives

By the end of this notebook you will be able to:

  1. Explain the sealed-envelope analogy and the two phases of a commitment (commit, reveal)

  2. State the two security properties, hiding and binding, and explain why they matter

  3. Implement a hash-based commitment scheme in SageMath

  4. Understand the fundamental duality: why you cannot have both perfect hiding and perfect binding

  5. Apply commitments to solve fair coin-flipping and other protocols

Prerequisites

You should have completed Modules 01–08 (modular arithmetic through pairings). This notebook requires:

  • Comfort with hash functions (conceptually)

  • Basic familiarity with SageMath / Python syntax

Bridge from earlier modules: In Modules 01–08, we built encryption (AES, RSA), key exchange (Diffie–Hellman, ECDH), and digital signatures (ECDSA). These primitives protect confidentiality and authenticity. Commitments are a different kind of primitive — they don't encrypt a message or prove who sent it. Instead, they let you bind yourself to a value without revealing it. This seemingly simple idea unlocks an entirely new world: zero-knowledge proofs, fair protocols, and verifiable computation.

The Sealed Envelope Analogy

A commitment scheme is the digital version of a sealed envelope:

Physical WorldCryptographic World
Write message on paperChoose message mm
Seal it in an opaque envelopeCompute commitment C=Commit(m,r)C = \text{Commit}(m, r)
Hand the sealed envelope to a friendSend CC to the verifier
Later: open the envelope to reveal the messageReveal mm and rr; verifier checks C=?Commit(m,r)C \stackrel{?}{=} \text{Commit}(m, r)

Two phases:

  1. Commit phase: The committer locks in a value and sends the commitment. The value is hidden.

  2. Reveal phase (opening): The committer reveals the original value and any randomness. The verifier checks consistency.

The scheme must satisfy two security properties:

  • Hiding: The sealed envelope is opaque — seeing CC reveals nothing about mm

  • Binding: The sealed envelope is tamper-proof — the committer cannot open it to a different message mmm' \neq m

Hash-Based Commitment

The simplest commitment scheme uses a cryptographic hash function:

C=H(mr)C = H(m \,\|\, r)

where mm is the message and rr is a random nonce (the "randomness" or "blinding factor").

  • Commit: Pick random rr, compute C=H(mr)C = H(m \,\|\, r), send CC

  • Reveal: Send (m,r)(m, r). The verifier computes H(mr)H(m \,\|\, r) and checks it equals CC

Why does this work?

  • Hiding comes from the random rr: even if mm is known to be one of a few values, the hash output looks uniformly random because rr adds entropy

  • Binding comes from collision resistance: the committer cannot find a different (m,r)(m', r') such that H(mr)=H(mr)H(m' \,\|\, r') = H(m \,\|\, r)

Misconception alert: "A commitment scheme is just a hash." No! The hash is ONE implementation. The concept is more general — any scheme satisfying hiding + binding is a commitment. We'll see Pedersen commitments (next notebook) that use discrete logarithms instead of hashes, and they have very different security properties.

import hashlib import os def hash_commit(message: bytes, randomness: bytes) -> str: """Commit to a message using SHA-256: C = H(m || r)""" return hashlib.sha256(message + randomness).hexdigest() def hash_verify(commitment: str, message: bytes, randomness: bytes) -> bool: """Verify that (message, randomness) opens to the given commitment.""" return hashlib.sha256(message + randomness).hexdigest() == commitment # --- Commit phase --- m = b'heads' # Alice's secret choice r = os.urandom(32) # 256 bits of randomness C = hash_commit(m, r) print(f'Message: {m}') print(f'Randomness: {r.hex()}') print(f'Commitment: {C}') print() # --- Reveal phase --- print(f'Verification: {hash_verify(C, m, r)}') print(f'Tampered msg: {hash_verify(C, b"tails", r)}')

Checkpoint: Before running the next cell, predict: if Alice tries to open her commitment to b'tails' using a different random nonce rr', will she succeed? Why or why not?

# Can Alice cheat? She committed to 'heads' but wants to claim 'tails' # She would need to find r' such that H('tails' || r') = C # Let's try a brute-force search (with a small nonce space, for illustration) cheating_attempts = 0 found_cheat = False for trial in range(100000): r_fake = trial.to_bytes(32, 'big') if hash_commit(b'tails', r_fake) == C: print(f'Cheat found after {trial} attempts!') found_cheat = True break cheating_attempts += 1 if not found_cheat: print(f'No cheat found after {cheating_attempts} attempts.') print('Finding a collision in SHA-256 would take ~2^128 operations.') print('The commitment is BINDING: Alice cannot change her choice.')

Why Randomness Is Essential

What if we skip the randomness and just compute C=H(m)C = H(m)? This is a common beginner mistake. Let's see what goes wrong.

# BAD commitment: no randomness C_bad = hashlib.sha256(b'heads').hexdigest() print(f'Commitment (no randomness): {C_bad}') print() # The verifier knows Alice chose either 'heads' or 'tails'. # Can the verifier figure out which one? candidates = [b'heads', b'tails'] for candidate in candidates: h = hashlib.sha256(candidate).hexdigest() match = ' <-- MATCH!' if h == C_bad else '' print(f'H({candidate.decode():5s}) = {h[:20]}...{match}') print() print('Without randomness, the verifier can test all possible messages.') print('HIDING is completely broken when the message space is small!')

The Two Security Properties in Detail

Let's be precise about what hiding and binding mean:

Hiding

Given a commitment CC, an adversary cannot determine which message mm was committed to.

  • Computationally hiding: No efficient algorithm can distinguish commitments to different messages. (The hash commitment achieves this — as long as the adversary can't invert SHA-256.)

  • Perfectly hiding (information-theoretic): Even with unlimited computational power, the adversary learns nothing about mm from CC. (The hash commitment does NOT achieve this — there exists exactly one preimage.)

Binding

The committer cannot find two different messages that produce the same commitment.

  • Computationally binding: No efficient algorithm can find mmm \neq m' with Commit(m,r)=Commit(m,r)\text{Commit}(m, r) = \text{Commit}(m', r'). (The hash commitment achieves this via collision resistance.)

  • Perfectly binding (information-theoretic): It is impossible (not just hard) to find a collision. (The hash commitment achieves this! For a fixed CC, there is at most one valid opening — well, technically the function is injective on its domain.)

So the hash commitment is: computationally hiding + perfectly binding.

The Fundamental Duality

Here is a deep and surprising fact:

Theorem (informal): No commitment scheme can be both perfectly hiding and perfectly binding.

Why? Suppose a scheme is perfectly hiding. Then for every commitment CC, every message mm in the message space could have produced CC (with some randomness rr). But that means the committer can find an alternative opening — so the scheme cannot be perfectly binding!

Conversely, if a scheme is perfectly binding, then each CC has a unique opening, so an unbounded adversary could (in principle) find it — breaking perfect hiding.

This gives us two families:

TypeHidingBindingExample
Hash commitmentComputationalPerfect (effectively)C=H(mr)C = H(m \,|\, r)
Pedersen commitmentPerfectComputationalC=gmhrC = g^m h^r (next notebook)

The choice depends on your application. For zero-knowledge proofs, we usually want perfect hiding (the Pedersen style), because the prover's secret must remain information-theoretically hidden.

Crypto foreshadowing: Pedersen commitments (next notebook, 09b) achieve perfect hiding using the discrete log assumption. They are additively homomorphic — you can add committed values without opening them. This property is the foundation of Bulletproofs, confidential transactions in blockchains, and the commitment schemes inside SNARKs (Module 10).

# Demonstrating the duality: a "toy" perfectly-hiding scheme # In a perfectly hiding scheme, every commitment C can be opened to ANY message. # Let's show this with a toy example using one-time pads. # Toy scheme: Commit(m, r) = m XOR r, where r is uniform random # This is perfectly hiding: C = m XOR r is uniform regardless of m # But NOT binding: given C, the committer can claim ANY m' by setting r' = m' XOR C import secrets m_real = 42 r = secrets.randbelow(256) # random byte C = m_real ^^ r # XOR commitment (^^ is XOR in SageMath) print(f'Committed value: m = {m_real}') print(f'Randomness: r = {r}') print(f'Commitment: C = {C}') print() # Now the dishonest committer wants to open to m' = 99 m_fake = 99 r_fake = m_fake ^^ C # compute r' so that m' XOR r' = C print(f'Fake opening: m\' = {m_fake}, r\' = {r_fake}') print(f'Verification: m\' XOR r\' = {m_fake ^^ r_fake} == C = {C}? {m_fake ^^ r_fake == C}') print() print('Perfectly hiding (C is uniform), but NOT binding (committer can cheat)!') print('This illustrates the duality: you cannot have both.')

Commitment as a Protocol

Let's formalize the two-phase protocol with a complete interactive simulation. Alice commits, sends the commitment to Bob, and later reveals.

Phase 1 (Commit): Alice ---[ C ]----> Bob (Bob stores C) Phase 2 (Reveal): Alice ---[ m, r ]---> Bob (Bob checks H(m||r) == C)
import hashlib, os class Committer: """Alice: commits to a value, later reveals it.""" def commit(self, message: bytes): self._m = message self._r = os.urandom(32) self._C = hashlib.sha256(self._m + self._r).hexdigest() return self._C # send only the commitment def reveal(self): return self._m, self._r # open the envelope class Verifier: """Bob: receives commitment, later verifies the opening.""" def receive_commitment(self, C: str): self._C = C print(f'[Bob] Received commitment: {C[:32]}...') def verify(self, message: bytes, randomness: bytes) -> bool: C_check = hashlib.sha256(message + randomness).hexdigest() valid = (C_check == self._C) print(f'[Bob] Received opening: m = {message}') print(f'[Bob] Recomputed hash matches commitment? {valid}') return valid # --- Run the protocol --- alice = Committer() bob = Verifier() print('=== Phase 1: Commit ===') C = alice.commit(b'I choose 7') bob.receive_commitment(C) print() print('=== Phase 2: Reveal ===') m, r = alice.reveal() result = bob.verify(m, r) print(f'\nProtocol outcome: {"ACCEPT" if result else "REJECT"}')

Application: Fair Coin Flipping

Now let's solve the motivating question! Alice and Bob can play odds-and-evens fairly using commitments:

  1. Alice picks her bit a{0,1}a \in \{0, 1\} and sends C=Commit(a,r)C = \text{Commit}(a, r) to Bob

  2. Bob picks his bit b{0,1}b \in \{0, 1\} and sends bb to Alice (in the clear)

  3. Alice reveals (a,r)(a, r). Bob verifies the commitment.

  4. The "coin flip" result is aba \oplus b (XOR)

Why is this fair?

  • Bob can't cheat: when he sends bb in step 2, he doesn't know aa (hiding)

  • Alice can't cheat: she committed to aa before seeing bb (binding)

import hashlib, os, secrets def commit(bit: int) -> tuple: """Returns (commitment, message_bytes, randomness).""" m = str(bit).encode() r = os.urandom(32) C = hashlib.sha256(m + r).hexdigest() return C, m, r def verify(C: str, m: bytes, r: bytes) -> bool: return hashlib.sha256(m + r).hexdigest() == C print('=== Fair Coin Flip Protocol ===\n') # Step 1: Alice commits alice_bit = secrets.randbelow(2) C, m_alice, r_alice = commit(alice_bit) print(f'Step 1: Alice commits (bit hidden). Sends C = {C[:24]}...') # Step 2: Bob chooses his bit (he can't see Alice's) bob_bit = secrets.randbelow(2) print(f'Step 2: Bob chooses b = {bob_bit} and sends it to Alice.') # Step 3: Alice reveals print(f'Step 3: Alice reveals a = {alice_bit}, r = {r_alice.hex()[:16]}...') valid = verify(C, m_alice, r_alice) print(f' Bob verifies: {"VALID" if valid else "INVALID (Alice cheated!)"}') # Step 4: Result result = alice_bit ^^ bob_bit winner = 'Alice' if result == 0 else 'Bob' print(f'\nResult: a XOR b = {alice_bit} XOR {bob_bit} = {result}') print(f'Winner: {winner}!') print(f'\nNeither party could cheat: hiding protected Bob, binding protected Alice.')

Checkpoint: Think about this: what if Alice refuses to reveal in step 3? The protocol stalls, but does anyone learn anything unfair? What about if the commitment were NOT hiding — could Bob cheat in step 2? Make sure you can answer both before proceeding.

Other Applications

Commitments appear everywhere in cryptography:

ApplicationHow commitments help
Sealed-bid auctionsEach bidder commits to their bid. All commitments are collected, then all are opened simultaneously. No one can adjust their bid after seeing others'.
Zero-knowledge proofsThe prover commits to intermediate values. The verifier challenges. The commitment ensures the prover can't "backtrack."
Coin flippingAs we just saw — commitments enable fair randomness without trusted third parties.
BlockchainCommit-reveal schemes prevent front-running in DeFi, enable private voting, and more.

The pattern is always the same: commit first, then interact, then reveal. The commitment creates a temporal barrier that prevents cheating.

Exercises

Exercise 1: Sealed-Bid Auction (Worked)

Three bidders each commit to a secret bid. After all commitments are collected, the bids are revealed and the highest bidder wins. Implement this protocol and verify all commitments.

import hashlib, os # --- Sealed-Bid Auction Protocol --- # Phase 1: Each bidder commits to a secret bid bidders = { 'Alice': 150, 'Bob': 220, 'Carol': 180, } commitments = {} secrets_store = {} # In practice, each bidder keeps their own secrets print('=== Phase 1: Commit ===\n') for name, bid in bidders.items(): m = str(bid).encode() r = os.urandom(32) C = hashlib.sha256(m + r).hexdigest() commitments[name] = C secrets_store[name] = (m, r) print(f'{name} commits: C = {C[:32]}...') print('\nAll commitments collected. No one can change their bid now.\n') # Phase 2: Reveal all bids and verify print('=== Phase 2: Reveal ===\n') verified_bids = {} for name in bidders: m, r = secrets_store[name] C_check = hashlib.sha256(m + r).hexdigest() valid = (C_check == commitments[name]) bid_value = int(m.decode()) verified_bids[name] = bid_value status = 'VALID' if valid else 'INVALID' print(f'{name} reveals bid = {bid_value} [{status}]') # Determine winner winner = max(verified_bids, key=verified_bids.get) print(f'\nWinner: {winner} with bid {verified_bids[winner]}!') print('\nKey insight: binding ensures no bidder could adjust after seeing others\' commitments.')

Exercise 2: Broken Commitment — What Goes Wrong? (Guided)

Alice uses a bad commitment scheme: she commits to her message WITHOUT randomness, i.e., C=H(m)C = H(m) instead of C=H(mr)C = H(m \,\|\, r).

Bob knows Alice's message is one of: 'yes', 'no', or 'maybe'.

Your task: Write code that lets Bob figure out Alice's message from the commitment alone.

Hints:

  • Bob can compute H(m)H(m) for each candidate message

  • Compare each result to Alice's commitment

  • Which security property is broken? Hiding or binding?

import hashlib # Alice's (bad) commitment: no randomness! alice_secret = b'maybe' C_bad = hashlib.sha256(alice_secret).hexdigest() print(f'Alice sends commitment: {C_bad[:32]}...') print() # Bob knows the message is one of these: candidates = [b'yes', b'no', b'maybe'] # TODO: Bob's attack # For each candidate message, compute its hash and compare to C_bad. # Print which message Alice committed to. # # Your code here: # for candidate in candidates: # h = hashlib.sha256(______).hexdigest() # if h == ______: # print(f'Bob discovered Alice\'s message: {______}') # # After finding the answer, write a comment explaining: # Which property is broken? Why does randomness fix it?

Exercise 3: Commit-and-Reveal Voting (Independent)

Design and implement a simple yes/no vote among 5 voters using hash commitments. Your protocol should:

  1. Have each voter commit to their vote (b'yes' or b'no') with proper randomness

  2. Collect all commitments before any reveals

  3. Verify each opening and tally the votes

  4. Print the final result

Challenge questions (answer in comments):

  • What happens if a voter refuses to reveal? How would you handle this?

  • Does this scheme protect voter privacy after the reveal phase? Why or why not?

  • How might you modify this to keep votes private even after tallying? (Hint: think about homomorphic commitments — coming in 09b)

import hashlib, os # Exercise 3: Your implementation here # # Suggested structure: # 1. Define voters and their secret votes # 2. Commit phase: each voter commits # 3. Reveal phase: each voter reveals, verify each commitment # 4. Tally and announce result

Summary

ConceptKey idea
Two phasesCommit (seal the value), then Reveal (open and verify)
HidingThe commitment reveals nothing about the message
BindingThe committer cannot change the message after committing
Hash commitmentC=H(mr)C = H(m \,|\, r) is simple, computationally hiding, and perfectly binding
Randomness is essentialWithout rr, an adversary can brute-force small message spaces and break hiding
Fundamental dualityYou cannot have both perfect hiding and perfect binding, you must choose one to be computational
ApplicationsFair coin flipping, sealed-bid auctions, zero-knowledge proofs, and more

Looking ahead: In the next notebook, we move from hash-based commitments to Pedersen commitments — an algebraic scheme that achieves perfect hiding using the discrete log assumption. Pedersen commitments are additively homomorphic: you can add two commitments and get a commitment to the sum, without opening either one. This property is the foundation of Bulletproofs, confidential transactions, and the commitment schemes inside modern zero-knowledge proof systems (SNARKs, STARKs).

Next: Pedersen Commitments