CoCalc -- 12a-secret-sharing-shamir.ipynb

GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/frontier/12-mpc/sage/12a-secret-sharing-shamir.ipynb
⁴⁸³ views

unlisted

Kernel: SageMath 10.5

Module 12: Multi-Party Computation

Motivating Question. You want to split a cryptographic key among 5 people so that any 3 of them can reconstruct it, but any 2 (or fewer) learn absolutely nothing. Is this possible? Shamir's Secret Sharing (1979) solves this elegantly using polynomial interpolation over finite fields.

Prerequisites. You should be comfortable with:

Modular arithmetic and finite fields (Module 01)
Polynomial evaluation and interpolation (Module 02)

Learning objectives. By the end of this notebook you will be able to:

Construct a $(t, n)$ secret sharing scheme using random polynomials.
Generate shares by polynomial evaluation.
Reconstruct the secret using Lagrange interpolation.
Verify the threshold property: $t-1$ shares reveal nothing.
Perform basic operations on secret-shared data.

1. The Idea: Hide a Secret in a Polynomial

Bridge from Module 01. In Module 01, we worked with polynomials over finite fields and saw that a degree- $(t-1)$ polynomial is uniquely determined by $t$ points. Shamir's scheme uses this fact: hide the secret as the constant term $f(0)$ of a random polynomial, and distribute evaluations $f(1), f(2), \ldots, f(n)$ as shares.

The $(t, n)$ threshold scheme:

Secret $s \in \mathbb{F}_p$
Pick random $a_1, \ldots, a_{t-1} \in \mathbb{F}_p$
Define $f(x) = s + a_1 x + a_2 x^2 + \ldots + a_{t-1} x^{t-1}$
Share $i$ : the point $(i, f(i))$ for $i = 1, 2, \ldots, n$
Any $t$ shares determine $f$ uniquely (and thus $s = f(0)$ )
Any $t-1$ shares reveal nothing about $s$

In [ ]:


# Work over a prime field
p = 1009  # prime
F = GF(p)
R.<x> = PolynomialRing(F)

# Our secret
secret = F(42)

# Threshold parameters
t = 3  # threshold: need 3 shares to reconstruct
n = 5  # total shares

# Build a random degree-(t-1) polynomial with f(0) = secret
coeffs = [secret] + [F(randint(1, p-1)) for _ in range(t-1)]
f = R(coeffs)

print(f"=== Shamir Secret Sharing ===")
print(f"Field: F_{p}")
print(f"Threshold: ({t}, {n})")
print(f"Secret: s = {secret}")
print(f"Random polynomial: f(x) = {f}")
print(f"Degree: {f.degree()} (= t-1 = {t-1})")
print(f"Check: f(0) = {f(0)} = secret ✓")

In [ ]:

# Visualize: the polynomial passes through (0, secret)
# and we evaluate at x = 1, 2, ..., n
print("Polynomial evaluations:")
print(f"  f(0) = {f(0)}  ← SECRET (not given to anyone)")
for i in range(1, n+1):
    print(f"  f({i}) = {f(F(i))}  ← Share {i}")

print(f"\nA degree-{t-1} polynomial needs exactly {t} points to reconstruct.")
print(f"With {n} shares, any {t} of them suffice.")

Checkpoint 1. The secret $s$ is $f(0)$ . The shares are $(1, f(1)), (2, f(2)), \ldots, (n, f(n))$ . Since a degree- $(t-1)$ polynomial is determined by $t$ points, any $t$ shares can reconstruct $f$ (and thus $s$ ), but $t-1$ shares leave $f(0)$ completely undetermined.

In [ ]:

def shamir_share(secret, t, n, field):
    """Split a secret into n shares with threshold t.
    
    Returns list of (x_i, y_i) pairs.
    """
    p = field.order()
    R_local.<x_local> = PolynomialRing(field)
    # Random polynomial: f(0) = secret, degree = t-1
    coeffs = [field(secret)] + [field(randint(1, p-1)) for _ in range(t-1)]
    f = R_local(coeffs)
    # Evaluate at x = 1, 2, ..., n
    shares = [(field(i), f(field(i))) for i in range(1, n+1)]
    return shares, f

# Generate shares
shares, poly = shamir_share(42, t=3, n=5, field=F)

print(f"Secret: 42")
print(f"Polynomial: f(x) = {poly}")
print(f"\nShares distributed to 5 parties:")
for i, (xi, yi) in enumerate(shares):
    print(f"  Party {i+1}: ({xi}, {yi})")

3. Reconstruction via Lagrange Interpolation

Given $t$ points $(x_1, y_1), \ldots, (x_t, y_t)$ , Lagrange interpolation recovers the unique polynomial of degree $\leq t-1$ passing through them:

f(x) = \sum_{i=1}^{t} y_i \prod_{j \neq i} \frac{x - x_j}{x_i - x_j}

We only need $f(0)$ , so we can compute just:

s = f(0) = \sum_{i=1}^{t} y_i \prod_{j \neq i} \frac{-x_j}{x_i - x_j}

The products $\lambda_i = \prod_{j \neq i} \frac{-x_j}{x_i - x_j}$ are called Lagrange coefficients.

In [ ]:

def lagrange_coeffs(xs, field):
    """Compute Lagrange coefficients for evaluation at x=0."""
    n = len(xs)
    lambdas = []
    for i in range(n):
        num = field(1)
        den = field(1)
        for j in range(n):
            if i != j:
                num *= -xs[j]
                den *= (xs[i] - xs[j])
        lambdas.append(num / den)
    return lambdas

def shamir_reconstruct(shares, field):
    """Reconstruct the secret from t or more shares."""
    xs = [s[0] for s in shares]
    ys = [s[1] for s in shares]
    lambdas = lagrange_coeffs(xs, field)
    secret = sum(l * y for l, y in zip(lambdas, ys))
    return secret

# Reconstruct from exactly t = 3 shares
subset = shares[:3]  # Parties 1, 2, 3
recovered = shamir_reconstruct(subset, F)
print(f"Using shares from parties 1, 2, 3:")
for xi, yi in subset:
    print(f"  ({xi}, {yi})")
print(f"Reconstructed secret: {recovered}")
print(f"Correct? {recovered == F(42)}")

In [ ]:

# Try ALL possible subsets of size t

print(f"Reconstructing from every possible subset of {t} shares:")
print()
for combo in Combinations(range(n), t):
    subset = [shares[i] for i in combo]
    parties = [i+1 for i in combo]
    recovered = shamir_reconstruct(subset, F)
    print(f"  Parties {parties}: secret = {recovered} {'✓' if recovered == F(42) else '✗'}")

print(f"\nAll {len(list(Combinations(range(n), t)))} subsets of size {t} recover the correct secret!")

Checkpoint 2. Any $t$ shares reconstruct the secret correctly, regardless of which $t$ parties participate. This is the beauty of the threshold property.

4. The Threshold Property: $t-1$ Shares Reveal Nothing

This is the security property of Shamir's scheme. With $t-1$ shares, the secret could be any value in $\mathbb{F}_p$ , each equally likely.

Why? A degree- $(t-1)$ polynomial has $t$ free coefficients. With $t-1$ equations (shares), one degree of freedom remains, exactly the secret $f(0)$ .

In [ ]:

# Demonstrate: with t-1 = 2 shares, ANY secret is consistent
two_shares = shares[:2]  # Only parties 1 and 2
print(f"Suppose an adversary captures 2 shares (< threshold {t}):")
for xi, yi in two_shares:
    print(f"  ({xi}, {yi})")
print()

# For each possible secret value, there exists a degree-2 polynomial
# passing through those 2 shares with f(0) = that value
print(f"For every possible secret s ∈ F_{p}, there's a valid polynomial:")
print()
x1, y1 = two_shares[0]
x2, y2 = two_shares[1]

for s_guess in [0, 1, 42, 100, 500, 999]:
    s_guess = F(s_guess)
    # Find a degree-2 polynomial through (0, s_guess), (x1, y1), (x2, y2)
    R_demo.<xd> = PolynomialRing(F)
    pts = [(F(0), s_guess), (x1, y1), (x2, y2)]
    f_guess = R_demo.lagrange_polynomial(pts)
    print(f"  If s = {s_guess}: f(x) = {f_guess} (passes through both shares ✓)")

print(f"\nThe adversary has no way to distinguish s=42 from any other value.")
print(f"This is information-theoretic security, not computational!")

Misconception alert. " $t-1$ shares narrow down the secret to a smaller set." No, they narrow it down to all possible values in $\mathbb{F}_p$ , each equally likely. This is perfect secrecy, like a one-time pad.

5. Operations on Shares

A remarkable property: we can perform some computations directly on the shares without reconstructing the secret.

Addition: If parties hold shares of $s_1$ and $s_2$ , they can locally add their shares to get shares of $s_1 + s_2$ . This works because polynomial addition is linear:

f_1(i) + f_2(i) = (f_1 + f_2)(i)

and $f_1(0) + f_2(0) = s_1 + s_2$ .

In [ ]:

# Share two secrets and add them
s1, s2 = 42, 17
shares1, _ = shamir_share(s1, t=3, n=5, field=F)
shares2, _ = shamir_share(s2, t=3, n=5, field=F)

# Each party adds their shares locally
shares_sum = [(shares1[i][0], shares1[i][1] + shares2[i][1]) for i in range(5)]

# Reconstruct the sum
recovered_sum = shamir_reconstruct(shares_sum[:3], F)

print(f"s1 = {s1}, s2 = {s2}")
print(f"Expected sum: {(s1 + s2) % p}")
print(f"\nEach party adds shares locally (no communication needed):")
for i in range(5):
    print(f"  Party {i+1}: {shares1[i][1]} + {shares2[i][1]} = {shares_sum[i][1]}")
print(f"\nReconstructed sum: {recovered_sum}")
print(f"Correct? {recovered_sum == F(s1 + s2)}")

In [ ]:

# Scalar multiplication also works locally
scalar = F(7)
shares_scaled = [(shares1[i][0], scalar * shares1[i][1]) for i in range(5)]
recovered_scaled = shamir_reconstruct(shares_scaled[:3], F)

print(f"s = {s1}, scalar = {scalar}")
print(f"Expected: {scalar} × {s1} = {scalar * F(s1)}")
print(f"Reconstructed: {recovered_scaled}")
print(f"Correct? {recovered_scaled == scalar * F(s1)}")
print(f"\nAddition and scalar multiplication work on shares, Shamir is linear!")
print(f"But multiplication of two secrets requires interaction (see SPDZ, Notebook 12e).")

6. What Can Go Wrong?

Shamir's basic scheme assumes an honest dealer who correctly distributes shares from a single polynomial. What if the dealer cheats?

In [ ]:

# A dishonest dealer gives inconsistent shares
honest_shares, _ = shamir_share(42, t=3, n=5, field=F)

# Tamper with party 3's share
cheating_shares = list(honest_shares)
cheating_shares[2] = (honest_shares[2][0], honest_shares[2][1] + F(100))  # add 100

print("=== Dishonest Dealer ===")
print(f"Party 3's share tampered: {honest_shares[2][1]} → {cheating_shares[2][1]}")
print()

# Different subsets now give different secrets!
for combo in combinations(range(5), 3):
    subset = [cheating_shares[i] for i in combo]
    parties = [i+1 for i in combo]
    recovered = shamir_reconstruct(subset, F)
    has_3 = 3 in parties
    correct = recovered == F(42)
    marker = "✗ (includes tampered share)" if has_3 and not correct else "✓" if correct else "✗"
    print(f"  Parties {parties}: secret = {recovered} {marker}")

print(f"\nSubsets without party 3 give the correct secret.")
print(f"Subsets with party 3 give wrong results.")
print(f"Solution: Verifiable Secret Sharing (VSS), parties can check share consistency.")

Crypto foreshadowing. Shamir's scheme is the foundation of MPC. In the SPDZ protocol (Notebook 12e), parties hold Shamir shares of values and perform computations using Beaver triples for multiplication, with MACs to detect cheating.

7. Exercises

Exercise 1 (Worked): Different Thresholds

Problem. Create a $(2, 7)$ sharing of the secret $s = 100$ over $\mathbb{F}_{1009}$ . Verify that any 2 shares reconstruct correctly and that 1 share reveals nothing.

Solution:

In [ ]:

# Exercise 1: Worked solution
shares_ex, poly_ex = shamir_share(100, t=2, n=7, field=F)
print(f"(2, 7) sharing of s = 100")
print(f"Polynomial: f(x) = {poly_ex} (degree 1 = line!)")
print()

# Any 2 shares reconstruct
for combo in [(0,1), (2,5), (0,6)]:
    subset = [shares_ex[i] for i in combo]
    rec = shamir_reconstruct(subset, F)
    parties = [i+1 for i in combo]
    print(f"Parties {parties}: reconstructed = {rec} {'✓' if rec == F(100) else '✗'}")

print(f"\nWith t=2, the polynomial is a LINE through (0, 100).")
print(f"One share gives one point on the line, every line through")
print(f"that point is equally likely, so every secret is consistent.")

Exercise 2 (Guided): Weighted Voting

Problem. Share the secret $s = 777$ with a $(3, 5)$ scheme. Compute $2s + 50$ using only share operations (no reconstruction). Verify the result.

Fill in the TODOs:

In [ ]:

# Exercise 2: fill in the TODOs

# TODO 1: Share s = 777 with (3, 5) threshold
# shares_ex2, _ = shamir_share(777, t=3, n=5, field=F)

# TODO 2: Multiply each share by 2 (scalar multiplication)
# shares_2s = [(shares_ex2[i][0], F(2) * shares_ex2[i][1]) for i in range(5)]

# TODO 3: Add constant 50, this is the same as adding shares of 50
# where all parties know the constant
# Hint: For a constant c, the "shares" are (i, c * lambda_i) where lambda_i
# are the Lagrange coefficients. But simpler: share c with polynomial f(x)=c
# (zero randomness), so share i = (i, c)... but that's only for t=1!
# For general t: share c with threshold t and add share-wise.
# shares_50, _ = shamir_share(50, t=3, n=5, field=F)
# shares_result = [(shares_2s[i][0], shares_2s[i][1] + shares_50[i][1]) for i in range(5)]

# TODO 4: Reconstruct and verify
# result = shamir_reconstruct(shares_result[:3], F)
# print(f"Reconstructed: {result}")
# print(f"Expected: 2*777 + 50 = {(2*777 + 50) % p}")

Exercise 3 (Independent): Threshold Choice

Problem.

In a 10-party system, what threshold $t$ ensures security against up to 4 corrupted parties?
Create a $(t, 10)$ sharing of your chosen secret and verify the threshold property.
What is the maximum number of corrupted parties a $(t, n)$ Shamir scheme can tolerate? How does this relate to the degree of the polynomial?

In [ ]:

# Exercise 3: write your solution here

Summary

Concept	Key Fact
$(t, n)$ scheme	$n$ shares, any $t$ reconstruct, $t-1$ reveal nothing
Polynomial	Degree $t-1$ , secret = $f(0)$ , shares = $f(1), \ldots, f(n)$
Reconstruction	Lagrange interpolation at $x = 0$
Security	Information-theoretic: $t-1$ shares leave all secrets equally likely
Linearity	Addition and scalar multiplication work directly on shares
Multiplication	Requires interaction between parties (degree doubles)

Shamir's secret sharing is the workhorse of threshold cryptography and MPC. Its linearity makes addition cheap (local), while multiplication requires clever protocols like Beaver triples.

Next: 12b: Additive Secret Sharing

1. The Idea: Hide a Secret in a Polynomial

3. Reconstruction via Lagrange Interpolation

4. The Threshold Property: $t-1$ Shares Reveal Nothing

5. Operations on Shares

6. What Can Go Wrong?

7. Exercises

Exercise 1 (Worked): Different Thresholds

Exercise 2 (Guided): Weighted Voting

Exercise 3 (Independent): Threshold Choice

Summary

Product

Resources

Company

Notebook 12a: Shamir Secret Sharing

1. The Idea: Hide a Secret in a Polynomial

2. Sharing: Distributing the Shares

3. Reconstruction via Lagrange Interpolation

4. The Threshold Property: t−1t-1t−1 Shares Reveal Nothing

5. Operations on Shares

6. What Can Go Wrong?

7. Exercises

Exercise 1 (Worked): Different Thresholds

Exercise 2 (Guided): Weighted Voting

Exercise 3 (Independent): Threshold Choice

Summary

4. The Threshold Property: $t-1$ Shares Reveal Nothing