GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/05-discrete-log-diffie-hellman/sage/05c-diffie-hellman-key-exchange.ipynb
⁴⁸³ views

unlisted

Kernel: SageMath 10.5

Notebook 05c: Diffie-Hellman Key Exchange

Module 05. The Discrete Logarithm and Diffie-Hellman

Motivating Question. Alice and Bob want to agree on a shared secret key, but they can only communicate over a public channel that Eve can monitor. Every message they send, Eve sees. How can they possibly end up with a shared secret that Eve does not know? Whitfield Diffie and Martin Hellman showed in 1976 that the discrete logarithm problem makes this possible.

Prerequisites. You should be comfortable with:

The discrete logarithm problem (notebook 05a)
Primitive roots and generators (notebook 05b)
Modular exponentiation (Module 01/04)

Learning objectives. By the end of this notebook you will be able to:

Describe the Diffie-Hellman protocol step by step.
Simulate a complete key exchange in SageMath.
Explain why the protocol is correct ( $g^{ab} = g^{ba}$ ) and why Eve cannot easily compute the shared secret.
Identify the limitations of basic DH (no authentication, man-in-the-middle).
Understand parameter selection: safe primes and group choice.

1. The Protocol

The Diffie-Hellman key exchange has three phases:

Phase 1: Public Parameters

Alice and Bob publicly agree on:

A large prime $p$
A generator $g$ of $\mathbb{Z}/p\mathbb{Z}^*$ (or of a large prime-order subgroup)

Phase 2: Exchange

Step	Alice	Public Channel	Bob
1	Picks secret $a \xleftarrow{$} \{2, \ldots, p-2\}$		Picks secret $b \xleftarrow{$} \{2, \ldots, p-2\}$
2	Computes $A = g^a \bmod p$	$A \longrightarrow$
3		$\longleftarrow B$	Computes $B = g^b \bmod p$

Phase 3: Shared Secret

Alice computes	Bob computes
$s = B^a = (g^b)^a = g^{ab}$	$s = A^b = (g^a)^b = g^{ab}$

Both arrive at the same value $s = g^{ab} \bmod p$ .

In [ ]:

# ============================================================
# Complete Diffie-Hellman key exchange simulation
# ============================================================

# Phase 1: Public parameters
p = next_prime(10^20)
g = Mod(primitive_root(p), p)
print("=== Phase 1: Public Parameters ===")
print(f"  p = {p}")
print(f"  g = {g}")

# Phase 2: Each side picks a secret and publishes g^secret
a = randint(2, p - 2)   # Alice's secret
b = randint(2, p - 2)   # Bob's secret
A = g^a                  # Alice's public value
B = g^b                  # Bob's public value

print("\n=== Phase 2: Exchange ===")
print(f"  Alice's secret a = {a}")
print(f"  Alice sends    A = g^a = {A}")
print(f"  Bob's secret   b = {b}")
print(f"  Bob sends      B = g^b = {B}")

# Phase 3: Both compute the shared secret
s_alice = B^a    # Alice computes B^a = g^{ba}
s_bob   = A^b    # Bob computes A^b = g^{ab}

print("\n=== Phase 3: Shared Secret ===")
print(f"  Alice computes B^a = {s_alice}")
print(f"  Bob computes   A^b = {s_bob}")
print(f"  Match? {s_alice == s_bob}")

Checkpoint 1. Why does $B^a = A^b$ ? Write out the algebraic steps. (Answer: $B^a = (g^b)^a = g^{ba} = g^{ab} = (g^a)^b = A^b$ , using the commutativity of exponents in an abelian group.)

2. What Eve Sees

Eve observes $p, g, A = g^a, B = g^b$ on the public channel. To find the shared secret $s = g^{ab}$ , she would need to either:

Recover $a$ from $A = g^a$ (solve the DLP), or
Recover $b$ from $B = g^b$ (solve the DLP), or
Compute $g^{ab}$ directly from $g^a$ and $g^b$ without knowing $a$ or $b$ (the Computational Diffie-Hellman problem, covered in 05d).

All three are believed to be computationally infeasible for large primes.

In [ ]:

# Eve's view: she knows p, g, A, B but not a or b
print("=== Eve's View ===")
print(f"  p = {p}")
print(f"  g = {g}")
print(f"  A = g^a = {A}")
print(f"  B = g^b = {B}")
print(f"  Shared secret s = g^ab = ???")
print()

# For a SMALL prime, Eve can solve the DLP by brute force
p_tiny = 23
g_tiny = Mod(5, p_tiny)
a_tiny, b_tiny = 7, 15
A_tiny = g_tiny^a_tiny
B_tiny = g_tiny^b_tiny
s_tiny = g_tiny^(a_tiny * b_tiny)

print(f"Toy example: p={p_tiny}, g={int(g_tiny)}, A={int(A_tiny)}, B={int(B_tiny)}")

# Eve solves DLP by brute force
for x in range(p_tiny - 1):
    if g_tiny^x == A_tiny:
        a_eve = x
        break
s_eve = B_tiny^a_eve
print(f"Eve recovers a = {a_eve} (true a = {a_tiny})")
print(f"Eve computes s = B^a = {int(s_eve)} (true s = {int(s_tiny)})")
print(f"\nBut for p with 20+ digits, brute force is completely infeasible!")

Misconception alert. "DH sends the shared secret over the channel." No! The shared secret $g^{ab}$ is never transmitted. Alice sends $g^a$ and Bob sends $g^b$ . Neither of these reveals $g^{ab}$ (under the CDH assumption). The "magic" is that both parties can independently compute the same value.

3. Step-by-Step with Small Numbers

Let us trace a complete exchange with $p = 23, g = 5$ so we can verify every step by hand.

In [ ]:

# Fully traced DH with small numbers
p = 23
g = Mod(5, p)
a, b = 6, 15

print(f"Public: p = {p}, g = {int(g)}")
print()

# Alice
A = g^a
print(f"Alice: a = {a}")
print(f"  A = g^a = {int(g)}^{a} mod {p}")
print(f"  A = {int(g)}^{a} = {int(g)^a} = {int(g)^a} mod {p} = {int(A)}")
print()

# Bob
B = g^b
print(f"Bob: b = {b}")
print(f"  B = g^b = {int(g)}^{b} mod {p} = {int(B)}")
print()

# Shared secret
s_alice = B^a
s_bob = A^b
print(f"Alice computes: s = B^a = {int(B)}^{a} mod {p} = {int(s_alice)}")
print(f"Bob computes:   s = A^b = {int(A)}^{b} mod {p} = {int(s_bob)}")
print(f"\nDirect check: g^(ab) = {int(g)}^{a*b} mod {p} = {int(g^(a*b))}")
assert s_alice == s_bob == g^(a*b)

Checkpoint 2. Redo the exchange with $p = 23, g = 5, a = 3, b = 10$ . Compute $A, B$ , and $s$ by hand (use the power table from 05a if needed), then verify with SageMath.

In [ ]:

# Checkpoint 2, verify your hand computation
p = 23; g = Mod(5, p); a = 3; b = 10
A = g^a; B = g^b; s = A^b
print(f"A = {int(A)}, B = {int(B)}, s = {int(s)}")
assert B^a == A^b

4. Visualising the Key Space

Let us visualise what the possible shared secrets look like for a small prime. For each pair $(a, b)$ , the shared secret is $g^{ab} \bmod p$ .

In [ ]:

p = 23
g = Mod(5, p)

# Build the matrix of shared secrets
n = p - 1
secrets = matrix(ZZ, n, n)
for i in range(n):
    for j in range(n):
        secrets[i, j] = int(g^((i+1)*(j+1)))

# Plot as a heatmap using SageMath's matrix_plot
G = matrix_plot(secrets, cmap='viridis', origin='lower', colorbar=True)
G.show(figsize=8, title=f'Shared secret $g^{{ab}} \\mod {p}$',
       axes=True)

print("The heatmap looks random, there's no pattern an eavesdropper can exploit.")

5. Parameter Selection

For DH to be secure, the parameters $(p, g)$ must be chosen carefully:

Parameter	Requirement	Why
$p$	Large (2048+ bits)	DLP difficulty scales with $p$
$p$	Preferably a safe prime $p = 2q + 1$	Limits subgroup attacks
$g$	Generator of prime-order subgroup of order $q$	DLP has no small-subgroup shortcuts
$a, b$	Random, uniform in $\{2, \ldots, q-1\}$	Predictable exponents = broken DH

Bridge from Module 04. Remember that in RSA, we needed $p$ and $q$ to be large primes. Here the requirement is similar but different: we need $p$ to be large and $p - 1$ to have a large prime factor (ideally $p - 1 = 2q$ with $q$ prime). This constrains the structure of $\mathbb{Z}/p\mathbb{Z}^*$ in a way that resists the Pohlig-Hellman attack (notebook 05f).

In [ ]:

# Generate a safe prime and demonstrate DH with it
def find_safe_prime(bits):
    """Find a safe prime p = 2q + 1 with approximately `bits` bits."""
    while True:
        q = random_prime(2^bits, lbound=2^(bits-1))
        p = 2*q + 1
        if is_prime(p):
            return p, q

p, q = find_safe_prime(64)
print(f"Safe prime: p = {p}")
print(f"  p - 1 = 2 * q where q = {q}")
print(f"  p has {p.nbits()} bits, q has {q.nbits()} bits")
print(f"  is_prime(q)? {is_prime(q)}")

# Find a generator of the order-q subgroup (quadratic residues)
# Any g = h^2 mod p with g != 1 generates the subgroup of order q
g = Mod(primitive_root(p), p)^2   # square to get into order-q subgroup
print(f"\n  g = {g} (order q = {multiplicative_order(g)})")
assert multiplicative_order(g) == q

# DH exchange
a = randint(2, q - 1)
b = randint(2, q - 1)
A, B = g^a, g^b
s_alice, s_bob = B^a, A^b
print(f"\nDH exchange successful: {s_alice == s_bob}")

6. Man-in-the-Middle Attack

Basic DH provides no authentication. An active attacker (Mallory) can intercept and substitute her own values:

Alice	Channel	Mallory	Channel	Bob
$A = g^a$	$\rightarrow$	intercepts $A$ , sends $M_1 = g^{m_1}$	$\rightarrow$	receives $M_1$
receives $M_2$	$\leftarrow$	intercepts $B$ , sends $M_2 = g^{m_2}$	$\leftarrow$	$B = g^b$

Now Alice shares key $g^{am_2}$ with Mallory, and Bob shares key $g^{bm_1}$ with Mallory. Mallory can decrypt, read, re-encrypt, and forward all messages.

In [ ]:

# Man-in-the-middle demonstration
p = next_prime(10^20)
g = Mod(primitive_root(p), p)

# Honest parties
a = randint(2, p - 2)
b = randint(2, p - 2)
A = g^a
B = g^b

# Mallory intercepts and substitutes
m1 = randint(2, p - 2)
m2 = randint(2, p - 2)
M1 = g^m1   # Mallory sends to Bob (pretending to be Alice)
M2 = g^m2   # Mallory sends to Alice (pretending to be Bob)

# Alice thinks she shares a key with Bob, but it's with Mallory
key_alice = M2^a         # Alice computes (g^m2)^a = g^{a*m2}
key_mallory_alice = A^m2  # Mallory computes (g^a)^m2 = g^{a*m2}

# Bob thinks he shares a key with Alice, but it's with Mallory  
key_bob = M1^b           # Bob computes (g^m1)^b = g^{b*m1}
key_mallory_bob = B^m1    # Mallory computes (g^b)^m1 = g^{b*m1}

print("=== Man-in-the-Middle ===")
print(f"Alice's key:           {key_alice}")
print(f"Mallory's key (Alice): {key_mallory_alice}")
print(f"Match? {key_alice == key_mallory_alice}")
print()
print(f"Bob's key:             {key_bob}")
print(f"Mallory's key (Bob):   {key_mallory_bob}")
print(f"Match? {key_bob == key_mallory_bob}")
print()
print(f"Alice and Bob share a key? {key_alice == key_bob}")
print("Mallory can read everything!")

Crypto foreshadowing. Real protocols solve this with authenticated DH:
TLS 1.3: the server signs its DH share with its certificate's private key.
Signal (X3DH): uses long-term identity keys to authenticate the DH exchange.
IKEv2 (IPsec): authenticates with pre-shared keys or certificates.
The DH exchange provides confidentiality; authentication is layered on top.

7. From Shared Secret to Encryption Key

The raw DH shared secret $g^{ab} \bmod p$ is a group element, not a uniform bitstring suitable as an encryption key. In practice, it is passed through a key derivation function (KDF) like HKDF:

\text{key} = \text{HKDF}(g^{ab})

This extracts the entropy into a fixed-length, pseudorandom key.

In [ ]:

import hashlib

# Simple key derivation: hash the shared secret
p = next_prime(10^20)
g = Mod(primitive_root(p), p)
a, b = randint(2, p-2), randint(2, p-2)
s = (g^b)^a

# Convert to bytes and hash
s_bytes = int(s).to_bytes((int(s).bit_length() + 7) // 8, 'big')
key = hashlib.sha256(s_bytes).hexdigest()

print(f"Raw shared secret: {s}")
print(f"Derived 256-bit key: {key}")
print(f"\n(In real TLS, HKDF-Expand is used instead of raw SHA-256)")

Checkpoint 3. Why can't Alice and Bob just use $g^{ab} \bmod p$ directly as an AES key? Think about two issues: (1) the value has $\lceil \log_2 p \rceil$ bits but might not be uniformly distributed, and (2) AES needs exactly 128 or 256 bits.

8. Exercises

Exercise 1 (Worked): Complete DH Exchange

Problem. Perform a DH exchange with $p = 43, g = 3, a = 8, b = 17$ . Compute $A, B$ , and the shared secret.

Solution.

$A = 3^8 \bmod 43 = 6561 \bmod 43 = 25$
$B = 3^{17} \bmod 43 = 129140163 \bmod 43 = 34$
Alice: $s = B^a = 34^8 \bmod 43$
Bob: $s = A^b = 25^{17} \bmod 43$

Both should give the same answer.

In [ ]:

# Exercise 1, verification
p = 43; g = Mod(3, p); a = 8; b = 17
A = g^a; B = g^b
s_alice = B^a; s_bob = A^b
print(f"A = g^a = {int(A)}")
print(f"B = g^b = {int(B)}")
print(f"Alice: s = B^a = {int(s_alice)}")
print(f"Bob:   s = A^b = {int(s_bob)}")
print(f"Match? {s_alice == s_bob}")
print(f"Direct: g^(ab) = g^{a*b} = {int(g^(a*b))}")

Exercise 2 (Guided): Eve's Brute-Force Attack

Problem. Given public values $p = 101, g = 2, A = 57, B = 37$ :

Solve the DLP to find Alice's secret $a$ (i.e., find $a$ such that $2^a \equiv 57 \pmod{101}$ ).
Use $a$ to compute the shared secret $s = B^a \bmod 101$ .
Verify by also solving for $b$ and checking $A^b = B^a$ .

Hint: Use discrete_log() or a brute-force loop.

In [ ]:

# Exercise 2, fill in the TODOs
p = 101
g = Mod(2, p)
A = Mod(57, p)
B = Mod(37, p)

# TODO 1: Find Alice's secret a
# a = discrete_log(A, g)
# print(f"Alice's secret: a = {a}")
# print(f"Check: g^a = {int(g^a)}")

# TODO 2: Compute shared secret
# s = B^a
# print(f"Shared secret: s = {int(s)}")

# TODO 3: Verify via Bob's secret
# b = discrete_log(B, g)
# print(f"Bob's secret: b = {b}")
# print(f"A^b = {int(A^b)}")
# print(f"Match? {B^a == A^b}")

Exercise 3 (Independent): Multi-Party DH

Problem. Can DH be extended to three parties (Alice, Bob, Charlie)?

Research or design a protocol where three parties arrive at a common shared secret $g^{abc}$ using only public exchanges.
Implement it in SageMath with $p = 101, g = 2$ .
How many rounds of communication are needed? (Hint: it requires 2 rounds, not 1.)

In [ ]:

# Exercise 3, write your solution here

Summary

Concept	Key Fact
DH protocol	Alice sends $g^a$ , Bob sends $g^b$ ; both compute $s = g^{ab}$
Correctness	$(g^b)^a = g^{ba} = g^{ab} = (g^a)^b$
Security	Relies on DLP/CDH hardness: Eve sees $g^a, g^b$ but cannot compute $g^{ab}$
No authentication	Vulnerable to man-in-the-middle without signatures or certificates
Parameters	Use safe primes $p = 2q + 1$ and generator of order- $q$ subgroup
Key derivation	Hash $g^{ab}$ through KDF before using as symmetric key

We have built the DH protocol from scratch. But how hard is it really for Eve to compute $g^{ab}$ from $g^a$ and $g^b$ ? The next notebook formalises this with the CDH and DDH hardness assumptions.

Next: 05d. Computational Hardness: CDH and DDH