GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/05-discrete-log-diffie-hellman/break/pohlig-hellman-smooth-order.ipynb
⁴⁸³ views

unlisted

Kernel: SageMath 10.0

Break: Pohlig-Hellman Attack on Smooth-Order DH

Module 05 | Breaking Weak Parameters

When $p - 1$ is smooth, an eavesdropper can recover both DH secret exponents.

Why This Matters

The Pohlig-Hellman attack (explored in 05f for abstract DLP) becomes devastating when applied to a full Diffie-Hellman key exchange. If the group order $p - 1$ is smooth (factors into only small primes), an eavesdropper who sees the public keys $A = g^a$ and $B = g^b$ can:

Recover Alice's secret exponent $a$ from $A$
Recover Bob's secret exponent $b$ from $B$
Compute the shared secret $g^{ab}$

Unlike the small subgroup attack, this is a passive attack --- Eve only needs to eavesdrop, not modify any messages.

The Scenario

Alice and Bob perform a DH exchange with a prime $p$ where $p - 1$ is smooth. Eve intercepts both public keys $A$ and $B$ and recovers the shared secret.

We'll use $p = 7841$ where $p - 1 = 7840 = 2^5 \cdot 5 \cdot 7^2$ . The largest prime factor is just 7.

In [ ]:

# === Step 1: Set up DH with a smooth-order prime ===

p = 7841
print(f'p = {p}')
print(f'Is p prime? {is_prime(p)}')
print(f'p - 1 = {p - 1}')
print(f'p - 1 = {factor(p - 1)}')
print()

# Smoothness: the largest prime factor
facts = list(factor(p - 1))
largest_prime = max(q for q, _ in facts)
print(f'Largest prime factor of p-1: {largest_prime}')
print(f'This is VERY smooth. The DLP will be easy.')
print()

# Generator
g = primitive_root(p)
print(f'Generator g = {g}')
print(f'Order of g = {Mod(g, p).multiplicative_order()}')

In [ ]:

# === Step 2: Alice and Bob perform DH ===

# Alice's secret
a = 4231
A = power_mod(g, a, p)

# Bob's secret
b = 6017
B = power_mod(g, b, p)

# True shared secret
shared_secret = power_mod(g, a * b, p)
assert shared_secret == power_mod(A, b, p)
assert shared_secret == power_mod(B, a, p)

print(f"Alice's secret:  a = {a}")
print(f"Alice's public:  A = g^a mod p = {A}")
print(f"Bob's secret:    b = {b}")
print(f"Bob's public:    B = g^b mod p = {B}")
print(f"Shared secret:   s = g^(ab) mod p = {shared_secret}")
print()
print('--- Eve sees: g, p, A, B ---')
print('--- Eve wants: the shared secret s ---')

Step 3: Eve Applies Pohlig-Hellman to Recover $a$

The Pohlig-Hellman algorithm:

Factor $p - 1 = q_1^{e_1} \cdot q_2^{e_2} \cdots q_k^{e_k}$
For each prime power $q_i^{e_i}$ : project $g$ and $A$ into the subgroup of order $q_i^{e_i}$ , and solve the DLP there by brute force
Combine results using the Chinese Remainder Theorem

Since each $q_i^{e_i}$ is small, each sub-DLP is trivially solvable.

In [ ]:

# === Step 3: Pohlig-Hellman attack on A to recover a ===

def pohlig_hellman(g, target, p):
    """Recover x such that g^x = target mod p using Pohlig-Hellman."""
    order = p - 1
    facts = list(factor(order))
    
    remainders = []
    moduli = []
    total_work = 0
    
    for q, e in facts:
        qe = q^e
        
        # Project into subgroup of order q^e
        g_sub = power_mod(g, order // qe, p)
        t_sub = power_mod(target, order // qe, p)
        
        # Brute-force DLP in this small subgroup
        found = False
        for x in range(qe):
            total_work += 1
            if power_mod(g_sub, x, p) == t_sub:
                remainders.append(x)
                moduli.append(qe)
                found = True
                break
        assert found, f'DLP not found in subgroup of order {qe}'
    
    recovered = CRT(remainders, moduli)
    return recovered, total_work, list(zip(remainders, moduli))

# Recover Alice's secret a
a_recovered, work_a, partials_a = pohlig_hellman(g, A, p)

print('=== Recovering Alice\'s secret a ===')
print()
print('Partial results (CRT components):')
for x, m in partials_a:
    print(f'  a ≡ {x} (mod {m})')
print()
print(f'CRT solution: a = {a_recovered}')
print(f'Actual a:         {a % (p - 1)}')
print(f'Match: {a_recovered == a % (p - 1)}')
print(f'Total work: {work_a} exponentiations')

In [ ]:

# === Step 4: Full attack --- recover BOTH secrets and compute shared secret ===

# Recover Bob's secret b
b_recovered, work_b, partials_b = pohlig_hellman(g, B, p)

print('=== Recovering Bob\'s secret b ===')
for x, m in partials_b:
    print(f'  b ≡ {x} (mod {m})')
print(f'CRT solution: b = {b_recovered}')
print(f'Actual b:         {b % (p - 1)}')
print(f'Match: {b_recovered == b % (p - 1)}')
print()

# Eve computes the shared secret
eve_secret = power_mod(g, a_recovered * b_recovered, p)
print('=== Eve computes the shared secret ===')
print(f'Eve\'s computation: g^(a*b) mod p = {eve_secret}')
print(f'True shared secret:              = {shared_secret}')
print(f'Match: {eve_secret == shared_secret}')
print()
print(f'Total work: {work_a + work_b} exponentiations')
print(f'Brute force would require up to: {2 * (p - 1)} exponentiations')
print(f'Speedup: {2 * (p - 1) / (work_a + work_b):.0f}x')

Cost Analysis: Smooth vs. Safe Primes

The Pohlig-Hellman attack cost is $\sum q_i^{e_i}$ --- the sum of the prime power factors of $p - 1$ . Compare:

Smooth $p - 1$ : all factors small $\Rightarrow$ attack is cheap
Safe $p = 2q + 1$ : $p - 1 = 2q$ with $q$ prime $\Rightarrow$ attack costs $\approx q \approx p/2$

For the smooth prime, the attack cost grows as $O(\log p)$ (essentially free). For a safe prime, the attack cost is $O(p)$ (no better than brute force).

In [ ]:

# === Cost comparison ===

# Our smooth prime
smooth_cost = sum(q^e for q, e in factor(p - 1))
print(f'Smooth prime p = {p}:')
print(f'  p - 1 = {factor(p - 1)}')
print(f'  Pohlig-Hellman cost: {smooth_cost}')
print(f'  Ratio cost/(p-1):   {smooth_cost / (p-1):.4f}')
print()

# A safe prime of similar size
p_safe = 7879  # 7879 is prime, (7879-1)/2 = 3939... let's check
# Find an actual safe prime near 7841
for candidate in range(7841, 8500):
    if is_prime(candidate) and is_prime((candidate - 1) // 2):
        p_safe = candidate
        break

safe_cost = sum(q^e for q, e in factor(p_safe - 1))
print(f'Safe prime p = {p_safe}:')
print(f'  p - 1 = {factor(p_safe - 1)}')
print(f'  Pohlig-Hellman cost: {safe_cost}')
print(f'  Ratio cost/(p-1):   {safe_cost / (p_safe-1):.4f}')
print()
print(f'Smooth prime attack is {safe_cost / smooth_cost:.0f}x cheaper!')

The Fix: Use Safe Primes or Standardized Groups

Two defenses:

Safe primes: $p = 2q + 1$ with $q$ prime. Then $p - 1 = 2q$ has only one large factor, making Pohlig-Hellman useless.
Standardized groups: Use the groups from RFC 3526 (IKE) or RFC 7919 (TLS). These are carefully chosen safe primes with vetted generators.

Real-world note: In 2015, researchers showed that many deployed systems used common DH primes. The Logjam attack demonstrated that if a nation-state precomputed the discrete log for a single 1024-bit prime, they could passively decrypt a significant fraction of Internet traffic. This drove the move to 2048-bit groups and safe primes.

Exercises

Bigger smooth prime: Try $p = 55441$ where $p - 1 = 2^4 \cdot 3^2 \cdot 5 \cdot 7 \cdot 11$ . Run the full attack. How does the cost scale?
Baby-step giant-step on the largest subgroup: In the safe prime case, you can apply BSGS to the subgroup of order $q$ . What is the BSGS cost? (Hint: $O(\sqrt{q})$ .)
Partially smooth: What if $p - 1 = 2 \cdot 3 \cdot 5 \cdot q$ where $q$ is a large prime? How much does Pohlig-Hellman help? How much of the secret does Eve learn?

Summary

Aspect	Detail
Attack type	Passive eavesdropping (no message modification)
Prerequisite	$p - 1$ is smooth (factors into small primes)
What Eve learns	Both secret exponents $a$ and $b$ , and the shared secret
Cost	$\sum q_i^{e_i}$ where $p - 1 = \prod q_i^{e_i}$
Fix	Safe primes ( $p = 2q + 1$ ) or standardized groups

Key takeaways:

Pohlig-Hellman is a passive attack: Eve only eavesdrops.
The attack decomposes the DLP into independent sub-problems via CRT.
Attack cost depends on the largest prime factor of $p - 1$ , not on $p$ itself.
Safe primes ensure $p - 1$ has a large prime factor, rendering the attack useless.
Real-world standards mandate safe primes for exactly this reason.

Back to Module 05: Discrete Log and Diffie-Hellman

Break: Pohlig-Hellman Attack on Smooth-Order DH

Why This Matters

The Scenario

Step 3: Eve Applies Pohlig-Hellman to Recover $a$

Cost Analysis: Smooth vs. Safe Primes

The Fix: Use Safe Primes or Standardized Groups

Exercises

Summary

Product

Resources

Company

Break: Pohlig-Hellman Attack on Smooth-Order DH

Why This Matters

The Scenario

Step 3: Eve Applies Pohlig-Hellman to Recover aaa

Cost Analysis: Smooth vs. Safe Primes

The Fix: Use Safe Primes or Standardized Groups

Exercises

Summary

Step 3: Eve Applies Pohlig-Hellman to Recover $a$