GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/01-modular-arithmetic-groups/python/break-smooth-order-attack.ipynb
⁴⁸³ views

unlisted

Kernel: Python 3

In [ ]:

# This notebook has two versions:
#   Python (this file) -- runs in browser via JupyterLite, no install needed
#   SageMath (../sage/) -- richer algebra system, needs local install or Codespaces
#
# Both versions cover the same material. Choose whichever works for you.

import sys, os
sys.path.insert(0, os.path.join('..', '..', '..', 'shared'))

from cryptolab import Mod, crt, factor, is_prime, power_mod, primitive_root

Break: Smooth-Order Attack

Module 01 | Breaking Weak Parameters

When $p-1$ has only small prime factors, the discrete log problem becomes easy.

Why This Matters

Diffie-Hellman key exchange relies on the discrete logarithm problem (DLP) being hard: given $g$ , $p$ , and $A = g^a \bmod p$ , find $a$ .

But hardness depends critically on the structure of the group. If $p - 1$ factors into only small primes (we call $p - 1$ smooth), an attacker can:

Solve the DLP independently in each small-prime subgroup (cheap!)
Combine the partial results using the Chinese Remainder Theorem

This is the Pohlig-Hellman algorithm (1978). It reduces a hard DLP into many trivially small DLPs.

The Scenario

Alice uses Diffie-Hellman with a deliberately bad prime where $p - 1$ is smooth (has only small factors). She thinks her secret exponent is safe because $p$ is prime.

Your job: recover Alice's secret key using the Pohlig-Hellman attack.

In [ ]:

# === The Setup: A prime with a dangerously smooth order ===

p = 2521  # A prime number
print(f'p = {p}')
print(f'Is p prime? {is_prime(p)}')
print()

# The critical weakness: look at the factorization of p - 1
print(f'p - 1 = {p - 1}')
print(f'p - 1 = {factor(p - 1)}')
print()
print('Every prime factor is at most 9. This is VERY smooth.')

In [ ]:

# === Alice's key pair ===

g = primitive_root(p)
print(f'Generator g = {g}  (a primitive root mod {p})')
print(f'Order of g = {Mod(g, p).multiplicative_order()}')
print()

# Alice's secret exponent
secret = 1337
A = power_mod(g, secret, p)

print(f"Alice's public key:  A = g^secret mod p = {A}")
print(f"Alice's secret key:  secret = {secret}  (we pretend we don't know this)")
print()
print(f'Attacker sees: g = {g}, p = {p}, A = {A}')
print(f'Attacker wants: secret such that g^secret ≡ A (mod p)')

Step 1: Factor the Group Order

The group $\mathbb{Z}/p\mathbb{Z}^*$ has order $p - 1 = 2520$ .

2520 = 2^3 \cdot 3^2 \cdot 5 \cdot 7

The largest prime power is just $9 = 3^2$ . This means we never need to brute-force more than 9 possibilities in any subgroup. Compare this to brute-forcing all 2520!

In [ ]:

# Factor the group order
order = p - 1
factored = list(factor(order))

print(f'Group order: {order}')
print(f'Factorization: {factored}')
print()

# The prime powers we'll work with
prime_powers = [(q, e, q**e) for q, e in factored]
for q, e, qe in prime_powers:
    print(f'  {q}^{e} = {qe}  →  brute force at most {qe} values')

print(f'\nTotal brute-force work: {sum(qe for _, _, qe in prime_powers)} operations')
print(f'Naive brute force:     {order} operations')
print(f'Speedup factor:        {order / sum(qe for _, _, qe in prime_powers):.1f}x')

Step 2: Solve the DLP in Each Subgroup

For each prime power $q^e$ dividing $p - 1$ , we project both $g$ and $A$ into the subgroup of order $q^e$ :

g_{q^e} = g^{(p-1)/q^e} \bmod p \qquad A_{q^e} = A^{(p-1)/q^e} \bmod p

Now $g_{q^e}$ has order $q^e$ , and we need to find $x$ such that $g_{q^e}^x \equiv A_{q^e} \pmod{p}$ .

Since $q^e$ is small, we can just brute force this.

In [ ]:

# Pohlig-Hellman: solve the DLP in each small subgroup
partial_logs = []

for q, e in factored:
    qe = q**e
    
    # Project into subgroup of order q^e
    g_qe = power_mod(g, (p - 1) // qe, p)
    A_qe = power_mod(A, (p - 1) // qe, p)
    
    print(f'--- Subgroup of order {q}^{e} = {qe} ---')
    print(f'  g_{qe} = g^{(p-1)//qe} mod p = {g_qe}')
    print(f'  A_{qe} = A^{(p-1)//qe} mod p = {A_qe}')
    
    # Brute force the small DLP
    for x in range(qe):
        if power_mod(g_qe, x, p) == A_qe:
            partial_logs.append((x, qe))
            print(f'  Found: secret ≡ {x} (mod {qe})')
            break
    print()

Step 3: Combine with the Chinese Remainder Theorem

We now have a system of congruences:

\text{secret} \equiv x_1 \pmod{q_1^{e_1}}, \quad \text{secret} \equiv x_2 \pmod{q_2^{e_2}}, \quad \ldots

Since the moduli $q_i^{e_i}$ are pairwise coprime and their product is $p - 1$ , the CRT gives us a unique solution modulo $p - 1$ .

In [ ]:

# Combine partial results with CRT
remainders = [x for x, qe in partial_logs]
moduli = [qe for x, qe in partial_logs]

print('System of congruences:')
for x, qe in partial_logs:
    print(f'  secret ≡ {x} (mod {qe})')

recovered = crt(remainders, moduli)

print(f'\n=== CRT Solution ===')
print(f'Recovered secret: {recovered}')
print(f'Actual secret:    {secret}')
print(f'Match: {recovered == secret}')

In [ ]:

# Verify: does g^recovered ≡ A (mod p)?
print(f'Verification: g^{recovered} mod p = {power_mod(g, recovered, p)}')
print(f'Alice\'s public key A          = {A}')
print(f'Match: {power_mod(g, recovered, p) == A}')

Cost Comparison

How much work did the attacker actually do?

In [ ]:

# Cost analysis
brute_force_cost = p - 1  # Worst case: try all exponents
pohlig_hellman_cost = sum(q**e for q, e in factored)  # Sum of subgroup sizes

print('=== Attack Cost Comparison ===')
print(f'Brute force (worst case):   {brute_force_cost} exponentiations')
print(f'Pohlig-Hellman:             {pohlig_hellman_cost} exponentiations')
print(f'Speedup:                    {brute_force_cost / pohlig_hellman_cost:.1f}x faster')
print()
print('For a real-world smooth prime with p ~ 2^1024:')
print('Brute force:        ~2^1024 operations (impossible)')
print('Pohlig-Hellman:     ~sum of small factors (trivial!)')
print()
print('The attack scales with the LARGEST prime factor of p-1,')
print('not with p itself. Smooth p-1 = broken DH.')

The Fix: Safe Primes

A safe prime is a prime $p$ such that $q = (p-1)/2$ is also prime. Then $p - 1 = 2q$ , which has only two prime factors: $2$ and the large prime $q$ .

The Pohlig-Hellman attack can only extract the secret modulo $2$ (trivial) and modulo $q$ (requires solving a DLP of size $q$ , essentially as hard as the original problem).

In [ ]:

# Compare: a safe prime near the same size
# Find a safe prime: p such that (p-1)/2 is also prime
p_safe = 2543  # Let's find one programmatically
for candidate in range(2521, 3000):
    if is_prime(candidate) and is_prime((candidate - 1) // 2):
        p_safe = candidate
        break

print(f'Safe prime: p = {p_safe}')
print(f'p - 1 = {p_safe - 1} = {factor(p_safe - 1)}')
print()

q_safe = (p_safe - 1) // 2
print(f'Pohlig-Hellman subgroup sizes: 2 and {q_safe}')
print(f'Largest subgroup DLP: {q_safe} (almost as hard as the full DLP!)')
print()

# Compare attack costs
pohlig_safe_cost = 2 + q_safe
print(f'Smooth prime (p={p}):  Pohlig-Hellman cost = {pohlig_hellman_cost}')
print(f'Safe prime (p={p_safe}): Pohlig-Hellman cost = {pohlig_safe_cost}')
print(f'\nWith the safe prime, the attack gives you almost NOTHING.')
print(f'You learn 1 bit (secret mod 2) and still face a DLP of size {q_safe}.')

Exercise: Try It Yourself

Change the secret: Set secret to a different value and re-run the attack. Does it still work? Why?
Bigger smooth prime: Try p = 55441 where $p - 1 = 55440 = 2^4 \cdot 3^2 \cdot 5 \cdot 7 \cdot 11$ . How does the attack cost change?
Partially smooth: What if $p - 1$ has one large factor and several small ones? How much does the attacker learn?

Summary

	Smooth $p-1$	Safe prime $p = 2q+1$
Factorization of $p-1$	Many small primes	$2 \times$ (large prime)
Pohlig-Hellman cost	Sum of small factors	$\approx p/2$
DLP difficulty	Easy	Hard

Key takeaways:

The DLP is only as hard as the largest prime factor of the group order.
The Pohlig-Hellman algorithm decomposes the DLP into independent sub-problems.
Safe primes ( $p = 2q + 1$ , $q$ prime) defend against this attack.
Modern DH standards (RFC 3526, RFC 7919) mandate safe primes for exactly this reason.

Back to Module 01: Modular Arithmetic & Groups