Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
duyuefeng0708
GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/05-discrete-log-diffie-hellman/sage/05f-pohlig-hellman.ipynb
483 views
unlisted
Kernel: SageMath 10.5

Notebook 05f: The Pohlig-Hellman Algorithm

Module 05. The Discrete Logarithm and Diffie-Hellman


Motivating Question. BSGS solves the DLP in O(n)O(\sqrt{n}) time for any group of order nn. But what if nn factors into small primes, say n=24335=2160n = 2^4 \cdot 3^3 \cdot 5 = 2160? Could we somehow solve the DLP in each small prime-power subgroup separately and then combine the results? The Pohlig-Hellman algorithm does exactly this, reducing the DLP in a group of order nn to DLPs in groups of order equal to each prime factor of nn. The cost is O(iei(logn+qi))O(\sum_i e_i(\log n + \sqrt{q_i})) where n=qiein = \prod q_i^{e_i}, dramatically faster when all qiq_i are small.


Prerequisites. You should be comfortable with:

  • The Chinese Remainder Theorem (Module 04d)

  • Baby-step giant-step (notebook 05e)

  • Subgroups and element order (notebook 05b)

Learning objectives. By the end of this notebook you will be able to:

  1. Explain why smooth group orders make the DLP easy.

  2. Project a DLP instance into prime-order subgroups.

  3. Solve sub-DLPs and combine with CRT.

  4. Implement Pohlig-Hellman from scratch.

  5. Explain why safe primes resist this attack.

1. Smooth Numbers and Vulnerable Groups

Definition. An integer nn is BB-smooth if all its prime factors are B\le B.

If the group order n=p1n = p - 1 is smooth, the DLP can be broken efficiently. Let us compare two primes of similar size.

# A smooth-order group vs a safe-prime group p_smooth = 433 # p-1 = 432 = 2^4 * 3^3 p_safe = 467 # p-1 = 466 = 2 * 233 (233 is prime) print(f"Smooth: p = {p_smooth}, p-1 = {p_smooth-1} = {factor(p_smooth-1)}") print(f" Largest prime factor: {max(q for q, _ in factor(p_smooth-1))}") print() print(f"Safe prime: p = {p_safe}, p-1 = {p_safe-1} = {factor(p_safe-1)}") print(f" Largest prime factor: {max(q for q, _ in factor(p_safe-1))}") print() print(f"Pohlig-Hellman cost (smooth): O(sqrt(2) + sqrt(3)) = O(tiny)") print(f"Pohlig-Hellman cost (safe): O(sqrt(233)) = O({isqrt(233)})")

Checkpoint 1. If p1=210355372=259,459,200p - 1 = 2^{10} \cdot 3^5 \cdot 5^3 \cdot 7^2 = 259,459,200, what is the largest prime factor? What would the BSGS cost be for the full group vs the largest subgroup?

2. The Key Idea: Projection into Subgroups

Suppose n=G=q1e1q2e2qkekn = |G| = q_1^{e_1} \cdot q_2^{e_2} \cdots q_k^{e_k}. We want to find xx such that gx=hg^x = h.

Step 1: Project. For each prime power qieiq_i^{e_i}, define: gi=gn/qiei,hi=hn/qieig_i = g^{n / q_i^{e_i}}, \quad h_i = h^{n / q_i^{e_i}}

Then gig_i has order qieiq_i^{e_i}, and gixi=hig_i^{x_i} = h_i where xi=xmodqieix_i = x \bmod q_i^{e_i}.

Step 2: Solve. Solve each small DLP: gixi=hig_i^{x_i} = h_i in a group of order qieiq_i^{e_i}.

Step 3: Combine. Use the Chinese Remainder Theorem to recover xx from {xi}\{x_i\}.


Bridge from Module 04. This is a direct application of the CRT from notebook 04d! There we used CRT to solve systems of congruences. Here we use it to combine DLP solutions from different subgroups.

# Demonstrate the projection step p = 433 # p-1 = 432 = 2^4 * 3^3 n = p - 1 g = Mod(primitive_root(p), p) x_secret = 137 h = g^x_secret print(f"p = {p}, n = p-1 = {n} = {factor(n)}") print(f"Secret x = {x_secret}") print(f"g = {int(g)}, h = g^x = {int(h)}") print() # Project into each prime-power subgroup for q, e in factor(n): qi_ei = q^e exp = n // qi_ei gi = g^exp hi = h^exp xi = x_secret % qi_ei print(f"Prime power {q}^{e} = {qi_ei}:") print(f" g_i = g^({n}/{qi_ei}) = g^{exp} = {int(gi)} [order {multiplicative_order(gi)}]") print(f" h_i = h^{exp} = {int(hi)}") print(f" x mod {qi_ei} = {xi}") print(f" Check: g_i^{xi} = {int(gi^xi)} == h_i? {gi^xi == hi}") print()

3. Solving Sub-DLPs

Each projected DLP is in a group of order qieiq_i^{e_i}. For prime qiq_i (when ei=1e_i = 1), we just run BSGS in a group of order qiq_i. For prime powers (ei>1e_i > 1), we can further decompose using a "lifting" technique, but for simplicity we'll use BSGS on the full prime-power subgroup.

def baby_step_giant_step(g, h, n): """Solve g^x = h in a group of order n.""" m = isqrt(n) + 1 table = {} gj = g^0 for j in range(m): table[gj] = j gj = gj * g g_inv_m = g^(-m) gamma = h for i in range(m): if gamma in table: return (i * m + table[gamma]) % n gamma = gamma * g_inv_m return None # Solve each sub-DLP p = 433 n = p - 1 g = Mod(primitive_root(p), p) x_secret = 137 h = g^x_secret print(f"Solving sub-DLPs for x = {x_secret} mod {n} = {factor(n)}") print() residues = [] # (x_i, q_i^e_i) pairs for CRT for q, e in factor(n): qi_ei = q^e exp = n // qi_ei gi = g^exp hi = h^exp xi = baby_step_giant_step(gi, hi, qi_ei) residues.append((xi, qi_ei)) print(f" x mod {qi_ei} ({q}^{e}) = {xi} " f"[BSGS in group of order {qi_ei}, cost O(sqrt({qi_ei})) = O({isqrt(qi_ei)+1})]") print(f"\nResidues for CRT: {residues}")

Misconception alert. "Pohlig-Hellman always makes the DLP easy." No! It only helps when the group order has small prime factors. If nn is prime (or has a large prime factor), Pohlig-Hellman reduces to a single BSGS in a group of that large order, no speedup.

4. CRT Combination

We now have xxi(modqiei)x \equiv x_i \pmod{q_i^{e_i}} for each prime power. The CRT gives us the unique x[0,n)x \in [0, n).

# Combine using CRT remainders = [r for r, m in residues] moduli = [m for r, m in residues] x_recovered = CRT_list(remainders, moduli) print(f"CRT input:") for r, m in residues: print(f" x ≡ {r} (mod {m})") print(f"\nCRT output: x = {x_recovered}") print(f"Secret was: x = {x_secret}") print(f"Match? {x_recovered == x_secret}") print(f"Verify: g^{x_recovered} = {int(g^x_recovered)} == h = {int(h)}? {g^x_recovered == h}")

5. Complete Implementation

def pohlig_hellman(g, h, n): """ Solve g^x = h in a group of order n using Pohlig-Hellman. Requires: n is known and g has order n. """ residues = [] moduli = [] for q, e in factor(n): qi_ei = q^e exp = n // qi_ei gi = g^exp # project generator hi = h^exp # project target xi = baby_step_giant_step(gi, hi, qi_ei) residues.append(xi) moduli.append(qi_ei) return CRT_list(residues, moduli) # Test on several examples p = 433 g = Mod(primitive_root(p), p) print(f"p = {p}, p-1 = {factor(p-1)}") for x_test in [0, 1, 42, 137, 431]: h = g^x_test x_found = pohlig_hellman(g, h, p - 1) print(f" x = {x_test}: Pohlig-Hellman returns {x_found}, correct? {x_found == x_test}")

Checkpoint 2. What happens if you run pohlig_hellman on a safe prime like p=467p = 467 (where p1=2233p - 1 = 2 \cdot 233)? What are the subgroup orders? Is there a speedup over plain BSGS?

# Checkpoint 2, safe prime p = 467 g = Mod(primitive_root(p), p) x_test = 200 h = g^x_test print(f"p = {p}, p-1 = {factor(p-1)}") print(f"Subgroups: order 2 and order 233") print(f"BSGS in order-233 subgroup: O(sqrt(233)) = O({isqrt(233)+1}) operations") print(f"BSGS on full group: O(sqrt(466)) = O({isqrt(466)+1}) operations") print(f"Speedup: only ~{isqrt(466)//isqrt(233)}x. Pohlig-Hellman barely helps!") print() x_found = pohlig_hellman(g, h, p - 1) print(f"Result: x = {x_found}, correct? {x_found == x_test}")

6. Performance Comparison

Let us compare Pohlig-Hellman, BSGS, and brute force on smooth vs safe primes.

# Find a smooth prime and a safe prime of similar size # Smooth: p-1 has only small factors # We'll use p = 2^16 * 3^2 * 5 + 1 = 2949121? Let's find one def find_smooth_prime(target_bits, B): """Find a prime p near 2^target_bits where p-1 is B-smooth.""" small_primes = list(primes(2, B + 1)) for _ in range(10000): n = 1 while n.nbits() < target_bits: n *= small_primes[randint(0, len(small_primes)-1)] if is_prime(n + 1): return n + 1 return None p_smooth = find_smooth_prime(20, 7) if p_smooth is None: p_smooth = 746497 # 746496 = 2^7 * 3^2 * 11 * 59, fallback # Safe prime of similar size p_safe = next_prime(p_smooth) while not is_prime((p_safe - 1) // 2): p_safe = next_prime(p_safe + 1) print(f"Smooth prime: p = {p_smooth}, p-1 = {factor(p_smooth - 1)}") print(f"Safe prime: p = {p_safe}, p-1 = {factor(p_safe - 1)}") print() for label, p_test in [("Smooth", p_smooth), ("Safe", p_safe)]: g = Mod(primitive_root(p_test), p_test) x = randint(2, p_test - 2) h = g^x start = walltime() x_ph = pohlig_hellman(g, h, p_test - 1) t_ph = (walltime() - start) * 1000 start = walltime() x_bsgs = baby_step_giant_step(g, h, p_test - 1) t_bsgs = (walltime() - start) * 1000 print(f"{label}: PH = {t_ph:.2f} ms, BSGS = {t_bsgs:.2f} ms, " f"PH speedup = {t_bsgs/t_ph:.1f}x" if t_ph > 0 else f"{label}: both fast")

7. Why Safe Primes Resist Pohlig-Hellman

For a safe prime p=2q+1p = 2q + 1 with qq prime:

  • p1=2qp - 1 = 2q has only two prime factors: 22 and qq.

  • The DLP in the order-22 subgroup is trivial (only check 2 elements).

  • The DLP in the order-qq subgroup requires O(q)O(\sqrt{q}) work.

  • Since qp/2q \approx p/2, the overall cost is O(p/2)O(\sqrt{p/2}), essentially no better than generic BSGS.

This is why safe primes are used for Diffie-Hellman parameters in practice (e.g., RFC 3526).


Crypto foreshadowing. The same principle applies to elliptic curves (Module 06): we choose curves where the group order is prime (or nearly prime), so Pohlig-Hellman provides no advantage. A curve with a smooth group order would be cryptographically useless.

# Visualise: DLP difficulty vs smoothness of p-1 primes_list = list(primes(100, 1000)) largest_factors = [] for p in primes_list: largest_factors.append(max(q for q, _ in factor(p - 1))) G = point(list(zip(primes_list, largest_factors)), size=20, alpha=0.7, color='blue') G += plot((x-1)/2, (x, 100, 1000), color='red', linestyle='--', alpha=0.5, legend_label='$(p-1)/2$ (safe prime line)') G.show(figsize=[12, 5], axes_labels=['Prime p', 'Largest prime factor of p-1'], title='Largest prime factor of $p-1$: higher = more resistant to Pohlig-Hellman', gridlines=True) print("Primes near the red line are safe primes (most resistant).") print("Primes near the bottom have smooth p-1 (most vulnerable).")

Checkpoint 3. Among primes p<100p < 100, which one has the smoothest p1p - 1 (smallest largest prime factor)? Which is a safe prime?

8. Exercises

Exercise 1 (Worked): Pohlig-Hellman by Hand

Problem. Solve 2x41(mod97)2^x \equiv 41 \pmod{97} using Pohlig-Hellman.

Solution. p1=96=253p - 1 = 96 = 2^5 \cdot 3.

  1. Subgroup of order 3232 (252^5): project g=296/32=23g' = 2^{96/32} = 2^3, h=413h' = 41^3. Solve gx1=hg'^{x_1} = h'.

  2. Subgroup of order 33: project g=296/3=232g' = 2^{96/3} = 2^{32}, h=4132h' = 41^{32}. Solve gx2=hg'^{x_2} = h'.

  3. CRT: xx1(mod32)x \equiv x_1 \pmod{32}, xx2(mod3)x \equiv x_2 \pmod{3}.

# Exercise 1, verification p = 97; g = Mod(2, p); h = Mod(41, p) n = 96 print(f"p = {p}, n = {n} = {factor(n)}") # Find secret for reference x_true = discrete_log(h, g) print(f"True answer: x = {x_true}") print() # Pohlig-Hellman x_ph = pohlig_hellman(g, h, n) print(f"Pohlig-Hellman: x = {x_ph}") print(f"Match? {x_ph == x_true}") print(f"Verify: 2^{x_ph} mod 97 = {int(g^x_ph)}")

Exercise 2 (Guided): Attack on a Smooth-Order Group

Problem. A careless implementer uses p=22035+1=15728641p = 2^{20} \cdot 3 \cdot 5 + 1 = 15728641 for Diffie-Hellman (this is prime!).

  1. Factor p1p - 1 and identify why this is vulnerable.

  2. Generate a DH exchange (pick random a,ba, b, compute A=ga,B=gbA = g^a, B = g^b).

  3. As Eve, use Pohlig-Hellman to recover Alice's secret aa from A=gaA = g^a.

  4. Compute the shared secret.

Hint: p - 1 = 15728640 = 2^20 * 3 * 5. The largest prime factor is just 5!

# Exercise 2, fill in the TODOs p = 15728641 assert is_prime(p) # TODO 1: Factor p-1 and show it's smooth # print(f"p-1 = {p-1} = {factor(p-1)}") # TODO 2: Set up DH # g = Mod(primitive_root(p), p) # a = randint(2, p-2); b = randint(2, p-2) # A = g^a; B = g^b # true_secret = B^a # TODO 3: Eve recovers a using Pohlig-Hellman # a_eve = pohlig_hellman(g, A, p - 1) # print(f"Eve recovered a = {a_eve}, true a = {a}") # TODO 4: Eve computes shared secret # eve_secret = B^a_eve # print(f"Eve's shared secret: {eve_secret}") # print(f"True shared secret: {true_secret}") # print(f"Match? {eve_secret == true_secret}")

Exercise 3 (Independent): Measuring Smoothness Vulnerability

Problem.

  1. Write a function pohlig_hellman_cost(n) that estimates the total BSGS cost: qenqe\sum_{q^e \| n} \lceil\sqrt{q^e}\rceil.

  2. For primes pp in the range [106,106+1000][10^6, 10^6 + 1000], compute this cost and the "full BSGS" cost p1\lceil\sqrt{p-1}\rceil.

  3. Plot the ratio. Which primes are most vulnerable? Can you identify the safe primes?

# Exercise 3, write your solution here

Summary

ConceptKey Fact
Smooth orderIf n=p1n = p-1 has only small prime factors, DLP is easy
ProjectionMap DLP into subgroups: gi=gn/qieig_i = g^{n/q_i^{e_i}}, hi=hn/qieih_i = h^{n/q_i^{e_i}}
Sub-DLPSolve gixi=hig_i^{x_i} = h_i in group of order qieiq_i^{e_i} via BSGS
CRTCombine xxi(modqiei)x \equiv x_i \pmod{q_i^{e_i}} to recover xx
CostO(qiei)O(\sum \sqrt{q_i^{e_i}}), fast when all qiq_i are small
DefenceUse safe primes p=2q+1p = 2q+1 so p1p-1 has a large prime factor

This completes Module 05! We have built a full picture of the discrete logarithm: from definition (05a), through generators (05b), to the Diffie-Hellman protocol (05c), its theoretical security (05d), and the two main attacks (05e, 05f).


Next module: Module 06. Elliptic Curves