Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
duyuefeng0708
GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/04-number-theory-rsa/sage/04e-rsa-key-generation.ipynb
483 views
unlisted
Kernel: SageMath 10.5

Primality Testing

Module 04 | 04-number-theory-rsa | Notebook 04e

Trial division, Fermat test, Carmichael numbers, Miller-Rabin, is_prime(), random_prime()

Objectives

By the end of this notebook you will be able to:

  1. Explain why finding large primes is essential for RSA key generation.

  2. Implement trial division and understand its computational limits.

  3. Describe the Fermat primality test and identify its weakness (Carmichael numbers).

  4. Implement and analyze the Miller-Rabin probabilistic primality test.

  5. Use SageMath's is_prime(), is_pseudoprime(), and random_prime() for cryptographic prime generation.

Prerequisites

  • Completion of The Chinese Remainder Theorem.

  • Fermat's Little Theorem from notebook 04c: if pp is prime and gcd(a,p)=1\gcd(a, p) = 1, then ap11(modp)a^{p-1} \equiv 1 \pmod{p}.

  • Modular exponentiation via power_mod(a, e, n).

Motivation: Why Primality Testing Matters

The Question: Is 2891=6189700196426901374495621112^{89} - 1 = 618970019642690137449562111 prime?

Trial division would require testing divisors up to 2891244.52.5×1013\sqrt{2^{89}-1} \approx 2^{44.5} \approx 2.5 \times 10^{13}. At a billion divisions per second, that is roughly 7 hours. For a 1024-bit number (the size RSA actually uses), trial division would take longer than the age of the universe. Is there a faster way?

Bridge from 04c: In notebook 04c we proved Fermat's Little Theorem: ap11(modp)a^{p-1} \equiv 1 \pmod{p} for prime pp. We used it as a structural fact about primes. Now we flip the logic and use it as a test: if an1≢1(modn)a^{n-1} \not\equiv 1 \pmod{n}, then nn cannot be prime. This simple contrapositive is the foundation of every fast primality test.

Crypto foreshadowing: RSA key generation needs two large primes pp and qq, each typically 1024 bits. The standard algorithm is: generate a random odd number, test primality, repeat. By the Prime Number Theorem, roughly 1 in every 710 odd 1024-bit numbers is prime, so efficient testing is critical.


1. Trial Division

The simplest primality test: to check whether nn is prime, test whether any integer dd with 2dn2 \le d \le \lfloor\sqrt{n}\rfloor divides nn.

Why n\sqrt{n} suffices: If n=abn = a \cdot b with aba \le b, then ana \le \sqrt{n}. So if no divisor up to n\sqrt{n} exists, nn is prime.

Complexity: O(n)O(\sqrt{n}) divisions. For an kk-bit number, that is O(2k/2)O(2^{k/2}), exponential in the bit-length.

def trial_division(n): """Return True if n is prime, using trial division up to sqrt(n).""" if n < 2: return False if n == 2: return True if n % 2 == 0: return False d = 3 while d * d <= n: if n % d == 0: return False d += 2 return True # Test on some small numbers test_values = [2, 3, 4, 15, 17, 561, 1009, 1729] for n in test_values: result = trial_division(n) print(f'trial_division({n}) = {result}')
# How slow is trial division on larger numbers? # A 20-digit prime p20 = next_prime(10^19) print(f'Testing p = {p20} ({p20.nbits()} bits)') start = walltime() result = trial_division(p20) elapsed = walltime() - start print(f'Result: {result}, Time: {elapsed:.4f} seconds') # A 30-digit prime, already noticeably slower p30 = next_prime(10^29) print(f'\nTesting p = {p30} ({p30.nbits()} bits)') start = walltime() result = trial_division(p30) elapsed = walltime() - start print(f'Result: {result}, Time: {elapsed:.4f} seconds') print('\nFor a 1024-bit prime, trial division is completely infeasible.')

2. The Fermat Primality Test

Idea (from 04c): Fermat's Little Theorem says: if pp is prime and gcd(a,p)=1\gcd(a,p) = 1, then ap11(modp).a^{p-1} \equiv 1 \pmod{p}.

Contrapositive: If an1≢1(modn)a^{n-1} \not\equiv 1 \pmod{n} for some aa with gcd(a,n)=1\gcd(a,n)=1, then nn is definitely composite.

Algorithm:

  1. Pick a random base aa with 2an22 \le a \le n-2.

  2. Compute an1modna^{n-1} \bmod n.

  3. If the result is not 1, output "composite." Otherwise, output "probably prime."

Repeat kk times with different random bases for higher confidence.

def fermat_test(n, k=10): """Fermat primality test with k random bases. Returns 'composite' or 'probably prime'.""" if n < 2: return 'composite' if n <= 3: return 'probably prime' for _ in range(k): a = randint(2, n - 2) if power_mod(a, n - 1, n) != 1: return 'composite' return 'probably prime' # Test on known primes and composites print(f'fermat_test(17) = {fermat_test(17)}') print(f'fermat_test(100) = {fermat_test(100)}') print(f'fermat_test(1009) = {fermat_test(1009)}') print(f'fermat_test(1000) = {fermat_test(1000)}') # Fast even on large numbers! big_prime = next_prime(2^256) print(f'\nfermat_test(next_prime(2^256)) = {fermat_test(big_prime)}')

The Fermat test is blazingly fast: power_mod uses repeated squaring, so it runs in O(klog2n)O(k \cdot \log^2 n) time, polynomial in the bit-length! Compare that to trial division's exponential cost.

But there is a trap...


3. Carmichael Numbers: When Fermat Fails

Misconception alert: "If an11(modn)a^{n-1} \equiv 1 \pmod{n} for some base aa, then nn is probably prime."

This is dangerously wrong. Carmichael numbers are composites where an11(modn)a^{n-1} \equiv 1 \pmod{n} for every base aa with gcd(a,n)=1\gcd(a,n)=1. The Fermat test will say "probably prime" no matter how many rounds you run!

Definition: A composite number nn is a Carmichael number if an11(modn)a^{n-1} \equiv 1 \pmod{n} for all aa with gcd(a,n)=1\gcd(a,n) = 1.

Korselt's criterion: nn is a Carmichael number if and only if nn is square-free and, for every prime pp dividing nn, we have (p1)(n1)(p-1) \mid (n-1).

The smallest Carmichael number is 561=31117561 = 3 \cdot 11 \cdot 17.

# Verify: 561 = 3 * 11 * 17 is a Carmichael number n = 561 print(f'561 = {factor(561)}') print(f'Is 561 prime? {is_prime(561)}') print() # Check Korselt's criterion: (p-1) | (n-1) for each prime factor p for p in [3, 11, 17]: print(f' (p-1) = {p-1}, (n-1) = {n-1}, (p-1) | (n-1)? {(n-1) % (p-1) == 0}') print() # The Fermat test is fooled for EVERY base coprime to 561 fooled_count = 0 for a in range(2, 561): if gcd(a, 561) == 1: if power_mod(a, 560, 561) == 1: fooled_count += 1 coprime_count = euler_phi(561) print(f'Bases coprime to 561: {coprime_count}') print(f'Bases where a^560 = 1 (mod 561): {fooled_count}') print(f'Fermat test is fooled for ALL coprime bases!')
# The Fermat test calls 561 "probably prime" every single time for trial in range(5): print(f' fermat_test(561, k=20) = {fermat_test(561, k=20)}') print() print('The Fermat test CANNOT detect Carmichael numbers.') print('We need a stronger test.')

Note: Carmichael numbers are rare but they are infinite (Alford-Granville-Pomerance, 1994). The first few are: 561, 1105, 1729, 2465, 2821, 6601, 8911, ...

For cryptography, "rare but infinite" is not good enough. We need a test with a provable error bound.


4. The Miller-Rabin Primality Test

Miller-Rabin strengthens the Fermat test by exploiting an additional property of primes.

Key insight: If pp is an odd prime and x21(modp)x^2 \equiv 1 \pmod{p}, then x±1(modp)x \equiv \pm 1 \pmod{p}. That is, the only square roots of 1 modulo a prime are 11 and 1-1. (This follows because Z/pZ\mathbb{Z}/p\mathbb{Z} is a field.)

Setup: Write n1=2sdn - 1 = 2^s \cdot d where dd is odd. Then: an1=a2sd=(ad)2sa^{n-1} = a^{2^s \cdot d} = \left(a^d\right)^{2^s}

We compute the sequence: ad,  a2d,  a4d,  ,  a2sd=an1a^d,\; a^{2d},\; a^{4d},\; \ldots,\; a^{2^s d} = a^{n-1}

Each term is the square of the previous one. If nn is prime, then:

  • Either ad1(modn)a^d \equiv 1 \pmod{n}, or

  • a2rd1(modn)a^{2^r d} \equiv -1 \pmod{n} for some 0r<s0 \le r < s.

If neither condition holds, nn is definitely composite.

Error bound: If nn is composite, at least 3/4 of all bases a{2,,n2}a \in \{2, \ldots, n-2\} will witness this. So each round has error probability 1/4\le 1/4, and kk rounds give error probability 4k\le 4^{-k}.

def miller_rabin_single(n, a): """Single round of Miller-Rabin with base a. Returns True if n is 'probably prime' for this base, False if n is definitely composite.""" # Write n-1 = 2^s * d with d odd d = n - 1 s = 0 while d % 2 == 0: d //= 2 s += 1 # Compute a^d mod n x = power_mod(a, d, n) # Case 1: a^d = 1 (mod n) if x == 1 or x == n - 1: return True # Square repeatedly: check a^(2^r * d) for r = 1, ..., s-1 for r in range(1, s): x = power_mod(x, 2, n) if x == n - 1: return True if x == 1: # Found a non-trivial square root of 1, composite! return False # a^(n-1) != 1 mod n, or we never hit -1 return False def miller_rabin(n, k=20): """Miller-Rabin primality test with k random bases. Error probability <= 4^(-k).""" if n < 2: return 'composite' if n == 2 or n == 3: return 'probably prime' if n % 2 == 0: return 'composite' for _ in range(k): a = randint(2, n - 2) if not miller_rabin_single(n, a): return 'composite' return 'probably prime' # Basic tests print(f'miller_rabin(17) = {miller_rabin(17)}') print(f'miller_rabin(100) = {miller_rabin(100)}') print(f'miller_rabin(1009) = {miller_rabin(1009)}')

Miller-Rabin catches Carmichael numbers

Checkpoint: Before running the cell below, predict: will Miller-Rabin correctly identify 561 as composite? (Recall that Fermat was fooled every time.)

# Miller-Rabin on Carmichael numbers carmichaels = [561, 1105, 1729, 2465, 2821, 6601] for c in carmichaels: result = miller_rabin(c, k=10) print(f'miller_rabin({c}) = {result} (actual: composite, factors = {factor(c)})') print() print('Miller-Rabin detects ALL of them!')
# Let's see WHY Miller-Rabin catches 561 # 561 - 1 = 560 = 2^4 * 35, so s=4, d=35 n = 561 s, d = 4, 35 print(f'n - 1 = {n-1} = 2^{s} * {d}') print(f'Check: 2^{s} * {d} = {2^s * d}\n') # Trace the Miller-Rabin sequence for base a=2 a = 2 x = power_mod(a, d, n) print(f'a = {a}') print(f'a^d mod n = {a}^{d} mod {n} = {x}') for r in range(1, s + 1): x_prev = x x = power_mod(x, 2, n) print(f'a^(2^{r}*d) mod n = {x} (squared {x_prev})') print(f'\nFinal value a^(n-1) mod n = {power_mod(a, n-1, n)}') print('The Fermat test sees 1 and says "probably prime".') print('But Miller-Rabin sees that the sequence reached 1 without') print('passing through -1 (i.e., 560) first, a non-trivial square') print('root of 1 exists, proving n is composite!')

5. Probabilistic vs Deterministic Testing

TestTypeComplexityWeakness
Trial divisionDeterministicO(n)O(\sqrt{n})Way too slow for large nn
Fermat testProbabilisticO(klog2n)O(k \log^2 n)Carmichael numbers
Miller-RabinProbabilisticO(klog2n)O(k \log^2 n)Error 4k\le 4^{-k} (no known weakness)
AKS (2002)DeterministicO~(log6n)\tilde{O}(\log^6 n)Polynomial, but too slow in practice

In practice: Miller-Rabin with k=40k = 40 rounds gives error probability 4401024\le 4^{-40} \approx 10^{-24}. This is far smaller than the probability of a hardware error corrupting the computation. Cryptographic libraries universally use Miller-Rabin.

SageMath's approach:

  • is_pseudoprime(n): Uses Miller-Rabin. Fast, probabilistic.

  • is_prime(n): Uses a combination of methods to give a proven result (for numbers within practical range). Slower but deterministic.

# SageMath's built-in primality functions n = next_prime(2^512) print(f'Testing n = next_prime(2^512)') print(f'n has {n.nbits()} bits\n') # is_pseudoprime: fast probabilistic (Miller-Rabin) start = walltime() result = is_pseudoprime(n) t1 = walltime() - start print(f'is_pseudoprime(n) = {result} ({t1:.6f} sec)') # is_prime: proven result start = walltime() result = is_prime(n) t2 = walltime() - start print(f'is_prime(n) = {result} ({t2:.6f} sec)') # random_prime: generate a random prime up to a bound p = random_prime(2^1024) print(f'\nrandom_prime(2^1024): {p.nbits()}-bit prime generated') print(f' Value: {p}')

6. Generating RSA Primes

Crypto foreshadowing: Here is how RSA key generation actually works:

  1. Pick a random odd number nn of the desired bit-length (e.g., 1024 bits).

  2. Test if nn is prime (using Miller-Rabin).

  3. If not, try n+2n + 2, n+4n + 4, ... (or just pick a new random number).

  4. Repeat until a prime is found.

By the Prime Number Theorem, the density of primes near NN is about 1/ln(N)1/\ln(N). For 1024-bit numbers, ln(21024)710\ln(2^{1024}) \approx 710, so on average we test about 355 random odd numbers before finding a prime.

# Simulate RSA prime generation: how many candidates before finding a prime? def generate_prime_counting_attempts(bits): """Generate a prime of the given bit-length, counting attempts.""" attempts = 0 while True: # Random odd number with exactly 'bits' bits candidate = randint(2^(bits-1), 2^bits - 1) | 1 # ensure odd attempts += 1 if is_pseudoprime(candidate): return candidate, attempts # Generate several 512-bit primes and report attempt counts print('Generating 512-bit primes (RSA-1024 needs two of these):\n') total_attempts = 0 num_trials = 5 for i in range(num_trials): start = walltime() p, attempts = generate_prime_counting_attempts(512) elapsed = walltime() - start total_attempts += attempts print(f' Trial {i+1}: found prime after {attempts} attempts ({elapsed:.3f} sec)') print(f'\nAverage attempts: {total_attempts / num_trials:.1f}') print(f'Expected by PNT: ~{round(log(2^512).n() / 2)}')

Exercises

Exercise 1: Trace Miller-Rabin by Hand (Fully Worked)

Problem: Apply the Miller-Rabin test to n=221=13×17n = 221 = 13 \times 17 with base a=174a = 174.

Solution:

Step 1: Write n1=220=22×55n - 1 = 220 = 2^2 \times 55, so s=2s = 2 and d=55d = 55.

Step 2: Compute admodn=17455mod221a^d \bmod n = 174^{55} \bmod 221.

# Exercise 1: Fully worked Miller-Rabin trace n = 221 # = 13 * 17 a = 174 s, d = 2, 55 print(f'n = {n} = {factor(n)}') print(f'n - 1 = {n-1} = 2^{s} * {d}') print(f'Verify: 2^{s} * {d} = {2^s * d}\n') # Step 2: Compute a^d mod n x0 = power_mod(a, d, n) print(f'x_0 = a^d mod n = {a}^{d} mod {n} = {x0}') # Step 3: Square repeatedly x1 = power_mod(x0, 2, n) print(f'x_1 = x_0^2 mod n = {x0}^2 mod {n} = {x1}') print(f'\nAnalysis:') print(f' x_0 = {x0} (not 1 and not {n-1})') print(f' x_1 = {x1} = n - 1? {"Yes" if x1 == n-1 else "No"}') if x0 != 1 and x0 != n - 1 and x1 != n - 1: print(f'\nConclusion: n = {n} is COMPOSITE (Miller-Rabin witness found: a = {a})') else: print(f'\nConclusion: a = {a} is not a witness; n passes this round.') # Verify with our function print(f'\nVerify: miller_rabin_single({n}, {a}) = {miller_rabin_single(n, a)}')

Exercise 2: Fermat vs Miller-Rabin on Carmichael Numbers (Guided)

Problem: The number 1729=7×13×191729 = 7 \times 13 \times 19 is a Carmichael number (also famous as the Hardy-Ramanujan "taxicab" number). Your tasks:

  1. Verify Korselt's criterion: check that (p1)(n1)(p-1) \mid (n-1) for each prime factor pp.

  2. Show that the Fermat test is fooled by testing 20 random bases.

  3. Show that Miller-Rabin catches it: find a witness base.

# Exercise 2: Guided, fill in the TODOs n = 1729 print(f'n = {n} = {factor(n)}') print() # Part 1: Verify Korselt's criterion print('Part 1: Korselt\'s criterion') for p in [7, 13, 19]: # TODO: Check whether (p-1) divides (n-1) and print the result # Hint: use (n-1) % (p-1) == 0 pass print() # Part 2: Show Fermat is fooled print('Part 2: Fermat test') # TODO: Run fermat_test(n, k=20) several times and print results # What do you observe? print() # Part 3: Find a Miller-Rabin witness print('Part 3: Miller-Rabin witness search') # TODO: Loop over bases a = 2, 3, 4, ... and find the first a # where miller_rabin_single(n, a) returns False # Hint: the loop should break as soon as you find a witness

Exercise 3: Error Probability Experiment (Independent)

Problem: Miller-Rabin theory says: for a composite nn, at most 1/41/4 of bases in {2,,n2}\{2, \ldots, n-2\} are non-witnesses (i.e., they fail to detect that nn is composite).

Pick the composite number n=341=11×31n = 341 = 11 \times 31 (a base-2 Fermat pseudoprime).

  1. For every base aa from 2 to 339, run miller_rabin_single(341, a) and count how many return True (non-witnesses).

  2. Compute the fraction of non-witnesses. Is it 1/4\le 1/4?

  3. Repeat for n=561n = 561. Is the 1/41/4 bound satisfied?

No starter code provided. Write your solution from scratch.

# Exercise 3: Your solution here

Summary

ConceptKey idea
Trial divisionSimple and correct, but exponential complexity O(2k/2)O(2^{k/2}) in bit-length kk makes it useless for cryptographic sizes
Fermat testUses the contrapositive of Fermat's Little Theorem: if an1≢1a^{n-1} \not\equiv 1, then nn is composite. Fast, but fatally flawed by Carmichael numbers
Carmichael numbersComposites (like 561, 1105, 1729) that fool the Fermat test for every coprime base
Miller-Rabin testStrengthens Fermat by checking for non-trivial square roots of 1. Each round catches a composite with probability 3/4\ge 3/4, so kk rounds give error 4k\le 4^{-k}
SageMath toolsis_prime() (deterministic), is_pseudoprime() (Miller-Rabin), and random_prime() for generating primes
RSA prime generationPick random odd numbers and test with Miller-Rabin until a prime is found. The Prime Number Theorem guarantees this happens quickly

Next: RSA Encryption and Decryption, we put primes to work, building the full RSA cryptosystem.