GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/04-number-theory-rsa/sage/04e-rsa-key-generation.ipynb
⁴⁸³ views

unlisted

Kernel: SageMath 10.5

Primality Testing

Module 04 | 04-number-theory-rsa | Notebook 04e

Trial division, Fermat test, Carmichael numbers, Miller-Rabin, is_prime(), random_prime()

Objectives

By the end of this notebook you will be able to:

Explain why finding large primes is essential for RSA key generation.
Implement trial division and understand its computational limits.
Describe the Fermat primality test and identify its weakness (Carmichael numbers).
Implement and analyze the Miller-Rabin probabilistic primality test.
Use SageMath's is_prime(), is_pseudoprime(), and random_prime() for cryptographic prime generation.

Prerequisites

Completion of The Chinese Remainder Theorem.
Fermat's Little Theorem from notebook 04c: if $p$ is prime and $\gcd(a, p) = 1$ , then $a^{p-1} \equiv 1 \pmod{p}$ .
Modular exponentiation via power_mod(a, e, n).

Motivation: Why Primality Testing Matters

The Question: Is $2^{89} - 1 = 618970019642690137449562111$ prime?
Trial division would require testing divisors up to $\sqrt{2^{89}-1} \approx 2^{44.5} \approx 2.5 \times 10^{13}$ . At a billion divisions per second, that is roughly 7 hours. For a 1024-bit number (the size RSA actually uses), trial division would take longer than the age of the universe. Is there a faster way?

Bridge from 04c: In notebook 04c we proved Fermat's Little Theorem: $a^{p-1} \equiv 1 \pmod{p}$ for prime $p$ . We used it as a structural fact about primes. Now we flip the logic and use it as a test: if $a^{n-1} \not\equiv 1 \pmod{n}$ , then $n$ cannot be prime. This simple contrapositive is the foundation of every fast primality test.

Crypto foreshadowing: RSA key generation needs two large primes $p$ and $q$ , each typically 1024 bits. The standard algorithm is: generate a random odd number, test primality, repeat. By the Prime Number Theorem, roughly 1 in every 710 odd 1024-bit numbers is prime, so efficient testing is critical.

1. Trial Division

The simplest primality test: to check whether $n$ is prime, test whether any integer $d$ with $2 \le d \le \lfloor\sqrt{n}\rfloor$ divides $n$ .

Why $\sqrt{n}$ suffices: If $n = a \cdot b$ with $a \le b$ , then $a \le \sqrt{n}$ . So if no divisor up to $\sqrt{n}$ exists, $n$ is prime.

Complexity: $O(\sqrt{n})$ divisions. For an $k$ -bit number, that is $O(2^{k/2})$ , exponential in the bit-length.

In [ ]:

def trial_division(n):
    """Return True if n is prime, using trial division up to sqrt(n)."""
    if n < 2:
        return False
    if n == 2:
        return True
    if n % 2 == 0:
        return False
    d = 3
    while d * d <= n:
        if n % d == 0:
            return False
        d += 2
    return True

# Test on some small numbers
test_values = [2, 3, 4, 15, 17, 561, 1009, 1729]
for n in test_values:
    result = trial_division(n)
    print(f'trial_division({n}) = {result}')

In [ ]:

# How slow is trial division on larger numbers?

# A 20-digit prime
p20 = next_prime(10^19)
print(f'Testing p = {p20} ({p20.nbits()} bits)')

start = walltime()
result = trial_division(p20)
elapsed = walltime() - start
print(f'Result: {result}, Time: {elapsed:.4f} seconds')

# A 30-digit prime, already noticeably slower
p30 = next_prime(10^29)
print(f'\nTesting p = {p30} ({p30.nbits()} bits)')

start = walltime()
result = trial_division(p30)
elapsed = walltime() - start
print(f'Result: {result}, Time: {elapsed:.4f} seconds')

print('\nFor a 1024-bit prime, trial division is completely infeasible.')

2. The Fermat Primality Test

Idea (from 04c): Fermat's Little Theorem says: if $p$ is prime and $\gcd(a,p) = 1$ , then $a^{p-1} \equiv 1 \pmod{p}.$

Contrapositive: If $a^{n-1} \not\equiv 1 \pmod{n}$ for some $a$ with $\gcd(a,n)=1$ , then $n$ is definitely composite.

Algorithm:

Pick a random base $a$ with $2 \le a \le n-2$ .
Compute $a^{n-1} \bmod n$ .
If the result is not 1, output "composite." Otherwise, output "probably prime."

Repeat $k$ times with different random bases for higher confidence.

In [ ]:

def fermat_test(n, k=10):
    """Fermat primality test with k random bases.
    Returns 'composite' or 'probably prime'."""
    if n < 2:
        return 'composite'
    if n <= 3:
        return 'probably prime'
    for _ in range(k):
        a = randint(2, n - 2)
        if power_mod(a, n - 1, n) != 1:
            return 'composite'
    return 'probably prime'

# Test on known primes and composites
print(f'fermat_test(17)   = {fermat_test(17)}')
print(f'fermat_test(100)  = {fermat_test(100)}')
print(f'fermat_test(1009) = {fermat_test(1009)}')
print(f'fermat_test(1000) = {fermat_test(1000)}')

# Fast even on large numbers!
big_prime = next_prime(2^256)
print(f'\nfermat_test(next_prime(2^256)) = {fermat_test(big_prime)}')

The Fermat test is blazingly fast: power_mod uses repeated squaring, so it runs in $O(k \cdot \log^2 n)$ time, polynomial in the bit-length! Compare that to trial division's exponential cost.

But there is a trap...

3. Carmichael Numbers: When Fermat Fails

Misconception alert: "If $a^{n-1} \equiv 1 \pmod{n}$ for some base $a$ , then $n$ is probably prime."
This is dangerously wrong. Carmichael numbers are composites where $a^{n-1} \equiv 1 \pmod{n}$ for every base $a$ with $\gcd(a,n)=1$ . The Fermat test will say "probably prime" no matter how many rounds you run!

Definition: A composite number $n$ is a Carmichael number if $a^{n-1} \equiv 1 \pmod{n}$ for all $a$ with $\gcd(a,n) = 1$ .

Korselt's criterion: $n$ is a Carmichael number if and only if $n$ is square-free and, for every prime $p$ dividing $n$ , we have $(p-1) \mid (n-1)$ .

The smallest Carmichael number is $561 = 3 \cdot 11 \cdot 17$ .

In [ ]:

# Verify: 561 = 3 * 11 * 17 is a Carmichael number
n = 561
print(f'561 = {factor(561)}')
print(f'Is 561 prime? {is_prime(561)}')
print()

# Check Korselt's criterion: (p-1) | (n-1) for each prime factor p
for p in [3, 11, 17]:
    print(f'  (p-1) = {p-1},  (n-1) = {n-1},  (p-1) | (n-1)? {(n-1) % (p-1) == 0}')

print()

# The Fermat test is fooled for EVERY base coprime to 561
fooled_count = 0
for a in range(2, 561):
    if gcd(a, 561) == 1:
        if power_mod(a, 560, 561) == 1:
            fooled_count += 1

coprime_count = euler_phi(561)
print(f'Bases coprime to 561: {coprime_count}')
print(f'Bases where a^560 = 1 (mod 561): {fooled_count}')
print(f'Fermat test is fooled for ALL coprime bases!')

In [ ]:

# The Fermat test calls 561 "probably prime" every single time
for trial in range(5):
    print(f'  fermat_test(561, k=20) = {fermat_test(561, k=20)}')

print()
print('The Fermat test CANNOT detect Carmichael numbers.')
print('We need a stronger test.')

Note: Carmichael numbers are rare but they are infinite (Alford-Granville-Pomerance, 1994). The first few are: 561, 1105, 1729, 2465, 2821, 6601, 8911, ...
For cryptography, "rare but infinite" is not good enough. We need a test with a provable error bound.

4. The Miller-Rabin Primality Test

Miller-Rabin strengthens the Fermat test by exploiting an additional property of primes.

Key insight: If $p$ is an odd prime and $x^2 \equiv 1 \pmod{p}$ , then $x \equiv \pm 1 \pmod{p}$ . That is, the only square roots of 1 modulo a prime are $1$ and $-1$ . (This follows because $\mathbb{Z}/p\mathbb{Z}$ is a field.)

Setup: Write $n - 1 = 2^s \cdot d$ where $d$ is odd. Then: $a^{n-1} = a^{2^s \cdot d} = \left(a^d\right)^{2^s}$

We compute the sequence: $a^d,\; a^{2d},\; a^{4d},\; \ldots,\; a^{2^s d} = a^{n-1}$

Each term is the square of the previous one. If $n$ is prime, then:

Either $a^d \equiv 1 \pmod{n}$ , or
$a^{2^r d} \equiv -1 \pmod{n}$ for some $0 \le r < s$ .

If neither condition holds, $n$ is definitely composite.

Error bound: If $n$ is composite, at least 3/4 of all bases $a \in \{2, \ldots, n-2\}$ will witness this. So each round has error probability $\le 1/4$ , and $k$ rounds give error probability $\le 4^{-k}$ .

In [ ]:

def miller_rabin_single(n, a):
    """Single round of Miller-Rabin with base a.
    Returns True if n is 'probably prime' for this base,
    False if n is definitely composite."""
    # Write n-1 = 2^s * d with d odd
    d = n - 1
    s = 0
    while d % 2 == 0:
        d //= 2
        s += 1

    # Compute a^d mod n
    x = power_mod(a, d, n)

    # Case 1: a^d = 1 (mod n)
    if x == 1 or x == n - 1:
        return True

    # Square repeatedly: check a^(2^r * d) for r = 1, ..., s-1
    for r in range(1, s):
        x = power_mod(x, 2, n)
        if x == n - 1:
            return True
        if x == 1:
            # Found a non-trivial square root of 1, composite!
            return False

    # a^(n-1) != 1 mod n, or we never hit -1
    return False


def miller_rabin(n, k=20):
    """Miller-Rabin primality test with k random bases.
    Error probability <= 4^(-k)."""
    if n < 2:
        return 'composite'
    if n == 2 or n == 3:
        return 'probably prime'
    if n % 2 == 0:
        return 'composite'

    for _ in range(k):
        a = randint(2, n - 2)
        if not miller_rabin_single(n, a):
            return 'composite'
    return 'probably prime'


# Basic tests
print(f'miller_rabin(17)   = {miller_rabin(17)}')
print(f'miller_rabin(100)  = {miller_rabin(100)}')
print(f'miller_rabin(1009) = {miller_rabin(1009)}')

Miller-Rabin catches Carmichael numbers

Checkpoint: Before running the cell below, predict: will Miller-Rabin correctly identify 561 as composite? (Recall that Fermat was fooled every time.)

In [ ]:

# Miller-Rabin on Carmichael numbers
carmichaels = [561, 1105, 1729, 2465, 2821, 6601]
for c in carmichaels:
    result = miller_rabin(c, k=10)
    print(f'miller_rabin({c}) = {result}  (actual: composite, factors = {factor(c)})')

print()
print('Miller-Rabin detects ALL of them!')

In [ ]:

# Let's see WHY Miller-Rabin catches 561
# 561 - 1 = 560 = 2^4 * 35, so s=4, d=35
n = 561
s, d = 4, 35
print(f'n - 1 = {n-1} = 2^{s} * {d}')
print(f'Check: 2^{s} * {d} = {2^s * d}\n')

# Trace the Miller-Rabin sequence for base a=2
a = 2
x = power_mod(a, d, n)
print(f'a = {a}')
print(f'a^d mod n = {a}^{d} mod {n} = {x}')

for r in range(1, s + 1):
    x_prev = x
    x = power_mod(x, 2, n)
    print(f'a^(2^{r}*d) mod n = {x}   (squared {x_prev})')

print(f'\nFinal value a^(n-1) mod n = {power_mod(a, n-1, n)}')
print('The Fermat test sees 1 and says "probably prime".')
print('But Miller-Rabin sees that the sequence reached 1 without')
print('passing through -1 (i.e., 560) first, a non-trivial square')
print('root of 1 exists, proving n is composite!')

5. Probabilistic vs Deterministic Testing

Test	Type	Complexity	Weakness
Trial division	Deterministic	$O(\sqrt{n})$	Way too slow for large $n$
Fermat test	Probabilistic	$O(k \log^2 n)$	Carmichael numbers
Miller-Rabin	Probabilistic	$O(k \log^2 n)$	Error $\le 4^{-k}$ (no known weakness)
AKS (2002)	Deterministic	$\tilde{O}(\log^6 n)$	Polynomial, but too slow in practice

In practice: Miller-Rabin with $k = 40$ rounds gives error probability $\le 4^{-40} \approx 10^{-24}$ . This is far smaller than the probability of a hardware error corrupting the computation. Cryptographic libraries universally use Miller-Rabin.

SageMath's approach:

is_pseudoprime(n): Uses Miller-Rabin. Fast, probabilistic.
is_prime(n): Uses a combination of methods to give a proven result (for numbers within practical range). Slower but deterministic.

In [ ]:

# SageMath's built-in primality functions

n = next_prime(2^512)
print(f'Testing n = next_prime(2^512)')
print(f'n has {n.nbits()} bits\n')

# is_pseudoprime: fast probabilistic (Miller-Rabin)
start = walltime()
result = is_pseudoprime(n)
t1 = walltime() - start
print(f'is_pseudoprime(n) = {result}  ({t1:.6f} sec)')

# is_prime: proven result
start = walltime()
result = is_prime(n)
t2 = walltime() - start
print(f'is_prime(n)       = {result}  ({t2:.6f} sec)')

# random_prime: generate a random prime up to a bound
p = random_prime(2^1024)
print(f'\nrandom_prime(2^1024): {p.nbits()}-bit prime generated')
print(f'  Value: {p}')

6. Generating RSA Primes

Crypto foreshadowing: Here is how RSA key generation actually works:

Pick a random odd number $n$ of the desired bit-length (e.g., 1024 bits).
Test if $n$ is prime (using Miller-Rabin).
If not, try $n + 2$ , $n + 4$ , ... (or just pick a new random number).
Repeat until a prime is found.

By the Prime Number Theorem, the density of primes near $N$ is about $1/\ln(N)$ . For 1024-bit numbers, $\ln(2^{1024}) \approx 710$ , so on average we test about 355 random odd numbers before finding a prime.

In [ ]:

# Simulate RSA prime generation: how many candidates before finding a prime?

def generate_prime_counting_attempts(bits):
    """Generate a prime of the given bit-length, counting attempts."""
    attempts = 0
    while True:
        # Random odd number with exactly 'bits' bits
        candidate = randint(2^(bits-1), 2^bits - 1) | 1  # ensure odd
        attempts += 1
        if is_pseudoprime(candidate):
            return candidate, attempts

# Generate several 512-bit primes and report attempt counts
print('Generating 512-bit primes (RSA-1024 needs two of these):\n')
total_attempts = 0
num_trials = 5
for i in range(num_trials):
    start = walltime()
    p, attempts = generate_prime_counting_attempts(512)
    elapsed = walltime() - start
    total_attempts += attempts
    print(f'  Trial {i+1}: found prime after {attempts} attempts ({elapsed:.3f} sec)')

print(f'\nAverage attempts: {total_attempts / num_trials:.1f}')
print(f'Expected by PNT:  ~{round(log(2^512).n() / 2)}')

Exercises

Exercise 1: Trace Miller-Rabin by Hand (Fully Worked)

Problem: Apply the Miller-Rabin test to $n = 221 = 13 \times 17$ with base $a = 174$ .

Solution:

Step 1: Write $n - 1 = 220 = 2^2 \times 55$ , so $s = 2$ and $d = 55$ .

Step 2: Compute $a^d \bmod n = 174^{55} \bmod 221$ .

In [ ]:

# Exercise 1: Fully worked Miller-Rabin trace
n = 221  # = 13 * 17
a = 174
s, d = 2, 55

print(f'n = {n} = {factor(n)}')
print(f'n - 1 = {n-1} = 2^{s} * {d}')
print(f'Verify: 2^{s} * {d} = {2^s * d}\n')

# Step 2: Compute a^d mod n
x0 = power_mod(a, d, n)
print(f'x_0 = a^d mod n = {a}^{d} mod {n} = {x0}')

# Step 3: Square repeatedly
x1 = power_mod(x0, 2, n)
print(f'x_1 = x_0^2 mod n = {x0}^2 mod {n} = {x1}')

print(f'\nAnalysis:')
print(f'  x_0 = {x0} (not 1 and not {n-1})')
print(f'  x_1 = {x1} = n - 1?  {"Yes" if x1 == n-1 else "No"}')

if x0 != 1 and x0 != n - 1 and x1 != n - 1:
    print(f'\nConclusion: n = {n} is COMPOSITE (Miller-Rabin witness found: a = {a})')
else:
    print(f'\nConclusion: a = {a} is not a witness; n passes this round.')

# Verify with our function
print(f'\nVerify: miller_rabin_single({n}, {a}) = {miller_rabin_single(n, a)}')

Exercise 2: Fermat vs Miller-Rabin on Carmichael Numbers (Guided)

Problem: The number $1729 = 7 \times 13 \times 19$ is a Carmichael number (also famous as the Hardy-Ramanujan "taxicab" number). Your tasks:

Verify Korselt's criterion: check that $(p-1) \mid (n-1)$ for each prime factor $p$ .
Show that the Fermat test is fooled by testing 20 random bases.
Show that Miller-Rabin catches it: find a witness base.

In [ ]:

# Exercise 2: Guided, fill in the TODOs
n = 1729
print(f'n = {n} = {factor(n)}')
print()

# Part 1: Verify Korselt's criterion
print('Part 1: Korselt\'s criterion')
for p in [7, 13, 19]:
    # TODO: Check whether (p-1) divides (n-1) and print the result
    # Hint: use (n-1) % (p-1) == 0
    pass

print()

# Part 2: Show Fermat is fooled
print('Part 2: Fermat test')
# TODO: Run fermat_test(n, k=20) several times and print results
# What do you observe?

print()

# Part 3: Find a Miller-Rabin witness
print('Part 3: Miller-Rabin witness search')
# TODO: Loop over bases a = 2, 3, 4, ... and find the first a
#       where miller_rabin_single(n, a) returns False
# Hint: the loop should break as soon as you find a witness

Exercise 3: Error Probability Experiment (Independent)

Problem: Miller-Rabin theory says: for a composite $n$ , at most $1/4$ of bases in $\{2, \ldots, n-2\}$ are non-witnesses (i.e., they fail to detect that $n$ is composite).

Pick the composite number $n = 341 = 11 \times 31$ (a base-2 Fermat pseudoprime).

For every base $a$ from 2 to 339, run miller_rabin_single(341, a) and count how many return True (non-witnesses).
Compute the fraction of non-witnesses. Is it $\le 1/4$ ?
Repeat for $n = 561$ . Is the $1/4$ bound satisfied?

No starter code provided. Write your solution from scratch.

In [ ]:

# Exercise 3: Your solution here

Summary

Concept	Key idea
Trial division	Simple and correct, but exponential complexity $O(2^{k/2})$ in bit-length $k$ makes it useless for cryptographic sizes
Fermat test	Uses the contrapositive of Fermat's Little Theorem: if $a^{n-1} \not\equiv 1$ , then $n$ is composite. Fast, but fatally flawed by Carmichael numbers
Carmichael numbers	Composites (like 561, 1105, 1729) that fool the Fermat test for every coprime base
Miller-Rabin test	Strengthens Fermat by checking for non-trivial square roots of 1. Each round catches a composite with probability $\ge 3/4$ , so $k$ rounds give error $\le 4^{-k}$
SageMath tools	`is_prime()` (deterministic), `is_pseudoprime()` (Miller-Rabin), and `random_prime()` for generating primes
RSA prime generation	Pick random odd numbers and test with Miller-Rabin until a prime is found. The Prime Number Theorem guarantees this happens quickly

Next: RSA Encryption and Decryption, we put primes to work, building the full RSA cryptosystem.