GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/04-number-theory-rsa/break/fermat-factorization-close-primes.ipynb
⁴⁸³ views

unlisted

Kernel: SageMath 10.0

Break: Fermat Factorization When Primes Are Close

Module 04 | Breaking Weak Parameters

Factor $n$ when $|p - q|$ is small using Fermat's method.

Why This Matters

RSA's security rests on the difficulty of factoring $n = pq$ . But how $p$ and $q$ are chosen matters enormously.

If $p$ and $q$ are close together --- say, chosen from the same narrow range, or generated by a buggy random number generator --- then $n$ can be factored almost instantly using a method Pierre de Fermat described in 1643.

This is not a theoretical curiosity. Real-world RSA keys have been broken this way:

ROCA vulnerability (2017): Infineon smart cards generated primes from a restricted set, making them factorable.
Poorly seeded PRNGs in embedded devices can produce near-identical primes.

The Scenario: Close Primes

We generate $n = p \cdot q$ where $p$ and $q$ are deliberately chosen to be close together. The attacker sees only $n$ and must factor it.

In [ ]:

# === Generate n = p * q with close primes ===

# Start from a base value and find two primes near it
set_random_seed(42)
base = 2^80  # ~80-bit primes, so n ~ 160 bits

# Find p: first prime above base
p = next_prime(base + ZZ.random_element(1000))
# Find q: a prime very close to p
q = next_prime(p + ZZ.random_element(10, 5000))

n = p * q

print(f'p = {p}')
print(f'q = {q}')
print(f'n = p * q = {n}')
print()
print(f'p has {p.nbits()} bits')
print(f'q has {q.nbits()} bits')
print(f'n has {n.nbits()} bits')
print()
print(f'Gap: |p - q| = {abs(p - q)}')
print(f'sqrt(n)       = {isqrt(n)}')
print(f'|p - q| / sqrt(n) = {RR(abs(p-q)) / RR(isqrt(n)):.6e}')
print()
print('The primes are extremely close relative to their size.')
print('An attacker sees only n. Can they factor it?')

Step 1: Fermat's Idea

Fermat observed that any odd composite $n$ can be written as a difference of two squares:

n = a^2 - b^2 = (a - b)(a + b)

where $a = (p + q)/2$ and $b = (p - q)/2$ (assuming $p > q$ ).

The algorithm:

Start with $a = \lceil\sqrt{n}\rceil$
Compute $b^2 = a^2 - n$
If $b^2$ is a perfect square, then $n = (a - b)(a + b)$ and we're done
Otherwise, increment $a$ and repeat

When $p \approx q$ , we have $a \approx \sqrt{n}$ , so the algorithm terminates after very few iterations.

In [ ]:

# === Verify Fermat's representation ===

# The true values (attacker doesn't know these)
a_true = (p + q) // 2
b_true = (p - q) // 2  # assuming p > q; use abs if needed
if p < q:
    b_true = (q - p) // 2

print(f'True factorization: n = {p} * {q}')
print()
print(f'a = (p + q) / 2 = {a_true}')
print(f'b = |p - q| / 2 = {b_true}')
print(f'a^2 - b^2 = {a_true^2 - b_true^2}')
print(f'n          = {n}')
print(f'Match: {a_true^2 - b_true^2 == n}')
print()
print(f'(a - b) = {a_true - b_true} = {"p" if a_true - b_true == p else "q"}')
print(f'(a + b) = {a_true + b_true} = {"q" if a_true + b_true == q else "p"}')
print()
print(f'ceil(sqrt(n)) = {isqrt(n) + (1 if isqrt(n)^2 < n else 0)}')
print(f'a_true        = {a_true}')
print(f'Distance from sqrt(n) to a: {a_true - isqrt(n)} iterations needed')

Step 2: Implement Fermat's Factorization

In [ ]:

# === Fermat's factorization algorithm ===

def fermat_factor(n, max_iterations=10^7):
    """Factor n using Fermat's method. Returns (p, q, iterations)."""
    a = isqrt(n)
    if a * a == n:
        return a, a, 0  # Perfect square
    a += 1  # Start with ceil(sqrt(n))
    
    for i in range(max_iterations):
        b_squared = a * a - n
        b = isqrt(b_squared)
        
        if b * b == b_squared:  # Is b^2 a perfect square?
            p_found = a + b
            q_found = a - b
            return p_found, q_found, i + 1
        
        a += 1
    
    return None, None, max_iterations  # Failed

# Run the attack!
t0 = walltime()
p_found, q_found, iterations = fermat_factor(n)
t1 = walltime()

print(f'=== FERMAT FACTORIZATION ===')
print(f'Iterations: {iterations}')
print(f'Time: {(t1-t0)*1000:.2f} ms')
print()
print(f'Found: p = {p_found}')
print(f'Found: q = {q_found}')
print(f'p * q = n? {p_found * q_found == n}')
print()

# Sort so we can compare
factors_found = sorted([p_found, q_found])
factors_true = sorted([p, q])
print(f'Matches original primes: {factors_found == factors_true}')

Step 3: How Many Iterations?

The number of iterations Fermat's method needs is approximately:

\text{iterations} \approx \frac{(p - q)^2}{4\sqrt{n}}

When $|p - q|$ is small, this is tiny. When $|p - q| \approx \sqrt{n}$ (as it should be for secure RSA), the number of iterations is $\approx \sqrt{n}/4$ , which is as hard as trial division.

In [ ]:

# === Cost analysis: iterations vs gap size ===

# Generate several n values with different gaps
print(f'{"Gap |p-q|":>20s}  {"Iterations":>12s}  {"Time (ms)":>12s}  {"Predicted":>12s}')

base_p = next_prime(2^60)
gaps = [10, 100, 1000, 10000, 100000, 1000000]

for gap in gaps:
    test_p = base_p
    test_q = next_prime(base_p + gap)
    test_n = test_p * test_q
    actual_gap = abs(test_p - test_q)
    
    t0 = walltime()
    _, _, iters = fermat_factor(test_n)
    t1 = walltime()
    
    predicted = Integer(actual_gap)^2 // (4 * isqrt(test_n))
    print(f'{actual_gap:>20d}  {iters:>12d}  {(t1-t0)*1000:>12.2f}  {predicted:>12d}')

print()
print('As the gap doubles, iterations roughly quadruple (quadratic in gap).')
print('For secure RSA, the gap should be ~ sqrt(n), making iterations ~ sqrt(n).')

In [ ]:

# === Visualization: iterations vs gap ===

data_points = []
base_p = next_prime(2^50)

for gap_exp in range(1, 14):
    gap = 2^gap_exp
    test_q = next_prime(base_p + gap)
    test_n = base_p * test_q
    _, _, iters = fermat_factor(test_n, max_iterations=10^6)
    if iters < 10^6:
        data_points.append((gap, iters))

if data_points:
    P = list_plot(data_points, scale='loglog', plotjoined=True, marker='o',
                  axes_labels=['|p - q|', 'Iterations'],
                  title='Fermat Factorization: Iterations vs Prime Gap')
    show(P, figsize=6)

The Fix: Ensure $|p - q|$ Is Large

The FIPS 186-4 standard for RSA key generation requires:

|p - q| > 2^{n_{\text{len}}/2 - 100}

where $n_{\text{len}}$ is the bit length of $n$ . For a 2048-bit RSA key, this means:

|p - q| > 2^{924}

This makes Fermat's method require approximately $2^{924}$ iterations --- completely infeasible.

In practice, generating $p$ and $q$ independently and uniformly from the right-sized range almost always satisfies this bound. The primes are only dangerously close if the random number generator is broken.

In [ ]:

# === Safe vs unsafe: compare factoring times ===

# UNSAFE: primes close together
p_unsafe = next_prime(2^80)
q_unsafe = next_prime(p_unsafe + 1000)
n_unsafe = p_unsafe * q_unsafe

t0 = walltime()
_, _, iters_unsafe = fermat_factor(n_unsafe)
t_unsafe = walltime() - t0

# SAFE: primes far apart (independently generated)
p_safe = random_prime(2^81, lbound=2^80)
q_safe = random_prime(2^81, lbound=2^80)
while abs(p_safe - q_safe) < 2^40:  # Ensure they're far apart
    q_safe = random_prime(2^81, lbound=2^80)
n_safe = p_safe * q_safe

t0 = walltime()
_, _, iters_safe = fermat_factor(n_safe, max_iterations=10^6)
t_safe = walltime() - t0

print('=== UNSAFE key (close primes) ===')
print(f'|p - q| = {abs(p_unsafe - q_unsafe)}')
print(f'Fermat iterations: {iters_unsafe}')
print(f'Time: {t_unsafe*1000:.2f} ms')
print()
print('=== SAFE key (independent primes) ===')
print(f'|p - q| = {abs(p_safe - q_safe)}')
print(f'Fermat iterations: {iters_safe} (hit limit = {iters_safe >= 10^6})')
print(f'Time: {t_safe*1000:.2f} ms (gave up after 10^6 iterations)')
print()
print('Properly generated primes make Fermat factorization hopeless.')

Exercises

Vary the gap: Generate keys with gaps of $|p - q| = 2^k$ for $k = 5, 10, 15, 20, 25$ . Plot the number of iterations. Does it match the predicted quadratic relationship?
Larger keys: Try Fermat factorization on 256-bit and 512-bit RSA keys with close primes. How does the constant factor scale with key size?
Comparison with trial division: For a given $n$ with close primes, compare the speed of Fermat's method vs. trial division (trying all primes up to $\sqrt{n}$ ). When is each method faster?

Summary

Component	Role in the Attack
$n = a^2 - b^2$	Fermat's difference-of-squares representation
$a = \lceil\sqrt{n}\rceil$	Starting point, close to $(p+q)/2$ when primes are close
Perfect square test	Check if $a^2 - n$ is a perfect square at each step
Close primes	Make the starting $a$ very close to the true $(p+q)/2$

Key takeaways:

Fermat's factorization method (1643) factors $n = pq$ in $O((p-q)^2 / \sqrt{n})$ steps.
When $|p - q|$ is small, factoring is trivial --- a few iterations suffice.
FIPS 186-4 mandates $|p - q| > 2^{n_{\text{len}}/2 - 100}$ to prevent this.
Independent, uniform prime generation naturally produces primes that are far apart.
The lesson: RSA security depends not just on key size, but on how the primes are generated.

Back to Module 04: Number Theory and RSA

Break: Fermat Factorization When Primes Are Close

Why This Matters

The Scenario: Close Primes

Step 1: Fermat's Idea

Step 2: Implement Fermat's Factorization

Step 3: How Many Iterations?

The Fix: Ensure $|p - q|$ Is Large

Exercises

Summary

Product

Resources

Company

Break: Fermat Factorization When Primes Are Close

Why This Matters

The Scenario: Close Primes

Step 1: Fermat's Idea

Step 2: Implement Fermat's Factorization

Step 3: How Many Iterations?

The Fix: Ensure ∣p−q∣|p - q|∣p−q∣ Is Large

Exercises

Summary

The Fix: Ensure $|p - q|$ Is Large