Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
duyuefeng0708
GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/04-number-theory-rsa/sage/04a-divisibility-gcd-euclid.ipynb
483 views
unlisted
Kernel: SageMath 10.5

Notebook 04a: Divisibility, GCD, and the Euclidean Algorithm

Module 04. Number Theory and RSA


Motivating Question. Computing gcd(1000000007,  999999937)\gcd(1000000007,\; 999999937) by listing all divisors of each number would require checking up to a billion candidates for each. Is there a shortcut that finds the answer in just a handful of steps?

The Euclidean algorithm, over 2300 years old, answers yes. It is one of the oldest algorithms still in daily use, and it sits at the heart of RSA key generation.


Prerequisites. You should be comfortable with:

  • Modular arithmetic and residues (Module 01)

  • The idea of units in Z/nZ\mathbb{Z}/n\mathbb{Z} (Module 01/02): an element aa is a unit iff it has a multiplicative inverse mod nn

Learning objectives. By the end of this notebook you will be able to:

  1. State the definition of divisibility and verify its key properties.

  2. Compute the GCD of two integers using the Euclidean algorithm by hand and in SageMath.

  3. Explain why the algorithm works (the key invariant).

  4. Relate coprimality (gcd(a,n)=1\gcd(a,n)=1) to the existence of modular inverses.

  5. Appreciate why the Euclidean algorithm is efficient: O(log(min(a,b)))O(\log(\min(a,b))) steps.

1. Divisibility

Definition. Let a,bZa, b \in \mathbb{Z} with a0a \neq 0. We say aa divides bb, written aba \mid b, if there exists an integer kk such that b=ka.b = k \cdot a.

Equivalently, bb is a multiple of aa, or aa is a factor (divisor) of bb.

Examples:

  • 3123 \mid 12 because 12=4312 = 4 \cdot 3.

  • 707 \mid 0 because 0=070 = 0 \cdot 7. (Every nonzero integer divides 0!)

  • 5135 \nmid 13 because there is no integer kk with 13=5k13 = 5k.

Key properties:

PropertyStatementWhy?
Reflexiveaaa \mid aBecause a=1aa = 1 \cdot a
TransitiveIf aba \mid b and bcb \mid c then aca \mid cc=lb=l(ka)=(lk)ac = lb = l(ka) = (lk)a
Antisymmetric (on positives)If aba \mid b and bab \mid a with a,b>0a,b > 0 then a=ba = bb=kab = ka and a=lba = lb, so a=lkaa = lka, giving lk=1lk=1, so l=k=1l=k=1
# SageMath's divides() method: a.divides(b) checks whether a | b print("3 | 12?", ZZ(3).divides(12)) # True print("7 | 0 ?", ZZ(7).divides(0)) # True (every nonzero integer divides 0) print("5 | 13?", ZZ(5).divides(13)) # False # List all positive divisors of 60 n = 60 print(f"\nPositive divisors of {n}: {divisors(n)}") print(f"Number of divisors: {len(divisors(n))}")

Checkpoint 1. Before running the cell below, predict: does 1212 divide 144144? Does 1212 divide 150150? What are the divisors of 2828?

# Check your predictions print("12 | 144?", ZZ(12).divides(144)) print("12 | 150?", ZZ(12).divides(150)) print("Divisors of 28:", divisors(28))

2. Common Divisors and the GCD

Definition. A common divisor of integers aa and bb is any integer dd such that dad \mid a and dbd \mid b.

The greatest common divisor gcd(a,b)\gcd(a, b) is the largest positive common divisor.

Example. Let a=48a = 48, b=18b = 18.

  • Divisors of 48: {1,2,3,4,6,8,12,16,24,48}\{1, 2, 3, 4, 6, 8, 12, 16, 24, 48\}

  • Divisors of 18: {1,2,3,6,9,18}\{1, 2, 3, 6, 9, 18\}

  • Common divisors: {1,2,3,6}\{1, 2, 3, 6\}

  • gcd(48,18)=6\gcd(48, 18) = 6

This "list all divisors" approach works for small numbers. But what about gcd(1000000007,999999937)\gcd(1000000007, 999999937)? Listing a billion divisors is not practical. We need something smarter.

# The brute-force way: find GCD by listing all divisors a, b = 48, 18 div_a = set(divisors(a)) div_b = set(divisors(b)) common = sorted(div_a & div_b) # set intersection print(f"Divisors of {a}: {sorted(div_a)}") print(f"Divisors of {b}: {sorted(div_b)}") print(f"Common divisors: {common}") print(f"gcd({a}, {b}) = {max(common)}") print(f"\nSageMath built-in: gcd({a}, {b}) = {gcd(a, b)}")

3. The Euclidean Algorithm

3.1 The Key Insight

The Euclidean algorithm rests on a single, beautiful fact:

Theorem. For any integers a,ba, b with b>0b > 0: gcd(a,b)=gcd(b,  amodb).\gcd(a, b) = \gcd(b, \; a \bmod b).

Why does this work? Write a=qb+ra = qb + r where r=amodbr = a \bmod b. We claim that aa and bb have exactly the same set of common divisors as bb and rr.

  • Forward direction: Suppose dad \mid a and dbd \mid b. Then d(aqb)=rd \mid (a - qb) = r. So dbd \mid b and drd \mid r.

  • Backward direction: Suppose dbd \mid b and drd \mid r. Then d(qb+r)=ad \mid (qb + r) = a. So dad \mid a and dbd \mid b.

Since the set of common divisors is identical, the greatest common divisor is the same. And crucially, r<br < b, so the numbers get strictly smaller each step.

3.2 The Algorithm

Repeatedly apply gcd(a,b)=gcd(b,amodb)\gcd(a,b) = \gcd(b, a \bmod b) until the remainder is 0. The last nonzero remainder is the GCD.


Misconception alert. "The Euclidean algorithm tries all possible divisors." No! It never tests a single divisor. It uses division (specifically, the remainder) to shrink the problem at each step. That is why it is so fast.

3.3 Worked Example: gcd(252,105)\gcd(252, 105)

Let us trace every step by hand first:

StepEquationRemainder
1252=2105+42252 = 2 \cdot 105 + 424242
2105=242+21105 = 2 \cdot 42 + 212121
342=221+042 = 2 \cdot 21 + 000

The last nonzero remainder is 21\mathbf{21}, so gcd(252,105)=21\gcd(252, 105) = 21.

Notice how the chain of equalities shows: gcd(252,105)=gcd(105,42)=gcd(42,21)=gcd(21,0)=21.\gcd(252, 105) = \gcd(105, 42) = \gcd(42, 21) = \gcd(21, 0) = 21.

Only 3 steps, no trial division needed!

# A traced Euclidean algorithm that shows every step def traced_gcd(a, b): """ Compute gcd(a, b) via the Euclidean algorithm, printing each division step. """ print(f"Computing gcd({a}, {b})") step = 1 while b != 0: q, r = a // b, a % b print(f"Step {step}: {a} = {q} * {b} + {r}") a, b = b, r step += 1 print(f"GCD = {a} (last nonzero remainder)") return a # Trace the worked example traced_gcd(252, 105);

Checkpoint 2. Before running the next cell, trace gcd(546,182)\gcd(546, 182) by hand on paper. How many steps does it take? What is the GCD?

# Check your hand trace traced_gcd(546, 182);

3.4 Why It's Fast: O(log(min(a,b)))O(\log(\min(a,b))) Steps

The Euclidean algorithm is remarkably efficient. The key observation is that every two consecutive steps reduce the larger number by at least half:

After two steps, the pair (a,b)(a, b) is replaced by a pair where both numbers are strictly less than bb, and specifically amodb<b/2a \bmod b < b/2 after at most two remainder operations.

This means the number of steps is at most 2log2(min(a,b))+12 \cdot \lfloor \log_2(\min(a,b)) \rfloor + 1, i.e., logarithmic in the input size.

For a 1000-digit number, the algorithm needs at most about 6600 steps, not billions!

The worst case is consecutive Fibonacci numbers. Let us verify experimentally.

# Count the number of steps for various inputs def gcd_step_count(a, b): """Return (gcd, number_of_steps) for the Euclidean algorithm.""" steps = 0 while b != 0: a, b = b, a % b steps += 1 return a, steps # Typical case print("Typical cases:") for (a, b) in [(252, 105), (1000000007, 999999937), (2**100, 3**63)]: g, s = gcd_step_count(a, b) print(f" gcd({a}, {b}) = {g} [{s} steps]") # Worst case: consecutive Fibonacci numbers print("\nWorst case (consecutive Fibonacci numbers):") for k in [10, 20, 30, 50]: fk = fibonacci(k) fk1 = fibonacci(k + 1) g, s = gcd_step_count(fk1, fk) print(f" gcd(F_{k+1}, F_{k}) = gcd({fk1}, {fk}) = {g} [{s} steps]")

Notice that for Fibonacci inputs, the number of steps equals k1k-1: the algorithm peels off one Fibonacci number per step, which is the slowest possible shrinkage. Any other input shrinks faster.

Let us now answer the motivating question.

# Answering the motivating question traced_gcd(1000000007, 999999937);

4. Coprimality and Modular Inverses

Definition. Two integers aa and bb are coprime (or relatively prime) if gcd(a,b)=1\gcd(a, b) = 1.

Bridge from Modules 01 and 02. Remember that in Module 01, we saw that the units (invertible elements) of Z/nZ\mathbb{Z}/n\mathbb{Z} are exactly those residues aa with gcd(a,n)=1\gcd(a, n) = 1. In Module 02, we formalised this: coprimality determines unit status in the ring Z/nZ\mathbb{Z}/n\mathbb{Z}.

Now we can state this precisely:

Theorem. aa has a multiplicative inverse modulo nn if and only if gcd(a,n)=1\gcd(a, n) = 1.

This is why coprimality matters for cryptography. In the next notebook (04b), we will see that the Extended Euclidean Algorithm does not just check coprimality, it actually computes the inverse.


Crypto foreshadowing. In RSA, we pick two large primes p,qp, q, set n=pqn = pq, and compute φ(n)=(p1)(q1)\varphi(n) = (p-1)(q-1). The public exponent ee must satisfy gcd(e,φ(n))=1\gcd(e, \varphi(n)) = 1 so that ee has an inverse dd modulo φ(n)\varphi(n). That inverse dd is the private key. The extended Euclidean algorithm (notebook 04b) is what actually computes dd.

# Coprimality and units in Z/nZ n = 15 print(f"Units of Z/{n}Z (elements coprime to {n}):") units = [a for a in range(1, n) if gcd(a, n) == 1] print(f" {units}") print(f" (There are {len(units)} units)") print(f"\nNon-units of Z/{n}Z (elements sharing a factor with {n}):") non_units = [a for a in range(1, n) if gcd(a, n) != 1] for a in non_units: print(f" gcd({a}, {n}) = {gcd(a, n)}, not invertible") # Verify: each unit really does have an inverse R = Zmod(n) print(f"\nVerification, each unit has an inverse mod {n}:") for a in units: inv_a = R(a)^(-1) print(f" {a} * {inv_a} = {a * int(inv_a)}{(a * int(inv_a)) % n} (mod {n})")

Checkpoint 3. Consider n=12n = 12. Which elements of {1,2,,11}\{1, 2, \ldots, 11\} are coprime to 12? How many units does Z/12Z\mathbb{Z}/12\mathbb{Z} have? Predict first, then run the cell below.

# Check your prediction n = 12 units_12 = [a for a in range(1, n) if gcd(a, n) == 1] print(f"Units of Z/{n}Z: {units_12}") print(f"Count: {len(units_12)}") print(f"(This count is Euler's totient: phi({n}) = {euler_phi(n)})")

5. SageMath GCD and LCM Toolkit

SageMath provides several built-in functions for divisibility and GCD computations. Let us survey the most useful ones.

# gcd and lcm, basics a, b = 252, 105 print(f"gcd({a}, {b}) = {gcd(a, b)}") print(f"lcm({a}, {b}) = {lcm(a, b)}") # Fundamental identity: gcd(a,b) * lcm(a,b) = |a * b| print(f"\nFundamental identity check:") print(f" gcd * lcm = {gcd(a,b)} * {lcm(a,b)} = {gcd(a,b) * lcm(a,b)}") print(f" |a * b| = {abs(a * b)}") print(f" Equal? {gcd(a,b) * lcm(a,b) == abs(a*b)}") # GCD of more than two numbers print(f"\ngcd(12, 18, 24) = {gcd([12, 18, 24])}") # xgcd returns (g, s, t) where g = s*a + t*b (preview of 04b!) g, s, t = xgcd(252, 105) print(f"\nExtended GCD preview: xgcd(252, 105) = ({g}, {s}, {t})") print(f" Check: {s}*252 + {t}*105 = {s*252 + t*105}")

6. Exercises

Exercise 1 (Worked): Trace the Euclidean Algorithm

Problem. Compute gcd(270,192)\gcd(270, 192) by hand, showing each division step. Verify with SageMath.

Solution.

StepEquationRemainder
1270=1192+78270 = 1 \cdot 192 + 787878
2192=278+36192 = 2 \cdot 78 + 363636
378=236+678 = 2 \cdot 36 + 666
436=66+036 = 6 \cdot 6 + 000

So gcd(270,192)=6\gcd(270, 192) = 6.

# Exercise 1, verification traced_gcd(270, 192) print() print(f"SageMath confirms: gcd(270, 192) = {gcd(270, 192)}")

Exercise 2 (Guided): Coprimality Check for RSA

Problem. In a toy RSA setup, suppose p=11p = 11, q=23q = 23, so n=pq=253n = pq = 253 and φ(n)=(p1)(q1)=220\varphi(n) = (p-1)(q-1) = 220. A common choice for the public exponent is e=65537e = 65537. Check whether gcd(e,φ(n))=1\gcd(e, \varphi(n)) = 1 so that ee is a valid RSA exponent.

Then find two values of ee in {2,3,,20}\{2, 3, \ldots, 20\} that are coprime to 220220, and two that are not.

Hint: Use gcd() in a list comprehension to filter.

# Exercise 2, fill in the TODOs p, q = 11, 23 n = p * q phi_n = (p - 1) * (q - 1) print(f"n = {n}, phi(n) = {phi_n}") # TODO 1: Check if e = 65537 is coprime to phi_n e = 65537 # print(f"gcd({e}, {phi_n}) = ???") # print(f"Valid RSA exponent? ???") # TODO 2: Find all e in {2, ..., 20} coprime to phi_n # coprime_candidates = [e for e in range(2, 21) if ???] # print(f"Values coprime to {phi_n}: {coprime_candidates}") # TODO 3: Find all e in {2, ..., 20} NOT coprime to phi_n # not_coprime = [e for e in range(2, 21) if ???] # print(f"Values NOT coprime to {phi_n}: {not_coprime}")

Exercise 3 (Independent): Worst-Case Analysis

Problem.

  1. Write a function gcd_with_steps(a, b) that returns a tuple (gcd_value, step_count, step_list) where step_list is a list of strings showing each division step.

  2. Use it to compute gcd(F25,F24)\gcd(F_{25}, F_{24}) where FkF_k is the kk-th Fibonacci number.

  3. Compare the step count to the theoretical bound 2log2(min(a,b))+12 \lfloor \log_2(\min(a,b)) \rfloor + 1. Is the Fibonacci case close to the bound?

# Exercise 3, write your solution here

Summary

ConceptKey Fact
Divisibilityaba \mid b means b=kab = ka; reflexive, transitive, antisymmetric (on positives)
GCDgcd(a,b)\gcd(a,b) = largest positive common divisor
Euclidean algorithmUses gcd(a,b)=gcd(b,amodb)\gcd(a,b) = \gcd(b, a \bmod b) repeatedly; terminates when remainder = 0
CorrectnessAny common divisor of a,ba,b also divides amodba \bmod b (and vice versa)
EfficiencyO(log(min(a,b)))O(\log(\min(a,b))) steps, worst case is Fibonacci inputs
Coprimalitygcd(a,n)=1\gcd(a,n) = 1 iff aa is a unit (has inverse) in Z/nZ\mathbb{Z}/n\mathbb{Z}

We now have the tool to check coprimality. But for RSA, we need to compute the inverse d=e1modφ(n)d = e^{-1} \bmod \varphi(n). That requires the Extended Euclidean Algorithm, which is the subject of the next notebook.


Next: 04b. The Extended Euclidean Algorithm and Modular Inverses