Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
duyuefeng0708
GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/frontier/11-homomorphic-encryption/sage/11a-what-is-fhe.ipynb
483 views
unlisted
Kernel: SageMath 10.5

Notebook 11a: What Is Fully Homomorphic Encryption?

Module 11: Homomorphic Encryption


Motivating Question. Can you compute on data you can't see? Imagine a hospital that wants a cloud server to run a machine-learning model on patient records, but the server should never see the records. Fully Homomorphic Encryption (FHE) makes this possible: you encrypt the data, the server computes on the ciphertexts, and the result decrypts to the correct answer. No one but the key holder ever sees the plaintext.


Prerequisites. You should be comfortable with:

  • Modular arithmetic (Module 01)

  • The Learning With Errors (LWE) problem (Module 08)

Learning objectives. By the end of this notebook you will be able to:

  1. Explain the FHE dream: Enc(a)Enc(b)=Enc(a+b)\text{Enc}(a) \oplus \text{Enc}(b) = \text{Enc}(a + b).

  2. Build a toy symmetric encryption scheme with additive homomorphism.

  3. Observe noise growth as operations accumulate and understand why it limits computation depth.

  4. Describe bootstrapping as the key idea that makes FHE fully homomorphic.

  5. Place FHE in context with a timeline of the four generations.

1. The FHE Dream

Bridge from Module 08. In Module 08 we saw that LWE gives us encryption that's hard to break even for quantum computers. Now we'll exploit the algebraic structure of LWE ciphertexts: adding two ciphertexts adds the underlying messages.

Homomorphic encryption lets you compute on ciphertexts:

Enc(a)Enc(b)=Enc(a+b)\text{Enc}(a) \boxplus \text{Enc}(b) = \text{Enc}(a + b)Enc(a)Enc(b)=Enc(ab)\text{Enc}(a) \boxtimes \text{Enc}(b) = \text{Enc}(a \cdot b)

If a scheme supports both addition and multiplication on ciphertexts, it's fully homomorphic, you can evaluate any function, because addition + multiplication = any arithmetic circuit.

# The dream in pseudocode print("=== The FHE Dream ===") print() print("Alice has secret data: a = 7, b = 3") print("Alice encrypts: c_a = Enc(7), c_b = Enc(3)") print("Alice sends c_a, c_b to Cloud (Cloud sees only ciphertext)") print() print("Cloud computes on ciphertexts:") print(" c_sum = c_a ⊕ c_b (homomorphic addition)") print(" c_prod = c_a ⊗ c_b (homomorphic multiplication)") print("Cloud sends c_sum, c_prod back to Alice") print() print("Alice decrypts:") print(" Dec(c_sum) = 10 (= 7 + 3 ✓)") print(" Dec(c_prod) = 21 (= 7 × 3 ✓)") print() print("Cloud computed the CORRECT answers without ever seeing 7 or 3!")

2. A Toy Symmetric HE Scheme

Let's build the simplest possible homomorphic encryption scheme to see the idea in action.

Secret key: sZqs \in \mathbb{Z}_q (a random secret modulus-qq integer)

Encrypt message mm (small integer, mq|m| \ll q):

  • Choose random aZqa \in \mathbb{Z}_q and small noise e{B,,B}e \in \{-B, \ldots, B\}

  • Output ciphertext (a,  b=as+m+e(modq))(a, \; b = a \cdot s + m + e \pmod{q})

Decrypt ciphertext (a,b)(a, b):

  • Compute v=bas(modq)v = b - a \cdot s \pmod{q} (centered in [q/2,q/2)[-q/2, q/2))

  • Return vv (this equals m+em + e, and if e|e| is small, we recover mm approximately)

q = 1000003 # a large prime modulus B = 5 # noise bound def centered_mod(x, q): """Reduce x modulo q into the range [-q/2, q/2).""" r = x % q if r > q // 2: r -= q return r def keygen(q): """Generate a random secret key.""" return randint(1, q - 1) def encrypt(m, s, q, B): """Encrypt message m with secret key s.""" a = randint(0, q - 1) e = randint(-B, B) b = (a * s + m + e) % q return (a, b, e) # returning e for educational purposes def decrypt(ct, s, q): """Decrypt ciphertext (a, b) with secret key s.""" a, b = ct[0], ct[1] v = centered_mod(b - a * s, q) return v # Generate key and encrypt s = keygen(q) print(f"Secret key: s = {s}") print(f"Modulus: q = {q}") print(f"Noise bound: B = {B}") m = 42 ct = encrypt(m, s, q, B) print(f"\nPlaintext: m = {m}") print(f"Ciphertext: (a={ct[0]}, b={ct[1]})") print(f"Noise: e = {ct[2]}") decrypted = decrypt(ct, s, q) print(f"\nDecrypted: {decrypted}") print(f"Original + noise: {m} + {ct[2]} = {m + ct[2]}") print(f"Match? {decrypted == m + ct[2]}")

Checkpoint 1. Decryption recovers m+em + e (message plus noise). Since ee is tiny compared to qq, we can round to recover mm exactly, as long as the noise stays small. This "noise budget" is the central challenge of FHE.

3. Homomorphic Addition

The magic: if we add two ciphertexts component-wise, the result encrypts the sum of the messages!

ct1=(a1,  a1s+m1+e1)\text{ct}_1 = (a_1, \; a_1 s + m_1 + e_1)ct2=(a2,  a2s+m2+e2)\text{ct}_2 = (a_2, \; a_2 s + m_2 + e_2)ct1+ct2=(a1+a2,  (a1+a2)s+(m1+m2)+(e1+e2))\text{ct}_1 + \text{ct}_2 = (a_1 + a_2, \; (a_1 + a_2)s + (m_1 + m_2) + (e_1 + e_2))

This is a valid encryption of m1+m2m_1 + m_2 with noise e1+e2e_1 + e_2.

def he_add(ct1, ct2, q): """Homomorphic addition: add ciphertexts component-wise.""" a_new = (ct1[0] + ct2[0]) % q b_new = (ct1[1] + ct2[1]) % q noise_new = ct1[2] + ct2[2] # for tracking return (a_new, b_new, noise_new) # Encrypt two messages m1, m2 = 17, 25 ct1 = encrypt(m1, s, q, B) ct2 = encrypt(m2, s, q, B) print(f"m1 = {m1}, noise e1 = {ct1[2]}") print(f"m2 = {m2}, noise e2 = {ct2[2]}") # Homomorphic addition (no secret key needed!) ct_sum = he_add(ct1, ct2, q) decrypted_sum = decrypt(ct_sum, s, q) print(f"\nHomomorphic addition (done WITHOUT the secret key):") print(f" Dec(ct1 + ct2) = {decrypted_sum}") print(f" Expected: m1 + m2 = {m1 + m2}") print(f" Accumulated noise: e1 + e2 = {ct_sum[2]}") print(f" Decrypted = message + noise = {m1 + m2} + {ct_sum[2]} = {m1 + m2 + ct_sum[2]}") print(f" Match? {decrypted_sum == m1 + m2 + ct_sum[2]}")
# Chain many additions to see noise grow messages = [randint(1, 50) for _ in range(20)] cts = [encrypt(m, s, q, B) for m in messages] # Sum all ciphertexts ct_running = cts[0] running_sum = messages[0] for i in range(1, len(messages)): ct_running = he_add(ct_running, cts[i], q) running_sum += messages[i] dec = decrypt(ct_running, s, q) noise = ct_running[2] correct = (dec == running_sum + noise) print(f"\nAfter {len(messages)-1} additions: noise = {ct_running[2]}, still tiny vs q = {q}") print(f"Addition noise grows LINEARLY: each addition adds at most ±{B}.")

4. The Noise Problem: Why Multiplication Is Hard

Addition adds noise linearly (e1+e2e_1 + e_2). But multiplication makes noise grow multiplicatively. In a proper LWE-based scheme, multiplying two ciphertexts with noise e1,e2e_1, e_2 produces noise roughly e1e2e_1 \cdot e_2, which grows exponentially with the number of multiplications.

Let's simulate this with a simple noise-growth model.

# Simulate noise growth for addition vs multiplication initial_noise = 3 # |e| ≤ 3 noise_budget = q // 4 # decryption fails when noise exceeds this print(f"Initial noise: ±{initial_noise}") print(f"Noise budget (max tolerable noise): ±{noise_budget}") print() # Addition: noise grows linearly print("=== Noise Growth: Addition ===") noise_add = initial_noise for i in range(1, 21): noise_add += initial_noise bar = "█" * min(50, int(50 * noise_add / noise_budget)) status = "✓" if noise_add < noise_budget else "✗ OVERFLOW" if i in [1, 5, 10, 15, 20]: print(f" {i} adds: noise ≤ {noise_add} |{bar}| {status}") print() # Multiplication: noise grows exponentially (simplified model) print("=== Noise Growth: Multiplication ===") noise_mul = initial_noise for i in range(1, 21): noise_mul = noise_mul * initial_noise # simplified: noise roughly multiplies bar = "█" * min(50, int(50 * min(noise_mul, noise_budget) / noise_budget)) status = "✓" if noise_mul < noise_budget else "✗ OVERFLOW" if i in [1, 2, 3, 4, 5, 10]: print(f" {i} muls: noise ≤ {noise_mul} |{bar}| {status}") if noise_mul >= noise_budget: print(f" ... noise exceeded budget after {i} multiplications!") break print(f"\nAddition: can do ~{noise_budget // initial_noise} operations") print(f"Multiplication: can only do ~{int(log(noise_budget, initial_noise))} operations") print(f"This is the fundamental challenge of FHE.")

Misconception alert. "FHE means unlimited computation on ciphertexts." Without bootstrapping, you can only do a limited number of operations before the noise overwhelms the message. A scheme that supports limited operations is called leveled FHE. True ("pure") FHE requires bootstrapping to reset the noise.

5. Bootstrapping: The Key to "Fully" Homomorphic

In 2009, Craig Gentry had a breakthrough idea: evaluate the decryption circuit homomorphically.

The trick:

  1. You have a noisy ciphertext ct\text{ct} that's about to overflow.

  2. Encrypt the secret key ss under a new key ss': publish Encs(s)\text{Enc}_{s'}(s).

  3. Homomorphically evaluate Dec(ct,s)\text{Dec}(\text{ct}, s) using Encs(s)\text{Enc}_{s'}(s).

  4. The result is Encs(m)\text{Enc}_{s'}(m), the same message encrypted under ss', with fresh (low) noise!

This "refreshes" the ciphertext, resetting the noise budget. You can then continue computing.

ct (noisy, key s) ──→ Homomorphic Dec ──→ ct' (fresh, key s') ↑ ↑ noise ≈ budget noise ≈ small
# Conceptual demonstration of bootstrapping # (We can't implement real bootstrapping in a toy scheme, but we can show the idea) print("=== Bootstrapping Concept ===") print() print("Without bootstrapping (leveled FHE):") print(" Enc(m) → op → op → op → op → op → ✗ noise overflow!") print() print("With bootstrapping (fully homomorphic):") print(" Enc(m) → op → op → [bootstrap] → op → op → [bootstrap] → op → ...") print(" ↑ noise reset ↑ noise reset") print() print("Each bootstrap is expensive (~1 second per operation in early schemes),") print("but it lets you compute INDEFINITELY.") print() # Simulate noise with bootstrapping noise = initial_noise bootstrap_threshold = noise_budget // 2 # bootstrap when noise gets high bootstrap_cost = initial_noise * 2 # fresh noise after bootstrap n_bootstraps = 0 for i in range(1, 15): noise = noise * initial_noise # multiplication (exponential growth) if noise >= bootstrap_threshold: n_bootstraps += 1 noise = bootstrap_cost # reset noise! else: print(f"\nCompleted 14 multiplications with {n_bootstraps} bootstraps.") print(f"Without bootstrapping, we could only do ~{int(log(noise_budget, initial_noise))}.")

Checkpoint 2. Bootstrapping is what transforms a somewhat homomorphic scheme (limited operations) into a fully homomorphic one (unlimited operations). The requirement is that the scheme can homomorphically evaluate its own decryption circuit, Gentry called this "squashing" the decryption circuit.

6. Four Generations of FHE

FHE has evolved rapidly since Gentry's 2009 breakthrough:

GenYearSchemesKey InnovationPerformance
1st2009GentryBootstrapping! First proof that FHE is possibleMinutes per gate
2nd2011–12BGV, BFVModulus switching replaces expensive bootstrappingSeconds per circuit
3rd2013GSWApproximate eigenvector technique, simpler designImproved asymptotic
4th2017CKKSApproximate arithmetic for real numbersPractical for ML
# Timeline visualization timeline = [ (1978, "RSA", "Rivest asks: can we compute on encrypted data?"), (2009, "Gentry", "First FHE construction! (ideal lattices + bootstrapping)"), (2011, "BGV", "Brakerski-Gentry-Vaikuntanathan: modulus switching"), (2012, "BFV", "Brakerski/Fan-Vercauteren: scale-invariant noise management"), (2013, "GSW", "Gentry-Sahai-Waters: approximate eigenvector, no key-switching"), (2016, "TFHE", "Chillotti et al.: gate-by-gate bootstrapping in <1ms"), (2017, "CKKS", "Cheon-Kim-Kim-Song: approximate arithmetic for ML"), (2020, "Libraries", "Microsoft SEAL, IBM HElib, PALISADE go production"), (2024, "Hardware", "DARPA DPRIVE, Intel/DARPA FHE accelerators"), ] print("=== FHE Timeline ===") print() for year, name, desc in timeline: bar_pos = (year - 1978) // 2 marker = "●" if year >= 2009 else "○" print(f" {year} {marker} {name}: {desc}") print() print("The field went from 'impossible?' (1978) to 'theoretically possible' (2009)") print("to 'practically useful for some applications' (2020+).")

7. FHE vs Other Privacy Technologies

FHE is one of several approaches to privacy-preserving computation:

TechnologyWhat it doesTrust modelPerformance
FHECompute on encrypted dataSingle key holderSlow (10-1000x overhead)
MPCMultiple parties compute togetherThreshold trustModerate overhead, communication-heavy
ZK ProofsProve computation was correctProver knows witnessFast verification
TEE (e.g., SGX)Hardware-isolated computationTrust hardware vendorNear-native speed

These are complementary, not competing:

  • FHE + ZK: FHE keeps inputs encrypted while computing; ZK proves the computation was done correctly. Together they give verifiable computation on private data.

  • FHE + MPC: distributed key generation, threshold decryption

  • FHE vs ZK: ZK hides the witness but the computation is public; FHE hides the data but the computation is known to the server. They solve different privacy problems.

  • All three together: verifiable computation on private, distributed data

# Performance comparison (approximate orders of magnitude) print("=== Performance Overhead (approximate) ===") print() operations = [ ("Integer addition", "1 ns", "1 μs", "~1,000x"), ("Integer multiply", "1 ns", "10 μs", "~10,000x"), ("AES evaluation", "1 μs", "1 s", "~1,000,000x"), ("ML inference", "10 ms", "10 min", "~60,000x"), ("DB query (simple)", "1 ms", "10 s", "~10,000x"), ] for op, plain, fhe, overhead in operations: print() print("FHE is MUCH slower than plaintext computation.") print("But for privacy-critical applications, the overhead is worth it.") print("And the gap is closing fast, 2024 FHE is >1000x faster than 2011 FHE.")

Checkpoint 3. FHE is not a silver bullet, it's orders of magnitude slower than plaintext computation. But for applications where the alternative is not computing at all (because the data is too sensitive), even 10,000x overhead is acceptable. The field is rapidly closing the performance gap.

Crypto foreshadowing. The next notebook explores partially homomorphic schemes (RSA, Paillier) that support only one operation. These are much faster and already widely deployed. We'll then build up to the full BGV, BFV, and CKKS schemes in subsequent notebooks.

8. Exercises

Exercise 1 (Worked): Maximum Additions

Problem. Using our toy scheme with q=1000003q = 1000003 and B=5B = 5, how many homomorphic additions can we perform before the accumulated noise exceeds q/4q/4?

Solution:

# Exercise 1: Worked solution q_ex = 1000003 B_ex = 5 budget = q_ex // 4 print(f"Modulus: q = {q_ex}") print(f"Noise bound per ciphertext: B = {B_ex}") print(f"Noise budget: q/4 = {budget}") print() # Each addition adds at most B noise # After k additions: noise ≤ (k+1) * B max_additions = budget // B_ex - 1 print(f"After k additions of fresh ciphertexts:") print(f" Total noise ≤ (k + 1) × B = (k + 1) × {B_ex}") print(f" Constraint: (k + 1) × {B_ex} < {budget}") print(f" k < {budget // B_ex} - 1 = {max_additions}") print(f"\nWe can perform up to {max_additions:,} additions safely.") print(f"That's a LOT of additions, addition is 'cheap' in noise budget.")

Exercise 2 (Guided): Noise Growth Experiment

Problem. Encrypt 100 random messages with our toy scheme. Sum them homomorphically and verify the result matches the plaintext sum.

Fill in the TODOs:

# Exercise 2: fill in the TODOs # TODO 1: Generate 100 random messages between 1 and 100 # msgs = [randint(1, 100) for _ in range(100)] # TODO 2: Encrypt all messages # encrypted = [encrypt(m, s, q, B) for m in msgs] # TODO 3: Sum all ciphertexts homomorphically # ct_total = encrypted[0] # for i in range(1, len(encrypted)): # ct_total = he_add(ct_total, encrypted[i], q) # TODO 4: Decrypt and compare with plaintext sum # dec_total = decrypt(ct_total, s, q) # true_sum = sum(msgs) # noise = ct_total[2] # print(f"Plaintext sum: {true_sum}") # print(f"Decrypted (with noise): {dec_total}") # print(f"Total noise: {noise}") # print(f"Recovered message: {dec_total - noise}") # print(f"Match? {dec_total - noise == true_sum}")

Exercise 3 (Independent): Noise Budget Analysis

Problem.

  1. In a scheme where multiplication noise grows as e1e2e_1 \cdot e_2, starting from initial noise B=2B = 2, how many sequential multiplications can you perform before exceeding a budget of 2402^{40}?

  2. If each bootstrapping reduces noise back to B=2B = 2 but costs the equivalent of 100 multiplications in wall-clock time, how much does bootstrapping increase the total computation time for a circuit of depth 20?

  3. Why might CKKS (approximate arithmetic) tolerate more noise than BFV (exact arithmetic)?

# Exercise 3: write your solution here

Summary

ConceptKey Fact
Homomorphic encryptionCompute on ciphertexts: Enc(a)Enc(b)=Enc(a+b)\text{Enc}(a) \oplus \text{Enc}(b) = \text{Enc}(a+b)
NoiseEvery operation adds noise; too much noise → decryption fails
Addition noiseGrows linearly: e1+e2e_1 + e_2 (cheap)
Multiplication noiseGrows multiplicatively: e1e2\sim e_1 \cdot e_2 (expensive)
Leveled FHESupports a bounded number of operations (no bootstrapping)
BootstrappingHomomorphically decrypt to refresh noise → unlimited operations
Fully homomorphicLeveled + bootstrapping = compute any circuit

FHE is the most ambitious goal in cryptography: compute anything on encrypted data. The field has gone from theoretical curiosity (2009) to practical deployment (2020s). In the next notebooks, we'll build concrete schemes step by step.


Next: 11b: Partially Homomorphic Schemes