Path: blob/main/frontier/08-lattices-post-quantum/sage/08d-learning-with-errors.ipynb
483 views
Learning With Errors
Module 08 | 08-lattices-post-quantum
LWE definition, noise, search vs decision
Objectives
By the end of this notebook you will be able to:
Define the Learning With Errors (LWE) problem and explain why adding noise to a linear system transforms it from trivially solvable to conjecturally hard.
Construct LWE instances in SageMath and experimentally verify that Gaussian elimination fails in the presence of noise.
Distinguish between Search-LWE and Decision-LWE, and explain their relationship.
Analyze how the parameters control security, and connect LWE hardness to the lattice problems (SVP/CVP) studied in earlier notebooks.
Foreshadow how LWE underpins Kyber (ML-KEM) and other post-quantum schemes.
Prerequisites
Completion of The LLL Algorithm.
Familiarity with lattices, SVP/CVP, and the fact that LLL provides approximate solutions but cannot solve worst-case SVP/CVP exactly.
Comfort with matrices and vectors over (modular arithmetic from Module 01).
Motivating Question
Solving a system of linear equations is easy --- Gaussian elimination does it in time. What if I add a tiny bit of noise to every equation? Suddenly it becomes one of the hardest problems in mathematics.
This is the central miracle of the Learning With Errors problem. A system is trivial. A system (where is "small") appears to be almost the same thing --- yet no known algorithm can solve it efficiently when the parameters are chosen correctly.
Bridge from 08c: In the previous notebook, you saw that LLL can find short lattice vectors and approximately solve CVP. You might hope that LLL could strip away the noise in LWE. We will see that for properly chosen parameters, even LLL (and its stronger variants like BKZ) cannot recover the secret. This is precisely why LWE is the foundation of post-quantum cryptography.
1. Setup: Linear Systems Without Noise
Let us start with something familiar. We pick a secret vector and a random matrix , and compute .
This is just a system of linear equations in unknowns over a finite field (when is prime). Gaussian elimination solves it instantly.
No surprise: a linear system over is easy. Gaussian elimination recovers exactly.
Now let us see what happens when we add noise.
2. The LWE Problem: Adding Noise
Definition (LWE). Fix parameters (dimension), (modulus), and a noise distribution (typically a discrete Gaussian with standard deviation ). The LWE problem is:
with , , and (each entry is small), find .
The key point: is small relative to . Each entry of is typically in the range with high probability, while can be much larger.
Checkpoint
Predict before running the next cell: If we apply Gaussian elimination to the noisy system , will we recover ?
Think about it: the system is , but Gaussian elimination "thinks" it is solving . It will find some solution, but will it be ?
This is the core insight:
| System | Difficulty |
|---|---|
| Trivial (Gaussian elimination) | |
| Conjectured hard (even for quantum computers) |
A "tiny" perturbation transforms the problem from to (conjectured) exponential.
Misconception alert: "LWE is just solving noisy equations, so just round to remove the noise." Rounding works in very low dimensions (try it for !), but in high dimensions the errors accumulate through the matrix operations. Gaussian elimination amplifies the noise catastrophically --- by the time you finish back-substitution, the errors have grown to fill all of , leaving you with a random vector.
3. Visualizing the Noise
Let us see what LWE "looks like." We will generate many LWE samples and plot the residuals . Without noise, these are all zero. With noise, they form a cluster around zero.
The scatter plot shows that each noisy observation is close to the true value , but not exactly equal. The red dashed line is the noiseless case. The blue dots scatter around it --- that scatter IS the LWE noise.
4. Decision-LWE: Can You Tell Noise from Random?
There is a second, equally important formulation of LWE:
Decision-LWE. Given , distinguish between:
(LWE samples), and
(uniformly random).
If the noise is large enough and is large enough, the LWE samples "look random" --- no efficient algorithm can tell them apart from uniform.
Let us build a distinguisher and see when it works and when it fails.
Takeaway: When is small relative to , the LWE distribution has detectable structure and a simple distinguisher works. When is chosen appropriately (large enough to mask the structure, but small enough that decryption still works), LWE samples become indistinguishable from random.
This is the Decision-LWE assumption: for appropriate parameters, no polynomial-time algorithm can distinguish from with non-negligible advantage.
5. Search-LWE: Recovering the Secret
Search-LWE asks: given , find .
A classical result due to Regev (2005) shows that Search-LWE and Decision-LWE are polynomially equivalent when is polynomial in . So if you can decide, you can search, and vice versa.
Let us try brute force search in a tiny instance to see the structure, then observe how quickly it becomes infeasible.
6. Connection to Lattices: LWE as CVP
Why is LWE a "lattice" problem? Consider the lattice
The vector is a lattice point, and is a point near the lattice (displaced by the small error ). So solving LWE is equivalent to solving a Closest Vector Problem (CVP) on this lattice: find the lattice point closest to .
We already know from 08b-08c that CVP is hard in general, and LLL only gives approximate solutions. LWE inherits this hardness.
Let us verify this connection concretely.
7. Parameter Space: When Is LWE Hard?
LWE security depends on three parameters:
| Parameter | Role | Effect on security |
|---|---|---|
| Dimension | Larger = harder (exponential in ) | |
| Modulus | Must be large enough for correctness, but not too large | |
| Noise width | Larger noise = harder to solve, but too large breaks decryption |
The critical ratio is : if is too small relative to , the noise is negligible and the system is easy. If is too large, decryption errors become likely.
Regev's reduction (2005): Worst-case lattice problems (like GapSVP) reduce to average-case LWE when . This is why LWE is so compelling: breaking any LWE instance (even a random one) is as hard as solving worst-case lattice problems.
Let us experimentally see how the dimension affects the difficulty of a lattice attack.
As increases, the LLL-based attack starts to fail. For the parameters used in real-world schemes (like Kyber, where per polynomial), the lattice dimension is in the hundreds or thousands, and no known algorithm --- classical or quantum --- can solve LWE.
Crypto foreshadowing: Kyber (ML-KEM), the NIST-selected post-quantum key encapsulation mechanism, is built on Module-LWE --- a structured variant of LWE where the matrix has a special block structure using polynomial rings. The noise IS the security: without it, Kyber would be trivially breakable by linear algebra. The next notebooks (08e, 08f) show how the ring structure makes this efficient enough for real-world use.
Exercises
Exercise 1: Noise Threshold for Gaussian Elimination (Worked)
Goal: Experimentally find the noise level at which Gaussian elimination stops recovering the secret.
Setup: Fix , , . For increasing values, generate LWE instances and try to solve with Gaussian elimination. Measure the fraction of correct recoveries.
Interpretation: At , Gaussian elimination always succeeds. By , the success rate drops sharply. Even tiny noise (relative to ) is enough to destroy exact linear algebra. This is the fundamental principle behind LWE security.
Exercise 2: Build a Better Distinguisher (Guided)
The naive distinguisher from Section 4 uses average residual magnitude. A better approach uses the chi-squared statistic: if the residuals are LWE noise, they cluster near zero; if random, they are uniform over .
Task: Complete the function below to implement a chi-squared distinguisher.
Hints:
Compute residuals where is the Gaussian elimination "solution."
Bin the signed residuals into a histogram.
Compare against the uniform distribution using chi-squared: .
A large means the distribution is NOT uniform, suggesting LWE structure.
Exercise 3: LWE Parameter Exploration (Independent)
Task: Write a complete experiment that explores the "security frontier" of LWE.
For and , with the smallest prime :
Generate 20 LWE instances.
For each, attempt to recover using the LLL + Babai approach from Section 6.
Record the success rate.
Produce a heatmap (or table) showing success rate as a function of .
Answer these questions:
At what point does the LLL attack stop working?
How does this relate to the ratio ?
What is the minimum for which the attack never succeeds (for any tested)?
Summary
| Concept | Key idea |
|---|---|
| Noise transforms difficulty | Without noise, is trivially solvable. With noise, becomes conjectured hard, even for quantum computers. |
| Decision vs. Search LWE | Distinguishing LWE from random and finding are polynomially equivalent. Both are as hard as worst-case lattice problems via Regev's reduction. |
| LWE as a lattice problem | Solving LWE corresponds to finding the closest lattice point (CVP) in a -ary lattice. LLL-based attacks work for toy parameters but fail as grows. |
| Parameter balance | The triple must be balanced. Too little noise means insecure, too much noise means decryption fails. The sweet spot gives both security and correctness. |
| Foundation for post-quantum | Schemes like Kyber (ML-KEM) are built on structured variants of LWE. The noise is not a bug, it IS the security. |
Next: Ring-LWE --- where we add algebraic structure to make LWE practical.