Path: blob/main/frontier/11-homomorphic-encryption/connect/encrypted-databases.ipynb
483 views
Connect: Encrypted Database Queries
Module 11 | Real-World Connections
A cloud database stores encrypted records. Using Paillier, we compute sums and averages without the server ever seeing a single plaintext value.
Introduction
Consider a company that stores employee salary data in the cloud. The cloud provider should never see individual salaries, but HR needs to run queries like:
What is the total payroll?
What is the average salary?
How many employees earn above a threshold? (harder!)
With homomorphic encryption, the database stores only ciphertexts. The server computes query results on ciphertexts, and only the client (key holder) can decrypt the answer.
In this notebook, we'll build a toy encrypted database using Paillier and demonstrate which queries are easy (additive) and which are hard (comparison-based).
Step 1: Set Up the Encrypted Database
Each employee's salary is encrypted with Paillier before being stored. The server holds only ciphertexts.
Step 2: Query 1 --- Total Payroll (Homomorphic Sum)
The server computes the total payroll by multiplying all Paillier ciphertexts (which adds the plaintexts). No decryption key is needed for this step.
Step 3: Query 2 --- Average Salary
The average is just the total divided by the count. The server can compute the encrypted sum, and the client divides after decryption. Alternatively, the server can use scalar multiplication to scale the sum if needed.
Step 4: Query 3 --- Weighted Sum (Bonus Allocation)
Suppose each employee gets a bonus that is a percentage of their salary. HR wants to know the total bonus cost: where is the bonus rate (public) and is the salary (encrypted).
Paillier supports this: .
Step 5: Why Comparisons Are Hard
Now suppose HR asks: "How many employees earn more than 70?"
This requires comparing encrypted values, which Paillier cannot do. Comparison is fundamentally a nonlinear operation --- it requires bit decomposition and many multiplications, which demands full FHE (BGV/CKKS) with enough multiplicative depth.
Real Deployments
Encrypted databases are not just theory. Several real systems exist:
| System | Approach | What It Supports |
|---|---|---|
| CryptDB (MIT, 2011) | Layered encryption: different schemes for different queries | Equality, order, sum |
| Microsoft SEAL | BFV/CKKS library | Arbitrary polynomial queries |
| Google Private Join and Compute | Paillier + MPC | Aggregation on joined datasets |
| Enveil ZeroReveal | Commercial FHE platform | Encrypted search and analytics |
| TFHE (Zama) | Fast bootstrapping | Gate-by-gate encrypted computation |
CryptDB's insight: different query types need different levels of encryption. For equality checks, use deterministic encryption (fast but leaks equality). For sums, use Paillier. For arbitrary queries, use full FHE. CryptDB "peels off" encryption layers as needed --- a practical trade-off between privacy and functionality.
Concept Map
| Module 11 Concept | Database Application |
|---|---|
| Additive HE (Paillier) | SUM, AVERAGE, weighted aggregation |
| Full FHE (BGV/BFV) | Arbitrary SQL on encrypted data |
| CKKS | Approximate queries, ML on encrypted data |
| TFHE | Fast Boolean comparisons, encrypted WHERE clauses |
| Noise management | Limits the complexity of the SQL query |
| Bootstrapping | Enables arbitrarily complex queries |
| Scalar multiplication | Weighted queries (e.g., bonus calculations) |
Summary
| Aspect | Detail |
|---|---|
| Problem | Cloud stores data it should never see |
| Paillier | Enables SUM, AVG, weighted sums on encrypted columns |
| Limitation | Comparisons and joins need full FHE |
| CryptDB | Practical system using layered encryption for different queries |
| Trade-off | More powerful queries require more expensive FHE schemes |
The math from Module 11 --- Paillier's additive homomorphism, BGV's noise management, CKKS's approximate arithmetic --- directly determines which database queries are possible on encrypted data and how fast they run.