Path: blob/main/frontier/11-homomorphic-encryption/connect/fhe-private-ml.ipynb
483 views
Connect: FHE in Privacy-Preserving Machine Learning
Module 11 | Real-World Connections
Hospitals want cloud ML on patient data without exposing it. FHE makes this possible: encrypt the features, send ciphertexts to the cloud, run the model homomorphically, decrypt only the result.
Introduction
A hospital has patient health data (blood pressure, cholesterol, BMI, etc.) and wants a cloud provider to run a diagnostic ML model on this data. The problem: sending raw patient data to the cloud violates privacy regulations (HIPAA, GDPR).
FHE solution:
Hospital encrypts each patient's features with FHE.
Cloud receives only ciphertexts --- never sees the raw data.
Cloud evaluates the ML model homomorphically on the ciphertexts.
Cloud returns encrypted predictions.
Hospital decrypts to get the prediction --- cloud never learned anything.
In this notebook, we'll implement this workflow using Paillier encryption (from Notebook 11b), which supports addition and scalar multiplication --- enough for linear models.
Step 1: Set Up Paillier Encryption
We reuse the Paillier implementation from Notebook 11b. Paillier gives us:
(homomorphic addition)
(scalar multiplication)
These two operations are exactly what we need for linear models: .
Step 2: Encrypt Patient Data
We have 5 patients, each with 3 features (blood pressure, cholesterol, BMI index). The hospital encrypts all features before sending them to the cloud.
Note: In real systems, features would be scaled to integers. We use small integers here for clarity.
Step 3: Cloud Computes the Linear Model Homomorphically
The cloud has the ML model weights (these are public --- only the data is private):
Using Paillier:
=
Enc(x_i)^{w_i}(scalar multiplication)=
Enc(w_1 x_1) * Enc(w_2 x_2)(addition)Bias :
Enc(w_1 x_1 + w_2 x_2 + ...) * Enc(b)(add encrypted bias)
Step 4: Hospital Decrypts the Results
Only the hospital (key holder) can decrypt the predictions. Let's verify they match the cleartext computation.
Limitations: Paillier vs Full FHE
Paillier only supports addition and scalar multiplication (linear operations). This is sufficient for:
Linear regression
Weighted sums and averages
Simple statistics (mean, variance with a trick)
But real ML models need nonlinear operations:
Neural networks need activation functions (ReLU, sigmoid)
Decision trees need comparisons
Polynomial regression needs multiplication of encrypted values
For these, you need full FHE (BGV, BFV, or CKKS).
CKKS for ML: Approximate Arithmetic
The CKKS scheme (Cheon-Kim-Kim-Song, 2017) was designed specifically for approximate computation --- exactly what ML needs. Key features:
| Feature | CKKS | BFV/BGV |
|---|---|---|
| Message type | Real/complex numbers | Integers mod |
| Arithmetic | Approximate (small error tolerated) | Exact |
| Suited for | Neural networks, statistics | Counting, voting, exact queries |
| Noise handling | Noise becomes part of approximation | Noise must stay below threshold |
CKKS enables encrypted inference on neural networks with polynomial activation function approximations (e.g., approximate ReLU with a low-degree polynomial).
Production deployments:
Crypto-NN (CryptoNets): first encrypted neural network inference (2016)
nGraph-HE: Intel's framework for encrypted deep learning
Concrete ML (Zama): compiles scikit-learn and PyTorch models to FHE
Concept Map
| Module 11 Concept | ML Application |
|---|---|
| Paillier (additive HE) | Linear regression, weighted sums, averages |
| BGV/BFV (integer FHE) | Decision trees, exact classification |
| CKKS (approximate FHE) | Neural networks, floating-point ML |
| Noise budget | Limits the depth of the ML model (number of layers) |
| Bootstrapping | Enables arbitrarily deep neural networks |
| Scalar multiplication | Applying model weights to encrypted features |
| Homomorphic addition | Summing weighted features (dot product) |
Summary
| Aspect | Detail |
|---|---|
| Problem | Cloud ML on sensitive data violates privacy |
| Solution | Encrypt data with FHE, compute model homomorphically |
| Paillier | Supports linear models (addition + scalar multiply) |
| CKKS | Supports neural networks (approximate floating-point FHE) |
| Trade-off | 10,000x--1,000,000x slowdown vs. cleartext computation |
| Reality | Production systems exist (SEAL, Concrete ML, nGraph-HE) |
FHE for ML is the "holy grail" of privacy-preserving computation: the cloud provides compute power, the hospital keeps data private, and the patient gets a correct diagnosis. The math from Module 11 --- additive homomorphism, noise budgets, bootstrapping --- is what makes this possible.