Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
duyuefeng0708
GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/04-number-theory-rsa/connect/rsa-oaep-padding.ipynb
483 views
unlisted
Kernel: SageMath 10.0

Connect: RSA-OAEP Padding

Module 04 | Real-World Connections

Why textbook RSA is never used in practice and how OAEP fixes it.

Introduction

Textbook RSA --- c=memodnc = m^e \bmod n --- is mathematically elegant but completely unsuitable for real-world encryption. It has two fatal flaws:

  1. Deterministic: the same message always encrypts to the same ciphertext

  2. Malleable: an attacker can manipulate ciphertexts to produce related plaintexts

Every real RSA deployment uses padding to fix these problems. The modern standard is OAEP (Optimal Asymmetric Encryption Padding), specified in PKCS#1 v2.2.

This notebook demonstrates why textbook RSA fails and how OAEP fixes each flaw.

Problem 1: Deterministic Encryption

Textbook RSA is a deterministic function: given the same mm, ee, and nn, it always produces the same cc.

This means an attacker can:

  • Detect when the same message is sent twice

  • Build a dictionary mapping known plaintexts to ciphertexts

  • Distinguish between two candidate messages (breaks IND-CPA security)

# === Setup: generate RSA key pair === set_random_seed(2024) p = random_prime(2^50, lbound=2^49) q = random_prime(2^50, lbound=2^49) n = p * q phi_n = (p - 1) * (q - 1) e = 65537 d = inverse_mod(e, phi_n) print(f'RSA key pair (n ~ {n.nbits()} bits):') print(f' Public: (n={n}, e={e})') print(f' Private: d={d}')
# === Demonstration: deterministic encryption leaks equality === # Suppose the plaintext is a vote: 0 = "No", 1 = "Yes" m_yes = 1 m_no = 0 c_yes = power_mod(m_yes, e, n) c_no = power_mod(m_no, e, n) print('Encrypted votes (textbook RSA):') print(f' "Yes" (m=1): c = {c_yes}') print(f' "No" (m=0): c = {c_no}') print() # Alice votes alice_vote = power_mod(1, e, n) bob_vote = power_mod(1, e, n) carol_vote = power_mod(0, e, n) print('Intercepted encrypted votes:') print(f' Alice: {alice_vote}') print(f' Bob: {bob_vote}') print(f' Carol: {carol_vote}') print() # Attacker can tell who voted the same way! print('Attacker observes:') print(f' Alice and Bob have the SAME ciphertext: {alice_vote == bob_vote}') print(f' Carol has a DIFFERENT ciphertext: {alice_vote != carol_vote}') print() print('Without decrypting, the attacker knows Alice and Bob voted the same way.') print('With only 2 possible messages, the attacker can try both and learn everything:') print(f' Encrypt(1) = {c_yes} matches Alice -> Alice voted Yes') print(f' Encrypt(0) = {c_no} matches Carol -> Carol voted No')

Problem 2: Malleability (Chosen Ciphertext Attack)

Textbook RSA is multiplicatively homomorphic:

Enc(m1)Enc(m2)=m1em2e=(m1m2)e=Enc(m1m2)\text{Enc}(m_1) \cdot \text{Enc}(m_2) = m_1^e \cdot m_2^e = (m_1 \cdot m_2)^e = \text{Enc}(m_1 \cdot m_2)

An attacker who sees c=memodnc = m^e \bmod n can compute:

  • c=2ecmodn=(2m)emodnc' = 2^e \cdot c \bmod n = (2m)^e \bmod n --- an encryption of 2m2m!

Without knowing mm, the attacker can manipulate the ciphertext to produce a related plaintext. This enables devastating chosen-ciphertext attacks.

# === Demonstration: multiplicative malleability === m = 42 # Secret message c = power_mod(m, e, n) print(f'Original: m = {m}, c = Enc({m}) = {c}') print() # Attacker computes c' = 2^e * c mod n c_prime = (power_mod(2, e, n) * c) % n # Decrypt c' to see what we get m_prime = power_mod(c_prime, d, n) print(f'Attacker computes: c\' = 2^e * c mod n = {c_prime}') print(f'Decryption of c\': m\' = {m_prime}') print(f'Is m\' = 2 * m? {m_prime} = 2 * {m} = {2*m}? {m_prime == 2*m}') print() # More generally: attacker can multiply by any known factor factor_val = 7 c_scaled = (power_mod(factor_val, e, n) * c) % n m_scaled = power_mod(c_scaled, d, n) print(f'Multiply by {factor_val}: Enc({factor_val}) * Enc({m}) mod n') print(f'Decrypts to: {m_scaled} = {factor_val} * {m} = {factor_val * m}? {m_scaled == factor_val * m}') print() print('The attacker can manipulate ciphertexts without knowing the plaintext.') print('This breaks any protocol that assumes ciphertexts are tamper-proof.')

The Bleichenbacher Attack (1998)

Daniel Bleichenbacher demonstrated a practical attack against RSA PKCS#1 v1.5 encryption padding. The attacker sends millions of modified ciphertexts to a server and observes whether the server reports a padding error. By exploiting the multiplicative malleability, each response leaks a small amount of information about the plaintext, eventually recovering it entirely.

This attack affected real SSL/TLS implementations and is the primary motivation for OAEP.

OAEP: Optimal Asymmetric Encryption Padding

OAEP (Bellare and Rogaway, 1994) pads the message with randomness using a Feistel-like structure before applying RSA:

Input: message m, random seed r Step 1: X = m || 0...0 XOR G(r) (G is a hash/mask-generation function) Step 2: Y = r XOR H(X) (H is another hash function) Step 3: padded = X || Y Step 4: c = padded^e mod n (standard RSA)

To decrypt:

  1. Compute padded=cdmodn\text{padded} = c^d \bmod n

  2. Split into XYX \| Y

  3. Recover r=YH(X)r = Y \oplus H(X)

  4. Recover m00=XG(r)m \| 0\ldots0 = X \oplus G(r)

  5. Check the zero padding; reject if invalid

# === Toy OAEP implementation === import hashlib import os def mgf(seed_bytes, length): """Simplified mask generation function (MGF1 with SHA-256).""" output = b'' counter = 0 while len(output) < length: c_bytes = counter.to_bytes(4, 'big') output += hashlib.sha256(seed_bytes + c_bytes).digest() counter += 1 return output[:length] def xor_bytes(a, b): """XOR two byte strings of equal length.""" return bytes(x ^^ y for x, y in zip(a, b)) def oaep_pad(message_bytes, key_size_bytes, seed=None): """Simplified OAEP padding.""" hash_len = 16 # Use 16 bytes for our toy version # Pad message with zeros to fill the data block db_len = key_size_bytes - hash_len - 1 padding_len = db_len - len(message_bytes) - 1 assert padding_len >= 0, 'Message too long' # Data block: [zero padding] [0x01] [message] db = bytes(padding_len) + bytes([0x01]) + message_bytes # Random seed if seed is None: seed = os.urandom(hash_len) # Feistel-like structure db_mask = mgf(seed, db_len) # G(r) masked_db = xor_bytes(db, db_mask) # X = DB xor G(r) seed_mask = mgf(masked_db, hash_len) # H(X) masked_seed = xor_bytes(seed, seed_mask) # Y = r xor H(X) # Final padded message: 0x00 || masked_seed || masked_db padded = bytes([0x00]) + masked_seed + masked_db return padded, seed def oaep_unpad(padded_bytes, key_size_bytes): """Simplified OAEP unpadding.""" hash_len = 16 db_len = key_size_bytes - hash_len - 1 # Split assert padded_bytes[0] == 0x00 masked_seed = padded_bytes[1:1+hash_len] masked_db = padded_bytes[1+hash_len:] # Reverse Feistel seed_mask = mgf(masked_db, hash_len) seed = xor_bytes(masked_seed, seed_mask) db_mask = mgf(seed, db_len) db = xor_bytes(masked_db, db_mask) # Find the 0x01 separator sep_idx = db.index(0x01) message = db[sep_idx+1:] return message # Test the OAEP padding key_bytes = (n.nbits() + 7) // 8 msg = b'Hello!' padded, seed_used = oaep_pad(msg, key_bytes) print(f'Message: {msg}') print(f'Key size: {key_bytes} bytes') print(f'Random seed: {seed_used.hex()}') print(f'Padded (hex): {padded.hex()}') print(f'Padded length: {len(padded)} bytes') print() # Verify unpadding recovered_msg = oaep_unpad(padded, key_bytes) print(f'Unpadded: {recovered_msg}') print(f'Match: {recovered_msg == msg}')
# === OAEP defeats determinism === def rsa_oaep_encrypt(message_bytes, e, n): """Encrypt with RSA-OAEP.""" key_bytes = (n.nbits() + 7) // 8 padded, _ = oaep_pad(message_bytes, key_bytes) m_int = Integer(int.from_bytes(padded, 'big')) return power_mod(m_int, e, n) # Encrypt the same message multiple times msg = b'Vote: Yes' c1 = rsa_oaep_encrypt(msg, e, n) c2 = rsa_oaep_encrypt(msg, e, n) c3 = rsa_oaep_encrypt(msg, e, n) print(f'Same message encrypted 3 times with RSA-OAEP:') print(f' c1 = {c1}') print(f' c2 = {c2}') print(f' c3 = {c3}') print() print(f'All different? c1 != c2: {c1 != c2}, c1 != c3: {c1 != c3}, c2 != c3: {c2 != c3}') print() print('Each encryption uses fresh randomness, so the same message') print('produces a DIFFERENT ciphertext every time.') print('The attacker can no longer detect when the same message is sent twice.')
# === OAEP defeats malleability === # Encrypt a message with OAEP msg = b'secret' key_bytes = (n.nbits() + 7) // 8 padded_bytes, _ = oaep_pad(msg, key_bytes) m_int = Integer(int.from_bytes(padded_bytes, 'big')) c = power_mod(m_int, e, n) print(f'Original ciphertext: c = {c}') print() # Attacker tries to multiply: c' = 2^e * c mod n c_modified = (power_mod(2, e, n) * c) % n m_modified_int = power_mod(c_modified, d, n) m_modified_bytes = int(m_modified_int).to_bytes(key_bytes, 'big') print(f'Attacker computes: c\' = 2^e * c mod n = {c_modified}') print(f'Decrypted integer: {m_modified_int}') print(f'Decrypted bytes (hex): {m_modified_bytes.hex()}') print() # Try to unpad - the Feistel structure will produce garbage try: result = oaep_unpad(m_modified_bytes, key_bytes) print(f'Unpadded: {result}') print('WARNING: Unpadding unexpectedly succeeded (unlikely with toy params)') except Exception as ex: print(f'OAEP unpadding FAILED: {ex}') print() print('The Feistel structure in OAEP means that multiplying the ciphertext') print('by 2^e does NOT multiply the underlying message by 2.') print('Instead, it corrupts the padding structure, and the recipient') print('rejects the ciphertext as invalid.')

Concept Map: Module 04 and OAEP

Module 04 ConceptRSA-OAEP Application
RSA encryption (memodnm^e \bmod n)Applied to the OAEP-padded message, not raw plaintext
RSA decryption (cdmodnc^d \bmod n)Recovers padded message; OAEP unpadding extracts plaintext
CRT (Notebook 04d)RSA-CRT optimization: compute cdmodpc^d \bmod p and cdmodqc^d \bmod q separately
Euler's theorem (Notebook 04c)Still guarantees (me)dm(m^e)^d \equiv m; OAEP adds security on top

OAEP transforms textbook RSA (which is a trapdoor permutation) into a proper encryption scheme with provable security against chosen-ciphertext attacks.

Summary

PropertyTextbook RSARSA-OAEP
Deterministic?Yes (same mm always gives same cc)No (fresh randomness each time)
Malleable?Yes (c=recc' = r^e \cdot c encrypts rmr \cdot m)No (Feistel structure prevents manipulation)
IND-CPA secure?NoYes
IND-CCA2 secure?NoYes (in the random oracle model)
Used in practice?NeverYes (PKCS#1 v2.2, RFC 8017)

Key takeaways:

  • Textbook RSA is a mathematical building block, not a cryptosystem.

  • Determinism lets an attacker detect equal plaintexts and perform dictionary attacks.

  • Malleability lets an attacker manipulate ciphertexts without knowing the plaintext.

  • OAEP fixes both problems by padding the message with randomness through a Feistel-like structure.

  • Every real RSA encryption uses OAEP (or is being migrated to it).

  • The underlying RSA math from Module 04 is unchanged --- OAEP adds a layer of security on top.


Back to Module 04: Number Theory and RSA