Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
duyuefeng0708
GitHub Repository: duyuefeng0708/Cryptography-From-First-Principle
Path: blob/main/foundations/03-galois-fields-aes/sage/03e-aes-mixcolumns-as-field-ops.ipynb
483 views
unlisted
Kernel: SageMath 10.5

AES MixColumns as Field Operations

Module 03e | Galois Fields and AES

Matrix multiplication over GF(256), the diffusion layer of AES.

Question: SubBytes provides confusion (nonlinearity). But confusion alone isn't enough, an attacker could analyze each byte independently. AES needs diffusion: changing one input byte should affect many output bytes. How?

MixColumns achieves this by treating each 4-byte column as a vector over GF(256) and multiplying by a carefully chosen matrix. One changed input byte cascades through the entire column.

Objectives

By the end of this notebook you will be able to:

  1. Describe the AES state as a 4×4 matrix of GF(256) elements

  2. Perform the MixColumns matrix-vector multiplication over GF(256)

  3. Verify the MDS (Maximum Distance Separable) property of the MixColumns matrix

  4. Implement InvMixColumns using the inverse matrix

  5. Explain why MixColumns provides optimal diffusion

Bridge from 03d

In 03d you built the S-box, AES's nonlinear byte substitution. Now we add the linear mixing step. Together, SubBytes (nonlinear, per-byte) and MixColumns (linear, per-column) create the confusion-diffusion architecture that Shannon identified as essential for strong ciphers.

MixColumns operates on 4-byte columns using matrix multiplication over GF(256), the same field you've been working with since 03c.

Setup

# Setup: GF(256) with AES polynomial R.<x> = GF(2)[] F.<a> = GF(2^8, modulus=x^8 + x^4 + x^3 + x + 1) def byte_to_gf(b): return sum(GF(2)((b >> i) & 1) * a^i for i in range(8)) def gf_to_byte(elem): p = elem.polynomial() return sum(int(p[i]) << i for i in range(8)) def xtime(b): result = b << 1 if result & 0x100: result ^^= 0x11B return result & 0xFF def gf256_mul(a, b): result = 0 temp = a for i in range(8): if b & (1 << i): result ^^= temp temp = xtime(temp) return result print('GF(256) utilities loaded.')

The AES State Matrix

AES processes 16 bytes (128 bits) at a time, arranged as a 4×4 matrix of GF(256) elements. Bytes fill column-by-column:

(b0b4b8b12b1b5b9b13b2b6b10b14b3b7b11b15)\begin{pmatrix} b_0 & b_4 & b_8 & b_{12} \\ b_1 & b_5 & b_9 & b_{13} \\ b_2 & b_6 & b_{10} & b_{14} \\ b_3 & b_7 & b_{11} & b_{15} \end{pmatrix}

MixColumns operates on each column independently.

The MixColumns Matrix

Each column [c0,c1,c2,c3]T[c_0, c_1, c_2, c_3]^T is left-multiplied by the fixed matrix:

M=(02030101010203010101020303010102)M = \begin{pmatrix} \texttt{02} & \texttt{03} & \texttt{01} & \texttt{01} \\ \texttt{01} & \texttt{02} & \texttt{03} & \texttt{01} \\ \texttt{01} & \texttt{01} & \texttt{02} & \texttt{03} \\ \texttt{03} & \texttt{01} & \texttt{01} & \texttt{02} \end{pmatrix}

where the entries are GF(256) elements. This is a circulant matrix, each row is a cyclic shift of the first.

# The MixColumns matrix over GF(256) M_bytes = [ [0x02, 0x03, 0x01, 0x01], [0x01, 0x02, 0x03, 0x01], [0x01, 0x01, 0x02, 0x03], [0x03, 0x01, 0x01, 0x02] ] # As a SageMath matrix over GF(256) M = matrix(F, [[byte_to_gf(b) for b in row] for row in M_bytes]) print('MixColumns matrix M (over GF(256)):') print(M) print() print(f'Is invertible? {M.is_invertible()}') print(f'Determinant: {M.det()} (= 0x{gf_to_byte(M.det()):02X})') print() # The matrix uses only 0x01, 0x02, 0x03: print('Only three GF(256) constants appear:') print(' 0x01 = identity (multiply by 1 = no-op)') print(' 0x02 = xtime (shift left, conditional XOR with 0x1B)') print(' 0x03 = 0x02 + 0x01 = xtime(b) XOR b')

MixColumns in Action

Checkpoint: Before running the next cell, compute the first output byte by hand. Given column [0xDB,0x13,0x53,0x45]T[\texttt{0xDB}, \texttt{0x13}, \texttt{0x53}, \texttt{0x45}]^T, the first output byte is: 02DB031301530145\texttt{02} \cdot \texttt{DB} \oplus \texttt{03} \cdot \texttt{13} \oplus \texttt{01} \cdot \texttt{53} \oplus \texttt{01} \cdot \texttt{45}

# MixColumns on a single column def mix_column(col_bytes): """Apply MixColumns to a 4-byte column. Returns 4 bytes.""" result = [0] * 4 for i in range(4): for j in range(4): result[i] ^^= gf256_mul(M_bytes[i][j], col_bytes[j]) return result # AES spec test vector (from FIPS 197, Appendix B) col = [0xDB, 0x13, 0x53, 0x45] out = mix_column(col) print(f'Input column: [{" ".join(f"0x{b:02X}" for b in col)}]') print(f'Output column: [{" ".join(f"0x{b:02X}" for b in out)}]') print() # Show the computation step by step for the first output byte print('First output byte breakdown:') terms = [] for j in range(4): product = gf256_mul(M_bytes[0][j], col[j]) terms.append(product) print(f' 0x{M_bytes[0][j]:02X} * 0x{col[j]:02X} = 0x{product:02X}') print(f' XOR all: 0x{terms[0]:02X} ⊕ 0x{terms[1]:02X} ⊕ 0x{terms[2]:02X} ⊕ 0x{terms[3]:02X} = 0x{out[0]:02X}') print() # Verify using SageMath matrix multiplication col_vec = vector(F, [byte_to_gf(b) for b in col]) sage_out = M * col_vec sage_bytes = [gf_to_byte(e) for e in sage_out] print(f'SageMath verification: [{" ".join(f"0x{b:02X}" for b in sage_bytes)}]') print(f'Match: {out == sage_bytes}')

The Diffusion Property

Why this specific matrix? Because it is MDS (Maximum Distance Separable). This means: if you change kk input bytes, at least 5k5 - k output bytes change. The minimum number of active bytes (input + output) is always 5\geq 5.

# Demonstrating diffusion: change one byte, all four outputs change col_a = [0xDB, 0x13, 0x53, 0x45] col_b = [0xDB, 0x13, 0x53, 0x46] # changed last byte by 1 bit out_a = mix_column(col_a) out_b = mix_column(col_b) print('Diffusion: change one input byte, watch the output') print(f'Input A: [{" ".join(f"0x{b:02X}" for b in col_a)}]') print(f'Input B: [{" ".join(f"0x{b:02X}" for b in col_b)}]') print(f' (only byte 3 changed: 0x45 → 0x46)') print() print(f'Output A: [{" ".join(f"0x{b:02X}" for b in out_a)}]') print(f'Output B: [{" ".join(f"0x{b:02X}" for b in out_b)}]') diffs = sum(1 for i in range(4) if out_a[i] != out_b[i]) print(f' {diffs} out of 4 output bytes changed!') print() # MDS property: check all submatrices are invertible print('=== MDS verification ===') is_mds = True for size in range(1, 5): for rows in Combinations(range(4), size): for cols in Combinations(range(4), size): sub = M[list(rows), list(cols)] if sub.det() == 0: print(f' Singular submatrix! rows={rows}, cols={cols}') is_mds = False print(f'All square submatrices are invertible? {is_mds}') print('This confirms M is MDS, maximum diffusion from any input change.')

Common mistake: "MixColumns is just matrix multiplication, so it's linear and therefore weak." MixColumns IS linear over GF(256), that's by design. The nonlinearity comes from SubBytes. AES's security comes from alternating nonlinear (SubBytes) and linear (MixColumns) layers. Neither alone is secure; together they create an avalanche effect.

InvMixColumns

# Inverse MixColumns matrix M_inv = M.inverse() M_inv_bytes = [[gf_to_byte(M_inv[i,j]) for j in range(4)] for i in range(4)] print('InvMixColumns matrix (over GF(256)):') for row in M_inv_bytes: print(f' [{" ".join(f"0x{b:02X}" for b in row)}]') print() print('Notice: the inverse uses 0x0E, 0x0B, 0x0D, 0x09') print('These are larger constants, InvMixColumns is more expensive.') print() # Verify: M_inv * M = identity product = M_inv * M print(f'M_inv * M = identity? {product == matrix.identity(F, 4)}') print() # Verify roundtrip def inv_mix_column(col_bytes): result = [0] * 4 for i in range(4): for j in range(4): result[i] ^^= gf256_mul(M_inv_bytes[i][j], col_bytes[j]) return result col = [0xDB, 0x13, 0x53, 0x45] mixed = mix_column(col) recovered = inv_mix_column(mixed) print(f'Original: [{" ".join(f"0x{b:02X}" for b in col)}]') print(f'Mixed: [{" ".join(f"0x{b:02X}" for b in mixed)}]') print(f'Recovered: [{" ".join(f"0x{b:02X}" for b in recovered)}]') print(f'Roundtrip correct: {col == recovered}')

Checkpoint: The inverse MixColumns matrix uses the constants 0x0E, 0x0B, 0x0D, 0x09. These are larger because M1M^{-1} has more complex GF(256) entries. This is why AES decryption is slightly slower than encryption, InvMixColumns requires more multiplications.

MixColumns as Polynomial Multiplication

There's an elegant alternative view: MixColumns is polynomial multiplication modulo x4+1x^4 + 1 in the ring GF(256)[x]/(x4+1)[x]/(x^4 + 1).

# MixColumns as polynomial multiplication S.<y> = PolynomialRing(F) Q = S.quotient(y^4 + 1, 'z') z = Q.gen() # The MixColumns polynomial c(x) = 03*x^3 + 01*x^2 + 01*x + 02 c_poly = byte_to_gf(0x03)*z^3 + byte_to_gf(0x01)*z^2 + byte_to_gf(0x01)*z + byte_to_gf(0x02) print(f'MixColumns polynomial c(x) = {c_poly}') print() # Column as polynomial: b(x) = b3*x^3 + b2*x^2 + b1*x + b0 col = [0xDB, 0x13, 0x53, 0x45] b_poly = sum(byte_to_gf(col[i]) * z^i for i in range(4)) print(f'Column polynomial b(x) = {b_poly}') print() # Multiply mod (x^4 + 1) result_poly = c_poly * b_poly print(f'c(x) * b(x) mod (x^4 + 1) = {result_poly}') print() # Extract bytes lift = result_poly.lift() result_bytes = [gf_to_byte(lift[i]) if i <= lift.degree() else 0 for i in range(4)] print(f'Result bytes: [{" ".join(f"0x{b:02X}" for b in result_bytes)}]') print(f'Matrix method: [{" ".join(f"0x{b:02X}" for b in mix_column(col))}]') print(f'Match: {result_bytes == mix_column(col)}')

Exercises

Exercise 1 (Worked)

Apply MixColumns to the column [0x63,0x53,0xE0,0x8C]T[\texttt{0x63}, \texttt{0x53}, \texttt{0xE0}, \texttt{0x8C}]^T, showing the GF(256) multiplications for each output byte.

# Exercise 1 (Worked), MixColumns step by step col = [0x63, 0x53, 0xE0, 0x8C] print(f'Input: [{" ".join(f"0x{b:02X}" for b in col)}]') print() for i in range(4): print(f'Output byte {i}:') terms = [] for j in range(4): product = gf256_mul(M_bytes[i][j], col[j]) terms.append(product) # Show how multiply-by-02 and multiply-by-03 work if M_bytes[i][j] == 0x02: detail = f'xtime(0x{col[j]:02X})' elif M_bytes[i][j] == 0x03: detail = f'xtime(0x{col[j]:02X}) ⊕ 0x{col[j]:02X}' else: detail = f'0x{col[j]:02X}' print(f' 0x{M_bytes[i][j]:02X} × 0x{col[j]:02X} = 0x{product:02X} ({detail})') result = 0 for t in terms: result ^^= t print(f' XOR: 0x{result:02X}') print() out = mix_column(col) print(f'Result: [{" ".join(f"0x{b:02X}" for b in out)}]')

Exercise 2 (Guided)

Verify the MDS property: show that every square submatrix of MM (of any size 1×1, 2×2, 3×3, 4×4) has a nonzero determinant over GF(256).

# Exercise 2 (Guided), Verify MDS property print('Checking all square submatrices of M:') for size in range(1, 5): count = 0 invertible = 0 for rows in Combinations(range(4), size): for cols in Combinations(range(4), size): count += 1 sub = M[list(rows), list(cols)] # TODO: check if determinant is nonzero # Hint: sub.det() != 0 in GF(256) pass # TODO: increment invertible if det != 0 # TODO: print results for this size # Expected: ALL submatrices are invertible print(f' {size}×{size}: {count} submatrices, TODO invertible') # Question: why does MDS imply that changing k input bytes # forces at least (5-k) output bytes to change?

Exercise 3 (Independent)

  1. Apply MixColumns to the column [0x01,0x00,0x00,0x00]T[\texttt{0x01}, \texttt{0x00}, \texttt{0x00}, \texttt{0x00}]^T. What do you get? (Hint: this extracts a column of MM, which one?)

  2. Apply MixColumns twice to any column. Is M2M^2 also MDS? Compute M2M^2 over GF(256) and check.

  3. The polynomial c(x)=03x3+01x2+01x+02c(x) = \texttt{03}x^3 + \texttt{01}x^2 + \texttt{01}x + \texttt{02} must be invertible modulo x4+1x^4 + 1 for InvMixColumns to exist. Verify this and find the inverse polynomial.

# Exercise 3 (Independent), Your code here

Summary

ConceptKey idea
MixColumnsMultiplies each 4-byte column by a fixed 4×44 \times 4 matrix over GF(256)
Efficient constantsThe matrix uses only 0x01, 0x02, and 0x03, so multiplication needs just XOR and xtime
MDS propertyChanging kk input bytes forces at least 5k5 - k output bytes to change, giving optimal diffusion
InvMixColumnsUses the inverse matrix with constants 0x09, 0x0B, 0x0D, 0x0E, making decryption slightly more expensive
Linear by designMixColumns is linear over GF(256). The nonlinearity comes from SubBytes, and the combination of both provides security
Polynomial viewEquivalently, MixColumns is multiplication by a fixed polynomial mod x4+1x^4 + 1 in GF(256)[x][x]

Crypto foreshadowing: In notebook 03f, we assemble all four AES round operations, SubBytes, ShiftRows, MixColumns, and AddRoundKey, into a complete AES round. You'll see how confusion (SubBytes) and diffusion (ShiftRows + MixColumns) combine with key mixing (AddRoundKey) to create a cipher that has resisted attack for over 25 years.

Next: Full AES Round, putting SubBytes, ShiftRows, MixColumns, and AddRoundKey together.