GitHub Repository: gmcninch-tufts/2024-Sp-Math190
Path: blob/main/course-contents/2024-02-14--notes-ECC--intro.md
⁹⁰⁹ views

---

title: The notion of a code and some examples
date: 2024-02-14

---

\newcommand{\F}{\mathbb{F}} \newcommand{\Z}{\mathbb{Z}} \newcommand{\e}{\mathbf{e}} \newcommand{\bb}{\mathbf{b}} \newcommand{\Null}{\operatorname{Null}}

The idea

We want to transmit information over a potentially noisy channel. So we want to encode our information in some way that permits us to later decode it even in the presence of transmission errors.

We want to exploit algebra to create our codes. We will use as our alphabet a finite field ParseError: KaTeX parse error: Undefined control sequence: \F at position 5: K = \̲F̲_q

Some recollections about finite fields

We pause to recall some information about finite fields. We plan to return to this topic.

First of all, any finite field $K$ contains a copy of the field ParseError: KaTeX parse error: Undefined control sequence: \F at position 1: \̲F̲_p = \Z/p\Z for some prime number $p>0$ . Moreover, $K$ can then be viewed as a finite-dimensional vector space over ParseError: KaTeX parse error: Undefined control sequence: \F at position 1: \̲F̲_p; as such, if ParseError: KaTeX parse error: Undefined control sequence: \F at position 11: d = \dim_{\̲F̲_p} K we see that $\#K = |K| = p^d$ .

Thus every finite field has order a power of some prime number $p$ .

On the other hand, for any prime power $q = p^d$ there is a finite field ParseError: KaTeX parse error: Undefined control sequence: \F at position 1: \̲F̲_q which is unique up to isomorphism.

For example, ParseError: KaTeX parse error: Undefined control sequence: \F at position 1: \̲F̲_9 = \F_3[i] = … where ParseError: KaTeX parse error: Undefined control sequence: \F at position 18: …2 = -1 = 2 \in \̲F̲_3. In fact, ParseError: KaTeX parse error: Undefined control sequence: \F at position 1: \̲F̲_9 \simeq \F_3[… (quotient of the polynomial ring ParseError: KaTeX parse error: Undefined control sequence: \F at position 1: \̲F̲_3[T] by the principal ideal generated by the minimal polynomial $T^2 + 1$ of the element ParseError: KaTeX parse error: Undefined control sequence: \F at position 7: i \in \̲F̲_9. This isomorphism identifies $i$ with the coset $T + \langle T^2 + 1 \rangle$ ).

Codewords as vectors

We are going to study what are known as linear codes $C$ . A linear code $C$ is a linear subspace ParseError: KaTeX parse error: Undefined control sequence: \F at position 13: C \subseteq \̲F̲_q^n for some natural number $n$ .

Thus a code word is a vector $\mathbf{v} \in C$ , and we can write $\mathbf{v} = (v_1, v_2, \cdots, v_n)$ for elements $v_i$ in our alphabet ParseError: KaTeX parse error: Undefined control sequence: \F at position 5: k = \̲F̲_q.

We write $k$ for the dimension of $C$ ; i.e. ParseError: KaTeX parse error: Undefined control sequence: \F at position 11: k = \dim_{\̲F̲_q} C. We say that $C$ is an $[n,k]$ -code, or more precisely an $[n,k]_q$ -code.

Specifying a code via a generator matrix.

Let $C$ be an $[n,k]_q$ -code, and choose a basis $b_1,\cdots,b_k$ for $C$ as ParseError: KaTeX parse error: Undefined control sequence: \F at position 1: \̲F̲_q-vector space.

Since ParseError: KaTeX parse error: Undefined control sequence: \F at position 11: C \subset \̲F̲_q^n = \F_q^{1 …, we view elements of $C$ as $1 \times n$ row vectors.

Now form the matrix $k \times n$ matrix $G$ whose rows are the $1 \times n$ vectors $\bb_1,\bb_2,\cdots,\bb_k$ :

G = \begin{bmatrix} \bb_1 \\ \bb_2 \\ \vdots \\ \bb_k \end{bmatrix}.

$G$ is known as a generator matrix for the code $C$ .

Notice that we recover $C$ from $G$ as the image of the linear transformation ParseError: KaTeX parse error: Undefined control sequence: \F at position 1: \̲F̲_q^k \to \F_q^n determined by right-multiplication with $G$ :

ParseError: KaTeX parse error: Undefined control sequence: \F at position 42: …\mathbf{x} \in \̲F̲_q^{1 \times n}…

Standard form

Let us write ParseError: KaTeX parse error: Undefined control sequence: \e at position 1: \̲e̲_1,\e_2,\cdots,… for the standard basis for ParseError: KaTeX parse error: Undefined control sequence: \F at position 1: \̲F̲_q^n.

Lemma: : Let $C$ be an $[n,k]_q$ -code, and let $G$ be a generator matrix for $C$ .

 a. After possibly replacing the basis $\e_1,\cdots,\e_n$ with the basis
    $\e_{\sigma(1)},\cdots,\e_{\sigma(n)}$ for some $\sigma \in S_n$, we may suppose 
	that the first $k$-columns of the $k\times n$ matrix are linearly independent.

 b. If the conclusion a. holds, then $C$ has a generator matrix of
    the form $$G' = \left [ \begin{array}{l|l} \mathbf{I}_k & A
    \end{array} \right ]$$ for some $k \times (n-k)$ matrix $A$,
    where $\mathbf{I}_k$ denotes the $k \times k$ identity matrix.

Proof of the Lemma: : a. By construction, $G$ has $k$ linearly independent rows (its rows are a basis for $C$ ). Since $G$ is $k \times n$ it follows that the rank of $G$ is equal to $k$ .

   Since the *row rank* of a matrix is equal to the *column rank*
   of the matrix, it follows that $G$ has $k$ linearly independent
   columns. After possibly re-ordering the order of these columns,
   we may arrange that $G$ has the required form.

b. Suppose that the first $k$-columns of $G$ are linearly
   independent and consider the projection mapping $$\pi:\F_q^n
   \to \F_q^k \quad \text{given by the rule} \quad
   \pi(x_1,\cdots,x_n) = (x_1, \cdots,x_k).$$
	
   Write $\b_1,\b_2,\cdots,b_k \in \F_q^{1 \times n}$ for the *rows* of $G$.	
   Since the first $k$-columns of $G$ are linearly independent,
   the rank of the $k\times k$ matrix  whose rows are the $1 \times k$-vectors
   $\pi(\b_1), \pi(\b_2), \cdots, \pi(\b_k)$ is equal to $k$.
   
   This shows that the dimension of $\pi(C)$ is $\ge k$, and hence
   that $\pi(C) = \F_q^k$. In other words, the restriction
   $\pi_{\mid C}:C \to \F_q^k$ of $\pi$ to $C$ is *surjective*.
   
   In view of the surjectivity, for $i = 1,2,\cdots,k$ we may choose a vector
   $\b_i' \in \pi^{-1}(\e_i) \cap C$.
   
   Let $G'$ whose rows are the vectors $\b_i'$: $$G' =
   \begin{bmatrix} \b_1' \\ \b_2' \\ \vdots \\ \b_k'
   \end{bmatrix}.$$ Since 
   $$\begin{bmatrix} \pi(\b_1') \\
   \pi(\b_2') \\ 
   \vdots \\
   \pi(\b_k') 
   \end{bmatrix} = \mathbf{I}_k,$$ $G'$ has the required
   properties.

We say that a ( $k \times n$ ) generator matrix $G$ for the $[n,k]_q$ -code $C$ is in standard form if it has the form $G = \left [ \begin{array}{l|l} \mathbf{I}_k & A \end{array} \right ]$ for some $k \times (n-k)$ matrix $A$ ; thus the Lemma asserts that (after possibly changing the ordering of the coordinates on ParseError: KaTeX parse error: Undefined control sequence: \F at position 1: \̲F̲_q^n) every code $C$ has a generator matrix in standard form.

Check matrices

The preceding discussion described the subspace $C$ by giving generators for the vector space. In contrast, we may also specify a subspace using linear equations.

So: let $C$ be an $[n,k]_q$ -code.

Consider the quotient linear mapping $x \mapsto x + C$ : ParseError: KaTeX parse error: Undefined control sequence: \F at position 1: \̲F̲_q^n \to \F_q^n…

There is an $(n-k) \times n$ matrix $H$ which represents this linear mapping; then ParseError: KaTeX parse error: Undefined control sequence: \F at position 24: …l(H) = \{x \in \̲F̲_q^{1 \times n}… i.e. we see that $C$ is the null space for some $(n-k) \times n$ matrix $H$ .

We say that such a matrix is a check matrix for $C$ . For a vector ParseError: KaTeX parse error: Undefined control sequence: \F at position 7: x \in \̲F̲_q^{1 \times n}, we can check membership in $C$ using $H$ : we have $x \in C \iff Hx^T = 0$ .

Proposition: : Suppose that $C$ is an $[n,k]_q$ -code and that $G = \left [ \begin{array}{l|l} \mathbf{I}_k & A \end{array} \right ]$ is a generator matrix for $C$ in standard form. Then the $(n-k) \times n$ matrix $H = \left [ \begin{array}{l|l} -A^T & \mathbf{I}_{n-k} \end{array} \right ]$ is a check matrix for $C$ .

Proof: : We observe that $H \cdot G^T = \left [ \begin{array}{l|l} -A^T & \mathbf{I}_{n-k} \end{array} \right ] \left [ \begin{array}{l|l} \mathbf{I}_k \\ \hline A^T \end{array} \right ] = -A^T \cdot \mathbf{I}_k + \mathbf{I}_{n-k} \cdot A^T = -A^T + A^T = \mathbf{0}.$

Since the rows of $G$ are a basis for $C$, this shows that
$Hx^T = 0$ for every $x \in C$, i.e. $C  \subset \Null(H)$.

Now, $H$ clearly has rank $(n-k)$, so $\dim C = \dim \Null(H)$ and hence $C = \Null(H)$.
This completes the proof.

Weights and distance

For a vector ParseError: KaTeX parse error: Undefined control sequence: \F at position 31: …cdots,v_n) \in \̲F̲_q^n = \F_q^{1 … we define $\operatorname{weight}(v)$ to be the number of non-zero entries; i.e. $\operatorname{weight}(v) = \# \{ i \mid v_i \ne 0\}.$

For two vectors ParseError: KaTeX parse error: Undefined control sequence: \F at position 9: v,w \in \̲F̲_q^n the distance between $v$ and $w$ is defined to be $\operatorname{dist}(v,w) = \operatorname{weight}(v-w).$

Thus $\operatorname{dist}(v,w)$ represents the number of coordinates in which $v$ and $w$ differ.

For a subspace ParseError: KaTeX parse error: Undefined control sequence: \F at position 11: C \subset \̲F̲_q^n - i.e. a code - we define the minimal distance of $C$ to be $d = \min \{ \operatorname{dist}(v,w) \mid v,w \in C, v \ne w\}.$

The following lemma is immediate:

Lemma: : $d = \min \{ \operatorname{weight}(v) \mid v \in C, v \ne 0\}.$

If $d$ is the minimal distance of the $[n,k]_q$ code $C$ , we say that $C$ is an $[n,k,d]_q$ -code.

Example

We investigate an example using SageMath.

Let's create the field k having 3 elements, and the standard vector space V=k^9

K = GF(27);
V = VectorSpace(k,9)
print(k)
print(V)

=> 
Finite Field in z3 of size 3^3
Vector space of dimension 9 over Finite Field in z3 of size 3^3

Now let's create a certain 3 dimensional subspace C of V -- a $[9,3]_3$ code -- essentially by giving its generator matrix.

C = V.subspace([V([1,0,0,1,1,0,1,1,2]),
                V([0,1,0,1,0,1,1,2,1]),
                V([0,0,1,0,1,1,2,1,1])])
C
=>
Vector space of degree 9 and dimension 3 over Finite Field in z3 of size 3^3
Basis matrix:
[1 0 0 1 1 0 1 1 2]
[0 1 0 1 0 1 1 2 1]
[0 0 1 0 1 1 2 1 1]

In order to manipulate the generator matrix as a matrix, we create the MatrixSpace of the right dimensions, and coerce the basis of C into a matrix:

MM = MatrixSpace(K,3,9)
G = MM.matrix(C.basis())
G
=> 
[1 0 0 1 1 0 1 1 2]
[0 1 0 1 0 1 1 2 1]
[0 0 1 0 1 1 2 1 1]

We investigate the weights of non-zero vectors in C:

# count the non-zero entries in a vector
def weight(v):
    r = [x for x in v if x != 0]
    return len(r)

# we now find the minimum of the weight of v for non-zero vectors v in C
min([ weight(v) for v in C if v != 0 ])
=> 6

This shows that $C$ is a $[9,3,6]_3$ -code.

Notice that the generator matrix G is in standard form. Let's extract from G the matrix A which is the 3 x 6 matrix for which G = [ I₃| A ]

A = MatrixSpace(K,3,6).matrix([b[3:9] for b in G])
A
=> 
[1 1 0 1 1 2]
[1 0 1 1 2 1]
[0 1 1 2 1 1]

We can now form the 6 × 9 check matrix H = [ -A.T | I₆] as above.\

# form the 6x6 identity matrix
i6=MatrixSpace(K,6,6).one()

# we construct H via *block_matrix*
H=block_matrix([[-A.transpose(),i6]],
               subdivide=False)        

H
=>
[2 2 0 1 0 0 0 0 0]
[2 0 2 0 1 0 0 0 0]
[0 2 2 0 0 1 0 0 0]
[2 2 1 0 0 0 1 0 0]
[2 1 2 0 0 0 0 1 0]
[1 2 2 0 0 0 0 0 1]

We can confirm that H * G.T == 0 and that H has rank 6:

H * G.T == 0
=>
True

rank(H)
=>
6

And indeed, if we use SAGE to check the right_kernel of the matrix H, we get exactly the subspace C.

H.right_kernel() == C
=>
True

So H is indeed a check-matrix for the code C.

Bibliography

::: {.refs} :::