Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

| Download

Math 208 Interactive Notebooks © 2024 by Soham Bhosale, Sara Billey, Herman Chau, Zihan Chen, Isaac Hartin Pasco, Jennifer Huang, Snigdha Mahankali, Clare Minerath, and Anna Willis is licensed under CC BY-ND 4.0

Project: Shared Project -- MatrixExploration2024

Path: Math 208 Interactive Notebooks / Chapter 8 Orthogonality (Clare Minnerath) / Chapter 8 Orthogonality (Clare Minnerath).ipynb

Views: ⁸⁵¹
License: OTHER
Image: ubuntu2204

Kernel: SageMath 10.1

Chapter 8: Orthogonality

Originally created by Clare Minnerath

This Jupyter notebook is a part of a linear algebra tutorial created by Sara Billey's 2024 WXML group. It follows Chapter 8 of Holt's Linear Algebra with applications. [1] which explores orthogonality and new ways to decompose a matrix into more easily understood chunks. It is intended as a supplement to the UW Math 208 course. The user is encouraged to explore beyond the scope of this notebook as interested.

Note: This notebook primarily uses Sage linear algebra libraries. However, the Single Value Decomposition (SVD) application to image compression utiilizes python's numpy library. Also, methods such as Gram-Schmidt and QR factorization are available built in as Sage functions, but functions are written out in this tutorial to be compared with the text's algorithms.

1. Holt, J. (2017).Linear algebra with applications. 2. Cat image from Clare Minnerath: image of family cat Egg. 3. Dog image from Sara Billey: image of painting by Sara's uncle of her dog Valentine.

Two vectors in $\R^n$ are orthogonal if their dot product is equal to 0. The magnitude (or norm) of a vector is given by $||x|| = \sqrt{x\cdot x}$ .

In [3]:

u=vector([2,-1,5,-2])
v=vector([3,2,-4,0])
w=vector([2,9,6,4])

# computing dot products
print("u\u2022v = ", u.dot_product(v))
print("v\u2022w = ", v.dot_product(w), "\n")

print("||v|| = ", v.norm())

u•v =  -16
v•w =  0 

||v|| =  sqrt(29)

In [4]:

zero = vector([0,0,0])
u=vector([2,-1,-3])
v=vector([3,2,-4])
w=vector([2,9,6])

# compute dot products and visualize normalized vectors
print("u\u2022v = ", u.dot_product(v))
print("v\u2022w = ", v.dot_product(w))

# normalizing scales the vectors to magnitude 1
plot = (arrow(zero, u.normalized(), color="red") + 
    arrow(zero, v.normalized(), color="purple") + arrow(zero, w.normalized()))
plot.show()

u•v =  16
v•w =  0

Graphics3d Object

Since the Pathagorean Theorem only holds for right triangles, it follows that $||u+v||^2 = ||u||^2 + ||v||^2$ if and only if $u\cdot v=0$ . The Triangle Inequality, $||u+v|| \le ||u|| + ||v||$ , holds for all vectors $u$ and $v$ .

In [5]:

# .norm() comutes the magnitude (norm) of the vector
print("Pathagorian Theorem: ", (v+w).norm()^2, "=", v.norm()^2 + w.norm()^2)
print("Triangle Inequality: ", round((u+v).norm(), 3), "<", round(u.norm() + v.norm(),3))

Pathagorian Theorem:  150 = 150
Triangle Inequality:  8.66 < 9.127

An orthogonal basis, $\{v_1,...,v_n\}$ , is a basis for a subspace such that all of the basis vectors are orthogonal ( $v_i\cdot v_j=0$ whenever $i\ne j$ ). Given such a basis, any vector, $s$ , in the subspace can be expressed as $s = c_1v_1+\cdots + c_kv_k$ where $c_i = \frac{v_i\cdot s}{||v_i||^2}$ . Recalling that $\frac{v\cdot s}{||v||^2}v = \text{proj}_vs$ , we can also express the formula as $s = \text{proj}_{v_1}s + \text{proj}_{v_2}s + \cdots + \text{proj}_{v_k}s.$

In [6]:

u_1 = vector([1,2,0,-1])
u_2 = vector([5,2,4,9])
u_3 = vector([-2,2,-3,2])

# checks that the dot product of all vectors in basis equal 0.
if u_1.dot_product(u_2) == u_2.dot_product(u_3) == u_1.dot_product(u_3) == 0:
    print("The basis {u_1, u_2, u_3} are an orthogonal basis for span{u_1, u_2, u_3}.")

The basis {u_1, u_2, u_3} are an orthogonal basis for span{u_1, u_2, u_3}.

In [7]:

s = vector([19,4,17,51])    # s is in span{u_1,u_2,u_3}

# Find coeffcients to express s as a linear combination of {u_1, u_2, u_3}
c_1 = s.dot_product(u_1)/u_1.norm()^2
c_2 = s.dot_product(u_2)/u_2.norm()^2
c_3 = s.dot_product(u_3)/u_3.norm()^2

if s == c_1*u_1 + c_2*u_2 + c_3*u_3:
    print("As expected: ", s, " = ", c_1, u_1, " + ", c_2, u_2, " + ", c_3, u_3)

As expected:  (19, 4, 17, 51)  =  -4 (1, 2, 0, -1)  +  5 (5, 2, 4, 9)  +  1 (-2, 2, -3, 2)

The Gram-Schmit Process is a method for transforming a basis for a subspace into an orthogonal basis for that subspace. The following function performs said algorithm, but more details can be found in section 8.2 of [1].

In [8]:

def Gram_Schmidt(B):
    """
    This function take in an orthogonal basis, B, for
    a subspace and outputs the corresponding orthogonal
    basis for the subspace using the Gram-Schmidt process.
    (Note: This function does not handle errors)
    
    Inputs:
    -------
    B : List of vectors
    
    Outputs:
    --------
    O : List of vectors
    """
    O = []
    for i, v_i in enumerate(B):
        orth_v = v_i - sum([(v_i.dot_product(O[j])/O[j].norm()^2)*O[j] for j in range(i)])    # utilizing projection formula
        O.append(orth_v)
        
    return O

In [9]:

B = [vector([1,0,1,1]), vector([0,2,0,3]), vector([-3,-1,1,5])]
Gram_Schmidt(B)

# try this yourself using a larger basis that would be too difficult to compute by hand!
# B = []

[(1, 0, 1, 1), (-1, 2, -1, 2), (-3, -3, 1, 2)]

An orthogonal matrix is a square matrix with unit length mutually orthogonal columns as well as unit length mutually orthogonal rows. A matrix, $A$ , is orthogonally diagonalizable if there exists an orthogonal matrix, $P$ , and a diagonal matrix, $D$ , such that $A = PDP^{-1}$ . The Spectral Theorem from section 8.3 [1], tells us that a matrix is orthogonally diagonalizable if and only if the matrix is symmetric. What a result! The following function takes a symmetric matrix and outputs corresponding orthogonal and diagonal matrices. In particular, this function will make use of the previous Gram-Schmidt process.

In [10]:

def orthogonal_diagonalization(A):
    """
    This function takes in a symmetric matrix, A, and
    outputs orthogonal and diagonal matrices which
    orthogonally diagonalize A.
    
    Inputs: 
    -------
    A : matrix (symmetric)
    
    Outputs:
    --------
    matrix: A diagonal matrix 
    matrix: an orthogonal (optionally orthonormal) matrix
    """
    if A.transpose() != A:
        raise ValueError("orthogonal_diagonalization only accepts symmetric matrices")
        
    O = []
    
    # sort to look at eigen vectors from largest to smallest eigenvalue
    # this will be important when using (SVD)
    for _, eigen_vectors, num_vectors in sorted(A.eigenvectors_right(), reverse=True):
        if num_vectors > 1:
            O.extend(Gram_Schmidt(eigen_vectors))    # if an eigen value has multiplicity, we need to 
                                                         # find an orthogonal basis for the multiple eigen vectors.
        else:
            O.extend(eigen_vectors)
    
    # normalize matrix vectors
    norm_O = []
    for vector in O:
        norm_O.append(vector.normalized())
        
    return matrix.diagonal(sorted(A.eigenvalues(), reverse=True)), matrix(norm_O).transpose()

In [11]:

A = matrix(SR, [[1,3,-3,-3],[3,-3,3,-1],[-3,3,1,-3],[-3,-1,-3,-3]])

D, P = orthogonal_diagonalization(A)

if A == P*D*P.inverse():
    print(A, " is orthogonally diagonalized by the orthogonal matrix\n\n", P, " and the diagonal matrix\n\n", D)
    
# try this yourself with your own symmetric matrix!
# A = matrix([[]])

[ 1  3 -3 -3]
[ 3 -3  3 -1]
[-3  3  1 -3]
[-3 -1 -3 -3]  is orthogonally diagonalized by the orthogonal matrix

 [ 1/2*sqrt(2)          1/2            0          1/2]
[           0          1/2  1/2*sqrt(2)         -1/2]
[-1/2*sqrt(2)          1/2            0          1/2]
[           0         -1/2  1/2*sqrt(2)          1/2]  and the diagonal matrix

 [ 4  0  0  0]
[ 0  4  0  0]
[ 0  0 -4  0]
[ 0  0  0 -8]

In the more general case where we have an $n\times m$ matrix $A$ with linearly independent columns, we can factorize by $A = QR$ where $Q$ is an orthonormal $n\times m$ matrix and $R$ is an $m\times m$ upper triangular matrix with positive diagonal entires. This method is known as QR Factorization. The following function takes a matrix with linearly independent columns and outputs the corresponding orthonormal and upper triangular matrices. See section 8.3 [1] for more details.

In [12]:

def QR_factorization(A):
    """
    This function takes in a matrix, A, with linearly indep.
    columns and outputs orthonormal and upper triangular matrices 
    which QR factorize A.
    
    Inputs: 
    -------
    A : matrix (linearly indpendent columns)
    
    Outputs:
    --------
    matrix : orthonormal matrix 
    matrix: upper triangular matrix with positive diagonal
    """
    norm_O = [v.normalized() for v in Gram_Schmidt(A.columns())]    # normalize orthogonal basis
    Q_transpose = matrix(norm_O)    # when defining Q with column vectors, the matrix returned is the transpose of Q
    
    return Q_transpose.transpose(), Q_transpose*A

In [13]:

# Note: matrix A uses the same vectors as the previous Gram-Schmidt example.
A = matrix([[1,0,-3],[0,2,-1],[1,0,1],[1,3,5]])

Q, R = QR_factorization(A)

if A == Q*R:
    print(A, " is QR factorized by the orthonormal matrix\n\n", Q, " and the upper triangular matrix with positive diagonal matrix\n\n", R)
        
# try this yourself with your own matrix!
# A = matrix([])

[ 1  0 -3]
[ 0  2 -1]
[ 1  0  1]
[ 1  3  5]  is QR factorized by the orthonormal matrix

 [   1/3*sqrt(3) -1/10*sqrt(10) -3/23*sqrt(23)]
[             0   1/5*sqrt(10) -3/23*sqrt(23)]
[   1/3*sqrt(3) -1/10*sqrt(10)  1/23*sqrt(23)]
[   1/3*sqrt(3)   1/5*sqrt(10)  2/23*sqrt(23)]  and the upper triangular matrix with positive diagonal matrix

 [ sqrt(3)  sqrt(3)  sqrt(3)]
[       0 sqrt(10) sqrt(10)]
[       0        0 sqrt(23)]

An even more general factorization is singular value decomposition (SVD). SVD can be applied to any $n\times m$ matrix and decomposes the matrix into $U\Sigma V^T$ where $U$ is an $n\times n$ orthogonal matrix, $\Sigma$ is a diagonal matrix padded with 0's, and $V$ is an $m\times m$ orthogonal matrix. More specific details on the decomposition can be found in section 8.4 [1]. The following function implements (SVD) for any $n\times m$ matrix.

In [14]:

def SVD(B):
    """
    This function takes in any n x m matrix, B, and outputs an orthonormal n x n matrix,
    a padded diagonal matrix and another orthonormal m x m matrix which factorize B.
    
    Inputs: 
    -------
    B : matrix
    
    Outputs:
    --------
    matrix : n x n orthonormal matrix
    matrix : diagonal matrix padded to be n x m
    matrix : m x m orthonormal matrix
    """
    # apply algorithm on matrix with at least as many rows as columns
    if B.nrows() < B.ncols():
        A = B.transpose()
    else:
        A = B
       
    D,V = orthogonal_diagonalization(A.transpose()*A)
    root_D = D.principal_square_root()
    
    # pad diagonal matrix
    S = root_D.stack(matrix(0, nrows=A.nrows()-A.ncols(), ncols=A.ncols()))
        
    U = []    
    for i, s in enumerate(S.diagonal()):
        U.append((A*V.column(i)/s).normalized())
    for v in A.kernel().basis():
        U.append(v.normalized())
    
    # return based on whether functoin ran on transpose
    if A == B:
        return matrix(U).transpose(), S, V.transpose()
    else:
        return V, S.transpose(), matrix(U)

In [15]:

A = matrix([[1,2],[2,0],[0,2]])
U, S, V_transpose = SVD(A)

print(A, " = A\n")
print(U*S*V_transpose, " = USV^T\n")

B = matrix([[1,2,1,0],[2,0,1,1]])
U, S, V_transpose = SVD(B)

print(B, " = B\n")
print(U*S*V_transpose, " = USV^T\n")

# try this yourself with your own matrix!
# C = matrix([])

[1 2]
[2 0]
[0 2]  = A

[1 2]
[2 0]
[0 2]  = USV^T

[1 2 1 0]
[2 0 1 1]  = B

[1 2 1 0]
[2 0 1 1]  = USV^T

SVD has applications in image compression. We can express an $n \times m$ matrix, $A$ , with the following sum $A = \alpha_1u_1v_1 + \alpha_2u_2v_2 + \cdots \alpha_nu_nv_n$ where $u_i$ is a column vector of $U$ , $v_i$ is a row vector of $V^T$ , and the $\alpha_1\ge \alpha_2 \ge \cdots \ge \alpha_n$ come from the diagonal of $\Sigma$ in the $SVD$ decomposition of $A$ . The product used between $u_i$ and $v_i$ in the expansion is the outer product. Since the $\alpha_i$ are decreasing, taking the sum with only the first $k$ terms can still give us a good approximation of the original matrix. See section 8.4 [1] for more details. We show the results of this application below utilizing python's numpy library.

In [16]:

# import image into array with appropriate libraries
from PIL import Image
import numpy as np
from matplotlib.pyplot import imshow

img = Image.open("egg.png").convert("L")
egg_array = np.array(img)

In [17]:

# Use numpy's SVD algorithm and compute
U,S,V_T = np.linalg.svd(egg_array, full_matrices=True)

In [18]:

# For 100, 50, 10 terms used, compute the compressed image 
arr_100 = np.zeros(shape=egg_array.shape)
for i in range(100):
    arr_100 += np.multiply.outer(U[:, i], V_T[i, :])*S[i]

arr_50 = np.zeros(shape=egg_array.shape)
for i in range(50):
    arr_50 += np.multiply.outer(U[:, i], V_T[i, :])*S[i]
    
arr_10 = np.zeros(shape=egg_array.shape)
for i in range(10):
    arr_10 += np.multiply.outer(U[:, i], V_T[i, :])*S[i]
 
print("Values for alpha at 1000, 100, 50, and 10th terms respectively: ", round(S[1000],3), round(S[100],3), round(S[50],3), round(S[10],3))

Values for alpha at 1000, 100, 50, and 10th terms respectively:  26.811 2119.213 3628.81 11689.741

In [19]:

print("Original:")
imshow(egg_array, cmap="gray")

Original:

<matplotlib.image.AxesImage object at 0x7fbd3d495410>

In [20]:

print("SVD compressed 100:")
imshow(arr_100, cmap="gray")

SVD compressed 100:

<matplotlib.image.AxesImage object at 0x7fbd3d50d2d0>

In [21]:

print("SVD compressed 50:")
imshow(arr_50, cmap="gray")

SVD compressed 50:

<matplotlib.image.AxesImage object at 0x7fbd3c38d2d0>

In [22]:

print("SVD compressed 10:")
imshow(arr_10, cmap="gray")

SVD compressed 10:

<matplotlib.image.AxesImage object at 0x7fbd3d509410>

In [23]:

img = Image.open("val_gray.png").convert("L")
val_array = np.array(img)

In [24]:

# Use numpy's SVD algorithm and compute
U,S,V_T = np.linalg.svd(val_array, full_matrices=True)

# For 50, 20, 10, 5 terms used, compute the compressed image 
arr_50 = np.zeros(shape=val_array.shape)
for i in range(50):
    arr_50 += np.multiply.outer(U[:, i], V_T[i, :])*S[i]

arr_20 = np.zeros(shape=val_array.shape)
for i in range(20):
    arr_20 += np.multiply.outer(U[:, i], V_T[i, :])*S[i]
    
arr_10 = np.zeros(shape=val_array.shape)
for i in range(10):
    arr_10 += np.multiply.outer(U[:, i], V_T[i, :])*S[i]
    
arr_5 = np.zeros(shape=val_array.shape)
for i in range(5):
    arr_5 += np.multiply.outer(U[:, i], V_T[i, :])*S[i]
    
print("Values for alpha at 200, 50, 20, 10, and 5th terms respectively: ", round(S[200],3), round(S[50],3), round(S[20],3), round(S[10],3), round(S[5],3))

Values for alpha at 200, 50, 20, 10, and 5th terms respectively:  3.022 165.809 379.725 626.37 1215.939

In [26]:

print("Original:")
imshow(val_array, cmap="gray")

Original:

<matplotlib.image.AxesImage object at 0x7fbd3c43a5d0>

In [27]:

print("SVD compressed 50:")
imshow(arr_50, cmap="gray")

SVD compressed 50:

<matplotlib.image.AxesImage object at 0x7fbd3d5225d0>

In [28]:

print("SVD compressed 20:")
imshow(arr_20, cmap="gray")

SVD compressed 20:

<matplotlib.image.AxesImage object at 0x7fbd3be0d2d0>

In [29]:

print("SVD compressed 10:")
imshow(arr_10, cmap="gray")

SVD compressed 10:

<matplotlib.image.AxesImage object at 0x7fbd3be68390>

In [30]:

print("SVD compressed 5:")
imshow(arr_5, cmap="gray")

SVD compressed 5:

<matplotlib.image.AxesImage object at 0x7fbd39577390>

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

Chapter 8: Orthogonality

Originally created by Clare Minnerath

Product

Resources

Company

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more, all in one place. Commercial Alternative to JupyterHub.

Chapter 8: Orthogonality

Originally created by Clare Minnerath

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.