CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.

| Download

Chapter 3 (Linear algebra)

Views: 26
Visibility: Unlisted (only visible to those who know the link)
Image: ubuntu2004-eol
Kernel: SageMath 9.8

Chapter 3 - A short introduction to Matrix Calculus and Linear algebra


3.1. Matrix and vector manipulation via Sage
3.2. Solving linear systems via Sage
3.3. Eignevalues and eigenvectors
3.4. Diagonalization  
3.5. Quadratic forms
3.6. Linear programming
3.7. Markov processes and other applications
3.8. The Matplotlib library in SageMath

Linear algebra is a fundamental branch of mathematics dealing with vectors, vector spaces, and linear transformations.

It provides powerful tools for solving systems of linear equations, analyzing geometric transformations, and understanding the properties of matrices.

Hence it is also a fundamental tool for solving a wide range of problems in science, engineering, and economics.

The motivation for studying matrix calculus lies in its ability to simplify complex mathematical operations involving matrices, making it easier to analyze and solve problems that arise in real-world applications.

By using matrix calculus, we can express a large number of calculations in a compact and efficient form, which allows us to perform computations that would otherwise be difficult or impossible.

Furthermore, the applications of matrix calculus are vast, including image processing, optimization, machine learning, and data analysis, making it an essential tool for modern-day mathematicians and scientists.

In this chapter we will explore the practical aspects of linear algebra using the Sage mathematical software system.

Sage offers a comprehensive set of tools for performing computations related to linear algebra, such as matrix operations, solving linear systems, calculating eigenvalues and eigenvectors, and much more.

We will also explore Sage tools for Linear Optimization problems and Markov Processess.

We first proceed with basics from matrix calculus and preliminaries from linear algebra.


3.1. Matrix and vector manipulation via Sage

To introduce a matrix in Sage there are many alternatives. For instance we can use the matrix function matrix{\tt{matrix}}, as in our first examples below.

Most of the known operations between matrices correspond to built-in commands in Sage. Let us list some of them:

  1. transpose ATA^T of a matrix AA - transpose(A){\tt{transpose(A)}} or A.transpose( ){\tt{A.transpose( \ )}} or A.T{\tt{A.T}}.

  2. determinant of a square matrix AA - det(A){\tt{det(A)}} or A.det( ){\tt{A.det( \ )}}

  3. trace of a square matrix AA - A.trace( ){\tt{A.trace( \ )}}

  4. inverse of AA - inverse(A){\tt{inverse(A)}} or A.inverse( ){\tt{A.inverse( \ )}}

  5. Rank of AA - rank(A){\tt{rank(A)}}

\bullet To introduce a vector we use the vector{\tt{vector}} function (see below).

\bullet The standard scalar product (dot product) of two vectors u,w=wTu\langle u, w\rangle=w^Tu is given by u.dot_product(w){\tt{u.dot\_product(w)}}.

Exercise

a) Introduce in Sage the matrices (2212)\begin{pmatrix} 2 & 2 \\ 1 & 2\end{pmatrix} and (1331)\begin{pmatrix} 1 & 3 \\ 3 & 1\end{pmatrix}.

b) Next use Sage to compute the matrices A+BA+B, 2A2A, ABAB, A2A^2, ATA^{T}, A1A^{-1}, B1B^{-1}.

c) Confirm using the bool{\tt{bool}} command that the given matrices A1A^{-1}, B1B^{-1} are correct.

d) Compute the determinant, the trace and the rank of AA.

Solution:

A=matrix(RR, [[2, 2], [1, 2]]) B=matrix(RR, [[1, 3], [3, 1]]) show(A);show(B)

(2.000000000000002.000000000000001.000000000000002.00000000000000)\displaystyle \left(\begin{array}{rr} 2.00000000000000 & 2.00000000000000 \\ 1.00000000000000 & 2.00000000000000 \end{array}\right)

(1.000000000000003.000000000000003.000000000000001.00000000000000)\displaystyle \left(\begin{array}{rr} 1.00000000000000 & 3.00000000000000 \\ 3.00000000000000 & 1.00000000000000 \end{array}\right)

If we want to avoid the decimal presentation of the entries, we can avoid to indicate the field, and simply type the following:

A=matrix([[2, 2], [1, 2]]) B=matrix([[1, 3], [3, 1]]) show(A);show(B)

(2212)\displaystyle \left(\begin{array}{rr} 2 & 2 \\ 1 & 2 \end{array}\right)

(1331)\displaystyle \left(\begin{array}{rr} 1 & 3 \\ 3 & 1 \end{array}\right)

b) For the matrix operations we can now proceed as usual:

# Addition A+B pretty_print(A+B)

(3543)\displaystyle \left(\begin{array}{rr} 3 & 5 \\ 4 & 3 \end{array}\right)

# Scalar multiplication D = 2*A show(D)

(4424)\displaystyle \left(\begin{array}{rr} 4 & 4 \\ 2 & 4 \end{array}\right)

# Matrix multiplication E = A*B show(E)

(8875)\displaystyle \left(\begin{array}{rr} 8 & 8 \\ 7 & 5 \end{array}\right)

# transpose # ONE CAN USE ALSO THE SHORTER CODE A.T show(A.transpose())

(23328827349981812027751233326)\displaystyle \left(\begin{array}{rrrr} -23 & -32 & -8 & 8 \\ -27 & -34 & -9 & 9 \\ 81 & 81 & 20 & -27 \\ -75 & -123 & -33 & 26 \end{array}\right)

or type

show(transpose(A))

(2122)\displaystyle \left(\begin{array}{rr} 2 & 1 \\ 2 & 2 \end{array}\right)

# get Nth power of a matrix show(A**2)

(6846)\displaystyle \left(\begin{array}{rr} 6 & 8 \\ 4 & 6 \end{array}\right)

#Inverse show(A.inverse())

(11121)\displaystyle \left(\begin{array}{rr} 1 & -1 \\ -\frac{1}{2} & 1 \end{array}\right)

show(B.inverse())

(18383818)\displaystyle \left(\begin{array}{rr} -\frac{1}{8} & \frac{3}{8} \\ \frac{3}{8} & -\frac{1}{8} \end{array}\right)

c) Let us verify that indeed the given matrice are the inverses of A,BA, B, respectively:

Ainv=matrix([[1, -1], [-1/2, 1]]) Binv=matrix([[-1/8, 3/8], [3/8, -1/8]]) print(bool(A*Ainv==identity_matrix(RR, 2))) print(bool(Ainv*A==identity_matrix(RR, 2))) print(bool(B*Binv==identity_matrix(RR, 2))) print(bool(Binv*B==identity_matrix(RR, 2)))
True True True True

d) For the determinant, trace and rank of AA, which are all numbers, we get:

# Determinant detA = A.det() detA
2

or simply type the following:

det(A)
2
# Trace A.trace()
4
# rank of a matrix print(rank(A))
2

Exercise

Use the command echelon_form(){\tt{echelon\_form( )}} to transform the matrix AA given below into its echelon form and count the number of nonzero rows to get the rank. Then verify your answer using the command rank( ){\tt{rank( \ )}}.

A=(123456789)A=\begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{pmatrix}

Solution:

A = matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) B = A.echelon_form() show(B) show(B.rank()) bool(B.rank()==A.rank())

(123036000)\displaystyle \left(\begin{array}{rrr} 1 & 2 & 3 \\ 0 & 3 & 6 \\ 0 & 0 & 0 \end{array}\right)

2\displaystyle 2

True

Exercise

Introduce the vectors v=(1,0,2)v=(1, 0, 2) and w=(1,2,1)w=(1, 2, 1) of R3\mathbb{R}^3.

Next compute the vector u=3vwu=3v-w and the dot products v,w\langle v, w\rangle an d u,v\langle u, v\rangle.

Solution:

v = vector([1,0,2]) print(f'v = {v}')
v = (1, 0, 2)

Or simply type

v=vector([1, 0, 2]); show(v)

(1,0,2)\displaystyle \left(1,\,0,\,2\right)

Similarly for ww:

w = vector([1,2,1])
u=3*v - w; show(u)

(2,2,5)\displaystyle \left(2,\,-2,\,5\right)

# dot product of vectors print(v*w) # alternative syntax v.dot_product(w)
3
3
u.dot_product(v)
12

As a confirmation one may use the bool{\tt{bool}} function, as follows:

bool(u.dot_product(v)==u*v)
True

Exercise

Given two vectors v=(2,3,1)v = (2, 3, -1) and w=(1,4,2)w = (-1, 4, 2), compute their inner product.

Solution:

Exercise

Plot the following vectors in Sage: u=(1,2,3),  v=(3,1,4),  u+v,  uv. u=(1, 2, 3), \ \ v=(-3, 1, 4), \ \ u+v, \ \ u-v.

Solution:

v = vector([2, 3, -1]) w = vector([-1, 4, 2]) inner_product = v.dot_product(w) print("The inner product of the vectors w and w is a real number, namely the number:", inner_product)
The inner product of the vectors w and w is a real number, namely the number: 8

To plot vectors of R3\mathbb{R}^3 one can use the command arrow3d{\tt{arrow3d}}. Let us illustrate its implementation with an example:

Exercise

Plot the vectors u=(1,2,3)Tu=(1, 2, 3)^{T}, v=(3,1,4)Tv=(-3, 1, 4)^{T}, u+vu+v and uvu-v in SageMath.

Solution:

nul=vector([0, 0, 0]) u=vector([1, 2, 3]) v=vector([-3, 1, 4]) vec_u=arrow3d(nul, u, color="red") vec_v=arrow3d(nul, v, color="blue") vec_3=arrow3d(nul, u+v, color="green") vec_4=arrow3d(nul, u-v) l1=line3d((u, u+v), color="black", linestyle="--") l2=line3d((v, u+v), color="black", linestyle="--") show(vec_u+vec_v+vec_3+vec_4+l1+l2, axes=True, frame=True)

Exercise

Introduce the vectors u=(1,1,0)Tu=(-1, 1, 0)^{T} and v=(1,0,1)Tv=(-1, 0, 1)^{T} of R3\mathbb{R}^3 and compute their cross product u×vu\times v. Next plot the vectors uu, vv and u+vu+v.

Instructions: We want the three axes to be visible in the figure and also dashed boundary lines for the three vectors for a better visualization.

Solution:

# Introduce the vectors u = vector([-1, 1, 0]) v = vector([-1, 0, 1]) # compute and dispaly their product cross_product = u.cross_product(v) print("The cross product of u and v is the vector", cross_product) # Create a plot with a small black dot at the origin p = point3d([0, 0, 0], color='black', size=5) # Smaller dot for the origin # Add the axes p += arrow3d((0, 0, 0), (2, 0, 0), color='cyan', linestyle="-", width=0.3) # x-axis p += arrow3d((0, 0, 0), (0, 2, 0), color='steelblue', width=0.3) # y-axis p += arrow3d((0, 0, 0), (0, 0, 2), color='purple', width=0.3) # z-axis # Add vectors to the plot p += arrow3d((0, 0, 0), u, color='blue', width=0.6, legend_label='u = (-1, 1, 0)') p += arrow3d((0, 0, 0), v, color='red', width=0.6, legend_label='v = (-1, 0, 1)') p += arrow3d((0, 0, 0), cross_product, color='green', width=1, legend_label='u x v') # Add parallelograms for better visualization of the cross product p += polygon3d([ [0, 0, 0], u, u+v, v], color='lightblue', opacity=0.3) p += polygon3d([ [0, 0, 0], u, u+cross_product, cross_product], color='lightgreen', opacity=0.3) p += polygon3d([ [0, 0, 0], v, v+cross_product, cross_product], color='lightcoral', opacity=0.3) # Add dashed boundary lines for better visualization p += line3d([(-1, 0, 0), (1, 0, 0)], color='gray', linestyle='dashed') # x bounds p += line3d([(0, -1, 0), (0, 1, 0)], color='gray', linestyle='dashed') # y bounds p += line3d([(0, 0, -1), (0, 0, 1)], color='gray', linestyle='dashed') # z bounds # Customize the plot to allow interaction with fixed zoom and axis labels p.show( axes=True, #aspect_ratio=[1, 1, 1], figsize=10, axes_labels=['x', 'y', 'z'], #gridlines=True, # viewer='tachyon', # Enable interactive viewing # zoom=0.8 # Fixed zoom level )

Exercise

For vectors v=(1,2,2)v = (1, 2, 2) and w=(2,1,1)w = (2, -1, 1), calculate the angle θ\theta between them (in radians)

Solution:

v = vector([1, 2, 2]) w = vector([2, -1, 1]) cos_theta = v.dot_product(w) / (v.norm() * w.norm()) angle_theta = arccos(cos_theta).n() print("The angle θ between v and w is given by", angle_theta, "radians")
The angle θ between v and w is given by 1.29515352757863 radians

Exercise

Write a short program in Sage checking if the vectors v=(4,2,1)v = (4, -2, 1) and w=(2,1,8)w = (2, 1, -8) are orthogonal.

Solution:

v = vector([4, -2, 1]) w = vector([2, 1, -8]) is_orthogonal = v.dot_product(w) == 0 print("Are v and w orthogonal?", is_orthogonal)
Are v and w orthogonal? False

Exercise

Find the projection of the vector v=(3,4,0)v = (3, 4, 0) onto w=(1,0,0)w = (1, 0, 0).

Solution:

v = vector([3, 4, 0]) w = vector([1, 0, 0]) projection_v_on_w = (v.dot_product(w) / w.norm()^2) * w print("The projection of v onto w is the vector", projection_v_on_w)
The projection of v onto w is the vector (3, 0, 0)

Exercise

Introduce the matrix MM given below and next compute its transpose, its determinant and its trace:

M=(012123234).M=\begin{pmatrix} 0 & 1 & 2 \\ 1 & 2 & 3 \\ 2 & 3 & 4 \end{pmatrix}\,.

Solution:

An alternative way to introduce a square matrix in Sage goes as follows:

# alternative way M = matrix(QQ, 3, 3, lambda i, j: i+j) show(M)

(012123234)\displaystyle \left(\begin{array}{rrr} 0 & 1 & 2 \\ 1 & 2 & 3 \\ 2 & 3 & 4 \end{array}\right)

# transpose show(M.transpose()) # determinant print(f'det M = {M.det()}') # trace print(f'tr M = {M.trace()}')

(012123234)\displaystyle \left(\begin{array}{rrr} 0 & 1 & 2 \\ 1 & 2 & 3 \\ 2 & 3 & 4 \end{array}\right)

det M = 0 tr M = 6

Notice for the transpose of a matrix MM we can use the shortcut M.TM.T

show(M.T)

(012123234)\displaystyle \left(\begin{array}{rrr} 0 & 1 & 2 \\ 1 & 2 & 3 \\ 2 & 3 & 4 \end{array}\right)

Exercise

For the matrices M1,M2M_1, M_2 given below and the vector k=(1,2)k=(1, 2) on R2\mathbb{R}^2 compute the following:

M3:=M1M2,M3k,kM3,k2M_3:=M_1M_2, \quad M_3k, \quad kM_3, \quad \|k\|^2
M1 = matrix([[1, 2], [3, 4]]) M2 = matrix([[2, 3], [4, 5]]) M3 = M1 * M2 k = vector([1,2]) show(M3) show(M3*k) show(k*M3) show(k*k)

(10132229)\displaystyle \left(\begin{array}{rr} 10 & 13 \\ 22 & 29 \end{array}\right)

(36,80)\displaystyle \left(36,\,80\right)

(54,71)\displaystyle \left(54,\,71\right)

5\displaystyle 5

Or for the norm u\|u\| of uu one can directly type

norm(k)
sqrt(5)

and hence k2=(5)2=5\|k\|^2=(\sqrt 5)^2=5.

Exercise

Recall that the kernel of the matrix is the space of vectors satisfying

Ax=0.Ax = 0.

For this, in Sage one should use the command A.left_kernel(){\tt{A.left\_kernel()}}.

Apply this method to find the kernel of the matrix A=(111211).A=\begin{pmatrix} 1 & 1 \\ 1 & 2 \\ 1 & 1\end{pmatrix}.

A = matrix([[1,1],[1,2],[1,1]]) show(A.left_kernel())

RowSpanZ(101)\displaystyle \renewcommand{\Bold}[1]{\mathbf{#1}}\mathrm{RowSpan}_{\Bold{Z}}\left(\begin{array}{rrr} 1 & 0 & -1 \end{array}\right)

Remark

Be aware that in Sage the built-in function A.kernel( ){\tt{A.kernel( \ )}} provides the right kernel of AA, that is those vectors uu satisfying uA=0uA=0.

3.2. Solving linear systems via Sage

Sage can used very succefully to solve linear systems.

In particular, suppose we have a matrix AA and a vector YY and we want to find the solution of the equation AX=Y. AX = Y\,.

To solve this task in Sage, a method relies on the solve_right(){\tt{solve\_right()}} function. Let us describe an example.

Exercise

Solve the system Ax=bAx = b, where A=(1234)A=\begin{pmatrix} 1 & 2 \\ 3 & 4\end{pmatrix} and bb is the vector (56)\begin{pmatrix} 5 \\ 6\end{pmatrix} on R2\mathbb{R}^2.

Solution:

# Solve the system Ax = b, where A is a matrix and b is a vector # This solves the system Ax = b and returns the solution vector x. A = Matrix([[1, 2], [3, 4]]) b = vector([5, 6]) x = A.solve_right(b) x
(-4, 9/2)

We could also type

show(A\b)

(4,92)\displaystyle \left(-4,\,\frac{9}{2}\right)

As another example: Use the same AA but replace the constant vector bb by the vector (13,8)(13, 8).

Y = vector([13,8]) A.solve_right(Y)
(-18, 31/2)

or as we said an alternative we could type:

# alternative syntax A \ Y
(-18, 31/2)

Matrix inverse method for solving linear systems

Another way to solve linear systems is by finding the inverse of the matrix A. SAGE has a built-in function "A.inverse()" that computes the inverse of a matrix

A = matrix([[2, 3, 1], [4, 5, 3], [2, 7, 5]]) b = vector([1, 2, 1]) Ainv = A.inverse() x = Ainv*b x
(1/2, 0, 0)

Gaussian elimination

This is a common method for solving linear systems, and SAGE has built-in functions to perform it. The command "A.rref()" computes the row-reduced echelon form of the matrix A, which is useful for finding solutions

A = matrix([[2, 3, 1], [4, 5, 3], [2, 7, 5]]) b = vector([1, 2, 1]) M = A.augment(b, subdivide=True) rrefM = M.rref() show(rrefM)

(1001201000010)\displaystyle \left(\begin{array}{rrr|r} 1 & 0 & 0 & \frac{1}{2} \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{array}\right)

LU factorization

Another method for solving linear systems is LU factorization. This involves factoring the matrix A into the product of two matrices, L and U, where L is lower triangular and U is upper triangular. SAGE has a built-in function A.LU( ){\tt{A.LU( \ )}}, that computes this factorization.

A = matrix([[2, 3, 1], [4, 5, 3], [2, 7, 5]]) b = vector([1, 2, 1]) lu = A.LU() show(lu)

((001100010),(100121012191),(453092720089))\displaystyle \left(\left(\begin{array}{rrr} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{array}\right), \left(\begin{array}{rrr} 1 & 0 & 0 \\ \frac{1}{2} & 1 & 0 \\ \frac{1}{2} & \frac{1}{9} & 1 \end{array}\right), \left(\begin{array}{rrr} 4 & 5 & 3 \\ 0 & \frac{9}{2} & \frac{7}{2} \\ 0 & 0 & -\frac{8}{9} \end{array}\right)\right)

Exercise

Find whether or not the following matrix is invertible:

(3212412422412348)\begin{pmatrix} 3 & 2 & -1 & 2\\ 4 & 1 & 2 & -4\\ -2 & 2 & 4 & 1\\ 2 & 3 & -4 & 8\\ \end{pmatrix}

Solution:

To determine whether or not the matrix is invertible, we can calculate its determinant. If the determinant is nonzero, then the matrix is invertible.

# define the matrix A = matrix([[3, 2, -1, 2], [4, 1, 2, -4], [-2, 2, 4, 1], [2, 3, -4, 8]]) # compute the determinant det_A = A.det() det_A
0

Since the determinant of the given matrix is 0, the matrix is not invertible

Exercise

Solve the system of simultaneous linear equations:

x1+2x2+3x3=2,x_1 + 2x_2 + 3x_3 = 2,2x13x2x3=3,2x_1 − 3x_2 − x_3 = −3,3x1+x2+2x3=3−3x_1 + x_2 + 2x_3 = −3
Solution:
# Define the coefficients of the system of equations A = matrix([[1, 2, 3], [2, -3, -1], [-3, 1, 2]]) # Define the constant terms b = vector([2, -3, -3]) # Solve the system of equations x = A.solve_right(b) # Print the solution print("The solution to the system of equations is: ", x)
The solution to the system of equations is: (1, 2, -1)

Exercise

Solve the matrix equation

X.(2513)=(4621)X. \begin{pmatrix} 2 & 5\\ 1 & 3 \end{pmatrix} = \begin{pmatrix} 4 & -6\\ 2 & 1 \end{pmatrix}
Solution:
# define matrices A = matrix([[2, 5], [1, 3]]) B = matrix([[4, -6], [2, 1]]) # get inverse of A A_inv = A.inverse() # compute X using the formula X = B · A^(-1) X = B * A_inv print("The solution to the matrix equation X · A = B is") show(X)
The solution to the matrix equation X · A = B is

(183258)\displaystyle \left(\begin{array}{rr} 18 & -32 \\ 5 & -8 \end{array}\right)

Exercise

Determine whether or not the vectors (1, 2, 3, 1), (1, 0, −1, 1), (2, 1, −1, 3) and (0, 0, 3, 2) are linearly independent.

Solution:

To determine whether or not the vectors are linearly independent, we can form a matrix with these vectors as its columns, and then compute the rank of the matrix.

# define the matrix with columns as vectors A = matrix([[1, 1, 2, 0], [2, 0, 1, 0], [3, -1, -1, 3], [1, 1, 3, 2]]) # compute rank of the matrix rank_A = A.rank() print("The rank of the matrix A is", rank_A)
The rank of the matrix A is 4

Since the rank of the matrix is equal to the number of columns, the columns of the matrix are linearly independent.

Therefore, the vectors (1, 2, 3, 1), (1, 0, −1, 1), (2, 1, −1, 3), and (0, 0, 3, 2) are linearly independent.

Exercise

Determine the number of solutions for the system

4x1+2x212x3=0,4x_1 + 2x_2 - 12x_3 = 0,5x1+2x2x3=0,5x_1 + 2x_2 − x_3 = 0,2x1x2+6x3=4−2x_1 - x_2 + 6x_3 = 4
Solution:

To determine the number of solutions of the system, we will find out the rank to see if the equations are linearly independent or not. We do this as follows:

# Define the coefficient matrix A and the right-hand side vector b A = matrix(QQ, [[4, 2, -12], [5, 2, -1], [-2, -1, 6]]) b = vector(QQ, [0, 0, 4]) # Construct the augmented matrix [A|b] AB = A.augment(b, subdivide=True) # Check the rank of A and [A|b] rank_A = A.rank() rank_AB = AB.rank() # Determine the number of solutions based on the rank if rank_A == rank_AB: if rank_A == A.ncols(): print("The system has a unique solution.") else: print("The system has infinitely many solutions.") else: print("The system has no solution.")
The system has no solution.

3.3. Eignevalues and eigenvectors

Sage can also compute the eigenvalues and eigenvectors of a matrix. Here's an example:

# Compute the eigenvalues and eigenvectors of a matrix A A = Matrix(SR, [[1, 2], [3, 4]]) eigens = A.eigenvalues() show(eigens) eigenvectors = A.eigenvectors_right() show(eigenvectors)

[1233+52,1233+52]\displaystyle \left[-\frac{1}{2} \, \sqrt{33} + \frac{5}{2}, \frac{1}{2} \, \sqrt{33} + \frac{5}{2}\right]

[(1233+52,[(1,1433+34)],1),(1233+52,[(1,1433+34)],1)]\displaystyle \left[\left(-\frac{1}{2} \, \sqrt{33} + \frac{5}{2}, \left[\left(1,\,-\frac{1}{4} \, \sqrt{33} + \frac{3}{4}\right)\right], 1\right), \left(\frac{1}{2} \, \sqrt{33} + \frac{5}{2}, \left[\left(1,\,\frac{1}{4} \, \sqrt{33} + \frac{3}{4}\right)\right], 1\right)\right]

A = matrix(QQ, [[2,1],[1,2]]) A.eigenvalues()
[3, 1]
for (lam, v, m) in A.eigenvectors_left(): print(f'eigenvalue {lam} of multiplicity {m} with eigenvectors = {v}')
eigenvalue 3 of multiplicity 1 with eigenvectors = [ (1, 1) ] eigenvalue 1 of multiplicity 1 with eigenvectors = [ (1, -1) ]

Exercise

Consider the following matrix: M=(123045106) M = \begin{pmatrix} 1 & 2 & 3 \\ 0 & 4 & 5 \\ 1 & 0 & 6 \end{pmatrix} .

  1. Compute the eigenvalues of the matrix MM.

  2. Find the eigenvectors for MM.

  3. Calculate the characteristic polynomial of MM.

Solution:

# Define the matrix M M = Matrix([[1, 2, 3], [0, 4, 5], [1, 0, 6]]) # 1. Calculate eigenvalues of M eigenvalues = M.eigenvalues() # 2. Calculate eigenvectors of M eigenvectors = M.eigenvectors_right() # 3. Calculate the characteristic polynomial of M char_poly = M.characteristic_polynomial() # Display the results "Solution: Eigenvalues of M =", eigenvalues, ", Eigenvectors of M =", eigenvectors, ", Characteristic polynomial of M =", char_poly
('Solution: Eigenvalues of M =', [1.088496685924976?, 2.870538719197560?, 7.040964594877465?], ', Eigenvectors of M =', [(1.088496685924976?, [(1, 0.3496538174332899?, -0.2036036496472015?)], 1), (2.870538719197560?, [(1, 1.414585083746477?, -0.3195438160984645?)], 1), (7.040964594877465?, [(1, 1.579511098820234?, 0.9606474657456659?)], 1)], ', Characteristic polynomial of M =', x^3 - 11*x^2 + 31*x - 22)

3.4. Diagonalization

In SageMath the is a direct method to create diagonalizable matrices and then use them to understand the procedure of diagonalization.

This can be done by using appropriatelly the command random_matrix{\tt{random\_matrix}} including the option diagonalizable{\tt{diagonalizable}}, as in the examples below:

A=random_matrix(QQ, 4, 4, "diagonalizable") show(A)

(432431110100512411159178145)\displaystyle \left(\begin{array}{rrrr} 43 & 24 & 3 & 111 \\ 0 & 1 & 0 & 0 \\ 51 & 24 & 11 & 159 \\ -17 & -8 & -1 & -45 \end{array}\right)

A=random_matrix(QQ, 3, 3, "diagonalizable") show(A)

(25848438123128)\displaystyle \left(\begin{array}{rrr} 25 & 84 & 84 \\ -3 & -8 & -12 \\ -3 & -12 & -8 \end{array}\right)

Note that any time executing the previous code Sage will print a different diagonalizable matrix.

We can use such matrices to make experiments on the diagonalization of a matrices. Recall that a n×nn\times n AA matrix is diagonalizable iff AA is similar to a diagonal matrix DD. Hence in this case we can specify matrices D,PD, P such that AP=PDAP=PD, or equivalently P1AP=DP^{-1}AP=D, where DD is the diagonal matrix whose entries are the eigenvalues of AA, while PP is an invertible n×nn\times n matrix consisting of the eigenvectors of AA .

Let us illustrate the relation P1AP=DP^{-1}AP=D for a diagonalizable matrix AA using the random_matrix{\tt{random\_matrix}} command, as above.

Exercise

Use the random_matrix{\tt{random\_matrix}} command, as above, to illustrate the relation AP=PDAP=PD for a 4×44\times 4 diagonalizable matrix. Moreover, combine the charpoly{\tt{charpoly}} command with the factor{\tt{factor}} command to present the factorized expression of the characteristic polynomial of AA. Repeat for a 5×55\times 5 matrix BB.

Solution:
A=random_matrix(QQ, 4, 4, "diagonalizable") show(A) eigen=A.eigenvalues() print(eigen) eigv=A.eigenvectors_right() show(eigv) chaA=A.charpoly().factor() show("The characteristic polynomial of A, factorized, is given by", chaA) #chaaA=A.characteristic_polynomial().factor() #show(chaaA) eigenm=A.eigenmatrix_right() # returns pair (D, P) with A*P=P*D show(eigenm) D, P=A.eigenmatrix_right() show("The diagonal matrix D is given by:", D) show("The matrix P is given by:", P) #verify the relation AP=PD A*P==P*D

(37281814605036243028192151490)\displaystyle \left(\begin{array}{rrrr} -37 & -28 & -18 & 14 \\ 60 & 50 & 36 & -24 \\ -30 & -28 & -19 & 2 \\ -15 & -14 & -9 & 0 \end{array}\right)

[8, -1, -6, -7]

[(8,[(1,2,1,12)],1),(1,[(1,2,32,12)],1),(6,[(1,32,1,12)],1),(7,[(1,43,23,13)],1)]\displaystyle \left[\left(8, \left[\left(1,\,-2,\,1,\,\frac{1}{2}\right)\right], 1\right), \left(-1, \left[\left(1,\,-2,\,\frac{3}{2},\,\frac{1}{2}\right)\right], 1\right), \left(-6, \left[\left(1,\,-\frac{3}{2},\,1,\,\frac{1}{2}\right)\right], 1\right), \left(-7, \left[\left(1,\,-\frac{4}{3},\,\frac{2}{3},\,\frac{1}{3}\right)\right], 1\right)\right]

The characteristic polynomial of A, factorized, is given by(x8)(x+1)(x+6)(x+7)\displaystyle \verb|The|\verb| |\verb|characteristic|\verb| |\verb|polynomial|\verb| |\verb|of|\verb| |\verb|A,|\verb| |\verb|factorized,|\verb| |\verb|is|\verb| |\verb|given|\verb| |\verb|by| (x - 8) \cdot (x + 1) \cdot (x + 6) \cdot (x + 7)

((8000010000600007),(111122324313212312121213))\displaystyle \left(\left(\begin{array}{rrrr} 8 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -6 & 0 \\ 0 & 0 & 0 & -7 \end{array}\right), \left(\begin{array}{rrrr} 1 & 1 & 1 & 1 \\ -2 & -2 & -\frac{3}{2} & -\frac{4}{3} \\ 1 & \frac{3}{2} & 1 & \frac{2}{3} \\ \frac{1}{2} & \frac{1}{2} & \frac{1}{2} & \frac{1}{3} \end{array}\right)\right)

The diagonal matrix D is given by:(8000010000600007)\displaystyle \verb|The|\verb| |\verb|diagonal|\verb| |\verb|matrix|\verb| |\verb|D|\verb| |\verb|is|\verb| |\verb|given|\verb| |\verb|by:| \left(\begin{array}{rrrr} 8 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -6 & 0 \\ 0 & 0 & 0 & -7 \end{array}\right)

The matrix P is given by:(111122324313212312121213)\displaystyle \verb|The|\verb| |\verb|matrix|\verb| |\verb|P|\verb| |\verb|is|\verb| |\verb|given|\verb| |\verb|by:| \left(\begin{array}{rrrr} 1 & 1 & 1 & 1 \\ -2 & -2 & -\frac{3}{2} & -\frac{4}{3} \\ 1 & \frac{3}{2} & 1 & \frac{2}{3} \\ \frac{1}{2} & \frac{1}{2} & \frac{1}{2} & \frac{1}{3} \end{array}\right)

True

An example with 5×55\times 5 diagonalizable matrix:

B=random_matrix(QQ, 5, 5, "diagonalizable") show(B) eigen=B.eigenvalues() print(eigen) eigv=B.eigenvectors_right() show(eigv) chaB=B.charpoly().factor() show("The characteristic polynomial of B, factorized, is given by", chaB) eigenm=B.eigenmatrix_right() # returns pair (D, P) with B*P=P*D show(eigenm) D, P=B.eigenmatrix_right() show("The diagonal matrix D is given by:", D) show("The matrix P is given by:", P) #verify the relation BP=PD B*P==P*D

(3366297133418153976343114551604169110659)\displaystyle \left(\begin{array}{rrrrr} -33 & 6 & -6 & -29 & 71 \\ -33 & -4 & 18 & -15 & -39 \\ -7 & 6 & -34 & -31 & 145 \\ 51 & -6 & 0 & 41 & -69 \\ 11 & 0 & -6 & 5 & 9 \end{array}\right)

[7, -4, -4, -10, -10]

[(7,[(1,3,1,3,1)],1),(4,[(1,0,74,34,14),(0,1,34,14,14)],2),(10,[(1,0,1,1,0),(0,1,43,13,13)],2)]\displaystyle \left[\left(7, \left[\left(1,\,3,\,-1,\,-3,\,-1\right)\right], 1\right), \left(-4, \left[\left(1,\,0,\,\frac{7}{4},\,-\frac{3}{4},\,\frac{1}{4}\right), \left(0,\,1,\,-\frac{3}{4},\,-\frac{1}{4},\,-\frac{1}{4}\right)\right], 2\right), \left(-10, \left[\left(1,\,0,\,1,\,-1,\,0\right), \left(0,\,1,\,-\frac{4}{3},\,-\frac{1}{3},\,-\frac{1}{3}\right)\right], 2\right)\right]

The characteristic polynomial of B, factorized, is given by(x7)(x+4)2(x+10)2\displaystyle \verb|The|\verb| |\verb|characteristic|\verb| |\verb|polynomial|\verb| |\verb|of|\verb| |\verb|B,|\verb| |\verb|factorized,|\verb| |\verb|is|\verb| |\verb|given|\verb| |\verb|by| (x - 7) \cdot (x + 4)^{2} \cdot (x + 10)^{2}

((700000400000400000100000010),(1101030101174341433341411311414013))\displaystyle \left(\left(\begin{array}{rrrrr} 7 & 0 & 0 & 0 & 0 \\ 0 & -4 & 0 & 0 & 0 \\ 0 & 0 & -4 & 0 & 0 \\ 0 & 0 & 0 & -10 & 0 \\ 0 & 0 & 0 & 0 & -10 \end{array}\right), \left(\begin{array}{rrrrr} 1 & 1 & 0 & 1 & 0 \\ 3 & 0 & 1 & 0 & 1 \\ -1 & \frac{7}{4} & -\frac{3}{4} & 1 & -\frac{4}{3} \\ -3 & -\frac{3}{4} & -\frac{1}{4} & -1 & -\frac{1}{3} \\ -1 & \frac{1}{4} & -\frac{1}{4} & 0 & -\frac{1}{3} \end{array}\right)\right)

The diagonal matrix D is given by:(700000400000400000100000010)\displaystyle \verb|The|\verb| |\verb|diagonal|\verb| |\verb|matrix|\verb| |\verb|D|\verb| |\verb|is|\verb| |\verb|given|\verb| |\verb|by:| \left(\begin{array}{rrrrr} 7 & 0 & 0 & 0 & 0 \\ 0 & -4 & 0 & 0 & 0 \\ 0 & 0 & -4 & 0 & 0 \\ 0 & 0 & 0 & -10 & 0 \\ 0 & 0 & 0 & 0 & -10 \end{array}\right)

The matrix P is given by:(1101030101174341433341411311414013)\displaystyle \verb|The|\verb| |\verb|matrix|\verb| |\verb|P|\verb| |\verb|is|\verb| |\verb|given|\verb| |\verb|by:| \left(\begin{array}{rrrrr} 1 & 1 & 0 & 1 & 0 \\ 3 & 0 & 1 & 0 & 1 \\ -1 & \frac{7}{4} & -\frac{3}{4} & 1 & -\frac{4}{3} \\ -3 & -\frac{3}{4} & -\frac{1}{4} & -1 & -\frac{1}{3} \\ -1 & \frac{1}{4} & -\frac{1}{4} & 0 & -\frac{1}{3} \end{array}\right)

True

Exercise

Consider the matrix given below: A=(2100121001210012). A = \begin{pmatrix} 2 & -1 & 0 & 0 \\ -1 & 2 & -1 & 0 \\ 0 & -1 & 2 & -1 \\ 0 & 0 & -1 & 2 \end{pmatrix}.

  1. Compute the eigenvalues of the matrix AA.

  2. Find the corresponding eigenvectors for AA.

  3. Calculate the characteristic polynomial of AA.

  4. Determine if the matrix AA is diagonalizable. If it is, find a diagonal matrix DD and an invertible matrix PP such that A=PDP1A = P \cdot D \cdot P^{-1}.

Solution:

# Define the matrix A A = Matrix(QQ, [[2, -1, 0, 0], [-1, 2, -1, 0], [0, -1, 2, -1], [0, 0, -1, 2]]) # 1. Calculate the eigenvalues of A eigenvalues = A.eigenvalues() show("The eigenvalues of the matrix A are:", eigenvalues) # 2. Calculate the eigenvectors of A eigenvectors = A.eigenvectors_right() show("The eigenvectors of the matrix A are:", eigenvectors) # 3. Calculate the characteristic polynomial of A char_poly = A.characteristic_polynomial() show("The characteristic polynomial of the matrix A is:", char_poly) # 4. Determine if A is diagonalizable and find the diagonal form if possible is_diagonalizable = A.is_diagonalizable() if is_diagonalizable: # If A is diagonalizable, find P and D such that A = P * D * P^-1 P, D = A.eigenmatrix_right() "The matrix A is diagonalizable." "The diagonal matrix D is:", D "The matrix P of eigenvectors is:", P else: "The matrix A is not diagonalizable." print("Is the matrix A diagonalizable?", is_diagonalizable)

The eigenvalues of the matrix A are:[1.381966011250106?,3.618033988749895?,0.3819660112501051?,2.618033988749895?]\displaystyle \verb|The|\verb| |\verb|eigenvalues|\verb| |\verb|of|\verb| |\verb|the|\verb| |\verb|matrix|\verb| |\verb|A|\verb| |\verb|are:| \left[1.381966011250106?, 3.618033988749895?, 0.3819660112501051?, 2.618033988749895?\right]

The eigenvectors of the matrix A are:[(1.381966011250106?,[(1,0.618033988749895?,0.618033988749895?,1)],1),(3.618033988749895?,[(1,1.618033988749895?,1.618033988749895?,1)],1),(0.3819660112501051?,[(1,1.618033988749895?,1.618033988749895?,1)],1),(2.618033988749895?,[(1,0.618033988749895?,0.618033988749895?,1)],1)]\displaystyle \verb|The|\verb| |\verb|eigenvectors|\verb| |\verb|of|\verb| |\verb|the|\verb| |\verb|matrix|\verb| |\verb|A|\verb| |\verb|are:| \left[\left(1.381966011250106?, \left[\left(1,\,0.618033988749895?,\,-0.618033988749895?,\,-1\right)\right], 1\right), \left(3.618033988749895?, \left[\left(1,\,-1.618033988749895?,\,1.618033988749895?,\,-1\right)\right], 1\right), \left(0.3819660112501051?, \left[\left(1,\,1.618033988749895?,\,1.618033988749895?,\,1\right)\right], 1\right), \left(2.618033988749895?, \left[\left(1,\,-0.618033988749895?,\,-0.618033988749895?,\,1\right)\right], 1\right)\right]

The characteristic polynomial of the matrix A is:x48x3+21x220x+5\displaystyle \verb|The|\verb| |\verb|characteristic|\verb| |\verb|polynomial|\verb| |\verb|of|\verb| |\verb|the|\verb| |\verb|matrix|\verb| |\verb|A|\verb| |\verb|is:| x^{4} - 8 x^{3} + 21 x^{2} - 20 x + 5

Is the matrix A diagonalizable? False

Exercise

Consider the following matrix: A=(444444444). A=\begin{pmatrix} -4 & 4 & 4\\ 4 & -4 & 4 \\ 4 & 4 & -4 \end{pmatrix}.

  1. Show that AA is symmetric, A=ATA=A^{T}.

  2. Find the characteristic polynomial of AA and its eigenvalues.

  3. Is the matrix A is diagonalizable?

Solution:

A=matrix(QQ, [[-4, 4, 4], [4, -4, 4], [4, 4, -4]]) show(A) print(A==A.T) chi_A=A.characteristic_polynomial() print(chi_A.factor()) print(A.eigenvalues()) #A.diagonalization() show(A.eigenmatrix_right()) D,P=A.eigenmatrix_right() show(D,P) A*P==P*D

(444444444)\displaystyle \left(\begin{array}{rrr} -4 & 4 & 4 \\ 4 & -4 & 4 \\ 4 & 4 & -4 \end{array}\right)

True (x - 4) * (x + 8)^2 [4, -8, -8]

((400080008),(110101111))\displaystyle \left(\left(\begin{array}{rrr} 4 & 0 & 0 \\ 0 & -8 & 0 \\ 0 & 0 & -8 \end{array}\right), \left(\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 1 & -1 & -1 \end{array}\right)\right)

(400080008)(110101111)\displaystyle \left(\begin{array}{rrr} 4 & 0 & 0 \\ 0 & -8 & 0 \\ 0 & 0 & -8 \end{array}\right) \left(\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 1 & -1 & -1 \end{array}\right)

True

Hence the matrix is diagonalizable, as the diagonal matrix DD and the matrix PP satisfy AP=PDAP=PD.

3.5. Quadratic forms

Let us now shortly review how one can treat quadratic forms in Sage. Let us begin with the following quadratic form in R^3:

q(x,y,z)=3x2+2xy+y2+2xz+2yz+z2q(x, y, z) = 3x^2 + 2xy + y^2 + 2xz + 2yz + z^2

To define this quadratic form in SAGE, we can use the QuadraticForm{\tt{QuadraticForm}} command, as follows:

Q = QuadraticForm(ZZ, 3, [3, 2, 1, 2, 2, 1]) show(Q)

Quadraticformin3variablesoverIntegerRingwithcoefficients:[321221]\displaystyle Quadratic form in 3 variables over Integer Ring with coefficients: \newline\left[ \begin{array}{ccc}3 & 2 & 1 & * & 2 & 2 & * & * & 1 & \end{array} \right]

Here, ZZ denotes the base ring (in this case, the integers), 3 denotes the number of variables (x, y, z), and [3, 1, 1, 2, 2, 1] specifies the coefficients of the quadratic form in the order x2,xy,y2,xz,yz,z2x^2, xy, y^2, xz, yz, z^2.

To find the matrix of a quadratic form we can use the matrix(){\tt{matrix()}} function, as follows:

Qmat=Q.matrix(); show(Qmat)

(621242122)\displaystyle \left(\begin{array}{rrr} 6 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 2 \end{array}\right)

Exercise

Consider the quadratic form Q(x,y)=x22xy+3y2Q(x, y) = x^2 - 2xy + 3y^2. Find the matrix corresponding to QQ and next determine the rank and signature of this quadratic form.

Solution:

Q = QuadraticForm(ZZ, 2, [3, -2, 3]) Qm= Q.matrix() show("The matrix of the quadratic form Q is given by:", Qm) eigenvalues = Qm.eigenvalues() sign=(sum(1 for e in eigenvalues if e > 0), sum(1 for e in eigenvalues if e < 0), sum(1 for e in eigenvalues if e == 0)) show("The sign of the quadratic form Q is the triple:", sign)

The matrix of the quadratic form Q is given by:(6226)\displaystyle \verb|The|\verb| |\verb|matrix|\verb| |\verb|of|\verb| |\verb|the|\verb| |\verb|quadratic|\verb| |\verb|form|\verb| |\verb|Q|\verb| |\verb|is|\verb| |\verb|given|\verb| |\verb|by:| \left(\begin{array}{rr} 6 & -2 \\ -2 & 6 \end{array}\right)

The sign of the quadratic form Q is the triple:(2,0,0)\displaystyle \verb|The|\verb| |\verb|sign|\verb| |\verb|of|\verb| |\verb|the|\verb| |\verb|quadratic|\verb| |\verb|form|\verb| |\verb|Q|\verb| |\verb|is|\verb| |\verb|the|\verb| |\verb|triple:| \left(2, 0, 0\right)

r=rank(Qm) show(r) show("The rank of the quadratic form Q is the rank of its matrix, hence it equals:", r)

2\displaystyle 2

The rank of the quadratic form Q is the rank of its matrix, hence it equals:2\displaystyle \verb|The|\verb| |\verb|rank|\verb| |\verb|of|\verb| |\verb|the|\verb| |\verb|quadratic|\verb| |\verb|form|\verb| |\verb|Q|\verb| |\verb|is|\verb| |\verb|the|\verb| |\verb|rank|\verb| |\verb|of|\verb| |\verb|its|\verb| |\verb|matrix,|\verb| |\verb|hence|\verb| |\verb|it|\verb| |\verb|equals:| 2

Exercise for practice

Find the matrix, the rank and the signature of the quadratic form on R3\R^3 given by f(x1,x2,x3)=x1x2x22+5x2x312x32f(x_1,x_2, x_3) = -x_1x_2 - x_2^2 + 5x_2x_3 - \frac{1}{2}x_3^2.

3.6. Linear programming

Linear programming builds upon notions from linear algebra discussed in chapter 2 and is a dynamic tool for solving linear optimization problems subject to linear constraints. In a linear programming problem (LP problem in short) we seek for a vector on some Euclidean space maximizing (or minimizing) the value of some linear functional, among all vectors satisfying a given system of linear constraints that govern the process. The idea of maximizing or minimizing a linear function subject to linear constraints arises naturally in many fields. For instance we may want to maximize profits, minimize costs, etc.

In SAGE, we'll use the Mixed Integer Linear Programming (MILP) package for LPLP-problems

Let us consider a simple LPLP-problem.

Exercise

Maximize x+3y+zx + 3y + z

Subject to:

  • x+2y4x + 2y ≤ 4

  • 5zy85z - y ≤ 8

  • x,y,z0x, y, z ≥ 0

Solution:

We can solve this problem using SAGE MixedIntegerLinearProgram() object by

  1. defining the variables

  2. defining the objective function

  3. setting the constraints

  4. Solving the problem with .solve() method

# define the variables and the objective function: p = MixedIntegerLinearProgram(maximization=True) # if maximization set to True the MILP maximizes the result, else it minimizes v = p.new_variable(real=True, nonnegative=True) x, y, z = v['x'], v['y'], v['z'] p.set_objective(x + 3*y + z)
# set the constraints p.add_constraint(x + 2*y <= 4) p.add_constraint(5*z - y <= 8)
p.show()
Maximization: x_0 + 3.0 x_1 + x_2 Constraints: x_0 + 2.0 x_1 <= 4.0 - x_1 + 5.0 x_2 <= 8.0 Variables: x_0 is a continuous variable (min=0.0, max=+oo) x_1 is a continuous variable (min=0.0, max=+oo) x_2 is a continuous variable (min=0.0, max=+oo)
# The solve method returns by default the optimal value reached by the objective function print('Maximized result:', round(p.solve(), 2))
Maximized result: 8.0

We can read the optimal assignment found by the solver for the variables through the .get_values({variable}) method of MILP obj

print('Best value for x:', round(p.get_values(x), 2)) print('Best value for y:', round(p.get_values(y), 2)) print('Best value for z:', round(p.get_values(z), 2))
Best value for x: 0.0 Best value for y: 2.0 Best value for z: 2.0

Exercise for practice

Minimize 3x1x22x3−3x_1 − x_2 − 2x_3 under the conditions:

  • x1x2+x34x_1 − x_2 + x_3 ≥ −4,

  • 2x1+x332x_1 + x_3 ≤ 3,

  • x1+x2+3x38x_1 + x_2 + 3x_3 ≤ 8

with xi0x_i ≥ 0 for i=1,2,3i = 1, 2, 3

Exercise

A company produces two products, XX and YY, using two machines, AA and BB. The production of XX requires 11 hour of machine AA and 22 hours of machine BB, while the production of YY requires 22 hours of machine AA and 11 hour of machine BB. The company has 4040 hours of machine AA and 3030 hours of machine BB available per week. The profit per unit of product XX is 5$5$ and the profit per unit of product Y is 4$4$. How many units of each product should the company produce to maximize profit?

Solution:

We can model this problem using LP by defining the decision variables and the objective function:

# define the variables and the objective function: p = MixedIntegerLinearProgram() v = p.new_variable(real=True, nonnegative=True) # set the variables to be the number of units of each product x, y = v['x'], v['y'] # objective - profit to be maximized p.set_objective(5*x + 4*y)
# Set the constraints - total hours should not exceed the available machine hours specified # Machine A p.add_constraint(x + 2*y <= 40) # Machine B p.add_constraint(2*x + y <= 30) # You can't produce less then 0 of each product p.add_constraint(x >= 0) p.add_constraint(y >= 0)
# optimal value reached by the objective function - maximum profit for this problem print('Profit:', round(p.solve(), 2))
Profit: 100.0
# print optimal values for x, y found by the solver res_x = round(p.get_values(x), 2) res_y = round(p.get_values(y), 2) print('Sold products X:', res_x) print('Sold products Y:', res_y)
Sold products X: 6.67 Sold products Y: 16.67

Thus the optimal number to maximize profits is 66 units of product XX and 1616 units of product YY

We can check it with the graphical method result

x = var("x") y = var('y') ( point((res_x, res_y), color='red', size=50) +text('({x},{y})'.format(x=res_x, y=res_y), (res_x+6, res_y+5), color='black') +implicit_plot(x + 2*y - 40, (x, -50, 50), (y, -20, 50)) # constraint 1 +implicit_plot(2*x + y - 30, (x, -50, 50), (y, -20, 50)) # constraint 2 +implicit_plot(x-0, (x, -50, 50), (y, -20, 50), color='black') # axis Y and constraint 3 +implicit_plot(y-0, (x, -50, 50), (y, -20, 50), color='black') # axis X and constraint 4 )
Image in a Jupyter notebook

Exercise

An investor wants to invest in three companies: Technology (T), Healthcare (H), and Energy (E). She has $10,000 to invest and wants to minimize risk. However, the government regulates the investments and made a law where you should invest at least 1000$ to Healtcare sector and your investments in Technology can't exceed 4000$.

Solution:

This problem has an infinite amount of solutions, let's check this with the help of SAGE

from sage.numerical.mip import MixedIntegerLinearProgram p = MixedIntegerLinearProgram(maximization=True) v = p.new_variable(real=True, nonnegative=True) T, H, E = v['T'], v['H'], v['E'] # Objective Function p.set_objective(T + H + E) # Constraints p.add_constraint(T + H + E == 10000) p.add_constraint(H >= 1000) p.add_constraint(T <= 4000) # Solving p.solve() # Output print("Investment in Technology:", p.get_values(T)) print("Investment in Healthcare:", p.get_values(H)) print("Investment in Energy:", p.get_values(E))
Investment in Technology: 4000.0 Investment in Healthcare: 6000.0 Investment in Energy: 0.0
p = MixedIntegerLinearProgram(maximization=True) v = p.new_variable(real=True, nonnegative=True) E, T, H = v['E'], v['T'], v['H'] # Objective Function p.set_objective(T + H + E) # Constraints p.add_constraint(T + H + E == 10000) p.add_constraint(H >= 1000) p.add_constraint(T <= 4000) # Solving p.solve() # Output print("Investment in Technology:", p.get_values(T)) print("Investment in Healthcare:", p.get_values(H)) print("Investment in Energy:", p.get_values(E))
Investment in Technology: 0.0 Investment in Healthcare: 1000.0 Investment in Energy: 9000.0
p = MixedIntegerLinearProgram(maximization=True) v = p.new_variable(real=True, nonnegative=True) H, T, E = v['H'], v['T'], v['E'] # Objective Function p.set_objective(T + H + E) # Constraints p.add_constraint(T + H + E == 10000) p.add_constraint(H >= 1000) p.add_constraint(T <= 4000) # Solving p.solve() # Output print("Investment in Technology:", p.get_values(T)) print("Investment in Healthcare:", p.get_values(H)) print("Investment in Energy:", p.get_values(E))
Investment in Technology: 0.0 Investment in Healthcare: 10000.0 Investment in Energy: 0.0

As you can see, there's an infinite amount of solutions

Exercise

A manufacturing company produces two types of products, A and B, using two machines, M1 and M2. The company aims to fulfill specific production requirements while efficiently utilizing its resources. Interestingly, the company doesn't have a preference for producing more of one product over the other. Hence, the problem leads to an infinite number of optimal solutions.

Given:

  • Machine M1 has 20 hours available.

  • Machine M2 has 15 hours available.

  • Product A requires 1 hour on M1 and 1 hour on M2 per unit.

  • Product B requires 1 hour on M1 and 1 hour on M2 per unit.

  • The company must produce at least 10 units of both products A and B.

Objective Function (Minimize): f(A,B)=A+B f(A, B) = A + B

Subject to Constraints:

  1. A+B<=20 A + B <= 20 (Hours on machine M1)

  2. A+B<=15A + B <= 15 (Hours on machine M2)

  3. A>=10A >=10 (Minimum production of product A)

  4. B>=10 B >= 10 (Minimum production of product B)

  5. A,B>=0 A, B >= 0

Since the company is indifferent between products A and B, any combination of A and B that satisfies the constraints is considered optimal. Thus, the linear program has infinite optimal solutions, reflecting the flexibility in production planning.

Solution:

p = MixedIntegerLinearProgram(maximization=False) v = p.new_variable(real=True, nonnegative=True) A,B = v['A'], v['B'] # Objective Function p.set_objective(A + B) # Constraints p.add_constraint(A + B <= 20) # Hours on machine M1 p.add_constraint(A + B <= 15) # Hours on machine M2 p.add_constraint(A >= 10) # Minimum production of product A p.add_constraint(B >= 10) # Minimum production of product B try: # Solving p.solve() # Output print("Product A production:", p.get_values(A)) print("Product B production:", p.get_values(B)) print("Total production:", p.get_objective_value()) except: print('No feasible solutions')
No feasible solutions

Exercise

A food factory wants to produce three products: Pasta (P), Pizza (Z), and Salad (S) while minimizing costs. Factory spends 3$ on Pasta, 4$ on Pizza and 2$ on Salads.

Constrains:

  1. P+Z+S600P+Z+S≤600 (Maximum production capacity)

  2. 2P+3Z+S10002P+3Z+S≥1000 (Minimum flour requirement)

  3. PZ+S500P−Z+S≥500 (Demand for more Pasta)

  4. Z+2S300Z+2S≤300 (Demand for less Pizza and Salad)

  5. 0P,Z,S3000≤P,Z,S≤300

Solution:

p = MixedIntegerLinearProgram(maximization=False) v = p.new_variable(real=True, nonnegative=True) P, Z, S = v['P'], v['Z'], v['S'] # Objective Function p.set_objective(3 * P + 4 * Z + 2 * S) # Constraints p.add_constraint(P + Z + S <= 600) p.add_constraint(2 * P + 3 * Z + S >= 1000) p.add_constraint(P - Z + S >= 500) p.add_constraint(Z + 2 * S <= 300) p.add_constraint(P <= 300) p.add_constraint(Z <= 300) p.add_constraint(S <= 300) # Solving try: p.solve() print("Pasta production:", p.get_values(P)) print("Pizza production:", p.get_values(Z)) print("Salad production:", p.get_values(S)) print("Total cost:", p.get_objective_value()) except: print("No solution found")
No solution found

Exercise

An insurance company has a capital of 100000$100000$ which aims to invest in two different ways: the investment of type XX and the investment of type YY. These types of investments give an annual income of 1010% and of 1515%, respectively. However, there is a limitation by the government, which requires that at least 2525% of the capitals, should be invested in the XX-type investment. On the other hand, the policy of the company requires the ratio between the capital used for the YY-type investment and the capital used for the X-type of investment, not be greater than 1.5. How the company should invest its capital?

Solution:

# define the variables and the objective function: p = MixedIntegerLinearProgram(maximization=True) # for maximization v = p.new_variable(real=True, nonnegative=True) # set the variables to be the number of units of each product x, y = v['x'], v['y'] # objective - income to be maximized p.set_objective(1.1*x + 1.15*y)
# You have only 100_000 dolars p.add_constraint(x+y <= 100_000) # Government limitation on 25% to X investment p.add_constraint(x >= 100_000*0.25) # Investments in Y should not be larger then 1.5 investments in X p.add_constraint(y <= 1.5*x) # Your investments can't less then 0 p.add_constraint(x >= 0) p.add_constraint(y >= 0)
print('Maximum income:',round(p.solve(), 2))
Maximum income: 113000.0
# print optimal values for x, y found by the solver print('X-type investments:',round(p.get_values(x), 2)) print('Y-type investments:',round(p.get_values(y), 2))
X-type investments: 40000.0 Y-type investments: 60000.0

3.7. Markov processes and other applications. (optional)

Markov processes are a type of stochastic process where the future state depends only on the current state, and not on any past states. ... In this tutorial, we will explore how to create and analyze Markov processes using SAGE.

Formally, the Markov property states that the conditional probability distribution for the system at the next step (and in fact at all future steps) given its current state depends only on the current state of the system, and not additionally on the state of the system at previous steps:

P(Xn+1X1,X2,,Xn)=P(Xn+1Xn)P(X_{n+1} |X_1,X_2,…,X_n)=P(X_{n+1}|X_{n})

The possible values of XiX_i or the set of all states of the system form a countable set 𝕏 called the state space of the chain. The changes of state of the system are called transitions, and the probabilities associated with various state-changes are called transition probabilities.

Since the system changes randomly, it is generally impossible to predict the exact state of the system in the future. However, the statistical and probabilistic properties of the system's future can be predicted. In many applications it is these statistical properties that are important.

Markov chains are often depicted by a weighted directed graph, where the edges are labeled by the probabilities of going from one state to the other states. This is called the flow diagram or transition probability diagram. The transition probability matrix P encodes the probabilities associated with state-changes or "jumps" from one state to another in the state-space 𝕏. If 𝕏 is labelled by 0,1,2,{0,1,2,…} then the i,j-th entry in the matrix P corresponds to the probability of going from state i to state j in one time-step.

P=[P0,0P0,1P0,2P1,0P1,1P1,2P2,0P2,1P2,2]P = \begin{bmatrix} P_{0,0} &P_{0, 1} & P_{0,2}&\dots \\ P_{1,0} &P_{1, 1} & P_{1,2}&\dots \\ P_{2,0} &P_{2, 1} & P_{2,2}&\dots \\ \dots& \dots & \dots & \ddots \end{bmatrix}

The state of the system at the n-th time step is described by a state probability vector: Pn=(P0n,P1n,P2n,)P^{n}=(P^{n}_0,P^{n}_1,P^{n}_2,…)

Thus, PinP^{n}_{i} is the probability you will find the Markov chain at state i𝕏i \in𝕏 at time-step n and Pi0P^{0}_i is called the initial probability vector at the convenient initial time 0.

The state space 𝕏 and transition probability matrix P completely characterizes a Markov chain.

Random Walks and Random Graphs in SageMath

Let's get introduced to simple random walks in SageMath.

set_random_seed(0) v = [randint(0,1) for _ in range(10)] v
[0, 0, 0, 1, 0, 0, 0, 1, 0, 1]
vv = []; nn = 0 @interact def foo(pts = checkbox(True, "Show points"), refresh = checkbox(False, "New random walk every time"), steps = (50,(10..500))): # We cache the walk in the global variable vv, so that # checking or unchecking the points checkbox doesn't change # the random walk. html("<h2>%s steps</h2>"%steps) global vv if refresh or len(vv) == 0: s = 0; v = [(0,0)] for i in range(steps): s += random() - 0.5 v.append((i, s)) vv = v elif len(vv) != steps: # Add or subtract some points s = vv[-1][1]; j = len(vv) for i in range(steps - len(vv)): s += random() - 0.5 vv.append((i+j,s)) v = vv[:steps] else: v = vv L = line(v, rgbcolor='#4a8de2') if pts: L += points(v, pointsize=10, rgbcolor='red') show(L, xmin=0, figsize=[8,3])

A 3D Random Walk

(originally by William Stein)

@interact def rwalk3d(n=(50,1000), frame=True): pnt = [0.,0.,0.] v = [copy(pnt)] for i in range(n): pnt[0] += random()-0.5 pnt[1] += random()-0.5 pnt[2] += random()-0.5 v.append(copy(pnt)) show(line3d(v,color='black'),aspect_ratio=[1,1,1],frame=frame,figsize=[6,6])

3.8. The Matplotlib library in SageMath

Matplotlib is a powerful and versatile Python library used for creating static, animated, and interactive visualizations. It is widely adopted for plotting graphs and charts in both scientific and data analytics contexts. With its simple interface, users can generate line plots, bar charts, histograms, scatter plots, and more complex visualizations like 3D plots and other. Matplotlib integrates seamlessly with various environments, including SageMath, making it an essential tool for visualizing mathematical computations and data. It also offers extensive customization for plot aesthetics, allowing users to control everything from colors and styles to axes and legends.

In particular, in SageMath, Matplotlib is used for a wide range of applications, particularly for visualizing mathematical data and functions. Below we desdcribe some key applications:

Application A:  Graphing Functions

You can use Matplotlib to plot various types of mathematical functions, such as linear, polynomial, trigonometric, and exponential functions. This is helpful for studying the behavior of functions and analyzing their properties, such as roots, maxima, minima, and inflection points. Let us describe an example using the sine function.

import matplotlib.pyplot as plt import numpy as np x = np.linspace(-2 * np.pi, 2 * np.pi, 100) y = np.sin(x) plt.plot(x, y) plt.title('Sine Function') plt.show()
Image in a Jupyter notebook
Application B: 3D Surface Plots

Matplotlib, when combined with numpy in SageMath, allows the visualization of 3D surfaces such as paraboloids, hyperboloids, or any user-defined surface. This is particularly useful for geometric and calculus problems.

from mpl_toolkits.mplot3d import Axes3D import numpy as np import matplotlib.pyplot as plt x = np.linspace(-5, 5, 100) y = np.linspace(-5, 5, 100) X, Y = np.meshgrid(x, y) Z = X**2 + Y**2 fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.plot_surface(X, Y, Z, cmap='viridis') plt.show()
Image in a Jupyter notebook
Application C: Statistical Data Visualization

Matplotlib is ideal for visualizing statistical data, such as histograms, scatter plots, and box plots. In SageMath, this is useful for analyzing large data sets or distributions in various fields such as statistics, data science, and machine learning.

import numpy as np import matplotlib.pyplot as plt data = np.random.randn(1000) plt.hist(data, bins=30, edgecolor='black') plt.title('Histogram of Random Data') plt.show()
Image in a Jupyter notebook
Application D: Contour and Heatmaps

These are used to visualize 2D data that represents a function’s behavior over a grid. In SageMath, they are commonly applied in calculus, differential equations, and optimization problems to understand level sets and gradient flows.

import matplotlib.pyplot as plt import numpy as np x = np.linspace(-5, 5, 100) y = np.linspace(-5, 5, 100) X, Y = np.meshgrid(x, y) Z = np.sin(np.sqrt(X**2 + Y**2)) plt.contour(X, Y, Z, 20, cmap='coolwarm') plt.title('Contour Plot') plt.show()
Image in a Jupyter notebook
Application E: Parametric and Polar Plots

Matplotlib is helpful in SageMath for visualizing parametric equations, polar coordinates, and vector fields, which are often used in advanced calculus and physics problems.

Let us first describe a simple example:

import numpy as np import matplotlib.pyplot as plt # Parametric equations for a circle t = np.linspace(0, 2 * np.pi, 100) x = np.cos(t) y = np.sin(t) # Plot the parametric curve plt.plot(x, y) plt.title('Parametric Plot of a Circle') plt.xlabel('x') plt.ylabel('y') plt.axis('equal') # Ensure equal scaling plt.show()
Image in a Jupyter notebook
The rose curve (r = sin(3θ))
import numpy as np import matplotlib.pyplot as plt # Polar coordinates theta = np.linspace(0, 2 * np.pi, 100) r = np.sin(3 * theta) # Plot the polar curve plt.polar(theta, r) plt.title('Polar Plot: Rose Curve') plt.show()
Image in a Jupyter notebook

Further examples are presented below:

import numpy as np import matplotlib.pyplot as plt theta = np.linspace(0, 2 * np.pi, 100) r = 1 + np.sin(theta) plt.polar(theta, r) plt.title('Polar Plot') plt.show()
Image in a Jupyter notebook
import numpy as np import matplotlib.pyplot as plt # Polar coordinates for a spiral theta = np.linspace(0, 4 * np.pi, 100) r = theta # Plot the spiral plt.polar(theta, r) plt.title('Polar Plot: Spiral') plt.show()
Image in a Jupyter notebook

Application F: Smoothing Data

The pandas package enables reading csv files (datasets) that can be visualized using Matplotlib package.

Moving average is a commonly used technique in time series analysis for smoothing data and removing noise. It involves taking the average of a certain number of consecutive data points, called the window size or span. For example, a 7-day moving average of a daily time series would be the average of the past week of data with the kernel (17,,17)\left (\tfrac{1}{7},\dots ,\tfrac{1}{7} \right )^\top. Other kernels can be used (think about how to do a weighted average). Common kernel shapes include simple symmetric forms like box or triangular shapes. The np package includes convolve function for this purpose.

Tak a look at the following example that reads a COVID-19 dataset:

import pandas as pd import matplotlib.pyplot as plt import numpy as np # Read data from CSV file T = pd.read_csv("testy-pcr-antigenni.csv")

The first column, "datum," contains dates (indexed from 0), and the eighth column, "incidence_pozitivni," contains the number of people who tested positive for COVID-19 on the corresponding day.

print(T.datum) print(T.datum.head())
0 2020-09-01 1 2020-09-02 2 2020-09-03 3 2020-09-04 4 2020-09-05 ... 188 2021-03-08 189 2021-03-09 190 2021-03-10 191 2021-03-11 192 2021-03-12 Name: datum, Length: 193, dtype: object 0 2020-09-01 1 2020-09-02 2 2020-09-03 3 2020-09-04 4 2020-09-05 Name: datum, dtype: object
print(T.iloc[int(0):int(5),int(0)]) print(T.iloc[int(0):int(5),int(7)])
0 2020-09-01 1 2020-09-02 2 2020-09-03 3 2020-09-04 4 2020-09-05 Name: datum, dtype: object 0 499 1 645 2 675 3 797 4 504 Name: incidence_pozitivni, dtype: int64

Convolution, in this context, is essentially the dot product of a segment of the time series, equal in length to the kernel, with the kernel itself.

# Plotting the graph plt.figure() plt.plot(T.iloc[:, int(0)], T.iloc[:, int(7)], label='Original Data') # Plotting original data # Calculating the convolution convolved_data = np.convolve(T.iloc[:, int(7)], np.ones(7)/7, 'same') # 7-day average # Adding convolved data to the plot plt.plot(T.iloc[:, int(0)], convolved_data, 'r', label='7-Day Average') # Plotting convolved data # Adding legend and displaying the plot plt.legend() xtick_positions = T.iloc[::int(14), int(0)] # Select every 14th value for x-ticks xtick_labels = T.iloc[::int(14), int(0)].astype(str) # Convert them to strings for labeling plt.xticks(xtick_positions, xtick_labels, rotation=45) # Rotate labels if necessary plt.show()
Image in a Jupyter notebook

Use daily time series and compute the moving average with 7-day window to get trend. By subtracting the trend from the time series you can get the week component.

Take the smoothed data at the beginning of the epidemic outbreak and check for an exponential trend.

Application G: Similarity Measure

In the context of similarity recognition, we usually use the dot product of normalized vectors – the cosine similarity – to measure the similarity between two vectors quantifying objects:

similarity(a,b)=abab=i=1naibii=1nai2i=1nbi2\text{similarity}(\mathbf{a}, \mathbf{b}) = \frac{\mathbf{a} \cdot \mathbf{b}}{|\mathbf{a}| |\mathbf{b}|} = \frac{\sum_{i=1}^n a_i b_i}{\sqrt{\sum_{i=1}^n a_i^2}\sqrt{\sum_{i=1}^n b_i^2}}

You can create a heatmap to visualize the results when comparing multiple objects with each other.

The following example shows code using Matplotlib to create a matrix visualization for 5 different vectors that describe 5 students with scores ranging from 0 to 5 across 10 courses. The similarity measure helps to identify a threshold between good and poor performance.

import numpy as np import matplotlib.pyplot as plt # Sample data: 5 students with scores from 0 to 5 in 10 courses students_marks = np.array([ [4, 5, 3, 4, 1, 5, 3, 4, 4, 5], # Student 1 [3, 4, 4, 5, 2, 4, 3, 3, 4, 4], # Student 2 [5, 1, 5, 3, 4, 2, 5, 5, 3, 2], # Student 3 [0, 1, 0, 0, 3, 0, 2, 3, 1, 4], # Student 4 [4, 3, 4, 5, 4, 4, 3, 5, 4, 3] # Student 5 ]) # Normalize the vectors (rows) norms = np.linalg.norm(students_marks, axis=1, keepdims=True) # Calculate the norms of each student's marks vector normalized_vectors = students_marks / norms # Normalize each student's marks vector # Initialize an empty similarity matrix similarity_matrix = np.zeros((len(students_marks), len(students_marks))) # Calculate cosine similarity manually for i in range(len(normalized_vectors)): for j in range(len(normalized_vectors)): similarity_matrix[i, j] = np.dot(normalized_vectors[i], normalized_vectors[j]) # Plotting the heatmap plt.figure(figsize=(8, 6)) plt.imshow(similarity_matrix, cmap='coolwarm', interpolation='nearest') plt.colorbar(label='Cosine Similarity') plt.xticks(ticks=np.arange(len(students_marks)), labels=[f'Student {i+1}' for i in range(len(students_marks))]) plt.yticks(ticks=np.arange(len(students_marks)), labels=[f'Student {i+1}' for i in range(len(students_marks))]) plt.title('Cosine Similarity Heatmap of Students based on Scores') plt.xlabel('Students') plt.ylabel('Students') plt.show()
Image in a Jupyter notebook

Compare COVID-19 incidences (eighth column) with positive PCR tests with symptoms ("PCR_pozit_sympt," thirteenth column) by plotting the data.

Include additional time series for comparison.

Use the cosine similarity measure. Experiment with various features such as raw data, trends, and weekly variance. Remember to normalize the data.

Application H: Image as Data, Compression using SVD

Images are often represented as multidimensional arrays of pixel values, where each pixel can contain information such as intensity or color (RGB). The PIL (Pillow) package is commonly used to load, manipulate, and convert images between formats, while NumPy provides powerful tools to perform mathematical operations on these pixel arrays efficiently. Matplotlib is used to visualize the images and any derived data, enabling a clear and intuitive understanding of the image processing results.

If you have problems with reading jpg file, upgrade the Pillow package running pip install --upgrade pillow

from PIL import Image import numpy as np import matplotlib.pyplot as plt # Load the image img = Image.open("garf.jpg") # Display the original image plt.figure() plt.imshow(img) plt.title('Original Image') plt.axis('off') plt.show() # Convert the image to grayscale img_gray = img.convert('L') # Convert the grayscale image to a NumPy array and normalize it to [0, 1] img_array = np.array(img_gray, dtype=np.float64) / 255.0 # Display the grayscale and normalized image plt.figure() plt.imshow(img_array, cmap='gray') plt.title('Grayscale and Normalized Image') plt.axis('off') plt.show()
Image in a Jupyter notebookImage in a Jupyter notebook

Singular Value Decomposition (SVD)

is a factorization method used to decomposes a matrix into three matrices.

Let A\mathbf{A} be an m×nm \times n matrix with rank rr. SVD factorizes A\mathbf{A} as A=UΣV\mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^\top , where U\mathbf{U} is an m×mm \times m orthonormal matrix, Σ\boldsymbol{\Sigma} is an m×nm \times n diagonal matrix, and V\mathbf{V} is an n×nn \times n orthonormal matrix. The diagonal entries of Σ\boldsymbol{\Sigma}, denoted by σi\sigma_i, are called the singular values of A\mathbf{A}, and they are ordered in non-increasing order: σ1σ2σr>0.\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r > 0.

The columns of U\mathbf{U} and V\mathbf{V} are called the left and right singular vectors of A\mathbf{A}, respectively. The left singular vectors are the eigenvectors of AA\mathbf{AA}^\top, and the right singular vectors are the eigenvectors of AA\mathbf{A}^\top\mathbf{A}. Moreover, the SVD can be written by columns ui\mathbf{u}_i and vi\mathbf{v}_i (left and right singular vectors) as A=i=1rσiuivi.\mathbf{A} = \sum_{i=1}^r \sigma_i \mathbf{u}_i \mathbf{v}_i^\top.

Since the diagonal entries of Σ\boldsymbol{\Sigma} are arranged in descending order, we can truncate this sum to obtain a low rank approximation of A\mathbf{A}. Omitting the smaller singular values effectively reduces the rank of A\mathbf{A}.

The truncated SVD is then Ak=UkΣkVk\mathbf{A}_k = \mathbf{U}_k \boldsymbol{\Sigma}_k \mathbf{V}_k^\top, where Uk\mathbf{U}_k and Vk\mathbf{V}_k are the first kk columns of U\mathbf{U} and V\mathbf{V}, respectively, and Σk\boldsymbol{\Sigma}_k is obtained by keeping only the first kk singular values on the diagonal of Σ\boldsymbol{\Sigma}. This approximation has rank kk and is the best rank-kk approximation of A\mathbf{A} in the sense that it minimizes the Frobenius norm (the Frobenius norm of matrix A\mathbf{A} is defined as the square root of the sum of the squared absolute values of its elements.) of the difference between A\mathbf{A} and any matrix of rank at most kk.

Low rank approximations by SVD are widely used in many areas, as they allow us to represent the original matrix data with fewer parameters, while retaining most of its important features. Here are some examples:

  • Image compression: In image processing, large matrices are often used to represent images. Truncated SVD can be used to approximate these matrices, which enables image compression without significant loss of quality.

  • Collaborative filtering: Recommender systems use matrix factorization techniques based on truncated SVD to predict user preferences for items in a large database. By approximating the user-item rating matrix with a low-rank SVD approximation, these systems can recommend items that a user is likely to enjoy based on their past preferences.

  • Data analysis: Truncated SVD can be used to analyze large datasets and extract useful information. For example, in genetics, truncated SVD can be used to identify genes that are most strongly associated with a particular trait.

  • Latent semantic analysis: In natural language processing, truncated SVD is used in latent semantic analysis (LSA) to identify relationships between words and documents. By approximating the term-document matrix with a low-rank SVD approximation, LSA can identify underlying themes or topics in a large corpus of text.

  • Network analysis: Truncated SVD can be used to analyze large graphs or networks. For example, in social network analysis, truncated SVD can be used to identify clusters or communities of individuals with similar interests or behavior.

If you have problems with reading jpg file, upgrade the Pillow package running pip install --upgrade pillow

# Performing SVD on the normalized image U, S, V = np.linalg.svd(img_array, full_matrices=False) S_diag = np.diag(S) # Display singular values plt.figure() plt.plot(S, linewidth=2) plt.title('Singular Values') plt.xlabel('Index') plt.ylabel('Value') plt.show() # Display cumulative percentage of stored information plt.figure() plt.plot(np.cumsum(S) / np.sum(S), linewidth=2) plt.title('Cumulative Percentage of Singular Values') plt.xlabel('Index') plt.ylabel('Percentage') plt.show() # Dimension reduction using truncation of smaller singular values k = 50 # Choose the number of singular values to keep U_k = U[:, :k] S_k = S_diag[:k, :k] V_k = V[:k, :] # Reconstruct the image with reduced dimensions img_k = np.dot(U_k, np.dot(S_k, V_k)) # Display the reconstructed image plt.figure() plt.imshow(img_k, cmap='gray') plt.title(f'Compressed Image, k = {k}') plt.axis('off') plt.show()
Image in a Jupyter notebookImage in a Jupyter notebookImage in a Jupyter notebook

Find the percentage difference by subtracting the images. Cross-validate it with the cumulative percentage of stored information.

Can you find the appropriate value of kk for a given threshold of the percentage of stored information?

The RGB data is stored in a matrix with 3 indices, where the red channel is stored in layer 0 with indices [horizontal pixel size, vertical pixel size, 0]. Separate the RGB Garfield image into its color channels and compress each color channel separately. Then concatenate them back together using np.stack. along the axis=2 (third layer). To maintain a valid range of [0,1][0, 1], SageMath offers clipping by default, but it is not actually needed.