Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.
Chapter 3 (Linear algebra)
Chapter 3 - A short introduction to Matrix Calculus and Linear algebra
3.1. Matrix and vector manipulation via Sage
3.2. Solving linear systems via Sage
3.3. Eignevalues and eigenvectors
3.4. Diagonalization
3.5. Quadratic forms
3.6. Linear programming
3.7. Markov processes and other applications
3.8. The Matplotlib library in SageMath
Linear algebra is a fundamental branch of mathematics dealing with vectors, vector spaces, and linear transformations.
It provides powerful tools for solving systems of linear equations, analyzing geometric transformations, and understanding the properties of matrices.
Hence it is also a fundamental tool for solving a wide range of problems in science, engineering, and economics.
The motivation for studying matrix calculus lies in its ability to simplify complex mathematical operations involving matrices, making it easier to analyze and solve problems that arise in real-world applications.
By using matrix calculus, we can express a large number of calculations in a compact and efficient form, which allows us to perform computations that would otherwise be difficult or impossible.
Furthermore, the applications of matrix calculus are vast, including image processing, optimization, machine learning, and data analysis, making it an essential tool for modern-day mathematicians and scientists.
In this chapter we will explore the practical aspects of linear algebra using the Sage mathematical software system.
Sage offers a comprehensive set of tools for performing computations related to linear algebra, such as matrix operations, solving linear systems, calculating eigenvalues and eigenvectors, and much more.
We will also explore Sage tools for Linear Optimization problems and Markov Processess.
We first proceed with basics from matrix calculus and preliminaries from linear algebra.
3.1. Matrix and vector manipulation via Sage
To introduce a matrix in Sage there are many alternatives. For instance we can use the matrix function , as in our first examples below.
Most of the known operations between matrices correspond to built-in commands in Sage. Let us list some of them:
transpose of a matrix - or or .
determinant of a square matrix - or
trace of a square matrix -
inverse of - or
Rank of -
To introduce a vector we use the function (see below).
The standard scalar product (dot product) of two vectors is given by .
Exercise
a) Introduce in Sage the matrices and .
b) Next use Sage to compute the matrices , , , , , , .
c) Confirm using the command that the given matrices , are correct.
d) Compute the determinant, the trace and the rank of .
Solution:
If we want to avoid the decimal presentation of the entries, we can avoid to indicate the field, and simply type the following:
b) For the matrix operations we can now proceed as usual:
or type
c) Let us verify that indeed the given matrice are the inverses of , respectively:
d) For the determinant, trace and rank of , which are all numbers, we get:
or simply type the following:
Exercise
Use the command to transform the matrix given below into its echelon form and count the number of nonzero rows to get the rank. Then verify your answer using the command .
Solution:
Exercise
Introduce the vectors and of .
Next compute the vector and the dot products an d .
Solution:
Or simply type
Similarly for :
As a confirmation one may use the function, as follows:
Exercise
Given two vectors and , compute their inner product.
Solution:
Exercise
Plot the following vectors in Sage:
Solution:
To plot vectors of one can use the command . Let us illustrate its implementation with an example:
Exercise
Plot the vectors , , and in SageMath.
Solution:
Exercise
Introduce the vectors and of and compute their cross product . Next plot the vectors , and .
Instructions: We want the three axes to be visible in the figure and also dashed boundary lines for the three vectors for a better visualization.
Solution:
Exercise
For vectors and , calculate the angle between them (in radians)
Solution:
Exercise
Write a short program in Sage checking if the vectors and are orthogonal.
Solution:
Exercise
Find the projection of the vector onto .
Solution:
Exercise
Introduce the matrix given below and next compute its transpose, its determinant and its trace:
Solution:
An alternative way to introduce a square matrix in Sage goes as follows:
Notice for the transpose of a matrix we can use the shortcut
Exercise
For the matrices given below and the vector on compute the following:
Or for the norm of one can directly type
and hence .
Exercise
Recall that the kernel of the matrix is the space of vectors satisfying
For this, in Sage one should use the command .
Apply this method to find the kernel of the matrix
Remark
Be aware that in Sage the built-in function provides the right kernel of , that is those vectors satisfying .
3.2. Solving linear systems via Sage
Sage can used very succefully to solve linear systems.
In particular, suppose we have a matrix and a vector and we want to find the solution of the equation
To solve this task in Sage, a method relies on the function. Let us describe an example.
Exercise
Solve the system , where and is the vector on .
Solution:
We could also type
As another example: Use the same but replace the constant vector by the vector .
or as we said an alternative we could type:
Matrix inverse method for solving linear systems
Another way to solve linear systems is by finding the inverse of the matrix A. SAGE has a built-in function "A.inverse()" that computes the inverse of a matrix
Gaussian elimination
This is a common method for solving linear systems, and SAGE has built-in functions to perform it. The command "A.rref()" computes the row-reduced echelon form of the matrix A, which is useful for finding solutions
LU factorization
Another method for solving linear systems is LU factorization. This involves factoring the matrix A into the product of two matrices, L and U, where L is lower triangular and U is upper triangular. SAGE has a built-in function , that computes this factorization.
Exercise
Find whether or not the following matrix is invertible:
Solution:
To determine whether or not the matrix is invertible, we can calculate its determinant. If the determinant is nonzero, then the matrix is invertible.
Since the determinant of the given matrix is 0, the matrix is not invertible
Exercise
Solve the system of simultaneous linear equations:
Solution:
Exercise
Solve the matrix equation
Solution:
Exercise
Determine whether or not the vectors (1, 2, 3, 1), (1, 0, −1, 1), (2, 1, −1, 3) and (0, 0, 3, 2) are linearly independent.
Solution:
To determine whether or not the vectors are linearly independent, we can form a matrix with these vectors as its columns, and then compute the rank of the matrix.
Since the rank of the matrix is equal to the number of columns, the columns of the matrix are linearly independent.
Therefore, the vectors (1, 2, 3, 1), (1, 0, −1, 1), (2, 1, −1, 3), and (0, 0, 3, 2) are linearly independent.
Exercise
Determine the number of solutions for the system
Solution:
To determine the number of solutions of the system, we will find out the rank to see if the equations are linearly independent or not. We do this as follows:
3.3. Eignevalues and eigenvectors
Sage can also compute the eigenvalues and eigenvectors of a matrix. Here's an example:
Exercise
Consider the following matrix: .
Compute the eigenvalues of the matrix .
Find the eigenvectors for .
Calculate the characteristic polynomial of .
Solution:
3.4. Diagonalization
In SageMath the is a direct method to create diagonalizable matrices and then use them to understand the procedure of diagonalization.
This can be done by using appropriatelly the command including the option , as in the examples below:
Note that any time executing the previous code Sage will print a different diagonalizable matrix.
We can use such matrices to make experiments on the diagonalization of a matrices. Recall that a matrix is diagonalizable iff is similar to a diagonal matrix . Hence in this case we can specify matrices such that , or equivalently , where is the diagonal matrix whose entries are the eigenvalues of , while is an invertible matrix consisting of the eigenvectors of .
Let us illustrate the relation for a diagonalizable matrix using the command, as above.
Exercise
Use the command, as above, to illustrate the relation for a diagonalizable matrix. Moreover, combine the command with the command to present the factorized expression of the characteristic polynomial of . Repeat for a matrix .
Solution:
An example with diagonalizable matrix:
Exercise
Consider the matrix given below:
Compute the eigenvalues of the matrix .
Find the corresponding eigenvectors for .
Calculate the characteristic polynomial of .
Determine if the matrix is diagonalizable. If it is, find a diagonal matrix and an invertible matrix such that .
Solution:
Exercise
Consider the following matrix:
Show that is symmetric, .
Find the characteristic polynomial of and its eigenvalues.
Is the matrix A is diagonalizable?
Solution:
Hence the matrix is diagonalizable, as the diagonal matrix and the matrix satisfy .
3.5. Quadratic forms
Let us now shortly review how one can treat quadratic forms in Sage. Let us begin with the following quadratic form in R^3:
To define this quadratic form in SAGE, we can use the command, as follows:
Here, ZZ denotes the base ring (in this case, the integers), 3 denotes the number of variables (x, y, z), and [3, 1, 1, 2, 2, 1] specifies the coefficients of the quadratic form in the order .
To find the matrix of a quadratic form we can use the function, as follows:
Exercise
Consider the quadratic form . Find the matrix corresponding to and next determine the rank and signature of this quadratic form.
Solution:
Exercise for practice
Find the matrix, the rank and the signature of the quadratic form on given by .
3.6. Linear programming
Linear programming builds upon notions from linear algebra discussed in chapter 2 and is a dynamic tool for solving linear optimization problems subject to linear constraints. In a linear programming problem (LP problem in short) we seek for a vector on some Euclidean space maximizing (or minimizing) the value of some linear functional, among all vectors satisfying a given system of linear constraints that govern the process. The idea of maximizing or minimizing a linear function subject to linear constraints arises naturally in many fields. For instance we may want to maximize profits, minimize costs, etc.
In SAGE, we'll use the Mixed Integer Linear Programming (MILP) package for -problems
Let us consider a simple -problem.
Exercise
Maximize
Subject to:
Solution:
We can solve this problem using SAGE MixedIntegerLinearProgram()
object by
defining the variables
defining the objective function
setting the constraints
Solving the problem with
.solve()
method
We can read the optimal assignment found by the solver for the variables through the .get_values({variable})
method of MILP obj
Exercise for practice
Minimize under the conditions:
,
,
with for
Exercise
A company produces two products, and , using two machines, and . The production of requires hour of machine and hours of machine , while the production of requires hours of machine and hour of machine . The company has hours of machine and hours of machine available per week. The profit per unit of product is and the profit per unit of product Y is . How many units of each product should the company produce to maximize profit?
Solution:
We can model this problem using LP by defining the decision variables and the objective function:
Thus the optimal number to maximize profits is units of product and units of product
We can check it with the graphical method result
Exercise
An investor wants to invest in three companies: Technology (T), Healthcare (H), and Energy (E). She has $10,000 to invest and wants to minimize risk. However, the government regulates the investments and made a law where you should invest at least 1000$ to Healtcare sector and your investments in Technology can't exceed 4000$.
Solution:
This problem has an infinite amount of solutions, let's check this with the help of SAGE
As you can see, there's an infinite amount of solutions
Exercise
A manufacturing company produces two types of products, A and B, using two machines, M1 and M2. The company aims to fulfill specific production requirements while efficiently utilizing its resources. Interestingly, the company doesn't have a preference for producing more of one product over the other. Hence, the problem leads to an infinite number of optimal solutions.
Given:
Machine M1 has 20 hours available.
Machine M2 has 15 hours available.
Product A requires 1 hour on M1 and 1 hour on M2 per unit.
Product B requires 1 hour on M1 and 1 hour on M2 per unit.
The company must produce at least 10 units of both products A and B.
Objective Function (Minimize):
Subject to Constraints:
(Hours on machine M1)
(Hours on machine M2)
(Minimum production of product A)
(Minimum production of product B)
Since the company is indifferent between products A and B, any combination of A and B that satisfies the constraints is considered optimal. Thus, the linear program has infinite optimal solutions, reflecting the flexibility in production planning.
Solution:
Exercise
A food factory wants to produce three products: Pasta (P), Pizza (Z), and Salad (S) while minimizing costs. Factory spends 3$ on Pasta, 4$ on Pizza and 2$ on Salads.
Constrains:
(Maximum production capacity)
(Minimum flour requirement)
(Demand for more Pasta)
(Demand for less Pizza and Salad)
Solution:
Exercise
An insurance company has a capital of which aims to invest in two different ways: the investment of type and the investment of type . These types of investments give an annual income of and of , respectively. However, there is a limitation by the government, which requires that at least of the capitals, should be invested in the -type investment. On the other hand, the policy of the company requires the ratio between the capital used for the -type investment and the capital used for the X-type of investment, not be greater than 1.5. How the company should invest its capital?
Solution:
3.7. Markov processes and other applications. (optional)
Markov processes are a type of stochastic process where the future state depends only on the current state, and not on any past states. ... In this tutorial, we will explore how to create and analyze Markov processes using SAGE.
Formally, the Markov property states that the conditional probability distribution for the system at the next step (and in fact at all future steps) given its current state depends only on the current state of the system, and not additionally on the state of the system at previous steps:
The possible values of or the set of all states of the system form a countable set 𝕏 called the state space of the chain. The changes of state of the system are called transitions, and the probabilities associated with various state-changes are called transition probabilities.
Since the system changes randomly, it is generally impossible to predict the exact state of the system in the future. However, the statistical and probabilistic properties of the system's future can be predicted. In many applications it is these statistical properties that are important.
Markov chains are often depicted by a weighted directed graph, where the edges are labeled by the probabilities of going from one state to the other states. This is called the flow diagram or transition probability diagram. The transition probability matrix P encodes the probabilities associated with state-changes or "jumps" from one state to another in the state-space 𝕏. If 𝕏 is labelled by then the i,j-th entry in the matrix P corresponds to the probability of going from state i to state j in one time-step.
The state of the system at the n-th time step is described by a state probability vector:
Thus, is the probability you will find the Markov chain at state at time-step n and is called the initial probability vector at the convenient initial time 0.
The state space 𝕏 and transition probability matrix P completely characterizes a Markov chain.
Random Walks and Random Graphs in SageMath
Let's get introduced to simple random walks in SageMath.
A 3D Random Walk
(originally by William Stein)
3.8. The Matplotlib library in SageMath
Matplotlib is a powerful and versatile Python library used for creating static, animated, and interactive visualizations. It is widely adopted for plotting graphs and charts in both scientific and data analytics contexts. With its simple interface, users can generate line plots, bar charts, histograms, scatter plots, and more complex visualizations like 3D plots and other. Matplotlib integrates seamlessly with various environments, including SageMath, making it an essential tool for visualizing mathematical computations and data. It also offers extensive customization for plot aesthetics, allowing users to control everything from colors and styles to axes and legends.
In particular, in SageMath, Matplotlib is used for a wide range of applications, particularly for visualizing mathematical data and functions. Below we desdcribe some key applications:
Application A: Graphing Functions
You can use Matplotlib to plot various types of mathematical functions, such as linear, polynomial, trigonometric, and exponential functions. This is helpful for studying the behavior of functions and analyzing their properties, such as roots, maxima, minima, and inflection points. Let us describe an example using the sine function.
Application B: 3D Surface Plots
Matplotlib, when combined with numpy in SageMath, allows the visualization of 3D surfaces such as paraboloids, hyperboloids, or any user-defined surface. This is particularly useful for geometric and calculus problems.
Application C: Statistical Data Visualization
Matplotlib is ideal for visualizing statistical data, such as histograms, scatter plots, and box plots. In SageMath, this is useful for analyzing large data sets or distributions in various fields such as statistics, data science, and machine learning.
Application D: Contour and Heatmaps
These are used to visualize 2D data that represents a function’s behavior over a grid. In SageMath, they are commonly applied in calculus, differential equations, and optimization problems to understand level sets and gradient flows.
Application E: Parametric and Polar Plots
Matplotlib is helpful in SageMath for visualizing parametric equations, polar coordinates, and vector fields, which are often used in advanced calculus and physics problems.
Let us first describe a simple example:
The rose curve (r = sin(3θ))
Further examples are presented below:
Application F: Smoothing Data
The pandas package enables reading csv files (datasets) that can be visualized using Matplotlib package.
Moving average is a commonly used technique in time series analysis for smoothing data and removing noise. It involves taking the average of a certain number of consecutive data points, called the window size or span. For example, a 7-day moving average of a daily time series would be the average of the past week of data with the kernel . Other kernels can be used (think about how to do a weighted average). Common kernel shapes include simple symmetric forms like box or triangular shapes. The np package includes convolve function for this purpose.
Tak a look at the following example that reads a COVID-19 dataset:
The first column, "datum," contains dates (indexed from 0), and the eighth column, "incidence_pozitivni," contains the number of people who tested positive for COVID-19 on the corresponding day.
Convolution, in this context, is essentially the dot product of a segment of the time series, equal in length to the kernel, with the kernel itself.
Use daily time series and compute the moving average with 7-day window to get trend. By subtracting the trend from the time series you can get the week component.
Take the smoothed data at the beginning of the epidemic outbreak and check for an exponential trend.
Application G: Similarity Measure
In the context of similarity recognition, we usually use the dot product of normalized vectors – the cosine similarity – to measure the similarity between two vectors quantifying objects:
You can create a heatmap to visualize the results when comparing multiple objects with each other.
The following example shows code using Matplotlib to create a matrix visualization for 5 different vectors that describe 5 students with scores ranging from 0 to 5 across 10 courses. The similarity measure helps to identify a threshold between good and poor performance.
Compare COVID-19 incidences (eighth column) with positive PCR tests with symptoms ("PCR_pozit_sympt," thirteenth column) by plotting the data.
Include additional time series for comparison.
Use the cosine similarity measure. Experiment with various features such as raw data, trends, and weekly variance. Remember to normalize the data.
Application H: Image as Data, Compression using SVD
Images are often represented as multidimensional arrays of pixel values, where each pixel can contain information such as intensity or color (RGB). The PIL (Pillow) package is commonly used to load, manipulate, and convert images between formats, while NumPy provides powerful tools to perform mathematical operations on these pixel arrays efficiently. Matplotlib is used to visualize the images and any derived data, enabling a clear and intuitive understanding of the image processing results.
If you have problems with reading jpg file, upgrade the Pillow package running pip install --upgrade pillow
Singular Value Decomposition (SVD)
is a factorization method used to decomposes a matrix into three matrices.
Let be an matrix with rank . SVD factorizes as , where is an orthonormal matrix, is an diagonal matrix, and is an orthonormal matrix. The diagonal entries of , denoted by , are called the singular values of , and they are ordered in non-increasing order:
The columns of and are called the left and right singular vectors of , respectively. The left singular vectors are the eigenvectors of , and the right singular vectors are the eigenvectors of . Moreover, the SVD can be written by columns and (left and right singular vectors) as
Since the diagonal entries of are arranged in descending order, we can truncate this sum to obtain a low rank approximation of . Omitting the smaller singular values effectively reduces the rank of .
The truncated SVD is then , where and are the first columns of and , respectively, and is obtained by keeping only the first singular values on the diagonal of . This approximation has rank and is the best rank- approximation of in the sense that it minimizes the Frobenius norm (the Frobenius norm of matrix is defined as the square root of the sum of the squared absolute values of its elements.) of the difference between and any matrix of rank at most .
Low rank approximations by SVD are widely used in many areas, as they allow us to represent the original matrix data with fewer parameters, while retaining most of its important features. Here are some examples:
Image compression: In image processing, large matrices are often used to represent images. Truncated SVD can be used to approximate these matrices, which enables image compression without significant loss of quality.
Collaborative filtering: Recommender systems use matrix factorization techniques based on truncated SVD to predict user preferences for items in a large database. By approximating the user-item rating matrix with a low-rank SVD approximation, these systems can recommend items that a user is likely to enjoy based on their past preferences.
Data analysis: Truncated SVD can be used to analyze large datasets and extract useful information. For example, in genetics, truncated SVD can be used to identify genes that are most strongly associated with a particular trait.
Latent semantic analysis: In natural language processing, truncated SVD is used in latent semantic analysis (LSA) to identify relationships between words and documents. By approximating the term-document matrix with a low-rank SVD approximation, LSA can identify underlying themes or topics in a large corpus of text.
Network analysis: Truncated SVD can be used to analyze large graphs or networks. For example, in social network analysis, truncated SVD can be used to identify clusters or communities of individuals with similar interests or behavior.
If you have problems with reading jpg file, upgrade the Pillow package running pip install --upgrade pillow
Find the percentage difference by subtracting the images. Cross-validate it with the cumulative percentage of stored information.
Can you find the appropriate value of for a given threshold of the percentage of stored information?
The RGB data is stored in a matrix with 3 indices, where the red channel is stored in layer 0 with indices [horizontal pixel size, vertical pixel size, 0]. Separate the RGB Garfield image into its color channels and compress each color channel separately. Then concatenate them back together using np.stack. along the axis=2 (third layer). To maintain a valid range of , SageMath offers clipping by default, but it is not actually needed.