CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
weijie-chen

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

GitHub Repository: weijie-chen/Linear-Algebra-With-Python
Path: blob/master/notebooks/Chapter 13a - Diagonalization.ipynb
Views: 449
Kernel: Python 3 (ipykernel)
import sympy as sy sy.init_printing() import matplotlib.pyplot as plt plt.style.use('ggplot') import numpy as np

Similarity

If A=PBP1A = PBP^{-1}, we say AA is similar to BB, decomposing AA into PBP1PBP^{-1} is also called a similarity transformation, PP is an invertable matrix.

If n×nn\times n matrices AA and BB are similar, they have the same eigenvalues.

Here are some reasons that we need similarity transformation:

  1. Invariant Properties: Similar matrices share many important properties, such as eigenvalues, determinant, trace, and rank. This invariance makes similarity transformations useful for simplifying problems without changing their fundamental characteristics.

  2. Changing Basis: Similarity transformations can be interpreted as changing the basis in which the linear transformation represented by the matrix is expressed. This change of basis can make the problem easier to understand or solve.

  3. Applications: Similarity transformations are widely used in various applications, including diagonalization, canonical forms, and simplifying differential equations.

The diagnoalization, which we will explain below, is a special case of similarity transformation.

Diagonalizable Matrix and Special Decompositon

Let AA be an n×nn\times n matrix. If there exists an n×nn\times n invertible matrix PP and a diagonal matrix DD, such that

A=PDP1A=PDP^{-1}

then matrix AA is called a diagonalizable matrix.

And further, the columns of PP are linearly independent eigenvectors of AA, and its corresponding eigenvalues are on the principal diagonal of DD. In other words, AA is diagonalizable if and only if the dimension of eigenspace basis is nn.

Let's show why this equation holds. Define PP and DD

$$P = \left[\begin{array}{llll} {v}_{1} & {v}_{2} & \cdots & {v}_{n} \end{array}\right]\\$$D=[λ1000λ2000λn]D = \left[\begin{array}{cccc} \lambda_{1} & 0 & \cdots & 0 \\ 0 & \lambda_{2} & \cdots & 0 \\ \vdots & \vdots & & \vdots \\ 0 & 0 & \cdots & \lambda_{n} \end{array}\right]

where vi,i(1,2,...n)v_i, i \in (1, 2, ...n) is an eigenvector of AA, λi,i(1,2,...n)\lambda_i, i \in (1, 2, ...n) is an eigenvalue of AA.

AP=A[v1v2vn]=[Av1Av2Avn]AP = A\left[\begin{array}{llll} {v}_{1} & {v}_{2} & \cdots & {v}_{n} \end{array}\right]=\left[\begin{array}{llll} A {v}_{1} & A {v}_{2} & \cdots & A {v}_{n} \end{array}\right]
PD=[v1v2vn][λ1000λ2000λn]=[λ1v1λ2v2λnvn]P D=\left[\begin{array}{llll} {v}_{1} & {v}_{2} & \cdots & {v}_{n} \end{array}\right]\left[\begin{array}{cccc} \lambda_{1} & 0 & \cdots & 0 \\ 0 & \lambda_{2} & \cdots & 0 \\ \vdots & \vdots & & \vdots \\ 0 & 0 & \cdots & \lambda_{n} \end{array}\right]=\left[\begin{array}{lllll} \lambda_{1} {v}_{1} & \lambda_{2} {v}_{2} & \cdots & \lambda_{n} {v}_{n} \end{array}\right]

We know that Avi=λiviA{v}_i = \lambda_i{v}_i, i.e.

AP=PDAP = PD

Since PP has all independent eigenvectors, then

A=PDP1A = PDP^{-1}

Strictly speaking, if AA is symmetric, i.e. A=ATA=A^T, the procedure above called Spectral decomposition, the similar matrix DD holds all the eigenvalues on its diagonal. And PP is orthogonal matrix, which means any of of its two columns are perpendicular. Therefore it could be rewritten as A=PDPT A = PDP^{T}

We can show why all eigenvectors are orthogonal to each other: Set up two equations Av1=λ1v1Av2=λ2v2 A \mathbf{v}_1=\lambda_1 \mathbf{v}_1\\ A \mathbf{v}_2=\lambda_2 \mathbf{v}_2 Take the transpose of the first equation and multiply by v2\mathbf{v}_2 : v1TATv2=λ1v1Tv2 \mathbf{v}_1^T A^T \mathbf{v}_2=\lambda_1 \mathbf{v}_1^T \mathbf{v}_2 Since AA is symmetric (A=AT)\left(A=A^T\right) : v1TAv2=λ1v1Tv2 \mathbf{v}_1^T A \mathbf{v}_2=\lambda_1 \mathbf{v}_1^T \mathbf{v}_2

Substitute Av2=λ2v2A \mathbf{v}_2=\lambda_2 \mathbf{v}_2 into the equation: v1T(λ2v2)=λ1v1Tv2 \mathbf{v}_1^T\left(\lambda_2 \mathbf{v}_2\right)=\lambda_1 \mathbf{v}_1^T \mathbf{v}_2 Simplify to get: λ2v1Tv2=λ1v1Tv2 \lambda_2 \mathbf{v}_1^T \mathbf{v}_2=\lambda_1 \mathbf{v}_1^T \mathbf{v}_2

Since λ1λ2\lambda_1 \neq \lambda_2 : (λ2λ1)v1Tv2=0 \left(\lambda_2-\lambda_1\right) \mathbf{v}_1^T \mathbf{v}_2=0 This implies that v1Tv2=0\mathbf{v}_1^T \mathbf{v}_2=0, meaning v1\mathbf{v}_1 and v2\mathbf{v}_2 are orthogonal.

Correlation/covariance matrix is a good example, which is symmetric and square, its all eigenvalues are real.

Spectral Decomposition Visualization

Let's visualize spectral decomposition, PP and PTP^T are for rotation because they orthogonal, and DD are for scaling because it's diagonal.

# Define matrix A A = np.array([[1, 3], [3, -5]]) # Calculate eigenvalues and eigenvectors eigenvalues, eigenvectors = np.linalg.eig(A) # Plot eigenvectors fig, ax = plt.subplots(figsize=(15, 4), nrows=1, ncols=4) for i in range(2): ax[0].quiver(0, 0, eigenvectors[:,i][0], eigenvectors[:,i][1], angles='xy', scale_units='xy', scale=1, color=['r','b'][i]) ax[0].set_xlim(-2, 2) ax[0].set_ylim(-2, 2) ax[0].set_title('Eigenvectors of $A$') for i in range(4): ax[i].set_xlabel('x') ax[i].set_ylabel('y') ax[i].set_aspect('equal') ax[1].quiver(0, 0, 1, 0, angles='xy', scale_units='xy', scale=1, color=['r']) ax[1].quiver(0, 0, 0, 1, angles='xy', scale_units='xy', scale=1, color=['b']) ax[1].set_xlim(-2, 2) ax[1].set_ylim(-2, 2) ax[1].set_title('$P^T$') ax[2].quiver(0, 0, eigenvalues[0], 0, angles='xy', scale_units='xy', scale=1, color=['r']) ax[2].quiver(0, 0, 0, eigenvalues[1], angles='xy', scale_units='xy', scale=1, color=['b']) ax[2].set_xlim(-7, 7) ax[2].set_ylim(-7, 7) ax[2].set_title('$DP^T$') temp = np.array([[eigenvalues[0], 0], [0, eigenvalues[1]]]) temp1 = temp@eigenvectors ax[3].quiver(0, 0, temp1[:,0][0], temp1[:,0][1], angles='xy', scale_units='xy', scale=1, color=['r']) ax[3].quiver(0, 0, temp1[:,1][0], temp1[:,1][1], angles='xy', scale_units='xy', scale=1, color=['b']) ax[3].set_xlim(-7, 7) ax[3].set_ylim(-7, 7) ax[3].set_title('$PDP^T$') plt.show()
Image in a Jupyter notebook

Diagonalizing a Matrix

Consider a matrix

A=[133353331]A=\left[\begin{array}{rrr} 1 & 3 & 3 \\ -3 & -5 & -3 \\ 3 & 3 & 1 \end{array}\right]

We seek to diagonalize the matrix AA.

Following these steps:

  1. Compute the eigenvalues of AA

  2. Compute the eigenvectors of AA

  3. Construct PP.

  4. Construct DD from the corresponding columns of PP.

A = sy.Matrix([[1,3,3], [-3, -5, -3], [3,3,1]]) eig = A.eigenvects() eig

[(2, 2, [[110], [101]]), (1, 1, [[111]])]\displaystyle \left[ \left( -2, \ 2, \ \left[ \left[\begin{matrix}-1\\1\\0\end{matrix}\right], \ \left[\begin{matrix}-1\\0\\1\end{matrix}\right]\right]\right), \ \left( 1, \ 1, \ \left[ \left[\begin{matrix}1\\-1\\1\end{matrix}\right]\right]\right)\right]

Reminder the return value takes the form [(eigenval, multiplicity, eigenspace), ...].

Construct PP

P = sy.zeros(3, 3) P[:, 0] = eig[0][2][0] P[:, 1] = eig[0][2][1] P[:, 2] = eig[1][2][0] P

[111101011]\displaystyle \left[\begin{matrix}-1 & -1 & 1\\1 & 0 & -1\\0 & 1 & 1\end{matrix}\right]

Construct DD

D = sy.diag(eig[0][0], eig[0][0], eig[1][0]) D

[200020001]\displaystyle \left[\begin{matrix}-2 & 0 & 0\\0 & -2 & 0\\0 & 0 & 1\end{matrix}\right]

We can verify if PDP1=APDP^{-1}=A holds:

P * D * P.inv() == A
True

Of course we don't need to go through this process seperately. There is diagonalize method in SymPy.

P, D = A.diagonalize()
P

[111101011]\displaystyle \left[\begin{matrix}-1 & -1 & 1\\1 & 0 & -1\\0 & 1 & 1\end{matrix}\right]

D

[200020001]\displaystyle \left[\begin{matrix}-2 & 0 & 0\\0 & -2 & 0\\0 & 0 & 1\end{matrix}\right]

We obtain the same results as previous separate steps.

Sometimes we just want to test if a matrix is diagonalizable, then use is_diagonalizable in SymPy.

A.is_diagonalizable()
True

If AA is symmetric, all of its eigenvectors are orthogonal.

Av1=λ1v1andAv2=λ2v2Av_1 = \lambda_1v_1 \quad \text{and} \quad Av_2 = \lambda_2v_2
v2Av1=λ1v1v2v1Av2=λ2v2v1v_2 \cdot Av_1 = \lambda_1v_1 \cdot v_2\\ v_1 \cdot Av_2 = \lambda_2v_2 \cdot v_1
v1Av2=v1ATv2=v1λ2v2=λ2v1v2v2Av1=v2ATv1=v2λ1v1=λ1v2v1v_1 \cdot Av_2 = v_1 \cdot A^Tv_2 = v_1 \cdot \lambda_2v_2 = \lambda_2v_1 \cdot v_2\\ v_2 \cdot Av_1 = v_2 \cdot A^Tv_1 = v_2 \cdot \lambda_1v_1 = \lambda_1v_2 \cdot v_1

How Diagonalization Simplifies the Solution Process

Original System: The original system of differential equations is coupled, meaning that each ed involves multiple variables. For example: dxdt=Ax \frac{d \mathbf{x}}{d t}=A \mathbf{x} where AA is a matrix, and x\mathbf{x} is a vector of variables.

By finding the eigenvalues and eigenvectors of the matrix AA, we can decomp into PDP1P D P^{-1}, where DD is a diagonal matrix of eigenvalues and PP is the matri eigenvectors: A=PDP1 A=P D P^{-1} We introduce a change of variables using the eigenvector matrix PP : x=Py \mathbf{x}=P \mathbf{y}

Substituting this into the original differential equation gives:

d(Py)dt=A(Py)\frac{d(P \mathbf{y})}{d t}=A(P \mathbf{y})

Simplifying, we get: Pdydt=PDP1(Py) P \frac{d \mathbf{y}}{d t}=P D P^{-1}(P \mathbf{y})

Since P1P=IP^{-1} P=I : Pdydt=PDy P \frac{d \mathbf{y}}{d t}=P D \mathbf{y}

Multiplying both sides by P1P^{-1} : dydt=Dy \frac{d \mathbf{y}}{d t}=D \mathbf{y} The transformed system dydt=Dy\frac{d \mathbf{y}}{d t}=D \mathbf{y} is now decoupled because DD is a diagonal matrix. This means that each differential equation in the system involves only a single variable yiy_i and can be solved independently:

dyidt=λiyi\frac{d y_i}{d t}=\lambda_i y_i

where λi\lambda_i are the eigenvalues of AA. Solving the Decoupled System: The decoupled differential equations are simple first-order linear differential equations, which have straightforward solutions: yi(t)=cieλit y_i(t)=c_i e^{\lambda_i t} where cic_i are constants determined by the initial conditions. Finally, we transform the solution back to the original variables using the eigenvector matrix PP : x=Py \mathbf{x}=P \mathbf{y}

This gives us the solution to the original system.