Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

GitHub Repository: weijie-chen/Linear-Algebra-With-Python
Path: blob/master/notebooks/Chapter 4 - LU Factorization.ipynb
Views: ⁴⁴⁹

Kernel: Python 3

In [19]:

import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import pandas as pd
import sympy as sy
sy.init_printing()

In [3]:

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

LU Factorization

LU factorization arises due its computational efficiency, it mainly facilitates solving the system of linear equations, though the flops (number of floating operations) of LU is still higher than Guassian-Jordon.

Nonetheless LU factorization is particularly handy if you are computing multiple solutions of $A x =b$ .

One example is, if you have a set of $\{b_i,\ i \in \mathbb Z\}$ to substitute in the system, such that $Ax=b_1,\quad Ax=b_2,\quad Ax=b_3,\quad Ax=b_4, \quad...$ we only decompose $A$ once, but Guassian-Jordon algorithm will have to re-do operations with every augmented $[A |b_i]$ .

LU Algorithm

We aim to decompose a matrix $A$ into $L$ and $U$ , which represent lower and upper triangular matrix respectively. And $L$ must have $1$ 's on its principal diagonal. $A = LU$ For instance, $$ A=\underbrace{\left[\begin{array}{cccc} 1 & 0 & 0 & 0 \

& 1 & 0 & 0 \
& * & 1 & 0 \
& * & * & 1 \end{array}\right]}{L}\underbrace{\left[ $\begin{array}{ccccc} \blacksquare & * & * & * & * \\ 0 & \blacksquare & * & * & * \\ 0 & 0 & 0 & \blacksquare & * \\ 0 & 0 & 0 & 0 & 0 \end{array}$ \right]}{U} $$

The plausibility of decomposition hinges on if $A$ can be converted into a upper triangular matrix after a series of row operations, i.e.

E_p...E_2E_1 A = U

Then

A = (E_p...E_2E_1)^{-1}U = LU

where

L = (E_p...E_2E_1)^{-1}

We will hand calculate an example here:

A = \begin{bmatrix} 9 & 3 & 6\\3 & 4 & 6\\0 & 8 & 8\\ \end{bmatrix}

\underbrace{ \begin{bmatrix} 1 & 0 & 0\\-\frac{1}{3} & 1 & 0\\0 & 0 & 1\\ \end{bmatrix}}_{E_1} \underbrace{ \begin{bmatrix} 9 & 3 & 6\\3 & 4 & 6\\0 & 8 & 8\\ \end{bmatrix}}_{A} = \underbrace{ \begin{bmatrix} 9 & 3 & 6\\0 & 3 & 4\\0 & 8 & 8\\ \end{bmatrix}}_{E_1A}

\underbrace{ \begin{bmatrix} 1 & 0 & 0\\0 &1 & 0\\0 & -\frac{8}{3} & 1\\ \end{bmatrix}}_{E_2} \underbrace{ \begin{bmatrix} 9 & 3 & 6\\0 & 3 & 4\\0 & 8 & 8\\ \end{bmatrix}}_{E_1A} = \underbrace{ \begin{bmatrix} 9 & 3 & 6\\0 & 3 & 4\\0 & 0 & -\frac{8}{3}\\ \end{bmatrix}}_{E_2E_1A=U}

L^{-1} = E_2E_1 = \underbrace{ \begin{bmatrix} 1 & 0 & 0\\0 &1 & 0\\0 & -\frac{8}{3} & 1\\ \end{bmatrix}}_{E_2} \underbrace{ \begin{bmatrix} 1 & 0 & 0\\-\frac{1}{3} & 1 & 0\\0 & 0 & 1\\ \end{bmatrix}}_{E_1} = \begin{bmatrix} 1 & 0 & 0\\-\frac{1}{3} & 1 & 0\\\frac{8}{9} & -\frac{8}{3} & 1\\ \end{bmatrix}

Or we can compute $E_1^{-1}$ and $E_2^{-1}$ directly, because $L = E_1^{-1}E_2^{-1}$

Construct augmented matrices for $E_2$ and $E_1$ $[E_1| I]=\left[\begin{array}{ccc|ccc} 1 & 0 & 0 & 1 & 0 & 0\\ -\frac{1}{3} & 1 & 0 & 0 & 1 & 0\\ 0 & 0 & 1 & 0 & 0 & 1 \end{array}\right] \sim \left[\begin{array}{ccc|ccc} 1 &0 &0 &1 & 0 & 0\\ 0& 1& 0& \frac{1}{3} & 1 & 0\\ 0 & 0 & 1 &0 & 0 & 1 \end{array}\right] =[I|E_1^{-1}]$

[E_2| I]= \left[\begin{array}{ccc|ccc} 1 & 0 & 0 & 1 & 0 & 0\\ 0 & 1 & 0 & 0 & 1 & 0\\ 0 & -\frac{8}{3} & 1 & 0 & 0 & 1\\ \end{array} \right] \sim \left[\begin{array}{ccc|ccc} 1 & 0 & 0 & 1 & 0 & 0\\ 0 & 1 & 0 & 0 & 1 & 0\\ 0 & 0 & 1 & 0 &\frac{8}{3}& 1\\ \end{array} \right] =[I|E_2^{-1}]

And we get the inverse of $E_1$ and $E_2$ : $E_1^{-1} = \left[\begin{array}{ccc} 1& 0& 0 \\ \frac{1}{3}& 1& 0\\ 0& 0 &1 \end{array}\right]\\ E_2^{-1} = \left[\begin{array}{ccc} 1& 0& 0 \\ 1& 1& 0\\ 0& \frac{8}{3} &1 \end{array}\right]$

L = E_1^{-1}E_2^{-1}= \left[\begin{array}{ccc} 1& 0& 0 \\ \frac{1}{3}& 1& 0\\ 0& 0 &1 \end{array}\right] \left[\begin{array}{ccc} 1& 0& 0 \\ 1& 1& 0\\ 0& \frac{8}{3} &1 \end{array}\right] = \left[\begin{array}{ccc} 1& 0& 0 \\ \frac{4}{3}&1& 0\\ 0& \frac{8}{3} &1 \end{array}\right]

The final result of LU decomposition: $A = LU = \underbrace{ \left[\begin{array}{ccc} 1& 0& 0 \\ \frac{4}{3}&1& 0\\ 0& \frac{8}{3} &1 \end{array}\right]}_{L} \underbrace{ \begin{bmatrix} 9 & 3 & 6\\0 & 3 & 4\\0 & 0 & -\frac{8}{3}\\ \end{bmatrix}}_{E_2E_1A=U}$

We can take a look at SciPy results.

In [3]:

A = np.array([[9, 3, 6], [3, 4, 6], [0, 8, 8]]); A

array([[9, 3, 6],
       [3, 4, 6],
       [0, 8, 8]])

In [4]:

P, L, U = sp.linalg.lu(A)
P
L
U

array([[1., 0., 0.],
       [0., 0., 1.],
       [0., 1., 0.]])

array([[1.        , 0.        , 0.        ],
       [0.        , 1.        , 0.        ],
       [0.33333333, 0.375     , 1.        ]])

array([[9., 3., 6.],
       [0., 8., 8.],
       [0., 0., 1.]])

In [5]:

P@L@U # this is A

array([[9., 3., 6.],
       [3., 4., 6.],
       [0., 8., 8.]])

It is easy to notice the SciPy lu function give us more than $L$ and $U$ , but also $P$ , which is a permutation matrix. Theoretically, we don't need row switches to convert $A$ into $U$ , but in some situations we must make row switches beforehand, otherwise decomposition will fail to materialize.

Thus Scipy uses $PLU$ decomposition to make the procedure always possible $A = PLU$

Also $P = P^{-1}$ , why? It's easier to analyze in augmented matrix expression, the inverse of row-switched elementary matrices are themselves.

[P| I]=\left[\begin{array}{ccc|ccc} 1 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0 & 1 & 0\\ 0 & 1 & 0 & 0 & 0 & 1 \end{array}\right] \sim \left[\begin{array}{ccc|ccc} 1 & 0 & 0 & 1 & 0 & 0\\ 0 & 1 & 0 & 0 & 0 & 1\\ 0 & 0 & 1 & 0 & 1 & 0 \end{array}\right]=[I| P^{-1}] =[P^{-1}|I]

With these knowledge, what we are decomposing is actucally

PA = LU

\begin{align} E_2E_1E_0A &= \underbrace{ \begin{bmatrix} 1 & 0& 0\\0 & 1 & 0\\0 & -\frac{3}{8} & 1\\ \end{bmatrix}}_{E_2} \underbrace{ \begin{bmatrix} 1 & 0& 0\\0 & 1 & 0\\-\frac{1}{3} & 0 & 1\\ \end{bmatrix}}_{E_1} \underbrace{ \begin{bmatrix} 1 & 0& 0\\0 & 0 & 1\\0 & 1 & 0\\ \end{bmatrix}}_{E_0} \begin{bmatrix} 9 & 3 & 6\\3 & 4 & 6\\0 & 8 & 8\\ \end{bmatrix}\\ E_2E_1E_0A &= \underbrace{ \begin{bmatrix} 1 & 0& 0\\0 & 1 & 0\\0 & -\frac{3}{8} & 1\\ \end{bmatrix}}_{E_2} \underbrace{ \begin{bmatrix} 1 & 0& 0\\0 & 1 & 0\\-\frac{1}{3} & 0 & 1\\ \end{bmatrix}}_{E_1} \underbrace{ \begin{bmatrix} 9& 3& 6\\0&8 &8\\ 3& 4& 6 \end{bmatrix}}_{E_0A}\\ E_2E_1E_0A &= \underbrace{ \begin{bmatrix} 1 & 0& 0\\0 & 1 & 0\\0 & -\frac{3}{8} & 1\\ \end{bmatrix}}_{E_2} \underbrace{ \begin{bmatrix} 9 &3 &6\\0 &8& 8 \\0& 3& 4 \end{bmatrix}}_{E_1E_0A}\\ E_2E_1E_0A &= \underbrace{ \begin{bmatrix} 9 &3 &6\\0 &8& 8 \\0& 0& -1 \end{bmatrix}}_{E_2E_1E_0A}=U \end{align}

Rearrange that we can see $PL$ $A = \underbrace{E_0^{-1}}_{P} \underbrace{(E_1^{-1}E_2^{-1})}_{L}U$

Solving Linear System by Using LU Factorization

Solve the linear system： $\begin{align} 3x_1-7x_2 -2x_3+2x_4&=-9\\ -3x_1+5x_2 +x_3 &=5\\ 6x_1-4x_2 -5x_4&=7\\ -9x_1+5x_2 -5x_3+12x_4&=11\\ \end{align}$ In matrix form: $$ \underbrace{\left[ $\begin{array}{rrrr} 3 & -7 & -2 & 2 \\ -3 & 5 & 1 & 0 \\ 6 & -4 & 0 & -5 \\ -9 & 5 & -5 & 12 \end{array}$ \right]}_{A} \left[ $\begin{array}{r} x_1 \\ x_2 \\ x_3 \\ x_4 \end{array}$ \right]

\underbrace{\left[ $\begin{array}{r} -9 \\ 5 \\ 7 \\ 11 \end{array}$ \right]}_{b} $$ Perform $LU$ decomposition:

\underbrace{\left[\begin{array}{rrrr} 3 & -7 & -2 & 2 \\ -3 & 5 & 1 & 0 \\ 6 & -4 & 0 & -5 \\ -9 & 5 & -5 & 12 \end{array}\right]}_{A} =\underbrace{ \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ -1 & 1 & 0 & 0 \\ 2 & -5 & 1 & 0 \\ -3 & 8 & 3 & 1 \end{array}\right]}_{L}\underbrace{\left[\begin{array}{rrrr} 3 & -7 & -2 & 2 \\ 0 & -2 & -1 & 2 \\ 0 & 0 & -1 & 1 \\ 0 & 0 & 0 & -1 \end{array}\right]}_{U}

Replace $A$ by $LU$ , we get $L(Ux) = b$ , now solve this pair of equations

Ly = b\\ Ux = y

Construct augmented matrix $[L|b]$

\underbrace{ \left[\begin{array}{rrrr} 1 & 0 & 0 & 0 \\ -1 & 1 & 0 & 0 \\ 2 & -5 & 1 & 0 \\ -3 & 8 & 3 & 1 \end{array}\right]}_{L} \underbrace{\left[\begin{array}{r} y_1 \\ y_2 \\ y_3 \\ y_4 \end{array}\right]}_{y} = \underbrace{\left[\begin{array}{r} -9 \\ 5 \\ 7 \\ 11 \end{array}\right]}_{b}

\left[\begin{array}{rrrr|r} 1 & 0 & 0 & 0 & -9 \\ -1 & 1 & 0 & 0 & 5\\ 2 & -5 & 1 & 0 & 7\\ -3 & 8 & 3 & 1 & 11 \end{array}\right]\sim \left[\begin{array}{rrrr|r} 1 & 0 & 0 & 0 & -9 \\ 0 & 1 & 0 & 0 & -4\\ 0 & 0 & 1 & 0 & 5\\ 0 & 0 & 0 & 1 & 1 \end{array}\right]

\left[\begin{array}{r} y_1 \\ y_2 \\ y_3 \\ y_4 \end{array}\right]= \left[\begin{array}{r} -9 \\ -4 \\ 5 \\ 1 \end{array}\right]

Next we solve $Ux = y$ , to show this in NumPy:

In [6]:

U = np.array([[3, -7, -2, 2], [0, -2, -1, 2], [0, 0, -1, 1], [0, 0, 0,-1]])
y = np.array([-9, -4, 5, 1])
np.linalg.solve(U, y)

array([ 3.,  4., -6., -1.])

If the process is correct, this is the solution of the system and we can verify results by invoking np.linalg.solve().

In [7]:

A = np.array([[3, -7, -2, 2], [-3, 5, 1, 0], [6, -4, 0, -5], [-9, 5, -5, 12]])
b = np.array([-9, 5, 7, 11])
np.linalg.solve(A, b)

array([ 3.,  4., -6., -1.])

The results are the same, $LU$ decomposition works!

Cholesky Factorization

Cholesky decomposition is analogous to taking the square root of a matrix. A common example is the decomposition of a covariance matrix. In this context, the $L$ matrix plays a role similar to that of the correlation matrix. $\Sigma = L L^T$

Suppose we have a vector of uncorrelated random variables $\mathbf{z}$ . Each element of $\mathbf{z}$ is a standard normal random variable with zero mean and unit variance. In mathematical terms, $\mathbf{z} \sim \mathcal{N}(0, I)$ , where $I$ is the identity matrix. $\mathbf{z}=\left(\begin{array}{c} z_1 \\ z_2 \\ \vdots \\ z_n \end{array}\right)$

We can transform non-correlated series $x$ into corelated $z$ $x = Lz$

To understand this, we examine the covariance matrix of the transformed vector ( x ): $\text{Cov}(x) = \text{Cov}(Lz) = L \text{Cov}(z) L^T = L I L^T = L L^T = \Sigma$ This shows that $\text{Cov}(x)$ has the desired covariance structure.

Let's see a numerical example.

In [32]:


# use Antithetic Variates to reduce variance
n = 1000
z = np.random.normal(size=(1000, 3))
z2 = -z
z = np.concatenate((z, z2), axis=0)

cov_matrix = np.array([[1, 0.8, 0.7], [0.8, 1, 0.8], [0.7, 0.8, 1]])
L = np.linalg.cholesky(cov_matrix)

z = z @ L.T

np.cov(z.T)

array([[0.99424759, 0.80479653, 0.71424917],
       [0.80479653, 1.00962669, 0.81794008],
       [0.71424917, 0.81794008, 1.02650048]])

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

LU Factorization

LU Algorithm

Solving Linear System by Using LU Factorization

Cholesky Factorization

Product

Resources

Company

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more, all in one place. Commercial Alternative to JupyterHub.

LU Factorization

LU Algorithm

Solving Linear System by Using LU Factorization

Cholesky Factorization

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.