Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
restrepo
GitHub Repository: restrepo/ComputationalMethods
Path: blob/master/material/linear-algebra-diagonalization.ipynb
934 views
Kernel: Python 3 (ipykernel)

Matrix diagonalization

Open In Colab

Ortogonal matrices

alt

x=xcosθ+ysinθy=ycosθxsinθ.\begin{align} x'=&x\cos\theta+y\sin\theta \nonumber\\ y'=&y\cos\theta-x\sin\theta\,. \end{align}

En forma matricial, la transformación del sistema de coordenas inicial al sistema de coordenados rotado por un ángulo θ\theta está dado por (xy)=(cosθsinθsinθcosθ)(xy).\begin{align} \begin{pmatrix} x'\\ y'\\ \end{pmatrix}= \begin{pmatrix} \cos\theta & \sin\theta\\ -\sin\theta& \cos\theta\\ \end{pmatrix} \begin{pmatrix} x\\ y\\ \end{pmatrix}. \end{align}

Theorem 1

If A\boldsymbol{A} is Hermitic, e.g: A=A,\begin{align} \boldsymbol{A}^\dagger =\boldsymbol{A}\,, \end{align} Then exists an unitary matrix V\boldsymbol{V} e.g: V 1=V,\begin{align} \boldsymbol{V}^{\ -1}=\boldsymbol{V}^\dagger \,, \end{align} such that VAV=Adiag,\begin{equation} \boldsymbol{V}^\dagger\boldsymbol{A}\,\boldsymbol{V}=\boldsymbol{A}_{\text{diag}}\,, \end{equation} where Adiag=diag(λ1,λ2,λn)\boldsymbol{A}_{\text{diag}}=\operatorname{diag}(\lambda_1,\lambda_2,\ldots \lambda_n) is the diagonalized mass matrix.

Corollary 1

If A\boldsymbol{A} is symmetric, e.g: AT=A,\begin{align} \boldsymbol{A}^{\operatorname{T}} =\boldsymbol{A}\,, \end{align} Then exists an ortogonal matrix V\boldsymbol{V}, e.g: V1=VT,\begin{align} \boldsymbol{V}^{-1}=\boldsymbol{V}^{\operatorname{T}} \,, \end{align} such that VTAV=Adiag,\begin{equation} \boldsymbol{V}^{\operatorname{T}}\boldsymbol{A}\,\boldsymbol{V}=\boldsymbol{A}_{\text{diag}}\,, \end{equation} where Adiag=diag(λ1,λ2,λn)\boldsymbol{A}_{\text{diag}}=\operatorname{diag}(\lambda_1,\lambda_2,\ldots \lambda_n) is the diagonalized mass matrix.


Eigenvector problem

Note that each eigenvector, corresponding to each column of the matrix, is associated to each eigenvalue V=[V1V2ViVjVn]. \boldsymbol{V}=\begin{bmatrix} \boldsymbol{V}_1 \vdots \boldsymbol{V}_2\cdots \vdots \boldsymbol{V}_i \vdots \boldsymbol{V}_j\vdots\cdots \vdots \boldsymbol{V}_n \end{bmatrix}. where Vi\boldsymbol{V}_i is the ii-th column of the matrix V\boldsymbol{V}.

In this way, if we interchange the iji\leftrightarrow j columns of the diagonalization matrix V\boldsymbol{V}, the order of the eigenvalues also change [V1V2VjViVn]A[V1V2VjViVn]=diag(λ1,λ2,,λj,λi,λn). \begin{bmatrix} \boldsymbol{V}_1 \vdots \boldsymbol{V}_2\cdots \vdots \boldsymbol{V}_j \vdots \boldsymbol{V}_i\vdots\cdots \vdots \boldsymbol{V}_n \end{bmatrix}^\dagger \boldsymbol{A} \begin{bmatrix} \boldsymbol{V}_1 \vdots \boldsymbol{V}_2\cdots \vdots \boldsymbol{V}_j \vdots \boldsymbol{V}_i\vdots\cdots \vdots \boldsymbol{V}_n \end{bmatrix}=\operatorname{diag}(\lambda_1,\lambda_2,\ldots,\lambda_j,\lambda_i,\ldots \lambda_n). This property is very important because usually the diagonalizationn alghoritm gives not the desired ordering of the eigenvalues and eigenvectors. It is recommended to use the np.c_ method for the eigenvector reoirdering

However, the Theorem only guarantees existence. Tor really calculate the diagonalization matrix we must establish the eigenvector problem:

We can use the unitary propery to write VVAV=VAdiagAV=VAdiag,\begin{align} \boldsymbol{V}\boldsymbol{V}^\dagger\boldsymbol{A}\,\boldsymbol{V}=&\boldsymbol{V}\boldsymbol{A}_{\text{diag}}\\ \boldsymbol{A}\,\boldsymbol{V}=&\boldsymbol{V}\boldsymbol{A}_{\text{diag}}\,, \end{align} or A[V1V2Vn]=diag(λ1,λ2,λn)[V1V2Vn]. \boldsymbol{A}\begin{bmatrix} \boldsymbol{V}_1 \vdots \boldsymbol{V}_2\cdots \vdots \boldsymbol{V}_n \end{bmatrix} =\operatorname{diag}(\lambda_1,\lambda_2\ldots,\lambda_n)\begin{bmatrix} \boldsymbol{V}_1 \vdots \boldsymbol{V}_2\cdots \vdots \boldsymbol{V}_n \end{bmatrix}. Therefore, the eigenvalue equation is just: ParseError: KaTeX parse error: Expected & or \\ or \cr or \end at end of input: \begin{align} % \boldsymbol{A}\,\boldsymbol{V}_i=&\lambda_i\boldsymbol{V}_i \nonumber\\ \boldsymbol{A}\,\boldsymbol{V}_i-\lambda_i\boldsymbol{V}_i=&\boldsymbol{0} \nonumber\\ (\boldsymbol{A}-\lambda_i \,\boldsymbol{I})\boldsymbol{V}_i=&\boldsymbol{0}\,, %ParseError: KaTeX parse error: Expected 'EOF', got '\end' at position 2: \̲e̲n̲d̲{align} where Vi\boldsymbol{V}_i is the ii-th column of the matrix V\boldsymbol{V}, and I\boldsymbol{I} is the identity matrix.

To avoid the trivial solution Vi=0\boldsymbol{V}_i=\boldsymbol{0}, we require that AλiI\boldsymbol{A}-\lambda_i\, \boldsymbol{I} does not have an inverse, or equivalently det(AλiI)=0.\det( \boldsymbol{A}-\lambda_i\, \boldsymbol{I})=0\,.

Example

From arXiv:2010.06458:

In the three flavor neutrino oscillation, the neutrino flavor states να(α=e,μ,τ)\left|\nu_{\alpha}\right\rangle(\alpha=e, \mu, \tau) are linear superposition of mass eigenstates νj(j=1,2,3):να=ΣjUαjνj,\left|\nu_{j}\right\rangle(j=1,2,3):\left|\nu_{\alpha}\right\rangle=\Sigma_{j} U_{\alpha j}\left|\nu_{j}\right\rangle, where UαjU_{\alpha j} are the elements of the lepton mixing matrix known as PMNS (Pontecorvo-Maki-Nakagawa-Sakita) matrix such that (νeνμντ)=(Ue1Ue2Ue3Uμ1Uμ2Uμ3Uτ1Uτ2Uτ3)(ν1ν2ν3) \left(\begin{array}{l} \left|\nu_{e}\right\rangle \\ \left|\nu_{\mu}\right\rangle \\ \left|\nu_{\tau}\right\rangle \end{array}\right)=\left(\begin{array}{lll} U_{e 1} & U_{e 2} & U_{e 3} \\ U_{\mu 1} & U_{\mu 2} & U_{\mu 3} \\ U_{\tau 1} & U_{\tau 2} & U_{\tau 3} \end{array}\right)\left(\begin{array}{l} \left|\nu_{1}\right\rangle \\ \left|\nu_{2}\right\rangle \\ \left|\nu_{3}\right\rangle \end{array}\right) The time evolution follows να(t)=ΣjeiEjtUαjνj,\left|\nu_{\alpha}(t)\right\rangle=\Sigma_{j} e^{-i E_{j} t} U_{\alpha j}\left|\nu_{j}\right\rangle, where EjE_{j} is the energy associated with the mass eigenstates νi\left|\nu_{i}\right\rangle This is a superposition state.

The observables associated to the three neutrinos are the entries of the 3×33\times 3 unitary matrix U=(U1U2U3)\boldsymbol{U}=\begin{pmatrix}\boldsymbol{U}_1\vdots\boldsymbol{U}_2\vdots\boldsymbol{U}_3\end{pmatrix}, and the eigenvalues associated to each eigenvector U1m1\boldsymbol{U}_1\to m_1, U2m2\boldsymbol{U}_2\to m_2, U3m3\boldsymbol{U}_3\to m_3. The normal ordering is m1<m2<m3m_1<m_2<m_3. The unitary matrix can be parameterized in terms of three mixing angles, θ23\theta_{23} θ13\theta_{13}, θ12\theta_{12}, and a complex phase, δCP\delta_{\text{CP}}, such that U=(1000c23s230s23c23)(c130s13eiδCP010s13eiδCP0c13)(c12s120s12c120001), \boldsymbol{U}=\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & c_{23} & s_{23} \\ 0 & -s_{23} & c_{23} \end{array}\right) \cdot\left(\begin{array}{ccc} c_{13} & 0 & s_{13} e^{-i \delta_{\mathrm{CP}}} \\ 0 & 1 & 0 \\ -s_{13} e^{i \delta_{\mathrm{CP}}} & 0 & c_{13} \end{array}\right) \cdot\left(\begin{array}{ccc} c_{12} & s_{12} & 0 \\ -s_{12} & c_{12} & 0 \\ 0 & 0 & 1 \end{array}\right), where cijcosθijc_{i j} \equiv \cos \theta_{i j} and sijsinθijs_{i j} \equiv \sin \theta_{i j}. Thus, we can write U\boldsymbol{U} as U=(c12c13s12c13s13eiδCPs12c23c12s13s23eiδCPc12c23s12s13s23eiδCPc13s23s12s23c12s13c23eiδCPc12s23s12s13c23eiδCPc13c23) \boldsymbol{U}=\left(\begin{array}{ccc}c_{12} c_{13} & s_{12} c_{13} & s_{13} e^{-i \delta_{\mathrm{CP}}} \\ -s_{12} c_{23}-c_{12} s_{13} s_{23} e^{i \delta_{\mathrm{CP}}} & c_{12} c_{23}-s_{12} s_{13} s_{23} e^{i \delta_{\mathrm{CP}}} & c_{13} s_{23} \\ s_{12} s_{23}-c_{12} s_{13} c_{23} e^{i \delta_{\mathrm{CP}}} & -c_{12} s_{23}-s_{12} s_{13} c_{23} e^{i \delta_{\mathrm{CP}}} & c_{13} c_{23}\end{array}\right) so that U1=(Ue1Uμ1Uτ1)=(c12c13s12c23c12s13s23eiδCPs12s23c12s13c23eiδCP),U2=(Ue2Uμ2Uτ2)=(s12c13c12c23s12s13s23eiδCPc12s23s12s13c23eiδCP),U3=(Ue3Uμ3Uτ3)=(s13eiδCPc13s23c13c23) \boldsymbol{U}_1=\begin{pmatrix}U_{e1}\\ U_{\mu 1}\\ U_{\tau 1}\end{pmatrix}=\begin{pmatrix} c_{12} c_{13} \\ -s_{12} c_{23}-c_{12} s_{13} s_{23} e^{i \delta_{\mathrm{CP}}} \\ s_{12} s_{23}-c_{12} s_{13} c_{23} e^{i \delta_{\mathrm{CP}}} \end{pmatrix},\qquad \boldsymbol{U}_2=\begin{pmatrix}U_{e2}\\ U_{\mu 2}\\ U_{\tau 2}\end{pmatrix}=\begin{pmatrix} s_{12} c_{13} \\ c_{12} c_{23}-s_{12} s_{13} s_{23} e^{i \delta_{\mathrm{CP}}} \\ -c_{12} s_{23}-s_{12} s_{13} c_{23} e^{i \delta_{\mathrm{CP}}} \end{pmatrix},\qquad \boldsymbol{U}_3=\begin{pmatrix}U_{e3}\\ U_{\mu 3}\\ U_{\tau 3}\end{pmatrix}=\begin{pmatrix} s_{13} e^{-i \delta_{\mathrm{CP}}} \\ c_{13} s_{23} \\ c_{13} c_{23} \end{pmatrix}

After decades of experimental efforts with thousands of millions of dollars in investment and two recent Nobel prizes, most of the parameters are already measured (see PDG22020 ):

IMAGE

where Δmij2=mi2mj2\Delta m^2_{ij}=m^2_i-m^2_j is the squared mass difference between eigenvalues ii and jj; and eV\text{eV} is really eV/c2\text{eV}/c^2 where cc is the speed of light in vacuum in natural units with c=1c=1, and 1 eV=1.6021766208(98)×1019 J1\ \text{eV}=1.602\,176\,6208(98)\times 10^{-19}\ \text{J}.

To a better measurement of δCP\delta_{CP} for example, a new large experiment called DUNE, and with a cost of around $1500$1\,500 million of dollars, is in construction in the United States IMAGE

Theorem 2: Singular value decomposition (SVD)

See SVD (where the Hermetique-conjugate is denoted with "*" instead that with "")

A general complex matrix A\boldsymbol{A} can be diagonalized by a bi-diagonal transformation such that VAU=Adiag,\begin{equation} \boldsymbol{V}^\dagger\boldsymbol{A}\,\boldsymbol{U}=\boldsymbol{A}_{\text{diag}}\,, \end{equation} where Adiag=diag(λ1,λ2,λn)\boldsymbol{A}_{\text{diag}}=\operatorname{diag}(\lambda_1,\lambda_2,\ldots \lambda_n) is the diagonalized mass matrix.

Demostration

Adiag=VAUAdiagAdiag=VAU(VAU)=VAU(UAV)=V(AA)V.\begin{align} \boldsymbol{A}_{\text{diag}}=&\boldsymbol{V}^\dagger\boldsymbol{A}\,\boldsymbol{U}\\ \boldsymbol{A}_{\text{diag}}\boldsymbol{A}_{\text{diag}}^\dagger=&\boldsymbol{V}^\dagger\boldsymbol{A}\,\boldsymbol{U} \left(\boldsymbol{V}^\dagger\boldsymbol{A}\,\boldsymbol{U}\right)^\dagger\\ =&\boldsymbol{V}^\dagger\boldsymbol{A}\,\boldsymbol{U} \left( \boldsymbol{U}^\dagger \boldsymbol{A}^\dagger\,\boldsymbol{V}\right)\\ =&\boldsymbol{V}^\dagger \left( \boldsymbol{A} \boldsymbol{A}^\dagger\,\right) \boldsymbol{V}\,.\\ \end{align}

Since AA\boldsymbol{A} \boldsymbol{A}^\dagger is hermitic and AA\boldsymbol{A} \boldsymbol{A}^\dagger is diagonal, then in fact there exists an unitary matrix V\boldsymbol{V}. Similarly there exists an unitary matrix U\boldsymbol{U} which diagonalizes AA\boldsymbol{A}^\dagger \boldsymbol{A}

Scipy implementation

The implementation in scipy is based in the inverted relation VAU=AdiagVVAUU=VAdiagUA=VAdiagU.\begin{align} \boldsymbol{V}^\dagger\boldsymbol{A}\,\boldsymbol{U}=&\boldsymbol{A}_{\text{diag}}\\ \boldsymbol{V}\boldsymbol{V}^\dagger\boldsymbol{A}\,\boldsymbol{U}\boldsymbol{U}^\dagger=&\boldsymbol{V}\boldsymbol{A}_{\text{diag}}\boldsymbol{U}^\dagger\\ \boldsymbol{A}=&\boldsymbol{V}\boldsymbol{A}_{\text{diag}}\boldsymbol{U}^\dagger\,. \end{align}

The implementation is trough the module scipy.linalg.svd:

V,Adiag,Udagger=linalg.svd(A)

Eigenvectors and eigenvalues

If we make U=[U1U2Un],V=[V1V2Vn] \boldsymbol{U}=[\boldsymbol{U}_1\vdots\boldsymbol{U}_2\vdots\cdots\vdots\boldsymbol{U}_n], \qquad \boldsymbol{V}=[\boldsymbol{V}_1\vdots\boldsymbol{V}_2\vdots\cdots\vdots\boldsymbol{V}_n]

We know that there exists a bi-diagonal transformación such that ParseError: KaTeX parse error: Expected & or \\ or \cr or \end at end of input: …gin{equation} % \boldsymbol{A}\,\boldsymbol{U}=\boldsymbol{V}\boldsymbol{A}_{\text{diag}} \end{equation}$$ ParseError: KaTeX parse error: Expected & or \\ or \cr or \end at end of input: …ldsymbol{V}_i % \end{equation}$$ not sum upon ii. Here

  • λi\lambda_i are called eigenvalues

  • ViV_i and UiU_i are the eigenvectors

We can use this to check the proper order of the eigenvalues

Transformation of a linear system

We start again with the matrix equation, capitol bold letters denotes matrices AX=B,\begin{equation} \boldsymbol{A}\boldsymbol{X}=\boldsymbol{B}\,, \end{equation} where A\boldsymbol{A} is an n×nn \times n matrix.

We know that there exists a bi-diagonal transformación such that VAU=Adiag\begin{equation} \boldsymbol{V}^\dagger\boldsymbol{A}\,\boldsymbol{U}=\boldsymbol{A}_{\text{diag}} \end{equation} So, by doing standard operations we have VAUUX=VB(VAU)(UX)=VBAdiagX=B,\begin{align} \boldsymbol{V}^\dagger \boldsymbol{A} \boldsymbol{U} \boldsymbol{U}^\dagger \boldsymbol{X}=& \boldsymbol{V}^\dagger \boldsymbol{B}\\ \left( \boldsymbol{V}^\dagger \boldsymbol{A} \boldsymbol{U} \right) \left( \boldsymbol{U}^\dagger \boldsymbol{X}\right)=& \boldsymbol{V}^\dagger \boldsymbol{B}\\ \boldsymbol{A}_{\text{diag}} \boldsymbol{X}'=&\boldsymbol{B}'\,, \end{align} where X=UX,B=VB,\begin{align} \boldsymbol{X}'=& \boldsymbol{U}^\dagger \boldsymbol{X}\,, & \boldsymbol{B}'=& \boldsymbol{V}^\dagger \boldsymbol{B}\,, \end{align} or X=UX,B=VB.\begin{align} \boldsymbol{X}=& \boldsymbol{U}\boldsymbol{X}'\,, & \boldsymbol{B}=& \boldsymbol{V}\boldsymbol{B}'\,. \end{align}

If Adiag=diag(λ1,λ2,λn)\boldsymbol{A}_{\text{diag}}=\operatorname{diag}(\lambda_1,\lambda_2,\ldots \lambda_n), X=(x1x2xn)T\boldsymbol{X}=\begin{pmatrix}x_1 & x_2 &\cdots & x_n\end{pmatrix}^{\operatorname{T}} and B=(b1b2bn)T\boldsymbol{B}=\begin{pmatrix}b_1& b_2 &\cdots & b_n\end{pmatrix}^{\operatorname{T}}, the solution of the system is given by λixi=bi.\begin{equation} \lambda_i x'_i=b_i'\,. \end{equation}

Note that X=Adiag1B,\begin{align} \boldsymbol{X}'=&\boldsymbol{A}_{\text{diag}}^{-1}\boldsymbol{B}'\,, \end{align} and the final solution is X=UX=UAdiag1VB,\begin{align} \boldsymbol{X}=&U \boldsymbol{X}'\\ =&U\boldsymbol{A}_{\text{diag}}^{-1}V^\dagger \boldsymbol{B}\,, \end{align} Therefore A1=UAdiag1V\boldsymbol{A}^{-1}=U\boldsymbol{A}_{\text{diag}}^{-1}V^\dagger

Example

A suitable way to introduce this method is applying it to some basic problem. To do so, let's take the result of the Example 1:

[540473035][x1x2x3]=[102]\begin{bmatrix} 5 & -4 & 0 \\ -4 & 7 & -3 \\ 0 & -3 & 5 \end{bmatrix} \begin{bmatrix} x_{1} \\ x_{2} \\ x_{3} \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \\ -2 \end{bmatrix}

As the matrix is symmetric U=V\boldsymbol{U}=\boldsymbol{V} and V=VT\boldsymbol{V}^\dagger=\boldsymbol{V}^{\operatorname{T}}

import numpy as np
M1=np.array([[5,-4,0], [-4,7,-3], [0,-3,5]])
A=M1 A
array([[ 5, -4, 0], [-4, 7, -3], [ 0, -3, 5]])

Check if all eigenvalues are different from zero:

np.linalg.det(A)
49.99999999999999
B=np.c_[ [1,0,-2] ] B
array([[ 1], [ 0], [-2]])

Also as

B=np.array([[1],[0],[-2]]) #or B=np.reshape( [1,0,-2],(3,1) )

Theorem 1 in numpy

WARNING: Only works for Hermitic matrices!

λ,V=np.linalg.eig( A )
A_diag=np.diag(λ) A_diag
array([[11.09901951, 0. , 0. ], [ 0. , 0.90098049, 0. ], [ 0. , 0. , 5. ]])
V
array([[-5.07191124e-01, -6.18673713e-01, -6.00000000e-01], [ 7.73342141e-01, -6.33988906e-01, 1.91548674e-16], [-3.80393343e-01, -4.64005285e-01, 8.00000000e-01]])
(V@V.transpose()).round(15).astype(int)
array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])

We first check the proper order of the diagonalization

(V.transpose()@A@V).round(14)
array([[11.09901951, 0. , 0. ], [ 0. , 0.90098049, 0. ], [ 0. , 0. , 5. ]])
(V.transpose().dot(A)).dot(V).round(14)
array([[11.09901951, 0. , 0. ], [ 0. , 0.90098049, 0. ], [ 0. , 0. , 5. ]])

Since

V
array([[-5.07191124e-01, -6.18673713e-01, -6.00000000e-01], [ 7.73342141e-01, -6.33988906e-01, 1.91548674e-16], [-3.80393343e-01, -4.64005285e-01, 8.00000000e-01]])

The final solution is:

A_diag_inv=np.diag(1/λ) A_diag_inv
array([[0.09009805, 0. , 0. ], [0. , 1.10990195, 0. ], [0. , 0. , 0.2 ]])

check with np.linalg.inv(A_diag)

X=V@A_diag_inv@V.transpose()@B X
array([[ 0.04], [-0.2 ], [-0.52]])

Activity: Usar np.lingalg.solve

np.linalg.solve(A,B)
array([[ 0.04], [-0.2 ], [-0.52]])

We can now check some properties.

VTAV=Adiag,\begin{equation} \boldsymbol{V}^{\operatorname{T}}\boldsymbol{A}\,\boldsymbol{V}=\boldsymbol{A}_{\text{diag}}\,, \end{equation}[V1V2VjViVn]A[V1V2VjViVn]=diag(λ1,λ2,,λj,λi,λn).\begin{bmatrix} \boldsymbol{V}_1 \vdots \boldsymbol{V}_2\cdots \vdots \boldsymbol{V}_j \vdots \boldsymbol{V}_i\vdots\cdots \vdots \boldsymbol{V}_n \end{bmatrix}^\dagger \boldsymbol{A} \begin{bmatrix} \boldsymbol{V}_1 \vdots \boldsymbol{V}_2\cdots \vdots \boldsymbol{V}_j \vdots \boldsymbol{V}_i\vdots\cdots \vdots \boldsymbol{V}_n \end{bmatrix}=\operatorname{diag}(\lambda_1,\lambda_2,\ldots,\lambda_j,\lambda_i,\ldots \lambda_n).

For the case 3×33\times 3

  • Obtain θ12\theta_{12} for λ1<λ2<λ3|\lambda_1|<|\lambda_2|<|\lambda_3| and, in the proper order U=(c12c13s12c13s13eiδCPs12c23c12s13s23eiδCPc12c23s12s13s23eiδCPc13s23s12s23c12s13c23eiδCPc12s23s12s13c23eiδCPc13c23) \boldsymbol{U}=\left(\begin{array}{ccc}c_{12} c_{13} & s_{12} c_{13} & s_{13} e^{-i \delta_{\mathrm{CP}}} \\ -s_{12} c_{23}-c_{12} s_{13} s_{23} e^{i \delta_{\mathrm{CP}}} & c_{12} c_{23}-s_{12} s_{13} s_{23} e^{i \delta_{\mathrm{CP}}} & c_{13} s_{23} \\ s_{12} s_{23}-c_{12} s_{13} c_{23} e^{i \delta_{\mathrm{CP}}} & -c_{12} s_{23}-s_{12} s_{13} c_{23} e^{i \delta_{\mathrm{CP}}} & c_{13} c_{23}\end{array}\right)

import numpy as np A=np.array( [[5,-4,0], [-4,7,-3], [0,-3,5]]) λ,V=np.linalg.eig( A ) A_diag=np.diag(λ) λ
array([11.09901951, 0.90098049, 5. ])
V
array([[-5.07191124e-01, -6.18673713e-01, -6.00000000e-01], [ 7.73342141e-01, -6.33988906e-01, 1.91548674e-16], [-3.80393343e-01, -4.64005285e-01, 8.00000000e-01]])

Extract the first eigenvector

np.c_[ V[: ,0] ]
array([[-0.50719112], [ 0.77334214], [-0.38039334]])

associated with the eigenvalue

A_diag[0,0],λ[0]
(11.099019513592784, 11.099019513592784)
V0=np.c_[ V[:,0] ] V1=np.c_[ V[:,1] ] V2=np.c_[ V[:,2] ] display(V0) display(V1) V2
array([[-0.50719112], [ 0.77334214], [-0.38039334]])
array([[-0.61867371], [-0.63398891], [-0.46400528]])
array([[-6.00000000e-01], [ 1.91548674e-16], [ 8.00000000e-01]])

Check: AVi=λiVi A V_i=\lambda_i V_i

Which means the eigenvalue associated to the "operator" AA acting on the eigenvector V1V_1

print(f'{A@V0} =\n{λ[0]*V0}')
[[-5.62932419] [ 8.58333952] [-4.22199314]] = [[-5.62932419] [ 8.58333952] [-4.22199314]]
print(f'{A@V1} =\n{λ[1]*V1}')
[[-0.55741294] [-0.57121163] [-0.41805971]] = [[-0.55741294] [-0.57121163] [-0.41805971]]
print(f'{(A@V2).round(14)} =\n{(λ[2]*V2).round(14)}')
[[-3.] [ 0.] [ 4.]] = [[-3.] [ 0.] [ 4.]]

The diagonalization matrix can be rebuild from the eigenvectors

V
array([[-5.07191124e-01, -6.18673713e-01, -6.00000000e-01], [ 7.73342141e-01, -6.33988906e-01, 1.91548674e-16], [-3.80393343e-01, -4.64005285e-01, 8.00000000e-01]])

is rebuild with

V=np.c_[ V0,V1,V2]

or with: np.hstack((V0,V1,V2))

V
array([[-5.07191124e-01, -6.18673713e-01, -6.00000000e-01], [ 7.73342141e-01, -6.33988906e-01, 1.91548674e-16], [-3.80393343e-01, -4.64005285e-01, 8.00000000e-01]])
( V.transpose()@A@V).round(14)
array([[11.09901951, 0. , 0. ], [ 0. , 0.90098049, 0. ], [ 0. , 0. , 5. ]])

Note that a sign of an egigenvalues can be changed:

V=np.c_[ V0,-V1,V2]

Eigenvector reordering

We can use this to check the proper order of the eigenvalues.

The order of eigenvalues can now be changed by changing the order of the eigenvectors and redifining the diagonalization matrix. For example, from small to large.

Then the proper order in the eigenvalues can be obtained

U=np.c_[ V1,V2,V0] ( U.transpose()@A@U).round(14)
array([[ 0.90098049, 0. , 0. ], [ 0. , 5. , 0. ], [ 0. , 0. , 11.09901951]])
U
array([[-6.18673713e-01, -6.00000000e-01, -5.07191124e-01], [-6.33988906e-01, 1.91548674e-16, 7.73342141e-01], [-4.64005285e-01, 8.00000000e-01, -3.80393343e-01]])

Once in the proper order, the mixing angles can be obtained U1=(Ue1Uμ1Uτ1)=(c12c13s12c23c12s13s23eiδCPs12s23c12s13c23eiδCP),U2=(Ue2Uμ2Uτ2)=(s12c13c12c23s12s13s23eiδCPc12s23s12s13c23eiδCP),U3=(Ue3Uμ3Uτ3)=(s13eiδCPc13s23c13c23), \boldsymbol{U}_1=\begin{pmatrix}U_{e1}\\ U_{\mu 1}\\ U_{\tau 1}\end{pmatrix}=\begin{pmatrix} c_{12} c_{13} \\ -s_{12} c_{23}-c_{12} s_{13} s_{23} e^{i \delta_{\mathrm{CP}}} \\ s_{12} s_{23}-c_{12} s_{13} c_{23} e^{i \delta_{\mathrm{CP}}} \end{pmatrix},\qquad \boldsymbol{U}_2=\begin{pmatrix}U_{e2}\\ U_{\mu 2}\\ U_{\tau 2}\end{pmatrix}=\begin{pmatrix} s_{12} c_{13} \\ c_{12} c_{23}-s_{12} s_{13} s_{23} e^{i \delta_{\mathrm{CP}}} \\ -c_{12} s_{23}-s_{12} s_{13} c_{23} e^{i \delta_{\mathrm{CP}}} \end{pmatrix},\qquad \boldsymbol{U}_3=\begin{pmatrix}U_{e3}\\ U_{\mu 3}\\ U_{\tau 3}\end{pmatrix}=\begin{pmatrix} s_{13} e^{-i \delta_{\mathrm{CP}}} \\ c_{13} s_{23} \\ c_{13} c_{23} \end{pmatrix}, so that tanθ12=Ue2Ue1 \tan\theta_{12}=\frac{U_{e2}}{U_{e1}}

θ_12=np.arctan( U[0,1]/U[0,0] )
θ_12, θ_12*180/np.pi
(0.7700763823614476, 44.122126612013574)

Implementation of an algorithm for reordering

λ
array([11.09901951, 0.90098049, 5. ])
np.sort( np.abs(λ))
array([ 0.90098049, 5. , 11.09901951])

Returns the indices that would sort an array.

index=np.abs(λ).argsort() index
array([1, 2, 0])

can be implemented in general with a comprehension list

V
array([[-5.07191124e-01, 6.18673713e-01, -6.00000000e-01], [ 7.73342141e-01, 6.33988906e-01, 1.91548674e-16], [-3.80393343e-01, 4.64005285e-01, 8.00000000e-01]])

For rebuild

np.c_[ tuple( [ np.c_[V[:,i]] for i in range(3) ] ) ]
array([[-5.07191124e-01, 6.18673713e-01, -6.00000000e-01], [ 7.73342141e-01, 6.33988906e-01, 1.91548674e-16], [-3.80393343e-01, 4.64005285e-01, 8.00000000e-01]])

And for reorder to index

U=np.c_[ tuple( [ np.c_[V[:,i]] for i in np.abs(λ).argsort() ] ) ] U
array([[ 6.18673713e-01, -6.00000000e-01, -5.07191124e-01], [ 6.33988906e-01, 1.91548674e-16, 7.73342141e-01], [ 4.64005285e-01, 8.00000000e-01, -3.80393343e-01]])

or:

n=3 U=np.hstack( [ np.reshape( V[:,i], (n,1) ) for i in index ] )
( U.transpose()@A@U).round(14)
array([[ 0.90098049, -0. , -0. ], [-0. , 5. , 0. ], [-0. , 0. , 11.09901951]])

Implementation as function

def argeig(A): λ,V=np.linalg.eig(A) λ=np.array([λ[i] for i in np.abs(λ).argsort()]) V=np.c_[ tuple( [ np.c_[V[:,i]] for i in np.abs(λ).argsort() ] ) ] return λ,V
λ,V=argeig(A) λ,V
(array([ 0.90098049, 5. , 11.09901951]), array([[-5.07191124e-01, -6.18673713e-01, -6.00000000e-01], [ 7.73342141e-01, -6.33988906e-01, 1.91548674e-16], [-3.80393343e-01, -4.64005285e-01, 8.00000000e-01]]))
#1. λ,V=argeig(A) λ,V Out[1]: (array([ 0.90098049, 5. , 11.09901951]), array([[-6.18673713e-01, -6.00000000e-01, -5.07191124e-01], [-6.33988906e-01, 1.91548674e-16, 7.73342141e-01], [-4.64005285e-01, 8.00000000e-01, -3.80393343e-01]]))
#2. ( V.transpose()@A@V).round(14) Out[1]: array([[ 0.90098049, 0. , 0. ], [ 0. , 5. , 0. ], [ 0. , 0. , 11.09901951]])
#3. print( np.linalg.det(A- λ[0]*np.identity(3) ) ) print( np.linalg.det(A- λ[1]*np.identity(3) ) ) np.linalg.det(A- λ[2]*np.identity(3) )

Activity: Build a function that diagonalize symmetric matrices with the eigenvalues in increasing order in the eigenvalues as a replacement of np.linalg.eig

def argeig(A): l,V=np.linalg.eig(A) .... return argl, argV

General matrix

import numpy as np from scipy import linalg

Example: Consider the following general real matrix without any specific symmetry

A=np.array([[ 6.666746440000001 , 3.1312125300000013], [-0.23343504999999923, 5.890289299999999 ]])

In this case we need to make the bidiogalization process of Theorem 2 with the Singular value decomposition (SVD) implemented in Scipy

V,diag,Udagger=linalg.svd(A)
U=Udagger.transpose().conjugate()
A_diag=(V.transpose().conjugate()@A@U).round(14) A_diag
array([[ 8., -0.], [ 0., 5.]])

This is important to stablish that the eigenvectors are determined until ordering and permutations

V
array([[-0.8660254, -0.5 ], [-0.5 , 0.8660254]])
U
array([[-0.70710678, -0.70710678], [-0.70710678, 0.70710678]])

For a general matrix they are just ortogonal matrices

(U.transpose()@U).round(14)
array([[1., 0.], [0., 1.]])

We define below a 2×22\times 2 orthogonal matrix:

def orthogonal(θ): return np.array( [[np.cos(θ) ,np.sin(θ)], [-np.sin(θ),np.cos(θ)]] )
π=np.pi Vp=orthogonal(π/3) Vp
array([[ 0.5 , 0.8660254], [-0.8660254, 0.5 ]])

Note that, for V=[V1V2]V=[V_1\vdots V_2], after

  • Interchange V1V2V_1 \to V_2 and V2V1V_2 \to V_1

  • VVV \to -V,

we get VV'

Up=orthogonal(π/4) Up
array([[ 0.70710678, 0.70710678], [-0.70710678, 0.70710678]])

Note that after

  • Interchange U1U2U_1 \to U_2 and U2U1U_2 \to U_1

  • UUU \to -U,

we get UU'.

Since the required trasformations in VV and UU are the same, the Theorem 2 is unnafected, only the order in the eigenvalues change to the normal ordering. In Fact

A_diag=(Vp.transpose()@A@Up).round(8) A_diag
array([[ 5., -0.], [ 0., 8.]])

Activity: Solve the system Ax=B \boldsymbol{A} \boldsymbol{x}=\boldsymbol{B} for the previous A \boldsymbol{A} matrix and B=[14]\boldsymbol{B}=\begin{bmatrix} 1\\ -4\\ \end{bmatrix}

Interpretations of Theorem 2

Single Diagonalization matrix

In Theorem 2 establishes a unique set of a matrices: AA with its eigenvalues matrix, AdiagA_{\rm diag}, and their eigenvectors matrices UU and VV. However, for a fixed set of eigenvalues and eigenvectors we can have several possibilities of AA' matrices. For example, if we fix U=1U'=\boldsymbol{1} VA=Adiag,\begin{equation} {V'}^\dagger \boldsymbol{A'}=\boldsymbol{A}_{\text{diag}}\,, \end{equation} so that A=VAdiag\begin{equation} \boldsymbol{A}'=V'\boldsymbol{A}_{\text{diag}} \end{equation}

Therefore: Under this conditions, for a fixed set of eigenvalues and eigenvectors (associated to the matrix VV') there is a unique AA' matrix

Example Let V=UTVV'=U^{\operatorname{T}} V in the previous example. Find the matrix AA' which gives rise to the same eigenvalues

Adiag=np.array( [[5,0], [0,8]] )
Vp=orthogonal(np.pi/4).transpose()@orthogonal(np.pi/3) Vp
array([[ 0.96592583, 0.25881905], [-0.25881905, 0.96592583]])
Ap=Vp@Adiag Ap
array([[ 4.82962913, 2.07055236], [-1.29409523, 7.72740661]])
(Vp.transpose()@Ap).round(14)
array([[5., 0.], [0., 8.]])

Finally, we can use the full procedure of hermitic matrices to obtain the bidiagonal matrices U,VU,V

Example: Diagonalize the matrix AA by using the Theorem 1 instead

A=np.array([[ 6.66674644, 3.13121253], [-0.23343505, 5.8902893 ]])
A@A.transpose()
array([[54.25 , 16.88749537], [16.88749537, 34.74999996]])
A.transpose()@A
array([[44.50000002, 19.50000001], [19.50000001, 44.49999995]])

→ is obtained with <ALT GR>+i or, similarly: ↓←

λ2,V=np.linalg.eig( A@A.transpose() ) print(λ2,'→',np.sqrt(λ2)) #V=np.c_[ -V[:,0],V[:,0] ] V
[63.99999999 24.99999997] → [8. 5.]
array([[ 0.8660254, -0.5 ], [ 0.5 , 0.8660254]])
λ2p,U=np.linalg.eig( A.transpose()@A ) print(np.sqrt(λ2)) U
[8. 5.]
array([[ 0.70710678, -0.70710678], [ 0.70710678, 0.70710678]])
(V.transpose()@A@U ).round(14)
array([[ 8., -0.], [ 0., 5.]])

Actividad: Make the algorithm of reordering for theorem 2

Mixed terms

Let: X=[BW]\begin{align} X' = \begin{bmatrix} B \\ W \\ \end{bmatrix} \end{align}

Consider the quadratic equation XTMX=[BW][M11M12M12M22][BW]=[BW][M11B+M12WM12B+M22W]=B(M11B+M12W)+W(M12B+M22W)=M11B2+2M12BW+M22W2.\begin{align} X^{\prime\operatorname{T}} M X^\prime=& \begin{bmatrix} B & W \end{bmatrix} \begin{bmatrix} M_{11} & M_{12} \\ M_{12} & M_{22} \\ \end{bmatrix} \begin{bmatrix} B \\ W \\ \end{bmatrix}\\ =& \begin{bmatrix} B & W \end{bmatrix} \begin{bmatrix} M_{11}B + M_{12}W \\ M_{12}B + M_{22}W \\ \end{bmatrix}\\ =& B( M_{11}B + M_{12}W)+ W ( M_{12}B + M_{22}W )\\ =&M_{11} B^2 + 2M_{12} BW+ M_{22} W^2\,. \end{align} The quadratic equation is in terms of: M11M_{11}, M12M_{12} y M22M_{22}

We can simplify this expression if we change to a new basis X=[AZ] X=\begin{bmatrix} A\\ Z \end{bmatrix} in which MM is diagonal, in such a case the crossed term would disappear. The rotation from XXX'\to X is defined by X=[BW][cosθsinθsinθcosθ][AZ]=VX\begin{align} X'= \begin{bmatrix} B\\ W \end{bmatrix}\equiv \begin{bmatrix} \cos\theta & \sin\theta\\ -\sin\theta & \cos\theta \end{bmatrix} \begin{bmatrix} A\\ Z \end{bmatrix}= V X \end{align} where VV is the rotation matrix V=[cosθsinθsinθcosθ]. V=\begin{bmatrix} \cos\theta & \sin\theta\\ -\sin\theta & \cos\theta \end{bmatrix}. Therefore X[AZ]=VTXXT=XTV,\begin{align} X\equiv \begin{bmatrix} A\\ Z \end{bmatrix}=V^{\operatorname{T}} X' \to X^{\operatorname{T}}= X^{\prime\operatorname{T}} V\,, \end{align}

In the new basis XTMX=XTVVTMVVTX=(XTV)(VTMV)(VTX)=XTMdiagX=λ1A2+λ2Z2.\begin{align} X^{\prime\operatorname{T}} M X^\prime=&X^{\prime\operatorname{T}}V V^{\operatorname{T}} M V V^{\operatorname{T}} X^\prime\\ =&(X^{\prime\operatorname{T}}V) (V^{\operatorname{T}} M V) (V^{\operatorname{T}} X^\prime)\\ =& X^{\operatorname{T}} M_{\text{diag}} X \\ =& \lambda_1 A^2+\lambda_2 Z^2\,. \end{align}

such that λ1λ2|\lambda_1|\le|\lambda_2|, where MdiagVTMV=[λ100λ2],\begin{align} M_{\text{diag}}\equiv V^{\operatorname{T}} M V=\begin{bmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \\ \end{bmatrix}\,, \end{align}

In this basis, the quadratic equation is in terms of eigenvalues and mixing angle, θ\theta. Therefore, there are not longer mixed terms.

The diagonalization of quadratic equations can be straightforwardly generalized to nn-th degree equations in terms of n×nn\times n matrices

import numpy as np g =0.64996 gp=0.35523 v=246.22046 #GeV

Example: Electroweak interactions

To understand the electromagnetic and weak fundamental interactions, the mathematical formulation need to be done in one basis where the photon field, denoted with a symbol AA, is still not well defined. Instead, the field BB, the precursor of AA, appears along with the weak field WW, the precursor of the electroweak field ZZ. In the mathematical basis we have then X=[BW],\begin{align} X' = \begin{bmatrix} B \\ W \\ \end{bmatrix}, \end{align} and there, the symmetric mass matrix is calculated as (Details here: PDF)

M=(v**2/4)*np.array([[gp**2,-g*gp], [-g*gp, g**2]])
M
array([[ 1912.52692086, -3499.32718938], [-3499.32718938, 6402.67629426]])

Checking that the determinant is zero

np.linalg.det(M).round(8)
0.0

This imply that one egivanlue is zero

np.linalg.eigvals(M).round(12)
array([ 0. , 8315.20321512])

which means that the matrix rank, the number of non-zero eigenvalues, is 1

np.linalg.matrix_rank(M)
1

To make the change to the phyisical basis, X=[AZ] X=\begin{bmatrix} A\\ Z \end{bmatrix} through the rotation matrix V=[cosθWsinθWsinθWcosθW], V=\begin{bmatrix} \cos\theta_W & \sin\theta_W\\ -\sin\theta_W & \cos\theta_W \end{bmatrix}, the following transformation need to be established X=[BW][cosθWsinθWsinθWcosθW][AZ]=VX,\begin{align} X'= \begin{bmatrix} B\\ W \end{bmatrix}\equiv \begin{bmatrix} \cos\theta_W & \sin\theta_W\\ -\sin\theta_W & \cos\theta_W \end{bmatrix} \begin{bmatrix} A\\ Z \end{bmatrix}= V X\,, \end{align} such that VV is the diagonalization matrix of MM

Mdiag=VTMVM_{\text{diag}}=V^{\operatorname{T}} M V

with the normal ordering: λ1λ2|\lambda_1|\le|\lambda_2|, and

  • VV → Diagonalization matrix

  • VV → Ortogonal matrix

  • VV → Rotation matrix

To obtain the eigensystem, we use

λ,V=np.linalg.eig(M)

which in fact satisfy

( V.transpose()@M@V).round(12)
array([[ -0. , 0. ], [ 0. , 8315.20321512]])

Since the first (zero) eigenvalue is the one associated to AA, we can interpret directly the rotation matrix without changing the order of the eigenvectors

V
array([[-0.87749437, 0.47958694], [-0.47958694, -0.87749437]])

Note that in fact, the absolute value of the first element of the first eigenvector is the larger one and corresponds to the component along the BB-axis.

Therefore

θ_W=np.arcsin( V[0,1] ) θ_W
0.5001839211647364

The eigenvalue associated to ZZ is

mZ=np.sqrt(λ[1]) mZ #GeV
91.18773610041056

corresponding to the ZZ mass in units of GeV/c2\text{GeV}/c^2. As a reference, the proton mass is approximately 1 GeV/c21\ \text{GeV}/c^2

The physical observable associated to the weak mixing angle, θW\theta_W, is (see PDF, along with the Z0Z^0 boson mass, mZm_Z)

np.sin(θ_W)**2
0.23000362966286086

Corolary 1 Theorem II

For a unique unitary n×nn\times n matrix V\boldsymbol{V} and a set of unique of eigenvalues λ1,λ2,λn\lambda_1,\lambda_2,\ldots\lambda_n there exists an infite set of mass matrices A\boldsymbol{A}, associated to the infinite sets of matrices U\boldsymbol{U}, such that VAU=Adiag, \boldsymbol{V}^\dagger\boldsymbol{A}\,\boldsymbol{U}=\boldsymbol{A}_{\text{diag}}\,, where Adiag=diag(λ1,λ2,λn)\boldsymbol{A}_{\text{diag}}=\operatorname{diag}(\lambda_1,\lambda_2,\ldots \lambda_n) is the diagonalized mass matrix. In particular, if A\boldsymbol{A} is hermitic then there is only one possibility for U=V\boldsymbol{U}=\boldsymbol{V} and therefore A\boldsymbol{A} is unique.

Corolary 2 Theorem II

An unique hermitic mass matrix, A\boldsymbol{A}, can be generated from an infinite set of arbitrary matrices Y\boldsymbol{Y}, such that A=YY\boldsymbol{A}=\boldsymbol{Y}^{\dagger}\boldsymbol{Y}

Example: Neutrino mass matrix

Returning back to the neutrino mixing discussion, it is worth noticing that in the mathematical basis N=(νeνμντ) N=\left(\begin{array}{l} \left|\nu_{e}\right\rangle \\ \left|\nu_{\mu}\right\rangle \\ \left|\nu_{\tau}\right\rangle \end{array}\right) the mass matrix, Mν\mathcal{M}_\nu is non-diagonal. However, if we assume that it is hermitic, then an unitary tranformation (rotation in the symmetric case), U\boldsymbol{U} ,can be defined to diagonal basis as (νeνμντ)=U(ν1ν2ν3), \left(\begin{array}{l} \left|\nu_{e}\right\rangle \\ \left|\nu_{\mu}\right\rangle \\ \left|\nu_{\tau}\right\rangle \end{array}\right)= \boldsymbol{U}\left(\begin{array}{l} \left|\nu_{1}\right\rangle \\ \left|\nu_{2}\right\rangle \\ \left|\nu_{3}\right\rangle \end{array}\right), which is identified as the diagonalization matrix UMνU=diag(m1,m2,m3). \boldsymbol{U}^\dagger \mathcal{M}_\nu \boldsymbol{U}=\operatorname{diag}(m_1,m_2,m_3).

According to the corrolary there is a set of infinite matrices complatible with an unique U\boldsymbol{U} and the eigenvalues m1,m2,m3m_1,m_2,m_3

Casas-Ibarra parameterization

Consider a n×nn\times n symmetric matrix AA. We can assumme without lost of generality that this can be generated from a matrix YY such that A=YTY A=Y^{\operatorname{T}}Y Theorem 1 gurantees that exists an ortogonal matrix UU such that UTAU=UTYTYU=Dλ U^{\operatorname{T}} A U=U^{\operatorname{T}} Y^{\operatorname{T}}Y U=D_\lambda where Dλ=Adiag=diag(λ1,λ2,,λn) D_{\lambda}=A_{\text{diag}}=\operatorname{diag}\left(\lambda_1,\lambda_2,\ldots,\lambda_n\right) where λi\lambda_i are the eigenvalues of AA. Therefore YTY=UDλUT=UDλDλUT\begin{align} Y^{\operatorname{T}}Y =&U D_\lambda U^{\operatorname{T}}\\ =&U D_{\sqrt{\lambda}} D_{\sqrt{\lambda}} U^{\operatorname{T}}\\ \end{align} where Dλ=diag(λ1,λ2,λn) D_{\sqrt{\lambda}}=\operatorname{diag}\left(\sqrt{\lambda_1},\sqrt{\lambda_2},\ldots \sqrt{\lambda_n}\right) Therefore, exists an ortogonal arbitrary matrix RR, such that YTY=UDλRTRDλUT Y^{\operatorname{T}}Y =U D_{\sqrt{\lambda}}R^{\operatorname{T}}R D_{\sqrt{\lambda}} U^{\operatorname{T}}

In this way, the matrix YY can be parameterized in terms of RR as Y=RDλUT Y=R D_{\sqrt{\lambda}} U^{\operatorname{T}}

  1. By using the previous equations, build a 2×22\times 2 YY matrix with the following conditions

  • RR is an orthogonal matrix with a rotation angle as a random number between (0,2π)(0,2\pi). Use your identification number as the seed of the random number generator.

  • The eigenvalues are λ1=2\lambda_1=2 and λ2=4\lambda_2=4.

  • UU is a diagonalization matrix with mixing angle π/4\pi/4

  1. Build the matrix AA and check that has the proper eigenvalues and eigenvectors

import numpy as np #np.random.seed(98554575) θ=np.random.uniform(0,2*np.pi) def orthogonal(θ): return np.array( [[ np.cos(θ), np.sin(θ)], [ -np.sin(θ), np.cos(θ) ]] )
λ1=2; λ2=4 R=orthogonal(θ) U=orthogonal(np.pi/4) D_sqrtλ=np.diag( np.sqrt([λ1,λ2]) ) print('R=') R
R=
array([[-0.97721326, -0.21225986], [ 0.21225986, -0.97721326]])

From the equation for YY

Y=np.dot( np.dot( R,D_sqrtλ), U.transpose() ) Y
array([[-1.27739403, 0.67703248], [-1.16972838, -1.5942481 ]])

The symmetric matrix is

#Unique mass matrix A=np.dot( Y.transpose(), Y) A
array([[3., 1.], [1., 3.]])
λ,Unew=np.linalg.eig(A) print('Eigenvalues={}'.format(λ)) print('U=') Unew
Eigenvalues=[4. 2.] U=
array([[ 0.70710678, -0.70710678], [ 0.70710678, 0.70710678]])

After the reordering of the eigenvalues, we would obtain the original UU and therefore

(U.transpose()@A@U).round(14)
array([[2., 0.], [0., 4.]])

In this way, there is an infinity set of symmetric matrices AA which have the same eigenvalues and eigenvectors

THDM CP-even masses and mixing

Some times, the convention to identify the large projections in the non-diagonal basis.

In this case we have the rotation between an interaction basis and a physical basis defined as (R1R2)=(cosαsinαsinαcosα)(Hh). \begin{pmatrix} R_1\\ R_2\\ \end{pmatrix}= \begin{pmatrix} \cos\alpha & -\sin\alpha \\ \sin\alpha & \cos\alpha \\ \end{pmatrix} \begin{pmatrix} H\\ h\\ \end{pmatrix}. Since the diagonalization of a symmetric 2×22\times 2 matrix can be obtained analitically, here we build a 2×22\times 2 symmetric matrix from the eigenvalues, mHm_H and mhm_h, and the rotation angle, α\alpha, and check with the numerical results by using a benchmark point:

Benchmark point BP9

Here, the eigenvalue mhm_h is defined as the one with the eigenvector with the larger projection upon R2R_2

import numpy as np from scipy import optimize G_F=1.1663787E-5 #GeV^-2 v=1/np.sqrt(np.sqrt(2.)*G_F) # GeV #********* BP9 ************ tanβ=2 sinβ_α=0.6 # sin(β-α) m_h = 125 # GeV/c^2 (c→1) m_H = 300 # GeV/c^2 m_A = 300 # GeV/c^2 m_Hp= 400 # GeV/c^2 m2_12=100**2 # (GeV/c^2)^2 #***********************

The formula for the mass matrix in terms of the previous parameters can be found in [hep-ph/0207010,arXiv:1507.00933]. We assumme here that λ6=λ7=0\lambda_6=\lambda_7=0. See Appendix D of [hep-ph/0207010] for full formulas

β=np.arctan(tanβ) α=β-np.arcsin(sinβ_α) # <= np.pi/2~1.57 print('α=',α) λ1 =(m_H**2*np.cos(α)**2+m_h**2*np.sin(α)**2-m2_12*np.tan(β))/(v**2*np.cos(β)**2) λ_2 =(m_H**2*np.sin(α)**2+m_h**2*np.cos(α)**2-m2_12/np.tan(β))/(v**2*np.sin(β)**2) λ3 =((m_H**2-m_h**2)*np.cos(α)*np.sin(α)+2*m_Hp**2*np.sin(β)*np.cos(β)-m2_12)/(v**2*np.sin(β)*np.cos(β)) λ4 =((m_A**2-2*m_Hp**2)*np.sin(β)*np.cos(β)+m2_12)/(v**2*np.sin(β)*np.cos(β)) λ5 =(m2_12-m_A**2*np.sin(β)*np.cos(β))/(v**2*np.sin(β)*np.cos(β)) λ345=λ3+λ4+λ5 λ345
α= 0.46364760900080604
1.63919914780058
def M2(β,λ1,λ_2,λ5,λ345,m2_12,v=246.2): mA2=m2_12/(np.sin(β)*np.cos(β))-λ5*v**2 M11=λ1*v**2*np.cos(β)**2+(mA2+λ5*v**2)*np.sin(β)**2 M12=(λ345*v**2-(mA2+λ5*v**2))*np.sin(β)*np.cos(β) M22=λ_2*v**2*np.sin(β)**2+(mA2+λ5*v**2)*np.cos(β)**2 return np.array( [[M11, M12], [M12, M22]])

In this way, from the BP point, we can build the 2×22\times2 symmetric mass M\mathcal{M}:

ℳ2=M2(β,λ1,λ_2,λ5,λ345,m2_12,v=v) ℳ2
array([[75125., 29750.], [29750., 30500.]])
m2,V=np.linalg.eig(ℳ2)

By using the proper basis (mH200mh2)=(cosαsinαsinαcosα)(M112M122M122M222)(cosαsinαsinαcosα). \left(\begin{array}{cc} m_{H}^{2} & 0 \\ 0 & m_{h}^{2} \end{array}\right)=\left(\begin{array}{rr} \cos{\alpha} & \sin{\alpha} \\ -\sin{\alpha} & \cos{\alpha} \end{array}\right)\left(\begin{array}{ll} \mathcal{M}_{11}^{2} & \mathcal{M}_{12}^{2} \\ \mathcal{M}_{12}^{2} & \mathcal{M}_{22}^{2} \end{array}\right)\left(\begin{array}{lr} \cos{\alpha} & -\sin{\alpha} \\ \sin{\alpha} & \cos{\alpha} \end{array}\right). Therefore U=(cosαsinαsinαcosα). U=\left(\begin{array}{lr} \cos{\alpha} & -\sin{\alpha} \\ \sin{\alpha} & \cos{\alpha} \end{array}\right). Since π/2απ/2-\pi/2\le\alpha\le\pi/2 then cosα0\cos\alpha\ge 0. In this way, we must obtain α\alpha from tanα\tan\alpha to avoid ambiguities with a global sign in the second eigenvector tanα=sinαcosα=U01U11. \tan\alpha=\frac{\sin\alpha}{\cos\alpha}=-\frac{U_{01}}{U_{11}}\,.

The eigenvector with large projection in the second component along R2R_2 ( Interaction basis (R1,R2)(R_1,R_2) → Mass basis (H0,h0)(H^0,h^0)) is the associated with the standard model Higgs mass, mhm_h

V
array([[ 0.89442719, -0.4472136 ], [ 0.4472136 , 0.89442719]])
if np.abs(V[0,1])<=np.abs(V[1,1]): H=0; h=1 else: H=1; h=0 (H,h)
(0, 1)
U=np.c_[ V[:,H],V[:,h] ] U
array([[ 0.89442719, -0.4472136 ], [ 0.4472136 , 0.89442719]])
M2diag=np.dot( np.dot( U.transpose(),ℳ2), U).round(9) mH,mh=np.sqrt(M2diag[0,0]),np.sqrt(M2diag[1,1]) mH,mh
(300.0, 125.0)
tanα=-U[0,1]/U[1,1] α=np.arctan(tanα) if α>=-np.pi/2 and α<=np.pi/2: print(α) else: print('Bad α range')
0.4636476090008061
np.cos(β-α)
0.8
np.sin(β-α)
0.5999999999999999

Analytical diagonalization from [arXiv:1507.00933]

Δ=np.sqrt( (ℳ2[0,0]-ℳ2[1,1])**2+4*ℳ2[0,1]**2 )
np.sqrt( [0.5*( ℳ2[0,0]+ℳ2[1,1]+Δ ), 0.5*( ℳ2[0,0]+ℳ2[1,1]-Δ ) ])
array([300., 125.])
Cosα,Sinα=( np.sqrt((Δ+ℳ2[0,0]-ℳ2[1,1])/(2*Δ)), np.sqrt(2)*ℳ2[0,1]/np.sqrt(Δ*(Δ+ℳ2[0,0]-ℳ2[1,1])) ) print('U=') np.array([[Cosα,-Sinα],[Sinα,Cosα]])
U=
array([[ 0.89442719, -0.4472136 ], [ 0.4472136 , 0.89442719]])
print('cos(β-α)={}'.format(np.cos( β- np.arctan( Sinα/Cosα) )))
cos(β-α)=0.8

Activity

Let M=[1234] \boldsymbol{M}=\begin{bmatrix} 1 & 2\\ 3 & 4\\ \end{bmatrix} Choose the diagonalization method (A/B)

  • A:

np.linalg.eig(M)
  • B:

from scipy import linalg linalg.svd(M)
from activities import * x=input('A or B?\n') activities(x,"7.1")