Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
explore-for-students
GitHub Repository: explore-for-students/python-tutorials
Path: blob/main/Lesson 9 - Symbolic computation.ipynb
968 views
Kernel: Python 3

Lecture 9: Symbolic computation

Authors:

  • David Klemmer

  • Edwin Genoud-Prachex

  • Sean Tulin

Objectives:

By the end of this lesson, you will be able to:

  • Use the SymPy library to perform symbolic computations, including basic algebraic manipulations, matrix manipulation, and calculus

Introduction to SymPy

SymPy is a Python library that is used to make symbolic computations. It provides similar functionality as Mathematica, Maple, and Matlab, but with the advantage that SymPy is free. Symbolic manipulation is not only used in research, but can be very helpful in your coursework to check theoretical calculations.

This tutorial will give you a quick introduction to different functionalities of SymPy:

  • Defining variables and functions

  • Calculus of functions (differentiation and integration)

  • Solving ordinary differential equations

  • Algebraic and matrix manipulations

  • Defining tensors

For more information, we refer the reader to the (highly recommended and well-written) official documentation. In particular, examples here only cover real-valued computations. For complex analysis, see the documentation.

Getting started

First, we import the SymPy library. (We will also use numpy and matplotlib.pyplot in the examples below.)

The next step is to define the variables we will use for our calculations to follow. Here we use t,x,y,zt, x, y, z.

import sympy as sp import numpy as np import matplotlib.pyplot as plt t, x, y, z = sp.symbols("t x y z")

With these variables, we can define mathematical expressions. SymPy has many elementary functions that are built-in and from which more complicated functions can be constructed.

For example, let's define a function

f(x)=cos(x)+1f(x)=\cos(x)+1

In SymPy, this is defined as follows.

f = sp.cos(x) + 1 f

cos(x)+1\displaystyle \cos{\left(x \right)} + 1

We can replace or substitute one variable by another. For example, say we want to replace xyx \to y in the above expression, yielding

f(x)f(y)=cos(y)+1f(x) \to f(y) = \cos(y) + 1

This is accomplished by substitution as follows.

f.subs(x, y)

cos(y)+1\displaystyle \cos{\left(y \right)} + 1

We can use this feature to evaluate a function at a given point. For example, suppose we want to calculate f(0)f(0):

f(x=0)=cos(0)+1=2f(x=0)=\cos(0)+1=2

We perform a substitution replacing x0x \to 0, as follows.

f.subs(x, 0)

2\displaystyle 2

We can also perform a more complicated substitution to replace xx by a function. For example, let's replace xx by a function of y,zy,z, i.e., xyz2x \to y z^2. That is, we want:

f(x)f(yz2)=cos(yz2)+1f(x) \to f(y z^2) = cos(yz^2)+1
f.subs(x, y * z**2)

cos(yz2)+1\displaystyle \cos{\left(y z^{2} \right)} + 1

Note that these operations do not change the definition of f itself. If you want to save the results of your symbolic manipulation, you need to use a definition.

print(f) # Still the original f g = f.subs(x,y * z**2) print(g)
cos(x) + 1 cos(y*z**2) + 1

Converting to numerical functions

The easiest way to convert a SymPy expression to an expression that can be numerically evaluated is to use the lambdify function.

To convert our symbolic function f(x)f(x) to a numerical one, we do the following:

f_num = sp.lambdify(x,f)

Note that we need pass our symbolic function f as well as the variable that will be evaluated numerically, namely xx. Now, we have a numerical function f_num that can be evaluated numerically in the same way as you would normally do.

f_num = sp.lambdify(x, f)

Now, let's define an array of xx values and evaluate the function at those values numerically.

x_arr = np.arange(0., 10.5, 0.5) f_num(x_arr)
array([2. , 1.87758256, 1.54030231, 1.0707372 , 0.58385316, 0.19885638, 0.0100075 , 0.06354331, 0.34635638, 0.7892042 , 1.28366219, 1.70866977, 1.96017029, 1.97658763, 1.75390225, 1.34663532, 0.85449997, 0.3979881 , 0.08886974, 0.00282784, 0.16092847])

SymPy also includes a built-in function sp.pprint() that can provide a nicer output.

sp.pprint(f_num(x_arr))
[2. 1.87758256 1.54030231 1.0707372 0.58385316 0.19885638 0.0100075 0.06354331 0.34635638 0.7892042 1.28366219 1.70866977 1.96017029 1.97658763 1.75390225 1.34663532 0.85449997 0.3979881 0.08886974 0.00282784 0.16092847]

We can plot these numerical values in the usual way.

plt.plot(x_arr, f_num(x_arr)) plt.xlabel(r"$x$") plt.ylabel(r'$f(x)$') plt.show()
Image in a Jupyter notebook

Algebraic manipulations

SymPy has a built-in function sp.simplify() that can simplify mathematical expressions. For example, we can show that

cos2(x)+sin2(x)=1\cos^2(x)+\sin^2(x)=1
expr = sp.cos(x)**2 + sp.sin(x)**2 sp.simplify(expr)

1\displaystyle 1

As another example, we can perform this simplification:

x35xx2+xy=xxx25x+y=x25x+y\frac{x^3-5x}{x^2+xy}=\frac{x}{x}\frac{x^2-5}{x+y}=\frac{x^2-5}{x+y}
expr = (x**3 - 5 * x) / (x**2 + y * x) sp.simplify(expr)

x25x+y\displaystyle \frac{x^{2} - 5}{x + y}

SymPy can expand a polynomial with sp.expand(). Take the following example:

(x+y)3=x3+3x2y+3y2x+y3(x+y)^3=x^3+3x^2y+3y^2x+y^3
poly = (x + y)**3 expanded_poly = sp.expand(poly) expanded_poly

x3+3x2y+3xy2+y3\displaystyle x^{3} + 3 x^{2} y + 3 x y^{2} + y^{3}

SymPy can factorize a polynomial with sp.factor(). From the previous example, we have:

sp.factor(expanded_poly)

(x+y)3\displaystyle \left(x + y\right)^{3}

Differentiation

SymPy can compute derivatives of functions. Consider the example

ddtcos(t)=sin(t)\frac{\text{d}}{\text{d}t} \cos(t) = -\sin(t)

In SymPy, this is done as follows using sp.diff(f,t) where f is the function we want to differentiate with respect to t.

sp.diff(sp.cos(t), t)

sin(t)\displaystyle - \sin{\left(t \right)}

Differentiation can also be performed using the syntax f.diff(t). Consider this example:

ddtcos2(t)=2sin(t)cos(t)\frac{\text{d}}{\text{d}t} \cos^2(t) =-2\sin(t)\cos(t)

We can perform this as:

f = sp.cos(t)**2 f.diff(t)

2sin(t)cos(t)\displaystyle - 2 \sin{\left(t \right)} \cos{\left(t \right)}

For taking a higher-order derivative, the syntax is f.diff(t,n) where n is the number of derivatives taken with respect to t. As an example, suppose we want to compute

d2dt2cos2(t)=2(sin2(t)cos2(t))\frac{\text{d}^2}{\text{d}t^2} \cos^2(t) = 2 (\sin^2(t)-\cos^2(t))
f.diff(t, 2)

2(sin2(t)cos2(t))\displaystyle 2 \left(\sin^{2}{\left(t \right)} - \cos^{2}{\left(t \right)}\right)

Of course, this is equivalent to taking two first derivatives.

f.diff(t).diff(t)

2sin2(t)2cos2(t)\displaystyle 2 \sin^{2}{\left(t \right)} - 2 \cos^{2}{\left(t \right)}

For multivariate functions, the same function diff can take partial derivatives, including higher-order derivatives with respect to different variables.

For example, we consider the function f(t,x)=(t+x)3f(t,x) = (t + x)^3 and we will show that

2ftx=2fxt\frac{\partial^2f}{\partial t\partial x}=\frac{\partial^2f}{\partial x \partial t}
f = (t+x)**3 # Derive first with respect to t, then wrt x f.diff(t, x)

6(t+x)\displaystyle 6 \left(t + x\right)

# Derive first wrt x, then wrt t f.diff(x, t)

6(t+x)\displaystyle 6 \left(t + x\right)

Integration

There are two kinds of integrals: definite and indefinite. To calculate the indefinite integral (i.e., primitive of a function)

f(t)  dt\int f(t) \; \text{d}t

we use the syntax sp.integrate(f,t) where f is the function and t is the integration variable. As an example, let's compute

cos(t)  dt=sin(t)\int\cos(t) \; \text{d}t=\sin(t)
sp.integrate(sp.cos(t), t)

sin(t)\displaystyle \sin{\left(t \right)}

Next, we compute a definite integral. The syntax is sp.integrate(f,(t,a,b)) to compute

abf(t)  dt\int_a^b f(t) \; \text{d}t

As an example, we can compute

0etdt\int^\infty_0 \text{e}^{-t}\text{d}t
sp.integrate(sp.exp(-t), (t, 0, np.infty))

1\displaystyle 1

Similar to differentiation, there is an alternative syntax form where integrate is called directly as a method of the function itself, f.integrate(t). As an example, we compute

et  dt=et\int \text{e}^{-t} \; \text{d}t=-\text{e}^{-t}
f = sp.exp(-t) f.integrate(t)

et\displaystyle - e^{- t}

Here is an example that does not converge:

etdt\int^\infty_{-\infty}\text{e}^{-t}\text{d}t
f.integrate((t, -np.infty, np.infty))

\displaystyle \infty

Next, we turn to multivariate integrales. For example, we calculate the volume of a sphere of radius rr by performing a triple integral

V=0rdr0πdθ02πdφr2sin(θ)=43πr3V=\int^r_{0} \text{d}r' \int^\pi_{0} \text{d}\theta \int^{2\pi}_{0} \text{d}\varphi \, r'^2 \sin(\theta) \, =\frac{4}{3}\pi r^3
# First define new variables for spherical coordinates r, rprime, theta, phi = sp.symbols("r, r', \theta, \phi") # Define integrand integrand = rprime**2 * sp.sin(theta) # Perform integral V = integrand.integrate((rprime, 0, r), (phi, 0, 2*sp.pi), (theta, 0, sp.pi)) V

4πr33\displaystyle \frac{4 \pi r^{3}}{3}

By substituting a given value for the radius, we get the value for a physical volume. For example, taking r=1r=1 gives the volume of the unit sphere.

unit_sphere = V.subs(r,1) # r = 1 unit_sphere

4π3\displaystyle \frac{4 \pi}{3}

We can convert the exact value into a numerical value using the method evalf().

unit_sphere.evalf()

4.18879020478639\displaystyle 4.18879020478639

Ordinary differential equations

Next, we can see how to solve ordinary differential equations (ODEs). Here we consider ODEs that can be expressed in the form

a0(t)f(t)+a1(t)df(t)dt+...+an(t)dnf(t)dtn+b(t)=0a_0(t) f(t)+a_1(t) \frac{\text{d}f(t)}{\text{d}t}+...+a_n(t)\frac{\text{d}^n f(t)}{\text{d}t^n}+b(t)=0

where f(t)f(t) is the function we want to solve for and ai(t)i[0,n]a_i(t) \, \forall i \in [0, n] and b(t)b(t) are arbitrary differentiable functions.

The steps are as follows.

  1. Define the symbol f to be a function (as opposed to a symbolic variable). This is done as follows, determines that the symbol f belongs to the class of the functions.

y = symbols("y", cls=Function)
  1. Define the ODE. This is done by defining an equation using sp.Eq(L,R), which represents the equations L=RL = R.

  2. Solve the ODE using sp.dsolve().

Let's show all the steps for the simple ODE

df(t)dtf(t)=0\frac{\text{d}f(t)}{\text{d}t}-f(t)=0
# Define the function f = sp.symbols("f", cls=sp.Function) # Define the ODE ODE = sp.Eq(f(t).diff(t) - f(t), 0) # Solve the ODE sp.dsolve(ODE, f(t))

f(t)=C1et\displaystyle f{\left(t \right)} = C_{1} e^{t}

Here is another example, which is a 2nd-order ODE, the simple harmonic oscillator:

d2f(t)dt2+ω02f(t)=0\frac{\text{d}^2f(t)}{\text{d}t^2}+\omega_0^2f(t)=0

Remember if you have a constant symbolic parameter in your ODE, you must first define it as a symbol.

omega = sp.symbols("\omega_0") ODE = sp.Eq(f(t).diff(t, 2) + omega**2 * f(t), 0) sp.dsolve(ODE, f(t))

f(t)=C1eiω0t+C2eiω0t\displaystyle f{\left(t \right)} = C_{1} e^{- i \omega_0 t} + C_{2} e^{i \omega_0 t}

Matrix manipulation with SymPy:

Matrix manipulation

First, we define a matrix:

M = sp.Matrix([[1, 2], [3, 4]]) M

[1234]\displaystyle \left[\begin{matrix}1 & 2\\3 & 4\end{matrix}\right]

We can get the individual entries of a matrix using the usual syntax for array indexing and slicing. For example, M[i,j] is the entry in the iith row and jjth column, and M[:,i] is the iith column. As usual in Python, counting indices starts at 00.

# First row row = M[0, :] row

[12]\displaystyle \left[\begin{matrix}1 & 2\end{matrix}\right]

# Second element of first row row[1]

2\displaystyle 2

# Or more directly M[0, 1]

2\displaystyle 2

You can define a column matrix (or column vector):

c = sp.Matrix([x, y]) c

[xy]\displaystyle \left[\begin{matrix}x\\y\end{matrix}\right]

And a row matrix (or row vector):

r = sp.Matrix([[x, y]]) r

[xy]\displaystyle \left[\begin{matrix}x & y\end{matrix}\right]

To multiply matrices, you can use the * symbol. Here are some examples:

M * M

[7101522]\displaystyle \left[\begin{matrix}7 & 10\\15 & 22\end{matrix}\right]

r * c # This is a single number

[x2+y2]\displaystyle \left[\begin{matrix}x^{2} + y^{2}\end{matrix}\right]

c * r # This is a 2x2 matrix

[x2xyxyy2]\displaystyle \left[\begin{matrix}x^{2} & x y\\x y & y^{2}\end{matrix}\right]

Other matrix operations are as follows:

M + M # matrix addition

[2468]\displaystyle \left[\begin{matrix}2 & 4\\6 & 8\end{matrix}\right]

sp.Transpose(M) # transposition

([1234])T\displaystyle \left(\left[\begin{matrix}1 & 2\\3 & 4\end{matrix}\right]\right)^{T}

If you want to see it explicitly, you should use the method as_explicit():

sp.Transpose(M).as_explicit()

[1324]\displaystyle \left[\begin{matrix}1 & 3\\2 & 4\end{matrix}\right]

sp.det(M) # Determinant

2\displaystyle -2

sp.det(sp.Transpose(M)) # Determinant of the transpose

2\displaystyle -2

M.inv() # Inverse of a matrix

[213212]\displaystyle \left[\begin{matrix}-2 & 1\\\frac{3}{2} & - \frac{1}{2}\end{matrix}\right]

We can check that a matrix multiplied by its inverse is the identity matrix.

M.inv() * M

[1001]\displaystyle \left[\begin{matrix}1 & 0\\0 & 1\end{matrix}\right]

We can compute the eigenvalues and eigenvectors of a matrix. The method eigenvals() returns the eigenvalues and the multiplicity of each eigenvalue. (For the example here, the multiplicity is always 1.)

The method eigenvects() additionally returns the eigenvectors (as column vectors).

Here are some examples.

M.eigenvals()
{5/2 - sqrt(33)/2: 1, 5/2 + sqrt(33)/2: 1}
M.eigenvects()
[(5/2 - sqrt(33)/2, 1, [Matrix([ [-2/(-3/2 + sqrt(33)/2)], [ 1]])]), (5/2 + sqrt(33)/2, 1, [Matrix([ [-2/(-sqrt(33)/2 - 3/2)], [ 1]])])]

General Relativity and tensor algebra

Einstein's theory of General Relativity (GR) is described in terms of many different tensors. Here we will show how SymPy can be useful for algebraic computations involving tensors, which in GR can be somewhat complicated and tedious to work out by hand.

What are tensors? They are objects with components labeled by indices. You are familiar with one-index objects (vectors) and two-index objects (matrices), but there can be more complicated objects with even more indices to label their components.

Tensor algebra is the process of multiplying tensors together by multiplying their components along certain axes. Familiar examples include the inner product (dot product) between vectors and usual matrix multiplication.

To begin, we define a spacetime vector to include time tt along with the usual three Cartesian spatial directions

xμ=(t,x,y,z)x^\mu=(t, x, y, z)

Here, μ\mu is the index and takes values μ=0,1,2,3\mu = 0, 1, 2, 3, i.e., x0=tx^0 = t, x1=xx^1 = x, etc. Even though xμx^\mu is strictly-speaking the component of the vector, it is customary to refer to xμx^\mu as the vector itself.

Matrix algebra with the metric tensor

The fundamental object in GR is the metric tensor gμνg_{\mu\nu}, which describes the curvature of spacetime. Since it has two indices, we can represent it as a matrix. Here we will assume the metric tensor can be written as the following matrix.

gμν=(10000f2(t)0000f2(t)0000f2(t))g_{\mu\nu}=\left(\begin{array}{rrr} 1 & 0 & 0 & 0 \\ 0 & -f^2(t) & 0 & 0 \\ 0 & 0 & -f^2(t) & 0 \\ 0 & 0 & 0 & -f^2(t) \\ \end{array}\right)

This form is useful for cosmology to describe a homogenous and isotropic expanding Universe.

The metric tensors encodes the geometry of the curved manifold of spacetime. This is often expressed in terms of the spacetime distance (squared) ds2ds^2 between two nearby spacetime points, xμ=(t,x,y,z)x^\mu = (t,x,y,z) and xμ+dxμ=(t+dt,x+dx,y+dy,z+dz)x^\mu + dx^\mu= (t + dt,x + dx,y + dy,z + dz):

ds2=gμνdxμdxν=dt2f2(t)(dx2+dy2+dz2)ds^2 = g_{\mu\nu} dx^\mu dx^\nu=dt^2-f^2(t)(dx^2+dy^2+dz^2)

In GR, tensors with upper and lower indices are different objects. An upper index is refered to as a covariant index, while a lower index is a contravariant index. To convert one to another (referred to as raising or lowering indices), we multiply with the metric tensor. That is, we multiply by gμνg_{\mu \nu} to lower an index and gμνg^{\mu \nu} to raise an index. Since raising and then lowering (or vice-versa) gives back the original, gμνg_{\mu \nu} and gμνg^{\mu \nu} must be matrix inverses of one another.

# Metric tensor g_{mu nu} with two lower indices (denoted ll) g_ll = sp.Matrix([[1, 0, 0, 0],[0, -f(t)**2, 0, 0],[0, 0, -f(t)**2, 0],[0, 0, 0, -f(t)**2]]) g_ll

[10000f2(t)0000f2(t)0000f2(t)]\displaystyle \left[\begin{matrix}1 & 0 & 0 & 0\\0 & - f^{2}{\left(t \right)} & 0 & 0\\0 & 0 & - f^{2}{\left(t \right)} & 0\\0 & 0 & 0 & - f^{2}{\left(t \right)}\end{matrix}\right]

# Metric tensor g^{mu nu} with two upper indices (denoted uu) # Take the inverse g_uu = g_ll.inv() g_uu

[100001f2(t)00001f2(t)00001f2(t)]\displaystyle \left[\begin{matrix}1 & 0 & 0 & 0\\0 & - \frac{1}{f^{2}{\left(t \right)}} & 0 & 0\\0 & 0 & - \frac{1}{f^{2}{\left(t \right)}} & 0\\0 & 0 & 0 & - \frac{1}{f^{2}{\left(t \right)}}\end{matrix}\right]

We can easily check that these matrices are inverses by (matrix) multiplying them together. We get the 4×44 \times 4 identity matrix.

g_uu * g_ll

[1000010000100001]\displaystyle \left[\begin{matrix}1 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{matrix}\right]

For other tensor calculations, we need a more general approach beyond matrix multiplication.

First, we have a general tensor product sp.tensorproduct(). This multiplies all the different elements of two tensors in all possible ways. For example, gμνg_{\mu \nu} has 4×44 \times 4 components and gαβg^{\alpha \beta} has 4×44 \times 4 components (we need unique labels for the indices here). So the tensor product

gμνgαβg_{\mu \nu} g^{\alpha \beta}

is an object with four indices that has 4×4×4×44 \times 4 \times 4 \times 4 components.

# Tensor product with yield a 4 x 4 x 4 x 4 component object. prod = sp.tensorproduct(g_ll, g_uu) prod

[[100001f2(t)00001f2(t)00001f2(t)][0000000000000000][0000000000000000][0000000000000000][0000000000000000][f2(t)000010000100001][0000000000000000][0000000000000000][0000000000000000][0000000000000000][f2(t)000010000100001][0000000000000000][0000000000000000][0000000000000000][0000000000000000][f2(t)000010000100001]]\displaystyle \left[\begin{matrix}\left[\begin{matrix}1 & 0 & 0 & 0\\0 & - \frac{1}{f^{2}{\left(t \right)}} & 0 & 0\\0 & 0 & - \frac{1}{f^{2}{\left(t \right)}} & 0\\0 & 0 & 0 & - \frac{1}{f^{2}{\left(t \right)}}\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right]\\\left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right] & \left[\begin{matrix}- f^{2}{\left(t \right)} & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right]\\\left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right] & \left[\begin{matrix}- f^{2}{\left(t \right)} & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right]\\\left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right] & \left[\begin{matrix}- f^{2}{\left(t \right)} & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{matrix}\right]\end{matrix}\right]

Next, we have tensor contraction. In the example above, suppose we set the indices ν=α\nu = \alpha and sum over ν\nu:

ν=03gμνgνβ\sum_{\nu = 0}^{3} g_{\mu \nu} g^{\nu \beta}

Now, this is nothing more than the usual matrix multiplication.

We are left with two free indices, μ\mu and β\beta, so this object is another 4×44 \times 4 matrix. Since summing over indices reduces the number of free indices left, this is referred to as contracting indices.

Here is a bit of important notation. It is customary to omit the summation sign and write the above as

gμνgνβg_{\mu \nu} g^{\nu \beta}

with the understanding that repeated indices (one raised, one lowered) are summed over, known as Einstein's summation convention.

We can perform this operation using sp.tensorcontraction(). First, we compute the tensor product as above. Then we contract the second index (ν\nu) and third index (α\alpha) with each other. For the above example, we have

prod = sp.tensorproduct(g_ll, g_uu) sp.tensorcontraction(prod, (1, 2))

Here (1,2) labels the indices to contract, i.e., the second and third indices since as usual Python counting starts at 0.

In this example, we again yield the identity matrix.

# Tensor contraction prod = sp.tensorproduct(g_ll, g_uu) sp.tensorcontraction(prod, (1, 2))

[1000010000100001]\displaystyle \left[\begin{matrix}1 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 1 & 0\\0 & 0 & 0 & 1\end{matrix}\right]

The contravariant metric times the covariant metric should give us the four dimensional unit matrix. gμαgαν=δνμ=(1000010000100001) g^{\mu\alpha}g_{\alpha\nu}=\delta^\mu_\nu=\left(\begin{array}{rrr} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array}\right)

Raising and lowering indices

The covariant form of the spacetime position vector is xμ=(t,x,y,z)x^\mu = (t,x,y,z). The contravariant form is obtained by multiplying with the metric tensor to lower the index. That is,

xμ=gμνxνx_\mu=g_{\mu\nu}x^\nu

Note that the repeated index ν\nu is summed over according to Einstein's summation convention (i.e., there is ν=03\sum_{\nu=0}^3 omitted in front). This operation is just a 4×44 \times 4 matrix multiplying a 44-component vector to yield another 44-component vector.

Similarly, raising an index is accomplished by

xμ=gμνxνx^\mu=g^{\mu\nu} x_\nu

In this way, indices are raised and lowered by contracting with the metric tensor. Let's see how to perform these operations.

# Define x with upper index (u) x_u = sp.Array([t, x, y, z]) x_u

[txyz]\displaystyle \left[\begin{matrix}t & x & y & z\end{matrix}\right]

# Compute x with lower index (l) # Step 1: compute tensor product prod = sp.tensorproduct(g_ll, x_u) # Step 2: contraction x_l = sp.tensorcontraction(prod, (1, 2)) x_l

[txf2(t)yf2(t)zf2(t)]\displaystyle \left[\begin{matrix}t & - x f^{2}{\left(t \right)} & - y f^{2}{\left(t \right)} & - z f^{2}{\left(t \right)}\end{matrix}\right]

Performing the same steps to raise the index returns the original position vector.

prod = sp.tensorproduct(g_uu, x_l) sp.tensorcontraction(prod, (1, 2))

[txyz]\displaystyle \left[\begin{matrix}t & x & y & z\end{matrix}\right]

Tensor products can be arbitrarily complicated. For example, we have the product of three metric tensors, which is as object with six indices.

gαβgγμgδνg^{\alpha\beta}g_{\gamma\mu}g_{\delta\nu}

If we contract α=γ\alpha = \gamma (the first and third indices) and β=δ\beta = \delta (the second and fourth indices), we have

gαβgαμgβν=gμνg^{\alpha\beta}g_{\alpha\mu}g_{\beta\nu} = g_{\mu\nu}

That is, we start with gαβg^{\alpha \beta} and lower two indices using the metric tensor twice. We can check that this works explicitly.

prod = sp.tensorproduct(g_uu, g_ll, g_ll) sp.tensorcontraction(prod, (0, 2), (1,4))

[10000f2(t)0000f2(t)0000f2(t)]\displaystyle \left[\begin{matrix}1 & 0 & 0 & 0\\0 & - f^{2}{\left(t \right)} & 0 & 0\\0 & 0 & - f^{2}{\left(t \right)} & 0\\0 & 0 & 0 & - f^{2}{\left(t \right)}\end{matrix}\right]

This is the same as the metric tensor with two lower indices, given above.

g_ll

[10000f2(t)0000f2(t)0000f2(t)]\displaystyle \left[\begin{matrix}1 & 0 & 0 & 0\\0 & - f^{2}{\left(t \right)} & 0 & 0\\0 & 0 & - f^{2}{\left(t \right)} & 0\\0 & 0 & 0 & - f^{2}{\left(t \right)}\end{matrix}\right]

Derivatives of tensors

In GR, tensors are not fixed numbers in general but are functions of spacetime coordinates. That is, the components of a tensor can be functions of t,x,y,zt,x,y,z. For example, in our discussion above, we assumed that the metric tensor was a function of tt.

Here we consider derivatives of tensors. In general, one needs to consider all partial derivatives

xμ=(t,x,y,z)\frac{\partial}{\partial x^\mu} = \left( \frac{\partial}{\partial t}, \frac{\partial}{\partial x}, \frac{\partial}{\partial y}, \frac{\partial}{\partial z} \right)

This is often abbreviated as μ\partial_\mu.

Here, since we only introduce dependence on time tt, we need only consider time derivatives /t\partial/\partial t. This is often abbreviated as 0\partial_0 or t\partial_t. Other derivatives x\partial_x, y\partial_y, z\partial_z will yield zero.

As an example, let's take the derivative of the metric tensor

αgμν\partial_\alpha g_{\mu \nu}

We have three indices, so we will get a 3-index object back. All of the components with α0\alpha \ne 0 will be zero.

g_ll.diff(x_u) # the metric only depends on t, so all derivatives by x, y or z are 0

[[000002f(t)ddtf(t)00002f(t)ddtf(t)00002f(t)ddtf(t)][0000000000000000][0000000000000000][0000000000000000]]\displaystyle \left[\begin{matrix}\left[\begin{matrix}0 & 0 & 0 & 0\\0 & - 2 f{\left(t \right)} \frac{d}{d t} f{\left(t \right)} & 0 & 0\\0 & 0 & - 2 f{\left(t \right)} \frac{d}{d t} f{\left(t \right)} & 0\\0 & 0 & 0 & - 2 f{\left(t \right)} \frac{d}{d t} f{\left(t \right)}\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right] & \left[\begin{matrix}0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\\0 & 0 & 0 & 0\end{matrix}\right]\end{matrix}\right]

If we just want the time derivative, we can compute tgμν\partial_t g_{\mu\nu} in two equivalent ways (i.e., setting α=0\alpha = 0).

g_ll.diff(x_u[0])

[000002f(t)ddtf(t)00002f(t)ddtf(t)00002f(t)ddtf(t)]\displaystyle \left[\begin{matrix}0 & 0 & 0 & 0\\0 & - 2 f{\left(t \right)} \frac{d}{d t} f{\left(t \right)} & 0 & 0\\0 & 0 & - 2 f{\left(t \right)} \frac{d}{d t} f{\left(t \right)} & 0\\0 & 0 & 0 & - 2 f{\left(t \right)} \frac{d}{d t} f{\left(t \right)}\end{matrix}\right]

g_ll.diff(t)

[000002f(t)ddtf(t)00002f(t)ddtf(t)00002f(t)ddtf(t)]\displaystyle \left[\begin{matrix}0 & 0 & 0 & 0\\0 & - 2 f{\left(t \right)} \frac{d}{d t} f{\left(t \right)} & 0 & 0\\0 & 0 & - 2 f{\left(t \right)} \frac{d}{d t} f{\left(t \right)} & 0\\0 & 0 & 0 & - 2 f{\left(t \right)} \frac{d}{d t} f{\left(t \right)}\end{matrix}\right]

We can also compute a second derivative, e.g., t2gμν\partial_t^2 g_{\mu\nu}

g_ll.diff(x_u[0], 2)

[000002(f(t)d2dt2f(t)+(ddtf(t))2)00002(f(t)d2dt2f(t)+(ddtf(t))2)00002(f(t)d2dt2f(t)+(ddtf(t))2)]\displaystyle \left[\begin{matrix}0 & 0 & 0 & 0\\0 & - 2 \left(f{\left(t \right)} \frac{d^{2}}{d t^{2}} f{\left(t \right)} + \left(\frac{d}{d t} f{\left(t \right)}\right)^{2}\right) & 0 & 0\\0 & 0 & - 2 \left(f{\left(t \right)} \frac{d^{2}}{d t^{2}} f{\left(t \right)} + \left(\frac{d}{d t} f{\left(t \right)}\right)^{2}\right) & 0\\0 & 0 & 0 & - 2 \left(f{\left(t \right)} \frac{d^{2}}{d t^{2}} f{\left(t \right)} + \left(\frac{d}{d t} f{\left(t \right)}\right)^{2}\right)\end{matrix}\right]