Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Path: blob/master/notebooks/Chapter 19 - Multivariate Normal Distribution.ipynb
Views: 449
This is arguably one of the most significant applications of linear algebra. We will gradually build up intuition before delving into the multivariate normal distribution.
Univariate Normal Distribution
The probability density function (PDF) of the univariate normal distribution is defined as:
Here, denotes the PDF where the random variable is , and the parameters are mean and variance . The notation is not indicative of a conditional relationship, which is typically represented as .
It's important to note that the expression represents a quadratic function. This is essentially the univariate counterpart of the quadratic forms discussed in previous chapters.
If we define , , let's plot the quadratic function and its exponential.
The constant at the beginning serves as a normalizing factor, ensuring that the integral of the entire function equals :
The simplest method to plot a univariate normal PDF is by utilizing the sp.stats.norm.pdf()
function from Scipy, where we can directly input the values for and .
Multivariate Normal Distribution (MND)
The probability density function (PDF) of a multivariate normal distribution is given by:
Understanding this PDF requires refreshing some foundational knowledge of random vectors, which will be done below.
Expectation and Covariance Matrix
Consider a random -vector , its expection is defined as
The variance of is a covariance matrix, denoted as
Linear Combination of Normal Distribution
As anticipated, the linear combination of a sequence of normally distributed variables will also yield a normal distribution. Consider a random vector :
where each is independently and identically distributed (iid) with mean and variance . Given any full-rank matrix , a random normal vector can be expressed as:
This implies that each for is a linear combination of . If , the variance of is given by:
The covariance matrix is a positive semi-definite matrix. For an -vector , we have:
This also indicates that all eigenvalues of are non-negative. For further details, refer to Chapter 17, specifically the section on positive definite matrices.
Inverse and Positive Definite
If a matrix is both positive definite and symmetric, all its eigenvalues are strictly greater than . This property is crucial because eigenvalues reflect the matrix's characteristics in various dimensions. However, encountering an eigenvalue of implies a significant shift:
This equation suggests that , a nontrivial solution (eigenvector), leads to a scenario where the matrix transforms into the zero vector. A nontrivial solution implies that the eigenvector is not the zero vector, which in turn indicates that the matrix lacks full rank and is therefore non-invertible.
Consequently, the presence of a eigenvalue in a positive definite and symmetric matrix contradicts its defined properties. Thus, if is indeed positive definite, it guarantees that all eigenvalues are positive, ensuring that is invertible. This invertibility is a key attribute, as it confirms the matrix's capability to be uniquely reversed, maintaining the integrity of operations within linear transformations.
Inverse and Symmetry
Continuing from the previous section, where is symmetric and positive semi-definite (PSD), we know that . By taking the transpose of this equation, we obtain:
This demonstrates that , indicating that is also a symmetric matrix. Next, we aim to show that is positive definite.
Consider an eigenvalue equation for :
Applying to both sides, we get:
This derivation proves that if has an eigenvalue , then has as its corresponding eigenvalue.
Given that (since is positive definite), it follows that . Therefore, is also positive definite, maintaining the property of positive definiteness through inversion.
Bivariate Normal Distribution
In the PDF of MND, the argument of exponential function is , which takes a quadratic form.
is symmetric positive semi-definite, so is for any vector .
With a minus sign in front, we get negative semi-definite quadratic form
If we define a simple bivariate case, the quadratic form is
Further, we give value to and
We can visualize quadratic form and its exponential
In the probability density function (PDF) of the Multivariate Normal Distribution (MND), the argument of the exponential function is , representing a quadratic form. Here, denotes the covariance matrix, which is symmetric and positive semi-definite. Consequently, its inverse, , shares these properties for any vector .
The presence of a negative sign results in a negative semi-definite quadratic form, mathematically expressed as:
To illustrate this concept with a simple bivariate example, consider the quadratic form:
For specific values of and :
This setup allows us to visualize the quadratic form and its exponential, providing insight into the shape and behavior of the Multivariate Normal Distribution's density function.
Also, the most convenient way of producing bivariate normal distribution is to use Scipy multivariate normal distribution function.
Since we have expanded the bivariate quadratic form, back to the MVN PDF, we have
We find that bivariate normal distribution can be decomposed into a product of two single variate normal distribution!
Covariance Matrix
Covariance matrix is the most important factor that shapes the distribution. Use Scipy multivariate normal random generator, we can learn some intuition of the covariance matrix.
Notice that the plot above all take a diagonal covariance matrix, i.e.
To understand the covariance matrix beyond the intuition, we need to analyze eigenvectors and -values. But we know that diagonal matrix has all of its eigenvalues on the principal diagonal, which are and .
Isocontours
Isocontours are simply contour lines. They are projections of the variable of iso height onto -plane. Let's show what they look like analytically.
We know eigenvalues, intuitively we also feel that eigenvalues are connected with the shape of isocontours which is demonstrated by the scatter plot above.
We will derive an equation for isocontour.
We saw in previous section, that
In order to get the equation of a contour, we set
Defining
it follows that
This is the equation of ellipse, are are major/minor semi radii.
The equation can be written as
However this becomes a function, only valid for ploting half of the ellipse. There are problems while dealing with square root.
Also note that
which means the singular values of the covariance matrix determines the shape of distribution. This expression also explaines why we are using to represent singular values, because it is essentially standard deviation.
Obviously, not the best tool for plotting ellipse, we have another tool - parametric function - specially designed for graphs which can't be conveniently plotted by function.
where .
Then let's plot both the ellipse and random draws of corresponding bivariate normal distribution.
Here is a side note, that some people might prefer using heatmap for distribution.
Covariance Matrix with Nonezero Covariance
We have mostly seen the case that covariance matrices are diagonal, how about they are just symmetric but not diagonal?
Let's take a look the graph.
It is clear that covariance decides the rotation angle.
The rotation matrix which is a linear transformation operator
the rotation matrix is closely connected with the covariance matrix.
Next we will plot circles with parametric functions, then transform them by the covariance matrix.
We can see that covariance matrix functions like a rotation matrix.
Quadratic Form of Normal Distribution
If a random normal vector , the linear transformation is also normally distributed, but we would like to know more of . Take expectation and variance respectively,
where is a deterministic matrix.
If , then .