Path: blob/master/site/en-snapshot/probability/examples/Gaussian_Copula.ipynb
25118 views
Copyright 2018 The TensorFlow Probability Authors.
Licensed under the Apache License, Version 2.0 (the "License");
A [copula](https://en.wikipedia.org/wiki/Copula_(probability_theory)) is a classical approach for capturing the dependence between random variables. More formally, a copula is a multivariate distribution such that marginalizing gives .
Copulas are interesting because we can use them to create multivariate distributions with arbitrary marginals. This is the recipe:
Using the Probability Integral Transform turns an arbitrary continuous R.V. into a uniform one , where is the CDF of .
Given a copula (say bivariate) , we have that and have uniform marginal distributions.
Now given our R.V's of interest , create a new distribution . The marginals for and are the ones we desired.
Marginals are univariate and thus may be easier to measure and/or model. A copula enables starting from marginals yet also achieving arbitrary correlation between dimensions.
Gaussian Copula
To illustrate how copulas are constructed, consider the case of capturing dependence according to multivariate Gaussian correlations. A Gaussian Copula is one given by where represents the CDF of a MultivariateNormal, with covariance and mean 0, and is the inverse CDF for the standard normal.
Applying the normal's inverse CDF warps the uniform dimensions to be normally distributed. Applying the multivariate normal's CDF then squashes the distribution to be marginally uniform and with Gaussian correlations.
Thus, what we get is that the Gaussian Copula is a distribution over the unit hypercube with uniform marginals.
Defined as such, the Gaussian Copula can be implemented with tfd.TransformedDistribution
and appropriate Bijector
. That is, we are transforming a MultivariateNormal, via the use of the Normal distribution's inverse CDF, implemented by the tfb.NormalCDF
bijector.
Below, we implement a Gaussian Copula with one simplifying assumption: that the covariance is parameterized by a Cholesky factor (hence a covariance for MultivariateNormalTriL
). (One could use other tf.linalg.LinearOperators
to encode different matrix-free assumptions.).
The power, however, from such a model is using the Probability Integral Transform, to use the copula on arbitrary R.V.s. In this way, we can specify arbitrary marginals, and use the copula to stitch them together.
We start with a model:
and use the copula to get a bivariate R.V. , which has marginals Kumaraswamy and Gumbel.
We'll start by plotting the product distribution generated by those two R.V.s. This is just to serve as a comparison point to when we apply the Copula.
Joint Distribution with Different Marginals
Now we use a Gaussian copula to couple the distributions together, and plot that. Again our tool of choice is TransformedDistribution
applying the appropriate Bijector
to obtain the chosen marginals.
Specifically, we use a Blockwise
bijector which applies different bijectors at different parts of the vector (which is still a bijective transformation).
Now we can define the Copula we want. Given a list of target marginals (encoded as bijectors), we can easily construct a new distribution that uses the copula and has the specified marginals.
Finally, let's actually use this Gaussian Copula. We'll use a Cholesky of , which will correspond to variances 1, and correlation for the multivariate normal.
We'll look at a few cases:
Finally, let's verify that we actually get the marginals we want.
Conclusion
And there we go! We've demonstrated that we can construct Gaussian Copulas using the Bijector
API.
More generally, writing bijectors using the Bijector
API and composing them with a distribution, can create rich families of distributions for flexible modelling.