Path: blob/master/site/en-snapshot/probability/examples/Bayesian_Switchpoint_Analysis.ipynb
25118 views
Copyright 2019 The TensorFlow Probability Authors.
Licensed under the Apache License, Version 2.0 (the "License");
Bayesian Switchpoint Analysis
This notebook reimplements and extends the Bayesian “Change point analysis” example from the pymc3 documentation.
Prerequisites
Dataset
The dataset is from here. Note, there is another version of this example floating around, but it has “missing” data – in which case you’d need to impute missing values. (Otherwise your model will not ever leave its initial parameters because the likelihood function will be undefined.)
Probabilistic Model
The model assumes a “switch point” (e.g. a year during which safety regulations changed), and Poisson-distributed disaster rate with constant (but potentially different) rates before and after that switch point.
The actual disaster count is fixed (observed); any sample of this model will need to specify both the switchpoint and the “early” and “late” rate of disasters.
Original model from pymc3 documentation example:
However, the mean disaster rate has a discontinuity at the switchpoint , which makes it not differentiable. Thus it provides no gradient signal to the Hamiltonian Monte Carlo (HMC) algorithm – but because the prior is continuous, HMC’s fallback to a random walk is good enough to find the areas of high probability mass in this example.
As a second model, we modify the original model using a sigmoid “switch” between e and l to make the transition differentiable, and use a continuous uniform distribution for the switchpoint . (One could argue this model is more true to reality, as a “switch” in mean rate would likely be stretched out over multiple years.) The new model is thus:
In the absence of more information we assume as parameters for the priors. We’ll run both models and compare their inference results.
The above code defines the model via JointDistributionSequential distributions. The disaster_rate
functions are called with an array of [0, ..., len(years)-1]
to produce a vector of len(years)
random variables – the years before the switchpoint
are early_disaster_rate
, the ones after late_disaster_rate
(modulo the sigmoid transition).
Here is a sanity-check that the target log prob function is sane:
HMC to do Bayesian inference
We define the number of results and burn-in steps required; the code is mostly modeled after the documentation of tfp.mcmc.HamiltonianMonteCarlo. It uses an adaptive step size (otherwise the outcome is very sensitive to the step size value chosen). We use values of one as the initial state of the chain.
This is not the full story though. If you go back to the model definition above, you’ll note that some of the probability distributions are not well-defined on the whole real number line. Therefore we constrain the space that HMC shall examine by wrapping the HMC kernel with a TransformedTransitionKernel that specifies the forward bijectors to transform the real numbers onto the domain that the probability distribution is defined on (see comments in the code below).
Run both models in parallel:
Visualize the result
We visualize the result as histograms of samples of the posterior distribution for the early and late disaster rate, as well as the switchpoint. The histograms are overlaid with a solid line representing the sample median, as well as the 95%ile credible interval bounds as dashed lines.