Path: blob/main/notebooks/published/bayesian_inference_mcmc/bayesian_inference_mcmc.ipynb
51 views
Bayesian Inference with Markov Chain Monte Carlo (MCMC)
Introduction
Bayesian inference provides a principled framework for updating our beliefs about unknown parameters in light of observed data. Unlike frequentist approaches that treat parameters as fixed but unknown quantities, Bayesian inference treats parameters as random variables with associated probability distributions.
Theoretical Foundation
Bayes' Theorem
The cornerstone of Bayesian inference is Bayes' theorem, which relates the posterior distribution to the prior and likelihood:
where:
is the posterior distribution — our updated belief about parameter after observing data
is the likelihood — the probability of observing the data given the parameter
is the prior distribution — our initial belief about the parameter before seeing data
is the marginal likelihood or evidence
The Computational Challenge
The marginal likelihood often involves intractable integrals, especially in high-dimensional parameter spaces:
This is where Markov Chain Monte Carlo (MCMC) methods become essential — they allow us to sample from the posterior distribution without explicitly computing the normalizing constant.
MCMC: The Metropolis-Hastings Algorithm
Core Idea
MCMC constructs a Markov chain whose stationary distribution is the target posterior distribution . The Metropolis-Hastings algorithm is a general-purpose MCMC method.
Algorithm
Given current state :
Propose a candidate state from proposal distribution
Compute the acceptance probability:
Accept or Reject:
Draw
If , set (accept)
Otherwise, set (reject)
Random Walk Metropolis
A common choice is the symmetric random walk proposal:
Since , the acceptance ratio simplifies to:
Convergence Properties
Under mild regularity conditions, the Markov chain satisfies:
Irreducibility: Any state can be reached from any other state
Aperiodicity: The chain does not cycle deterministically
Ergodicity: Time averages converge to ensemble averages
Practical Example: Inferring Parameters of a Normal Distribution
We will demonstrate Bayesian inference by estimating the mean and standard deviation of a normal distribution from observed data.
Model Specification
Likelihood:
Priors:
Joint Likelihood:
Log-posterior (up to a constant):
Diagnostics and Convergence Assessment
Effective Sample Size (ESS)
The effective sample size accounts for autocorrelation in the MCMC chain:
where is the autocorrelation at lag .
Posterior Predictive Check
A crucial validation step is comparing the posterior predictive distribution with the observed data:
We approximate this by sampling from the posterior and generating predictions.
Conclusion
This notebook demonstrated the fundamentals of Bayesian inference using Markov Chain Monte Carlo (MCMC):
Bayes' theorem provides the theoretical foundation for updating beliefs given data
Metropolis-Hastings algorithm enables sampling from complex posterior distributions
Convergence diagnostics (trace plots, ESS, autocorrelation) ensure reliable inference
Posterior predictive checks validate model fit
Key Takeaways
MCMC methods bypass the need to compute intractable normalizing constants
Proposal distribution tuning affects acceptance rate and mixing efficiency
Burn-in period is essential to discard samples before convergence
The posterior distribution fully characterizes parameter uncertainty
Further Reading
Gelman, A., et al. (2013). Bayesian Data Analysis, 3rd ed.
Brooks, S., et al. (2011). Handbook of Markov Chain Monte Carlo
Robert, C. & Casella, G. (2004). Monte Carlo Statistical Methods