We consider the eczema medical trial data set again. This time we will compare which of 2 models explain the observed data best.
Model 1: All studies have the same probability of success.
Model 2: A hierarchical model where the probability of success in each study is drawn from a beta prior distribution with unknown and parameters.
| Study | Treatment group | Control group |
|---|---|---|
| Di Rienzo 2014 | 20 / 23 | 9 / 15 |
| Galli 1994 | 10 / 16 | 11 / 18 |
| Kaufman 1974 | 13 / 16 | 4 / 10 |
| Qin 2014 | 35 / 45 | 21 / 39 |
| Sanchez 2012 | 22 / 31 | 12 / 29 |
| Silny 2006 | 7 / 10 | 0 / 10 |
| Totals | 107 / 141 | 57 / 121 |
Model 1:
For each group (treatment and control), all 6 studies have the same fixed, but unknown, probability of success, .
The data follow a binomial distribution in each study, conditioned on the probability of success — for treatment or for control.
The priors over and are uniform.
These assumptions lead to the following model.
Likelihood: , where is the number of successful recoveries, is the number of failures (did not recover), and the number of patients.
Prior: for both and .
Posterior for treatment group: .
Posterior for control group: .
Since we have closed-form solutions for the posteriors, we can calculate the marginal likelihood by rearranging Bayes' equation: (marginal likelihood) = (likelihood) x (prior) / (posterior).
where and are the parameters of the prior, and and are the parameters of the posterior beta distribution.
Since all factors involving cancel out, we are just left with the normalization constants of the likelihood, the prior and the posterior:
We usually compute the log of the marginal likelihood since the results can vary over many orders of magnitude.
Task 1:
Take the log of the marginal likelihood above.
Complete the R function below to calculate the log marginal likelihood.
You can use the built-in function
lbeta(a,b)to compute .
Error in parse(text = x, srcfile = src): <text>:11:9: unexpected symbol
10:
11: for s
^
Traceback:
Model 2:
For each group (intervention and control), each of the 6 studies has a different probability of success.
Each probability of success is drawn from a beta prior with unknown parameters and .
Since and are unknown, we put a broad hyperprior on them — we choose the Gamma(2, 0.5) distribution, which is shown below.
These assumptions lead to the following model:
Likelihood: , where is the number of successful recoveries, is the number of failures (did not recover), and the number of patients. Note that each study has its own , whereas Model 1 had the same for all 6 studies.
Prior: .
Hyperprior: .
This model has 8 parameters (for each of the treatment and control groups), namely , , and .
Since the posterior does not have a closed-form analytical solution, we have to calculate the marginal likelihood by integrating out all of the parameters in the model.
This looks like a crazy 8-dimensional integral, but we can actually integrate out the analytically, leaving a 2-dimensional integral over and .
First, note that does not contain , so we can move it outside of the integrals.
ParseError: KaTeX parse error: Invalid color: '[' at position 61: …eta) \textcolor[̲rgb]{0.00,0.00,…Next, since there are no factors containing two different variables (go to explanation), we can rearrange the integrals and the products (the blue part) like this:
ParseError: KaTeX parse error: Invalid color: '[' at position 61: …eta) \textcolor[̲rgb]{0.00,0.00,…Note that we cannot always swap products and integrals.
Since the beta distribution is conjugate to the binomial, the blue integrals above can be evaluated analytically (much like we did for Model 1), to get
Finally, move all the factors that do not depend on or out of the integrals.
Unfortunately we cannot evaluate the remaining integrals analytically, so we resort to a numerical calculation.
Task 2:
Read up about the
adaptIntegrate(f, lowerLimit, upperLimit)function in thecubaturepackage — see the function documentation below.How would you define a function so that
adaptIntegrate(f, c(0, 0), c(Inf, Inf))evaluates the 2-dimensional integral over and above?