Bayesian decision theory
From beliefs to actions
Probabilistic inference let us compute our distribution over the possible states of nature ; this is called our belief state. But ultimately we must convert beliefs into actions.
The optimal way to make decisions under uncertainty is to use decision theory.
If the decision maker or agent picks action when the true state of nature is , then we assume they incur a loss of .
Since the state is usually hidden, we must compute the the expected loss or risk for each action:
The optimal policy specifies which action to take for each possible observation so as to minimize this risk:
Equivalently, we can maximize expected utility
Example: Optimal policy for treating COVID patients
Suppose we are a doctor who can either do nothing or give an expensive and painful drug to a patient.
Suppose the patient either has or does not have COVID, and is either young or old.
Suppose the loss function is as shown below, where the units are Quality Adjusted Life Years. This matrix encodes the assumption that a young person is likely to live much longer than an old person, and that the drug is so bad that it is like losing 8 (quality-adjusted) years of life. (We will vary these assumptions later.)
We assume the age is a visible variable, but the COVID status is hidden.
The doctor can use Bayes rule to infer , where is a diagnostic test.
Given the belief state, we can compute the optimal policy, which is shown below.
We see that we should only give the drug to young people who test positive.
Sensitivity to assumptions
If we reduce the cost of the drug from 8 to 5, we get the new optimal policy shown below.
Now we see that we should give the drug to anyone who tests positive.
If we keep the cost of the drug at 8, but increase the sensititivity of the test from 0.875 to 0.975, we get the new optimal policy shown below.
Now we see that we should give the drug to anyone who tests positive, since the chance that they actually have COVID is higher.
Code to reproduce the above.
Decision-theoretic classification
In classification problems, the unknown state of nature is the "true" class label , where .
We typically assume that the set of possible actions is to pick one of the labels, .
We often assume a zero-one loss matrix , as follows:
We can write this more concisely as follows:
In this case, the posterior expected loss is Hence the action that minimizes the expected loss is to choose the most probable label:
This corresponds to the mode of the posterior distribution, also known as the maximum a posteriori or MAP estimate.
Classification with a "reject" option
In some applications (e.g., medicine, finance), we have an asymmetric loss, where the cost of a false positives is much higher than a cost of a false negative.
In such settings, it is useful to allow for addition rejection action, in which we refuse to classify if we are uncertain. Such inputs can then be handled by some other system (e.g., a human).
We define the loss function as follows: $$ \ell(h, a) = \left\{ \begin{array}{cc} 0 & \mbox{if $h=aa \in {1,\ldots,C}ParseError: KaTeX parse error: Expected 'EOF', got '}' at position 1: }̲\\ \lambda_r & …a=0$}\ \lambda_e & \mbox{otherwise} \end{array} \right. $$ where $\lambda_r\lambda_e$ is a cost of a misclassification error.
It is easy to show that the optimal policy is as follows: where is the MAP estimate.