Think Bayes
This notebook presents example code and exercise solutions for Think Bayes.
Copyright 2018 Allen B. Downey
MIT License: https://opensource.org/licenses/MIT
The Lincoln index problem
A few years ago my occasional correspondent John D. Cook wrote an excellent blog post about the Lincoln index, which is a way to estimate the number of errors in a document (or program) by comparing results from two independent testers.
http://www.johndcook.com/blog/2010/07/13/lincoln-index/
Here's his presentation of the problem:
"Suppose you have a tester who finds 20 bugs in your program. You want to estimate how many bugs are really in the program. You know there are at least 20 bugs, and if you have supreme confidence in your tester, you may suppose there are around 20 bugs. But maybe your tester isn't very good. Maybe there are hundreds of bugs. How can you have any idea how many bugs there are? There's no way to know with one tester. But if you have two testers, you can get a good idea, even if you don't know how skilled the testers are."
Then he presents the Lincoln index, an estimator "described by Frederick Charles Lincoln in 1930," where Wikpedia's use of "described" is a hint that the index is another example of Stigler's law of eponymy.
"Suppose two testers independently search for bugs. Let
k1be the number of errors the first tester finds andk2the number of errors the second tester finds. Letcbe the number of errors both testers find. The Lincoln Index estimates the total number of errors ask1 * k2 / c"
I changed his notation to be consistent with mine.
So if the first tester finds 20 bugs, the second finds 15, and they find 3 in common, we estimate that there are about 100 bugs.
Of course, whenever I see something like this, the idea that pops into my head is that there must be a (better) Bayesian solution! And there is.