WEBVTT
Kind: captions
Language: en
00:00:00.000 --> 00:00:04.080
Some of you may have heard this paradoxical
fact about medical tests. It's very commonly
00:00:04.080 --> 00:00:08.720
used to introduce the topic of bayes rule in
probability. The paradox is that you could
00:00:08.720 --> 00:00:14.160
take a test which is highly accurate, in the sense
that it gives correct results to a large majority
00:00:14.160 --> 00:00:19.680
of the people taking it, and yet under the right
circumstances when assessing the probability that
00:00:19.680 --> 00:00:25.680
your particular test result is correct, you can
still land on a very low number. Arbitrarily low
00:00:25.680 --> 00:00:31.600
in fact. In short, an accurate test is
not necessarily a very predictive test.
00:00:32.880 --> 00:00:37.840
When people think about math and formulas they
don't often think of it as a design process.
00:00:37.840 --> 00:00:42.160
I mean maybe in the case of notation it's easy
to see that different choices are possible,
00:00:42.160 --> 00:00:46.560
but when it comes to the structure of the
formulas themselves and how we use them
00:00:46.560 --> 00:00:49.600
that's something that people
typically view as fixed.
00:00:50.560 --> 00:00:55.040
In this video you and i will dig into this
paradox, but instead of using it to talk about the
00:00:55.040 --> 00:01:00.400
usual version of bayes rule, I'd like to motivate
an alternate version, an alternate design choice.
00:01:01.360 --> 00:01:05.280
Now what's up on the screen is a little
bit abstract, which makes it difficult to
00:01:05.280 --> 00:01:09.200
justify that there really is a substantive
difference here, especially when i haven't
00:01:09.200 --> 00:01:12.960
explained either one yet. To see what i'm
talking about though we should really start
00:01:12.960 --> 00:01:17.920
by spending some time a little more concretely
and just laying out what exactly this paradox is.
00:01:24.080 --> 00:01:28.720
Picture one thousand women and suppose that one
percent of them have breast cancer. And let's
00:01:28.720 --> 00:01:33.440
say they all undergo a certain breast cancer
screening and that nine of those with cancer
00:01:33.440 --> 00:01:38.560
correctly get positive results and there's one
false negative. And then suppose that among the
00:01:38.560 --> 00:01:46.000
remainder without cancer 89 get false positives
and 901 correctly get negative results . So if
00:01:46.000 --> 00:01:50.240
all you know about a woman is that she does
the screening and she gets a positive result,
00:01:50.240 --> 00:01:54.560
you don't have information about symptoms or
anything like that, you know that she's either one
00:01:54.560 --> 00:02:01.120
of these 9 true positives or one of these 89 false
positives. So the probability that she's in the
00:02:01.120 --> 00:02:07.840
cancer group given the test result is 9 divided
by (9 + 89) which is approximately 1 in 11.
00:02:08.880 --> 00:02:13.760
In medical parlance you would call this the
"Positive Predictive Value" of the test, or PPV.
00:02:14.320 --> 00:02:18.960
The number of true positives divided by
the total number of positive test results.
00:02:18.960 --> 00:02:23.280
You can see where the name comes from to
what extent does a positive test result
00:02:23.280 --> 00:02:25.280
actually predict that you have the disease.
00:02:26.560 --> 00:02:30.400
Now hopefully, as I've presented it this
way where we're thinking concretely about
00:02:30.400 --> 00:02:34.720
a sample population, all of this makes
perfect sense. But where it comes across
00:02:34.720 --> 00:02:38.640
as counterintuitive is if you just
look at the accuracy of the test,
00:02:38.640 --> 00:02:43.120
present it to people as a statistic, and then ask
them to make judgments about their test result.
00:02:43.760 --> 00:02:49.440
Test accuracy is not actually one number but
two. First you ask how often is the test correct
00:02:49.440 --> 00:02:55.120
on those with the disease, this is known as
the test sensitivity. As in how sensitive is
00:02:55.120 --> 00:03:01.280
it to detecting the presence of the disease. In
our example test sensitivity is 9 in 10 or 90%.
00:03:02.160 --> 00:03:05.760
Another way to say the same fact would
be to say the false negative rate is 10%.
00:03:06.640 --> 00:03:11.440
And then a separate not-necessarily-related
number is how often it's correct for those
00:03:11.440 --> 00:03:16.960
without the disease, which is known as the
test specificity. As in, are positive results
00:03:16.960 --> 00:03:22.160
caused specifically by the disease or are there
confounding triggers giving false positives?
00:03:22.880 --> 00:03:29.120
In our example the specificity is about 91%. Or
another way to say the same fact would be to say
00:03:29.120 --> 00:03:36.320
the false positive rate is 9%. So the paradox
here is that in one sense the test is over 90%
00:03:36.320 --> 00:03:40.400
accurate, it gives correct results to
over 90% of the patients who take it.
00:03:40.960 --> 00:03:45.760
And yet if you learn that someone gets a
positive result without any added information,
00:03:45.760 --> 00:03:49.520
there's actually only a 1 in 11 chance
that that particular result is accurate.
00:03:50.400 --> 00:03:54.800
This is a bit of a problem because of all of
the places for math to be counter-intuitive
00:03:54.800 --> 00:04:01.360
medical tests are one area where it matters a lot.
In 2006 and 2007 the psychologist Gerd Gigerenzer
00:04:01.360 --> 00:04:06.240
gave a series of statistics seminars to practicing
gynecologists and he opened with the following
00:04:06.240 --> 00:04:12.080
example: A 50 year old woman, no symptoms
participates in a routine mammography screening.
00:04:12.080 --> 00:04:16.960
She tests positive, is alarmed and wants to know
from you whether she has breast cancer for certain
00:04:16.960 --> 00:04:21.680
or what her chances are. Apart from the screening
result you know nothing else about this woman.
00:04:22.400 --> 00:04:25.760
In that seminar the doctors were then told
that the prevalence of breast cancer for
00:04:25.760 --> 00:04:30.880
women of this age is about 1%, and then to
suppose that the test sensitivity is 90%
00:04:31.440 --> 00:04:36.400
and that its specificity was 91%. You might
notice these are exactly the same numbers
00:04:36.400 --> 00:04:39.520
from the example that you and I just
looked at, this is where I got them,
00:04:39.520 --> 00:04:44.800
so having already thought it through you and
i know the answer, it's about 1 in 11. However
00:04:44.800 --> 00:04:48.640
the doctors in this session were not primed
with the suggestion to picture a concrete
00:04:48.640 --> 00:04:53.280
sample of one thousand individuals the way that
you and I had. All they saw were these numbers.
00:04:53.840 --> 00:04:58.480
They were then asked: "How many women who
test positive actually have breast cancer?
00:04:58.480 --> 00:05:01.680
What is the best answer?", and they
were presented with these four choices.
00:05:02.320 --> 00:05:07.760
In one of the sessions over half the doctors
present said that the correct answer was 9 and 10,
00:05:07.760 --> 00:05:12.160
which is way off. Only a fifth
of them gave the correct answer,
00:05:12.160 --> 00:05:15.280
which is worse than what it would have
been if everybody had randomly guessed!
00:05:16.400 --> 00:05:21.280
It might seem a little extreme to be calling
this a paradox. I mean it's just a fact, it's
00:05:21.280 --> 00:05:26.640
not something intrinsically self-contradictory.
But as these seminars with Gigerenzer show,
00:05:26.640 --> 00:05:32.240
people (including doctors) definitely find it
counterintuitive that a test with high accuracy
00:05:32.240 --> 00:05:34.160
can give you such a low predictive value.
00:05:35.280 --> 00:05:40.880
We might call this a "vertical paradox", which
refers to facts that are provably true but which
00:05:40.880 --> 00:05:46.160
nevertheless can feel false when phrased a certain
way. It's sort of the softest form of a paradox,
00:05:46.160 --> 00:05:51.840
saying more about human psychology than about
logic. The question is how we can combat this.
00:05:53.920 --> 00:05:58.160
Where we're going with this by the way is that I
want you to be able to look at numbers like this
00:05:58.160 --> 00:06:02.720
and quickly estimate in your head that it means
the predictive value of a positive test should
00:06:02.720 --> 00:06:08.960
be around 1 and 11. Or if i changed things and
asked what if it was 10% of the population who had
00:06:08.960 --> 00:06:14.000
breast cancer, you should be able to quickly turn
around and say that the final answer would be a
00:06:14.000 --> 00:06:20.800
little over 50%. Or if i said imagine a really low
prevalence something like 0.1% of patients having
00:06:20.800 --> 00:06:26.480
cancer, you should again quickly estimate that the
predictive value of the test is around 1 in 100,
00:06:26.480 --> 00:06:30.480
that 1 in 100 of those with positive test
results in that case would have cancer.
00:06:31.440 --> 00:06:33.760
Or let's say we go back to the 1% prevalence,
00:06:33.760 --> 00:06:37.600
but I make the test more accurate, I tell
you to imagine the specificity is 99%
00:06:38.720 --> 00:06:42.800
there you should be able to relatively quickly
estimate that the answer is a little less than
00:06:42.800 --> 00:06:47.840
50%. The hope is that you're doing all of
this with minimal calculations in your head.
00:06:48.640 --> 00:06:52.320
Now the goals of quick calculations might feel
very different from the goals of addressing
00:06:52.320 --> 00:06:57.040
whatever misconception underlies this paradox,
but they actually go hand-in-hand. Let me show
00:06:57.040 --> 00:07:01.520
you what I mean. On the side of addressing
misconceptions, what would you tell to the
00:07:01.520 --> 00:07:06.880
people in that seminar who answered 9 and 10?
What fundamental misconception are they revealing?
00:07:08.000 --> 00:07:11.600
What I might tell them is that in much the
same way that you shouldn't think of tests
00:07:11.600 --> 00:07:14.960
as telling you deterministically
whether you have a disease,
00:07:14.960 --> 00:07:18.480
you shouldn't even think of them as telling
you your chances of having a disease.
00:07:19.360 --> 00:07:24.320
Instead, the healthy view of what tests
do is that they *update* your chances.
00:07:25.840 --> 00:07:30.320
In our example before taking the test a
patient's chances of having cancer were 1 in 100.
00:07:30.960 --> 00:07:35.680
In Bayesian terms we call this the "prior
probability". The effect of this test was
00:07:35.680 --> 00:07:42.000
to update that prior by almost an order of
magnitude, up to around 1 in 11. The accuracy of
00:07:42.000 --> 00:07:46.640
a test is telling us about the strength of this
updating, it's not telling us a final answer.
00:07:47.520 --> 00:07:51.520
What does this have to do with quick
approximations? Well a key number for
00:07:51.520 --> 00:07:56.400
those approximations is something called the
Bayes factor, and the very act of defining
00:07:56.400 --> 00:08:01.280
this number serves to reinforce this central
lesson about reframing what it is the tests do.
00:08:02.080 --> 00:08:06.160
You see, one of the things that makes test
statistics so very confusing is that there are
00:08:06.160 --> 00:08:10.480
at least four numbers that you'll hear associated
with them. For those with the disease there's the
00:08:10.480 --> 00:08:14.480
sensitivity and the false negative rate, and then
for those without there's the specificity in the
00:08:14.480 --> 00:08:18.560
false positive rate. And none of these numbers
actually tell you the thing you want to know!
00:08:19.440 --> 00:08:24.720
Luckily if you want to interpret a positive test
result you can pull out just one number to focus
00:08:24.720 --> 00:08:30.080
on from all this. Take the sensitivity divided
by the false positive rate. In other words how
00:08:30.080 --> 00:08:36.080
much more likely are you to see the positive test
result with cancer versus without. In our example
00:08:36.080 --> 00:08:41.680
this number is 10. This is the Bayes factor,
also sometimes called the likelihood ratio.
00:08:42.880 --> 00:08:46.000
A very handy rule of thumb is
that to update a small prior,
00:08:46.000 --> 00:08:51.280
or at least to approximate the answer, you simply
multiply it by the Bayes factor. So in our example
00:08:51.280 --> 00:08:56.080
where the prior was 1 in 100, you would estimate
that the final answer should be around 1 in 10,
00:08:56.080 --> 00:09:01.360
which is in fact slightly above the true correct
answer. So based on this rule of thumb, if I
00:09:01.360 --> 00:09:07.040
asked you what would happen if the prior from our
example was instead 1 in 1,000, you could quickly
00:09:07.040 --> 00:09:11.040
estimate that the effect of the test should
be to update those chances to around 1 in 100.
00:09:12.160 --> 00:09:15.680
And in fact take a moment to check yourself
by thinking through a sample population.
00:09:16.560 --> 00:09:22.240
In this case you might picture 10,000 patients
where only 10 of them really have cancer. Then
00:09:22.240 --> 00:09:27.840
based on that 90% sensitivity we would expect
9 of those cancer cases to give true positives.
00:09:29.200 --> 00:09:34.880
And on the other side, a 91% specificity means
that 9% of those without cancer are getting
00:09:34.880 --> 00:09:40.240
false positives, so we'd expect nine percent
of the remaining patients, which is around 900,
00:09:40.240 --> 00:09:46.480
to give false positive results. Here, with such
a low prevalence, the false positives really do
00:09:46.480 --> 00:09:51.360
dominate the true positives, so the probability
that a randomly chosen positive case from this
00:09:51.360 --> 00:09:56.960
population actually has cancer is only around one
percent, just like the rule of thumb predicted.
00:09:58.880 --> 00:10:03.440
Now, this rule of thumb clearly cannot work for
higher priors. For example, that would predict
00:10:03.440 --> 00:10:09.200
that a prior of 10% gets updated all the way
to 100% certainty, but that can't be right.
00:10:09.840 --> 00:10:14.800
In fact take a moment to think through what the
answer should be again using a sample population.
00:10:14.800 --> 00:10:20.720
Maybe this time we picture 10 out of 100 having
cancer again. Based on the 90% sensitivity of
00:10:20.720 --> 00:10:26.160
the test we'd expect 9 of those cancer cases to
get positive results, but what about the false
00:10:26.160 --> 00:10:34.880
positives? How many do we expect there? About 9
of the remaining 90, or about 8. So upon seeing
00:10:34.880 --> 00:10:40.000
a positive test result it tells you that you're
either one of these 9 true positives or one of the
00:10:40.000 --> 00:10:46.960
8 false positives. So this means the chances are
a little over 50%, roughly 9 out of 17, or 53%.
00:10:48.240 --> 00:10:52.880
At this point, having dared to dream that Bayesian
updating could look as simple as multiplication,
00:10:52.880 --> 00:10:55.520
you might tear down your hopes
and pragmatically acknowledge
00:10:55.520 --> 00:10:57.600
sometimes life is just more complicated than that.
00:10:59.840 --> 00:11:04.400
Except, it's not. This rule of thumb
turns into a precise mathematical fact
00:11:04.400 --> 00:11:08.400
as long as we shift away from talking about
probabilities to instead talking about
00:11:08.400 --> 00:11:14.560
odds. If you've ever heard someone talk about the
chances of an event being "1-to-1" or "2-to-1",
00:11:14.560 --> 00:11:19.760
things like that, you already know about odds.
With probability we're taking the ratio of the
00:11:19.760 --> 00:11:25.600
number of positive cases out of all possible
cases, right? Things like "1 in 5" or "1 in 10".
00:11:25.600 --> 00:11:31.840
With odds what you do is take the ratio of all
positive cases to all negative cases. You commonly
00:11:31.840 --> 00:11:36.160
see odds written with a colon to emphasize the
distinction, but it's still just a fraction,
00:11:36.160 --> 00:11:42.400
just a number. So an event with a 50% probability
would be described as having one-to-one odds.
00:11:42.960 --> 00:11:49.680
A 10% probability is the same as 1-to-9 odds.
An 80% probability is the same as 4:1 odds,
00:11:49.680 --> 00:11:55.120
you get the point. It's the same information, it
still describes the chances of a random event,
00:11:55.120 --> 00:11:58.160
but is presented a little differently,
like a different unit system.
00:11:58.960 --> 00:12:04.000
Probabilities are constrained between 0
and 1, with even chances sitting at 0.5,
00:12:04.560 --> 00:12:09.440
but odds range from 0 up to infinity with
even chances sitting at the number 1.
00:12:12.000 --> 00:12:16.000
The beauty here is that a
completely-accurate-not-even-approximating-things
00:12:16.000 --> 00:12:21.920
way to frame Bayes rule is to say: Express your
prior using odds, then just multiply by the Bayes
00:12:21.920 --> 00:12:27.360
factor. Think about what the prior odds are really
saying, it's the number of people with cancer
00:12:27.360 --> 00:12:32.080
divided by the number without it. Here, let's just
write that down as a normal fraction for a moment
00:12:32.080 --> 00:12:37.280
so we can multiply it. When you filter down just
to those with positive test results the number of
00:12:37.280 --> 00:12:42.960
people with cancer gets scaled down scaled down by
the probability of seeing a positive test result
00:12:42.960 --> 00:12:48.480
given that someone has cancer, and then similarly
the number of people without cancer also gets
00:12:48.480 --> 00:12:53.280
scaled down, this time by the probability of
seeing a positive test result but in that case.
00:12:53.920 --> 00:12:59.360
So the ratio between these two counts, the
new odds upon seeing the test, looks just
00:12:59.360 --> 00:13:05.840
like the prior odds, except multiplied by this
term here, which is exactly the Bayes factor.
00:13:07.920 --> 00:13:13.360
Look back at our example where the bayes factor
was 10. And as a reminder this came from the 90%
00:13:13.360 --> 00:13:18.720
sensitivity divided by the 9% false positive rate;
how much more likely are you to see a positive
00:13:18.720 --> 00:13:25.920
result with cancer versus without. If the prior
is 1%, expressed as odds this looks like 1-to-99.
00:13:27.040 --> 00:13:32.080
so by our rule this gets updated to
10-to-99, which if you want you could
00:13:32.080 --> 00:13:37.200
convert back to a probability. It would be
10 divided by (10 + 99), or about 1 in 11.
00:13:38.000 --> 00:13:43.440
If instead the prior was 10%, which was the
example that tripped up our rule of thumb earlier,
00:13:43.440 --> 00:13:50.000
expressed as odds this looks like 1-to-9. By
our simple rule this gets updated to 10-to-9,
00:13:50.000 --> 00:13:54.240
which you can already read off pretty
intuitively. It's a little above even chances,
00:13:54.240 --> 00:13:58.880
a little above 1-to-1. If you prefer you
can convert it back to a probability,
00:13:58.880 --> 00:14:04.560
you would write it as 10 out of 19, or
about 53%. And indeed that is what we
00:14:04.560 --> 00:14:07.200
already found by thinking things
through with a sample population.
00:14:08.160 --> 00:14:11.920
Let's say we go back to the 1% prevalence,
but I make the test more accurate.
00:14:11.920 --> 00:14:16.080
Now what if I told you to imagine that the
false positive rate was only 1% instead of 9%.
00:14:17.200 --> 00:14:20.640
What that would mean is that our
Bayes factor is 90 instead of 10,
00:14:20.640 --> 00:14:25.200
the test is doing more work for us. In this
case, with the more accurate test it gets
00:14:25.200 --> 00:14:32.160
updated to 90-to-99, which is a little less than
even chances, something a little under 50%. To be
00:14:32.160 --> 00:14:36.720
more precise you could make the conversion back
to probability and work out that it's around 48%,
00:14:37.520 --> 00:14:41.360
but honestly if you're just going for a
gut feel, it's fine to stick with the odds.
00:14:42.960 --> 00:14:46.320
Do you see what I mean about how just
defining this number helps to combat
00:14:46.320 --> 00:14:51.760
potential misconceptions? For anybody who's a
little hasty in connecting test accuracy directly
00:14:51.760 --> 00:14:56.240
to your probability of having a disease, it's
worth emphasizing that you could administer
00:14:56.240 --> 00:15:00.080
the same test with the same accuracy
to multiple different patients
00:15:00.080 --> 00:15:04.320
who all get the same exact result, but if
they're coming from different contexts,
00:15:04.320 --> 00:15:09.040
that result can mean wildly different
things. However the one thing that does
00:15:09.040 --> 00:15:14.480
stay constant in every case is the factor by
which each patient's prior odds get updated.
00:15:16.000 --> 00:15:19.760
And by the way this whole time we've been using
the prevalence of the disease, which is the
00:15:19.760 --> 00:15:25.200
proportion of people in a population who have it,
as a substitute for the prior, the probability of
00:15:25.200 --> 00:15:30.800
having it before you see a test. However that's
not necessarily the case! If there are other known
00:15:30.800 --> 00:15:36.000
factors, things like symptoms, or in the case of
a contagious disease things like known contacts,
00:15:36.000 --> 00:15:39.840
those also factor into the prior, and they
could potentially make a huge difference.
00:15:40.560 --> 00:15:44.640
As another side note, so far we've only
talked about positive test results,
00:15:44.640 --> 00:15:48.640
but way more often you would be seeing
a negative test result. The logic there
00:15:48.640 --> 00:15:52.640
is completely the same but the Bayes factor
that you compute is going to look different.
00:15:52.640 --> 00:15:57.600
Instead you look at the probability of seeing
this negative test result with the disease versus
00:15:57.600 --> 00:16:02.480
without the disease. So in our cancer example
this would have been the 10% false negative
00:16:02.480 --> 00:16:09.520
rate divided by the 91% specificity, or about
1 in 9. In other words seeing a negative test
00:16:09.520 --> 00:16:14.480
result in that example would reduce your
prior odds by about an order of magnitude.
00:16:15.760 --> 00:16:20.800
When you write it all out as a formula, here's how
it looks. It says your odds of having a disease
00:16:20.800 --> 00:16:25.680
given a test result equals your odds
before taking the test, the prior odds,
00:16:25.680 --> 00:16:30.800
times the bayes factor. Now let's contrast this
with the usual way that Bayes rule is written,
00:16:30.800 --> 00:16:32.240
which is a bit more complicated.
00:16:32.800 --> 00:16:36.240
In case you haven't seen it before it's
essentially just what we were doing with sample
00:16:36.240 --> 00:16:41.360
populations, but you wrap it all up symbolically.
Remember how every time we were counting the
00:16:41.360 --> 00:16:45.680
number of true positives and then dividing it
by the sum of the true positives and the false
00:16:45.680 --> 00:16:51.040
positives? We do just that, except instead of
talking about absolute amounts we talk of each
00:16:51.040 --> 00:16:56.640
term as a proportion. So the proportion of true
positives in the population comes from the prior
00:16:56.640 --> 00:17:00.880
probability of having the disease multiplied
by the probability of seeing a positive test
00:17:00.880 --> 00:17:06.240
result in that case, and then we copy that term
down again into the denominator and then the
00:17:06.240 --> 00:17:11.760
proportion of false positives comes from the prior
probability of not having the disease times the
00:17:11.760 --> 00:17:16.960
probability of a positive test in that case. If
you want you could also write this down with words
00:17:16.960 --> 00:17:20.800
instead of symbols, if terms like sensitivity
and false positive rate are more comfortable.
00:17:21.680 --> 00:17:25.120
This is one of those formulas where once you
say it out loud it seems like a bit much,
00:17:25.120 --> 00:17:28.480
but it really is no different from what
we were doing with sample populations.
00:17:29.040 --> 00:17:32.080
If you wanted to make the whole thing
look simpler you often see this entire
00:17:32.080 --> 00:17:38.240
denominator written just as the probability of
seeing a positive test result overall. While that
00:17:38.240 --> 00:17:42.880
does make for a really elegant little expression,
if you intend to use this for calculations,
00:17:42.880 --> 00:17:46.480
it's a little disingenuous because in
practice every single time you do this
00:17:46.480 --> 00:17:50.480
you need to break down that denominator into
two separate parts, breaking down the cases.
00:17:51.440 --> 00:17:56.000
Wo taking this more honest representation of it,
let's compare our two versions of Bayes rule.
00:17:56.560 --> 00:18:00.480
And again maybe it looks nicer if we use the
word sensitivity and false positive rate,
00:18:00.480 --> 00:18:04.640
if nothing else it helps emphasize which parts
of the formula are coming from statistics about
00:18:04.640 --> 00:18:08.320
the test accuracy. I mean this actually
emphasizes one thing i really like about
00:18:08.320 --> 00:18:12.240
the framing with odds and a Bayes factor,
which is that it cleanly factors out the
00:18:12.240 --> 00:18:15.760
parts that have to do with the prior and the
parts that have to do with the test accuracy.
00:18:16.480 --> 00:18:21.360
But over in the usual formula all of those are
very intermingled together. And this has a very
00:18:21.360 --> 00:18:25.760
practical benefit, it's really nice if you want
to swap out different priors and easily see their
00:18:25.760 --> 00:18:30.240
effects. This is what we were doing earlier.
But with the other formula, to do that you
00:18:30.240 --> 00:18:35.200
have to recompute everything each time, you can't
leverage a pre-computed Bayes factor the same way.
00:18:36.000 --> 00:18:39.360
The odds framing also makes things really
nice if you want to do multiple different
00:18:39.360 --> 00:18:43.680
Bayesian updates based on multiple pieces of
evidence. For example, let's say you took not
00:18:43.680 --> 00:18:48.240
one test but two. Or you wanted to think about
how the presence of symptoms plays into it.
00:18:48.880 --> 00:18:53.200
For each piece of new evidence you see you
always ask the question: How much more likely
00:18:53.200 --> 00:18:57.680
would you be to see that with the disease
versus without the disease? Each answer to
00:18:57.680 --> 00:19:01.920
that question gives you a new Bayes factor,
a new thing that you multiply by your odds.
00:19:02.960 --> 00:19:06.960
Beyond just making calculations easier, there's
something I really like about attaching a number
00:19:06.960 --> 00:19:12.080
to test accuracy that doesn't even look like a
probability. I mean, if you hear that a test has
00:19:12.080 --> 00:19:17.600
for example, a 9% false positive rate, that's
just such a disastrously ambiguous phrase!
00:19:17.600 --> 00:19:22.400
It's so easy to misinterpret it to mean there's a
9% chance that your positive test result is false.
00:19:23.040 --> 00:19:26.480
But imagine if instead the number that
we heard tacked on to test results
00:19:26.480 --> 00:19:32.080
was that the Bayes factor for a positive test
result is, say, 10. There's no room to confuse
00:19:32.080 --> 00:19:37.120
that for your probability of having a disease,
the entire framing of what a Bayes factor is
00:19:37.120 --> 00:19:41.440
is that it's something that acts on a prior,
it forces your hand to acknowledge the prior
00:19:41.440 --> 00:19:45.360
as something that's separate entirely and
highly necessary to drawing any conclusion.
00:19:47.440 --> 00:19:52.160
All that said, the usual formula is definitely
not without its merits. If you view it not simply
00:19:52.160 --> 00:19:56.800
as something to plug numbers into, but as an
encapsulation of the sample population idea
00:19:56.800 --> 00:20:00.720
that we've been using throughout, you could
very easily argue that that's actually much
00:20:00.720 --> 00:20:04.960
better for your intuition. After all, it's
what we were routinely falling back on in
00:20:04.960 --> 00:20:08.960
order to check ourselves that the Bayes factor
computation even made sense in the first place.
00:20:11.680 --> 00:20:15.520
Like any design decision there is
no clear-cut objective best here.
00:20:15.520 --> 00:20:19.360
But it's almost certainly the case that
giving serious consideration to that question
00:20:19.360 --> 00:20:21.680
will lead you to a better
understanding of Bayes rule
00:20:30.080 --> 00:20:33.200
Also since we're on the topic
of kind of paradoxical things,
00:20:33.200 --> 00:20:38.000
a friend of mine Matt Cook recently wrote a book
all about paradoxes. I actually contributed a
00:20:38.000 --> 00:20:42.320
small chapter to it with thoughts on the question
of whether math is invented or discovered, and the
00:20:42.320 --> 00:20:46.400
book as a whole is this really nice connection
of thought-provoking paradoxical things ranging
00:20:46.400 --> 00:21:01.840
from philosophy to math and physics. You can of
course find all the details in the description.