# A Minor Insight Into Bayes’ Theorem

By definition, the probability of event A given event B is

$P(A|B) = \frac{P(A,B)}{P(B)}.$
And so we can play the following game:
$P(A,B)=P(B,A)$
(This just means that the probability of A and B is the same as the probability of B and A.)  This is equivalent to
$P(A|B)P(B)=P(B|A)P(A).$

Now we can divide both sides by P(A), and we get Bayes’ Theorem:

$P(B|A) = \frac{P(A|B) P(B)}{P(A)}.$
Furthermore, if event A is partitioned into several parts, then we can write Bayes’ theorem this way:
$P(B|A) = \frac{P(A|B)P(B)}{\sum_{B'}P(A|B')P(B')}.$
In my experience, this is usually how Bayes’ theorem is presented.
But let’s go back to two equations ago.  And I’m going to write it slightly differently:
$P(B|A) = P(A|B) \frac{P(B)}{P(A)}.$
This form makes it obvious that we simply have to multiply P(A|B) by the ratio of P(B)/P(A) to get P(B|A).  Let’s go through an example.
Suppose a patient shows up to the emergency room with a symptom: a migraine headache.  We know that the P( migraine | brain bleed ) is very high; let’s give it a fake number: 0.98.  That seems bad.  Basically, anyone who has a brain bleed has a migraine.
But what we really want to know is P( brain bleed | migraine ).  How do we get this number?  We simply multiply P( migraine | brain bleed ) by the ratio of brain bleeds to migraines in the population.  That’s what Baye’s rule says!  You take the likelihood P(A|B) and multiply it by P(B)/P(A)!
Let’s say that a migraine shows up on any given day in 1% of the population.  But a brain bleed shows up on any given day in 0.000001% of the population.  (I’m totally making these numbers up.)  Then we would multiply 98% by 0.000001/1 to get the probability that it’s a brain bleed, which would yield a probability of 0.0098%.