Bayes' theorem

Bayes' theorem is a mathematical equation used in probability and statistics ot calculate conditional probability. Intuitively, it is used to calculate the probability of an event, based upon it's association with another event.

\begin{equation*} P(A|B)= P(B|A)\frac{P(A)}{P(B)} \end{equation*}

\(P(A|B)\) is the probability of A occuring, given that B is true. \(P(A)\) and \(P(B)\) are the probabilities of A and B occuring independently of each other.


Let's say we want to have the probability of having the flu if a child has a running nose.

  • \(A\) is the event that a child has a flu, and the probability of this event is 20%. \(P(A)=0.2\).

  • \(B\) is the event that a child has a running nose, and the probability of this event is 50%. \(P(B)=0.5\).

  • Research indicates that 20% of children have a running nose (B) due to the flu (A). \(P(B|A)=0.2\)

When we insert these values into the theorem:

\begin{equation*} P(\text{Flu}|\text{RunnyNose}) = P(\text{RunnyNose}|\text{Flu}) \frac{P(\text{Flu})}{P(\text{RunnyNose})} \end{equation*}
\begin{equation*} P(\text{Flu}|\text{RunnyNose}) = 0.2 \times \frac{0.2}{0.5} \end{equation*}
\begin{equation*} P(\text{Flu}|\text{RunnyNose}) = 0.08 \end{equation*}

Therefore, there is an 8% probability that a child who has the flu when they have the running nose. This indicates that it is unlikely that a random patient with running nose has the flu since the probability is only 0.08.

Impact of sensitivity and specificity

Bayes' theorem demonstrates the effect of false positives and false negatives.


This is the true positive rate (TPR). It is the measure of the proportion of correctly identified positives. A sensitive test rarely misses a "positive".


This is the true negative rate (TNR). It is the measure of the proportion of correctly identified negatives. A specific test rarely misses a "negative".

A prefect test is 100% sensitive and specific; however, most tests have a minimum error called the Bayes Error Rate. Below is the Bayes' Theorem re-written in a form that is usually used to solve the accuracy of medical tests.

\begin{equation*} P(A|X)=\frac{P(X|A) P(A)}{P(X|A)P(A)+P(X|\sim A)P(\sim A)} \end{equation*}

In real-world applications, a trade-off is based between sensitivity (TPR) and specificity (TNR) depending on whether it is more important to not miss a positive result, or whether its better to not label a negative result as positive.


Comments powered by Disqus