As part of the strategy to control COVID-19, many governments carry on random sampling of the population looking for asymptomatic cases.

Imagine that you are randomly chosen for a test of COVID-19. The test result is “positive”, that is, it says that you have the virus. You also know that the test sometimes fails, giving either a *false positive* or a *false negative*. Then the question is **what is the probability that you have COVID-19 given that the test said “positive”?**

Let’s assume that:

- There are \(10^{5}\) people tested
- The test has a
*precision*of 99% - The
*prevalence*of COVID in the population is 0.1% - The people to test is chosen randomly from the population

Since this context will be the same in all cases, we will not write it explicitly

Test- | Test+ | Total | |
---|---|---|---|

COVID- |
. | . | . |

COVID+ |
. | . | . |

Total |
. | . | . |

COVID reality in the rows and test results in the columns

Test- | Test+ | Total | |
---|---|---|---|

COVID- |
. | . | . |

COVID+ |
. | . | . |

Total |
. | . | 1e+05 |

We will fill this matrix in the following slides

A large population size help us to see small values

Test- | Test+ | Total | |
---|---|---|---|

COVID- |
. | . | 99900 |

COVID+ |
. | . | 100 |

Total |
. | . | 1e+05 |

Prevalence is the percentage of the population that has COVID.

In other words, it is the probability of (COVID_{+}) \[
\begin{aligned}
ℙ(\text{COVID}_+) & =0.1\% = 0.001\\
ℙ(\text{COVID}_-) & =99.9\%=0.999
\end{aligned}
\]

Test- | Test+ | Total | |
---|---|---|---|

COVID- |
. | . | 99900 |

COVID+ |
. | 99 | 100 |

Total |
. | . | 1e+05 |

*Precision* is the probability of a correct diagnostic \[ℙ(\text{test}_+ \vert \text{COVID}_+)=0.99\] We fill the box corresponding to (test_{+},COVID_{+}) \[ℙ(\text{test}_+, \text{COVID}_+)=ℙ(\text{test}_+ \vert \text{COVID}_+)\cdotℙ(\text{COVID}_+)\]

Test- | Test+ | Total | |
---|---|---|---|

COVID- |
98901 | . | 99900 |

COVID+ |
. | 99 | 100 |

Total |
. | . | 1e+05 |

In this case the precision for negative cases is the same \[ℙ(\text{test}_- | \text{COVID}_-)=0.99\] We fill the box corresponding to (test_{-},COVID_{-}) \[ℙ(\text{test}_-, \text{COVID}_-)=ℙ(\text{test}_- | \text{COVID}_-)⋅ℙ(\text{COVID}_-)\]

Test- | Test+ | Total | |
---|---|---|---|

COVID- |
98901 | 999 | 99900 |

COVID+ |
1 | 99 | 100 |

Total |
. | . | 1e+05 |

Misdiagnostic is the negation of good diagnostic \[ℙ(\text{test}_- | \text{COVID}_+)=1-ℙ(\text{test}_+ | \text{COVID}_+)=0.01\] we combine them in the same way as before \[ℙ(\text{test}_-, \text{COVID}_+)=ℙ(\text{test}_- | \text{COVID}_+)⋅ ℙ(\text{COVID}_+)\]

Test- | Test+ | Total | |
---|---|---|---|

COVID- |
98901 | 999 | 99900 |

COVID+ |
1 | 99 | 100 |

Total |
98902 | 1098 | 1e+05 |

We sum and fill the empty boxes

1098 people got positive test, but only 99 of them have COVID \[ℙ(\text{COVID}_+ | \text{test}_+)=\frac{99}{1098} = 9.02\%\]

Yes | No | Test | |
---|---|---|---|

True |
True Positive | False Negative | All True |

False |
False Positive | True Negative | All False |

Reality |
Detected | Not detected | All cases |

Other values that can be calculated

- Sensitivity, specificity
- Precision, Recall
- F-index
- Matthews correlation coefficient (MCC)

“All the truth” \[\textrm{Sensitivity}=\frac{\textrm{True Positives}}{\textrm{All True}}\] “Nothing but the truth” \[\textrm{Specificity}=\frac{\textrm{True negatives}}{\textrm{All False}}\] \[\textrm{Accuracy}=\frac{\textrm{True Positives+True negatives}}{\textrm{All Cases}}\]

\[\textrm{Precision}=\frac{\textrm{True Positives}}{\textrm{Detected}}\] \[\textrm{Recall}=\frac{\textrm{True Positives}}{\textrm{All True}}\] \[\frac{1}{\textrm{F-index}}=\frac{1}{2}\left(\frac{1}{\textrm{Precision}}+\frac{1}{\textrm{Recall}}\right)\]