fbpx

Calculating the chance of a false positive COVID-19 test

ANALYSIS

BY LEN CABRERA

Ohio Governor Mike DeWine’s very public false positive COVID-19 test this week has a lot of people asking about the accuracy of the testing and the chances of getting a false positive.

Don't Miss a Post!

An article by Maureen Ferren (PhD geneticist and associate professor of biology at Rochester Institute of Technology) addresses the importance of knowing the accuracy of COVID-19 tests to determine the rate of false positives. She states some examples without explaining the math. First, she gives the shocking revelation that if the test is 95% sensitive and 95% specific and only 5% of the population is infected, then a positive test only has a 50% chance of identifying an infected person. Her second example says that if 5% of the population is infected, a 99% sensitive and 99% specific test still has a 16% chance of false positive results.

Those calculations come from flipping conditional probabilities using Bayes’ Theorem. Bayes’ Theorem can frequently provide counterintuitive results like Dr. Ferren’s first example. We can get the same result (50% false positives) with a 90% sensitive and 90% specific test with 10% of the population infected. This case is easier to illustrate with a picture before getting into the math. First, some definitions are required (assuming you already know some basic probability).

To say a test is 90% sensitive means that it will correctly identify 90% of the people who are infected (i.e., correct positives). So if 10 infected people are tested, 9 of the tests will return positive results. In mathematical terms, the probability of a positive test given an infected person is 0.9: Pr(+ test | infected) = 0.9.

To say a test is 90% specific means that it will correctly identify 90% of people who are not infected with that specific virus (i.e., correct negatives). So if 10 healthy people are tested (or people with a different virus), 9 of the tests will return negative results. In mathematical terms, Pr(- test | not infected) = 0.9. Since there are only two possible results for the test (+ or -), the sum of their probabilities is equal to 1, so this can also be written as Pr(+ test | not infected) = 1 ‑ Pr(‑ test | not infected) = 0.1.

Determining the sensitivity and specificity of a test requires knowing whether the person being tested is infected or not. But if you give a test to an unknown person, the conditional probabilities are reversed. The chance that a person who has a positive test is actually infected is given by Pr(infected | + test). Similarly, a false positive means a person is not infected, but has a positive test (i.e., Pr(not infected | + test).

A picture will illustrate the math behind flipping the conditional probabilities. Let’s look at 100 people and assume that 10% of the population is infected. 10 of those 100 will be infected and 90 will not. Of the infected people, 9 will test positive and 1 will not. Of the 90 people who are not infected, 9 will test positive and 81 will test negative. That means if someone has a positive test, there is an equal chance that the person is infected or not: 9/18 = 0.5.

For the general case, computing the chance of a false positive starts with the definition of conditional probability:

None of the terms on the right side are known directly, but they can be derived from the test’s sensitivity (Pr(+ test | infected)) and specificity (Pr(- test | not infected)) and the infection rate (Pr(infected)). Since there are only two events (infected and not infected) and two outcomes (+ test and – test), Bayes’ Theorem can compute false positives like this:

For the curious, I’ll derive that monstrosity at the end of the article. Here’s how the formula works for the example pictured above:

Here’s how the formula works with Dr. Ferren’s second example, 99% sensitivity and specificity with a 5% infection rate:

So how many false positives do we have? If there is a small infection rate (as suggested by cumulative 1.5% case/population ratio in the U.S.; 2.5% in Florida), even very accurate tests can result in a lot of false positives. Picturing all the possibilities is difficult in two dimensions given three parameters, but if we assume sensitivity equals specificity, a simple table can show the chance of false positives for various levels of accuracy and infection rates.

You can play with the inputs directly with this tool by the British Medical Journal.

Even if there are only 10% false positives, that means nearly 500,000 people in the U.S. (over 52,000 in Florida) have had their lives and rights suspended for nothing. According to the FDA’s SARS-CoV-2 rRT-PCR fact sheet, a false positive results in:

“a recommendation for isolation of the patient, monitoring of household or other close contacts for symptoms, patient isolation that might limit contact with family or friends and may increase contact with other potentially COVID-19 patients, limits in the ability to work, the delayed diagnosis and treatment for the true infection causing the symptoms, unnecessary prescription of a treatment or therapy, or other unintended adverse effects.”


Deriving the formula to find false positives:

Using 1 – specificity (i.e., Pr(+ test | not infected)), we can get the numerator by using the definition of conditional probability:

By rearranging those terms, we get:

The denominator is a little trickier. It requires breaking up in all the positive tests between those who are and are not infected using joint probabilities:

The first term we just found using 1 – specificity. We can the same technique (rearranging terms of conditional probability) with the sensitivity (i.e., Pr(+ test | infected))  to get the second term:

Combining all that gives:

3 Comments