Testing decisions in the pandemic: how do we use imperfect tests for COVID-19 (SARS-CoV-2) to make clinical decisions?

Bell, Katy JL; Stanaway, Fiona; Irwig, Les M; Horvath, Andrea R; Teixeira-Pinto, Armando; Loy, Clement

This is a preprint version of an article submitted for publication in the Medical Journal of Australia. Changes may be made before final publication. Click here for the PDF version. Suggested citation: Bell KJL , Stanaway F, Horvath AR, Irwig LM, Teixeira-Pinto A, Loy C. Testing decisions in the Pandemic: How do we use imperfect tests for COVID-19 (SARS-CoV-2) to make clinical decisions?. Med J Aust 2020; https://www.mja.com.au/journal/2020/testing-decisions-pandemic-how-do-we-use-imperfect-tests-covid-19-sars-cov-2-make [Preprint, 2 October 2020].

Abstract

This is the second article of a series on how to use diagnostic test accuracy evidence. In this article, we explain how diagnostic test accuracy estimates may be applied in clinical practice, using the example of coronavirus disease 2019 (COVID-19). Using clinical scenarios, we guide the reader through the process of applying estimates of diagnostic accuracy to change the pre-test probability of disease, and make clinical decisions about their patient.

A key recommendation for controlling the coronavirus disease 2019 (COVID-19) pandemic is to “test, test test!”(1). If we were able to test everyone using a test that was both 100% sensitive and specific for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, we would have no false-positives (wrong identification of people without the infection) and no false-negatives (cases of infection are missed). With this perfect test we could identify, isolate and treat all infected individuals away from uninfected individuals. Contact tracing could identify further individuals in the incubation period for quarantine and testing as needed.

However, as we saw in the first article in this series(2), tests are rarely 100% sensitive and 100% specific, and usually there is a trade-off between the two. Typically, a highly specific test (Sp) has few false positives so returning a positive (P) result effectively rules in the diagnosis (SpPin), but this is traded off against greater risk of false-negatives. In contrast, a highly sensitive test (Sn) has few false negatives so returning a negative result (N) effectively rules out the diagnosis (SnNout), but this is traded off against greater risk of false-positives. Although these are useful ‘rules of thumb’, in order to make clinical decisions we usually need a more defined probability of disease in an individual patient. In this second article, we use the clinical example of COVID-19 to guide the reader through how to apply estimates of diagnostic test accuracy in clinical practice. The process of making a diagnosis can be thought of as an estimation and re-estimation of the probability of disease in a continuous process. When thought of in that way, each item of history and examination, as well as laboratory and imaging tests, are individual tests in themselves, with the results on each increasing or reducing the probability of disease.

For example, suppose a 26 year-old man in Sydney presents with a blocked nose, cough, and fever in late March 2020. To help you determine whether this could be COVID-19 rather than something else (e.g. other viral respiratory pathogen), you might ask him about contact with anyone with COVID-19, shortness of breath, recent travel, and where he lives (some areas have higher risk of community transmission). The answers to these and other questions would give you some kind of pre-test probability for COVID-19. You would then readjust the disease probability after further questions (e.g. loss of smell or taste, duration of symptoms, past medical history), physical examination (if undertaken), and tests results such as SARS-CoV-2 tests, blood tests (e.g. full blood count, C-reactive protein), and chest imaging tests. If we know the “test” accuracy for each of these steps then we can continuously recalculate the probability of disease.

Likelihood Ratios

The conceptual approach outlined above, updating disease probability for a patient once we have new information from a diagnostic test, can be operationalised by calculating likelihood ratios for the test and using Fagan’s nomogram(3), by estimating post-test probability (Figure 1 - available in PDF)(4). Likelihood ratios describe the ratio of the probability of a test result in people who truly have the disease, to the probability of the same test result in people who truly do not have the disease. In the simplest case of a dichotomous test (positive or negative for SARS-CoV-2), we can calculate a positive likelihood ratio to decide on implications of a positive test result for our patient, and a negative likelihood ratio for implications of a negative test result.

Box 1 (available in PDF) outlines some relevant formulae for these calculations, and interested readers may enjoy reading more about how the Bayes theorem applies to diagnostic tests(4).

When a likelihood ratio is greater than 1, the probability of disease after receiving this test result (post-test probability) is increased from what it was before the test (pre-test probability). The higher the likelihood ratio, the more powerful the test is at shifting the probability of disease upwards from pre- to post-test, and the better the test is at ruling in disease. Conversely when the likelihood ratio is less than 1, then the probability of disease is decreased from what it was before doing the test. The lower the negative likelihood ratio, the more powerful the test is at shifting the probability of disease downwards from pre- to post-test, and the better the test is at ruling out disease.

Decision thresholds

The decision to request a test for a patient needs to take into consideration not just the accuracy of the test, but also how the test results will help us make clinical decisions. To do this, the first step is to define decision thresholds for different types of management. We use tests to move the probability of disease from one side of these decision thresholds to the other. These thresholds may be set by considering the possible impacts that each course of action may have, and by weighing up the benefits and harms of each approach. For example, we may use increasingly higher post-test probability thresholds of SARS-CoV-2 infection to recommend: self-isolation while symptomatic with a respiratory infection, strict quarantine for 14 days, diagnostic testing in people who have contact with vulnerable members of the community (e.g. healthcare workers, aged care workers), diagnostic testing in people who do not have such contact, diagnosis of SARS-CoV-2 infection with strict isolation, and tracing and quarantining/testing all contacts, and diagnosis of COVID-19 with consideration of treatment and/or of enrolling in a trial of experimental treatment (Box 2 - available in PDF).

Pre-test probability

The next step after setting decision thresholds is to decide on our patient’s pre-test probability. Often it is not something that we consciously think about, even though we may intuitively use it to make clinical decisions. Although we normally rely on this “clinical intuition” – our best guess of probability of disease from our clinical experience - this may be less useful in the setting of a new disease like COVID-19. To explicitly estimate the pre-test probability of a patient, we might use the prevalence of disease in a similar clinical cohort from the literature, or from our local database. We would need to make sure the characteristics of the study cohort used for the prevalence estimate (the denominator of people at risk of infection included in the calculation) matches the clinical cohort for our patient. This should include the specific time and place the estimates are based on, and importantly the criteria for testing used.

COVID-19 diagnostic test accuracy

We now move to finding and applying the evidence on diagnostic test accuracy. After a brief literature search, we identify several test accuracy studies for detecting the SARS-CoV-2 virus with reverse transcriptase polymerase chain reaction (RT-PCR) molecular testing. As we have outlined previously(2), we may use the acronym ‘RAM’ to decide if a diagnostic accuracy study is likely to be valid and applicable in our own setting: Representative (covers a similar spectrum of patients); Ascertainment (a reference standard is used to verify all test results); Measurement (an accurate reference standard is applied independently and without knowledge of the test result). There is generally a high risk of bias in available diagnostic accuracy studies on SARS-CoV-2/COVID-19 tests (5, 6) as well as important applicability issues(7). Nevertheless, we find a paper that reports on 1014 patients presenting to a one hospital in Wuhan, China, who underwent both chest computed tomography (CT) and RT-PCR tests for the evaluation of possible COVID-19(8). Using a reference standard that combines all clinical information and all test results available in the paper, we calculate sensitivity, specificity, and likelihood ratios for RT-PCR and chest CT (Table 1 (available in PDF); RT-PCR estimates for sensitivity are similar to others’ (5, 6, 9)). This retrospective hospital based study may overestimate diagnostic accuracy for our intended test use in the community, however, the study appears to have a lower risk of bias than many others published to date.

Post-test probability

Back to the 26 year-old man in Sydney presenting with a blocked nose, cough, and fever in late March 2020. He tells us that he lives in Bondi, and using available prevalence data, we estimate that his pre-test probability estimate is around 6%. His pre-test probability is above the testing threshold of 2% (general population) and so we collect a nasopharyngeal swab and request an RT-PCR test for the SARS-CoV-2 virus and recommend that he self-isolates at home. Two days later we receive the RT-PCR results, which are negative. He has a positive result for another respiratory virus, however, we are aware that co-infection is possible and that this alone does not rule out COVID-19(10). Using Fagan’s nomogram and the negative LR we calculated, we revise the probability of COVID-19 downwards to 3%, which is still above the testing threshold (Blue line in Figure 1 - available in PDF). A second negative PCR is needed before the probability of disease is sufficiently low to rule out COVID-19 and stop testing (but he still needs to self isolate while symptomatic). In the scenario where his housemate (a close contact) has COVID-19, more than two negative PCR results are needed to rule out the disease and a chest CT may be helpful (see scenarios 1C-1G in Table 2 - available in PDF). The process of test interpretations in this and three other clinical scenarios are presented in Table 2, along with key concepts illustrated.

Conclusion

Tests for COVID-19, like tests for other diseases, are not perfect and will sometimes miss people who are infected and falsely identify others who are not infected. Despite this, by using likelihood ratios to estimate the post-test probability of disease, we can make the important clinical decisions needed to provide patient-centred care while also preventing spread of the pandemic.

References

World Health Organisation. WHO Director-General's opening remarks at the media briefing on COVID-19 - 16 March 2020 [press release]. https://www.who.int/dg/speeches/detail/who-director-general-s-opening-remarks-at-the-media-briefing-on-covid-19---16-march-2020 (accessed August 2020).
Bell KJ, Macaskill P, Loy C. Test accuracy and potential sources of bias in diagnostic test evaluation. Med J Aust. 2020;212:10-13.e1.
Fagan T. Nomogram for Bayes's Theorem. N Engl J Med. 1975;293:257.
Jaeschke R, Guyatt GH, Sackett DL. Users' guides to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? The Evidence-Based Medicine Working Group. JAMA. 1994;271:703-7.
Watson J, Whiting PF, Brush JE. Interpreting a covid-19 test result. BMJ. 2020;369:m1808.
Arevalo-Rodriguez I, Buitrago-Garcia D, Simancas-Racines D, Zambrano-Achig P, del Campo R, Ciapponi A, et al. False-negative results of initial RT-PCR assays for COVID-19: a systematic review. medRxiv. 2020; DOI: https://doi.org/10.1101/2020.04.16.20066787
Bossuyt PM. Testing COVID-19 tests faces methodological challenges. J Clin Epidemiol. 2020; DOI: https://doi.org/10.1016/j.jclinepi.2020.06.037
Ai T, Yang Z, Hou H, Zhan C, Chen C, Lv W, et al. Correlation of Chest CT and RT-PCR Testing in Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases. Radiology. 2020;200642.
Woloshin S, Patel N, Kesselheim AS. False Negative Tests for SARS-CoV-2 Infection - Challenges and Implications. N Engl J Med. 2020; 383:e38.
Kim D, Quinn J, Pinsky B, Shah NH, Brown I. Rates of Co-infection Between SARS-CoV-2 and Other Respiratory Pathogens. JAMA. 2020; 323:2085-2086.
Deeks JJ, Altman DG. Diagnostic tests 4: likelihood ratios. BMJ. 2004;329(7458):168.

Katy JL Bell¹
Fiona Stanaway¹
Les M Irwig¹
Andrea R Horvath²
Armando Teixeira-Pinto¹^,³
Clement Loy¹^,⁴

1 The University of Sydney
2 Prince of Wales Hospital and Community Health Services
3 Westmead Millennium Institute for Medical Research
4 Westmead Hospital

Correspondence:

Competing interests:

Competing interests: No relevant disclosures

Online responses are no longer available. Please refer to our instructions for authors page for more information.

Testing decisions in the pandemic: how do we use imperfect tests for COVID-19 (SARS-CoV-2) to make clinical decisions?

Related Articles

Abstract