For many of us, the following scenario is all too familiar.
Me: What’s your plan for the lady in bay 6 who had the fall?
Registrar: She’s going home with antibiotics for a UTI
Me: Did she have symptoms of a UTI?
Registrar: No, but her urine dipstick was positive.
There’s something strangely re-assuring about the results of tests, that lures many doctors and medical practitioners into a false sense of security. This is also true of the public, who often identify tests as the gold standard and find themselves more re-assured by a test result than a clinical opinion. Tests seem to offer a definitive answer whereas clinical decisions feel less so.
So why then, in medical school, were we never taught to lead with tests, and diagnostic reasoning always started with History, Examination, then Differential Diagnosis? Tests were only suggested as an adjunct to a clinical diagnosis. There appears to be a discord between what we were taught and what we are practicing. It seems we have learned to distrust our clinical opinion and will gladly default our decision making to a test. It’s easy right? Let’s just do a troponin on all chest pain, surely if it’s positive the cause must be ACS?
In my experience very few practitioners would discharge someone home who has chest pain and a raised troponin, despite not believing the patient was having an ACS before the result. Yet doing the troponin in the first place does not follow our traditional medical school teachings. Perhaps then universities are getting it wrong?
The issue appears to lie in a misunderstanding of the value of tests. We seem to assume the results are definitive because they give quantitative answers. The Hb is 156, or the troponin is positive, sounds much more compelling than the more qualitative “my opinion is …”. This results in a misplaced trust in their value both publicly and professionally, which using some simple probability theory, can be easily demonstrated.
Have you ever asked yourself, “How accurate is this test, and how many times does it get it wrong?” The “accuracy” of a test is often presented as the sensitivity., for example; the sensitivity of a urine dipstick, which is around 85%. Knowing this figure, if I dipped your urine and it was positive, what is the likelihood that you have a UTI?
Many would answer 85% – but they would be wrong.
Sensitivity is the measure of how likely you would be to get a positive result from the dipstick if you had a UTI (a true positive). That is the probability of a positive test given you have the disease. P(T|D)
This is completely different to the probability that you have the disease given a positive test P(D|T) – which is known as the Positive Predictive Value (PPV) – and is what we are interested in as clinicians using tests to make diagnoses.
It’s easy to demonstrate how poor the PPV can be with a test that has a sensitivity of 85%
Let’s imagine 100 people selected randomly from the community. I’m going to test everyone of them for a UTI using my 85% sensitive urine dipstick test. Before I do the test, I know that the prevalence (number of people in the community with a UTI) is about 1%, so 1 person in this sample of 100 people will likely have a UTI – we can mark this person as red.
Now, when we test his/her urine we have an 85% chance the dipstick will be positive (true positive). But, when we test the others who do not have the disease there will be a 15% chance of a false positive (100% – 85%) – That will be about 15 out of the remaining 99 people who will also test positive; I’ll mark them as green.
This means the test will be positive in 15 people who don’t have the disease and only 1 who has the disease. We can easily see how a positive dipstick with 85% sensitivity has only a 6% (1 in 16) chance of predicting disease (PPV). Despite this shockingly poor PPV, some Emergency Departments will routinely dipstick the urine of every patient presenting.
The difference between sensitivity and PPV depends on the prior probability of having the disease (the prevalence in this example) and a bit of mathematics called Bayes Theorem. Without going into these mathematics it’s important to know how this prior probability affects the result.
Let’s say that we don’t randomly test urine, but we filter the people we test into a cohort of people likely to have a UTI. We can do this using the clinical skills taught in university; History & Examination. Now if we have 80% clinical suspicion based on these skills we increase the prior probability to 80%. When we do the maths this time we find that the PPV from the test rises to 96%. Suddenly the test has more significance, and I now feel much happier about prescribing those antibiotics
It would seem then, that the lecturers teaching in our medical schools were correct all along, and that tests should only be used as an adjunct to clinical diagnosis, not instead of it. Knowing this illustrates that random screening tests in the Emergency Department, and elsewhere, are likely to lead to a much higher number of false positives than most practitioners realise, and false positives can lead to significant patient harm, as well as misdiagnosis.
The moral of the story then is this: Tests are a poor substitute for clinical skills, but can have value only if we know when to use them.