Defining Biomarker Performance and Clinical Validity
In the evaluation of biomarkers three questions can be answered: what is the analytical validity of the marker, what is its clinical validity, and does the marker have clinical utility? In most cases, clinical validity will be expressed in terms of the marker's accuracy: the degree to which it can be used to correctly identify diseased patients or, more generally, patients with the target condition. Diagnostic accuracy is evaluated in studies in which the biomarker values are compared to the outcome of the clinical reference standard in the same patients. There are several ways in which the results of diagnostic accuracy studies can be summarized, reported, and interpreted. In this paper we summarize and present the available measures. We classify these as error-based measures, information-based measures, and measures of the strength of the association. Clinical validity is linked to clinical utility. If the target condition is well defined and associated with unequivocal downstream management decisions, clinical validity, when defined in comparative terms, may sometimes act as a surrogate outcome measure for clinical utility.
Lijmer JG, Leeflang M, Bossuyt PM. Proposals for a phased evaluation of medical tests. Med Decis Making 2009; 29 (5): E13-21.
Fryback DG, Thornbury JR. The efficacy of diagnostic imaging. Medical Decision Making 1991; 11 (2): 88-94.
Teutsch SM, Bradley LA, Palomaki GE, Haddow JE, Piper M, Calonge N, et al. The Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Initiative: methods of the EGAPP Working Group. Genetics in medicine: official journal of the American College of Medical Genetics 2009; 11 (1): 3-14.
Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Clin Chem Lab Med 2003; 41 (1): 68-73.
Korse CM, Taal BG, Bonfrer JM, Vincent A, Van Velthuysen ML, Baas P. An elevated progastrin-releasing peptide level in patients with well-differentiated neuroendocrine tumours indicates a primary tumour in the lung and predicts a shorter survival. Annals of oncology: official journal of the European Society for Medical Oncology / ESMO. 2011.
Smidt N, Rutjes AW, Van der Windt DA, Ostelo RW, Reitsma JB, Bossuyt PM, et al. Quality of reporting of diagnostic accuracy studies. Radiology 2005; 235 (2): 347-53.
Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Standards for Reporting of Diagnostic Accuracy. Clin Chem 2003; 49 (1): 1-6.
Hilden J, Glasziou P. Regret graphs, diagnostic uncertainty and Youden's Index. Stat Med 1996; 15 (10): 969-86.
McGee S. Simplifying likelihood ratios. J Gen Intern Med 2002; 17 (8): 646-9.
Glas AS, Lijmer JG, Prins MH, Bonsel GJ, Bossuyt PM. The diagnostic odds ratio: a single indicator of test performance. J Clin Epidemiol 2003; 56 (11): 1129-35.
Moons KG, Van Es GA, Deckers JW, Habbema JD, Grobbee DE. Limitations of sensitivity, specificity, likelihood ratio, and Bayes' theorem in assessing diagnostic probabilities: a clinical example. Epidemiology 1997; 8 (1): 12-117.
Diamond GA. Reverend Bayes' silent majority. An alternative factor affecting sensitivity and specificity of exercise electrocardiography. Am J Cardiol 1986; 57 (13): 1175-80.
Irwig L, Bossuyt P, Glasziou P, Gatsonis C, Lijmer J. Designing studies to ensure that estimates of test accuracy are transferable. BMJ 2002; 324 (7338): 669-71.
Pepe MS. The Statistical Evaluation of Medical Tests for Classification and Prediction: Oxford University Press; 2003.
Bachmann LM, Puhan MA, Ter Riet G, Bossuyt PM. Sample sizes of studies on diagnostic accuracy: literature survey. BMJ 2006; 332 (7550): 1127-9.
Leeflang MM, Deeks JJ, Gatsonis C, Bossuyt PM. Systematic reviews of diagnostic test accuracy. Ann Intern Med 2008; 149 (12): 889-97.
Rostoff P, Piwowarska W, Gackowski A, Konduracka E, El MN, Latacz P, et al. Electrocardiographic prediction of acute left main coronary artery occlusion. Am J Emerg Med 2007; 25 (7): 852-5.
Lilienfeld DE. Abe and Yak: the interactions of Abraham M. Lilienfeld and Jacob Yerushalmy in the development of modern epidemiology (1945-1973). Epidemiology 2007; 18 (4): 507-14.
Ledley R, Lusted L. Reasoning foundations of medical diagnosis science 1959; 130: 9-21.
Moons KG, Harrell FE. Sensitivity and specificity should be de-emphasized in diagnostic accuracy studies. Acad Radiol 2003; 10 (6): 670-2.
Guggenmoos-Holzmann I, Van Houwelingen HC. The (in)validity of sensitivity and specificity. Stat Med 2000; 19 (13): 1783-92.
Perera R, Heneghan C. Making sense of diagnostic tests likelihood ratios. Evid Based Med 2006; 11 (5): 130-1.
Jaeschke R, Guyatt G, Sackett DL. Users' guides to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA 1994; 271 (5): 389-91.
Chien PF, Khan KS. Evaluation of a clinical test. II: Assessment of validity. BJOG 2001; 108 (6): 568-72.
Puhan MA, Steurer J, Bachmann LM, Ter Riet G. A randomized trial of ways to describe test accuracy: the effect on physicians' post-test probability estimates. Ann Intern Med 2005; 143 (3): 184-9.
Bossuyt PM, Irwig L, Craig J, Glasziou P. Comparative accuracy: assessing new tests against existing diagnostic pathways. BMJ 2006; 332 (7549): 1089-92.
Hayen A, Macaskill P, Irwig L, Bossuyt P. Appropriate statistical methods are required to assess diagnostic tests for replacement, add-on, and triage. Journal of Clinical Epidemiology 2010; 63 (8): 883-91.
Feinstein AR. Misguided efforts and future challenges for research on »diagnostic tests«. J Epidemiol Community Health 2002; 56 (5): 330-2.
Mrus JM. Getting beyond diagnostic accuracy: moving toward approaches that can be used in practice. Clin Infect Dis 2004; 38 (10): 1391-3.
Bossuyt PM, Lijmer JG, Mol BW. Randomised comparisons of medical tests: sometimes invalid, not always efficient. Lancet 2000; 356 (9244): 1844-7.
Lord SJ, Irwig L, Simes RJ. When is measuring sensitivity and specificity sufficient to evaluate a diagnostic test, and when do we need randomized trials? Ann Intern Med 2006; 144 (11): 850-5.