Decision Trees for Predicting Risk of Mortality using Routinely Collected Data

It is well known that Logistic Regression is the gold standard method for predicting clinical outcome, especially predicting risk of mortality. In this paper, the Decision Tree method has been proposed to solve specific problems that commonly use Logistic Regression as a solution. The Biochemistry and Haematology Outcome Model (BHOM) dataset obtained from Portsmouth NHS Hospital from 1 January to 31 December 2001 was divided into four subsets. One subset of training data was used to generate a model, and the model obtained was then applied to three testing datasets. The performance of each model from both methods was then compared using calibration (the χ2 test or chi-test) and discrimination (area under ROC curve or c-index). The experiment presented that both methods have reasonable results in the case of the c-index. However, in some cases the calibration value (χ2) obtained quite a high result. After conducting experiments and investigating the advantages and disadvantages of each method, we can conclude that Decision Trees can be seen as a worthy alternative to Logistic Regression in the area of Data Mining.




References:
[1] Asiimwe, A. (2007). Morbidity and mortality in patients with stable and
unstable COPD: Construction and validation of a prediction model using
routinely collected data, University of Portsmouth.
[2] B. Silke, J. Kellett, T. Rooney, K. Bennett, and D. O-Riordan (2010). An
improved medical admissions risk system using multivariable fractional
polynomial logistic regression modeling, QJM 103(1): 23-32
doi:10.1093/qjmed/hcp149
[3] Cook, N. R. (2008). Statistical evaluation of prognostic versus
diagnostic models: Beyond the ROC curve. Clinical Chemistry, 54(1),
17-23. doi: 10.1373/clinchem.2007.096529
[4] Copeland, G.P., D. Jones and M. Walters (1991). POSSUM: a scoring
system for surgical audit. Br J Surg 78(3): p. 355-60.
[5] Pine, M., B. Jones, and Y.-B. Lou (1998). Laboratory values improve
predictions of hospital mortality, Int J Qual Health Care 10(6): 491-501
doi:10.1093/intqhc/10.6.491
[6] Prytherch, D. R., J.S. Briggs, P.C. Weaver,, P. Schmidt, & G.B. Smith,
(2005). Measuring clinical performance using routinely collected
clinical data. Medical Informatics and the Internet in Medicine, 30(2),
151-156. doi: 10.1080/14639230500298966
[7] Prytherch, D. R., B.M.F. Ridler, S. Ashley & Audit Res Comm Vascular
Soc (2005). Risk-adjusted predictive models of mortality after index
arterial operations using a minimal data set. Br J Surg, 92(6), 714-718.
doi: 10.1002/bjs.4965
[8] Prytherch, D. R., J.S. Sirl, P. Schmidt, P.I. Featherstone, P.C. Weaver, &
G.B. Smith (2005). The use of routine laboratory data to predict inhospital
death in medical admissions. Resuscitation, 66(2), 203-207. doi:
10.1016/j.resuscitation.2005.02.011
[9] Prytherch, D. R., G.B. Smith., P.E. Schmidt & P.I. Featherstone (2010).
ViEWS-Towards a national early warning score for detecting adult
inpatient deterioration. Resuscitation, 81(8), 932-937. doi:
10.1016/j.resuscitation.2010.04.014
[10] Prytherch, D. R., M.S. Whiteley, B, Higgins, P.C. Weaver,,W.G. Prout,
& S.J. Powell (1998). POSSUM and Portsmouth: POSSUM for
predicting mortality. Br J Surg, 85(9), 1217-1220.
[11] Tang, T., S.R. Walsh,, D.R. Prytherch, T. Lees, K. Varty, J.R. Boyle &
Assoc Res Comm Vascular Soc. (2007). VBHOM, a data economic
model for predicting the outcome after open abdominal aortic aneurysm
surgery. Br J Surg, 94(6), 717-721. doi: 10.1002/bjs.5808