Decision Trees for Predicting Risk of Mortality using Routinely Collected Data
It is well known that Logistic Regression is the gold
standard method for predicting clinical outcome, especially
predicting risk of mortality. In this paper, the Decision Tree method
has been proposed to solve specific problems that commonly use
Logistic Regression as a solution. The Biochemistry and
Haematology Outcome Model (BHOM) dataset obtained from
Portsmouth NHS Hospital from 1 January to 31 December 2001 was
divided into four subsets. One subset of training data was used to
generate a model, and the model obtained was then applied to three
testing datasets. The performance of each model from both methods
was then compared using calibration (the χ2 test or chi-test) and
discrimination (area under ROC curve or c-index). The experiment
presented that both methods have reasonable results in the case of the
c-index. However, in some cases the calibration value (χ2) obtained
quite a high result. After conducting experiments and investigating
the advantages and disadvantages of each method, we can conclude
that Decision Trees can be seen as a worthy alternative to Logistic
Regression in the area of Data Mining.
[1] Asiimwe, A. (2007). Morbidity and mortality in patients with stable and
unstable COPD: Construction and validation of a prediction model using
routinely collected data, University of Portsmouth.
[2] B. Silke, J. Kellett, T. Rooney, K. Bennett, and D. O-Riordan (2010). An
improved medical admissions risk system using multivariable fractional
polynomial logistic regression modeling, QJM 103(1): 23-32
doi:10.1093/qjmed/hcp149
[3] Cook, N. R. (2008). Statistical evaluation of prognostic versus
diagnostic models: Beyond the ROC curve. Clinical Chemistry, 54(1),
17-23. doi: 10.1373/clinchem.2007.096529
[4] Copeland, G.P., D. Jones and M. Walters (1991). POSSUM: a scoring
system for surgical audit. Br J Surg 78(3): p. 355-60.
[5] Pine, M., B. Jones, and Y.-B. Lou (1998). Laboratory values improve
predictions of hospital mortality, Int J Qual Health Care 10(6): 491-501
doi:10.1093/intqhc/10.6.491
[6] Prytherch, D. R., J.S. Briggs, P.C. Weaver,, P. Schmidt, & G.B. Smith,
(2005). Measuring clinical performance using routinely collected
clinical data. Medical Informatics and the Internet in Medicine, 30(2),
151-156. doi: 10.1080/14639230500298966
[7] Prytherch, D. R., B.M.F. Ridler, S. Ashley & Audit Res Comm Vascular
Soc (2005). Risk-adjusted predictive models of mortality after index
arterial operations using a minimal data set. Br J Surg, 92(6), 714-718.
doi: 10.1002/bjs.4965
[8] Prytherch, D. R., J.S. Sirl, P. Schmidt, P.I. Featherstone, P.C. Weaver, &
G.B. Smith (2005). The use of routine laboratory data to predict inhospital
death in medical admissions. Resuscitation, 66(2), 203-207. doi:
10.1016/j.resuscitation.2005.02.011
[9] Prytherch, D. R., G.B. Smith., P.E. Schmidt & P.I. Featherstone (2010).
ViEWS-Towards a national early warning score for detecting adult
inpatient deterioration. Resuscitation, 81(8), 932-937. doi:
10.1016/j.resuscitation.2010.04.014
[10] Prytherch, D. R., M.S. Whiteley, B, Higgins, P.C. Weaver,,W.G. Prout,
& S.J. Powell (1998). POSSUM and Portsmouth: POSSUM for
predicting mortality. Br J Surg, 85(9), 1217-1220.
[11] Tang, T., S.R. Walsh,, D.R. Prytherch, T. Lees, K. Varty, J.R. Boyle &
Assoc Res Comm Vascular Soc. (2007). VBHOM, a data economic
model for predicting the outcome after open abdominal aortic aneurysm
surgery. Br J Surg, 94(6), 717-721. doi: 10.1002/bjs.5808
[1] Asiimwe, A. (2007). Morbidity and mortality in patients with stable and
unstable COPD: Construction and validation of a prediction model using
routinely collected data, University of Portsmouth.
[2] B. Silke, J. Kellett, T. Rooney, K. Bennett, and D. O-Riordan (2010). An
improved medical admissions risk system using multivariable fractional
polynomial logistic regression modeling, QJM 103(1): 23-32
doi:10.1093/qjmed/hcp149
[3] Cook, N. R. (2008). Statistical evaluation of prognostic versus
diagnostic models: Beyond the ROC curve. Clinical Chemistry, 54(1),
17-23. doi: 10.1373/clinchem.2007.096529
[4] Copeland, G.P., D. Jones and M. Walters (1991). POSSUM: a scoring
system for surgical audit. Br J Surg 78(3): p. 355-60.
[5] Pine, M., B. Jones, and Y.-B. Lou (1998). Laboratory values improve
predictions of hospital mortality, Int J Qual Health Care 10(6): 491-501
doi:10.1093/intqhc/10.6.491
[6] Prytherch, D. R., J.S. Briggs, P.C. Weaver,, P. Schmidt, & G.B. Smith,
(2005). Measuring clinical performance using routinely collected
clinical data. Medical Informatics and the Internet in Medicine, 30(2),
151-156. doi: 10.1080/14639230500298966
[7] Prytherch, D. R., B.M.F. Ridler, S. Ashley & Audit Res Comm Vascular
Soc (2005). Risk-adjusted predictive models of mortality after index
arterial operations using a minimal data set. Br J Surg, 92(6), 714-718.
doi: 10.1002/bjs.4965
[8] Prytherch, D. R., J.S. Sirl, P. Schmidt, P.I. Featherstone, P.C. Weaver, &
G.B. Smith (2005). The use of routine laboratory data to predict inhospital
death in medical admissions. Resuscitation, 66(2), 203-207. doi:
10.1016/j.resuscitation.2005.02.011
[9] Prytherch, D. R., G.B. Smith., P.E. Schmidt & P.I. Featherstone (2010).
ViEWS-Towards a national early warning score for detecting adult
inpatient deterioration. Resuscitation, 81(8), 932-937. doi:
10.1016/j.resuscitation.2010.04.014
[10] Prytherch, D. R., M.S. Whiteley, B, Higgins, P.C. Weaver,,W.G. Prout,
& S.J. Powell (1998). POSSUM and Portsmouth: POSSUM for
predicting mortality. Br J Surg, 85(9), 1217-1220.
[11] Tang, T., S.R. Walsh,, D.R. Prytherch, T. Lees, K. Varty, J.R. Boyle &
Assoc Res Comm Vascular Soc. (2007). VBHOM, a data economic
model for predicting the outcome after open abdominal aortic aneurysm
surgery. Br J Surg, 94(6), 717-721. doi: 10.1002/bjs.5808
@article{"International Journal of Information, Control and Computer Sciences:49390", author = "Tessy Badriyah and Jim S. Briggs and Dave R. Prytherch", title = "Decision Trees for Predicting Risk of Mortality using Routinely Collected Data", abstract = "It is well known that Logistic Regression is the gold
standard method for predicting clinical outcome, especially
predicting risk of mortality. In this paper, the Decision Tree method
has been proposed to solve specific problems that commonly use
Logistic Regression as a solution. The Biochemistry and
Haematology Outcome Model (BHOM) dataset obtained from
Portsmouth NHS Hospital from 1 January to 31 December 2001 was
divided into four subsets. One subset of training data was used to
generate a model, and the model obtained was then applied to three
testing datasets. The performance of each model from both methods
was then compared using calibration (the χ2 test or chi-test) and
discrimination (area under ROC curve or c-index). The experiment
presented that both methods have reasonable results in the case of the
c-index. However, in some cases the calibration value (χ2) obtained
quite a high result. After conducting experiments and investigating
the advantages and disadvantages of each method, we can conclude
that Decision Trees can be seen as a worthy alternative to Logistic
Regression in the area of Data Mining.", keywords = "Decision Trees, Logistic Regression, clinical
outcome, risk of mortality.", volume = "6", number = "2", pages = "156-4", }