Cirrhosis Mortality Prediction as Classification Using Frequent Subgraph Mining

In this work, we use machine learning and data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. Our work applies modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

[1] National Institute of Diabetes and Digestive and Kidney Diseases. Cirrhosis. (accessed Nov 2017).
[2] Malinchoc M., Kamath P. S., Gordon F. D., et al. A model to predict poor survival in patients undergoing transjugular intrahepatic portosystemic shunts. Hepatology. 2000 Apr;31(4):864-71.
[3] Suman A., Barnes D. S., Zein N. N., et al. Predicting outcome after cardiac surgery in patients with cirrhosis: a comparison of Child-Pugh and MELD scores. Clin Gastroenterol Hepatol. 2004 Aug;2(8):719-23.
[4] Botta F., Giannini E., Romagnoli P., et al. MELD scoring system is useful for predicting prognosis in patients with liver cirrhosis and is correlated with residual liver function: a European study. Gut. 2003 Jan;52(1):134-9.
[5] Kamath P. S., Wiesner R. H., Malinchoc M., et al. A model to predict survival in patients with end-stage liver disease. Hepatology. 2001 Feb;33(2):464-70.
[6] Heuman D. M., Abou-Assi S. G., Habib A., et al. Persistent ascites and low serum sodium identify patients with cirrhosis and low MELD scores who are at high risk for early death. Hepatology. 2004 Oct;40(4):802-10.
[7] Luo Y., Xin Y., Joshi R., et al. Predicting ICU Mortality Risk by Grouping Temporal Trends from a Multivariate Panel of Physiologic Measurements. In: Proc Conf AAAI Artif Intell 2016. 2016 Feb 12-17; Phoenix, Az.
[8] Pugh R. N., Murray-Lyon I. M., Dawson J. L., et al. Transection of the oesophagus for bleeding oesophageal varices. Br J Surg. 1973 Aug;60(8):646-9.
[9] Ruf A. E., Kremers W. K., Chavez L. L., et al. Addition of serum sodium into the MELD score predicts waiting list mortality better than MELD alone. Liver Transpl. 2005 Mar;11(3):336-43.
[10] Prohic D., Mesihovic R., Vanis N., Prognostic Significance of Ascites and Serum Sodium in Patients with Low Meld Scores. Med Arch. 2016 Feb;70(1):48–52.
[11] Goldberg E., Chopra S., Cirrhosis in adults: Etiologies, clinical manifestations, and diagnosis. UpToDate. (accessed Nov 2017).
[12] Charif I., Saada K., Mellouki I., et al. Predictors of Intra-Hospital Mortality in Patients with Cirrhosis. Open Journal of Gastroenterology. 2014 Mar;4(3):141-48.
[13] Rodrigues-Pinto E., Freitas-Silva M., Hepatorenal syndrome, septic shock and renal failure as mortality predictors in patients with spontaneous bacterial peritonitis. GE Jornal Português de Gastrenterologia. 2012 Nov-Dec;19(6):278-83.
[14] Romero-Gómez M., Boza F., García-Valdecasas M. S., et al. Subclinical hepatic encephalopathy predicts the development of overt hepatic encephalopathy. Am J Gastroenterol. 2001 Sep;96(9):2718-23.
[15] Zein C. O., Lindor K. D., Angulo P., Prevalence and predictors of esophageal varices in patients with primary sclerosing cholangitis. Hepatology. 2004 Jan;39(1):204-10.
[16] Papp M., Vitalis Z., Altorjay I., et al. Acute phase proteins in the diagnosis and prediction of cirrhosis associated bacterial infections. Liver Int. 2012 Apr;32(4):603-11. doi: 10.1111/j.1478-3231.2011.02689.x. Epub 2011 Dec 6.
[17] Singal A. G., Mukherjee A., Elmunzer B. J., et al. Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma. Am J Gastroenterol. 2013 Nov;108(11):1723-30. doi: 10.1038/ajg.2013.332. Epub 2013 Oct 29.
[18] Morgul M. H., Klunk S., Anastasiadou Z., Gauger U., Dietel C., Reutzel-Selke A., Felgendref P., Hau H. M., Tautenhahn H. M., Schmuck R. B., Raschzok N., Sauer I. M., Bartels M., Diagnosis of HCC for patients with cirrhosis using miRNA profiles of the tumor-surrounding tissue - A statistical model based on stepwise penalized logistic regression. Exp Mol Pathol. 2016 Oct;101(2):165-171. doi: 10.1016/j.yexmp.2016.07.014. Epub 2016 Aug 20. PMID: 27554417.
[19] Sartakhti J. S., Zangooei M. H., Mozafari K., Hepatitis disease diagnosis using a novel hybrid method based on support vector machine and simulated annealing (SVM-SA). Comput Methods Programs Biomed. 2012 Nov;108(2):570-9. doi:10.1016/j.cmpb.2011.08.003. Epub 2011 Oct 2.
[20] Pérez-Ortiz, M., et al. "An organ allocation system for liver transplantation based on ordinal regression." Applied Soft Computing 14 (2014): 88-98.
[21] Briceño J., Cruz-Ramírez M., Prieto M., Navasa M., Ortiz de Urbina J., Orti R., Gómez-Bravo M. Á., Otero A., Varo E., Tomé S., Clemente G., Bañares R, Bárcena R, Cuervas-Mons V, Solórzano G, Vinaixa C, Rubín A., Colmenero J., Valdivieso A., Ciria R., Hervás-Martínez C., de la Mata M., Use of artificial intelligence as an innovative donor-recipient matching model for liver transplantation: results from a multicenter Spanish study. J Hepatol. 2014 Nov;61(5):1020-8. doi: 10.1016/j.jhep.2014.05.039. Epub 2014 Jun 4. PMID: 24905493.
[22] Van Buuren S., Groothuis-Oudshoorn K., Mice: Multivariate Imputation by Chained Equations. J Stat Softw. 2011 Dec;45(3):1–67.
[23] Borgelt C., Berthold M. R., Mining molecular fragments: finding relevant substructures of molecules. In Proc IEEE Int Conf Data Min 2002. 2002:51-58.
[24] Hofree M., Shen J. P., Carter H., et al. Network-based stratification ofor mutations. Nat Methods. 2013 Nov;10(11):1108–1115.
[25] Peng H. C., Long F. H., Ding C., Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell. 2005 Jun;27(8):1226-1238.
[26] Chawla N. V., Bowyer K. W., Hall L. O., et al. SMOTE: Synthetic Minority Over-sampling Technique. J Artif Intell Res. 2002 Jan;16(1):321-357.