Machine Learning-Based Clinical Decision Support System for Automatic Diagnosis of COVID-19 based on Clinical Data

Mohammad Reza Afrash; Leila Erfannia; Morteza Amrae; Nahid Mehrabi; Saeed Jelvay; Raoof Nopour; Mostafa Shanbehzadeh

doi:10.18502/jbe.v8i1.10407

Mohammad Reza Afrash School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
Leila Erfannia Department of Health Information Technology, Faculty of Paramedical, Zaheda University of Medical Sciences, Zahedan, Iran.
Morteza Amrae Department of Health Information Technology, School of Allied Medical Sciences, Lorestan University of Medical Sciences, Khorramabad, Iran.
Nahid Mehrabi Department of Health Information Technology, Aja University of Medical Sciences, Tehran, Iran.
Saeed Jelvay Instructor of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran.
Raoof Nopour Department of Health Information Technology and Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran.
Mostafa Shanbehzadeh Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran.

DOI: https://doi.org/10.18502/jbe.v8i1.10407

Keywords: COVID-19; Coronavirus; Machine learning; Artificial intelligence; Decision Support Systems.

Abstract

Introduction: Needless to say that correct and real-time detection and effective prognosis of the COVID-19 are necessary to deliver the best possible care for patients and, accordingly, diminish the pressure on the healthcare industries. Hence our paper aims to present an intelligent algorithm for selecting the best features from the dataset and developing Machine Learning(ML) based models to predict the COVID-19 and finally opted for the best-performing algorithm.

Methods: In this developmental study, the clinical data of 1703 COVID-19 and non-COVID-19 patients Using a single-center registry from February 9, 2020, to December 20, 2020, were used. The Minimum Redundancy Maximum Relevance (mRMR) feature selection algorithm identified the most relevant variables. Then, chosen features feed into the several data mining methods, including K-Nearest Neighbors, AdaBoost Classifier, Decision Tree, HistGradient Boosting Classifier, and Support Vector Machine. A 10-fold cross-validation method and six performance evaluation metrics were used to evaluate and compare these implemented algorithms, and finally, the best model was implemented.

Results: Out of the 34 included features, 11 variables were selected as the essential features. The results of using ML algorithms indicated that the best performance belongs to the AdaBoost classifier with mean accuracy = 92.9%, mean specificity = 89.3%, mean sensitivity = 94.2%, mean F-measure = 91.6 %, mean KAPA = 94.3% and mean ROC = 92.1 %.

Conclusion: The empirical results reveal that the Adaboost model yielded higher performance than other classification models and developed our Clinical Decision Support Systems (CDSS) interface to discriminate positive COVID-19 from negative cases.