Comparison of Different Decision Tree Algorithms for Classification of Retinopathy Patients in Yazd City, Central Part of Iran

  • Amin Karami Master Degree Shahid Sadoughi University of Medical Sciences and Health Services
  • Mohsen Askarishahi PhD Shahid Sadoughi University of Medical Sciences and Health Services
  • Nasim Namiranian PhD Shahid Sadoughi University of Medical Sciences and Health Services
Keywords: Retinopathy, Diabetes, Decision Tree, Yazd, Data Mining

Abstract

Introduction: Diabetes is one of the most common diseases caused by metabolic disorders. It is the result of impaired secretion or function of insulin. The prevalence of diabetes is increasing rapidly. The aim of this study is to investigate the performance of different decision tree algorithms in the diagnosis of diabetic retinopathy. It was done using a database regarding diabetic patients. They were referred to Yazd Diabetes Research Center.

Method: This study was analytical and cross-sectional. 2613 patients visited Yazd City's research and treatment center. Their demographic information was received in the first stage. Then, they were tested by the nursing team, and the patient's information form was completed by the respective nurse. After that, the descriptive indicators of mean, mode, median, variance, frequency, and percentage of missing data were observed. Four diagnostic models (Chadi), classification tree and regression (C and R), (Quest) and C 5.0 were compared. Authors evaluated the performance of these four models using three statistical criteria: accuracy, sensitivity, and specificity. Gains chart was used for more accurate comparison of models. SPSS MODELER V 18.0 software was used for data processing and modeling. The significance level was considered 5%.

Result: In this study, among the demographic and clinical variables, BMI, duration of disease, type of drug used, age, hypertension, gender, cholesterol, and hemoglobin A1c were entered in the final model. The dependent variable of retinopathy was investigated. It was based on the obtained criteria of accuracy (71.75), sensitivity (75.60), specificity (57.14) in the CART model; accuracy (65.84), sensitivity (65.86), specificity (65.76) of the Quest model; accuracy (69.33), sensitivity (67.35), specificity (76.81) of Chaid model; and accuracy (73.27), sensitivity (79.65), specificity (49.05) of Chaid model.

Conclusion: Based on the criteria of accuracy, sensitivity, specificity, and comparison of Gain Chart for four algorithms, Chaid algorithm showed better performance. Therefore, for further research, the authors suggest this algorithm.

Published
2022-10-16
Section
Articles