Impact of the Power of Adaptive Weight on Penalized Logistic Regression: Application to Adipocytic Tumors Classification

Narumol  Sudjai; Monthira  Duangsaphon; Chandhanarat  Chandhanayingyong

doi:10.18502/jbe.v10i3.17922

Narumol Sudjai Department of Orthopaedic Surgery, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
Monthira Duangsaphon Department of Mathematics and Statistics, Faculty of Science and Technology, Thammasat University, Pathum Thani, Thailand
Chandhanarat Chandhanayingyong Department of Orthopaedic Surgery, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand

DOI: https://doi.org/10.18502/jbe.v10i3.17922

Keywords: High-dimensional sparse data; Machine-learning; Multicollinearity; Penalized logistic regres- sion; Penalty function; Power of adaptive weight; Initial weight

Abstract

Introduction: MRI-based texture features in adipocytic tumors to serve as non-invasive predictive biomarkersthat can provide precise outcomes for decision-making. Power of adaptive weight and the initial weight forthe adaptive Lasso is one of the important parameters. This study aimed to compare the impact of the initialweight together with the power of adaptive weight for this adaptive Lasso under high-dimensional sparse datawith multicollinearity.

Methods: All independent variables in the Monte Carlo simulation were generated using the Toeplitzcorrelation structure. Performance of the initial weight together with the power of adaptive weight on penalizedapproaches was evaluated using the mean of the predicted mean squared error (MPMSE) for simulationstudy and the area under the receiver operator characteristic curve (AUC), precision, recall, F1-score, and theclassification accuracy of models for real-data applications.

Results: The simulation study showed that the smallest MPMSE value was obtained from the square rootof the adaptive Lasso together with the initial weight using Lasso. Additionally, the results of this approachon the real-data application achieved high performance to distinguish the intramuscular lipomas from well-differentiated liposarcomas: the values of AUC, accuracy, precision, recall, and F1-score for the model basedon penalized logistic regression classifier were 0.935, 0.928, 0.919, 0.921, and 0.925 respectively, and 0.946,0.935, 0.932, 0.934, and 0.930 respectively for the model based on support vector machine classifier. Both thesimulation study and the real-data application presented that the square root of the adaptive Lasso together withthe initial weight using Lasso was the best option under high-dimensional sparse data with multicollinearity.

Conclusion: Our finding showed that the power of adaptive weight on penalty function and the initialweight can affect certain the classification accuracy of machine-learning model. In practice, if choosing theseparameters are appropriate, it produces models that have good performance.