Classifying Substance Abuse Tendencies Using the Naive Bayes Algorithm
Abstract
Introduction: Uncertainty in human life often arises from a lack of knowledge based on past events or unrealized circumstances. The Naive Bayes classification technique, rooted in conditional probability, offers a hypothesis-driven approach to linking two random occurrences and calculating posterior probabilities. Substance addiction remains a critical issue, particularly in patients hospitalized in community mental health centers, necessitating effective predictive methods for early identification and intervention.
Methods: This study employed the Naive Bayes algorithm to classify substance addiction tendencies in patients. Data of all 205 patients registered at the Giresun Province Prof. Dr. A. Ilhan Ö�zdemir State Hospital Community Mental Health Center was obtained from the database. To enhance prediction accuracy, feature selection was conducted using the Information Value (IV) method. Ten patient attributes were analyzed, including gender, education level, marital status, income status, urban status, living alone, family disease, relation with family and environment, activity status, and age. Features with strong or medium predictive power were prioritized for the model. Accuracy, recall, precision, and F1 score were used as evaluation metrics of the model.
Results: Based on the strong or medium predictive power of IV, four features: gender, education level, income status, and relationship status with family and environment (respectively 0.45, 0.2, 0.17, and 0.17) were related to substance abuse. The Naive Bayes algorithm revealed that males (78%) are approximately four times more likely than females (22%) to develop substance addiction. Patients with education levels ranging from primary to high school were more prone than those with college-level education or higher. Additionally, those under state protection exhibited a higher likelihood (39%) of substance abuse compared to other income statuses. Finally, individuals with poor or neutral relationships with family and their environment were more susceptible to addiction (30%). Respectively, recall, precision, F1 score, and accuracy were obtained as 75%, 65%, 70%, and 76%, indicating the proper classification rate.
Conclusion: The Naive Bayes algorithm effectively classified substance addiction tendencies in hospitalized patients, emphasizing key predictive factors such as gender, education level, income status, and relational dynamics. These findings highlight the importance of targeted interventions tailored to at-risk populations, improving early detection and management strategies in community mental health settings