https://publish.kne-publishing.com/index.php/jbe/issue/feed Journal of Biostatistics and Epidemiology 2024-02-13T09:26:33+00:00 Nahid Gavili n.gavili@knowledgee.com Open Journal Systems <p><strong data-stringify-type="bold">All the manuscripts should be submitted through the Journal Primary Website at <a href="https://jbe.tums.ac.ir/index.php/jbe/about/submissions">https://jbe.tums.ac.ir/index.php/jbe/about/submissions</a></strong></p> https://publish.kne-publishing.com/index.php/jbe/article/view/14619 A Wegner's Granulomatosis Risk Prediction Model Based on Machine Learning Algorithms 2024-02-13T09:21:04+00:00 Jaleh Shoshtarian Malak none@none.com Samira Alsaeidi none@none.com Fatemeh Haji Ali Asgari none@none.com Fahimeh Khedmatkon none@none.com <p><strong>Introduction: </strong>Prediction of Wegener's granulomatosis diagnosis and relapse is a complex process. In this study, we applied machine learning algorithms to predict Wegener's granulomatosis relapse.</p> <p><strong>Methods: </strong>In this research, 189 patients admitted to Amiralam Hospital were studied and followed for approximately 2 years. Patient features included demographics, organ involvement, symptoms, and other clinical data. Different popular machine learning algorithms were applied for predicting Wegener's granulomatosis relapse, including Support Vector Machines, Random Forest, Gradient Boosting, and XGBoost algorithms. The prediction model performance was measured for the different candidate prediction algorithms using accuracy, precision, recall, and F1-measure. The selected prediction model performance was calculated based on different relapse rates and major relapse occurrence according to Birmingham Vasculitis Activity Score (BVAS) fields.</p> <p><strong>Results: </strong>Applying different machine learning algorithms, the XGBoost algorithm performed the best. The results indicated that the prediction model's performance increased when calculating higher relapse rate possibilities. The XGBoost model had 82% accuracy while predicting more than one relapse rate and 92% accuracy in predicting more than twice the relapse rate. We also calculated the SHAP value for the prediction model. The results indicated that Cr, BVAS, lymphocyte percentage, vitamin D, nose involvement, alkaline phosphatase, diagnosis age, white blood cell count, erythrocyte sedimentation rate, and initial nose presentation are the 10 most important features according to SHAP value.</p> <p><strong>Conclusion: </strong>In this study, we have developed Wegener's granulomatosis relapse prediction model using machine learning algorithms. We achieved reasonable precision and recall for early prediction and decision- making regarding Wegener's granulomatosis relapse.</p> 2023-12-31T13:19:53+00:00 Copyright (c) 2023 Journal of Biostatistics and Epidemiology https://publish.kne-publishing.com/index.php/jbe/article/view/14622 The Effective Coverage of Maternal and Child Primary Health-Care-Services and its Relationship with Health-Expenditures: An Analysis at Sub-National-Level in Iran 2024-02-13T09:23:53+00:00 Elham Abdalmaleki none@none.com Zhaleh Abdi none@none.com Saharnaz Sazgarnejad none@none.com Bahar Haghdoost none@none.com Elham Ahmadnezhad none@none.com <p><strong>Introduction:</strong> The primary health care (PHC) approach is widely acknowledged as a fundamental element in achieving universal health coverage (UHC) goals. Consequently, numerous countries have undertaken efforts to restructure their health systems based on PHC principles. This study aims to evaluate geographic disparities in essential maternal and child indicators provided at the PHC level, focusing on both crude and effective coverage. Additionally, it seeks to explore the association between effective coverage and health expenditures within the national and sub-national contexts of Iran.</p> <p><strong>Methods: </strong>This research employed a secondary analysis approach to investigate the spatial distribution of maternal and child health (MCH) indicators in Iran's provinces, utilizing the latest available data from the 2010 Demographic Health Survey (DHS). To provide a comprehensive understanding of MCH indicators, the study calculated composite indicators, crude, and effective coverage. The provinces' situations were compared using the median cut-off method. Additionally, the study examined the association between coverage indicators and total health expenditure per-capita.</p> <p><strong>Results: </strong>At the national level, the crude and the effective composite coverage were 89.56% and 77.22%, respectively. Also, the medians of composite crude and effective service coverage in the provinces were 90.25% and 77.62%, respectively. There was no significant difference between urban and rural areas.</p> <p><strong>Conclusion: </strong>This study has revealed a notable difference between the crude and effective service coverage of the selected MCH indicators. While the coverage of maternal services was generally higher than that of child services, there were significant geographic disparities in the coverage of key indicators of MCH services across provinces. Despite the provision of free services in rural areas, their coverage was not higher than that of urban areas. These findings suggest that PHC services in Iran are still far from achieving the desired coverage and UHC goals. Policymakers and stakeholders need to focus on addressing the gaps in effective coverage and geographic disparities to improve access to essential maternal and child health services and achieve UHC in Iran.</p> <p>&nbsp;</p> 2024-01-02T03:29:08+00:00 Copyright (c) 2024 Journal of Biostatistics and Epidemiology https://publish.kne-publishing.com/index.php/jbe/article/view/14623 Common Study Designs of Nutrition Clinical Trials: Review of the Basic Elements and the Pros and Cons 2024-02-13T09:24:28+00:00 Parvin Mirmiran none@none.com Hanieh Malmir none@none.com Zahra Bahadoran none@none.com <p><strong>Introduction: </strong>Nutrition Clinical Trials (NCTs) are pivotal in establishing causal links between nutritional interventions and chronic diseases. This review comprehensively examines prevalent clinical trial designs, emphasizing their strengths and limitations. The goal is to provide insights into the selection and optimization of these designs for dietary intervention studies.</p> <p><strong>Methods: </strong>Various study designs in NCTs are explored, including quasi-experimental designs, double-blind randomized placebo-controlled trials for nutrient/functional foods supplementation, community-based lifestyle interventions, pragmatic nutrition interventions, and field trial projects. The characteristics, advantages, and challenges of each design are discussed. Real examples are presented to illustrate how these designs can be tailored and optimized for dietary intervention studies.</p> <p><strong>Results: </strong>Parallel randomized clinical trials are acknowledged as the gold standard, despite requiring substantial sample sizes and having inherent limitations. Cross-over NCTs emerge as valuable for assessing temporary treatment effects while mitigating potential confounders and interpatient variability. However, they may not be suitable for acute diseases and progressive disorders, and attrition rates can be higher. Multi-arm randomized designs offer increased study power with a lower sample size but necessitate more intricate design, analysis, and result reporting.</p> <p><strong>Conclusion: </strong>In conclusion, each study design in NCTs comes with its set of strengths and limitations. The selection of an appropriate design should consider determinants and common considerations to provide robust evidence for establishing cause-and-effect associations or assessing the safety and efficacy of food products in nutrition research. This comprehensive understanding aids researchers in making informed choices when planning and conducting nutrition clinical trials.</p> 2024-01-02T03:32:22+00:00 Copyright (c) 2024 Journal of Biostatistics and Epidemiology https://publish.kne-publishing.com/index.php/jbe/article/view/14624 Estimation of HIV Prevalence among the Female Population in South India: A Bayesian Approach 2024-02-13T09:24:51+00:00 Elangovan Arumugum none@none.com Vasna Joshua none@none.com <p><strong>Introduction: </strong>The HIV Sentinel Surveillance (HSS) conducted by National AIDS Control Organization (NACO) is the predominant data source for HIV estimations in India. While the HSS targets the key populations at risk of HIV infection, the National Family Health Survey (NFHS) measures the community- based HIV prevalence. Improvised HIV estimates in India were attributed to the HIV prevalence data obtained from the NACO-HSS and NFHS.</p> <p><strong>Methods: </strong>Bayesian analysis was performed to determine the state-level prevalence of HIV among females in seven South Indian States. The analysis involved plotting the prior, likelihood, and posterior distributions, facilitating a visual assessment of the data. The HIV prevalence among females calculated from the NFHS (2015-16) survey data was used for prior distributions. HIV prevalence among pregnant women obtained from the HIV Sentinel Surveillance 2019 was used for likelihood. Bayesian analysis was performed using the R programming (RStudio 2022.02.0). A posterior probability distribution was obtained using the prior distribution and the likelihood by applying the Bayes theorem. Graphical representation was achieved through R's plotting functions. Kerala and Pondicherry were not included in the analysis due to zero or very low prevalence reported in both NFHS and HSS.</p> <p><strong>Results: </strong>The Bayesian estimates of HIV prevalence among females were 0.38 % [95% CI:0.29 - 0.47] in Andhra Pradesh, 0.28 [95% CI:0.23 - 0.35] in Karnataka, 0.27 [95% CI:0.20 - 0.34] Odisha, 0.27 % [95% CI:0.19 - 0.36] in Telangana and 0.19 [95% CI:0.15 - 0.24] in Tamil Nadu.</p> <p><strong>Conclusion: </strong>Bayesian techniques present a versatile and robust strategy for modelling and analysing HIV- related data, offering a flexible and powerful approach to data analysis.</p> 2024-01-02T03:38:57+00:00 Copyright (c) 2024 Journal of Biostatistics and Epidemiology https://publish.kne-publishing.com/index.php/jbe/article/view/14625 The Prevalence of Human Papilloma Virus Infection and Its High Risk Genotypes among Healthy Women in 28 Provinces in Iran; A Systematic Review and Meta-Analysis 2024-02-13T09:25:12+00:00 Mojgan Akbarzadeh-Jahromi none@none.com Negar Taheri none@none.com Babak Dashtdar none@none.com Nasim Taheri none@none.com Fatemeh Abiri none@none.com Marjan Zare none@none.com <p><strong>Introduction:</strong> Human Papilloma Virus infection (HPV) high-risk genotypes are responsible for up to 70% of invasive cervical cancers. It was aimed to determine the national and provincial prevalence of the total HPV and its high-risk genotypes including HPV genotype 16 (HPV16) and HPV genotype 18 (HPV18), and HPV genotypes other than genotypes of 16 and 18 (HPV other genotypes) among Iranian healthy women.</p> <p><strong>Methods:</strong> Iran with 28 provinces locates at latitude and longitude of 32° 00' north and 53° 00' east. All Persian and English studies reporting HPV infection based on cervical specimens were selected through searching the PubMed, Magiran, Scopus, Irandoc databases, and Google Scholar research search engine. Sample size and event rates were used to compute the overall event rates and 95% confidence interval (95% C.I); Fixed or random effects model, heterogeneity indices including Q-statistics (p-value), and degree of heterogeneity (I2) were reported. The search was done up to February 29, 2022. Comprehensive Meta-analysis 2.2.064 and ArcGIS 10.8.2 software tools were used at a significance level of &lt;0.05.</p> <p><strong>Results: </strong>The meta-analysis included nineteen studies with 258839 participants. The national meta-analysis resulted in a total HPV prevalence of 0.025 (95% C.I 0.016, 0.039); those of HPV16, HPV18, and HPV other genotypes were 0.032 (95% C.I 0.019, 0.051), 0.028 (95% C.I 0.019, 0.040), and 0.048 (95% C.I 0.033,</p> <p>0.069), respectively. The provincial meta-analysis showed that the total HPV prevalence was highest in Zanjn and Kerman (0.323 and 0.240, respectively); that of HPV16 was highest in Boushehr and Khozestan (0.298 and 0.253, respectively); that of HPV18 was highest in Tehran (0.089) and that of HPV other genotypes was highest in Khozestan (0.542).</p> <p><strong>Conclusion: </strong>The current results would help policymakers and health managers accentuate on further implementation of screening strategies and health services in needier areas such as Zanjan, Kerma, Khozestan, and Tehran.</p> 2024-01-02T03:48:10+00:00 Copyright (c) 2024 Journal of Biostatistics and Epidemiology https://publish.kne-publishing.com/index.php/jbe/article/view/14626 Count Data Regression Modelling: An Application to Monkeypox Confirmed Cases 2024-02-13T09:25:38+00:00 Divya Vijithaswan Nair none@none.com Rujuta Hadaye none@none.com <p><strong>Introduction: </strong>With the presence of COVID 19, some countries also faced an increase in number of cases due to Monkeypox virus. The main aim of this research was to investigate whether it is possible to fit count data regression models to predict the daily incidence of Monkeypox confirmed cases.</p> <p><strong>Methods: </strong>In this study we have used two types of traditional count regression models like Poisson regression model and Negative binomial regression model using identity and logarithmic link function. Since our data was overdispersed, Negative binomial regression model with logarithmic link function fitted well as compared to other models. The parameters were estimated using SPSS, version 23.0.</p> <p><strong>Results: </strong>The Negative Binomial Regression model with logarithm function fits well to the data related to Monkeypox cases. Therefore, the model shows that majority of the countries like Brazil, Canada, France, Germany, Peru, Spain, United Kingdom and United States of America shows significant decrease in number of cases with respect to time. The prediction line was plotted using this model where the line predicts well about the daily Monkeypox cases reported by different countries.</p> <p><strong>Conclusion: </strong>From our study, we concluded that the count data regression model can be used widely to predict the incidence of any disease. The countries like Canada and Brazil have largest and smallest slope coefficient which shows maximum and minimum decrease in expected number of cases confirmed each day respectively.</p> 2024-01-02T04:10:37+00:00 Copyright (c) 2024 Journal of Biostatistics and Epidemiology https://publish.kne-publishing.com/index.php/jbe/article/view/14627 Determinants of Hospital Stay Duration Post-Colorectal Surgery 2024-02-13T09:25:59+00:00 Gideon Addo none@none.com Paul Poku Sampene Ossei none@none.com Bismark Amponsah Yeboah none@none.com William Gilbert Ayibor none@none.com Raphael Doh-Nani none@none.com Seidu Mohammed none@none.com Michael Obuobi none@none.com Roselyn Assor Appau none@none.com <p><strong>Introduction: </strong>Hospital length of stay (LOS) remains a vital metric for assessing patient outcomes and healthcare resource utilization. Given the substantial financial impact of diagnosing and treating colorectal anomalies, coupled with an increased susceptibility to postoperative complications, it is crucial to understand the factors affecting LOS following colorectal surgery. Our primary objective was to investigate the preoperative, intraoperative, and postoperative risk factors that have substantial influence over LOS following a colorectal procedure.</p> <p><strong>Methods: </strong>This study analyzed data from a retrospective study of adults who underwent various colorectal surgeries (colostomy, ileostomy, small bowel resection, etc.) at Cleveland Clinic Foundation (January 2005</p> <p>- December 2014). Predictor variables were categorized into preoperative (patient demographics, medical history, comorbidities, lifestyle factors), intraoperative, and postoperative factors. LOS was grouped into short-term (SLOS) (≤ 7 days), medium-term (MLOS) (8-30 days), and long-term (LLOS) (&gt; 30 days) stays. Multinomial logistic regression models assessed predictor effects on LOS.</p> <p><strong>Results: </strong>Among the 7874 patients, 50.7% were females, with a minimum age of 20 years. SLOS were observed in 61.1%, MLOS in 37.6%, and LLOS in 1.3% of patients. Advanced age correlated with prolonged LOS, possibly due to age-related health challenges like weak immune systems. Coagulopathy, and fluid and electrolyte disorders raised MLOS and LLOS risk, likely due to complications like significant bleeding and electrolyte imbalances. Surgery duration predicted longer LOS, elevating LLOS and MLOS by 52% and 42%. Postoperative infections were associated to extended stays, possibly due to subsequent interventions, monitoring and recovery delays.</p> <p><strong>Conclusion: </strong>Our study revealed that key preoperative predictors of LOS included Age, coagulopathy, fluid and electrolyte disorders, severe weight loss, and drug abuse. Notably, intraoperative factors such as surgical approach (open vs laparoscopic) and surgery duration, alongside postoperative complications including superficial and serious infections, significantly influenced LOS. By incorporating these insights into the preoperative planning, clinicians could potentially develop tailored interventions to mitigate risk factors and enhance postoperative recovery, thus potentially reducing LOS and improving patient outcomes.</p> 2024-01-02T04:25:28+00:00 Copyright (c) 2024 Journal of Biostatistics and Epidemiology https://publish.kne-publishing.com/index.php/jbe/article/view/14628 Addressing Heteroscedasticity in Correlated Binary Data: A Bayesian Mixed Effects Location Scale Approach 2024-02-13T09:26:20+00:00 Parisa Rezanejad-Asl none@none.com Farid Zayeri none@none.com Abbas Hajifathali none@none.com <p><strong>Introduction:</strong> The mixed effects logistic regression model is a common model for analysing correlated binary data as longitudinal data. The between and within subject variances are typically considered to be homogeneous but longitudinal data often show heterogeneity in these variances. This study proposes a Bayesian mixed effects location scale model to accommodate heteroscedasticity in binary data analysis.</p> <p><strong>Methods: </strong>This study was carried out in two stages; first, the simulation study was used to evaluate the accuracy of the proposed model with the Bayesian approach and then the proposed model was applied to a real data. In simulation study, the data were generated from the mixed effects location scale model with different correlations between the random location effect and random scale effect and different sample sizes. In order to evaluate the accuracy of the estimations, the Root Mean Square Error, bias and Coverage Probability were calculated and the deviance information criterion was used to select the appropriate model. At the end we utilized this model to analyse uric acid levels of patients with haematological disorders.</p> <p><strong>Results: </strong>The simulation results show the accuracy of model parameter estimates as well as the correlation between random location and scale effects. They also display that if a random scale effect is present in the data, it should be accounted for in model. Otherwise, the model is forced to assign the within subject variation due to these subject random effects to the error term. The results of real data are also in line with this. The odds of having normal UA levels increases by a factor of 26% per week. Due to the positive value of the covariance parameter, patients with higher mean of UA levels show higher variation in UA levels. Furthermore, the significance of the covariates in the between subject and within subject variances model, as well as the significance of the random scale variance determines the heterogeneity across subjects.</p> <p><strong>Conclusion: </strong>Bayesian mixed effects location scale model provides a useful tool for analysing correlated binary data with heteroscedasticity because it considers data correlation and modelling mean and variance simultaneously. Furthermore, it improves the accuracy of statistical inference in longitudinal studies compared to classic mixed effects models.</p> 2024-01-02T04:28:18+00:00 Copyright (c) 2024 Journal of Biostatistics and Epidemiology https://publish.kne-publishing.com/index.php/jbe/article/view/14629 Quantile Regression in Survival Analysis: Comparing Check-Based Modeling and the Minimum Distance Approach 2024-02-13T09:26:33+00:00 Fereshteh Mokhtarpour none@none.com Mostafa Hosseini none@none.com Akram Yazdani none@none.com Mehdi Yaseri none@none.com <p><strong>Introduction: </strong>Quantile regression is a valuable alternative for survival data analysis, enabling flexible evaluations of covariate effects on survival outcomes with intuitive interpretations. It offers practical computation and reliability. However, challenges arise when applying quantile regression to censored data, particularly for upper quantiles. The minimum distance approach, utilizing dual-kernel estimation and the inverse cumulative distribution function, shows promise in addressing these challenges, especially with higher-dimensional covariates.</p> <p><strong>Methods:</strong> This study contrasts two methods within the realm of quantile linear regression for survival analysis: check-based modeling and the minimum distance approach. Effectiveness is assessed across various scenarios through comprehensive simulation.</p> <p><strong>Results:</strong> The simulation results showed that using the quantile regression model with the minimum distance approach reduces the percentage of root mean square error in parameter estimation compared to the quantile regression models based on the check loss function. Additionally, a larger sample size and reduced censoring percentage led to decreased root mean square error in parameter estimation.</p> <p><strong>Conclusion: </strong>The research highlights the benefits of using the minimum distance approach for quantile regression. It reduces errors, improves model predictions, captures patterns, and optimizes parameters even with complete data. However, this approach has limitations. The accuracy of estimated quantiles can be influenced by the choice of distance metric and weighting scheme. The assumption of independence between censoring mechanism and survival time may not hold in real-world scenarios. Additionally, dealing with large datasets can be computationally complex.</p> <p>&nbsp;</p> <p>&nbsp;</p> 2024-01-02T04:32:46+00:00 Copyright (c) 2024 Journal of Biostatistics and Epidemiology