Identification of Potential Biomarkers for Osteoarthritis
Abstract
Background: We aimed to identify biomarkers associated with Osteoarthritis (OA) and evaluate their predictive capabilities.
Methods: Four synovial tissue datasets (GSE1919, GSE12021, GSE55235, GSE55457) and one peripheral blood mononuclear rcells (PBMC) dataset (GSE48556) were obtained. GSE55235 and GSE55457 were merged to conduct differential expression analysis and train machine learning algorithms. Predictive models were trained using a subset of genes and then validated on the other datasets. In addition, PBMC dataset was used to train predictive models using the same subset of genes, with the synovial tissue datasets serving as validation datasets. Finally, immune infiltration analysis was performed in the merged synovial tissue dataset using CIBERSORT.
Results: RPA3, LAMA5, SAT1, and UCP2 were used to train machine learning algorithms. Predictive models performed well in synovial tissue datasets but faced challenges in the PBMC dataset, as models achieved high sensitivity but moderate specificity. However, models trained on the PBMC dataset exhibited high sensitivity and specificity in the four external validation datasets. SAT1 exhibited the highest impact on the model performance. Immune infiltration analysis revealed significant differences in the expression of several immune cells, such as mast cells, between OA and control groups. In general, the four genes showed moderate to strong correlations with mast cells.
Conclusion: While promising, our findings point to the need for further studies to validate biomarkers and improve the models' predictive power across diverse sample types.