Identifying Emerging Trends in Scientific Texts Using TF-IDF Algorithm: A Case Study of Medical Librarianship and Information Articles

  • Meisam Dastani PhD Candidate in knowledge and information science, Payame Noor University, Tehran, Iran
  • Afshin Mousavi Chelak Associate Professor, Department of Knowledge and Information Science, Payame Noor University, Tehran, Iran
  • Soraya Ziaei Associate Professor, Department of Knowledge and Information Science, Payame Noor University, Tehran, Iran
  • Faeze Delghandi Assistant Professor, Department of Knowledge and Information Science, Payame Noor University, Tehran, Iran
Keywords: : Librarianship and Information; Medical; Analysis; Keyword; Text Mining; TF-IDF

Abstract

Background: Nowadays, due to the increased publication of articles in various scientific fields, identifying the publishing trends and emerging keywords in the texts of these articles is essential.

Objectives: Thus, the present study identified and analyzed the keywords used in the published articles on medical librarianship and information.

Methods: In the present investigation, an exploratory and descriptive approach was used to analyze librarianship and information articles published in specialized journals in this field from 1964 to 2019 by applying text mining techniques. The TF-IDF weighting algorithm was applied to identify the most important keywords used in the articles. The Python programming language was used to implement text mining algorithms.

Results: The results obtained from the TF-IDF algorithm indicated that the words “Library”, “Patient”, and “Inform” with the weights of 95.087, 65.796, and 63.386, respectively, were the most important keywords in the published articles on medical librarianship and information. Also, the words “Catalog”, “Book”, and “Journal” were the most important keywords used in the articles published between the years 1960 and 1970, and the words “Patient”, “Bookstore”, and “Intervent” were the most important keywords used in articles on medical librarianship and information published from 2015 to 2020. The words “Blockchain”, “Telerehabilit”, “Instagram”, “WeChat”, and “Comic” were new keywords observed in articles on medical librarianship and information between 2015 and 2020.

Conclusions: The results of the present study revealed that the keywords used in articles on medical librarianship and information were not consistent over time and have undergone a change at different periods so that nowadays, this field of science has also changed following the needs of society with the advent and growth of information technologies.

Published
2021-05-23
Section
Articles