Controlled Vocabularies and Information Retrieval: 1918 Pandemic’s Scientific Literature as an Example

The role of controlled vocabularies in information retrieval is broadly recognized as a relevant feature. Besides, there is a standing demand that editors and databases should consider the effective introduction of controlled vocabularies in their procedures to index scientific literature. That is especially important because information retrieval is pointed out as a significant point to drive systematic literature review. Hence, a first question emerges: Are the controlled vocabularies at this moment considered? On the other hand, subject searching in the catalogs is complex mainly due to the dichotomy between keywords from authors versus keywords based on controlled vocabularies. Finally, there is some demand to unify the terminology related to health to make easier the medical history exploitation and research. Considering these features, this paper focuses on controlled vocabularies related to the health field and their role for storing, classifying, and retrieving relevant literature. The objective is knowing which role plays the controlled vocabularies related to the health field to index and retrieve research literature in data bases such as Web of Science (WoS) and Scopus. So, this exploratory research is grounded over two research questions: 1) Which are the terms considered in specific controlled vocabularies of the health field; and 2) How papers are indexed in relevant databases to be easily retrieved, considering keywords vs specific health’ controlled vocabularies? This research takes as fieldwork the controlled vocabularies related to health and the scientific interest for 1918 flu pandemic, also known equivocally as ‘Spanish flu’. This interest has been fostered by the emergence in the early 21st of epidemics of pneumonic diseases caused by virus. Searches about and with controlled vocabularies on WoS and Scopus databases are conducted. First results of this work in progress are surprising. There are different controlled vocabularies for the health field, into which the terms collected and preferred related to ‘1918 pandemic’ are identified. To summarize, ‘Spanish influenza epidemic’ or ‘Spanish flu’ are collected as not preferred terms. The preferred terms are: ‘influenza’ or ‘influenza pandemic, 1918-1919’. Although the controlled vocabularies are clear in their election, most of the literature about ‘1918 pandemic’ is retrievable either by ‘Spanish’ or by ‘1918’ disjunct, and the dominant word to retrieve literature is ‘Spanish’ rather than ‘1918’. This is surprising considering the existence of suitable controlled vocabularies related to health topics, and the modern guidelines of World Health Organization concerning naming of diseases that point out to other preferred terms. A first conclusion is the failure of using controlled vocabularies for a field such as health, and in consequence for WoS and Scopus. This research opens further research questions about which is the role that controlled vocabularies play in the instructions to authors that journals deliver to documents’ authors.





References:
[1] Liu, W. J., Bi, Y., Wang, D., Gao, G. F. On the Centenary of the Spanish Flu: Being Prepared for the next Pandemic. Virologica Sinica 33, 463-466 (2018).
[2] Bouramoul, A., Kholladi, M.K., Doan, B.L. How Ontology Can be Used to Improve Semantic Information Retrieval: The AnimSe Finder Tool. International Journal of Computer Applications, Foundation of Computer Science 21 (9), 48-54 (2011).
[3] Alvite Díez, M.L. Tendencia en la investigación sobre recuperación de información jurídica. Revista Española de Documentación Científica 26(2), 191-212 (2003)
[4] Boukhari, K., Omri, M.N. DL-VSM based document indexing approach for information retrieval. Journal of Ambient Intelligence and Humanized Computing 1-12. https://doi.org/10.1007/s12652-020-01684-x (2020).
[5] Utechta, J., Ballb, J., Bowmanc, S. M., Doddb, J., Judkinsd, J., Maxsone, R. T., Nabaweesif, R., Pradhanc, R., Sanddalb, N. D., Winchellg, R. J., Brochhausena, M. Development and Validation of a Controlled Vocabulary: An OWL Representation of Organizational Structures of Trauma Centers and Trauma Systems. In Ohno-Machado, L. and Séroussi, B. (eds.) MEDINFO 2019: Health and Wellbeing e-Networks for All (2019)
[6] Chae, J., Cho, I., Yoo, A., Kim, Y. Analysis of a Locally Controlled Vocabulary in an Electronic Health Records for Evidence-Based Inpatient Fall-Prevention Care. Nursing Informatics 78 (2018).
[7] Soualmia, L. F., Darmoni, S. J. Combining different standards and different approaches for health information retrieval in a quality-controlled gateway. Informational Journal of Medical Informatics 74(2-4), 141-150 (2005).
[8] Darmoni, S., Soualmia, L. F., Letord, C., Jaulent, M.C., Griffon, N., Thirion, B., Névéol, A. Improving information retrieval using Medical Subject Headings Concepts: a test case on rare and chronic diseases. Journal of the Medical Library Association, 100(3):176-183. DOI: 10.3163/1536-5050.100.3.007. PMID: 22879806; PMCID: PMC3411256. (2012).
[9] Lelong, R., Cabot, C., Soualmia, L. F. Semantic Search Engine to Query into Electronic Health Records with a Multiple-Layer Query Language. SGIR 16 Juli, 2016 Pisa, Italy. ACM (2016).
[10] Kierkegaard, P., Kaushal, R., Vest, J. R. Information Retrieval Pathways for Health Information Exchange in Multiple Care Settings. The American journal of managed care 20(11 Spec No. 17), SP494-501 (2014).
[11] Chae, J.; Cho, I.; Yoo, A.; Kim, Y. Analysis of a Locally Controlled Vocabulary in an Electronic Health Records for Evidence-Based Inpatient Fall-Prevention Care. Nursing Informatics (2018)
[12] Gross, T.; Taylor, A. G.; Joudrey, D. N. Still a Lot to Lose: The Role of Controlled Vocabulary in Keyword Searching. Cataloging & Classification Quarterly, 53:1, 1-39 (2015).
[13] Bousquet, C; Souvignet, J; Sadou É; Jaulent M-C; Declerck G. Ontological and Non-Ontological Resources for Associating Medical Dictionary for Regulatory Activities Terms to SNOMED Clinical Terms with Semantic Properties. Frontiers in Pharmacology (2019) 10;10:975. doi: 10.3389/fphar.2019.00975. PMID: 31551780; PMCID: PMC6747929. (2019)
[14] World Health Organization. World Health Organization Best Practices for the Naming of New Infectious Diseases. May 2015. https://www.who.int/topics/infectious_diseases/naming-new-diseases/en/ last accessed 2020/10/9
[15] Barry, J. M. The site of origin of the 1918 influenza pandemic and its public health implications. Journal of Translational Medicine, 2:3 (2004).
[16] Knobler S, Mack A, Mahmoud A, Lemon S, (eds.) "1: The Story of Influenza". The Threat of Pandemic Influenza: Are We Ready? Workshop Summary (2005). Washington, DC: The National Academies Press. pp. 60–61 (2005).
[17] Johnson, N.P., Mueller, J. Updating the accounts: global mortality of the 1918-1920 "Spanish" influenza pandemic. Bulletin of the History of Medicine. 76 (1), 105–15. doi:10.1353/bhm.2002.0022. PMID 11875246 (2002).
[18] Jordan, E. Epidemic influenza First edition. Chicago: AMA; (1927).
[19] Thomson D., Thomson R: Influenza. Annals of the Pickett-Thomson Research Laboratory First edition. Baltimore: Williams and Wilkens; (1934).
[20] Burnet, F.M., Clark, E. Influenza: a survey of the last fifty years Melbourne.: Macmillan Co; (1942).
[21] Oxford, J.S., Lambkin, R., Sefton, A., Daniels, R., Elliot, A., Brown, R., Gill, D. A hypothesis: the conjunction of soldiers, gas, pigs, ducks, geese and horses in northern France during the Great War provided the conditions for the emergence of the ‘‘Spanish’’ influenza pandemic of 1918–1919. Vaccine 23, 940–945 (2005).
[22] Shortridge, K.F. The 1918 ‘Spanish’ flu: pearls from swine? Nature Medicine, vol 5, number 4, April 1999, 384-385 (1999).
[23] Taubenberger, J.K., Morens, D.M. 1918 influenza: the mother of all pandemics. Emerg Infect Dis 2006; 12, 15–22 (2006).
[24] Rosenheck, J. The so-called “Spanish” flu. Doctor’s Review. http://www.doctorsreview.com/history/nov05-history/ (2005) last accessed 2020/10/09
[25] Mecking, E. Spaanse griep en Eerste Wereldoorlog Het drama van 1918. De Eerste Wereldoorlog 1914 – 1918. https://www.wereldoorlog1418.nl/spaanse%20griep/index.html last accessed (2006) 2020/10/09
[26] Bennett, H. Spanish Flu and the history of pandemic propaganda. Prospect, March 19 2020. https://www.prospectmagazine.co.uk/politics/coronavirus-spanish-flu-1918-pandemic-from-spain (2020) last accessed 2020/10/09
[27] Soper, G. A. The Influenza Pneumonia Pandemic in the American Army Camps during September and October, 1918. Science, Nov. 8, 1918, New Series, Vol. 48, No. 1245 (Nov. 8, 1918), pp. 451-456 (1918). https://www.jstor.org/stable/1641514?seq=1#metadata_info_tab_contents last accessed 2020/10/09
[28] ANSI/NISO Z39.19-2005. Guidelines for the construction, Format, and Management of Monolingual Controlled Vocabularies. (2005).
[29] UK Data Service. About ELSST. https://elsst.ukdataservice.ac.uk/ last accessed 2020/10/09
[30] Rogers, F B (Jan 1963). "Medical subject headings". Bulletin of the Medical Library Association, 51(1):114–116 (1963).
[31] Soualmia, L.F., Sakji, S., Letord, C., Rollin, L., Massari, P., Darmoni, S.J. Improving information retrieval with multiple health terminologies in a quality-controlled gateway. Health Information Science Systems, 4(1):8. DOI: 10.1186/2047-2501-1-8. PMID: 25825660; PMCID: PMC4341235 (2013).
[32] Rosenbloom, S.T., Miller, R.A., Johnson, K.B., Elkin, P.L., Brown, S.H. Interface terminologies: facilitating direct entry of clinical data into electronic health record systems. Journal of American Medical Information Association13(3), 277-88. doi: 10.1197/jamia.M1957. Epub 2006 Feb 24. PMID: 16501181; PMCID: PMC1513664. (2006)
[33] Bratková, E.; Kucerová, H. Knowledge Organization Systems and Their Typology. Revue of Librarianship. 25(2):1-25.
[34] Park, Z.; Gnoli, C; Morelli, D. The Second Edition of the Integrative Levels Classification: Evolution of a KOS, 5(1):39-50.