Controlled Vocabularies and Information Retrieval: 1918 Pandemic’s Scientific Literature as an Example

The role of controlled vocabularies in information retrieval is broadly recognized as a relevant feature. Besides, there is a standing demand that editors and databases should consider the effective introduction of controlled vocabularies in their procedures to index scientific literature. That is especially important because information retrieval is pointed out as a significant point to drive systematic literature review. Hence, a first question emerges: Are the controlled vocabularies at this moment considered? On the other hand, subject searching in the catalogs is complex mainly due to the dichotomy between keywords from authors versus keywords based on controlled vocabularies. Finally, there is some demand to unify the terminology related to health to make easier the medical history exploitation and research. Considering these features, this paper focuses on controlled vocabularies related to the health field and their role for storing, classifying, and retrieving relevant literature. The objective is knowing which role plays the controlled vocabularies related to the health field to index and retrieve research literature in data bases such as Web of Science (WoS) and Scopus. So, this exploratory research is grounded over two research questions: 1) Which are the terms considered in specific controlled vocabularies of the health field; and 2) How papers are indexed in relevant databases to be easily retrieved, considering keywords vs specific health’ controlled vocabularies? This research takes as fieldwork the controlled vocabularies related to health and the scientific interest for 1918 flu pandemic, also known equivocally as ‘Spanish flu’. This interest has been fostered by the emergence in the early 21st of epidemics of pneumonic diseases caused by virus. Searches about and with controlled vocabularies on WoS and Scopus databases are conducted. First results of this work in progress are surprising. There are different controlled vocabularies for the health field, into which the terms collected and preferred related to ‘1918 pandemic’ are identified. To summarize, ‘Spanish influenza epidemic’ or ‘Spanish flu’ are collected as not preferred terms. The preferred terms are: ‘influenza’ or ‘influenza pandemic, 1918-1919’. Although the controlled vocabularies are clear in their election, most of the literature about ‘1918 pandemic’ is retrievable either by ‘Spanish’ or by ‘1918’ disjunct, and the dominant word to retrieve literature is ‘Spanish’ rather than ‘1918’. This is surprising considering the existence of suitable controlled vocabularies related to health topics, and the modern guidelines of World Health Organization concerning naming of diseases that point out to other preferred terms. A first conclusion is the failure of using controlled vocabularies for a field such as health, and in consequence for WoS and Scopus. This research opens further research questions about which is the role that controlled vocabularies play in the instructions to authors that journals deliver to documents’ authors.

Twitter Sentiment Analysis during the Lockdown on New Zealand

One of the most common fields of natural language processing (NLP) is sentimental analysis. The inferred feeling in the text can be successfully mined for various events using sentiment analysis. Twitter is viewed as a reliable data point for sentimental analytics studies since people are using social media to receive and exchange different types of data on a broad scale during the COVID-19 epidemic. The processing of such data may aid in making critical decisions on how to keep the situation under control. The aim of this research is to look at how sentimental states differed in a single geographic region during the lockdown at two different times.1162 tweets were analyzed related to the COVID-19 pandemic lockdown using keywords hashtags (lockdown, COVID-19) for the first sample tweets were from March 23, 2020, until April 23, 2020, and the second sample for the following year was from March 1, 2021, until April 4, 2021. Natural language processing (NLP), which is a form of Artificial intelligent was used for this research to calculate the sentiment value of all of the tweets by using AFINN Lexicon sentiment analysis method. The findings revealed that the sentimental condition in both different times during the region's lockdown was positive in the samples of this study, which are unique to the specific geographical area of New Zealand. This research suggests applied machine learning sentimental method such as Crystal Feel and extended the size of the sample tweet by using multiple tweets over a longer period of time.

Information Literacy among Faculty and Students of Medical Colleges of Haryana, Punjab and Chandigarh

With the availability of diverse printed, electronic literature and web sites on medical and health related information, it is impossible for the medical professional to get the information he seeks in the shortest possible time. For all these problems information literacy is the only solution. Thus, information literacy is recognized as an important aspect of medical education. In the present study, an attempt has been made to know the information literacy skills of the faculty and students at medical colleges of Haryana, Punjab and Chandigarh. The scope of the study was confined to the 12 selected medical colleges of three States (Haryana, Punjab, and Chandigarh). The findings of the study were based on the data collected through 1018 questionnaires filled by the respondents of the medical colleges. It was found that Online Medical Websites (such as WebMD, eMedicine and Mayo Clinic etc.) were frequently used by 63.43% of the respondents of Chandigarh which is slightly more than Haryana (61%) and Punjab (55.65%). As well, 30.86% of the respondents of Chandigarh, 27.41% of Haryana and 27.05% of Punjab were familiar with the controlled vocabulary tool; 25.14% respondents of Chandigarh, 23.80% of Punjab, 23.17% of Haryana were familiar with the Boolean operators; 33.05% of the respondents of Punjab, 28.19% of Haryana and 25.14% of Chandigarh were familiar with the use and importance of the keywords while searching an electronic database; and 51.43% of the respondents of Chandigarh, 44.52% of Punjab and 36.29% of Haryana were able to make effective use of the retrieved information. For accessing information in electronic format, 47.74% of the respondents rated their skills high, while the majority of respondents (76.13%) were unfamiliar with the basic search technique i.e. Boolean operator used for searching information in an online database. On the basis of the findings, it was suggested that a comprehensive training program based on medical professionals information needs should be organized frequently. Furthermore, it was also suggested that information literacy may be included as a subject in the health science curriculum so as to make the medical professionals information literate and independent lifelong learners.

Measuring Text-Based Semantics Relatedness Using WordNet

Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.

Measuring Text-Based Semantics Relatedness Using WordNet

Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.

Managing Business Processes in the Age of Digital Transformation: A Literature Review

Today, digital transformation is one of the leading topics that occupy the attention of scientific circles and business experts. Organizational success is most often reflected through the successful managing of business processes. Given the growing market for digital innovations and its ever-increasing impact on business, organizations need to be prepared for organizational changes that come with the digital era. In order to maintain their competitive advantage in the global market, organizations must adapt their processes to new digitalization conditions. The main goal of this study is to point out the link between the digital transformation and the business process management concept. Therefore, in order to contribute to the scientific field that explores the potential relation between business process management concept and digital transformation, a literature review has been conducted. Papers have been searched within the Business Process Management Journal by keywords related to the term digital transformation. Selected papers have been analyzed according to the topic, type of publication, year of publication, keywords, etc. The results reveal a growing number of papers published on the topic of digital transformation to the Business Process Management Journal, but the lack of case studies. This paper contributes to the extension of academic literature in this important, yet insufficiently researched, scientific field that creates the bond between two strong concepts of digital transformation and business process management.

Analyzing Keyword Networks for the Identification of Correlated Research Topics

The production and publication of scientific works have increased significantly in the last years, being the Internet the main factor of access and distribution of these works. Faced with this, there is a growing interest in understanding how scientific research has evolved, in order to explore this knowledge to encourage research groups to become more productive. Therefore, the objective of this work is to explore repositories containing data from scientific publications and to characterize keyword networks of these publications, in order to identify the most relevant keywords, and to highlight those that have the greatest impact on the network. To do this, each article in the study repository has its keywords extracted and in this way the network is  characterized, after which several metrics for social network analysis are applied for the identification of the highlighted keywords.

3D-Vehicle Associated Research Fields for Smart City via Semantic Search Approach

This paper presents 15-year trends for scientific studies in a scientific database considering 3D and vehicle words. Two words are selected to find their associated publications in IEEE scholar database. Both of keywords are entered individually for the years 2002, 2012, and 2016 on the database to identify the preferred subjects of researchers in same years. We have classified closer research fields after searching and listing. Three years (2002, 2012, and 2016) have been investigated to figure out progress in specified time intervals. The first one is assumed as the initial progress in between 2002-2012, and the second one is in 2012-2016 that is fast development duration. We have found very interesting and beneficial results to understand the scholars’ research field preferences for a decade. This information will be highly desirable in smart city-based research purposes consisting of 3D and vehicle-related issues.

Analysis of the Topics of Research of Brazilian Researchers Acting in the Areas of Engineering

The production and publication of scientific works have increased significantly in the last years, being the Internet the main factor of access and diffusion of these. In view of this, researchers from several areas of knowledge have carried out several studies on scientific production data in order to analyze phenomena and trends about science. The understanding of how research has evolved can, for example, serve as a basis for building scientific policies for further advances in science and stimulating research groups to become more productive. In this context, the objective of this work is to analyze the main research topics investigated along the trajectory of the Brazilian science of researchers working in the areas of engineering, in order to map scientific knowledge and identify topics in highlights. To this end, studies are carried out on the frequency and relationship of the keywords of the set of scientific articles registered in the existing curricula in the Lattes Platform of each one of the selected researchers, counting with the aid of bibliometric analysis features.

Lecture Video Indexing and Retrieval Using Topic Keywords

In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.

Keyword Network Analysis on the Research Trends of Life-Long Education for People with Disabilities in Korea

The purpose of this study is to examine the research trends of life-long education for people with disabilities using a keyword network analysis. For this purpose, 151 papers were selected from 594 papers retrieved using keywords such as 'people with disabilities' and 'life-long education' in the Korean Education and Research Information Service. The Keyword network analysis was constructed by extracting and coding the keyword used in the title of the selected papers. The frequency of the extracted keywords, the centrality of degree, and betweenness was analyzed by the keyword network. The results of the keyword network analysis are as follows. First, the main keywords that appeared frequently in the study of life-long education for people with disabilities were 'people with disabilities', 'life-long education', 'developmental disabilities', 'current situations', 'development'. The research trends of life-long education for people with disabilities are focused on the current status of the life-long education and the program development. Second, the keyword network analysis and visualization showed that the keywords with high frequency of occurrences also generally have high degree centrality and betweenness centrality. In terms of the keyword network diagram, it was confirmed that research trends of life-long education for people with disabilities are centered on six prominent keywords. Based on these results, it was discussed that life-long education for people with disabilities in the future needs to expand the subjects and the supporting areas of the life-long education, and the research needs to be further expanded into more detailed and specific areas. 

Development of a Technology Assessment Model by Patents and Customers' Review Data

Recent years have seen an increasing number of patent disputes due to excessive competition in the global market and a reduced technology life-cycle; this has increased the risk of investment in technology development. While many global companies have started developing a methodology to identify promising technologies and assess for decisions, the existing methodology still has some limitations. Post hoc assessments of the new technology are not being performed, especially to determine whether the suggested technologies turned out to be promising. For example, in existing quantitative patent analysis, a patent’s citation information has served as an important metric for quality assessment, but this analysis cannot be applied to recently registered patents because such information accumulates over time. Therefore, we propose a new technology assessment model that can replace citation information and positively affect technological development based on post hoc analysis of the patents for promising technologies. Additionally, we collect customer reviews on a target technology to extract keywords that show the customers’ needs, and we determine how many keywords are covered in the new technology. Finally, we construct a portfolio (based on a technology assessment from patent information) and a customer-based marketability assessment (based on review data), and we use them to visualize the characteristics of the new technologies.

Stop Texting While Learning: A Meta-Analysis of Social Networks Use and Academic Performances

Teachers and university lecturers face an unsolved problem, which is students’ multitasking behaviors during class time, such as texting or playing a game. It is important to examine the most powerful predictor that can result in students’ educational performances. Meta-analysis was used to analyze the research articles, which were published with the keywords, multitasking, class performance, and texting. We selected 14 research articles published during 2008-2013 from online databases, and four articles met the predetermined inclusion criteria. Effect size of each pair of variables was used as the dependent variable. The findings revealed that the students’ expectancy and value on SNSs usages is the best significant predictor of their educational performances, followed by their motivation and ability in using SNSs, prior educational performances, usage behaviors of SNSs in class, and their personal characteristics, respectively. Future study should conduct a longitudinal design to better understand the effect of multitasking in the classroom.

Adapting Tools for Text Monitoring and for Scenario Analysis Related to the Field of Social Disasters

Humanity faces more and more often with different social disasters, which in turn can generate new accidents and catastrophes. To mitigate their consequences, it is important to obtain early possible signals about the events which are or can occur and to prepare the corresponding scenarios that could be applied. Our research is focused on solving two problems in this domain: identifying signals related that an accident occurred or may occur and mitigation of some consequences of disasters. To solve the first problem, methods of selecting and processing texts from global network Internet are developed. Information in Romanian is of special interest for us. In order to obtain the mentioned tools, we should follow several steps, divided into preparatory stage and processing stage. Throughout the first stage, we manually collected over 724 news articles and classified them into 10 categories of social disasters. It constitutes more than 150 thousand words. Using this information, a controlled vocabulary of more than 300 keywords was elaborated, that will help in the process of classification and identification of the texts related to the field of social disasters. To solve the second problem, the formalism of Petri net has been used. We deal with the problem of inhabitants’ evacuation in useful time. The analysis methods such as reachability or coverability tree and invariants technique to determine dynamic properties of the modeled systems will be used. To perform a case study of properties of extended evacuation system by adding time, the analysis modules of PIPE such as Generalized Stochastic Petri Nets (GSPN) Analysis, Simulation, State Space Analysis, and Invariant Analysis have been used. These modules helped us to obtain the average number of persons situated in the rooms and the other quantitative properties and characteristics related to its dynamics.

Technologic Information about Photovoltaic Applied in Urban Residences

Among renewable energy sources, solar energy is the one that has stood out. Solar radiation can be used as a thermal energy source and can also be converted into electricity by means of effects on certain materials, such as thermoelectric and photovoltaic panels. These panels are often used to generate energy in homes, buildings, arenas, etc., and have low pollution emissions. Thus, a technological prospecting was performed to find patents related to the use of photovoltaic plates in urban residences. The patent search was based on ESPACENET, associating the keywords photovoltaic and home, where we found 136 patent documents in the period of 1994-2015 in the fields title and abstract. Note that the years 2009, 2010, 2011, 2012, 2013 and 2014 had the highest number of applicants, with respectively, 11, 13, 23, 29, 15 and 21. Regarding the country that deposited about this technology, it is clear that China leads with 67 patent deposits, followed by Japan with 38 patents applications. It is important to note that most depositors, 50% are companies, 44% are individual inventors and only 6% are universities. On the International Patent classification (IPC) codes, we noted that the most present classification in results was H02J3/38, which represents provisions in parallel to feed a single network by two or more generators, converters or transformers. Among all categories, there is the H session, which means Electricity, with 70% of the patents.

Algorithm for Information Retrieval Optimization

When using Information Retrieval Systems (IRS), users often present search queries made of ad-hoc keywords. It is then up to the IRS to obtain a precise representation of the user’s information need and the context of the information. This paper investigates optimization of IRS to individual information needs in order of relevance. The study addressed development of algorithms that optimize the ranking of documents retrieved from IRS. This study discusses and describes a Document Ranking Optimization (DROPT) algorithm for information retrieval (IR) in an Internet-based or designated databases environment. Conversely, as the volume of information available online and in designated databases is growing continuously, ranking algorithms can play a major role in the context of search results. In this paper, a DROPT technique for documents retrieved from a corpus is developed with respect to document index keywords and the query vectors. This is based on calculating the weight (

The Effect of Treated Waste-Water on Compaction and Compression of Fine Soil

—The main objective of this paper is to study the effect of treated waste-water (TWW) on the compaction and compressibility properties of fine soil. Two types of fine soils (clayey soils) were selected for this study and classified as CH soil and Cl type of soil. Compaction and compressibility properties such as optimum water content, maximum dry unit weight, consolidation index and swell index, maximum past pressure and volume change were evaluated using both tap and treated waste water. It was found that the use of treated waste water affects all of these properties. The maximum dry unit weight increased for both soils and the optimum water content decreased as much as 13.6% for highly plastic soil. The significant effect was observed in swell index and swelling pressure of the soils. The swell indexed decreased by as much as 42% and 33% for highly plastic and low plastic soils, respectively, when TWW is used. Additionally, the swelling pressure decreased by as much as 16% for both soil types. The result of this research pointed out that the use of treated waste water has a positive effect on compaction and compression properties of clay soil and promise for potential use of this water in engineering applications. Keywords—Consolidation, proctor compaction, swell index, treated waste-water, volume change.

Negative Pressures of Ca. -20 MPA for Water Enclosed into a Metal Berthelot Tube under a Vacuum Condition

Negative pressures of liquids have been expected to contribute many kinds of technology. Nevertheless, experiments for subjecting liquids which have not too small volumes to negative pressures are difficult even now. The reason of the difficulties is because the liquids tend to generate cavities easily. In order to remove cavitation nuclei, an apparatus for enclosing water into a metal Berthelot tube under vacuum conditions was developed. By using the apparatus, negative pressures for water rose to ca. -20 MPa. This is the highest value for water in metal Berthelot tubes. Results were explained by a traditional crevice model. Keywords

An Open Source Advertisement System

An online advertisement system and its implementation for the Yioop open source search engine are presented. This system supports both selling advertisements and displaying them within search results. The selling of advertisements is done using a system to auction off daily impressions for keyword searches. This is an open, ascending price auction system in which all accepted bids will receive a fraction of the auctioned day’s impressions. New bids in our system are required to be at least one half of the sum of all previous bids ensuring the number of accepted bids is logarithmic in the total ad spend on a keyword for a day. The mechanics of creating an advertisement, attaching keywords to it, and adding it to an advertisement inventory are described. The algorithm used to go from accepted bids for a keyword to which ads are displayed at search time is also presented. We discuss properties of our system and compare it to existing auction systems and systems for selling online advertisements.

Adaptive Kaman Filter for Fault Diagnosis of Linear Parameter-Varying Systems

Fault diagnosis of Linear Parameter-Varying (LPV) system using an adaptive Kalman filter is proposed. The LPV model is comprised of scheduling parameters, and the emulator parameters. The scheduling parameters are chosen such that they are capable of tracking variations in the system model as a result of changes in the operating regimes. The emulator parameters, on the other hand, simulate variations in the subsystems during the identification phase and have negligible effect during the operational phase. The nominal model and the influence vectors, which are the gradient of the feature vector respect to the emulator parameters, are identified off-line from a number of emulator parameter perturbed experiments. A Kalman filter is designed using the identified nominal model. As the system varies, the Kalman filter model is adapted using the scheduling variables. The residual is employed for fault diagnosis. The proposed scheme is successfully evaluated on simulated system as well as on a physical process control system.