Abstract: The present study aims primarily at investigating words of peace (lexemes of peace) in the formal speeches of the Egyptian president Abdulfattah El-Sisi in a two-year span of time, from 2018 to 2019. This paper attempts to shed light not only on the contextual use of the antonyms, war and peace, but also it underpins quantitative analysis through the current methods of corpus linguistics. As such, the researchers have deployed a corpus-based approach in collecting, encoding, and processing 30 presidential speeches over the stated period (23,411 words and 25,541 tokens in total). Further, semantic fields and collocational networkzs are identified and compared statistically. Results have shown a significant propensity of adopting peace, including its relevant collocation network, textually and therefore, ideationally, at the expense of war concept which in most cases surfaces euphemistically through the noun conflict. The president has not justified the action of war with an honorable cause or a valid reason. Such results, so far, have indicated a positive sociopolitical mindset the Egyptian president possesses and moreover, reveal national and international fair dealing on arising issues.
Abstract: This proposal aims for semantic enrichment between
glossaries using the Simple Knowledge Organization System (SKOS)
vocabulary to discover synonyms, hyponyms and hyperonyms
semiautomatically, in Brazilian Portuguese, generating new semantic
relationships based on WordNet. To evaluate the quality of this
proposed model, experiments were performed by the use of two sets
containing new relations, being one generated automatically and the
other manually mapped by the domain expert. The applied evaluation
metrics were precision, recall, f-score, and confidence interval. The
results obtained demonstrate that the applied method in the field of
Oil Production and Extraction (E&P) is effective, which suggests that
it can be used to improve the quality of terminological mappings.
The procedure, although adding complexity in its elaboration, can be
reproduced in others domains.
Abstract: Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.
Abstract: Measuring semantic similarity between texts is calculating semantic relatedness between texts using various techniques. Our web application (Measuring Relatedness of Concepts-MRC) allows user to input two text corpuses and get semantic similarity percentage between both using WordNet. Our application goes through five stages for the computation of semantic relatedness. Those stages are: Preprocessing (extracts keywords from content), Feature Extraction (classification of words into Parts-of-Speech), Synonyms Extraction (retrieves synonyms against each keyword), Measuring Similarity (using keywords and synonyms, similarity is measured) and Visualization (graphical representation of similarity measure). Hence the user can measure similarity on basis of features as well. The end result is a percentage score and the word(s) which form the basis of similarity between both texts with use of different tools on same platform. In future work we look forward for a Web as a live corpus application that provides a simpler and user friendly tool to compare documents and extract useful information.
Abstract: This article presents a comparative study evaluating and comparing the quality of machine translation (MT) output of Chinese gastronomy nomenclature. Chinese gastronomic culture is experiencing an increased international acknowledgment nowadays. The nomenclature of Chinese gastronomy not only reflects a specific aspect of culture, but it is related to other areas of society such as philosophy, traditional medicine, etc. Chinese dish names are composed of several types of cultural references, such as ingredients, colors, flavors, culinary techniques, cooking utensils, toponyms, anthroponyms, metaphors, historical tales, among others. These cultural references act as one of the biggest difficulties in translation, in which the use of translation techniques is usually required. Regarding the lack of Chinese food-related translation studies, especially in Chinese-Spanish translation, and the current massive use of MT, the quality of the MT output of Chinese dish names is questioned. Fifty Chinese dish names with different types of cultural components were selected in order to complete this study. First, all of these dish names were translated by three different MT tools (Google Translate, Baidu Translate and Bing Translator). Second, a questionnaire was designed and completed by 12 Chinese online users (Chinese graduates of a Hispanic Philology major) in order to find out user preferences regarding the collected MT output. Finally, human translation techniques were observed and analyzed to identify what translation techniques would be observed more often in the preferred MT proposals. The result reveals that the MT output of the Chinese gastronomy nomenclature is not of high quality. It would be recommended not to trust the MT in occasions like restaurant menus, TV culinary shows, etc. However, the MT output could be used as an aid for tourists to have a general idea of a dish (the main ingredients, for example). Literal translation turned out to be the most observed technique, followed by borrowing, generalization and adaptation, while amplification, particularization and transposition were infrequently observed. Possibly because that the MT engines at present are limited to relate equivalent terms and offer literal translations without taking into account the whole context meaning of the dish name, which is essential to the application of those less observed techniques. This could give insight into the post-editing of the Chinese dish name translation. By observing and analyzing translation techniques in the proposals of the machine translators, the post-editors could better decide which techniques to apply in each case so as to correct mistakes and improve the quality of the translation.
Abstract: Sentiment analysis is a recent field of study that computationally assesses the emotional nature of a body of text. To assess its test-validity, sentiment analysis was carried out on the emotional corpus of text from a personal 15-day mood diary. Self-reported mood scores varied more or less accurately with daily mood evaluation score given by the software. On further assessment, it was found that while sentiment analysis was good at assessing ‘global’ mood, it was not able to ‘locally’ identify and differentially score synonyms of various emotional words. It is further critiqued for treating the intensity of an emotion as universal across cultures. Finally, the software is shown not to account for emotional complexity in sentences by treating emotions as strictly positive or negative. Hence, it is posited that a better output could be two (positive and negative) affect scores for the same body of text.
Abstract: Vehicular Ad hoc NETwork (VANET) is a kind of Mobile Ad hoc NETwork (MANET). It allows the vehicles to communicate with one another as well as with nearby Road Side Units (RSU) and Regional Trusted Authorities (RTA). Vehicles communicate through On-Board Units (OBU) in which privacy has to be assured which will avoid the misuse of private data. A secure authentication framework for VANETs is proposed in which Public Key Cryptography (PKC) based adaptive pseudonym scheme is used to generate self-generated pseudonyms. Self-generated pseudonyms are used instead of real IDs for privacy preservation and non-repudiation. The ID-Based Signature (IBS) and ID-Based Online/Offline Signature (IBOOS) schemes are used for authentication. IBS is used to authenticate between vehicle and RSU whereas IBOOS provides authentication among vehicles. Security attacks like impersonation attack in the network are resolved and the attacking nodes are rejected from the network, thereby ensuring secure communication among the vehicles in the network. Simulation results shows that the proposed system provides better authentication in VANET environment.
Abstract: In the field of Quran Studies known as GHAREEB AL QURAN (The study of the meanings of strange words and structures in Holy Quran), it is difficult to distinguish some pragmatic meanings from conceptual meanings. One who wants to study this subject may need to look for a common usage between any two words or more; to understand general meaning, and sometimes may need to look for common differences between them, even if there are synonyms (word sisters).
Some of the distinguished scholars of Arabic linguistics believe that there are no synonym words, they believe in varieties of meaning and multi-context usage. Based on this viewpoint, our method was designedto look for synonyms of a word, then the differences that distinct the word and their synonyms.
There are many available books that use such a method e.g. synonyms books, dictionaries, glossaries, and some books on the interpretations of strange vocabulary of the Holy Quran, but it is difficult to look up words in these written works.
For that reason, we proposed a logical entity, which we called Differences Matrix (DM).
DM groups the synonyms words to extract the relations between them and to know the general meaning, which defines the skeleton of all word synonyms; this meaning is expressed by a word of its sisters.
In Differences Matrix, we used the sisters(words) as titles for rows and columns, and in the obtained cells we tried to define the row title (word) by using column title (her sister), so the relations between sisters appear, the expected result is well defined groups of sisters for each word. We represented the obtained results formally, and used the defined groups as a base for building the ontology of the Holy Quran synonyms.
Abstract: The most important property of the Gene Ontology is
the terms. These control vocabularies are defined to provide
consistent descriptions of gene products that are shareable and
computationally accessible by humans, software agent, or other
machine-readable meta-data. Each term is associated with
information such as definition, synonyms, database references, amino
acid sequences, and relationships to other terms. This information has
made the Gene Ontology broadly applied in microarray and
proteomic analysis. However, the process of searching the terms is
still carried out using traditional approach which is based on keyword
matching. The weaknesses of this approach are: ignoring semantic
relationships between terms, and highly depending on a specialist to
find similar terms. Therefore, this study combines semantic similarity
measure and genetic algorithm to perform a better retrieval process
for searching semantically similar terms. The semantic similarity
measure is used to compute similitude strength between two terms.
Then, the genetic algorithm is employed to perform batch retrievals
and to handle the situation of the large search space of the Gene
Ontology graph. The computational results are presented to show the
effectiveness of the proposed algorithm.
Abstract: Transliteration is frequently used especially in writing geographic denominations, personal names (onyms) etc. Proper names (onyms) of all languages must sound similarly in translated works as well as in scientific projects and works written in mother tongue, because we can get introduced with the nation, its history, culture, traditions and other spiritual values through the onyms of that nation. Therefore it is necessary to systematize the different transliterations of onyms of foreign languages. This paper is dedicated to the problem of making the project of transliterating Kazakh onyms into Arabic. In order to achieve this goal we use scientific or practical types of transliteration. Because in this type of transliteration provides easy reading writing source language's texts in the target language without any diacritical symbols, it is limited by the target language's alphabetic system.
Abstract: Automated operations based on voice commands will become more and more important in many applications, including robotics, maintenance operations, etc. However, voice command recognition rates drop quite a lot under non-stationary and chaotic noise environments. In this paper, we tried to significantly improve the speech recognition rates under non-stationary noise environments. First, 298 Navy acronyms have been selected for automatic speech recognition. Data sets were collected under 4 types of noisy environments: factory, buccaneer jet, babble noise in a canteen, and destroyer. Within each noisy environment, 4 levels (5 dB, 15 dB, 25 dB, and clean) of Signal-to-Noise Ratio (SNR) were introduced to corrupt the speech. Second, a new algorithm to estimate speech or no speech regions has been developed, implemented, and evaluated. Third, extensive simulations were carried out. It was found that the combination of the new algorithm, the proper selection of language model and a customized training of the speech recognizer based on clean speech yielded very high recognition rates, which are between 80% and 90% for the four different noisy conditions. Fourth, extensive comparative studies have also been carried out.
Abstract: Discussion and development of principles of the
uniform nation formation within the limits of the Kazakhstan state
obviously became one of the most pressing questions of the day. The
fact that this question has not been solved "from above" as many
other questions has caused really brisk discussion, shows us increase
of civil consciousness in Kazakhstan society, and also the actuality of
this theme which can be carried in the category of fatal questions. In
any sense, nation building has raised civil society to a much higher
level. It would be better to begin with certain definitions. First is the
word "nation". The second is the "state". Both of these terms are very
closely connected with each other, so that in English language they
are in general synonyms. In Russian more shades of these terms
exist. For example in Kazakhstan the citizens of the country
irrespective of nationality (but mainly with reference to non-kazakhs)
are called «kazakhstanians», while the name of the title nation is
\"Kazakhs\". The same we can see in Russia, where, for example, the
Chechen or the Yakut –are \"Rossiyane\" which means “the citizens
of Russian Federation, but not \"Russians\".
The paper was written under the research project “Islam in modern
Kazakhstan: the nature and outcome of the religious revival”.
Abstract: When designing information systems that deal with
large amount of domain knowledge, system designers need to consider
ambiguities of labeling termsin domain vocabulary for navigating
users in the information space. The goal of this study is to develop a
methodology for system designers to label navigation items, taking
account of ambiguities stems from synonyms or polysemes of labeling
terms. In this paper, we propose a method for concept labeling based
on mappings between domain ontology andthesaurus, and report
results of an empirical evaluation.
Abstract: Ontology Matching is a task needed in various applica-tions, for example for comparison or merging purposes. In literature,many algorithms solving the matching problem can be found, butmost of them do not consider instances at all. Mappings are deter-mined by calculating the string-similarity of labels, by recognizinglinguistic word relations (synonyms, subsumptions etc.) or by ana-lyzing the (graph) structure. Due to the facts that instances are oftenmodeled within the ontology and that the set of instances describesthe meaning of the concepts better than their meta information,instances should definitely be incorporated into the matching process.In this paper several novel instance-based matching algorithms arepresented which enhance the quality of matching results obtainedwith common concept-based methods. Different kinds of formalismsare use to classify concepts on account of their instances and finallyto compare the concepts directly.KeywordsInstances, Ontology Matching, Semantic Web
Abstract: Advent enhancements in the field of computing have
increased massive use of web based electronic documents. Current
Copyright protection laws are inadequate to prove the ownership for
electronic documents and do not provide strong features against
copying and manipulating information from the web. This has
opened many channels for securing information and significant
evolutions have been made in the area of information security.
Digital Watermarking has developed into a very dynamic area of
research and has addressed challenging issues for digital content.
Watermarking can be visible (logos or signatures) and invisible
(encoding and decoding). Many visible watermarking techniques
have been studied for text documents but there are very few for web
based text. XML files are used to trade information on the internet
and contain important information. In this paper, two invisible
watermarking techniques using Synonyms and Acronyms are
proposed for XML files to prove the intellectual ownership and to
achieve the security. Analysis is made for different attacks and
amount of capacity to be embedded in the XML file is also noticed.
A comparative analysis for capacity is also made for both methods.
The system has been implemented using C# language and all tests are
made practically to get the results.