Data Gathering and Analysis for Arabic Historical Documents

This paper introduces a new dataset (and the methodology used to generate it) based on a wide range of historical Arabic documents containing clean data simple and homogeneous-page layouts. The experiments are implemented on printed and handwritten documents obtained respectively from some important libraries such as Qatar Digital Library, the British Library and the Library of Congress. We have gathered and commented on 150 archival document images from different locations and time periods. It is based on different documents from the 17th-19th century. The dataset comprises differing page layouts and degradations that challenge text line segmentation methods. Ground truth is produced using the Aletheia tool by PRImA and stored in an XML representation, in the PAGE (Page Analysis and Ground truth Elements) format. The dataset presented will be easily available to researchers world-wide for research into the obstacles facing various historical Arabic documents such as geometric correction of historical Arabic documents.

Interactive, Topic-Oriented Search Support by a Centroid-Based Text Categorisation

Centroid terms are single words that semantically and topically characterise text documents and so may serve as their very compact representation in automatic text processing. In the present paper, centroids are used to measure the relevance of text documents with respect to a given search query. Thus, a new graphbased paradigm for searching texts in large corpora is proposed and evaluated against keyword-based methods. The first, promising experimental results demonstrate the usefulness of the centroid-based search procedure. It is shown that especially the routing of search queries in interactive and decentralised search systems can be greatly improved by applying this approach. A detailed discussion on further fields of its application completes this contribution.

An Efficient Fall Detection Method for Elderly Care System

Fall detection is one of the challenging problems in elderly care system. The objective of this paper is to identify falls in elderly care system. In this paper, an efficient fall detection method is proposed to identify falls using correlation factor and Motion History Image (MHI). The proposed method is tested on URF (University of Rzeszow Fall detection) dataset and evaluated with some efficient measures like sensitivity, specificity, precision and classification accuracy. It is compared with other recent methods. The experimental results substantially proved that the proposed method achieves 1.5% higher sensitivity when compared to other methods.

Duration Patterns of English by Native British Speakers and Mandarin ESL Speakers

This study is intended to describe and analyze the effects of polysyllabic shortening and word or phrase boundary on the duration patterns of spoken utterances by Mandarin learners of English in comparison with native speakers of English. To investigate the relative contribution of these effects, two production experiments were conducted. The study included 11 native British English speakers and 20 Mandarin learners of English who were asked to produce four sets of tokens consisting of a mono-syllabic base form, disyllabic, and trisyllabic words derived from the base by the addition of suffixes, and a set of short sentences with a particular combination of phrase size, stress pattern, and boundary location. The duration of words and segments was measured, and results from the data analysis suggest that the amount of polysyllabic shortening and the effect of word or phrase position are likely to affect a Chinese accent for Mandarin ESL speakers. This study sheds light on research on the duration patterns of language by demonstrating the effect of duration-related factors on the foreign accent of Mandarin ESL speakers. It can also benefit both L2 learners and language teachers by increasing their sensitivity to the duration differences and difficulties experienced by L2 learners of English. An understanding of the amount of polysyllabic shortening and the effect of position in words and phrase on syllable duration can also facilitate L2 teachers to establish priorities for teaching pronunciation to ESL learners.

Cybersecurity Protection Structures: The Case of Lesotho

The Internet brings increasing use of Information and Communications Technology (ICT) services and facilities. Consequently, new computing paradigms emerge to provide services over the Internet. Although there are several benefits stemming from these services, they pose several risks inherited from the Internet. For example, cybercrime, identity theft, malware etc. To thwart these risks, this paper proposes a holistic approach. This approach involves multidisciplinary interactions. The paper proposes a top-down and bottom-up approach to deal with cyber security concerns in developing countries. These concerns range from regulatory and legislative areas, cyber awareness, research and development, technical dimensions etc. The main focus areas are highlighted and a cybersecurity model solution is proposed. The paper concludes by combining all relevant solutions into a proposed cybersecurity model to assist developing countries in enhancing a cyber-safe environment to instill and promote a culture of cybersecurity.

Using ε Value in Describe Regular Languages by Using Finite Automata, Operation on Languages and the Changing Algorithm Implementation

This paper aims at introducing nondeterministic finite automata with ε value which is used to perform some operations on languages. a program is created to implement the algorithm that converts nondeterministic finite automata with ε value (ε-NFA) to deterministic finite automata (DFA).The program is written in c++ programming language. The program inputs are FA 5-tuples from text file and then classifies it into either DFA/NFA or ε -NFA. For DFA, the program will get the string w and decide whether it is accepted or rejected. The tracking path for an accepted string is saved by the program. In case of NFA or ε-NFA automation, the program changes the automation to DFA to enable tracking and to decide if the string w exists in the regular language or not.

Comparison of Machine Learning Models for the Prediction of System Marginal Price of Greek Energy Market

The Greek Energy Market is structured as a mandatory pool where the producers make their bid offers in day-ahead basis. The System Operator solves an optimization routine aiming at the minimization of the cost of produced electricity. The solution of the optimization problem leads to the calculation of the System Marginal Price (SMP). Accurate forecasts of the SMP can lead to increased profits and more efficient portfolio management from the producer`s perspective. Aim of this study is to provide a comparative analysis of various machine learning models such as artificial neural networks and neuro-fuzzy models for the prediction of the SMP of the Greek market. Machine learning algorithms are favored in predictions problems since they can capture and simulate the volatilities of complex time series.

Machine Translation Analysis of Chinese Dish Names

This article presents a comparative study evaluating and comparing the quality of machine translation (MT) output of Chinese gastronomy nomenclature. Chinese gastronomic culture is experiencing an increased international acknowledgment nowadays. The nomenclature of Chinese gastronomy not only reflects a specific aspect of culture, but it is related to other areas of society such as philosophy, traditional medicine, etc. Chinese dish names are composed of several types of cultural references, such as ingredients, colors, flavors, culinary techniques, cooking utensils, toponyms, anthroponyms, metaphors, historical tales, among others. These cultural references act as one of the biggest difficulties in translation, in which the use of translation techniques is usually required. Regarding the lack of Chinese food-related translation studies, especially in Chinese-Spanish translation, and the current massive use of MT, the quality of the MT output of Chinese dish names is questioned. Fifty Chinese dish names with different types of cultural components were selected in order to complete this study. First, all of these dish names were translated by three different MT tools (Google Translate, Baidu Translate and Bing Translator). Second, a questionnaire was designed and completed by 12 Chinese online users (Chinese graduates of a Hispanic Philology major) in order to find out user preferences regarding the collected MT output. Finally, human translation techniques were observed and analyzed to identify what translation techniques would be observed more often in the preferred MT proposals. The result reveals that the MT output of the Chinese gastronomy nomenclature is not of high quality. It would be recommended not to trust the MT in occasions like restaurant menus, TV culinary shows, etc. However, the MT output could be used as an aid for tourists to have a general idea of a dish (the main ingredients, for example). Literal translation turned out to be the most observed technique, followed by borrowing, generalization and adaptation, while amplification, particularization and transposition were infrequently observed. Possibly because that the MT engines at present are limited to relate equivalent terms and offer literal translations without taking into account the whole context meaning of the dish name, which is essential to the application of those less observed techniques. This could give insight into the post-editing of the Chinese dish name translation. By observing and analyzing translation techniques in the proposals of the machine translators, the post-editors could better decide which techniques to apply in each case so as to correct mistakes and improve the quality of the translation.

Performance Analysis of Search Medical Imaging Service on Cloud Storage Using Decision Trees

Telemedicine services use a large amount of data, most of which are diagnostic images in Digital Imaging and Communications in Medicine (DICOM) and Health Level Seven (HL7) formats. Metadata is generated from each related image to support their identification. This study presents the use of decision trees for the optimization of information search processes for diagnostic images, hosted on the cloud server. To analyze the performance in the server, the following quality of service (QoS) metrics are evaluated: delay, bandwidth, jitter, latency and throughput in five test scenarios for a total of 26 experiments during the loading and downloading of DICOM images, hosted by the telemedicine group server of the Universidad Militar Nueva Granada, Bogotá, Colombia. By applying decision trees as a data mining technique and comparing it with the sequential search, it was possible to evaluate the search times of diagnostic images in the server. The results show that by using the metadata in decision trees, the search times are substantially improved, the computational resources are optimized and the request management of the telemedicine image service is improved. Based on the experiments carried out, search efficiency increased by 45% in relation to the sequential search, given that, when downloading a diagnostic image, false positives are avoided in management and acquisition processes of said information. It is concluded that, for the diagnostic images services in telemedicine, the technique of decision trees guarantees the accessibility and robustness in the acquisition and manipulation of medical images, in improvement of the diagnoses and medical procedures in patients.

Routing Medical Images with Tabu Search and Simulated Annealing: A Study on Quality of Service

In telemedicine, the image repository service is important to increase the accuracy of diagnostic support of medical personnel. This study makes comparison between two routing algorithms regarding the quality of service (QoS), to be able to analyze the optimal performance at the time of loading and/or downloading of medical images. This study focused on comparing the performance of Tabu Search with other heuristic and metaheuristic algorithms that improve QoS in telemedicine services in Colombia. For this, Tabu Search and Simulated Annealing heuristic algorithms are chosen for their high usability in this type of applications; the QoS is measured taking into account the following metrics: Delay, Throughput, Jitter and Latency. In addition, routing tests were carried out on ten images in digital image and communication in medicine (DICOM) format of 40 MB. These tests were carried out for ten minutes with different traffic conditions, reaching a total of 25 tests, from a server of Universidad Militar Nueva Granada (UMNG) in Bogotá-Colombia to a remote user in Universidad de Santiago de Chile (USACH) - Chile. The results show that Tabu search presents a better QoS performance compared to Simulated Annealing, managing to optimize the routing of medical images, a basic requirement to offer diagnostic images services in telemedicine.

Performance Analysis of M-Ary Pulse Position Modulation in Multihop Multiple Input Multiple Output-Free Space Optical System over Uncorrelated Gamma-Gamma Atmospheric Turbulence Channels

The performance of Decode and Forward (DF) multihop Free Space Optical ( FSO) scheme deploying Multiple Input Multiple Output (MIMO) configuration under Gamma-Gamma (GG) statistical distribution, that adopts M-ary Pulse Position Modulation (MPPM) coding, is investigated. We have extracted exact and estimated values of Symbol-Error Rates (SERs) respectively. A closed form formula related to the Probability Density Function (PDF) is expressed for our designed system. Thanks to the use of DF multihop MIMO FSO configuration and MPPM signaling, atmospheric turbulence is combatted; hence the transmitted signal quality is improved.

Single Valued Neutrosophic Hesitant Fuzzy Rough Set and Its Application

In this paper, we proposed the notion of single valued neutrosophic hesitant fuzzy rough set, by combining single valued neutrosophic hesitant fuzzy set and rough set. The combination of single valued neutrosophic hesitant fuzzy set and rough set is a powerful tool for dealing with uncertainty, granularity and incompleteness of knowledge in information systems. We presented both definition and some basic properties of the proposed model. Finally, we gave a general approach which is applied to a decision making problem in disease diagnoses, and demonstrated the effectiveness of the approach by a numerical example.

Mutual Authentication for Sensor-to-Sensor Communications in IoT Infrastructure

Internet of things is a new concept that its emergence has caused ubiquity of sensors in human life, so that at any time, all data are collected, processed and transmitted by these sensors. In order to establish a secure connection, the first challenge is authentication between sensors. However, this challenge also requires some features so that the authentication is done properly. Anonymity, untraceability, and being lightweight are among the issues that need to be considered. In this paper, we have evaluated the authentication protocols and have analyzed the security vulnerabilities found in them. Then an improved light weight authentication protocol for sensor-to-sensor communications is presented which uses the hash function and logical operators. The analysis of protocol shows that security requirements have been met and the protocol is resistant against various attacks. In the end, by decreasing the number of computational cost functions, it is argued that the protocol is lighter than before.

Improving the Optoacoustic Signal by Monitoring the Changes of Coupling Medium

In this paper, we discussed the coupling medium in the optoacoustic imaging. The coupling medium is placed between the scanned object and the ultrasound transducers. Water with varying temperature was used as the coupling medium. The water temperature is gradually varied between 25 to 40 degrees. This heating process is taken with care in order to avoid the bubble formation. Rise in the photoacoustic signal is noted through an unfocused transducer with frequency of 2.25 MHz as the temperature increases. The temperature rise is monitored using a NTC thermistor and the values in degrees are calculated using an embedded evaluation kit. Also the temperature is transmitted to PC through a serial communication. All these processes are synchronized using a trigger signal from the laser source.

A Mean–Variance–Skewness Portfolio Optimization Model

Portfolio optimization is one of the most important topics in finance. This paper proposes a mean–variance–skewness (MVS) portfolio optimization model. Traditionally, the portfolio optimization problem is solved by using the mean–variance (MV) framework. In this study, we formulate the proposed model as a three-objective optimization problem, where the portfolio's expected return and skewness are maximized whereas the portfolio risk is minimized. For solving the proposed three-objective portfolio optimization model we apply an adapted version of the non-dominated sorting genetic algorithm (NSGAII). Finally, we use a real dataset from FTSE-100 for validating the proposed model.

A Combined Cipher Text Policy Attribute-Based Encryption and Timed-Release Encryption Method for Securing Medical Data in Cloud

The biggest problem in cloud is securing an outsourcing data. A cloud environment cannot be considered to be trusted. It becomes more challenging when outsourced data sources are managed by multiple outsourcers with different access rights. Several methods have been proposed to protect data confidentiality against the cloud service provider to support fine-grained data access control. We propose a method with combined Cipher Text Policy Attribute-based Encryption (CP-ABE) and Timed-release encryption (TRE) secure method to control medical data storage in public cloud.

A Domain Specific Modeling Language Semantic Model for Artefact Orientation

Since the process of transforming user requirements to modeling constructs are not very well supported by domain-specific frameworks, it became necessary to integrate domain requirements with the specific architectures to achieve an integrated customizable solutions space via artifact orientation. Domain-specific modeling language specifications of model-driven engineering technologies focus more on requirements within a particular domain, which can be tailored to aid the domain expert in expressing domain concepts effectively. Modeling processes through domain-specific language formalisms are highly volatile due to dependencies on domain concepts or used process models. A capable solution is given by artifact orientation that stresses on the results rather than expressing a strict dependence on complicated platforms for model creation and development. Based on this premise, domain-specific methods for producing artifacts without having to take into account the complexity and variability of platforms for model definitions can be integrated to support customizable development. In this paper, we discuss methods for the integration capabilities and necessities within a common structure and semantics that contribute a metamodel for artifact-orientation, which leads to a reusable software layer with concrete syntax capable of determining design intents from domain expert. These concepts forming the language formalism are established from models explained within the oil and gas pipelines industry.

An Approach to Measure Snow Depth of Winter Accumulation at Basin Scale Using Satellite Data

Snow depth estimation and monitoring studies have been carried out for decades using empirical relationship or extrapolation of point measurements carried out in field. With the development of advanced satellite based remote sensing techniques, a modified approach is proposed in the present study to estimate the winter accumulated snow depth at basin scale. Assessment of snow depth by differencing Digital Elevation Model (DEM) generated at the beginning and end of winter season can be experimented for the region of interest (Himalayan and polar regions) accounting for winter accumulation (solid precipitation). The proposed approach is based on existing geodetic method that is being used for glacier mass balance estimation. Considering the satellite datasets purely acquired during beginning and end of winter season, it is possible to estimate the change in depth or thickness for the snow that is accumulated during the winter as it takes one year for the snow to get transformed into firn (snow that has survived one summer or one-year old snow).

Bug Localization on Single-Line Bugs of Apache Commons Math Library

Software bug localization is one of the most costly tasks in program repair technique. Therefore, there is a high claim for automated bug localization techniques that can monitor programmers to the locations of bugs, with slight human arbitration. Spectrum-based bug localization aims to help software developers to discover bugs rapidly by investigating abstractions of the program traces to make a ranking list of most possible buggy modules. Using the Apache Commons Math library project, we study the diagnostic accuracy using our spectrum-based bug localization metric. Our outcomes show that the greater performance of a specific similarity coefficient, used to inspect the program spectra, is mostly effective on localizing of single line bugs.

A Study of the Assistant Application for Tourists Taking Metros

With the proliferation and development of mobile devices, various mobile apps have appeared to satisfy people’s needs. Metro, with the feature of convenient, punctuality and economic, is one of the most popular modes of transportation in cities. Yet, there are still some inconveniences brought by various factors, impacting tourists’ riding experience. The aim of this study is to help tourists to shorten the time of purchasing tickets, to provide them clear metro information and direct navigation, detailed schedule as well as a way to collect metro cards as souvenir. The study collects data through three phases, including observation, survey and test. Data collected from 106 tourists totally in Wuhan metro stations are discussed in the study. The result reflects tourists’ demand when they take the metro. It also indicates the feasibility of using mobile technology to improve passenger’s experience.