Natural Language News Generation from Big Data

In this paper, we introduce an NLG application for the automatic creation of ready-to-publish texts from big data. The resulting fully automatic generated news stories have a high resemblance to the style in which the human writer would draw up such a story. Topics include soccer games, stock exchange market reports, and weather forecasts. Each generated text is unique. Readyto-publish stories written by a computer application can help humans to quickly grasp the outcomes of big data analyses, save timeconsuming pre-formulations for journalists and cater to rather small audiences by offering stories that would otherwise not exist. 

Studies of Rule Induction by STRIM from the Decision Table with Contaminated Attribute Values from Missing Data and Noise — In the Case of Critical Dataset Size —

STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by simulation experiments specifying rules in advance, and by comparison with conventional methods. However, scope for future development remains before STRIM can be applied to the analysis of real-world data sets. The first requirement is to determine the size of the dataset needed for inducting true rules, since finding statistically significant rules is the core of the method. The second is to examine the capacity of rule induction from datasets with contaminated attribute values created by missing data and noise, since real-world datasets usually contain such contaminated data. This paper examines the first problem theoretically, in connection with the rule length. The second problem is then examined in a simulation experiment, utilizing the critical size of dataset derived from the first step. The experimental results show that STRIM is highly robust in the analysis of datasets with contaminated attribute values, and hence is applicable to real-world data

An Automatic Bayesian Classification System for File Format Selection

This paper presents an approach for the classification of an unstructured format description for identification of file formats. The main contribution of this work is the employment of data mining techniques to support file format selection with just the unstructured text description that comprises the most important format features for a particular organisation. Subsequently, the file format indentification method employs file format classifier and associated configurations to support digital preservation experts with an estimation of required file format. Our goal is to make use of a format specification knowledge base aggregated from a different Web sources in order to select file format for a particular institution. Using the naive Bayes method, the decision support system recommends to an expert, the file format for his institution. The proposed methods facilitate the selection of file format and the quality of a digital preservation process. The presented approach is meant to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and specifications of file formats. To facilitate decision-making, the aggregated information about the file formats is presented as a file format vocabulary that comprises most common terms that are characteristic for all researched formats. The goal is to suggest a particular file format based on this vocabulary for analysis by an expert. The sample file format calculation and the calculation results including probabilities are presented in the evaluation section.

Classification of Attaks over Cloud Environment

The security of cloud services is the concern of cloud service providers. In this paper, we will mention different classifications of cloud attacks referred by specialized organizations. Each agency has its classification of well-defined properties. The purpose is to present a high-level classification of current research in cloud computing security. This classification is organized around attack strategies and corresponding defenses.

Scalable Cloud-Based LEO Satellite Constellation Simulator

Distributed applications deployed on LEO satellites and ground stations require substantial communication between different members in a constellation to overcome the earth coverage barriers imposed by GEOs. Applications running on LEO constellations suffer the earth line-of-sight blockage effect. They need adequate lab testing before launching to space. We propose a scalable cloud-based network simulation framework to simulate problems created by the earth line-of-sight blockage. The framework utilized cloud IaaS virtual machines to simulate LEO satellites and ground stations distributed software. A factorial ANOVA statistical analysis is conducted to measure simulator overhead on overall communication performance. The results showed a very low simulator communication overhead. Consequently, the simulation framework is proposed as a candidate for testing LEO constellations with distributed software in the lab before space launch.

Wireless Transmission of Big Data Using Novel Secure Algorithm

This paper presents a novel algorithm for secure, reliable and flexible transmission of big data in two hop wireless networks using cooperative jamming scheme. Two hop wireless networks consist of source, relay and destination nodes. Big data has to transmit from source to relay and from relay to destination by deploying security in physical layer. Cooperative jamming scheme determines transmission of big data in more secure manner by protecting it from eavesdroppers and malicious nodes of unknown location. The novel algorithm that ensures secure and energy balance transmission of big data, includes selection of data transmitting region, segmenting the selected region, determining probability ratio for each node (capture node, non-capture and eavesdropper node) in every segment, evaluating the probability using binary based evaluation. If it is secure transmission resume with the two- hop transmission of big data, otherwise prevent the attackers by cooperative jamming scheme and transmit the data in two-hop transmission.

Implicit Force Control of a Position Controlled Robot – A Comparison with Explicit Algorithms

This paper investigates simple implicit force control algorithms realizable with industrial robots. A lot of approaches already published are difficult to implement in commercial robot controllers, because the access to the robot joint torques is necessary or the complete dynamic model of the manipulator is used. In the past we already deal with explicit force control of a position controlled robot. Well known schemes of implicit force control are stiffness control, damping control and impedance control. Using such algorithms the contact force cannot be set directly. It is further the result of controller impedance, environment impedance and the commanded robot motion/position. The relationships of these properties are worked out in this paper in detail for the chosen implicit approaches. They have been adapted to be implementable on a position controlled robot. The behaviors of stiffness control and damping control are verified by practical experiments. For this purpose a suitable test bed was configured. Using the full mechanical impedance within the controller structure will not be practical in the case when the robot is in physical contact with the environment. This fact will be verified by simulation.

A Pattern Recognition Neural Network Model for Detection and Classification of SQL Injection Attacks

Thousands of organisations store important and confidential information related to them, their customers, and their business partners in databases all across the world. The stored data ranges from less sensitive (e.g. first name, last name, date of birth) to more sensitive data (e.g. password, pin code, and credit card information). Losing data, disclosing confidential information or even changing the value of data are the severe damages that Structured Query Language injection (SQLi) attack can cause on a given database. It is a code injection technique where malicious SQL statements are inserted into a given SQL database by simply using a web browser. In this paper, we propose an effective pattern recognition neural network model for detection and classification of SQLi attacks. The proposed model is built from three main elements of: a Uniform Resource Locator (URL) generator in order to generate thousands of malicious and benign URLs, a URL classifier in order to: 1) classify each generated URL to either a benign URL or a malicious URL and 2) classify the malicious URLs into different SQLi attack categories, and a NN model in order to: 1) detect either a given URL is a malicious URL or a benign URL and 2) identify the type of SQLi attack for each malicious URL. The model is first trained and then evaluated by employing thousands of benign and malicious URLs. The results of the experiments are presented in order to demonstrate the effectiveness of the proposed approach.

Advances in Artificial Intelligence Using Speech Recognition

This research study aims to present a retrospective study about speech recognition systems and artificial intelligence. Speech recognition has become one of the widely used technologies, as it offers great opportunity to interact and communicate with automated machines. Precisely, it can be affirmed that speech recognition facilitates its users and helps them to perform their daily routine tasks, in a more convenient and effective manner. This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence. Recent researches have revealed the fact that speech recognition is found to be the utmost issue, which affects the decoding of speech. In order to overcome these issues, different statistical models were developed by the researchers. Some of the most prominent statistical models include acoustic model (AM), language model (LM), lexicon model, and hidden Markov models (HMM). The research will help in understanding all of these statistical models of speech recognition. Researchers have also formulated different decoding methods, which are being utilized for realistic decoding tasks and constrained artificial languages. These decoding methods include pattern recognition, acoustic phonetic, and artificial intelligence. It has been recognized that artificial intelligence is the most efficient and reliable methods, which are being used in speech recognition.

Reduction of Impulsive Noise in OFDM System Using Adaptive Algorithm

The Orthogonal Frequency Division Multiplexing (OFDM) with high data rate, high spectral efficiency and its ability to mitigate the effects of multipath makes them most suitable in wireless application. Impulsive noise distorts the OFDM transmission and therefore methods must be investigated to suppress this noise. In this paper, a State Space Recursive Least Square (SSRLS) algorithm based adaptive impulsive noise suppressor for OFDM communication system is proposed. And a comparison with another adaptive algorithm is conducted. The state space model-dependent recursive parameters of proposed scheme enables to achieve steady state mean squared error (MSE), low bit error rate (BER), and faster convergence than that of some of existing algorithm.

A Smart Monitoring System for Preventing Gas Risks in Indoor

In this paper, we propose a system for preventing gas risks through the use of wireless communication modules and intelligent gas safety appliances. Our system configuration consists of an automatic extinguishing system, detectors, a wall-pad, and a microcomputer controlled micom gas meter to monitor gas flow and pressure as well as the occurrence of earthquakes. The automatic fire extinguishing system checks for both combustible gaseous leaks and monitors the environmental temperature, while the detector array measures smoke and CO gas concentrations. Depending on detected conditions, the micom gas meter cuts off an inner valve and generates a warning, the automatic fire-extinguishing system cuts off an external valve and sprays extinguishing materials, or the sensors generate signals and take further action when smoke or CO are detected. Information on intelligent measures taken by the gas safety appliances and sensors are transmitted to the wall-pad, which in turn relays this as real time data to a server that can be monitored via an external network (BcN) connection to a web or mobile application for the management of gas safety. To validate this smart-home gas management system, we field-tested its suitability for use in Korean apartments under several scenarios.

Characteristic Study on Conventional and Soliton Based Transmission System

Here, we study the characteristic feature of conventional (ON-OFF keying) and soliton based transmission system. We consider 20Gbps transmission system implemented with Conventional Single Mode Fiber (C-SMF) to examine the role of Gaussian pulse which is the characteristic of conventional propagation and Hyperbolic-secant pulse which is the characteristic of soliton propagation in it. We note the influence of these pulses with respect to different dispersion lengths and soliton period in conventional and soliton system respectively and evaluate the system performance in terms of Quality factor. From the analysis, we could prove that the soliton pulse has the consistent performance even for long distance without dispersion compensation than the conventional system as it is robust to dispersion. For the length of transmission of 200Km, soliton system yielded Q of 33.958 while the conventional system totally exhausted with Q=0.

Developing a Web-Based Workflow Management System in Cloud Computing Platforms

Cloud computing is the innovative and leading information technology model for enabling convenient, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released with minimal management effort. In this paper, we aim at the development of workflow management system for cloud computing platforms based on our previous research on the dynamic allocation of the cloud computing resources and its workflow process. We took advantage of the HTML5 technology and developed web-based workflow interface. In order to enable the combination of many tasks running on the cloud platform in sequence, we designed a mechanism and developed an execution engine for workflow management on clouds. We also established a prediction model which was integrated with job queuing system to estimate the waiting time and cost of the individual tasks on different computing nodes, therefore helping users achieve maximum performance at lowest payment. This proposed effort has the potential to positively provide an efficient, resilience and elastic environment for cloud computing platform. This development also helps boost user productivity by promoting a flexible workflow interface that lets users design and control their tasks' flow from anywhere.

All Optical Wavelength Conversion Based On Four Wave Mixing in Optical Fiber

We have designed wavelength conversion based on four wave mixing in an optical fiber at 10 Gb/s. The power of converted signal increases with increase in signal power. The converted signal power is investigated as a function of input signal power and pump power. On comparison of converted signal power at different value of input signal power, we observe that best converted signal power is obtained at -2 dBm input signal power for both up conversion as well as for down conversion. Further, FWM efficiency, quality factor is observed for increase in input signal power and optical fiber length.

An Approach to Flatten the Gain of Fiber Raman Amplifiers with Multi-Pumping

The effects of the pumping wavelength and their power on the gain flattening of a fiber Raman amplifier (FRA) are investigated. The multi-wavelength pumping scheme is utilized to achieve gain flatness in FRA. It is proposed that gain flatness becomes better with increase in number of pumping wavelengths applied. We have achieved flat gain with 0.27 dB fluctuation in a spectral range of 1475-1600 nm for a Raman fiber length of 10 km by using six pumps with wavelengths with in the 1385-1495 nm interval. The effect of multi-wavelength pumping scheme on gain saturation in FRA is also studied. It is proposed that gain saturation condition gets improved by using this scheme and this scheme is more useful for higher spans of Raman fiber length.

New Approach for Minimizing Wavelength Fragmentation in Wavelength-Routed WDM Networks

Wavelength Division Multiplexing (WDM) is the dominant transport technology used in numerous high capacity backbone networks, based on optical infrastructures. Given the importance of costs (CapEx and OpEx) associated to these networks, resource management is becoming increasingly important, especially how the optical circuits, called “lightpaths”, are routed throughout the network. This requires the use of efficient algorithms which provide routing strategies with the lowest cost. We focus on the lightpath routing and wavelength assignment problem, known as the RWA problem, while optimizing wavelength fragmentation over the network. Wavelength fragmentation poses a serious challenge for network operators since it leads to the misuse of the wavelength spectrum, and then to the refusal of new lightpath requests. In this paper, we first establish a new Integer Linear Program (ILP) for the problem based on a node-link formulation. This formulation is based on a multilayer approach where the original network is decomposed into several network layers, each corresponding to a wavelength. Furthermore, we propose an efficient heuristic for the problem based on a greedy algorithm followed by a post-treatment procedure. The obtained results show that the optimal solution is often reached. We also compare our results with those of other RWA heuristic methods

Tool for Analysing the Sensitivity and Tolerance of Mechatronic Systems in Matlab GUI

The article deals with the tool in Matlab GUI form that is designed to analyse a mechatronic system sensitivity and tolerance. In the analysed mechatronic system, a torque is transferred from the drive to the load through a coupling containing flexible elements. Different methods of control system design are used. The classic form of the feedback control is proposed using Naslin method, modulus optimum criterion and inverse dynamics method. The cascade form of the control is proposed based on combination of modulus optimum criterion and symmetric optimum criterion. The sensitivity is analysed on the basis of absolute and relative sensitivity of system function to the change of chosen parameter value of the mechatronic system, as well as the control subsystem. The tolerance is analysed in the form of determining the range of allowed relative changes of selected system parameters in the field of system stability. The tool allows to analyse an influence of torsion stiffness, torsion damping, inertia moments of the motor and the load and controller(s) parameters. The sensitivity and tolerance are monitored in terms of the impact of parameter change on the response in the form of system step response and system frequency-response logarithmic characteristics. The Symbolic Math Toolbox for expression of the final shape of analysed system functions was used. The sensitivity and tolerance are graphically represented as 2D graph of sensitivity or tolerance of the system function and 3D/2D static/interactive graph of step/frequency response.

Knowledge Representation Based On Interval Type-2 CFCM Clustering

This paper is concerned with knowledge representation and extraction of fuzzy if-then rules using Interval Type-2 Context-based Fuzzy C-Means clustering (IT2-CFCM) with the aid of fuzzy granulation. This proposed clustering algorithm is based on information granulation in the form of IT2 based Fuzzy C-Means (IT2-FCM) clustering and estimates the cluster centers by preserving the homogeneity between the clustered patterns from the IT2 contexts produced in the output space. Furthermore, we can obtain the automatic knowledge representation in the design of Radial Basis Function Networks (RBFN), Linguistic Model (LM), and Adaptive Neuro-Fuzzy Networks (ANFN) from the numerical input-output data pairs. We shall focus on a design of ANFN in this paper. The experimental results on an estimation problem of energy performance reveal that the proposed method showed a good knowledge representation and performance in comparison with the previous works.

A Quantitative Study of the Evolution of Open Source Software Communities

Typically, virtual communities exhibit the well-known phenomenon of participation inequality, which means that only a small percentage of users is responsible of the majority of contributions. However, the sustainability of the community requires that the group of active users must be continuously nurtured with new users that gain expertise through a participation process. This paper analyzes the time evolution of Open Source Software (OSS) communities, considering users that join/abandon the community over time and several topological properties of the network when modeled as a social network. More specifically, the paper analyzes the role of those users rejoining the community and their influence in the global characteristics of the network.

A Long Tail Study of eWOM Communities

Electronic Word-Of-Mouth (eWOM) communities represent today an important source of information in which more and more customers base their purchasing decisions. They include thousands of reviews concerning very different products and services posted by many individuals geographically distributed all over the world. Due to their massive audience, eWOM communities can help users to find the product they are looking for even if they are less popular or rare. This is known as the long tail effect, which leads to a larger number of lower-selling niche products. This paper analyzes the long tail effect in a well-known eWOM community and defines a tool for finding niche products unavailable through conventional channels.