Abstract: In this paper, we presented our innovative way of determining the driving context for a driving assistance system. We invoke the fusion of all parameters that describe the context of the environment, the vehicle and the driver to obtain the driving context. We created a training set that stores driving situation patterns and from which the system consults to determine the driving situation. A machine-learning algorithm predicts the driving situation. The driving situation is an input to the fission process that yields the action that must be implemented when the driver needs to be informed or assisted from the given the driving situation. The action may be directed towards the driver, the vehicle or both. This is an ongoing work whose goal is to offer an alternative driving assistance system for safe driving, green driving and comfortable driving. Here, ontologies are used for knowledge representation.
Abstract: This aims of this paper is to forecast the electricity spot prices. First, we focus on modeling the conditional mean of the series so we adopt a generalized fractional -factor Gegenbauer process (k-factor GARMA). Secondly, the residual from the -factor GARMA model has used as a proxy for the conditional variance; these residuals were predicted using two different approaches. In the first approach, a local linear wavelet neural network model (LLWNN) has developed to predict the conditional variance using the Back Propagation learning algorithms. In the second approach, the Gegenbauer generalized autoregressive conditional heteroscedasticity process (G-GARCH) has adopted, and the parameters of the k-factor GARMA-G-GARCH model has estimated using the wavelet methodology based on the discrete wavelet packet transform (DWPT) approach. The empirical results have shown that the k-factor GARMA-G-GARCH model outperform the hybrid k-factor GARMA-LLWNN model, and find it is more appropriate for forecasts.
Abstract: In electromagnetic imaging, because of the diffraction limited system, the pixel values could change slowly near the edge of the image targets and they also change with the location in the same target. Using traditional digital image segmentation methods to segment electromagnetic gradient images could result in lots of errors because of this change in pixel values. To address this issue, this paper proposes a novel image segmentation and extraction algorithm based on Mean-Shift and dictionary learning. Firstly, the preliminary segmentation results from adaptive bandwidth Mean-Shift algorithm are expanded, merged and extracted. Then the overlap rate of the extracted image block is detected before determining a segmentation region with a single complete target. Last, the gradient edge of the extracted targets is recovered and reconstructed by using a dictionary-learning algorithm, while the final segmentation results are obtained which are very close to the gradient target in the original image. Both the experimental results and the simulated results show that the segmentation results are very accurate. The Dice coefficients are improved by 70% to 80% compared with the Mean-Shift only method.
Abstract: This paper illustrates an application of granular computing approach, namely rough set theory in data mining. The paper outlines the formalism of granular computing and elucidates the mathematical underpinning of rough set theory, which has been widely used by the data mining and the machine learning community. A real-world application is illustrated, and the classification performance is compared with other contending machine learning algorithms. The predictive performance of the rough set rule induction model shows comparative success with respect to other contending algorithms.
Abstract: Energy production optimization has been traditionally very important for utilities in order to improve resource consumption. However, load forecasting is a challenging task, as there are a large number of relevant variables that must be considered, and several strategies have been used to deal with this complex problem. This is especially true also in microgrids where many elements have to adjust their performance depending on the future generation and consumption conditions. The goal of this paper is to present a solution for short-term load forecasting in microgrids, based on three machine learning experiments developed in R and web services built and deployed with different components of Cortana Intelligence Suite: Azure Machine Learning, a fully managed cloud service that enables to easily build, deploy, and share predictive analytics solutions; SQL database, a Microsoft database service for app developers; and PowerBI, a suite of business analytics tools to analyze data and share insights. Our results show that Boosted Decision Tree and Fast Forest Quantile regression methods can be very useful to predict hourly short-term consumption in microgrids; moreover, we found that for these types of forecasting models, weather data (temperature, wind, humidity and dew point) can play a crucial role in improving the accuracy of the forecasting solution. Data cleaning and feature engineering methods performed in R and different types of machine learning algorithms (Boosted Decision Tree, Fast Forest Quantile and ARIMA) will be presented, and results and performance metrics discussed.
Abstract: In this paper, we study a distributed control algorithm
for the problem of unknown area coverage by a network of robots.
The coverage objective is to locate a set of targets in the area and
to minimize the robots’ energy consumption. The robots have no
prior knowledge about the location and also about the number of the
targets in the area. One efficient approach that can be used to relax
the robots’ lack of knowledge is to incorporate an auxiliary learning
algorithm into the control scheme. A learning algorithm actually
allows the robots to explore and study the unknown environment
and to eventually overcome their lack of knowledge. The control
algorithm itself is modeled based on game theory where the network
of the robots use their collective information to play a non-cooperative
potential game. The algorithm is tested via simulations to verify its
performance and adaptability.
Abstract: This paper presents an approach for optimal cyber security decisions to protect instances of a federated Internet of Things (IoT) platform in the cloud. The presented solution implements the repeated Stackelberg Security Game (SSG) and a model called Stochastic Human behaviour model with AttRactiveness and Probability weighting (SHARP). SHARP employs the Subjective Utility Quantal Response (SUQR) for formulating a subjective utility function, which is based on the evaluations of alternative solutions during decision-making. We augment the repeated SSG (including SHARP and SUQR) with a reinforced learning algorithm called Naïve Q-Learning. Naïve Q-Learning belongs to the category of active and model-free Machine Learning (ML) techniques in which the agent (either the defender or the attacker) attempts to find an optimal security solution. In this way, we combine GT and ML algorithms for discovering optimal cyber security policies. The proposed security optimization components will be validated in a collaborative cloud platform that is based on the Industrial Internet Reference Architecture (IIRA) and its recently published security model.
Abstract: In this paper, we present a low cost design for a smart glove that can perform sign language recognition to assist the speech impaired people. Specifically, we have designed and developed an Assistive Hand Gesture Interpreter that recognizes hand movements relevant to the American Sign Language (ASL) and translates them into text for display on a Thin-Film-Transistor Liquid Crystal Display (TFT LCD) screen as well as synthetic speech. Linear Bayes Classifiers and Multilayer Neural Networks have been used to classify 11 feature vectors obtained from the sensors on the glove into one of the 27 ASL alphabets and a predefined gesture for space. Three types of features are used; bending using six bend sensors, orientation in three dimensions using accelerometers and contacts at vital points using contact sensors. To gauge the performance of the presented design, the training database was prepared using five volunteers. The accuracy of the current version on the prepared dataset was found to be up to 99.3% for target user. The solution combines electronics, e-textile technology, sensor technology, embedded system and machine learning techniques to build a low cost wearable glove that is scrupulous, elegant and portable.
Abstract: Feature Selection is significant in order to perform constructive classification in the area of cancer diagnosis. However, a large number of features compared to the number of samples makes the task of classification computationally very hard and prone to errors in microarray gene expression datasets. In this paper, we present an innovative method for selecting highly informative gene subsets of gene expression data that effectively classifies the cancer data into tumorous and non-tumorous. The hybrid gene selection technique comprises of combined Mutual Information and Fisher score to select informative genes. The gene selection is validated by classification using Support Vector Machine (SVM) which is a supervised learning algorithm capable of solving complex classification problems. The results obtained from improved Mutual Information and F-Score with SVM as a classifier has produced efficient results.
Abstract: An active islanding detection method using disturbance signal injection with intelligent controller is proposed in this study. First, a DC\AC power inverter is emulated in the distributed generator (DG) system to implement the tracking control of active power, reactive power outputs and the islanding detection. The proposed active islanding detection method is based on injecting a disturbance signal into the power inverter system through the d-axis current which leads to a frequency deviation at the terminal of the RLC load when the utility power is disconnected. Moreover, in order to improve the transient and steady-state responses of the active power and reactive power outputs of the power inverter, and to further improve the performance of the islanding detection method, two probabilistic fuzzy neural networks (PFNN) are adopted to replace the traditional proportional-integral (PI) controllers for the tracking control and the islanding detection. Furthermore, the network structure and the online learning algorithm of the PFNN are introduced in detail. Finally, the feasibility and effectiveness of the tracking control and the proposed active islanding detection method are verified with experimental results.
Abstract: Most of self-tuning fuzzy systems, which are
automatically constructed from learning data, are based on the
steepest descent method (SDM). However, this approach often
requires a large convergence time and gets stuck into a shallow
local minimum. One of its solutions is to use fuzzy rule modules
with a small number of inputs such as DIRMs (Double-Input Rule
Modules) and SIRMs (Single-Input Rule Modules). In this paper,
we consider a (generalized) DIRMs model composed of double
and single-input rule modules. Further, in order to reduce the
redundant modules for the (generalized) DIRMs model, pruning and
generative learning algorithms for the model are suggested. In order
to show the effectiveness of them, numerical simulations for function
approximation, Box-Jenkins and obstacle avoidance problems are
performed.
Abstract: MicroRNAs are small non-coding RNA found in
many different species. They play crucial roles in cancer such as
biological processes of apoptosis and proliferation. The identification
of microRNA-target genes can be an essential first step towards to
reveal the role of microRNA in various cancer types. In this paper,
we predict miRNA-target genes for lung cancer by integrating
prediction scores from miRanda and PITA algorithms used as a
feature vector of miRNA-target interaction. Then, machine-learning
algorithms were implemented for making a final prediction. The
approach developed in this study should be of value for future studies
into understanding the role of miRNAs in molecular mechanisms
enabling lung cancer formation.
Abstract: Advances in spatial and spectral resolution of satellite
images have led to tremendous growth in large image databases. The
data we acquire through satellites, radars, and sensors consists of
important geographical information that can be used for remote
sensing applications such as region planning, disaster management.
Spatial data classification and object recognition are important tasks
for many applications. However, classifying objects and identifying
them manually from images is a difficult task. Object recognition is
often considered as a classification problem, this task can be
performed using machine-learning techniques. Despite of many
machine-learning algorithms, the classification is done using
supervised classifiers such as Support Vector Machines (SVM) as the
area of interest is known. We proposed a classification method,
which considers neighboring pixels in a region for feature extraction
and it evaluates classifications precisely according to neighboring
classes for semantic interpretation of region of interest (ROI). A
dataset has been created for training and testing purpose; we
generated the attributes by considering pixel intensity values and
mean values of reflectance. We demonstrated the benefits of using
knowledge discovery and data-mining techniques, which can be on
image data for accurate information extraction and classification from
high spatial resolution remote sensing imagery.
Abstract: In this paper, a learning algorithm using neuronal networks to improve the roll stability and prevent the rollover in a single unit heavy vehicle is proposed. First, LQR control to keep balanced normalized rollovers, between front and rear axles, below the unity, then a data collected from this controller is used as a training basis of a neuronal regulator. The ANN controller is thereafter applied for the nonlinear side force model, and gives satisfactory results than the LQR one.
Abstract: The aim of this paper is to propose a general
framework for storing, analyzing, and extracting knowledge from
two-dimensional echocardiographic images, color Doppler images,
non-medical images, and general data sets. A number of high
performance data mining algorithms have been used to carry out this
task. Our framework encompasses four layers namely physical
storage, object identification, knowledge discovery, user level.
Techniques such as active contour model to identify the cardiac
chambers, pixel classification to segment the color Doppler echo
image, universal model for image retrieval, Bayesian method for
classification, parallel algorithms for image segmentation, etc., were
employed. Using the feature vector database that have been
efficiently constructed, one can perform various data mining tasks
like clustering, classification, etc. with efficient algorithms along
with image mining given a query image. All these facilities are
included in the framework that is supported by state-of-the-art user
interface (UI). The algorithms were tested with actual patient data
and Coral image database and the results show that their performance
is better than the results reported already.
Abstract: People, throughout the history, have made estimates
and inferences about the future by using their past experiences.
Developing information technologies and the improvements in the
database management systems make it possible to extract useful
information from knowledge in hand for the strategic decisions.
Therefore, different methods have been developed. Data mining by
association rules learning is one of such methods. Apriori algorithm,
one of the well-known association rules learning algorithms, is not
commonly used in spatio-temporal data sets. However, it is possible
to embed time and space features into the data sets and make Apriori
algorithm a suitable data mining technique for learning spatiotemporal
association rules. Lake Van, the largest lake of Turkey, is a
closed basin. This feature causes the volume of the lake to increase or
decrease as a result of change in water amount it holds. In this study,
evaporation, humidity, lake altitude, amount of rainfall and
temperature parameters recorded in Lake Van region throughout the
years are used by the Apriori algorithm and a spatio-temporal data
mining application is developed to identify overflows and newlyformed
soil regions (underflows) occurring in the coastal parts of
Lake Van. Identifying possible reasons of overflows and underflows
may be used to alert the experts to take precautions and make the
necessary investments.
Abstract: The critical concern of satellite operations is to ensure
the health and safety of satellites. The worst case in this perspective
is probably the loss of a mission, but the more common interruption
of satellite functionality can result in compromised mission
objectives. All the data acquiring from the spacecraft are known as
Telemetry (TM), which contains the wealth information related to the
health of all its subsystems. Each single item of information is
contained in a telemetry parameter, which represents a time-variant
property (i.e. a status or a measurement) to be checked. As a
consequence, there is a continuous improvement of TM monitoring
systems to reduce the time required to respond to changes in a
satellite's state of health. A fast conception of the current state of the
satellite is thus very important to respond to occurring failures.
Statistical multivariate latent techniques are one of the vital learning
tools that are used to tackle the problem above coherently.
Information extraction from such rich data sources using advanced
statistical methodologies is a challenging task due to the massive
volume of data. To solve this problem, in this paper, we present a
proposed unsupervised learning algorithm based on Principle
Component Analysis (PCA) technique. The algorithm is particularly
applied on an actual remote sensing spacecraft. Data from the
Attitude Determination and Control System (ADCS) was acquired
under two operation conditions: normal and faulty states. The models
were built and tested under these conditions, and the results show that
the algorithm could successfully differentiate between these
operations conditions. Furthermore, the algorithm provides
competent information in prediction as well as adding more insight
and physical interpretation to the ADCS operation.
Abstract: In this paper, we present an optimization technique or
a learning algorithm using the hybrid architecture by combining the
most popular sequence recognition models such as Recurrent Neural
Networks (RNNs) and Hidden Markov models (HMMs). In order to
improve the sequence/pattern recognition/classification performance
by applying a hybrid/neural symbolic approach, a gradient descent
learning algorithm is developed using the Real Time Recurrent
Learning of Recurrent Neural Network for processing the knowledge
represented in trained Hidden Markov Models. The developed hybrid
algorithm is implemented on automata theory as a sample test beds
and the performance of the designed algorithm is demonstrated and
evaluated on learning the deterministic finite state automata.
Abstract: Human beings have the ability to make logical
decisions. Although human decision - making is often optimal, it is
insufficient when huge amount of data is to be classified. Medical
dataset is a vital ingredient used in predicting patient’s health
condition. In other to have the best prediction, there calls for most
suitable machine learning algorithms. This work compared the
performance of Artificial Neural Network (ANN) and Decision Tree
Algorithms (DTA) as regards to some performance metrics using
diabetes data. WEKA software was used for the implementation of
the algorithms. Multilayer Perceptron (MLP) and Radial Basis
Function (RBF) were the two algorithms used for ANN, while
RegTree and LADTree algorithms were the DTA models used. From
the results obtained, DTA performed better than ANN. The Root
Mean Squared Error (RMSE) of MLP is 0.3913 that of RBF is
0.3625, that of RepTree is 0.3174 and that of LADTree is 0.3206
respectively.
Abstract: Today, there is a large number of political transcripts
available on the Web to be mined and used for statistical analysis,
and product recommendations. As the online political resources are
used for various purposes, automatically determining the political
orientation on these transcripts becomes crucial. The methodologies
used by machine learning algorithms to do an automatic classification
are based on different features that are classified under categories
such as Linguistic, Personality etc. Considering the ideological
differences between Liberals and Conservatives, in this paper, the
effect of Personality traits on political orientation classification is
studied. The experiments in this study were based on the correlation
between LIWC features and the BIG Five Personality traits. Several
experiments were conducted using Convote U.S. Congressional-
Speech dataset with seven benchmark classification algorithms. The
different methodologies were applied on several LIWC feature sets
that constituted by 8 to 64 varying number of features that are
correlated to five personality traits. As results of experiments,
Neuroticism trait was obtained to be the most differentiating
personality trait for classification of political orientation. At the same
time, it was observed that the personality trait based classification
methodology gives better and comparable results with the related
work.