Abstract: This study applies the sequential panel selection
method (SPSM) procedure proposed by Chortareas and Kapetanios
(2009) to investigate the time-series properties of energy
consumption in 50 US states from 1963 to 2009. SPSM involves the
classification of the entire panel into a group of stationary series and
a group of non-stationary series to identify how many and which
series in the panel are stationary processes. Empirical results obtained
through SPSM with the panel KSS unit root test developed by Ucar
and Omay (2009) combined with a Fourier function indicate that
energy consumption in all the 50 US states are stationary. The results
of this study have important policy implications for the 50 US states.
Abstract: In this paper, a wavelet-based neural network (WNN) classifier for recognizing EEG signals is implemented and tested under three sets EEG signals (healthy subjects, patients with epilepsy and patients with epileptic syndrome during the seizure). First, the Discrete Wavelet Transform (DWT) with the Multi-Resolution Analysis (MRA) is applied to decompose EEG signal at resolution levels of the components of the EEG signal (δ, θ, α, β and γ) and the Parseval-s theorem are employed to extract the percentage distribution of energy features of the EEG signal at different resolution levels. Second, the neural network (NN) classifies these extracted features to identify the EEGs type according to the percentage distribution of energy features. The performance of the proposed algorithm has been evaluated using in total 300 EEG signals. The results showed that the proposed classifier has the ability of recognizing and classifying EEG signals efficiently.
Abstract: This paper addresses the problems encountered by conventional distance relays when protecting double-circuit transmission lines. The problems arise principally as a result of the mutual coupling between the two circuits under different fault conditions; this mutual coupling is highly nonlinear in nature. An adaptive protection scheme is proposed for such lines based on application of artificial neural network (ANN). ANN has the ability to classify the nonlinear relationship between measured signals by identifying different patterns of the associated signals. One of the key points of the present work is that only current signals measured at local end have been used to detect and classify the faults in the double circuit transmission line with double end infeed. The adaptive protection scheme is tested under a specific fault type, but varying fault location, fault resistance, fault inception angle and with remote end infeed. An improved performance is experienced once the neural network is trained adequately, which performs precisely when faced with different system parameters and conditions. The entire test results clearly show that the fault is detected and classified within a quarter cycle; thus the proposed adaptive protection technique is well suited for double circuit transmission line fault detection & classification. Results of performance studies show that the proposed neural network-based module can improve the performance of conventional fault selection algorithms.
Abstract: Radio Frequency Identification (RFID) initially introduced
during WW-II, has revolutionized the world with its numerous
benefits and plethora of implementations in diverse areas ranging
from manufacturing to agriculture to healthcare to hotel management.
This work reviews the current research in this area with emphasis
on applications for supply chain management and to develop a
taxonomic framework to classify literature which will enable swift
and easy content analysis and also help identify areas for future
research.
Abstract: Surface metrology with image processing is a challenging task having wide applications in industry. Surface roughness can be evaluated using texture classification approach. Important aspect here is appropriate selection of features that characterize the surface. We propose an effective combination of features for multi-scale and multi-directional analysis of engineering surfaces. The features include standard deviation, kurtosis and the Canny edge detector. We apply the method by analyzing the surfaces with Discrete Wavelet Transform (DWT) and Dual-Tree Complex Wavelet Transform (DT-CWT). We used Canberra distance metric for similarity comparison between the surface classes. Our database includes the surface textures manufactured by three machining processes namely Milling, Casting and Shaping. The comparative study shows that DT-CWT outperforms DWT giving correct classification performance of 91.27% with Canberra distance metric.
Abstract: This research is aimed to compare the percentages of correct classification of Empirical Bayes method (EB) to Classical method when data are constructed as near normal, short-tailed and long-tailed symmetric, short-tailed and long-tailed asymmetric. The study is performed using conjugate prior, normal distribution with known mean and unknown variance. The estimated hyper-parameters obtained from EB method are replaced in the posterior predictive probability and used to predict new observations. Data are generated, consisting of training set and test set with the sample sizes 100, 200 and 500 for the binary classification. The results showed that EB method exhibited an improved performance over Classical method in all situations under study.
Abstract: In recent years, rapid advances in software and hardware in the field of information technology along with a digital imaging revolution in the medical domain facilitate the generation and storage of large collections of images by hospitals and clinics. To search these large image collections effectively and efficiently poses significant technical challenges, and it raises the necessity of constructing intelligent retrieval systems. Content-based Image Retrieval (CBIR) consists of retrieving the most visually similar images to a given query image from a database of images[5]. Medical CBIR (content-based image retrieval) applications pose unique challenges but at the same time offer many new opportunities. On one hand, while one can easily understand news or sports videos, a medical image is often completely incomprehensible to untrained eyes.
Abstract: Intelligent systems based on machine learning
techniques, such as classification, clustering, are gaining wide spread
popularity in real world applications. This paper presents work on
developing a software system for predicting crop yield, for example
oil-palm yield, from climate and plantation data. At the core of our
system is a method for unsupervised partitioning of data for finding
spatio-temporal patterns in climate data using kernel methods which
offer strength to deal with complex data. This work gets inspiration
from the notion that a non-linear data transformation into some high
dimensional feature space increases the possibility of linear
separability of the patterns in the transformed space. Therefore, it
simplifies exploration of the associated structure in the data. Kernel
methods implicitly perform a non-linear mapping of the input data
into a high dimensional feature space by replacing the inner products
with an appropriate positive definite function. In this paper we
present a robust weighted kernel k-means algorithm incorporating
spatial constraints for clustering the data. The proposed algorithm
can effectively handle noise, outliers and auto-correlation in the
spatial data, for effective and efficient data analysis by exploring
patterns and structures in the data, and thus can be used for
predicting oil-palm yield by analyzing various factors affecting the
yield.
Abstract: This research proposes a methodology for patent-citation-based technology input-output analysis by applying the patent information to input-output analysis developed for the dependencies among different industries. For this analysis, a technology relationship matrix and its components, as well as input and technology inducement coefficients, are constructed using patent information. Then, a technology inducement coefficient is calculated by normalizing the degree of citation from certain IPCs to the different IPCs (International patent classification) or to the same IPCs. Finally, we construct a Dependency Structure Matrix (DSM) based on the technology inducement coefficient to suggest a useful application for this methodology.
Abstract: The paper presents a complete discrete statistical framework, based on a novel vector quantization (VQ) front-end process. This new VQ approach performs an optimal distribution of VQ codebook components on HMM states. This technique that we named the distributed vector quantization (DVQ) of hidden Markov models, succeeds in unifying acoustic micro-structure and phonetic macro-structure, when the estimation of HMM parameters is performed. The DVQ technique is implemented through two variants. The first variant uses the K-means algorithm (K-means- DVQ) to optimize the VQ, while the second variant exploits the benefits of the classification behavior of neural networks (NN-DVQ) for the same purpose. The proposed variants are compared with the HMM-based baseline system by experiments of specific Arabic consonants recognition. The results show that the distributed vector quantization technique increase the performance of the discrete HMM system.
Abstract: Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optmize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is also able to automatically suggest a strategy for number of classes optimization.The tool is used to classify macroeconomic data that report the most developed countries? import and export. It is possible to classify the countries based on their economic behaviour and use an ad hoc tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation.
Abstract: Human identification at a distance has recently gained
growing interest from computer vision researchers. Gait recognition
aims essentially to address this problem by identifying people based
on the way they walk [1]. Gait recognition has 3 steps. The first step
is preprocessing, the second step is feature extraction and the third
one is classification. This paper focuses on the classification step that
is essential to increase the CCR (Correct Classification Rate).
Multilayer Perceptron (MLP) is used in this work. Neural Networks
imitate the human brain to perform intelligent tasks [3].They can
represent complicated relationships between input and output and
acquire knowledge about these relationships directly from the data
[2]. In this paper we apply MLP NN for 11 views in our database and
compare the CCR values for these views. Experiments are performed
with the NLPR databases, and the effectiveness of the proposed
method for gait recognition is demonstrated.
Abstract: There are several approaches for handling multiclass classification. Aside from one-against-one (OAO) and one-against-all (OAA), hierarchical classification technique is also commonly used. A binary classification tree is a hierarchical classification structure that breaks down a k-class problem into binary sub-problems, each solved by a binary classifier. In each node, a set of classes is divided into two subsets. A good class partition should be able to group similar classes together. Many algorithms measure similarity in term of distance between class centroids. Classes are grouped together by a clustering algorithm when distances between their centroids are small. In this paper, we present a binary classification tree with tuned observation-based clustering (BCT-TOB) that finds a class partition by performing clustering on observations instead of class centroids. A merging step is introduced to merge any insignificant class split. The experiment shows that performance of BCT-TOB is comparable to other algorithms.
Abstract: The proposed system identifies the species of the wood
using the textural features present in its barks. Each species of a wood
has its own unique patterns in its bark, which enabled the proposed
system to identify it accurately. Automatic wood recognition system
has not yet been well established mainly due to lack of research in this
area and the difficulty in obtaining the wood database. In our work, a
wood recognition system has been designed based on pre-processing
techniques, feature extraction and by correlating the features of those
wood species for their classification. Texture classification is a problem
that has been studied and tested using different methods due to its
valuable usage in various pattern recognition problems, such as wood
recognition, rock classification. The most popular technique used
for the textural classification is Gray-level Co-occurrence Matrices
(GLCM). The features from the enhanced images are thus extracted
using the GLCM is correlated, which determines the classification
between the various wood species. The result thus obtained shows a
high rate of recognition accuracy proving that the techniques used in
suitable to be implemented for commercial purposes.
Abstract: This paper describes an optimal approach for feature
subset selection to classify the leaves based on Genetic Algorithm
(GA) and Kernel Based Principle Component Analysis (KPCA). Due
to high complexity in the selection of the optimal features, the
classification has become a critical task to analyse the leaf image
data. Initially the shape, texture and colour features are extracted
from the leaf images. These extracted features are optimized through
the separate functioning of GA and KPCA. This approach performs
an intersection operation over the subsets obtained from the
optimization process. Finally, the most common matching subset is
forwarded to train the Support Vector Machine (SVM). Our
experimental results successfully prove that the application of GA
and KPCA for feature subset selection using SVM as a classifier is
computationally effective and improves the accuracy of the classifier.
Abstract: The behavior of Radial Basis Function (RBF) Networks greatly depends on how the center points of the basis functions are selected. In this work we investigate the use of instance reduction techniques, originally developed to reduce the storage requirements of instance based learners, for this purpose. Five Instance-Based Reduction Techniques were used to determine the set of center points, and RBF networks were trained using these sets of centers. The performance of the RBF networks is studied in terms of classification accuracy and training time. The results obtained were compared with two Radial Basis Function Networks: RBF networks that use all instances of the training set as center points (RBF-ALL) and Probabilistic Neural Networks (PNN). The former achieves high classification accuracies and the latter requires smaller training time. Results showed that RBF networks trained using sets of centers located by noise-filtering techniques (ALLKNN and ENN) rather than pure reduction techniques produce the best results in terms of classification accuracy. The results show that these networks require smaller training time than that of RBF-ALL and higher classification accuracy than that of PNN. Thus, using ALLKNN and ENN to select center points gives better combination of classification accuracy and training time. Our experiments also show that using the reduced sets to train the networks is beneficial especially in the presence of noise in the original training sets.
Abstract: The purpose of this paper is to demonstrate the ability
of a genetic programming (GP) algorithm to evolve a team of data
classification models. The GP algorithm used in this work is
“multigene" in nature, i.e. there are multiple tree structures (genes)
that are used to represent team members. Each team member assigns
a data sample to one of a fixed set of output classes. A majority vote,
determined using the mode (highest occurrence) of classes predicted
by the individual genes, is used to determine the final class
prediction. The algorithm is tested on a binary classification problem.
For the case study investigated, compact classification models are
obtained with comparable accuracy to alternative approaches.
Abstract: This paper details the application of a genetic
programming framework for induction of useful classification rules
from a database of income statements, balance sheets, and cash flow
statements for North American public companies. Potentially
interesting classification rules are discovered. Anomalies in the
discovery process merit further investigation of the application of
genetic programming to the dataset for the problem domain.
Abstract: Array signal processing involves signal enumeration and source localization. Array signal processing is centered on the ability to fuse temporal and spatial information captured via sampling signals emitted from a number of sources at the sensors of an array in order to carry out a specific estimation task: source characteristics (mainly localization of the sources) and/or array characteristics (mainly array geometry) estimation. Array signal processing is a part of signal processing that uses sensors organized in patterns or arrays, to detect signals and to determine information about them. Beamforming is a general signal processing technique used to control the directionality of the reception or transmission of a signal. Using Beamforming we can direct the majority of signal energy we receive from a group of array. Multiple signal classification (MUSIC) is a highly popular eigenstructure-based estimation method of direction of arrival (DOA) with high resolution. This Paper enumerates the effect of missing sensors in DOA estimation. The accuracy of the MUSIC-based DOA estimation is degraded significantly both by the effects of the missing sensors among the receiving array elements and the unequal channel gain and phase errors of the receiver.
Abstract: This study introduces a new method for detecting,
sorting, and localizing spikes from multiunit EEG recordings. The
method combines the wavelet transform, which localizes distinctive
spike features, with Super-Paramagnetic Clustering (SPC) algorithm,
which allows automatic classification of the data without assumptions
such as low variance or Gaussian distributions. Moreover, the method
is capable of setting amplitude thresholds for spike detection. The
method makes use of several real EEG data sets, and accordingly the
spikes are detected, clustered and their times were detected.