Gait Biometric for Person Re-Identification

Biometric identification is to identify unique features in a person like fingerprints, iris, ear, and voice recognition that need the subject's permission and physical contact. Gait biometric is used to identify the unique gait of the person by extracting moving features. The main advantage of gait biometric to identify the gait of a person at a distance, without any physical contact. In this work, the gait biometric is used for person re-identification. The person walking naturally compared with the same person walking with bag, coat and case recorded using long wave infrared, short wave infrared, medium wave infrared and visible cameras. The videos are recorded in rural and in urban environments. The pre-processing technique includes human identified using You Only Look Once, background subtraction, silhouettes extraction and synthesis Gait Entropy Image by averaging the silhouettes. The moving features are extracted from the Gait Entropy Energy Image. The extracted features are dimensionality reduced by the Principal Component Analysis and recognized using different classifiers. The comparative results with the different classifier show that Linear Discriminant Analysis outperform other classifiers with 95.8% for visible in the rural dataset and 94.8% for longwave infrared in the urban dataset.

Performance Analysis of Traffic Classification with Machine Learning

Network security is role of the ICT environment because malicious users are continually growing that realm of education, business, and then related with ICT. The network security contravention is typically described and examined centrally based on a security event management system. The firewalls, Intrusion Detection System (IDS), and Intrusion Prevention System are becoming essential to monitor or prevent of potential violations, incidents attack, and imminent threats. In this system, the firewall rules are set only for where the system policies are needed. Dataset deployed in this system are derived from the testbed environment. The traffic as in DoS and PortScan traffics are applied in the testbed with firewall and IDS implementation. The network traffics are classified as normal or attacks in the existing testbed environment based on six machine learning classification methods applied in the system. It is required to be tested to get datasets and applied for DoS and PortScan. The dataset is based on CICIDS2017 and some features have been added. This system tested 26 features from the applied dataset. The system is to reduce false positive rates and to improve accuracy in the implemented testbed design. The system also proves good performance by selecting important features and comparing existing a dataset by machine learning classifiers.

Rank-Based Chain-Mode Ensemble for Binary Classification

In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.

Amelioration of Cardiac Arrythmias Classification Performance Using Artificial Neural Network, Adaptive Neuro-Fuzzy and Fuzzy Inference Systems Classifiers

This paper aims at bringing a scientific contribution to the cardiac arrhythmia biomedical diagnosis systems; more precisely to the study of the amelioration of cardiac arrhythmia classification performance using artificial neural network, adaptive neuro-fuzzy and fuzzy inference systems classifiers. The purpose of this amelioration is to enable cardiologists to make reliable diagnosis through automatic cardiac arrhythmia analyzes and classifications based on high confidence classifiers. In this study, six classes of the most commonly encountered arrhythmias are considered: the Right Bundle Branch Block, the Left Bundle Branch Block, the Ventricular Extrasystole, the Auricular Extrasystole, the Atrial Fibrillation and the Normal Cardiac rate beat. From the electrocardiogram (ECG) extracted parameters, we constructed a matrix (360x360) serving as an input data sample for the classifiers based on neural networks and a matrix (1x6) for the classifier based on fuzzy logic. By varying three parameters (the quality of the neural network learning, the data size and the quality of the input parameters) the automatic classification permitted us to obtain the following performances: in terms of correct classification rate, 83.6% was obtained using the fuzzy logic based classifier, 99.7% using the neural network based classifier and 99.8% for the adaptive neuro-fuzzy based classifier. These results are based on signals containing at least 360 cardiac cycles. Based on the comparative analysis of the aforementioned three arrhythmia classifiers, the classifiers based on neural networks exhibit a better performance.

Foot Recognition Using Deep Learning for Knee Rehabilitation

The use of foot recognition can be applied in many medical fields such as the gait pattern analysis and the knee exercises of patients in rehabilitation. Generally, a camera-based foot recognition system is intended to capture a patient image in a controlled room and background to recognize the foot in the limited views. However, this system can be inconvenient to monitor the knee exercises at home. In order to overcome these problems, this paper proposes to use the deep learning method using Convolutional Neural Networks (CNNs) for foot recognition. The results are compared with the traditional classification method using LBP and HOG features with kNN and SVM classifiers. According to the results, deep learning method provides better accuracy but with higher complexity to recognize the foot images from online databases than the traditional classification method.

Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Investigation of Wave Atom Sub-Bands via Breast Cancer Classification

This paper investigates successful sub-bands of wave atom transform via classification of mammograms, when the coefficients of sub-bands are used as features. A computer-aided diagnosis system is constructed by using wave atom transform, support vector machine and k-nearest neighbor classifiers. Two-class classification is studied in detail using two data sets, separately. The successful sub-bands are determined according to the accuracy rates, coefficient numbers, and sensitivity rates.

Evaluation of Classification Algorithms for Road Environment Detection

The road environment information is needed accurately for applications such as road maintenance and virtual 3D city modeling. Mobile laser scanning (MLS) produces dense point clouds from huge areas efficiently from which the road and its environment can be modeled in detail. Objects such as buildings, cars and trees are an important part of road environments. Different methods have been developed for detection of above such objects, but still there is a lack of accuracy due to the problems of illumination, environmental changes, and multiple objects with same features. In this work the comparison between different classifiers such as Multiclass SVM, kNN and Multiclass LDA for the road environment detection is analyzed. Finally the classification accuracy for kNN with LBP feature improved the classification accuracy as 93.3% than the other classifiers.

Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques

Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.

Multi-Objective Evolutionary Computation Based Feature Selection Applied to Behaviour Assessment of Children

Abstract—Attribute or feature selection is one of the basic strategies to improve the performances of data classification tasks, and, at the same time, to reduce the complexity of classifiers, and it is a particularly fundamental one when the number of attributes is relatively high. Its application to unsupervised classification is restricted to a limited number of experiments in the literature. Evolutionary computation has already proven itself to be a very effective choice to consistently reduce the number of attributes towards a better classification rate and a simpler semantic interpretation of the inferred classifiers. We present a feature selection wrapper model composed by a multi-objective evolutionary algorithm, the clustering method Expectation-Maximization (EM), and the classifier C4.5 for the unsupervised classification of data extracted from a psychological test named BASC-II (Behavior Assessment System for Children - II ed.) with two objectives: Maximizing the likelihood of the clustering model and maximizing the accuracy of the obtained classifier. We present a methodology to integrate feature selection for unsupervised classification, model evaluation, decision making (to choose the most satisfactory model according to a a posteriori process in a multi-objective context), and testing. We compare the performance of the classifier obtained by the multi-objective evolutionary algorithms ENORA and NSGA-II, and the best solution is then validated by the psychologists that collected the data.

Evaluation of Ensemble Classifiers for Intrusion Detection

One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this research work, new ensemble classification methods are proposed with homogeneous ensemble classifier using bagging and heterogeneous ensemble classifier using arcing and their performances are analyzed in terms of accuracy. A Classifier ensemble is designed using Radial Basis Function (RBF) and Support Vector Machine (SVM) as base classifiers. The feasibility and the benefits of the proposed approaches are demonstrated by the means of standard datasets of intrusion detection. The main originality of the proposed approach is based on three main parts: preprocessing phase, classification phase, and combining phase. A wide range of comparative experiments is conducted for standard datasets of intrusion detection. The performance of the proposed homogeneous and heterogeneous ensemble classifiers are compared to the performance of other standard homogeneous and heterogeneous ensemble methods. The standard homogeneous ensemble methods include Error correcting output codes, Dagging and heterogeneous ensemble methods include majority voting, stacking. The proposed ensemble methods provide significant improvement of accuracy compared to individual classifiers and the proposed bagged RBF and SVM performs significantly better than ECOC and Dagging and the proposed hybrid RBF-SVM performs significantly better than voting and stacking. Also heterogeneous models exhibit better results than homogeneous models for standard datasets of intrusion detection. 

Breast Cancer Survivability Prediction via Classifier Ensemble

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Performance Comparison of Situation-Aware Models for Activating Robot Vacuum Cleaner in a Smart Home

We assume an IoT-based smart-home environment where the on-off status of each of the electrical appliances including the room lights can be recognized in a real time by monitoring and analyzing the smart meter data. At any moment in such an environment, we can recognize what the household or the user is doing by referring to the status data of the appliances. In this paper, we focus on a smart-home service that is to activate a robot vacuum cleaner at right time by recognizing the user situation, which requires a situation-aware model that can distinguish the situations that allow vacuum cleaning (Yes) from those that do not (No). We learn as our candidate models a few classifiers such as naïve Bayes, decision tree, and logistic regression that can map the appliance-status data into Yes and No situations. Our training and test data are obtained from simulations of user behaviors, in which a sequence of user situations such as cooking, eating, dish washing, and so on is generated with the status of the relevant appliances changed in accordance with the situation changes. During the simulation, both the situation transition and the resulting appliance status are determined stochastically. To compare the performances of the aforementioned classifiers we obtain their learning curves for different types of users through simulations. The result of our empirical study reveals that naïve Bayes achieves a slightly better classification accuracy than the other compared classifiers.

Evaluation of Robust Feature Descriptors for Texture Classification

Texture is an important characteristic in real and synthetic scenes. Texture analysis plays a critical role in inspecting surfaces and provides important techniques in a variety of applications. Although several descriptors have been presented to extract texture features, the development of object recognition is still a difficult task due to the complex aspects of texture. Recently, many robust and scaling-invariant image features such as SIFT, SURF and ORB have been successfully used in image retrieval and object recognition. In this paper, we have tried to compare the performance for texture classification using these feature descriptors with k-means clustering. Different classifiers including K-NN, Naive Bayes, Back Propagation Neural Network , Decision Tree and Kstar were applied in three texture image sets - UIUCTex, KTH-TIPS and Brodatz, respectively. Experimental results reveal SIFTS as the best average accuracy rate holder in UIUCTex, KTH-TIPS and SURF is advantaged in Brodatz texture set. BP neuro network works best in the test set classification among all used classifiers.

Activity Recognition by Smartphone Accelerometer Data Using Ensemble Learning Methods

As smartphones are equipped with various sensors, there have been many studies focused on using these sensors to create valuable applications. Human activity recognition is one such application motivated by various welfare applications, such as the support for the elderly, measurement of calorie consumption, lifestyle and exercise patterns analyses, and so on. One of the challenges one faces when using smartphone sensors for activity recognition is that the number of sensors should be minimized to save battery power. In this paper, we show that a fairly accurate classifier can be built that can distinguish ten different activities by using only a single sensor data, i.e., the smartphone accelerometer data. The approach that we adopt to deal with this twelve-class problem uses various methods. The features used for classifying these activities include not only the magnitude of acceleration vector at each time point, but also the maximum, the minimum, and the standard deviation of vector magnitude within a time window. The experiments compared the performance of four kinds of basic multi-class classifiers and the performance of four kinds of ensemble learning methods based on three kinds of basic multi-class classifiers. The results show that while the method with the highest accuracy is ECOC based on Random forest.

Lipschitz Classifiers Ensembles: Usage for Classification of Target Events in C-OTDR Monitoring Systems

This paper introduces an original method for guaranteed estimation of the accuracy for an ensemble of Lipschitz classifiers. The solution was obtained as a finite closed set of alternative hypotheses, which contains an object of classification with probability of not less than the specified value. Thus, the classification is represented by a set of hypothetical classes. In this case, the smaller the cardinality of the discrete set of hypothetical classes is, the higher is the classification accuracy. Experiments have shown that if cardinality of the classifiers ensemble is increased then the cardinality of this set of hypothetical classes is reduced. The problem of the guaranteed estimation of the accuracy for an ensemble of Lipschitz classifiers is relevant in multichannel classification of target events in C-OTDR monitoring systems. Results of suggested approach practical usage to accuracy control in C-OTDR monitoring systems are present.

The Optimization of Decision Rules in Multimodal Decision-Level Fusion Scheme

This paper introduces an original method of parametric optimization of the structure for multimodal decisionlevel fusion scheme which combines the results of the partial solution of the classification task obtained from assembly of the mono-modal classifiers. As a result, a multimodal fusion classifier which has the minimum value of the total error rate has been obtained.

Spike Sorting Method Using Exponential Autoregressive Modeling of Action Potentials

Neurons in the nervous system communicate with each other by producing electrical signals called spikes. To investigate the physiological function of nervous system it is essential to study the activity of neurons by detecting and sorting spikes in the recorded signal. In this paper a method is proposed for considering the spike sorting problem which is based on the nonlinear modeling of spikes using exponential autoregressive model. The genetic algorithm is utilized for model parameter estimation. In this regard some selected model coefficients are used as features for sorting purposes. For optimal selection of model coefficients, self-organizing feature map is used. The results show that modeling of spikes with nonlinear autoregressive model outperforms its linear counterpart. Also the extracted features based on the coefficients of exponential autoregressive model are better than wavelet based extracted features and get more compact and well-separated clusters. In the case of spikes different in small-scale structures where principal component analysis fails to get separated clouds in the feature space, the proposed method can obtain well-separated cluster which removes the necessity of applying complex classifiers.

The Use of Classifiers in Image Analysis of Oil Wells Profiling Process and the Automatic Identification of Events

Different strategies and tools are available at the oil and gas industry for detecting and analyzing tension and possible fractures in borehole walls. Most of these techniques are based on manual observation of the captured borehole images. While this strategy may be possible and convenient with small images and few data, it may become difficult and suitable to errors when big databases of images must be treated. While the patterns may differ among the image area, depending on many characteristics (drilling strategy, rock components, rock strength, etc.). In this work we propose the inclusion of data-mining classification strategies in order to create a knowledge database of the segmented curves. These classifiers allow that, after some time using and manually pointing parts of borehole images that correspond to tension regions and breakout areas, the system will indicate and suggest automatically new candidate regions, with higher accuracy. We suggest the use of different classifiers methods, in order to achieve different knowledge dataset configurations.

Empirical and Indian Automotive Equity Portfolio Decision Support

A brief review of the empirical studies on the methodology of the stock market decision support would indicate that they are at a threshold of validating the accuracy of the traditional and the fuzzy, artificial neural network and the decision trees. Many researchers have been attempting to compare these models using various data sets worldwide. However, the research community is on the way to the conclusive confidence in the emerged models. This paper attempts to use the automotive sector stock prices from National Stock Exchange (NSE), India and analyze them for the intra-sectorial support for stock market decisions. The study identifies the significant variables and their lags which affect the price of the stocks using OLS analysis and decision tree classifiers.