An Effective Noise Resistant FM Continuous-Wave Radar Vital Sign Signal Detection Method

To address the problem that the FM continuous-wave (FMCW) radar extracts human vital sign signals which are susceptible to noise interference and low reconstruction accuracy, a detection scheme for the sign signals is proposed. Firstly, an improved complete ensemble empirical modal decomposition with adaptive noise (ICEEMDAN) algorithm is applied to decompose the radar-extracted thoracic signals to obtain several intrinsic modal functions (IMF) with different spatial scales, and then the IMF components are optimized by a backpropagation (BP) neural network improved by immune genetic algorithm (IGA). The simulation results show that this scheme can effectively separate the noise, accurately extract the respiratory and heartbeat signals and improve the reconstruction accuracy and signal to-noise ratio of the sign signals.

Multi-Sensor Target Tracking Using Ensemble Learning

Multiple classifier systems combine several individual classifiers to deliver a final classification decision. However, an increasingly controversial question is whether such systems can outperform the single best classifier, and if so, what form of multiple classifiers system yields the most significant benefit. Also, multi-target tracking detection using multiple sensors is an important research field in mobile techniques and military applications. In this paper, several multiple classifiers systems are evaluated in terms of their ability to predict a system’s failure or success for multi-sensor target tracking tasks. The Bristol Eden project dataset is utilised for this task. Experimental and simulation results show that the human activity identification system can fulfil requirements of target tracking due to improved sensors classification performances with multiple classifier systems constructed using boosting achieving higher accuracy rates.

Cirrhosis Mortality Prediction as Classification Using Frequent Subgraph Mining

In this work, we use machine learning and data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. Our work applies modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

A Review and Comparative Analysis on Cluster Ensemble Methods

Clustering is an unsupervised learning technique for aggregating data objects into meaningful classes so that intra cluster similarity is maximized and inter cluster similarity is minimized in data mining. However, no single clustering algorithm proves to be the most effective in producing the best result. As a result, a new challenging technique known as the cluster ensemble approach has blossomed in order to determine the solution to this problem. For the cluster analysis issue, this new technique is a successful approach. The cluster ensemble's main goal is to combine similar clustering solutions in a way that achieves the precision while also improving the quality of individual data clustering. Because of the massive and rapid creation of new approaches in the field of data mining, the ongoing interest in inventing novel algorithms necessitates a thorough examination of current techniques and future innovation. This paper presents a comparative analysis of various cluster ensemble approaches, including their methodologies, formal working process, and standard accuracy and error rates. As a result, the society of clustering practitioners will benefit from this exploratory and clear research, which will aid in determining the most appropriate solution to the problem at hand.

An Optimal Control Method for Reconstruction of Topography in Dam-Break Flows

Modeling dam-break flows over non-flat beds requires an accurate representation of the topography which is the main source of uncertainty in the model. Therefore, developing robust and accurate techniques for reconstructing topography in this class of problems would reduce the uncertainty in the flow system. In many hydraulic applications, experimental techniques have been widely used to measure the bed topography. In practice, experimental work in hydraulics may be very demanding in both time and cost. Meanwhile, computational hydraulics have served as an alternative for laboratory and field experiments. Unlike the forward problem, the inverse problem is used to identify the bed parameters from the given experimental data. In this case, the shallow water equations used for modeling the hydraulics need to be rearranged in a way that the model parameters can be evaluated from measured data. However, this approach is not always possible and it suffers from stability restrictions. In the present work, we propose an adaptive optimal control technique to numerically identify the underlying bed topography from a given set of free-surface observation data. In this approach, a minimization function is defined to iteratively determine the model parameters. The proposed technique can be interpreted as a fractional-stage scheme. In the first stage, the forward problem is solved to determine the measurable parameters from known data. In the second stage, the adaptive control Ensemble Kalman Filter is implemented to combine the optimality of observation data in order to obtain the accurate estimation of the topography. The main features of this method are on one hand, the ability to solve for different complex geometries with no need for any rearrangements in the original model to rewrite it in an explicit form. On the other hand, its achievement of strong stability for simulations of flows in different regimes containing shocks or discontinuities over any geometry. Numerical results are presented for a dam-break flow problem over non-flat bed using different solvers for the shallow water equations. The robustness of the proposed method is investigated using different numbers of loops, sensitivity parameters, initial samples and location of observations. The obtained results demonstrate high reliability and accuracy of the proposed techniques.

Rank-Based Chain-Mode Ensemble for Binary Classification

In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.

Design of an Ensemble Learning Behavior Anomaly Detection Framework

Data assets protection is a crucial issue in the cybersecurity field. Companies use logical access control tools to vault their information assets and protect them against external threats, but they lack solutions to counter insider threats. Nowadays, insider threats are the most significant concern of security analysts. They are mainly individuals with legitimate access to companies information systems, which use their rights with malicious intents. In several fields, behavior anomaly detection is the method used by cyber specialists to counter the threats of user malicious activities effectively. In this paper, we present the step toward the construction of a user and entity behavior analysis framework by proposing a behavior anomaly detection model. This model combines machine learning classification techniques and graph-based methods, relying on linear algebra and parallel computing techniques. We show the utility of an ensemble learning approach in this context. We present some detection methods tests results on an representative access control dataset. The use of some explored classifiers gives results up to 99% of accuracy.

River Stage-Discharge Forecasting Based on Multiple-Gauge Strategy Using EEMD-DWT-LSSVM Approach

This study presented hybrid pre-processing approach along with a conceptual model to enhance the accuracy of river discharge prediction. In order to achieve this goal, Ensemble Empirical Mode Decomposition algorithm (EEMD), Discrete Wavelet Transform (DWT) and Mutual Information (MI) were employed as a hybrid pre-processing approach conjugated to Least Square Support Vector Machine (LSSVM). A conceptual strategy namely multi-station model was developed to forecast the Souris River discharge more accurately. The strategy used herein was capable of covering uncertainties and complexities of river discharge modeling. DWT and EEMD was coupled, and the feature selection was performed for decomposed sub-series using MI to be employed in multi-station model. In the proposed feature selection method, some useless sub-series were omitted to achieve better performance. Results approved efficiency of the proposed DWT-EEMD-MI approach to improve accuracy of multi-station modeling strategies.

Predictive Semi-Empirical NOx Model for Diesel Engine

Accurate prediction of NOx emission is a continuous challenge in the field of diesel engine-out emission modeling. Performing experiments for each conditions and scenario cost significant amount of money and man hours, therefore model-based development strategy has been implemented in order to solve that issue. NOx formation is highly dependent on the burn gas temperature and the O2 concentration inside the cylinder. The current empirical models are developed by calibrating the parameters representing the engine operating conditions with respect to the measured NOx. This makes the prediction of purely empirical models limited to the region where it has been calibrated. An alternative solution to that is presented in this paper, which focus on the utilization of in-cylinder combustion parameters to form a predictive semi-empirical NOx model. The result of this work is shown by developing a fast and predictive NOx model by using the physical parameters and empirical correlation. The model is developed based on the steady state data collected at entire operating region of the engine and the predictive combustion model, which is developed in Gamma Technology (GT)-Power by using Direct Injected (DI)-Pulse combustion object. In this approach, temperature in both burned and unburnt zone is considered during the combustion period i.e. from Intake Valve Closing (IVC) to Exhaust Valve Opening (EVO). Also, the oxygen concentration consumed in burnt zone and trapped fuel mass is also considered while developing the reported model.  Several statistical methods are used to construct the model, including individual machine learning methods and ensemble machine learning methods. A detailed validation of the model on multiple diesel engines is reported in this work. Substantial numbers of cases are tested for different engine configurations over a large span of speed and load points. Different sweeps of operating conditions such as Exhaust Gas Recirculation (EGR), injection timing and Variable Valve Timing (VVT) are also considered for the validation. Model shows a very good predictability and robustness at both sea level and altitude condition with different ambient conditions. The various advantages such as high accuracy and robustness at different operating conditions, low computational time and lower number of data points requires for the calibration establishes the platform where the model-based approach can be used for the engine calibration and development process. Moreover, the focus of this work is towards establishing a framework for the future model development for other various targets such as soot, Combustion Noise Level (CNL), NO2/NOx ratio etc.

A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling

Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.

Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Performance Assessment of Multi-Level Ensemble for Multi-Class Problems

Many supervised machine learning tasks require decision making across numerous different classes. Multi-class classification has several applications, such as face recognition, text recognition and medical diagnostics. The objective of this article is to analyze an adapted method of Stacking in multi-class problems, which combines ensembles within the ensemble itself. For this purpose, a training similar to Stacking was used, but with three levels, where the final decision-maker (level 2) performs its training by combining outputs from the tree-based pair of meta-classifiers (level 1) from Bayesian families. These are in turn trained by pairs of base classifiers (level 0) of the same family. This strategy seeks to promote diversity among the ensembles forming the meta-classifier level 2. Three performance measures were used: (1) accuracy, (2) area under the ROC curve, and (3) time for three factors: (a) datasets, (b) experiments and (c) levels. To compare the factors, ANOVA three-way test was executed for each performance measure, considering 5 datasets by 25 experiments by 3 levels. A triple interaction between factors was observed only in time. The accuracy and area under the ROC curve presented similar results, showing a double interaction between level and experiment, as well as for the dataset factor. It was concluded that level 2 had an average performance above the other levels and that the proposed method is especially efficient for multi-class problems when compared to binary problems.

Long Wavelength Coherent Pulse of Sound Propagating in Granular Media

A mechanical wave or vibration propagating through granular media exhibits a specific signature in time. A coherent pulse or wavefront arrives first with multiply scattered waves (coda) arriving later. The coherent pulse is micro-structure independent i.e. it depends only on the bulk properties of the disordered granular sample, the sound wave velocity of the granular sample and hence bulk and shear moduli. The coherent wavefront attenuates (decreases in amplitude) and broadens with distance from its source. The pulse attenuation and broadening effects are affected by disorder (polydispersity; contrast in size of the granules) and have often been attributed to dispersion and scattering. To study the effect of disorder and initial amplitude (non-linearity) of the pulse imparted to the system on the coherent wavefront, numerical simulations have been carried out on one-dimensional sets of particles (granular chains). The interaction force between the particles is given by a Hertzian contact model. The sizes of particles have been selected randomly from a Gaussian distribution, where the standard deviation of this distribution is the relevant parameter that quantifies the effect of disorder on the coherent wavefront. Since, the coherent wavefront is system configuration independent, ensemble averaging has been used for improving the signal quality of the coherent pulse and removing the multiply scattered waves. The results concerning the width of the coherent wavefront have been formulated in terms of scaling laws. An experimental set-up of photoelastic particles constituting a granular chain is proposed to validate the numerical results.

A Distributed Mobile Agent Based on Intrusion Detection System for MANET

This study is about an algorithmic dependence of Artificial Neural Network on Multilayer Perceptron (MPL) pertaining to the classification and clustering presentations for Mobile Adhoc Network vulnerabilities. Moreover, mobile ad hoc network (MANET) is ubiquitous intelligent internetworking devices in which it has the ability to detect their environment using an autonomous system of mobile nodes that are connected via wireless links. Security affairs are the most important subject in MANET due to the easy penetrative scenarios occurred in such an auto configuration network. One of the powerful techniques used for inspecting the network packets is Intrusion Detection System (IDS); in this article, we are going to show the effectiveness of artificial neural networks used as a machine learning along with stochastic approach (information gain) to classify the malicious behaviors in simulated network with respect to different IDS techniques. The monitoring agent is responsible for detection inference engine, the audit data is collected from collecting agent by simulating the node attack and contrasted outputs with normal behaviors of the framework, whenever. In the event that there is any deviation from the ordinary behaviors then the monitoring agent is considered this event as an attack , in this article we are going to demonstrate the  signature-based IDS approach in a MANET by implementing the back propagation algorithm over ensemble-based Traffic Table (TT), thus the signature of malicious behaviors or undesirable activities are often significantly prognosticated and efficiently figured out, by increasing the parametric set-up of Back propagation algorithm during the experimental results which empirically shown its effectiveness  for the ratio of detection index up to 98.6 percentage. Consequently it is proved in empirical results in this article, the performance matrices are also being included in this article with Xgraph screen show by different through puts like Packet Delivery Ratio (PDR), Through Put(TP), and Average Delay(AD).

Optimizing Approach for Sifting Process to Solve a Common Type of Empirical Mode Decomposition Mode Mixing

Empirical mode decomposition (EMD), a new data-driven of time-series decomposition, has the advantage of supposing that a time series is non-linear or non-stationary, as is implicitly achieved in Fourier decomposition. However, the EMD suffers of mode mixing problem in some cases. The aim of this paper is to present a solution for a common type of signals causing of EMD mode mixing problem, in case a signal suffers of an intermittency. By an artificial example, the solution shows superior performance in terms of cope EMD mode mixing problem comparing with the conventional EMD and Ensemble Empirical Mode decomposition (EEMD). Furthermore, the over-sifting problem is also completely avoided; and computation load is reduced roughly six times compared with EEMD, an ensemble number of 50.

Potential Climate Change Impacts on the Hydrological System of the Harvey River Catchment

Climate change is likely to impact the Australian continent by changing the trends of rainfall, increasing temperature, and affecting the accessibility of water quantity and quality. This study investigates the possible impacts of future climate change on the hydrological system of the Harvey River catchment in Western Australia by using the conceptual modelling approach (HBV mode). Daily observations of rainfall and temperature and the long-term monthly mean potential evapotranspiration, from six weather stations, were available for the period (1961-2015). The observed streamflow data at Clifton Park gauging station for 33 years (1983-2015) in line with the observed climate variables were used to run, calibrate and validate the HBV-model prior to the simulation process. The calibrated model was then forced with the downscaled future climate signals from a multi-model ensemble of fifteen GCMs of the CMIP3 model under three emission scenarios (A2, A1B and B1) to simulate the future runoff at the catchment outlet. Two periods were selected to represent the future climate conditions including the mid (2046-2065) and late (2080-2099) of the 21st century. A control run, with the reference climate period (1981-2000), was used to represent the current climate status. The modelling outcomes show an evident reduction in the mean annual streamflow during the mid of this century particularly for the A1B scenario relative to the control run. Toward the end of the century, all scenarios show a relatively high reduction trends in the mean annual streamflow, especially the A1B scenario, compared to the control run. The decline in the mean annual streamflow ranged between 4-15% during the mid of the current century and 9-42% by the end of the century.

Event Related Potentials in Terms of Visual and Auditory Stimuli

Event-related potential (ERP) is one of the useful tools for investigating cognitive reactions. In this study, the potential of ERP components detected after auditory and visual stimuli was examined. Subjects were asked to respond upon stimuli that were of three categories; Target, Non-Target and Standard stimuli. The ERP after stimulus was measured. In the experiment of visual evoked potentials (VEPs), the subjects were asked to gaze at a center point on the monitor screen where the stimuli were provided by the reversal pattern of the checkerboard. In consequence of the VEP experiments, we observed consistent reactions. Each peak voltage could be measured when the ensemble average was applied. Visual stimuli had smaller amplitude and a longer latency compared to that of auditory stimuli. The amplitude was the highest with Target and the smallest with Standard in both stimuli.

Orchestra Course Outcomes in Terms of Values Education

Music education aims to bring up individuals most appropriately and to advanced levels as a balanced whole physically, cognitively, affectively, and kinesthetically while making a major contribution to the physical and spiritual development of the individual. The most crucial aim of music education, an influential education medium per se, is to make music be loved; yet, among its educational aims are concepts such as affinity, friendship, goodness, philanthropy, responsibility, and respect all extremely crucial bringing up individuals as a balanced whole. One of the most essential assets of the music education is the training of making music together, solidifying musical knowledge and enabling the acquisition of cooperation. This habit requires internalization of values like responsibility, patience, cooperativeness, respect, self-control, friendship, and fairness. If musicians lack these values, the ensemble will become after some certain time a cacophony. In this qualitative research, the attitudes of music teacher candidates in orchestra/chamber music classes will be examined in terms of values.

Evaluation of Ensemble Classifiers for Intrusion Detection

One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this research work, new ensemble classification methods are proposed with homogeneous ensemble classifier using bagging and heterogeneous ensemble classifier using arcing and their performances are analyzed in terms of accuracy. A Classifier ensemble is designed using Radial Basis Function (RBF) and Support Vector Machine (SVM) as base classifiers. The feasibility and the benefits of the proposed approaches are demonstrated by the means of standard datasets of intrusion detection. The main originality of the proposed approach is based on three main parts: preprocessing phase, classification phase, and combining phase. A wide range of comparative experiments is conducted for standard datasets of intrusion detection. The performance of the proposed homogeneous and heterogeneous ensemble classifiers are compared to the performance of other standard homogeneous and heterogeneous ensemble methods. The standard homogeneous ensemble methods include Error correcting output codes, Dagging and heterogeneous ensemble methods include majority voting, stacking. The proposed ensemble methods provide significant improvement of accuracy compared to individual classifiers and the proposed bagged RBF and SVM performs significantly better than ECOC and Dagging and the proposed hybrid RBF-SVM performs significantly better than voting and stacking. Also heterogeneous models exhibit better results than homogeneous models for standard datasets of intrusion detection. 

Breast Cancer Survivability Prediction via Classifier Ensemble

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.