Abstract: Gradient boosting methods have been proven to be a very important strategy. Many successful machine learning solutions were developed using the XGBoost and its derivatives. The aim of this study is to investigate and compare the efficiency of three gradient methods. Home credit dataset is used in this work which contains 219 features and 356251 records. However, new features are generated and several techniques are used to rank and select the best features. The implementation indicates that the LightGBM is faster and more accurate than CatBoost and XGBoost using variant number of features and records.
Abstract: This study solves a phylogeny problem by using modified wild dog pack optimization. The least squares error is considered as a cost function that needs to be minimized. Therefore, in each iteration, new distance matrices based on the constructed trees are calculated and used to select the alpha dog. To test the suggested algorithm, ten homologous genes are selected and collected from National Center for Biotechnology Information (NCBI) databanks (i.e., 16S, 18S, 28S, Cox 1, ITS1, ITS2, ETS, ATPB, Hsp90, and STN). The data are divided into three categories: 50 taxa, 100 taxa and 500 taxa. The empirical results show that the proposed algorithm is more reliable and accurate than other implemented methods.
Abstract: Genome rearrangement is an important area in computational biology and bioinformatics. The basic problem in genome rearrangements is to compute the edit distance, i.e., the minimum number of operations needed to transform one genome into another. Unfortunately, unsigned genome rearrangement problem is NP-hard. In this study an improved ant colony optimization algorithm to approximate the edit distance is proposed. The main idea is to convert the unsigned permutation to signed permutation and evaluate the ants by using Kaplan algorithm. Two new operations are added to the standard ant colony algorithm: Replacing the worst ants by re-sampling the ants from a new probability distribution and applying the crossover operations on the best ants. The proposed algorithm is tested and compared with the improved breakpoint reversal sort algorithm by using three datasets. The results indicate that the proposed algorithm achieves better accuracy ratio than the previous methods.
Abstract: Intrusion detection is a mechanism used to protect a
system and analyse and predict the behaviours of system users. An
ideal intrusion detection system is hard to achieve due to
nonlinearity, and irrelevant or redundant features. This study
introduces a new anomaly-based intrusion detection model. The
suggested model is based on particle swarm optimisation and
nonlinear, multi-class and multi-kernel support vector machines.
Particle swarm optimisation is used for feature selection by applying
a new formula to update the position and the velocity of a particle;
the support vector machine is used as a classifier. The proposed
model is tested and compared with the other methods using the KDD
CUP 1999 dataset. The results indicate that this new method achieves
better accuracy rates than previous methods.
Abstract: To improve the classification rate of the face
recognition, features combination and a novel non-linear kernel are
proposed. The feature vector concatenates three different radius of
local binary patterns and Gabor wavelet features. Gabor features are
the mean, standard deviation and the skew of each scaling and
orientation parameter. The aim of the new kernel is to incorporate
the power of the kernel methods with the optimal balance between
the features. To verify the effectiveness of the proposed method,
numerous methods are tested by using four datasets, which are
consisting of various emotions, orientations, configuration,
expressions and lighting conditions. Empirical results show the
superiority of the proposed technique when compared to other
methods.