A Multi-Population Differential Evolution with Adaptive Mutation and Local Search for Global Optimization

This paper presents a multi population Differential Evolution (DE) with adaptive mutation and local search for global optimization, named AMMADE in order to better coordinate the cooperation between the populations and the rational use of resources. In AMMADE, the population is divided based on the Euclidean distance sorting method at each generation to appropriately coordinate the cooperation between subpopulations and the usage of resources, such that the best-performed subpopulation will get more computing resources in the next generation. Further, an adaptive local search strategy is employed on the best-performed subpopulation to achieve a balanced search. The proposed algorithm has been tested by solving optimization problems taken from CEC2014 benchmark problems. Experimental results show that our algorithm can achieve a competitive or better result than related methods. The results also confirm the significance of devised strategies in the proposed algorithm.

Variable vs. Fixed Window Width Code Correlation Reference Waveform Receivers for Multipath Mitigation in Global Navigation Satellite Systems with Binary Offset Carrier and Multiplexed Binary Offset Carrier Signals

This paper compares the multipath mitigation performance of code correlation reference waveform receivers with variable and fixed window width, for binary offset carrier and multiplexed binary offset carrier signals typically used in global navigation satellite systems. In the variable window width method, such width is iteratively reduced until the distortion on the discriminator with multipath is eliminated. This distortion is measured as the Euclidean distance between the actual discriminator (obtained with the incoming signal), and the local discriminator (generated with a local copy of the signal). The variable window width have shown better performance compared to the fixed window width. In particular, the former yields zero error for all delays for the BOC and MBOC signals considered, while the latter gives rather large nonzero errors for small delays in all cases. Due to its computational simplicity, the variable window width method is perfectly suitable for implementation in low-cost receivers.

A Minimum Spanning Tree-Based Method for Initializing the K-Means Clustering Algorithm

The traditional k-means algorithm has been widely used as a simple and efficient clustering method. However, the algorithm often converges to local minima for the reason that it is sensitive to the initial cluster centers. In this paper, an algorithm for selecting initial cluster centers on the basis of minimum spanning tree (MST) is presented. The set of vertices in MST with same degree are regarded as a whole which is used to find the skeleton data points. Furthermore, a distance measure between the skeleton data points with consideration of degree and Euclidean distance is presented. Finally, MST-based initialization method for the k-means algorithm is presented, and the corresponding time complexity is analyzed as well. The presented algorithm is tested on five data sets from the UCI Machine Learning Repository. The experimental results illustrate the effectiveness of the presented algorithm compared to three existing initialization methods.

Electricity Generation from Renewables and Targets: An Application of Multivariate Statistical Techniques

Renewable energy is referred to as "clean energy" and common popular support for the use of renewable energy (RE) is to provide electricity with zero carbon dioxide emissions. This study provides useful insight into the European Union (EU) RE, especially, into electricity generation obtained from renewables, and their targets. The objective of this study is to identify groups of European countries, using multivariate statistical analysis and selected indicators. The hierarchical clustering method is used to decide the number of clusters for EU countries. The conducted statistical hierarchical cluster analysis is based on the Ward’s clustering method and squared Euclidean distances. Hierarchical cluster analysis identified eight distinct clusters of European countries. Then, non-hierarchical clustering (k-means) method was applied. Discriminant analysis was used to determine the validity of the results with data normalized by Z score transformation. To explore the relationship between the selected indicators, correlation coefficients were computed. The results of the study reveal the current situation of RE in European Union Member States.

Speaker Identification by Atomic Decomposition of Learned Features Using Computational Auditory Scene Analysis Principals in Noisy Environments

Speaker recognition is performed in high Additive White Gaussian Noise (AWGN) environments using principals of Computational Auditory Scene Analysis (CASA). CASA methods often classify sounds from images in the time-frequency (T-F) plane using spectrograms or cochleargrams as the image. In this paper atomic decomposition implemented by matching pursuit performs a transform from time series speech signals to the T-F plane. The atomic decomposition creates a sparsely populated T-F vector in “weight space” where each populated T-F position contains an amplitude weight. The weight space vector along with the atomic dictionary represents a denoised, compressed version of the original signal. The arraignment or of the atomic indices in the T-F vector are used for classification. Unsupervised feature learning implemented by a sparse autoencoder learns a single dictionary of basis features from a collection of envelope samples from all speakers. The approach is demonstrated using pairs of speakers from the TIMIT data set. Pairs of speakers are selected randomly from a single district. Each speak has 10 sentences. Two are used for training and 8 for testing. Atomic index probabilities are created for each training sentence and also for each test sentence. Classification is performed by finding the lowest Euclidean distance between then probabilities from the training sentences and the test sentences. Training is done at a 30dB Signal-to-Noise Ratio (SNR). Testing is performed at SNR’s of 0 dB, 5 dB, 10 dB and 30dB. The algorithm has a baseline classification accuracy of ~93% averaged over 10 pairs of speakers from the TIMIT data set. The baseline accuracy is attributable to short sequences of training and test data as well as the overall simplicity of the classification algorithm. The accuracy is not affected by AWGN and produces ~93% accuracy at 0dB SNR.

Selecting the Best Sub-Region Indexing the Images in the Case of Weak Segmentation Based On Local Color Histograms

Color Histogram is considered as the oldest method used by CBIR systems for indexing images. In turn, the global histograms do not include the spatial information; this is why the other techniques coming later have attempted to encounter this limitation by involving the segmentation task as a preprocessing step. The weak segmentation is employed by the local histograms while other methods as CCV (Color Coherent Vector) are based on strong segmentation. The indexation based on local histograms consists of splitting the image into N overlapping blocks or sub-regions, and then the histogram of each block is computed. The dissimilarity between two images is reduced, as consequence, to compute the distance between the N local histograms of the both images resulting then in N*N values; generally, the lowest value is taken into account to rank images, that means that the lowest value is that which helps to designate which sub-region utilized to index images of the collection being asked. In this paper, we make under light the local histogram indexation method in the hope to compare the results obtained against those given by the global histogram. We address also another noteworthy issue when Relying on local histograms namely which value, among N*N values, to trust on when comparing images, in other words, which sub-region among the N*N sub-regions on which we base to index images. Based on the results achieved here, it seems that relying on the local histograms, which needs to pose an extra overhead on the system by involving another preprocessing step naming segmentation, does not necessary mean that it produces better results. In addition to that, we have proposed here some ideas to select the local histogram on which we rely on to encode the image rather than relying on the local histogram having lowest distance with the query histograms.

Generation of Photo-Mosaic Images through Block Matching and Color Adjustment

Mosaic refers to a technique that makes image by gathering lots of small materials in various colors. This paper presents an automatic algorithm that makes the photo-mosaic image using photos. The algorithm is composed of 4 steps: partition and feature extraction, block matching, redundancy removal and color adjustment. The input image is partitioned in the small block to extract feature. Each block is matched to find similar photo in database by comparing similarity with Euclidean difference between blocks. The intensity of the block is adjusted to enhance the similarity of image by replacing the value of light and darkness with that of relevant block. Further, the quality of image is improved by minimizing the redundancy of tiles in the adjacent blocks. Experimental results support that the proposed algorithm is excellent in quantitative analysis and qualitative analysis.

Adaptive WiFi Fingerprinting for Location Approximation

WiFi has become an essential technology that is widely used nowadays. It is famous due to its convenience to be used with mobile devices. This is especially true for Internet users worldwide that use WiFi connections. There are many location based services that are available nowadays which uses Wireless Fidelity (WiFi) signal fingerprinting. A common example that is gaining popularity in this era would be Foursquare. In this work, the WiFi signal would be used to estimate the user or client’s location. Similar to GPS, fingerprinting method needs a floor plan to increase the accuracy of location estimation. Still, the factor of inconsistent WiFi signal makes the estimation defer at different time intervals. Given so, an adaptive method is needed to obtain the most accurate signal at all times. WiFi signals are heavily distorted by external factors such as physical objects, radio frequency interference, electrical interference, and environmental factors to name a few. Due to these factors, this work uses a method of reducing the signal noise and estimation using the Nearest Neighbour based on past activities of the signal to increase the signal accuracy up to more than 80%. The repository yet increases the accuracy by using Artificial Neural Network (ANN) pattern matching. The repository acts as the server cum support of the client side application decision. Numerous previous works has adapted the methods of collecting signal strengths in the repository over the years, but mostly were just static. In this work, proposed solutions on how the adaptive method is done to match the signal received to the data in the repository are highlighted. With the said approach, location estimation can be done more accurately. Adaptive update allows the latest location fingerprint to be stored in the repository. Furthermore, any redundant location fingerprints are removed and only the updated version of the fingerprint is stored in the repository. How the location estimation of the user can be predicted would be highlighted more in the proposed solution section. After some studies on previous works, it is found that the Artificial Neural Network is the most feasible method to deploy in updating the repository and making it adaptive. The Artificial Neural Network functions are to do the pattern matching of the WiFi signal to the existing data available in the repository.

Iris Recognition Based On the Low Order Norms of Gradient Components

Iris pattern is an important biological feature of human body; it becomes very hot topic in both research and practical applications. In this paper, an algorithm is proposed for iris recognition and a simple, efficient and fast method is introduced to extract a set of discriminatory features using first order gradient operator applied on grayscale images. The gradient based features are robust, up to certain extents, against the variations may occur in contrast or brightness of iris image samples; the variations are mostly occur due lightening differences and camera changes. At first, the iris region is located, after that it is remapped to a rectangular area of size 360x60 pixels. Also, a new method is proposed for detecting eyelash and eyelid points; it depends on making image statistical analysis, to mark the eyelash and eyelid as a noise points. In order to cover the features localization (variation), the rectangular iris image is partitioned into N overlapped sub-images (blocks); then from each block a set of different average directional gradient densities values is calculated to be used as texture features vector. The applied gradient operators are taken along the horizontal, vertical and diagonal directions. The low order norms of gradient components were used to establish the feature vector. Euclidean distance based classifier was used as a matching metric for determining the degree of similarity between the features vector extracted from the tested iris image and template features vectors stored in the database. Experimental tests were performed using 2639 iris images from CASIA V4-Interival database, the attained recognition accuracy has reached up to 99.92%.

Application of l1-Norm Minimization Technique to Image Retrieval

Image retrieval is a topic where scientific interest is currently high. The important steps associated with image retrieval system are the extraction of discriminative features and a feasible similarity metric for retrieving the database images that are similar in content with the search image. Gabor filtering is a widely adopted technique for feature extraction from the texture images. The recently proposed sparsity promoting l1-norm minimization technique finds the sparsest solution of an under-determined system of linear equations. In the present paper, the l1-norm minimization technique as a similarity metric is used in image retrieval. It is demonstrated through simulation results that the l1-norm minimization technique provides a promising alternative to existing similarity metrics. In particular, the cases where the l1-norm minimization technique works better than the Euclidean distance metric are singled out.

MIBiClus: Mutual Information based Biclustering Algorithm

Most of the biclustering/projected clustering algorithms are based either on the Euclidean distance or correlation coefficient which capture only linear relationships. However, in many applications, like gene expression data and word-document data, non linear relationships may exist between the objects. Mutual Information between two variables provides a more general criterion to investigate dependencies amongst variables. In this paper, we improve upon our previous algorithm that uses mutual information for biclustering in terms of computation time and also the type of clusters identified. The algorithm is able to find biclusters with mixed relationships and is faster than the previous one. To the best of our knowledge, none of the other existing algorithms for biclustering have used mutual information as a similarity measure. We present the experimental results on synthetic data as well as on the yeast expression data. Biclusters on the yeast data were found to be biologically and statistically significant using GO Tool Box and FuncAssociate.

Combining Similarity and Dissimilarity Measurements for the Development of QSAR Models Applied to the Prediction of Antiobesity Activity of Drugs

In this paper we study different similarity based approaches for the development of QSAR model devoted to the prediction of activity of antiobesity drugs. Classical similarity approaches are compared regarding to dissimilarity models based on the consideration of the calculation of Euclidean distances between the nonisomorphic fragments extracted in the matching process. Combining the classical similarity and dissimilarity approaches into a new similarity measure, the Approximate Similarity was also studied, and better results were obtained. The application of the proposed method to the development of quantitative structure-activity relationships (QSAR) has provided reliable tools for predicting of inhibitory activity of drugs. Acceptable results were obtained for the models presented here.

Customer Segmentation in Foreign Trade based on Clustering Algorithms Case Study: Trade Promotion Organization of Iran

The goal of this paper is to segment the countries based on the value of export from Iran during 14 years ending at 2005. To measure the dissimilarity among export baskets of different countries, we define Dissimilarity Export Basket (DEB) function and use this distance function in K-means algorithm. The DEB function is defined based on the concepts of the association rules and the value of export group-commodities. In this paper, clustering quality function and clusters intraclass inertia are defined to, respectively, calculate the optimum number of clusters and to compare the functionality of DEB versus Euclidean distance. We have also study the effects of importance weight in DEB function to improve clustering quality. Lastly when segmentation is completed, a designated RFM model is used to analyze the relative profitability of each cluster.

Using a Semantic Self-Organising Web Page-Ranking Mechanism for Public Administration and Education

In the proposed method for Web page-ranking, a novel theoretic model is introduced and tested by examples of order relationships among IP addresses. Ranking is induced using a convexity feature, which is learned according to these examples using a self-organizing procedure. We consider the problem of selforganizing learning from IP data to be represented by a semi-random convex polygon procedure, in which the vertices correspond to IP addresses. Based on recent developments in our regularization theory for convex polygons and corresponding Euclidean distance based methods for classification, we develop an algorithmic framework for learning ranking functions based on a Computational Geometric Theory. We show that our algorithm is generic, and present experimental results explaining the potential of our approach. In addition, we explain the generality of our approach by showing its possible use as a visualization tool for data obtained from diverse domains, such as Public Administration and Education.

Assessment of Time-Lapse in Visible and Thermal Face Recognition

Although face recognition seems as an easy task for human, automatic face recognition is a much more challenging task due to variations in time, illumination and pose. In this paper, the influence of time-lapse on visible and thermal images is examined. Orthogonal moment invariants are used as a feature extractor to analyze the effect of time-lapse on thermal and visible images and the results are compared with conventional Principal Component Analysis (PCA). A new triangle square ratio criterion is employed instead of Euclidean distance to enhance the performance of nearest neighbor classifier. The results of this study indicate that the ideal feature vectors can be represented with high discrimination power due to the global characteristic of orthogonal moment invariants. Moreover, the effect of time-lapse has been decreasing and enhancing the accuracy of face recognition considerably in comparison with PCA. Furthermore, our experimental results based on moment invariant and triangle square ratio criterion show that the proposed approach achieves on average 13.6% higher in recognition rate than PCA.

Using Spectral Vectors and M-Tree for Graph Clustering and Searching in Graph Databases of Protein Structures

In this paper, we represent protein structure by using graph. A protein structure database will become a graph database. Each graph is represented by a spectral vector. We use Jacobi rotation algorithm to calculate the eigenvalues of the normalized Laplacian representation of adjacency matrix of graph. To measure the similarity between two graphs, we calculate the Euclidean distance between two graph spectral vectors. To cluster the graphs, we use M-tree with the Euclidean distance to cluster spectral vectors. Besides, M-tree can be used for graph searching in graph database. Our proposal method was tested with graph database of 100 graphs representing 100 protein structures downloaded from Protein Data Bank (PDB) and we compare the result with the SCOP hierarchical structure.

Image Segmentation Based on Graph Theoretical Approach to Improve the Quality of Image Segmentation

Graph based image segmentation techniques are considered to be one of the most efficient segmentation techniques which are mainly used as time & space efficient methods for real time applications. How ever, there is need to focus on improving the quality of segmented images obtained from the earlier graph based methods. This paper proposes an improvement to the graph based image segmentation methods already described in the literature. We contribute to the existing method by proposing the use of a weighted Euclidean distance to calculate the edge weight which is the key element in building the graph. We also propose a slight modification of the segmentation method already described in the literature, which results in selection of more prominent edges in the graph. The experimental results show the improvement in the segmentation quality as compared to the methods that already exist, with a slight compromise in efficiency.

Enhanced Traveling Salesman Problem Solving by Genetic Algorithm Technique (TSPGA)

The well known NP-complete problem of the Traveling Salesman Problem (TSP) is coded in genetic form. A software system is proposed to determine the optimum route for a Traveling Salesman Problem using Genetic Algorithm technique. The system starts from a matrix of the calculated Euclidean distances between the cities to be visited by the traveling salesman and a randomly chosen city order as the initial population. Then new generations are then created repeatedly until the proper path is reached upon reaching a stopping criterion. This search is guided by a solution evaluation function.

Standard Deviation of Mean and Variance of Rows and Columns of Images for CBIR

This paper describes a novel and effective approach to content-based image retrieval (CBIR) that represents each image in the database by a vector of feature values called “Standard deviation of mean vectors of color distribution of rows and columns of images for CBIR". In many areas of commerce, government, academia, and hospitals, large collections of digital images are being created. This paper describes the approach that uses contents as feature vector for retrieval of similar images. There are several classes of features that are used to specify queries: colour, texture, shape, spatial layout. Colour features are often easily obtained directly from the pixel intensities. In this paper feature extraction is done for the texture descriptor that is 'variance' and 'Variance of Variances'. First standard deviation of each row and column mean is calculated for R, G, and B planes. These six values are obtained for one image which acts as a feature vector. Secondly we calculate variance of the row and column of R, G and B planes of an image. Then six standard deviations of these variance sequences are calculated to form a feature vector of dimension six. We applied our approach to a database of 300 BMP images. We have determined the capability of automatic indexing by analyzing image content: color and texture as features and by applying a similarity measure Euclidean distance.

Comparison of Parameterization Methods in Recognizing Spoken Arabic Digits

This paper proposes evaluation of sound parameterization methods in recognizing some spoken Arabic words, namely digits from zero to nine. Each isolated spoken word is represented by a single template based on a specific recognition feature, and the recognition is based on the Euclidean distance from those templates. The performance analysis of recognition is based on four parameterization features: the Burg Spectrum Analysis, the Walsh Spectrum Analysis, the Thomson Multitaper Spectrum Analysis and the Mel Frequency Cepstral Coefficients (MFCC) features. The main aim of this paper was to compare, analyze, and discuss the outcomes of spoken Arabic digits recognition systems based on the selected recognition features. The results acqired confirm that the use of MFCC features is a very promising method in recognizing Spoken Arabic digits.