Statistical Measures and Optimization Algorithms for Gene Selection in Lung and Ovarian Tumor

Microarray technology is universally used in the study of disease diagnosis using gene expression levels. The main shortcoming of gene expression data is that it includes thousands of genes and a small number of samples. Abundant methods and techniques have been proposed for tumor classification using microarray gene expression data. Feature or gene selection methods can be used to mine the genes that directly involve in the classification and to eliminate irrelevant genes. In this paper statistical measures like T-Statistics, Signal-to-Noise Ratio (SNR) and F-Statistics are used to rank the genes. The ranked genes are used for further classification. Particle Swarm Optimization (PSO) algorithm and Shuffled Frog Leaping (SFL) algorithm are used to find the significant genes from the top-m ranked genes. The Naïve Bayes Classifier (NBC) is used to classify the samples based on the significant genes. The proposed work is applied on Lung and Ovarian datasets. The experimental results show that the proposed method achieves 100% accuracy in all the three datasets and the results are compared with previous works.

Inflation and Unemployment Rates as Indicators of the Transition European Union Countries Monetary Policy Orientation

Numerous studies carried out in the developed  western democratic countries have shown that the ideological  framework of the governing party has a significant influence on the  monetary policy. The executive authority consisting of a left-wing  party gives a higher weight to unemployment suppression and central  bank implements a more expansionary monetary policy. On the other  hand, right-wing governing party considers the monetary stability to  be more important than unemployment suppression and in such a  political framework the main macroeconomic objective becomes the  inflation rate reduction. The political framework conditions in the  transition countries which are new European Union (EU) members  are still highly specific in relation to the other EU member countries.  In the focus of this paper is the question whether the same  monetary policy principles are valid in these transitional countries as  well as they apply in developed western democratic EU member  countries. The data base consists of inflation rate and unemployment  rate for 11 transitional EU member countries covering the period  from 2001 to 2012. The essential information for each of these 11  countries and for each year of the observed period is right or left  political orientation of the ruling party.  In this paper we use t-statistics to test our hypothesis that there are  differences in inflation and unemployment between right and left  political orientation of the governing party. To explore the influence  of different countries, through years and different political  orientations descriptive statistics is used. Inflation and unemployment  should be strongly negatively correlated through time, which is tested  using Pearson correlation coefficient.  Regarding the fact whether the governing authority is consisted  from left or right politically oriented parties, monetary authorities  will adjust its policy setting the higher priority on lower inflation or  unemployment reduction. 

Performance Degradation for the GLR Test-Statistics for Spatial Signal Detection

Antenna arrays are widely used in modern radio systems in sonar and communications. The solving of the detection problems of a useful signal on the background of noise is based on the GLRT method. There is a large number of problem which depends on the known a priori information. In this work, in contrast to the majority of already solved problems, it is used only difference  spatial properties of the signal and noise for detection. We are analyzing the influence of the degree of non-coherence of signal and noise unhomogeneity on the performance characteristics of different GLRT statistics. The description of the signal and noise is carried out by means of the spatial covariance matrices C in the cases of different number of known information. The partially coherent signalis is simulated as a plane wave with a random angle of incidence of the wave concerning a normal. Background noise is simulated as random process with uniform distribution function in each element. The results of investigation of degradation of performance characteristics for different cases are represented in this work.

Active Segment Selection Method in EEG Classification Using Fractal Features

BCI (Brain Computer Interface) is a communication machine that translates brain massages to computer commands. These machines with the help of computer programs can recognize the tasks that are imagined. Feature extraction is an important stage of the process in EEG classification that can effect in accuracy and the computation time of processing the signals. In this study we process the signal in three steps of active segment selection, fractal feature extraction, and classification. One of the great challenges in BCI applications is to improve classification accuracy and computation time together. In this paper, we have used student’s 2D sample t-statistics on continuous wavelet transforms for active segment selection to reduce the computation time. In the next level, the features are extracted from some famous fractal dimension estimation of the signal. These fractal features are Katz and Higuchi. In the classification stage we used ANFIS (Adaptive Neuro-Fuzzy Inference System) classifier, FKNN (Fuzzy K-Nearest Neighbors), LDA (Linear Discriminate Analysis), and SVM (Support Vector Machines). We resulted that active segment selection method would reduce the computation time and Fractal dimension features with ANFIS analysis on selected active segments is the best among investigated methods in EEG classification.

Dimensionality Reduction of PSSM Matrix and its Influence on Secondary Structure and Relative Solvent Accessibility Predictions

State-of-the-art methods for secondary structure (Porter, Psi-PRED, SAM-T99sec, Sable) and solvent accessibility (Sable, ACCpro) predictions use evolutionary profiles represented by the position specific scoring matrix (PSSM). It has been demonstrated that evolutionary profiles are the most important features in the feature space for these predictions. Unfortunately applying PSSM matrix leads to high dimensional feature spaces that may create problems with parameter optimization and generalization. Several recently published suggested that applying feature extraction for the PSSM matrix may result in improvements in secondary structure predictions. However, none of the top performing methods considered here utilizes dimensionality reduction to improve generalization. In the present study, we used simple and fast methods for features selection (t-statistics, information gain) that allow us to decrease the dimensionality of PSSM matrix by 75% and improve generalization in the case of secondary structure prediction compared to the Sable server.

Efficiency of Different GLR Test-statistics for Spatial Signal Detection

In this work the characteristics of spatial signal detec¬tion from an antenna array in various sample cases are investigated. Cases for a various number of available prior information about the received signal and the background noise are considered. The spatial difference between a signal and noise is only used. The performance characteristics and detecting curves are presented. All test-statistics are obtained on the basis of the generalized likelihood ratio (GLR). The received results are correct for a short and long sample.

Performance Analysis of Genetic Algorithm with kNN and SVM for Feature Selection in Tumor Classification

Tumor classification is a key area of research in the field of bioinformatics. Microarray technology is commonly used in the study of disease diagnosis using gene expression levels. The main drawback of gene expression data is that it contains thousands of genes and a very few samples. Feature selection methods are used to select the informative genes from the microarray. These methods considerably improve the classification accuracy. In the proposed method, Genetic Algorithm (GA) is used for effective feature selection. Informative genes are identified based on the T-Statistics, Signal-to-Noise Ratio (SNR) and F-Test values. The initial candidate solutions of GA are obtained from top-m informative genes. The classification accuracy of k-Nearest Neighbor (kNN) method is used as the fitness function for GA. In this work, kNN and Support Vector Machine (SVM) are used as the classifiers. The experimental results show that the proposed work is suitable for effective feature selection. With the help of the selected genes, GA-kNN method achieves 100% accuracy in 4 datasets and GA-SVM method achieves in 5 out of 10 datasets. The GA with kNN and SVM methods are demonstrated to be an accurate method for microarray based tumor classification.