Predicting Protein-Protein Interactions from Protein Sequences Using Phylogenetic Profiles

In this study, a high accuracy protein-protein interaction prediction method is developed. The importance of the proposed method is that it only uses sequence information of proteins while predicting interaction. The method extracts phylogenetic profiles of proteins by using their sequence information. Combining the phylogenetic profiles of two proteins by checking existence of homologs in different species and fitting this combined profile into a statistical model, it is possible to make predictions about the interaction status of two proteins. For this purpose, we apply a collection of pattern recognition techniques on the dataset of combined phylogenetic profiles of protein pairs. Support Vector Machines, Feature Extraction using ReliefF, Naive Bayes Classification, K-Nearest Neighborhood Classification, Decision Trees, and Random Forest Classification are the methods we applied for finding the classification method that best predicts the interaction status of protein pairs. Random Forest Classification outperformed all other methods with a prediction accuracy of 76.93%

Texture Feature Extraction using Slant-Hadamard Transform

Random and natural textures classification is still one of the biggest challenges in the field of image processing and pattern recognition. In this paper, texture feature extraction using Slant Hadamard Transform was studied and compared to other signal processing-based texture classification schemes. A parametric SHT was also introduced and employed for natural textures feature extraction. We showed that a subtly modified parametric SHT can outperform ordinary Walsh-Hadamard transform and discrete cosine transform. Experiments were carried out on a subset of Vistex random natural texture images using a kNN classifier.

Effects of Hidden Unit Sizes and Autoregressive Features in Mental Task Classification

Classification of electroencephalogram (EEG) signals extracted during mental tasks is a technique that is actively pursued for Brain Computer Interfaces (BCI) designs. In this paper, we compared the classification performances of univariateautoregressive (AR) and multivariate autoregressive (MAR) models for representing EEG signals that were extracted during different mental tasks. Multilayer Perceptron (MLP) neural network (NN) trained by the backpropagation (BP) algorithm was used to classify these features into the different categories representing the mental tasks. Classification performances were also compared across different mental task combinations and 2 sets of hidden units (HU): 2 to 10 HU in steps of 2 and 20 to 100 HU in steps of 20. Five different mental tasks from 4 subjects were used in the experimental study and combinations of 2 different mental tasks were studied for each subject. Three different feature extraction methods with 6th order were used to extract features from these EEG signals: AR coefficients computed with Burg-s algorithm (ARBG), AR coefficients computed with stepwise least square algorithm (ARLS) and MAR coefficients computed with stepwise least square algorithm. The best results were obtained with 20 to 100 HU using ARBG. It is concluded that i) it is important to choose the suitable mental tasks for different individuals for a successful BCI design, ii) higher HU are more suitable and iii) ARBG is the most suitable feature extraction method.

Computer Aided Detection on Mammography

A typical definition of the Computer Aided Diagnosis (CAD), found in literature, can be: A diagnosis made by a radiologist using the output of a computerized scheme for automated image analysis as a diagnostic aid. Often it is possible to find the expression Computer Aided Detection (CAD or CADe): this definition emphasizes the intent of CAD to support rather than substitute the human observer in the analysis of radiographic images. In this article we will illustrate the application of CAD systems and the aim of these definitions. Commercially available CAD systems use computerized algorithms for identifying suspicious regions of interest. In this paper are described the general CAD systems as an expert system constituted of the following components: segmentation / detection, feature extraction, and classification / decision making. As example, in this work is shown the realization of a Computer- Aided Detection system that is able to assist the radiologist in identifying types of mammary tumor lesions. Furthermore this prototype of station uses a GRID configuration to work on a large distributed database of digitized mammographic images.

A new Adaptive Approach for Histogram based Mouth Segmentation

The segmentation of mouth and lips is a fundamental problem in facial image analyisis. In this paper we propose a method for lip segmentation based on rg-color histogram. Statistical analysis shows, using the rg-color-space is optimal for this purpose of a pure color based segmentation. Initially a rough adaptive threshold selects a histogram region, that assures that all pixels in that region are skin pixels. Based on that pixels we build a gaussian model which represents the skin pixels distribution and is utilized to obtain a refined, optimal threshold. We are not incorporating shape or edge information. In experiments we show the performance of our lip pixel segmentation method compared to the ground truth of our dataset and a conventional watershed algorithm.

Javanese Character Recognition Using Hidden Markov Model

Hidden Markov Model (HMM) is a stochastic method which has been used in various signal processing and character recognition. This study proposes to use HMM to recognize Javanese characters from a number of different handwritings, whereby HMM is used to optimize the number of state and feature extraction. An 85.7 % accuracy is obtained as the best result in 16-stated vertical model using pure HMM. This initial result is satisfactory for prompting further research.

Automatic Detection of Syllable Repetition in Read Speech for Objective Assessment of Stuttered Disfluencies

Automatic detection of syllable repetition is one of the important parameter in assessing the stuttered speech objectively. The existing method which uses artificial neural network (ANN) requires high levels of agreement as prerequisite before attempting to train and test ANNs to separate fluent and nonfluent. We propose automatic detection method for syllable repetition in read speech for objective assessment of stuttered disfluencies which uses a novel approach and has four stages comprising of segmentation, feature extraction, score matching and decision logic. Feature extraction is implemented using well know Mel frequency Cepstra coefficient (MFCC). Score matching is done using Dynamic Time Warping (DTW) between the syllables. The Decision logic is implemented by Perceptron based on the score given by score matching. Although many methods are available for segmentation, in this paper it is done manually. Here the assessment by human judges on the read speech of 10 adults who stutter are described using corresponding method and the result was 83%.

Learning User Keystroke Patterns for Authentication

Keystroke authentication is a new access control system to identify legitimate users via their typing behavior. In this paper, machine learning techniques are adapted for keystroke authentication. Seven learning methods are used to build models to differentiate user keystroke patterns. The selected classification methods are Decision Tree, Naive Bayesian, Instance Based Learning, Decision Table, One Rule, Random Tree and K-star. Among these methods, three of them are studied in more details. The results show that machine learning is a feasible alternative for keystroke authentication. Compared to the conventional Nearest Neighbour method in the recent research, learning methods especially Decision Tree can be more accurate. In addition, the experiment results reveal that 3-Grams is more accurate than 2-Grams and 4-Grams for feature extraction. Also, combination of attributes tend to result higher accuracy.

Improved Text-Independent Speaker Identification using Fused MFCC and IMFCC Feature Sets based on Gaussian Filter

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for speech related applications. On a recent contribution by authors, it has been shown that the Inverted Mel- Frequency Cepstral Coefficients (IMFCC) is useful feature set for SI, which contains complementary information present in high frequency region. This paper introduces the Gaussian shaped filter (GF) while calculating MFCC and IMFCC in place of typical triangular shaped bins. The objective is to introduce a higher amount of correlation between subband outputs. The performances of both MFCC & IMFCC improve with GF over conventional triangular filter (TF) based implementation, individually as well as in combination. With GMM as speaker modeling paradigm, the performances of proposed GF based MFCC and IMFCC in individual and fused mode have been verified in two standard databases YOHO, (Microphone Speech) and POLYCOST (Telephone Speech) each of which has more than 130 speakers.

A New Method for Image Classification Based on Multi-level Neural Networks

In this paper, we propose a supervised method for color image classification based on a multilevel sigmoidal neural network (MSNN) model. In this method, images are classified into five categories, i.e., “Car", “Building", “Mountain", “Farm" and “Coast". This classification is performed without any segmentation processes. To verify the learning capabilities of the proposed method, we compare our MSNN model with the traditional Sigmoidal Neural Network (SNN) model. Results of comparison have shown that the MSNN model performs better than the traditional SNN model in the context of training run time and classification rate. Both color moments and multi-level wavelets decomposition technique are used to extract features from images. The proposed method has been tested on a variety of real and synthetic images.

A Universal Model for Content-Based Image Retrieval

In this paper a novel approach for generalized image retrieval based on semantic contents is presented. A combination of three feature extraction methods namely color, texture, and edge histogram descriptor. There is a provision to add new features in future for better retrieval efficiency. Any combination of these methods, which is more appropriate for the application, can be used for retrieval. This is provided through User Interface (UI) in the form of relevance feedback. The image properties analyzed in this work are by using computer vision and image processing algorithms. For color the histogram of images are computed, for texture cooccurrence matrix based entropy, energy, etc, are calculated and for edge density it is Edge Histogram Descriptor (EHD) that is found. For retrieval of images, a novel idea is developed based on greedy strategy to reduce the computational complexity. The entire system was developed using AForge.Imaging (an open source product), MATLAB .NET Builder, C#, and Oracle 10g. The system was tested with Coral Image database containing 1000 natural images and achieved better results.

Local Steerable Pyramid Binary Pattern Sequence LSPBPS for Face Recognition Method

In this paper the problem of face recognition under variable illumination conditions is considered. Most of the works in the literature exhibit good performance under strictly controlled acquisition conditions, but the performance drastically drop when changes in pose and illumination occur, so that recently number of approaches have been proposed to deal with such variability. The aim of this work is to introduce an efficient local appearance feature extraction method based steerable pyramid (SP) for face recognition. Local information is extracted from SP sub-bands using LBP(Local binary Pattern). The underlying statistics allow us to reduce the required amount of data to be stored. The experiments carried out on different face databases confirm the effectiveness of the proposed approach.

Practical Method for Digital Music Matching Robust to Various Sound Qualities

In this paper, we propose a practical digital music matching system that is robust to variation in sound qualities. The proposed system is subdivided into two parts: client and server. The client part consists of the input, preprocessing and feature extraction modules. The preprocessing module, including the music onset module, revises the value gap occurring on the time axis between identical songs of different formats. The proposed method uses delta-grouped Mel frequency cepstral coefficients (MFCCs) to extract music features that are robust to changes in sound quality. According to the number of sound quality formats (SQFs) used, a music server is constructed with a feature database (FD) that contains different sub feature databases (SFDs). When the proposed system receives a music file, the selection module selects an appropriate SFD from a feature database; the selected SFD is subsequently used by the matching module. In this study, we used 3,000 queries for matching experiments in three cases with different FDs. In each case, we used 1,000 queries constructed by mixing 8 SQFs and 125 songs. The success rate of music matching improved from 88.6% when using single a single SFD to 93.2% when using quadruple SFDs. By this experiment, we proved that the proposed method is robust to various sound qualities.

Detecting and Tracking Vehicles in Airborne Videos

In this work, we present an automatic vehicle detection system for airborne videos using combined features. We propose a pixel-wise classification method for vehicle detection using Dynamic Bayesian Networks. In spite of performing pixel-wise classification, relations among neighboring pixels in a region are preserved in the feature extraction process. The main novelty of the detection scheme is that the extracted combined features comprise not only pixel-level information but also region-level information. Afterwards, tracking is performed on the detected vehicles. Tracking is performed using efficient Kalman filter with dynamic particle sampling. Experiments were conducted on a wide variety of airborne videos. We do not assume prior information of camera heights, orientation, and target object sizes in the proposed framework. The results demonstrate flexibility and good generalization abilities of the proposed method on a challenging dataset.

Frame Texture Classification Method (FTCM) Applied on Mammograms for Detection of Abnormalities

Texture classification is an important image processing task with a broad application range. Many different techniques for texture classification have been explored. Using sparse approximation as a feature extraction method for texture classification is a relatively new approach, and Skretting et al. recently presented the Frame Texture Classification Method (FTCM), showing very good results on classical texture images. As an extension of that work the FTCM is here tested on a real world application as detection of abnormalities in mammograms. Some extensions to the original FTCM that are useful in some applications are implemented; two different smoothing techniques and a vector augmentation technique. Both detection of microcalcifications (as a primary detection technique and as a last stage of a detection scheme), and soft tissue lesions in mammograms are explored. All the results are interesting, and especially the results using FTCM on regions of interest as the last stage in a detection scheme for microcalcifications are promising.

Comparison of MFCC and Cepstral Coefficients as a Feature Set for PCG Biometric Systems

Heart sound is an acoustic signal and many techniques used nowadays for human recognition tasks borrow speech recognition techniques. One popular choice for feature extraction of accoustic signals is the Mel Frequency Cepstral Coefficients (MFCC) which maps the signal onto a non-linear Mel-Scale that mimics the human hearing. However the Mel-Scale is almost linear in the frequency region of heart sounds and thus should produce similar results with the standard cepstral coefficients (CC). In this paper, MFCC is investigated to see if it produces superior results for PCG based human identification system compared to CC. Results show that the MFCC system is still superior to CC despite linear filter-banks in the lower frequency range, giving up to 95% correct recognition rate for MFCC and 90% for CC. Further experiments show that the high recognition rate is due to the implementation of filter-banks and not from Mel-Scaling.

A Case Study on Appearance Based Feature Extraction Techniques and Their Susceptibility to Image Degradations for the Task of Face Recognition

Over the past decades, automatic face recognition has become a highly active research area, mainly due to the countless application possibilities in both the private as well as the public sector. Numerous algorithms have been proposed in the literature to cope with the problem of face recognition, nevertheless, a group of methods commonly referred to as appearance based have emerged as the dominant solution to the face recognition problem. Many comparative studies concerned with the performance of appearance based methods have already been presented in the literature, not rarely with inconclusive and often with contradictory results. No consent has been reached within the scientific community regarding the relative ranking of the efficiency of appearance based methods for the face recognition task, let alone regarding their susceptibility to appearance changes induced by various environmental factors. To tackle these open issues, this paper assess the performance of the three dominant appearance based methods: principal component analysis, linear discriminant analysis and independent component analysis, and compares them on equal footing (i.e., with the same preprocessing procedure, with optimized parameters for the best possible performance, etc.) in face verification experiments on the publicly available XM2VTS database. In addition to the comparative analysis on the XM2VTS database, ten degraded versions of the database are also employed in the experiments to evaluate the susceptibility of the appearance based methods on various image degradations which can occur in "real-life" operating conditions. Our experimental results suggest that linear discriminant analysis ensures the most consistent verification rates across the tested databases.

ANN-Based Classification of Indirect Immuno Fluorescence Images

In this paper we address the issue of classifying the fluorescent intensity of a sample in Indirect Immuno-Fluorescence (IIF). Since IIF is a subjective, semi-quantitative test in its very nature, we discuss a strategy to reliably label the image data set by using the diagnoses performed by different physicians. Then, we discuss image pre-processing, feature extraction and selection. Finally, we propose two ANN-based classifiers that can separate intrinsically dubious samples and whose error tolerance can be flexibly set. Measured performance shows error rates less than 1%, which candidates the method to be used in daily medical practice either to perform pre-selection of cases to be examined, or to act as a second reader.

Face Recognition Using Morphological Shared-weight Neural Networks

We introduce an algorithm based on the morphological shared-weight neural network. Being nonlinear and translation-invariant, the MSNN can be used to create better generalization during face recognition. Feature extraction is performed on grayscale images using hit-miss transforms that are independent of gray-level shifts. The output is then learned by interacting with the classification process. The feature extraction and classification networks are trained together, allowing the MSNN to simultaneously learn feature extraction and classification for a face. For evaluation, we test for robustness under variations in gray levels and noise while varying the network-s configuration to optimize recognition efficiency and processing time. Results show that the MSNN performs better for grayscale image pattern classification than ordinary neural networks.

Protein-Protein Interaction Detection Based on Substring Sensitivity Measure

Detecting protein-protein interactions is a central problem in computational biology and aberrant such interactions may have implicated in a number of neurological disorders. As a result, the prediction of protein-protein interactions has recently received considerable attention from biologist around the globe. Computational tools that are capable of effectively identifying protein-protein interactions are much needed. In this paper, we propose a method to detect protein-protein interaction based on substring similarity measure. Two protein sequences may interact by the mean of the similarities of the substrings they contain. When applied on the currently available protein-protein interaction data for the yeast Saccharomyces cerevisiae, the proposed method delivered reasonable improvement over the existing ones.