Incorporating Lexical-Semantic Knowledge into Convolutional Neural Network Framework for Pediatric Disease Diagnosis

The utilization of electronic medical record (EMR) data to establish the disease diagnosis model has become an important research content of biomedical informatics. Deep learning can automatically extract features from the massive data, which brings about breakthroughs in the study of EMR data. The challenge is that deep learning lacks semantic knowledge, which leads to impracticability in medical science. This research proposes a method of incorporating lexical-semantic knowledge from abundant entities into a convolutional neural network (CNN) framework for pediatric disease diagnosis. Firstly, medical terms are vectorized into Lexical Semantic Vectors (LSV), which are concatenated with the embedded word vectors of word2vec to enrich the feature representation. Secondly, the semantic distribution of medical terms serves as Semantic Decision Guide (SDG) for the optimization of deep learning models. The study evaluates the performance of LSV-SDG-CNN model on four kinds of Chinese EMR datasets. Additionally, CNN, LSV-CNN, and SDG-CNN are designed as baseline models for comparison. The experimental results show that LSV-SDG-CNN model outperforms baseline models on four kinds of Chinese EMR datasets. The best configuration of the model yielded an F1 score of 86.20%. The results clearly demonstrate that CNN has been effectively guided and optimized by lexical-semantic knowledge, and LSV-SDG-CNN model improves the disease classification accuracy with a clear margin.

Local Spectrum Feature Extraction for Face Recognition

This paper presents two techniques, local feature extraction using image spectrum and low frequency spectrum modelling using GMM to capture the underlying statistical information to improve the performance of face recognition system. Local spectrum features are extracted using overlap sub block window that are mapped on the face image. For each of this block, spatial domain is transformed to frequency domain using DFT. A low frequency coefficient is preserved by discarding high frequency coefficients by applying rectangular mask on the spectrum of the facial image. Low frequency information is non- Gaussian in the feature space and by using combination of several Gaussian functions that has different statistical properties, the best feature representation can be modelled using probability density function. The recognition process is performed using maximum likelihood value computed using pre-calculated GMM components. The method is tested using FERET datasets and is able to achieved 92% recognition rates.

Efficient and Effective Gabor Feature Representation for Face Detection

We here propose improved version of elastic graph matching (EGM) as a face detector, called the multi-scale EGM (MS-EGM). In this improvement, Gabor wavelet-based pyramid reduces computational complexity for the feature representation often used in the conventional EGM, but preserving a critical amount of information about an image. The MS-EGM gives us higher detection performance than Viola-Jones object detection algorithm of the AdaBoost Haar-like feature cascade. We also show rapid detection speeds of the MS-EGM, comparable to the Viola-Jones method. We find fruitful benefits in the MS-EGM, in terms of topological feature representation for a face.

Speaker Identification Using Admissible Wavelet Packet Based Decomposition

Mel Frequency Cepstral Coefficient (MFCC) features are widely used as acoustic features for speech recognition as well as speaker recognition. In MFCC feature representation, the Mel frequency scale is used to get a high resolution in low frequency region, and a low resolution in high frequency region. This kind of processing is good for obtaining stable phonetic information, but not suitable for speaker features that are located in high frequency regions. The speaker individual information, which is non-uniformly distributed in the high frequencies, is equally important for speaker recognition. Based on this fact we proposed an admissible wavelet packet based filter structure for speaker identification. Multiresolution capabilities of wavelet packet transform are used to derive the new features. The proposed scheme differs from previous wavelet based works, mainly in designing the filter structure. Unlike others, the proposed filter structure does not follow Mel scale. The closed-set speaker identification experiments performed on the TIMIT database shows improved identification performance compared to other commonly used Mel scale based filter structures using wavelets.

Palmprint based Cancelable Biometric Authentication System

A cancelable palmprint authentication system proposed in this paper is specifically designed to overcome the limitations of the contemporary biometric authentication system. In this proposed system, Geometric and pseudo Zernike moments are employed as feature extractors to transform palmprint image into a lower dimensional compact feature representation. Before moment computation, wavelet transform is adopted to decompose palmprint image into lower resolution and dimensional frequency subbands. This reduces the computational load of moment calculation drastically. The generated wavelet-moment based feature representation is used to generate cancelable verification key with a set of random data. This private binary key can be canceled and replaced. Besides that, this key also possesses high data capture offset tolerance, with highly correlated bit strings for intra-class population. This property allows a clear separation of the genuine and imposter populations, as well as zero Equal Error Rate achievement, which is hardly gained in the conventional biometric based authentication system.