Photograph Based Pair-matching Recognition of Human Faces

In this paper, a novel system recognition of human faces without using face different color photographs is proposed. It mainly in face detection, normalization and recognition. Foot method of combination of Haar-like face determined segmentation and region-based histogram stretchi (RHST) is proposed to achieve more accurate perf using Haar. Apart from an effective angle norm side-face (pose) normalization, which is almost a might be important and beneficial for the prepr introduced. Then histogram-based and photom normalization methods are investigated and ada retinex (ASR) is selected for its satisfactory illumin Finally, weighted multi-block local binary pattern with 3 distance measures is applied for pair-mat Experimental results show its advantageous perfo with PCA and multi-block LBP, based on a principle.

Complex-Valued Neural Network in Image Recognition: A Study on the Effectiveness of Radial Basis Function

A complex valued neural network is a neural network, which consists of complex valued input and/or weights and/or thresholds and/or activation functions. Complex-valued neural networks have been widening the scope of applications not only in electronics and informatics, but also in social systems. One of the most important applications of the complex valued neural network is in image and vision processing. In Neural networks, radial basis functions are often used for interpolation in multidimensional space. A Radial Basis function is a function, which has built into it a distance criterion with respect to a centre. Radial basis functions have often been applied in the area of neural networks where they may be used as a replacement for the sigmoid hidden layer transfer characteristic in multi-layer perceptron. This paper aims to present exhaustive results of using RBF units in a complex-valued neural network model that uses the back-propagation algorithm (called 'Complex-BP') for learning. Our experiments results demonstrate the effectiveness of a Radial basis function in a complex valued neural network in image recognition over a real valued neural network. We have studied and stated various observations like effect of learning rates, ranges of the initial weights randomly selected, error functions used and number of iterations for the convergence of error on a neural network model with RBF units. Some inherent properties of this complex back propagation algorithm are also studied and discussed.

Enhanced Clustering Analysis and Visualization Using Kohonen's Self-Organizing Feature Map Networks

Cluster analysis is the name given to a diverse collection of techniques that can be used to classify objects (e.g. individuals, quadrats, species etc). While Kohonen's Self-Organizing Feature Map (SOFM) or Self-Organizing Map (SOM) networks have been successfully applied as a classification tool to various problem domains, including speech recognition, image data compression, image or character recognition, robot control and medical diagnosis, its potential as a robust substitute for clustering analysis remains relatively unresearched. SOM networks combine competitive learning with dimensionality reduction by smoothing the clusters with respect to an a priori grid and provide a powerful tool for data visualization. In this paper, SOM is used for creating a toroidal mapping of two-dimensional lattice to perform cluster analysis on results of a chemical analysis of wines produced in the same region in Italy but derived from three different cultivators, referred to as the “wine recognition data" located in the University of California-Irvine database. The results are encouraging and it is believed that SOM would make an appealing and powerful decision-support system tool for clustering tasks and for data visualization.

Development of a Pipeline Monitoring System by Bio-mimetic Robots

To explore pipelines is one of various bio-mimetic robot applications. The robot may work in common buildings such as between ceilings and ducts, in addition to complicated and massive pipeline systems of large industrial plants. The bio-mimetic robot finds any troubled area or malfunction and then reports its data. Importantly, it can not only prepare for but also react to any abnormal routes in the pipeline. The pipeline monitoring tasks require special types of mobile robots. For an effective movement along a pipeline, the movement of the robot will be similar to that of insects or crawling animals. During its movement along the pipelines, a pipeline monitoring robot has an important task of finding the shapes of the approaching path on the pipes. In this paper we propose an effective solution to the pipeline pattern recognition, based on the fuzzy classification rules for the measured IR distance data.

A Modified Speech Enhancement Using Adaptive Gain Equalizer with Non linear Spectral Subtraction for Robust Speech Recognition

In this paper we present an enhanced noise reduction method for robust speech recognition using Adaptive Gain Equalizer with Non linear Spectral Subtraction. In Adaptive Gain Equalizer method (AGE), the input signal is divided into a number of subbands that are individually weighed in time domain, in accordance to the short time Signal-to-Noise Ratio (SNR) in each subband estimation at every time instant. Instead of focusing on suppression the noise on speech enhancement is focused. When analysis was done under various noise conditions for speech recognition, it was found that Adaptive Gain Equalizer method algorithm has an obvious failing point for a SNR of -5 dB, with inadequate levels of noise suppression for SNR less than this point. This work proposes the implementation of AGE when coupled with Non linear Spectral Subtraction (AGE-NSS) for robust speech recognition. The experimental result shows that out AGE-NSS performs the AGE when SNR drops below -5db level.

Validation Testing for Temporal Neural Networks for RBF Recognition

A neuron can emit spikes in an irregular time basis and by averaging over a certain time window one would ignore a lot of information. It is known that in the context of fast information processing there is no sufficient time to sample an average firing rate of the spiking neurons. The present work shows that the spiking neurons are capable of computing the radial basis functions by storing the relevant information in the neurons' delays. One of the fundamental findings of the this research also is that when using overlapping receptive fields to encode the data patterns it increases the network-s clustering capacity. The clustering algorithm that is discussed here is interesting from computer science and neuroscience point of view as well as from a perspective.

Face Localization and Recognition in Varied Expressions and Illumination

In this paper, we propose a robust scheme to work face alignment and recognition under various influences. For face representation, illumination influence and variable expressions are the important factors, especially the accuracy of facial localization and face recognition. In order to solve those of factors, we propose a robust approach to overcome these problems. This approach consists of two phases. One phase is preprocessed for face images by means of the proposed illumination normalization method. The location of facial features can fit more efficient and fast based on the proposed image blending. On the other hand, based on template matching, we further improve the active shape models (called as IASM) to locate the face shape more precise which can gain the recognized rate in the next phase. The other phase is to process feature extraction by using principal component analysis and face recognition by using support vector machine classifiers. The results show that this proposed method can obtain good facial localization and face recognition with varied illumination and local distortion.

Segmentation Problems and Solutions in Printed Degraded Gurmukhi Script

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper we have proposed a complete solution for segmenting touching characters in all the three zones of printed Gurmukhi script. A study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis. Structural properties of the Gurmukhi characters are used for defining the categories. New algorithms have been proposed to segment the touching characters in middle zone, upper zone and lower zone. These algorithms have shown a reasonable improvement in segmenting the touching characters in degraded printed Gurmukhi script. The algorithms proposed in this paper are applicable only to machine printed text. We have also discussed a new and useful technique to segment the horizontally overlapping lines.

Search Engine Module in Voice Recognition Browser to Facilitate the Visually Impaired in Virtual Learning (MGSYS VISI-VL)

Nowadays, web-based technologies influence in people-s daily life such as in education, business and others. Therefore, many web developers are too eager to develop their web applications with fully animation graphics and forgetting its accessibility to its users. Their purpose is to make their web applications look impressive. Thus, this paper would highlight on the usability and accessibility of a voice recognition browser as a tool to facilitate the visually impaired and blind learners in accessing virtual learning environment. More specifically, the objectives of the study are (i) to explore the challenges faced by the visually impaired learners in accessing virtual learning environment (ii) to determine the suitable guidelines for developing a voice recognition browser that is accessible to the visually impaired. Furthermore, this study was prepared based on an observation conducted with the Malaysian visually impaired learners. Finally, the result of this study would underline on the development of an accessible voice recognition browser for the visually impaired.

Application of Genetic Algorithms to Feature Subset Selection in a Farsi OCR

Dealing with hundreds of features in character recognition systems is not unusual. This large number of features leads to the increase of computational workload of recognition process. There have been many methods which try to remove unnecessary or redundant features and reduce feature dimensionality. Besides because of the characteristics of Farsi scripts, it-s not possible to apply other languages algorithms to Farsi directly. In this paper some methods for feature subset selection using genetic algorithms are applied on a Farsi optical character recognition (OCR) system. Experimental results show that application of genetic algorithms (GA) to feature subset selection in a Farsi OCR results in lower computational complexity and enhanced recognition rate.

Comparing Arabic and Latin Handwritten Digits Recognition Problems

A comparison between the performance of Latin and Arabic handwritten digits recognition problems is presented. The performance of ten different classifiers is tested on two similar Arabic and Latin handwritten digits databases. The analysis shows that Arabic handwritten digits recognition problem is easier than that of Latin digits. This is because the interclass difference in case of Latin digits is smaller than in Arabic digits and variances in writing Latin digits are larger. Consequently, weaker yet fast classifiers are expected to play more prominent role in Arabic handwritten digits recognition.

Performance Comparison and Evaluation of AdaBoost and SoftBoost Algorithms on Generic Object Recognition

SoftBoost is a recently presented boosting algorithm, which trades off the size of achieved classification margin and generalization performance. This paper presents a performance evaluation of SoftBoost algorithm on the generic object recognition problem. An appearance-based generic object recognition model is used. The evaluation experiments are performed using a difficult object recognition benchmark. An assessment with respect to different degrees of label noise as well as a comparison to the well known AdaBoost algorithm is performed. The obtained results reveal that SoftBoost is encouraged to be used in cases when the training data is known to have a high degree of noise. Otherwise, using Adaboost can achieve better performance.

A New Approach to Face Recognition Using Dual Dimension Reduction

In this paper a new approach to face recognition is presented that achieves double dimension reduction, making the system computationally efficient with better recognition results and out perform common DCT technique of face recognition. In pattern recognition techniques, discriminative information of image increases with increase in resolution to a certain extent, consequently face recognition results change with change in face image resolution and provide optimal results when arriving at a certain resolution level. In the proposed model of face recognition, initially image decimation algorithm is applied on face image for dimension reduction to a certain resolution level which provides best recognition results. Due to increased computational speed and feature extraction potential of Discrete Cosine Transform (DCT), it is applied on face image. A subset of coefficients of DCT from low to mid frequencies that represent the face adequately and provides best recognition results is retained. A tradeoff between decimation factor, number of DCT coefficients retained and recognition rate with minimum computation is obtained. Preprocessing of the image is carried out to increase its robustness against variations in poses and illumination level. This new model has been tested on different databases which include ORL , Yale and EME color database.

Combining Bagging and Boosting

Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.

Integrating Low and High Level Object Recognition Steps

In pattern recognition applications the low level segmentation and the high level object recognition are generally considered as two separate steps. The paper presents a method that bridges the gap between the low and the high level object recognition. It is based on a Bayesian network representation and network propagation algorithm. At the low level it uses hierarchical structure of quadratic spline wavelet image bases. The method is demonstrated for a simple circuit diagram component identification problem.

View-Point Insensitive Human Pose Recognition using Neural Network

This paper proposes view-point insensitive human pose recognition system using neural network. Recognition system consists of silhouette image capturing module, data driven database, and neural network. The advantages of our system are first, it is possible to capture multiple view-point silhouette images of 3D human model automatically. This automatic capture module is helpful to reduce time consuming task of database construction. Second, we develop huge feature database to offer view-point insensitivity at pose recognition. Third, we use neural network to recognize human pose from multiple-view because every pose from each model have similar feature patterns, even though each model has different appearance and view-point. To construct database, we need to create 3D human model using 3D manipulate tools. Contour shape is used to convert silhouette image to feature vector of 12 degree. This extraction task is processed semi-automatically, which benefits in that capturing images and converting to silhouette images from the real capturing environment is needless. We demonstrate the effectiveness of our approach with experiments on virtual environment.

Real-Time Vision-based Korean Finger Spelling Recognition System

Finger spelling is an art of communicating by signs made with fingers, and has been introduced into sign language to serve as a bridge between the sign language and the verbal language. Previous approaches to finger spelling recognition are classified into two categories: glove-based and vision-based approaches. The glove-based approach is simpler and more accurate recognizing work of hand posture than vision-based, yet the interfaces require the user to wear a cumbersome and carry a load of cables that connected the device to a computer. In contrast, the vision-based approaches provide an attractive alternative to the cumbersome interface, and promise more natural and unobtrusive human-computer interaction. The vision-based approaches generally consist of two steps: hand extraction and recognition, and two steps are processed independently. This paper proposes real-time vision-based Korean finger spelling recognition system by integrating hand extraction into recognition. First, we tentatively detect a hand region using CAMShift algorithm. Then fill factor and aspect ratio estimated by width and height estimated by CAMShift are used to choose candidate from database, which can reduce the number of matching in recognition step. To recognize the finger spelling, we use DTW(dynamic time warping) based on modified chain codes, to be robust to scale and orientation variations. In this procedure, since accurate hand regions, without holes and noises, should be extracted to improve the precision, we use graph cuts algorithm that globally minimize the energy function elegantly expressed by Markov random fields (MRFs). In the experiments, the computational times are less than 130ms, and the times are not related to the number of templates of finger spellings in database, as candidate templates are selected in extraction step.

3D Star Skeleton for Fast Human Posture Representation

In this paper, we propose an improved 3D star skeleton technique, which is a suitable skeletonization for human posture representation and reflects the 3D information of human posture. Moreover, the proposed technique is simple and then can be performed in real-time. The existing skeleton construction techniques, such as distance transformation, Voronoi diagram, and thinning, focus on the precision of skeleton information. Therefore, those techniques are not applicable to real-time posture recognition since they are computationally expensive and highly susceptible to noise of boundary. Although a 2D star skeleton was proposed to complement these problems, it also has some limitations to describe the 3D information of the posture. To represent human posture effectively, the constructed skeleton should consider the 3D information of posture. The proposed 3D star skeleton contains 3D data of human, and focuses on human action and posture recognition. Our 3D star skeleton uses the 8 projection maps which have 2D silhouette information and depth data of human surface. And the extremal points can be extracted as the features of 3D star skeleton, without searching whole boundary of object. Therefore, on execution time, our 3D star skeleton is faster than the “greedy" 3D star skeleton using the whole boundary points on the surface. Moreover, our method can offer more accurate skeleton of posture than the existing star skeleton since the 3D data for the object is concerned. Additionally, we make a codebook, a collection of representative 3D star skeletons about 7 postures, to recognize what posture of constructed skeleton is.

Mouse Pointer Tracking with Eyes

In this article, we expose our research work in Human-machine Interaction. The research consists in manipulating the workspace by eyes. We present some of our results, in particular the detection of eyes and the mouse actions recognition. Indeed, the handicaped user becomes able to interact with the machine in a more intuitive way in diverse applications and contexts. To test our application we have chooses to work in real time on videos captured by a camera placed in front of the user.

Constructing of Classifier for Face Recognition on the Basis of the Conjugation Indexes

In this work the opportunity of construction of the qualifiers for face-recognition systems based on conjugation criteria is investigated. The linkage between the bipartite conjugation, the conjugation with a subspace and the conjugation with the null-space is shown. The unified solving rule is investigated. It makes the decision on the rating of face to a class considering the linkage between conjugation values. The described recognition method can be successfully applied to the distributed systems of video control and video observation.