Classification of the Latin Alphabet as Pattern on ARToolkit Markers for Augmented Reality Applications

augmented reality is a technique used to insert virtual objects in real scenes. One of the most used libraries in the area is the ARToolkit library. It is based on the recognition of the markers that are in the form of squares with a pattern inside. This pattern which is mostly textual is source of confusing. In this paper, we present the results of a classification of Latin characters as a pattern on the ARToolkit markers to know the most distinguishable among them.

Reconstruction of the Most Energetic Modes in a Fully Developed Turbulent Channel Flow with Density Variation

Proper orthogonal decomposition (POD) is used to reconstruct spatio-temporal data of a fully developed turbulent channel flow with density variation at Reynolds number of 150, based on the friction velocity and the channel half-width, and Prandtl number of 0.71. To apply POD to the fully developed turbulent channel flow with density variation, the flow field (velocities, density, and temperature) is scaled by the corresponding root mean square values (rms) so that the flow field becomes dimensionless. A five-vector POD problem is solved numerically. The reconstructed second-order moments of velocity, temperature, and density from POD eigenfunctions compare favorably to the original Direct Numerical Simulation (DNS) data.

Realtime Lip Contour Tracking For Audio-Visual Speech Recognition Applications

Detection and tracking of the lip contour is an important issue in speechreading. While there are solutions for lip tracking once a good contour initialization in the first frame is available, the problem of finding such a good initialization is not yet solved automatically, but done manually. We have developed a new tracking solution for lip contour detection using only few landmarks (15 to 25) and applying the well known Active Shape Models (ASM). The proposed method is a new LMS-like adaptive scheme based on an Auto regressive (AR) model that has been fit on the landmark variations in successive video frames. Moreover, we propose an extra motion compensation model to address more general cases in lip tracking. Computer simulations demonstrate a fair match between the true and the estimated spatial pixels. Significant improvements related to the well known LMS approach has been obtained via a defined Frobenius norm index.

Pattern Recognition as an Internalized Motor Programme

A new conceptual architecture for low-level neural pattern recognition is presented. The key ideas are that the brain implements support vector machines and that support vectors are represented as memory patterns in competitive queuing memories. A binary classifier is built from two competitive queuing memories holding positive and negative valence training examples respectively. The support vector machine classification function is calculated in synchronized evaluation cycles. The kernel is computed by bisymmetric feed-forward networks feed by sensory input and by competitive queuing memories traversing the complete sequence of support vectors. Temporary summation generates the output classification. It is speculated that perception apparatus in the brain reuses structures that have evolved for enabling fluent execution of prepared action sequences so that pattern recognition is built on internalized motor programmes.

Combining Bagging and Boosting

Bagging and boosting are among the most popular resampling ensemble methods that generate and combine a diversity of classifiers using the same learning algorithm for the base-classifiers. Boosting algorithms are considered stronger than bagging on noisefree data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, in this work we built an ensemble using a voting methodology of bagging and boosting ensembles with 10 subclassifiers in each one. We performed a comparison with simple bagging and boosting ensembles with 25 sub-classifiers, as well as other well known combining methods, on standard benchmark datasets and the proposed technique was the most accurate.

Rough Set Based Intelligent Welding Quality Classification

The knowledge base of welding defect recognition is essentially incomplete. This characteristic determines that the recognition results do not reflect the actual situation. It also has a further influence on the classification of welding quality. This paper is concerned with the study of a rough set based method to reduce the influence and improve the classification accuracy. At first, a rough set model of welding quality intelligent classification has been built. Both condition and decision attributes have been specified. Later on, groups of the representative multiple compound defects have been chosen from the defect library and then classified correctly to form the decision table. Finally, the redundant information of the decision table has been reducted and the optimal decision rules have been reached. By this method, we are able to reclassify the misclassified defects to the right quality level. Compared with the ordinary ones, this method has higher accuracy and better robustness.

Automatic Recognition of an Unknown and Time-Varying Number of Simultaneous Environmental Sound Sources

The present work faces the problem of automatic enumeration and recognition of an unknown and time-varying number of environmental sound sources while using a single microphone. The assumption that is made is that the sound recorded is a realization of sound sources belonging to a group of audio classes which is known a-priori. We describe two variations of the same principle which is to calculate the distance between the current unknown audio frame and all possible combinations of the classes that are assumed to span the soundscene. We concentrate on categorizing environmental sound sources, such as birds, insects etc. in the task of monitoring the biodiversity of a specific habitat.

Ottoman Script Recognition Using Hidden Markov Model

In this study, an OCR system for segmentation, feature extraction and recognition of Ottoman Scripts has been developed using handwritten characters. Detection of handwritten characters written by humans is a difficult process. Segmentation and feature extraction stages are based on geometrical feature analysis, followed by the chain code transformation of the main strokes of each character. The output of segmentation is well-defined segments that can be fed into any classification approach. The classes of main strokes are identified through left-right Hidden Markov Model (HMM).

Motion Recognition Based On Fuzzy WP Feature Extraction Approach

This paper is concerned with motion recognition based fuzzy WP(Wavelet Packet) feature extraction approach from Vicon physical data sets. For this purpose, we use an efficient fuzzy mutual-information-based WP transform for feature extraction. This method estimates the required mutual information using a novel approach based on fuzzy membership function. The physical action data set includes 10 normal and 10 aggressive physical actions that measure the human activity. The data have been collected from 10 subjects using the Vicon 3D tracker. The experiments consist of running, seating, and walking as physical activity motion among various activities. The experimental results revealed that the presented feature extraction approach showed good recognition performance.

Integrating Low and High Level Object Recognition Steps

In pattern recognition applications the low level segmentation and the high level object recognition are generally considered as two separate steps. The paper presents a method that bridges the gap between the low and the high level object recognition. It is based on a Bayesian network representation and network propagation algorithm. At the low level it uses hierarchical structure of quadratic spline wavelet image bases. The method is demonstrated for a simple circuit diagram component identification problem.

An Advanced Method for Speech Recognition

In this paper in consideration of each available techniques deficiencies for speech recognition, an advanced method is presented that-s able to classify speech signals with the high accuracy (98%) at the minimum time. In the presented method, first, the recorded signal is preprocessed that this section includes denoising with Mels Frequency Cepstral Analysis and feature extraction using discrete wavelet transform (DWT) coefficients; Then these features are fed to Multilayer Perceptron (MLP) network for classification. Finally, after training of neural network effective features are selected with UTA algorithm.

View-Point Insensitive Human Pose Recognition using Neural Network

This paper proposes view-point insensitive human pose recognition system using neural network. Recognition system consists of silhouette image capturing module, data driven database, and neural network. The advantages of our system are first, it is possible to capture multiple view-point silhouette images of 3D human model automatically. This automatic capture module is helpful to reduce time consuming task of database construction. Second, we develop huge feature database to offer view-point insensitivity at pose recognition. Third, we use neural network to recognize human pose from multiple-view because every pose from each model have similar feature patterns, even though each model has different appearance and view-point. To construct database, we need to create 3D human model using 3D manipulate tools. Contour shape is used to convert silhouette image to feature vector of 12 degree. This extraction task is processed semi-automatically, which benefits in that capturing images and converting to silhouette images from the real capturing environment is needless. We demonstrate the effectiveness of our approach with experiments on virtual environment.

Real-Time Vision-based Korean Finger Spelling Recognition System

Finger spelling is an art of communicating by signs made with fingers, and has been introduced into sign language to serve as a bridge between the sign language and the verbal language. Previous approaches to finger spelling recognition are classified into two categories: glove-based and vision-based approaches. The glove-based approach is simpler and more accurate recognizing work of hand posture than vision-based, yet the interfaces require the user to wear a cumbersome and carry a load of cables that connected the device to a computer. In contrast, the vision-based approaches provide an attractive alternative to the cumbersome interface, and promise more natural and unobtrusive human-computer interaction. The vision-based approaches generally consist of two steps: hand extraction and recognition, and two steps are processed independently. This paper proposes real-time vision-based Korean finger spelling recognition system by integrating hand extraction into recognition. First, we tentatively detect a hand region using CAMShift algorithm. Then fill factor and aspect ratio estimated by width and height estimated by CAMShift are used to choose candidate from database, which can reduce the number of matching in recognition step. To recognize the finger spelling, we use DTW(dynamic time warping) based on modified chain codes, to be robust to scale and orientation variations. In this procedure, since accurate hand regions, without holes and noises, should be extracted to improve the precision, we use graph cuts algorithm that globally minimize the energy function elegantly expressed by Markov random fields (MRFs). In the experiments, the computational times are less than 130ms, and the times are not related to the number of templates of finger spellings in database, as candidate templates are selected in extraction step.

On Preprocessing of Speech Signals

Preprocessing of speech signals is considered a crucial step in the development of a robust and efficient speech or speaker recognition system. In this paper, we present some popular statistical outlier-detection based strategies to segregate the silence/unvoiced part of the speech signal from the voiced portion. The proposed methods are based on the utilization of the 3 σ edit rule, and the Hampel Identifier which are compared with the conventional techniques: (i) short-time energy (STE) based methods, and (ii) distribution based methods. The results obtained after applying the proposed strategies on some test voice signals are encouraging.

3D Star Skeleton for Fast Human Posture Representation

In this paper, we propose an improved 3D star skeleton technique, which is a suitable skeletonization for human posture representation and reflects the 3D information of human posture. Moreover, the proposed technique is simple and then can be performed in real-time. The existing skeleton construction techniques, such as distance transformation, Voronoi diagram, and thinning, focus on the precision of skeleton information. Therefore, those techniques are not applicable to real-time posture recognition since they are computationally expensive and highly susceptible to noise of boundary. Although a 2D star skeleton was proposed to complement these problems, it also has some limitations to describe the 3D information of the posture. To represent human posture effectively, the constructed skeleton should consider the 3D information of posture. The proposed 3D star skeleton contains 3D data of human, and focuses on human action and posture recognition. Our 3D star skeleton uses the 8 projection maps which have 2D silhouette information and depth data of human surface. And the extremal points can be extracted as the features of 3D star skeleton, without searching whole boundary of object. Therefore, on execution time, our 3D star skeleton is faster than the “greedy" 3D star skeleton using the whole boundary points on the surface. Moreover, our method can offer more accurate skeleton of posture than the existing star skeleton since the 3D data for the object is concerned. Additionally, we make a codebook, a collection of representative 3D star skeletons about 7 postures, to recognize what posture of constructed skeleton is.

Biologically Inspired Artificial Neural Cortex Architecture and its Formalism

The paper attempts to elucidate the columnar structure of the cortex by answering the following questions. (1) Why the cortical neurons with similar interests tend to be vertically arrayed forming what is known as cortical columns? (2) How to describe the cortex as a whole in concise mathematical terms? (3) How to design efficient digital models of the cortex?

Mouse Pointer Tracking with Eyes

In this article, we expose our research work in Human-machine Interaction. The research consists in manipulating the workspace by eyes. We present some of our results, in particular the detection of eyes and the mouse actions recognition. Indeed, the handicaped user becomes able to interact with the machine in a more intuitive way in diverse applications and contexts. To test our application we have chooses to work in real time on videos captured by a camera placed in front of the user.

Shift Invariant Support Vector Machines Face Recognition System

In this paper, we present a new method for incorporating global shift invariance in support vector machines. Unlike other approaches which incorporate a feature extraction stage, we first scale the image and then classify it by using the modified support vector machines classifier. Shift invariance is achieved by replacing dot products between patterns used by the SVM classifier with the maximum cross-correlation value between them. Unlike the normal approach, in which the patterns are treated as vectors, in our approach the patterns are treated as matrices (or images). Crosscorrelation is computed by using computationally efficient techniques such as the fast Fourier transform. The method has been tested on the ORL face database. The tests indicate that this method can improve the recognition rate of an SVM classifier.

Enhancing Human-Computer Interaction and Feedback in Touchscreen Icon

In order to enhance the usability of the human computer interface (HCI) on the touchscreen, this study explored the optimal tactile depth and effect of visual cues on the user-s tendency to touch the touchscreen icons. The experimental program was designed on the touchscreen in this study. Results indicated that the ratio of the icon size to the tactile depth was 1:0.106. There were significant effects of experienced users and novices on the tactile feedback depth (p < 0.01). In addition, the results proved that the visual cues provided a feedback that helped to guide the user-s touch icons accurately and increased the capture efficiency for a tactile recognition field. This tactile recognition field was 18.6 mm in length. There was consistency between the experienced users and novices under the visual cue effects. Finally, the study developed an applied design with touch feedback for touchscreen icons.

Constructing of Classifier for Face Recognition on the Basis of the Conjugation Indexes

In this work the opportunity of construction of the qualifiers for face-recognition systems based on conjugation criteria is investigated. The linkage between the bipartite conjugation, the conjugation with a subspace and the conjugation with the null-space is shown. The unified solving rule is investigated. It makes the decision on the rating of face to a class considering the linkage between conjugation values. The described recognition method can be successfully applied to the distributed systems of video control and video observation.