Abstract: The present work faces the problem of automatic enumeration and recognition of an unknown and time-varying number of environmental sound sources while using a single microphone. The assumption that is made is that the sound recorded is a realization of sound sources belonging to a group of audio classes which is known a-priori. We describe two variations of the same principle which is to calculate the distance between the current unknown audio frame and all possible combinations of the classes that are assumed to span the soundscene. We concentrate on categorizing environmental sound sources, such as birds, insects etc. in the task of monitoring the biodiversity of a specific habitat.
Abstract: In this study, an OCR system for segmentation,
feature extraction and recognition of Ottoman Scripts has been
developed using handwritten characters. Detection of handwritten
characters written by humans is a difficult process. Segmentation and
feature extraction stages are based on geometrical feature analysis,
followed by the chain code transformation of the main strokes of
each character. The output of segmentation is well-defined segments
that can be fed into any classification approach. The classes of main
strokes are identified through left-right Hidden Markov Model
(HMM).
Abstract: This paper is concerned with motion recognition based fuzzy WP(Wavelet Packet) feature extraction approach from Vicon physical data sets. For this purpose, we use an efficient fuzzy mutual-information-based WP transform for feature extraction. This method estimates the required mutual information using a novel approach based on fuzzy membership function. The physical action data set includes 10 normal and 10 aggressive physical actions that measure the human activity. The data have been collected from 10 subjects using the Vicon 3D tracker. The experiments consist of running, seating, and walking as physical activity motion among various activities. The experimental results revealed that the presented feature extraction approach showed good recognition performance.
Abstract: In pattern recognition applications the low level
segmentation and the high level object recognition are generally
considered as two separate steps. The paper presents a method that
bridges the gap between the low and the high level object
recognition. It is based on a Bayesian network representation and
network propagation algorithm. At the low level it uses hierarchical
structure of quadratic spline wavelet image bases. The method is
demonstrated for a simple circuit diagram component identification
problem.
Abstract: In this paper in consideration of each available
techniques deficiencies for speech recognition, an advanced method
is presented that-s able to classify speech signals with the high
accuracy (98%) at the minimum time. In the presented method, first,
the recorded signal is preprocessed that this section includes
denoising with Mels Frequency Cepstral Analysis and feature
extraction using discrete wavelet transform (DWT) coefficients; Then
these features are fed to Multilayer Perceptron (MLP) network for
classification. Finally, after training of neural network effective
features are selected with UTA algorithm.
Abstract: Lurking behavior is common in information-seeking oriented communities. Transferring users with lurking behavior to be contributors can assist virtual communities to obtain competitive advantages. Based on the ecological cognition framework, this study proposes a model to examine the antecedents of lurking behavior in information-seeking oriented virtual communities. This study argues desire for emotional support, desire for information support, desire for performance-approach, desire for performance -avoidance, desire for mastery-approach, desire for mastery-avoidance, desire for ability trust, desire for benevolence trust, and desire for integrity trust effect on lurking behavior. This study offers an approach to understanding the determinants of lurking behavior in online contexts.
Abstract: This paper proposes view-point insensitive human
pose recognition system using neural network. Recognition system
consists of silhouette image capturing module, data driven database,
and neural network. The advantages of our system are first, it is
possible to capture multiple view-point silhouette images of 3D human
model automatically. This automatic capture module is helpful to
reduce time consuming task of database construction. Second, we
develop huge feature database to offer view-point insensitivity at pose
recognition. Third, we use neural network to recognize human pose
from multiple-view because every pose from each model have similar
feature patterns, even though each model has different appearance and
view-point. To construct database, we need to create 3D human model
using 3D manipulate tools. Contour shape is used to convert silhouette
image to feature vector of 12 degree. This extraction task is processed
semi-automatically, which benefits in that capturing images and
converting to silhouette images from the real capturing environment is
needless. We demonstrate the effectiveness of our approach with
experiments on virtual environment.
Abstract: Finger spelling is an art of communicating by signs
made with fingers, and has been introduced into sign language to serve
as a bridge between the sign language and the verbal language.
Previous approaches to finger spelling recognition are classified into
two categories: glove-based and vision-based approaches. The
glove-based approach is simpler and more accurate recognizing work
of hand posture than vision-based, yet the interfaces require the user to
wear a cumbersome and carry a load of cables that connected the
device to a computer. In contrast, the vision-based approaches provide
an attractive alternative to the cumbersome interface, and promise
more natural and unobtrusive human-computer interaction. The
vision-based approaches generally consist of two steps: hand
extraction and recognition, and two steps are processed independently.
This paper proposes real-time vision-based Korean finger spelling
recognition system by integrating hand extraction into recognition.
First, we tentatively detect a hand region using CAMShift algorithm.
Then fill factor and aspect ratio estimated by width and height
estimated by CAMShift are used to choose candidate from database,
which can reduce the number of matching in recognition step. To
recognize the finger spelling, we use DTW(dynamic time warping)
based on modified chain codes, to be robust to scale and orientation
variations. In this procedure, since accurate hand regions, without
holes and noises, should be extracted to improve the precision, we use
graph cuts algorithm that globally minimize the energy function
elegantly expressed by Markov random fields (MRFs). In the
experiments, the computational times are less than 130ms, and the
times are not related to the number of templates of finger spellings in
database, as candidate templates are selected in extraction step.
Abstract: Preprocessing of speech signals is considered a crucial step in the development of a robust and efficient speech or speaker recognition system. In this paper, we present some popular statistical outlier-detection based strategies to segregate the silence/unvoiced part of the speech signal from the voiced portion. The proposed methods are based on the utilization of the 3 σ edit rule, and the Hampel Identifier which are compared with the conventional techniques: (i) short-time energy (STE) based methods, and (ii) distribution based methods. The results obtained after applying the proposed strategies on some test voice signals are encouraging.
Abstract: In this paper, we propose an improved 3D star skeleton
technique, which is a suitable skeletonization for human posture representation
and reflects the 3D information of human posture.
Moreover, the proposed technique is simple and then can be performed
in real-time. The existing skeleton construction techniques, such as
distance transformation, Voronoi diagram, and thinning, focus on the
precision of skeleton information. Therefore, those techniques are not
applicable to real-time posture recognition since they are computationally
expensive and highly susceptible to noise of boundary. Although
a 2D star skeleton was proposed to complement these problems,
it also has some limitations to describe the 3D information of the
posture. To represent human posture effectively, the constructed skeleton
should consider the 3D information of posture. The proposed 3D
star skeleton contains 3D data of human, and focuses on human action
and posture recognition. Our 3D star skeleton uses the 8 projection
maps which have 2D silhouette information and depth data of human
surface. And the extremal points can be extracted as the features of 3D
star skeleton, without searching whole boundary of object. Therefore,
on execution time, our 3D star skeleton is faster than the “greedy" 3D
star skeleton using the whole boundary points on the surface. Moreover,
our method can offer more accurate skeleton of posture than the
existing star skeleton since the 3D data for the object is concerned.
Additionally, we make a codebook, a collection of representative 3D
star skeletons about 7 postures, to recognize what posture of constructed
skeleton is.
Abstract: The paper attempts to elucidate the columnar structure
of the cortex by answering the following questions. (1) Why the
cortical neurons with similar interests tend to be vertically arrayed
forming what is known as cortical columns? (2) How to describe the
cortex as a whole in concise mathematical terms? (3) How to design
efficient digital models of the cortex?
Abstract: In this article, we expose our research work in
Human-machine Interaction. The research consists in manipulating
the workspace by eyes. We present some of our results, in particular
the detection of eyes and the mouse actions recognition. Indeed, the
handicaped user becomes able to interact with the machine in a more
intuitive way in diverse applications and contexts. To test our
application we have chooses to work in real time on videos captured
by a camera placed in front of the user.
Abstract: In this paper, we present a new method for
incorporating global shift invariance in support vector machines.
Unlike other approaches which incorporate a feature extraction stage,
we first scale the image and then classify it by using the modified
support vector machines classifier. Shift invariance is achieved by
replacing dot products between patterns used by the SVM classifier
with the maximum cross-correlation value between them. Unlike the
normal approach, in which the patterns are treated as vectors, in our
approach the patterns are treated as matrices (or images). Crosscorrelation
is computed by using computationally efficient
techniques such as the fast Fourier transform. The method has been
tested on the ORL face database. The tests indicate that this method
can improve the recognition rate of an SVM classifier.
Abstract: In order to enhance the usability of the human computer interface (HCI) on the touchscreen, this study explored the optimal tactile depth and effect of visual cues on the user-s tendency to touch the touchscreen icons. The experimental program was designed on the touchscreen in this study. Results indicated that the ratio of the icon size to the tactile depth was 1:0.106. There were significant effects of experienced users and novices on the tactile feedback depth (p < 0.01). In addition, the results proved that the visual cues provided a feedback that helped to guide the user-s touch icons accurately and increased the capture efficiency for a tactile recognition field. This tactile recognition field was 18.6 mm in length. There was consistency between the experienced users and novices under the visual cue effects. Finally, the study developed an applied design with touch feedback for touchscreen icons.
Abstract: In this work the opportunity of construction of the
qualifiers for face-recognition systems based on conjugation criteria
is investigated. The linkage between the bipartite conjugation, the
conjugation with a subspace and the conjugation with the null-space
is shown. The unified solving rule is investigated. It makes the
decision on the rating of face to a class considering the linkage
between conjugation values. The described recognition method can
be successfully applied to the distributed systems of video control
and video observation.
Abstract: Hypernetworks are a generalized graph structure
representing higher-order interactions between variables. We present a
method for self-organizing hypernetworks to learn an associative
memory of sentences and to recall the sentences from this memory.
This learning method is inspired by the “mental chemistry" model of
cognition and the “molecular self-assembly" technology in
biochemistry. Simulation experiments are performed on a corpus of
natural-language dialogues of approximately 300K sentences
collected from TV drama captions. We report on the sentence
completion performance as a function of the order of word-interaction
and the size of the learning corpus, and discuss the plausibility of this
architecture as a cognitive model of language learning and memory.
Abstract: We provide a supervised speech-independent voice recognition technique in this paper. In the feature extraction stage we propose a mel-cepstral based approach. Our feature vector classification method uses a special nonlinear metric, derived from the Hausdorff distance for sets, and a minimum mean distance classifier.
Abstract: Automatic Vehicle Identification (AVI) has many
applications in traffic systems (highway electronic toll collection, red
light violation enforcement, border and customs checkpoints, etc.).
License Plate Recognition is an effective form of AVI systems. In
this study, a smart and simple algorithm is presented for vehicle-s
license plate recognition system. The proposed algorithm consists of
three major parts: Extraction of plate region, segmentation of
characters and recognition of plate characters. For extracting the
plate region, edge detection algorithms and smearing algorithms are
used. In segmentation part, smearing algorithms, filtering and some
morphological algorithms are used. And finally statistical based
template matching is used for recognition of plate characters. The
performance of the proposed algorithm has been tested on real
images. Based on the experimental results, we noted that our
algorithm shows superior performance in car license plate
recognition.
Abstract: Fuzzy fingerprint vault is a recently developed cryptographic construct based on the polynomial reconstruction problem to secure critical data with the fingerprint data. However, the previous researches are not applicable to the fingerprint having a few minutiae since they use a fixed degree of the polynomial without considering the number of fingerprint minutiae. To solve this problem, we use an adaptive degree of the polynomial considering the number of minutiae extracted from each user. Also, we apply multiple polynomials to avoid the possible degradation of the security of a simple solution(i.e., using a low-degree polynomial). Based on the experimental results, our method can make the possible attack difficult 2192 times more than using a low-degree polynomial as well as verify the users having a few minutiae.
Abstract: This article presents a simple way to perform programmed voice commands for the interface with commercial Digital and Analogue Input/Output PCI cards, used in Robotics and Automation applications. Robots and Automation equipment can "listen" to voice commands and perform several different tasks, approaching to the human behavior, and improving the human- machine interfaces for the Automation Industry. Since most PCI Digital and Analogue Input/Output cards are sold with several DLLs included (for use with different programming languages), it is possible to add speech recognition capability, using a standard speech recognition engine, compatible with the programming languages used. It was created in this work a Visual Basic 6 (the world's most popular language) application, that listens to several voice commands, and is capable to communicate directly with several standard 128 Digital I/O PCI Cards, used to control complete Automation Systems, with up to (number of boards used) x 128 Sensors and/or Actuators.