Abstract: The inherent skin patterns created at the joints in the
finger exterior are referred as finger knuckle-print. It is exploited to
identify a person in a unique manner because the finger knuckle print
is greatly affluent in textures. In biometric system, the region of
interest is utilized for the feature extraction algorithm. In this paper,
local and global features are extracted separately. Fast Discrete
Orthonormal Stockwell Transform is exploited to extract the local
features. Global feature is attained by escalating the size of Fast
Discrete Orthonormal Stockwell Transform to infinity. Two features
are fused to increase the recognition accuracy. A matching distance is
calculated for both the features individually. Then two distances are
merged mutually to acquire the final matching distance. The
proposed scheme gives the better performance in terms of equal error
rate and correct recognition rate.
Abstract: The 3D body movement signals captured during
human-human conversation include clues not only to the content of
people’s communication but also to their culture and personality.
This paper is concerned with automatic extraction of this information
from body movement signals. For the purpose of this research, we
collected a novel corpus from 27 subjects, arranged them into groups
according to their culture. We arranged each group into pairs and
each pair communicated with each other about different topics.
A state-of-art recognition system is applied to the problems of
person, culture, and topic recognition. We borrowed modeling,
classification, and normalization techniques from speech recognition.
We used Gaussian Mixture Modeling (GMM) as the main technique
for building our three systems, obtaining 77.78%, 55.47%, and
39.06% from the person, culture, and topic recognition systems
respectively. In addition, we combined the above GMM systems with
Support Vector Machines (SVM) to obtain 85.42%, 62.50%, and
40.63% accuracy for person, culture, and topic recognition
respectively.
Although direct comparison among these three recognition
systems is difficult, it seems that our person recognition system
performs best for both GMM and GMM-SVM, suggesting that intersubject
differences (i.e. subject’s personality traits) are a major
source of variation. When removing these traits from culture and
topic recognition systems using the Nuisance Attribute Projection
(NAP) and the Intersession Variability Compensation (ISVC)
techniques, we obtained 73.44% and 46.09% accuracy from culture
and topic recognition systems respectively.
Abstract: The study of the electrical signals produced by neural
activities of human brain is called Electroencephalography. In this
paper, we propose an automatic and efficient EEG signal
classification approach. The proposed approach is used to classify the
EEG signal into two classes: epileptic seizure or not. In the proposed
approach, we start with extracting the features by applying Discrete
Wavelet Transform (DWT) in order to decompose the EEG signals
into sub-bands. These features, extracted from details and
approximation coefficients of DWT sub-bands, are used as input to
Principal Component Analysis (PCA). The classification is based on
reducing the feature dimension using PCA and deriving the supportvectors
using Support Vector Machine (SVM). The experimental are
performed on real and standard dataset. A very high level of
classification accuracy is obtained in the result of classification.
Abstract: Speech enhancement is a long standing problem with
numerous applications like teleconferencing, VoIP, hearing aids and
speech recognition. The motivation behind this research work is to
obtain a clean speech signal of higher quality by applying the optimal
noise cancellation technique. Real-time adaptive filtering algorithms
seem to be the best candidate among all categories of the speech
enhancement methods. In this paper, we propose a speech
enhancement method based on Recursive Least Squares (RLS)
adaptive filter of speech signals. Experiments were performed on
noisy data which was prepared by adding AWGN, Babble and Pink
noise to clean speech samples at -5dB, 0dB, 5dB and 10dB SNR
levels. We then compare the noise cancellation performance of
proposed RLS algorithm with existing NLMS algorithm in terms of
Mean Squared Error (MSE), Signal to Noise ratio (SNR) and SNR
Loss. Based on the performance evaluation, the proposed RLS
algorithm was found to be a better optimal noise cancellation
technique for speech signals.
Abstract: The increasing demand of gallium, indium and
rare-earth elements for the production of electronics, e.g. solid
state-lighting, photovoltaics, integrated circuits, and liquid crystal
displays, will exceed the world-wide supply according to current
forecasts. Recycling systems to reclaim these materials are not yet in
place, which challenges the sustainability of these technologies. This
paper proposes a multispectral imaging system as a basis for a vision
based recognition system for valuable components of electronics
waste. Multispectral images intend to enhance the contrast of images
of printed circuit boards (single components, as well as labels) for
further analysis, such as optical character recognition and entire
printed circuit board recognition. The results show, that a higher
contrast is achieved in the near infrared compared to ultraviolett and
visible light.
Abstract: Iris codes contain bits with different entropy. This
work investigates different strategies to reduce the size of iris
code templates with the aim of reducing storage requirements and
computational demand in the matching process. Besides simple subsampling
schemes, also a binary multi-resolution representation as
used in the JBIG hierarchical coding mode is assessed. We find that
iris code template size can be reduced significantly while maintaining
recognition accuracy. Besides, we propose a two-stage identification
approach, using small-sized iris code templates in a pre-selection
stage, and full resolution templates for final identification, which
shows promising recognition behaviour.
Abstract: In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.
Abstract: In this study, we propose a novel technique for acoustic
echo suppression (AES) during speech recognition under barge-in
conditions. Conventional AES methods based on spectral subtraction
apply fixed weights to the estimated echo path transfer function
(EPTF) at the current signal segment and to the EPTF estimated until
the previous time interval. However, the effects of echo path changes
should be considered for eliminating the undesired echoes. We
describe a new approach that adaptively updates weight parameters in
response to abrupt changes in the acoustic environment due to
background noises or double-talk. Furthermore, we devised a voice
activity detector and an initial time-delay estimator for barge-in speech
recognition in communication networks. The initial time delay is
estimated using log-spectral distance measure, as well as
cross-correlation coefficients. The experimental results show that the
developed techniques can be successfully applied in barge-in speech
recognition systems.
Abstract: In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation,
style, illumination, and can suffer from perspective distortion.
Pre-processing is performed to make the characters scale and
rotation invariant. Since text degradations can not be appropriately
defined using well-known geometric transformations such
as translation, rotation, affine transformation and shearing, we
use the whole character black pixels as our feature vector.
Classification is performed with minimum distance classifier
using the maximum likelihood criterion, which delivers very
promising Character Recognition Rate (CRR) of 89%. We
achieve considerably higher Word Recognition Rate (WRR) of
99% when using lower level linguistic knowledge about product
words during the recognition process.
Abstract: The paper describes a Chinese shadow play animation
system based on Kinect. Users, without any professional training, can
personally manipulate the shadow characters to finish a shadow play
performance by their body actions and get a shadow play video
through giving the record command to our system if they want. In our
system, Kinect is responsible for capturing human movement and
voice commands data. Gesture recognition module is used to control
the change of the shadow play scenes. After packaging the data from
Kinect and the recognition result from gesture recognition module,
VRPN transmits them to the server-side. At last, the server-side uses
the information to control the motion of shadow characters and video
recording. This system not only achieves human-computer interaction,
but also realizes the interaction between people. It brings an
entertaining experience to users and easy to operate for all ages. Even
more important is that the application background of Chinese shadow
play embodies the protection of the art of shadow play animation.
Abstract: In this paper, Fuzzy C-Means clustering with
Expectation Maximization-Gaussian Mixture Model based hybrid
modeling algorithm is proposed for Continuous Tamil Speech
Recognition. The speech sentences from various speakers are used
for training and testing phase and objective measures are between the
proposed and existing Continuous Speech Recognition algorithms.
From the simulated results, it is observed that the proposed algorithm
improves the recognition accuracy and F-measure up to 3% as
compared to that of the existing algorithms for the speech signal from
various speakers. In addition, it reduces the Word Error Rate, Error
Rate and Error up to 4% as compared to that of the existing
algorithms. In all aspects, the proposed hybrid modeling for Tamil
speech recognition provides the significant improvements for speechto-
text conversion in various applications.
Abstract: The Smart Help for persons with disability (PWD) is a
part of the project SMARTDISABLE which aims to develop relevant
solution for PWD that target to provide an adequate workplace
environment for them. It would support PWD needs smartly through
smart help to allow them access to relevant information and
communicate with other effectively and flexibly, and smart editor
that assist them in their daily work. It will assist PWD in knowledge
processing and creation as well as being able to be productive at the
work place. The technical work of the project involves design of a
technological scenario for the Ambient Intelligence (AmI) - based
assistive technologies at the workplace consisting of an integrated
universal smart solution that suits many different impairment
conditions and will be designed to empower the Physically disabled
persons (PDP) with the capability to access and effectively utilize the
ICTs in order to execute knowledge rich working tasks with
minimum efforts and with sufficient comfort level. The proposed
technology solution for PWD will support voice recognition along
with normal keyboard and mouse to control the smart help and smart
editor with dynamic auto display interface that satisfies the
requirements for different PWD group. In addition, a smart help will
provide intelligent intervention based on the behavior of PWD to
guide them and warn them about possible misbehavior. PWD can
communicate with others using Voice over IP controlled by voice
recognition. Moreover, Auto Emergency Help Response would be
supported to assist PWD in case of emergency. This proposed
technology solution intended to make PWD very effective at the
work environment and flexible using voice to conduct their tasks at
the work environment. The proposed solution aims to provide
favorable outcomes that assist PWD at the work place, with the
opportunity to participate in PWD assistive technology innovation
market which is still small and rapidly growing as well as upgrading
their quality of life to become similar to the normal people at the
workplace. Finally, the proposed smart help solution is applicable in
all workplace setting, including offices, manufacturing, hospital, etc.
Abstract: The paper presents combined automatic speech
recognition (ASR) of English and machine translation (MT) for
English and Croatian and Croatian-English language pairs in the
domain of business correspondence. The first part presents results of
training the ASR commercial system on English data sets, enriched
by error analysis. The second part presents results of machine
translation performed by free online tool for English and Croatian
and Croatian-English language pairs. Human evaluation in terms of
usability is conducted and internal consistency calculated by
Cronbach's alpha coefficient, enriched by error analysis. Automatic
evaluation is performed by WER (Word Error Rate) and PER
(Position-independent word Error Rate) metrics, followed by
investigation of Pearson’s correlation with human evaluation.
Abstract: This paper presents general results on the Java source
code snippet detection problem. We propose the tool which uses
graph and subgraph isomorphism detection. A number of solutions
for all of these tasks have been proposed in the literature. However,
although that all these solutions are really fast, they compare just the
constant static trees. Our solution offers to enter an input sample
dynamically with the Scripthon language while preserving an
acceptable speed. We used several optimizations to achieve very low
number of comparisons during the matching algorithm.
Abstract: The paper presents the results of clusterization by
Kohonen self-organizing maps (SOM) applied for analysis of array of
Raman spectra of multi-component solutions of inorganic salts, for
determination of types of salts present in the solution. It is
demonstrated that use of SOM is a promising method for solution of
clusterization and classification problems in spectroscopy of multicomponent
objects, as attributing a pattern to some cluster may be
used for recognition of component composition of the object.
Abstract: The article is devoted to the problem of political
discourse and its reflection on mass cognition. This article is
dedicated to describe the myth as one of the main features of political
discourse. The dominance of an expressional and emotional
component in the myth is shown. Precedent phenomenon plays an
important role in distinguishing the myth from the linguistic point of
view. Precedent phenomena show the linguistic cognition, which is
characterized by their fame and recognition. Four types of myths
such as master myths, a foundation myth, sustaining myth,
eschatological myths are observed. The myths about the national idea
are characterized by national specificity. The main aim of the
political discourse with the help of myths is to influence on the mass
consciousness in order to motivate the addressee to certain actions so
that the target purpose is reached owing to unity of forces.
Abstract: One of the major goals of Spoken Dialog Systems
(SDS) is to understand what the user utters.
In the SDS domain, the Spoken Language Understanding (SLU)
Module classifies user utterances by means of a pre-definite
conceptual knowledge. The SLU module is able to recognize only the
meaning previously included in its knowledge base. Due the vastity
of that knowledge, the information storing is a very expensive
process.
Updating and managing the knowledge base are time-consuming
and error-prone processes because of the rapidly growing number of
entities like proper nouns and domain-specific nouns. This paper
proposes a solution to the problem of Name Entity Recognition
(NER) applied to a SDS domain. The proposed solution attempts to
automatically recognize the meaning associated with an utterance by
using the PANKOW (Pattern based Annotation through Knowledge
On the Web) method at runtime.
The method being proposed extracts information from the Web to
increase the SLU knowledge module and reduces the development
effort. In particular, the Google Search Engine is used to extract
information from the Facebook social network.
Abstract: In this paper the issue of dimensionality reduction is
investigated in finger vein recognition systems using kernel Principal
Component Analysis (KPCA). One aspect of KPCA is to find the
most appropriate kernel function on finger vein recognition as there
are several kernel functions which can be used within PCA-based
algorithms. In this paper, however, another side of PCA-based
algorithms -particularly KPCA- is investigated. The aspect of
dimension of feature vector in PCA-based algorithms is of
importance especially when it comes to the real-world applications
and usage of such algorithms. It means that a fixed dimension of
feature vector has to be set to reduce the dimension of the input and
output data and extract the features from them. Then a classifier is
performed to classify the data and make the final decision. We
analyze KPCA (Polynomial, Gaussian, and Laplacian) in details in
this paper and investigate the optimal feature extraction dimension in
finger vein recognition using KPCA.
Abstract: In the past, the most comprehensively adopted light
source was incandescent light bulbs, but with the appearance of LED
light sources, traditional light sources have been gradually replaced by
LEDs because of its numerous superior characteristics. However,
many of the standards do not apply to LEDs as the two light sources
are characterized differently. This also intensifies the significance of
studies on LEDs. As a Kansei design study investigating the visual
glare produced by traffic arrows implemented with LEDs, this study
conducted a semantic analysis on the styles of traffic arrows used in
domestic and international occasions. The results will be able to
reduce drivers’ misrecognition that results in the unsuccessful arrival
at the destination, or in traffic accidents. This study started with a
literature review and surveyed the status quo before conducting
experiments that were divided in two parts. The first part involved a
screening experiment of arrow samples, where cluster analysis was
conducted to choose five representative samples of LED displays. The
second part was a semantic experiment on the display of arrows using
LEDs, where the five representative samples and the selected ten
adjectives were incorporated. Analyzing the results with
Quantification Theory Type I, it was found that among the
composition of arrows, fletching was the most significant factor that
influenced the adjectives. In contrast, a “no fletching” design was
more abstract and vague. It lacked the ability to convey the intended
message and might bear psychological negative connotation including
“dangerous,” “forbidden,” and “unreliable.” The arrow design
consisting of “> shaped fletching” was found to be more concrete and
definite, showing positive connotation including “safe,” “cautious,”
and “reliable.” When a stimulus was placed at a farther distance, the
glare could be significantly reduced; moreover, the visual evaluation
scores would be higher. On the contrary, if the fletching and the shaft
had a similar proportion, looking at the stimuli caused higher
evaluation at a closer distance. The above results will be able to be
applied to the design of traffic arrows by conveying information
definitely and rapidly. In addition, drivers’ safety could be enhanced
by understanding the cause of glare and improving visual
recognizability.
Abstract: In this paper, we propose a method that allows faster and more accurate detection of traffic lights by a vision sensor during driving, DGPS is used to obtain physical location of a traffic light, extract from the image information of the vision sensor only the traffic light area at this location and ascertain if the sign is in operation and determine its form. This method can solve the problem in existing research where low visibility at night or reflection under bright light makes it difficult to recognize the form of traffic light, thus making driving unstable. We compared our success rate of traffic light recognition in day and night road environments. Compared to previous researches, it showed similar performance during the day but 50% improvement at night.