Abstract: An emotional speech recognition system for the
applications on smart phones was proposed in this study to combine
with 3G mobile communications and social networks to provide users
and their groups with more interaction and care. This study developed
a mechanism using the support vector machines (SVM) to recognize
the emotions of speech such as happiness, anger, sadness and normal.
The mechanism uses a hierarchical classifier to adjust the weights of
acoustic features and divides various parameters into the categories of
energy and frequency for training. In this study, 28 commonly used
acoustic features including pitch and volume were proposed for
training. In addition, a time-frequency parameter obtained by
continuous wavelet transforms was also used to identify the accent and
intonation in a sentence during the recognition process. The Berlin
Database of Emotional Speech was used by dividing the speech into
male and female data sets for training. According to the experimental
results, the accuracies of male and female test sets were increased by
4.6% and 5.2% respectively after using the time-frequency parameter
for classifying happy and angry emotions. For the classification of all
emotions, the average accuracy, including male and female data, was
63.5% for the test set and 90.9% for the whole data set.
Abstract: We report in this paper the model adopted by our
system of continuous speech recognition in Arab language SySRA
and the results obtained until now. This system uses the database
Arabdic-10 which is a corpus of word for the Arab language and
which was manually segmented. Phonetic decoding is represented
by an expert system where the knowledge base is translated in the
form of production rules. This expert system transforms a vocal
signal into a phonetic lattice. The higher level of the system takes
care of the recognition of the lattice thus obtained by deferring it in
the form of written sentences (orthographical Form). This level
contains initially the lexical analyzer which is not other than the
module of recognition. We subjected this analyzer to a set of
spectrograms obtained by dictating a score of sentences in Arab
language. The rate of recognition of these sentences is about 70%
which is, to our knowledge, the best result for the recognition of the
Arab language. The test set consists of twenty sentences from four
speakers not having taken part in the training.
Abstract: In this paper, the main principles of text-to-speech synthesis system are presented. Associated problems which arise when developing speech synthesis system are described. Used approaches and their application in the speech synthesis systems for Azerbaijani language are shown.
Abstract: This paper discusses the cued speech recognition
methods in videoconference. Cued speech is a specific gesture
language that is used for communication between deaf people. We
define the criteria for sentence intelligibility according to answers of
testing subjects (deaf people). In our tests we use 30 sample videos
coded by H.264 codec with various bit-rates and various speed of
cued speech. Additionally, we define the criteria for consonant sign
recognizability in single-handed finger alphabet (dactyl) analogically
to acoustics. We use another 12 sample videos coded by H.264 codec
with various bit-rates in four different video formats. To interpret the
results we apply the standard scale for subjective video quality
evaluation and the percentual evaluation of intelligibility as in
acoustics. From the results we construct the minimum coded bit-rate
recommendations for every spatial resolution.
Abstract: The computer, among the most important inventions of the twentieth century, has become an increasingly important component in our everyday lives. Computer games also have become increasingly popular among people day-by-day, owing to their features based on realistic virtual environments, audio and visual features, and the roles they offer players. In the present study, the metaphors students have for computer games are investigated, as well as an effort to fill the gap in the literature. Students were asked to complete the sentence—‘Computer game is like/similar to….because….’— to determine the middle school students’ metaphorical images of the concept for ‘computer game’. The metaphors created by the students were grouped in six categories, based on the source of the metaphor. These categories were ordered as ‘computer game as a means of entertainment’, ‘computer game as a beneficial means’, ‘computer game as a basic need’, ‘computer game as a source of evil’, ‘computer game as a means of withdrawal’, and ‘computer game as a source of addiction’, according to the number of metaphors they included.
Abstract: Since the one-to-one word translator does not have the
facility to translate pragmatic aspects of Javanese, the parallel text
alignment model described uses a phrase pair combination. The
algorithm aligns the parallel text automatically from the beginning to
the end of each sentence. Even though the results of the phrase pair
combination outperform the previous algorithm, it is still inefficient.
Recording all possible combinations consume more space in the
database and time consuming. The original algorithm is modified by
applying the edit distance coefficient to improve the data-storage
efficiency. As a result, the data-storage consumption is 90% reduced
as well as its learning period (42s).
Abstract: It is an important task in Korean-English machine
translation to classify the gender of names correctly. When a sentence
is composed of two or more clauses and only one subject is given as a proper noun, it is important to find the gender of the proper noun
for correct translation of the sentence. This is because a singular pronoun has a gender in English while it does not in Korean. Thus,
in Korean-English machine translation, the gender of a proper noun should be determined. More generally, this task can be expanded into the classification of the general Korean names. This paper proposes a statistical method for this problem. By considering a name as just
a sequence of syllables, it is possible to get a statistics for each name from a collection of names. An evaluation of the proposed method
yields the improvement in accuracy over the simple looking-up of the
collection. While the accuracy of the looking-up method is 64.11%, that of the proposed method is 81.49%. This implies that the proposed
method is more plausible for the gender classification of the Korean names.
Abstract: Automatic keyphrase extraction is useful in efficiently
locating specific documents in online databases. While several
techniques have been introduced over the years, improvement on
accuracy rate is minimal. This research examines attribute scores for
author-supplied keyphrases to better understand how the scores affect
the accuracy rate of automatic keyphrase extraction. Five attributes
are chosen for examination: Term Frequency, First Occurrence, Last
Occurrence, Phrase Position in Sentences, and Term Cohesion
Degree. The results show that First Occurrence is the most reliable
attribute. Term Frequency, Last Occurrence and Term Cohesion
Degree display a wide range of variation but are still usable with
suggested tweaks. Only Phrase Position in Sentences shows a totally
unpredictable pattern. The results imply that the commonly used
ranking approach which directly extracts top ranked potential phrases
from candidate keyphrase list as the keyphrases may not be reliable.
Abstract: This paper introduces an automatic voice classification
system for the diagnosis of individual constitution based on Sasang
Constitutional Medicine (SCM) in Traditional Korean Medicine
(TKM). For the developing of this algorithm, we used the voices of
309 female speakers and extracted a total of 134 speech features from
the voice data consisting of 5 sustained vowels and one sentence. The
classification system, based on a rule-based algorithm that is derived
from a non parametric statistical method, presents 3 types of decisions:
reserved, positive and negative decisions. In conclusion, 71.5% of the
voice data were diagnosed by this system, of which 47.7% were
correct positive decisions and 69.7% were correct negative decisions.
Abstract: Information is increasing in volumes; companies are overloaded with information that they may lose track in getting the intended information. It is a time consuming task to scan through each of the lengthy document. A shorter version of the document which contains only the gist information is more favourable for most information seekers. Therefore, in this paper, we implement a text summarization system to produce a summary that contains gist information of oil and gas news articles. The summarization is intended to provide important information for oil and gas companies to monitor their competitor-s behaviour in enhancing them in formulating business strategies. The system integrated statistical approach with three underlying concepts: keyword occurrences, title of the news article and location of the sentence. The generated summaries were compared with human generated summaries from an oil and gas company. Precision and recall ratio are used to evaluate the accuracy of the generated summary. Based on the experimental results, the system is able to produce an effective summary with the average recall value of 83% at the compression rate of 25%.
Abstract: Hypernetworks are a generalized graph structure
representing higher-order interactions between variables. We present a
method for self-organizing hypernetworks to learn an associative
memory of sentences and to recall the sentences from this memory.
This learning method is inspired by the “mental chemistry" model of
cognition and the “molecular self-assembly" technology in
biochemistry. Simulation experiments are performed on a corpus of
natural-language dialogues of approximately 300K sentences
collected from TV drama captions. We report on the sentence
completion performance as a function of the order of word-interaction
and the size of the learning corpus, and discuss the plausibility of this
architecture as a cognitive model of language learning and memory.
Abstract: In the paper a method of modeling text for Polish is
discussed. The method is aimed at transforming continuous input text
into a text consisting of sentences in so called canonical form, whose
characteristic is, among others, a complete structure as well as no
anaphora or ellipses. The transformation is lossless as to the content
of text being transformed. The modeling method has been worked
out for the needs of the Thetos system, which translates Polish
written texts into the Polish sign language. We believe that the
method can be also used in various applications that deal with the
natural language, e.g. in a text summary generator for Polish.
Abstract: This paper presents a system for discovering
association rules from collections of unstructured documents called
EART (Extract Association Rules from Text). The EART system
treats texts only not images or figures. EART discovers association
rules amongst keywords labeling the collection of textual documents.
The main characteristic of EART is that the system integrates XML
technology (to transform unstructured documents into structured
documents) with Information Retrieval scheme (TF-IDF) and Data
Mining technique for association rules extraction. EART depends on
word feature to extract association rules. It consists of four phases:
structure phase, index phase, text mining phase and visualization
phase. Our work depends on the analysis of the keywords in the
extracted association rules through the co-occurrence of the keywords
in one sentence in the original text and the existing of the keywords
in one sentence without co-occurrence. Experiments applied on a
collection of scientific documents selected from MEDLINE that are
related to the outbreak of H5N1 avian influenza virus.
Abstract: In this paper we present a computational model for pronominal anaphora resolution in Turkish. The model is based on Hobbs’ Naїve Algorithm [4, 5, 6], which exploits only the surface syntax of sentences in a given text.
Abstract: There are multiple reasons to expect that detecting the
word order errors in a text will be a difficult problem, and detection
rates reported in the literature are in fact low. Although grammatical
rules constructed by computer linguists improve the performance of
grammar checker in word order diagnosis, the repairing task is still
very difficult. This paper presents an approach for repairing word
order errors in English text by reordering words in a sentence and
choosing the version that maximizes the number of trigram hits
according to a language model. The novelty of this method concerns
the use of an efficient confusion matrix technique for reordering the
words. The comparative advantage of this method is that works with
a large set of words, and avoids the laborious and costly process of
collecting word order errors for creating error patterns.
Abstract: The paper presents the design concept of a unitselection
text-to-speech synthesis system for the Slovenian language.
Due to its modular and upgradable architecture, the system can be
used in a variety of speech user interface applications, ranging from
server carrier-grade voice portal applications, desktop user interfaces
to specialized embedded devices.
Since memory and processing power requirements are important
factors for a possible implementation in embedded devices, lexica
and speech corpora need to be reduced. We describe a simple and
efficient implementation of a greedy subset selection algorithm that
extracts a compact subset of high coverage text sentences. The
experiment on a reference text corpus showed that the subset
selection algorithm produced a compact sentence subset with a small
redundancy.
The adequacy of the spoken output was evaluated by several
subjective tests as they are recommended by the International
Telecommunication Union ITU.
Abstract: A word recognition architecture based on a network
of neural associative memories and hidden Markov models has been
developed. The input stream, composed of subword-units like wordinternal
triphones consisting of diphones and triphones, is provided
to the network of neural associative memories by hidden Markov
models. The word recognition network derives words from this input
stream. The architecture has the ability to handle ambiguities on
subword-unit level and is also able to add new words to the
vocabulary during performance. The architecture is implemented to
perform the word recognition task in a language processing system
for understanding simple command sentences like “bot show apple".
Abstract: This paper deals with automatic sentence modality
recognition in French. In this work, only prosodic features are
considered. The sentences are recognized according to the three
following modalities: declarative, interrogative and exclamatory
sentences. This information will be used to animate a talking head for
deaf and hearing-impaired children. We first statistically study a real
radio corpus in order to assess the feasibility of the automatic
modeling of sentence types. Then, we test two sets of prosodic
features as well as two different classifiers and their combination. We
further focus our attention on questions recognition, as this modality
is certainly the most important one for the target application.
Abstract: This paper focuses on the use of project work as a
pretext for applying the conventions of writing, or the correctness of
mechanics, usage, and sentence formation, in a content-based class in
a Rajabhat University. Its aim was to explore to what extent the
student teachers’ academic achievement of the basic writing features
against the 70% attainment target after the use of project is. The
organization of work around an agreed theme in which the students
reproduce language provided by texts and instructors is expected to
enhance students’ correct writing conventions. The sample of the
study comprised of 38 fourth-year English major students. The data
was collected by means of achievement test and student writing
works. The scores in the summative achievement test were analyzed
by mean score, standard deviation, and percentage. It was found that
the student teachers do more achieve of practicing mechanics and
usage, and less in sentence formation. The students benefited from
the exposure to texts during conducting the project; however, their
automaticity of how and when to form phrases and clauses into
simple/complex sentences had room for improvement.
Abstract: The Major Depressive Disorder has been a burden of
medical expense in Taiwan as well as the situation around the world.
Major Depressive Disorder can be defined into different categories by
previous human activities. According to machine learning, we can
classify emotion in correct textual language in advance. It can help
medical diagnosis to recognize the variance in Major Depressive
Disorder automatically. Association language incremental is the
characteristic and relationship that can discovery words in sentence.
There is an overlapping-category problem for classification. In this
paper, we would like to improve the performance in classification in
principle of no overlapping-category problems. We present an
approach that to discovery words in sentence and it can find in high
frequency in the same time and can-t overlap in each category, called
Association Language Features by its Category (ALFC).
Experimental results show that ALFC distinguish well in Major
Depressive Disorder and have better performance. We also compare
the approach with baseline and mutual information that use single
words alone or correlation measure.