Abstract: In this study, the use of silicon NAM (Non-Audible
Murmur) microphone in automatic speech recognition is presented.
NAM microphones are special acoustic sensors, which are attached
behind the talker-s ear and can capture not only normal (audible)
speech, but also very quietly uttered speech (non-audible murmur).
As a result, NAM microphones can be applied in automatic speech
recognition systems when privacy is desired in human-machine communication.
Moreover, NAM microphones show robustness against
noise and they might be used in special systems (speech recognition,
speech conversion etc.) for sound-impaired people. Using a small
amount of training data and adaptation approaches, 93.9% word
accuracy was achieved for a 20k Japanese vocabulary dictation
task. Non-audible murmur recognition in noisy environments is also
investigated. In this study, further analysis of the NAM speech has
been made using distance measures between hidden Markov model
(HMM) pairs. It has been shown the reduced spectral space of NAM
speech using a metric distance, however the location of the different
phonemes of NAM are similar to the location of the phonemes
of normal speech, and the NAM sounds are well discriminated.
Promising results in using nonlinear features are also introduced,
especially under noisy conditions.
Abstract: The paper presents the design concept of a unitselection
text-to-speech synthesis system for the Slovenian language.
Due to its modular and upgradable architecture, the system can be
used in a variety of speech user interface applications, ranging from
server carrier-grade voice portal applications, desktop user interfaces
to specialized embedded devices.
Since memory and processing power requirements are important
factors for a possible implementation in embedded devices, lexica
and speech corpora need to be reduced. We describe a simple and
efficient implementation of a greedy subset selection algorithm that
extracts a compact subset of high coverage text sentences. The
experiment on a reference text corpus showed that the subset
selection algorithm produced a compact sentence subset with a small
redundancy.
The adequacy of the spoken output was evaluated by several
subjective tests as they are recommended by the International
Telecommunication Union ITU.
Abstract: This paper presents the findings of two experiments that were performed on the Redundancy in Wireless Connection Model (RiWC) using the 802.11b standard. The experiments were simulated using OPNET 11.5 Modeler software. The first was aimed at finding the maximum number of simultaneous Voice over Internet Protocol (VoIP) users the model would support under the G.711 and G.729 codec standards when the packetization interval was 10 milliseconds (ms). The second experiment examined the model?s VoIP user capacity using the G.729 codec standard along with background traffic using the same packetization interval as in the first experiment. To determine the capacity of the model under various experiments, we checked three metrics: jitter, delay and data loss. When background traffic was added, we checked the response time in addition to the previous three metrics. The findings of the first experiment indicated that the maximum number of simultaneous VoIP users the model was able to support was 5, which is consistent with recent research findings. When using the G.729 codec, the model was able to support up to 16 VoIP users; similar experiments in current literature have indicated a maximum of 7 users. The finding of the second experiment demonstrated that the maximum number of VoIP users the model was able to support was 12, with the existence of background traffic.
Abstract: This article discusses the questions concerning of creating small packet networks for energy companies with application of high voltage power line carrier equipment (PLC) with functionality of IP traffic transmission. The main idea is to create converged PLC links between substations and dispatching centers where packet data and voice are transmitted in one data flow. The article contents description of basic conception of the network, evaluation of voice traffic transmission parameters, and discussion of header compression techniques in relation to PLC links. The results of exploration show us, that convergent packet PLC links can be very useful in the construction of small packet networks between substations in remote locations, such as deposits or low populated areas.
Abstract: Electronic commerce is growing rapidly with on-line
sales already heading for hundreds of billion dollars per year. Due to
the huge amount of money transferred everyday, an increased
security level is required. In this work we present the architecture of
an intelligent speaker verification system, which is able to accurately
verify the registered users of an e-commerce service using only their
voices as an input. According to the proposed architecture, a
transaction-based e-commerce application should be complemented
by a biometric server where customer-s unique set of speech models
(voiceprint) is stored. The verification procedure requests from the
user to pronounce a personalized sequence of digits and after
capturing speech and extracting voice features at the client side are
sent back to the biometric server. The biometric server uses pattern
recognition to decide whether the received features match the stored
voiceprint of the customer who claims to be, and accordingly grants
verification. The proposed architecture can provide e-commerce
applications with a higher degree of certainty regarding the identity
of a customer, and prevent impostors to execute fraudulent
transactions.
Abstract: Concatenative speech synthesis is a method that can
make speech sound which has naturalness and high-individuality of a
speaker by introducing a large speech corpus. Based on this method, in
this paper, we propose a voice conversion method whose conversion
speech has high-individuality and naturalness. The authors also have
two subjective evaluation experiments for evaluating individuality and
sound quality of conversion speech. From the results, following three
facts have be confirmed: (a) the proposal method can convert the
individuality of speakers well, (b) employing the framework of unit
selection (especially join cost) of concatenative speech synthesis into
conventional voice conversion improves the sound quality of
conversion speech, and (c) the proposal method is robust against the
difference of genders between a source speaker and a target speaker.
Abstract: The migration-environment nexus has gained increased interest from the social research field over the last years. While straightly connected to human security issues, this theme has pervaded through the media to the public sphere. Therefore, it is important to observe how did the discussions over environmentally induced migrations develop from the scientific basis to the media attention, passing through some political voices, and in which ways might these messages be interpreted within the broader public discourses. To achieve this purpose, the analysis of the press entries between 2004 and 2010 in three of the main Portuguese newspapers shall be presented, specially reflecting upon the events, protagonists, topics, geographical attributions and terms/expressions used to define those who migrate due to environmental degradation or disasters.
Abstract: Nowadays, offshore's complicated facilities need their
own communications requirements. Nevertheless, developing and
real-world applications of new communications technology are faced
with tremendous problems for new technology users, developers and
implementers. Traditional systems engineering cannot be capable to
develop a new technology effectively because it does not consider
the dynamics of the process. This paper focuses on the design of a
holistic model that represents the dynamics of new communication
technology development within offshore industry. The model shows
the behavior of technology development efforts. Furthermore,
implementing this model, results in new and useful insights about the
policy option analysis for developing a new communications
technology in offshore industry.
Abstract: A novel approach to speech coding using the hybrid architecture is presented. Advantages of parametric and perceptual coding methods are utilized together in order to create a speech coding algorithm assuring better signal quality than in traditional CELP parametric codec. Two approaches are discussed. One is based on selection of voiced signal components that are encoded using parametric algorithm, unvoiced components that are encoded perceptually and transients that remain unencoded. The second approach uses perceptual encoding of the residual signal in CELP codec. The algorithm applied for precise transient selection is described. Signal quality achieved using the proposed hybrid codec is compared to quality of some standard speech codecs.
Abstract: In this paper, we propose a chaotic cipher system consisting of Improved Volterra Filters and the mapping that is created from the actual voice by using Radial Basis Function Network. In order to achieve a practical system, the system supposes to use the digital communication line, such as the Internet, to maintain the parameter matching between the transmitter and receiver sides. Therefore, in order to withstand the attack from outside, it is necessary that complicate the internal state and improve the sensitivity coefficient. In this paper, we validate the robustness of proposed method from three perspectives of "Chaotic properties", "Randomness", "Coefficient sensitivity".