Scholarly

Using Teager Energy Cepstrum and HMM distancesin Automatic Speech Recognition and Analysis of Unvoiced Speech

Year: 2009 Volume: 3 Issue: 11 1935 - 1941 Pages

Authors:
Panikos Heracleous

Abstract: In this study, the use of silicon NAM (Non-Audible Murmur) microphone in automatic speech recognition is presented. NAM microphones are special acoustic sensors, which are attached behind the talker-s ear and can capture not only normal (audible) speech, but also very quietly uttered speech (non-audible murmur). As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech conversion etc.) for sound-impaired people. Using a small amount of training data and adaptation approaches, 93.9% word accuracy was achieved for a 20k Japanese vocabulary dictation task. Non-audible murmur recognition in noisy environments is also investigated. In this study, further analysis of the NAM speech has been made using distance measures between hidden Markov model (HMM) pairs. It has been shown the reduced spectral space of NAM speech using a metric distance, however the location of the different phonemes of NAM are similar to the location of the phonemes of normal speech, and the NAM sounds are well discriminated. Promising results in using nonlinear features are also introduced, especially under noisy conditions.

Slovenian Text-to-Speech Synthesis for Speech User Interfaces

Year: 2007 Volume: 1 Issue: 11 1586 - 1590 Pages

Abstract: The paper presents the design concept of a unitselection text-to-speech synthesis system for the Slovenian language. Due to its modular and upgradable architecture, the system can be used in a variety of speech user interface applications, ranging from server carrier-grade voice portal applications, desktop user interfaces to specialized embedded devices. Since memory and processing power requirements are important factors for a possible implementation in embedded devices, lexica and speech corpora need to be reduced. We describe a simple and efficient implementation of a greedy subset selection algorithm that extracts a compact subset of high coverage text sentences. The experiment on a reference text corpus showed that the subset selection algorithm produced a compact sentence subset with a small redundancy. The adequacy of the spoken output was evaluated by several subjective tests as they are recommended by the International Telecommunication Union ITU.

VoIP and Database Traffic Co-existence over IEEE 802.11b WLAN with Redundancy

Year: 2007 Volume: 1 Issue: 2 146 - 151 Pages

Abstract: This paper presents the findings of two experiments that were performed on the Redundancy in Wireless Connection Model (RiWC) using the 802.11b standard. The experiments were simulated using OPNET 11.5 Modeler software. The first was aimed at finding the maximum number of simultaneous Voice over Internet Protocol (VoIP) users the model would support under the G.711 and G.729 codec standards when the packetization interval was 10 milliseconds (ms). The second experiment examined the model?s VoIP user capacity using the G.729 codec standard along with background traffic using the same packetization interval as in the first experiment. To determine the capacity of the model under various experiments, we checked three metrics: jitter, delay and data loss. When background traffic was added, we checked the response time in addition to the previous three metrics. The findings of the first experiment indicated that the maximum number of simultaneous VoIP users the model was able to support was 5, which is consistent with recent research findings. When using the G.729 codec, the model was able to support up to 16 VoIP users; similar experiments in current literature have indicated a maximum of 7 users. The finding of the second experiment demonstrated that the maximum number of VoIP users the model was able to support was 12, with the existence of background traffic.

Power Line Carrier Equipment Supporting IP Traffic Transmission in the Enterprise Networks of Energy Companies

Year: 2012 Volume: 6 Issue: 10 1112 - 1119 Pages

Authors:
M. S. Anton Merkulov

Abstract: This article discusses the questions concerning of creating small packet networks for energy companies with application of high voltage power line carrier equipment (PLC) with functionality of IP traffic transmission. The main idea is to create converged PLC links between substations and dispatching centers where packet data and voice are transmitted in one data flow. The article contents description of basic conception of the network, evaluation of voice traffic transmission parameters, and discussion of header compression techniques in relation to PLC links. The results of exploration show us, that convergent packet PLC links can be very useful in the construction of small packet networks between substations in remote locations, such as deposits or low populated areas.

Intelligent Speaker Verification based Biometric System for Electronic Commerce Applications

Year: 2008 Volume: 2 Issue: 8 811 - 815 Pages

Abstract: Electronic commerce is growing rapidly with on-line sales already heading for hundreds of billion dollars per year. Due to the huge amount of money transferred everyday, an increased security level is required. In this work we present the architecture of an intelligent speaker verification system, which is able to accurately verify the registered users of an e-commerce service using only their voices as an input. According to the proposed architecture, a transaction-based e-commerce application should be complemented by a biometric server where customer-s unique set of speech models (voiceprint) is stored. The verification procedure requests from the user to pronounce a personalized sequence of digits and after capturing speech and extracting voice features at the client side are sent back to the biometric server. The biometric server uses pattern recognition to decide whether the received features match the stored voiceprint of the customer who claims to be, and accordingly grants verification. The proposed architecture can provide e-commerce applications with a higher degree of certainty regarding the identity of a customer, and prevent impostors to execute fraudulent transactions.

High-Individuality Voice Conversion Based on Concatenative Speech Synthesis

Year: 2007 Volume: 1 Issue: 11 1580 - 1585 Pages

Abstract: Concatenative speech synthesis is a method that can make speech sound which has naturalness and high-individuality of a speaker by introducing a large speech corpus. Based on this method, in this paper, we propose a voice conversion method whose conversion speech has high-individuality and naturalness. The authors also have two subjective evaluation experiments for evaluating individuality and sound quality of conversion speech. From the results, following three facts have be confirmed: (a) the proposal method can convert the individuality of speakers well, (b) employing the framework of unit selection (especially join cost) of concatenative speech synthesis into conventional voice conversion improves the sound quality of conversion speech, and (c) the proposal method is robust against the difference of genders between a source speaker and a target speaker.

The Portuguese Press Portrait of “Environmental Refugees“

Year: 2012 Volume: 6 Issue: 4 386 - 391 Pages

Authors:
Inês Vieira

Abstract: The migration-environment nexus has gained increased interest from the social research field over the last years. While straightly connected to human security issues, this theme has pervaded through the media to the public sphere. Therefore, it is important to observe how did the discussions over environmentally induced migrations develop from the scientific basis to the media attention, passing through some political voices, and in which ways might these messages be interpreted within the broader public discourses. To achieve this purpose, the analysis of the press entries between 2004 and 2010 in three of the main Portuguese newspapers shall be presented, specially reflecting upon the events, protagonists, topics, geographical attributions and terms/expressions used to define those who migrate due to environmental degradation or disasters.

Voice Over IP Technology Development in Offshore Industry: System Dynamics Approach

Year: 2008 Volume: 2 Issue: 8 958 - 968 Pages

Abstract: Nowadays, offshore's complicated facilities need their own communications requirements. Nevertheless, developing and real-world applications of new communications technology are faced with tremendous problems for new technology users, developers and implementers. Traditional systems engineering cannot be capable to develop a new technology effectively because it does not consider the dynamics of the process. This paper focuses on the design of a holistic model that represents the dynamics of new communication technology development within offshore industry. The model shows the behavior of technology development efforts. Furthermore, implementing this model, results in new and useful insights about the policy option analysis for developing a new communications technology in offshore industry.

High Quality Speech Coding using Combined Parametric and Perceptual Modules

Year: 2008 Volume: 2 Issue: 7 1311 - 1316 Pages

Abstract: A novel approach to speech coding using the hybrid architecture is presented. Advantages of parametric and perceptual coding methods are utilized together in order to create a speech coding algorithm assuring better signal quality than in traditional CELP parametric codec. Two approaches are discussed. One is based on selection of voiced signal components that are encoded using parametric algorithm, unvoiced components that are encoded perceptually and transients that remain unencoded. The second approach uses perceptual encoding of the residual signal in CELP codec. The algorithm applied for precise transient selection is described. Signal quality achieved using the proposed hybrid codec is compared to quality of some standard speech codecs.

Study on the Evaluation of the Chaotic Cipher System Using the Improved Volterra Filters and the RBFN Mapping

Year: 2012 Volume: 6 Issue: 11 1239 - 1243 Pages

Abstract: In this paper, we propose a chaotic cipher system consisting of Improved Volterra Filters and the mapping that is created from the actual voice by using Radial Basis Function Network. In order to achieve a practical system, the system supposes to use the digital communication line, such as the Internet, to maintain the parameter matching between the transmitter and receiver sides. Therefore, in order to withstand the attack from outside, it is necessary that complicate the internal state and improve the sensitivity coefficient. In this paper, we validate the robustness of proposed method from three perspectives of "Chaotic properties", "Randomness", "Coefficient sensitivity".

Top Journal

SUGGEST A JOURNAL