Design, Manufacture and Test of a Solar Powered Audible Bird Scarer

The most common domestic birds live in Turkey are: crows (Corvus corone), pigeons (Columba livia), sparrows (Passer domesticus), starlings (Sturnus vulgaris) and blackbirds (Turdus merula). These birds give damage to the agricultural areas and make dirty the human life areas. In order to send away these birds, some different materials and methods such as chemicals, treatments, colored lights, flash and audible scarers are used. It is possible to see many studies about chemical methods in the literatures. However there is not enough works regarding audible bird scarers are reported in the literature. Therefore, a solar powered bird scarer was designed, manufactured and tested in this experimental investigation. Firstly, to understand the sensitive level of these domestic birds against to the audible scarer, many series preliminary studies were conducted. These studies showed that crows are the most resistant against to the audible bird scarer when compared with pigeons, sparrows, starlings and blackbirds. Therefore the solar powered audible bird scarer was tested on crows. The scarer was tested about one month during April- May, 2007. 18 different common known predators- sounds (voices or calls) of domestic birds from Falcon (Falco eleonorae), Falcon (Buteo lagopus), Eagle (Aquila chrysaetos), Montagu-s harrier (Circus pygargus) and Owl (Glaucidium passerinum) were selected for test of the scarer. It was seen from the results that the reaction of the birds was changed depending on the predators- sound type, camouflage of the scarer, sound quality and volume, loudspeaker play and pause periods in one application. In addition, it was also seen that the sound from Falcon (Buteo lagopus) was most effective on crows and the scarer was enough efficient.

Enhancement of a 3D Sound Using Psychoacoustics

Generally, in order to create 3D sound using binaural systems, we use head related transfer functions (HRTF) including the information of sounds which is arrived to our ears. But it can decline some three-dimensional effects in the area of a cone of confusion between front and back directions, because of the characteristics of HRTF. In this paper, we propose a new method to use psychoacoustics theory that reduces the confusion of sound image localization. In the method, HRTF spectrum characteristic is enhanced by using the energy ratio of the bark band. Informal listening tests show that the proposed method improves the front-back sound localization characteristics much better than the conventional methods

Voice Disorders Identification Using Hybrid Approach: Wavelet Analysis and Multilayer Neural Networks

This paper presents a new strategy of identification and classification of pathological voices using the hybrid method based on wavelet transform and neural networks. After speech acquisition from a patient, the speech signal is analysed in order to extract the acoustic parameters such as the pitch, the formants, Jitter, and shimmer. Obtained results will be compared to those normal and standard values thanks to a programmable database. Sounds are collected from normal people and patients, and then classified into two different categories. Speech data base is consists of several pathological and normal voices collected from the national hospital “Rabta-Tunis". Speech processing algorithm is conducted in a supervised mode for discrimination of normal and pathology voices and then for classification between neural and vocal pathologies (Parkinson, Alzheimer, laryngeal, dyslexia...). Several simulation results will be presented in function of the disease and will be compared with the clinical diagnosis in order to have an objective evaluation of the developed tool.

An Approach for Blind Source Separation using the Sliding DFT and Time Domain Independent Component Analysis

''Cocktail party problem'' is well known as one of the human auditory abilities. We can recognize the specific sound that we want to listen by this ability even if a lot of undesirable sounds or noises are mixed. Blind source separation (BSS) based on independent component analysis (ICA) is one of the methods by which we can separate only a special signal from their mixed signals with simple hypothesis. In this paper, we propose an online approach for blind source separation using the sliding DFT and the time domain independent component analysis. The proposed method can reduce calculation complexity in comparison with conventional methods, and can be applied to parallel processing by using digital signal processors (DSPs) and so on. We evaluate this method and show its availability.

Bangla Vowel Characterization Based on Analysis by Synthesis

Bangla Vowel characterization determines the spectral properties of Bangla vowels for efficient synthesis as well as recognition of Bangla vowels. In this paper, Bangla vowels in isolated word have been analyzed based on speech production model within the framework of Analysis-by-Synthesis. This has led to the extraction of spectral parameters for the production model in order to produce different Bangla vowel sounds. The real and synthetic spectra are compared and a weighted square error has been computed along with the error in the formant bandwidths for efficient representation of Bangla vowels. The extracted features produced good representation of targeted Bangla vowel. Such a representation also plays essential role in low bit rate speech coding and vocoders.

Using HMM-based Classifier Adapted to Background Noises with Improved Sounds Features for Audio Surveillance Application

Discrimination between different classes of environmental sounds is the goal of our work. The use of a sound recognition system can offer concrete potentialities for surveillance and security applications. The first paper contribution to this research field is represented by a thorough investigation of the applicability of state-of-the-art audio features in the domain of environmental sound recognition. Additionally, a set of novel features obtained by combining the basic parameters is introduced. The quality of the features investigated is evaluated by a HMM-based classifier to which a great interest was done. In fact, we propose to use a Multi-Style training system based on HMMs: one recognizer is trained on a database including different levels of background noises and is used as a universal recognizer for every environment. In order to enhance the system robustness by reducing the environmental variability, we explore different adaptation algorithms including Maximum Likelihood Linear Regression (MLLR), Maximum A Posteriori (MAP) and the MAP/MLLR algorithm that combines MAP and MLLR. Experimental evaluation shows that a rather good recognition rate can be reached, even under important noise degradation conditions when the system is fed by the convenient set of features.

A Neural Model of Object Naming

One astonishing capability of humans is to recognize thousands of different objects visually, and to learn the semantic association between those objects and words referring to them. This work is an attempt to build a computational model of such capacity,simulating the process by which infants learn how to recognize objects and words through exposure to visual stimuli and vocal sounds.One of the main fact shaping the brain of a newborn is that lights and colors come from entities of the world. Gradually the visual system learn which light sensations belong to same entities, despite large changes in appearance. This experience is common between humans and several other mammals, like non-human primates. But humans only can recognize a huge variety of objects, most manufactured by himself, and make use of sounds to identify and categorize them. The aim of this model is to reproduce these processes in a biologically plausible way, by reconstructing the essential hierarchy of cortical circuits on the visual and auditory neural paths.

Comparison of Fricative Vocal Tract Transfer Functions Derived using Two Different Segmentation Techniques

The acoustic and articulatory properties of fricative speech sounds are being studied using magnetic resonance imaging (MRI) and acoustic recordings from a single subject. Area functions were derived from a complete set of axial and coronal MR slices using two different methods: the Mermelstein technique and the Blum transform. Area functions derived from the two techniques were shown to differ significantly in some cases. Such differences will lead to different acoustic predictions and it is important to know which is the more accurate. The vocal tract acoustic transfer function (VTTF) was derived from these area functions for each fricative and compared with measured speech signals for the same fricative and same subject. The VTTFs for /f/ in two vowel contexts and the corresponding acoustic spectra are derived here; the Blum transform appears to show a better match between prediction and measurement than the Mermelstein technique.

Traditional Thai Musical Instrument for Tablet Computer– Ranaad EK

This paper proposes an architectural and graphical user interface (GUI) design of a traditional Thai musical instrument application for tablet computers for practicing “Ranaad Ek" which is a trough-resonated keyboard percussion instrument. The application provides percussion methods for a player as real as a physical instrument. The application consists of two playing modes. The first mode is free playing, a player can freely multi touches on wooden bar to produce instrument sounds. The second mode is practicing mode that guilds the player to follow percussions and rhythms of practice songs. The application has achieved requirements and specifications.

Comparison of MFCC and Cepstral Coefficients as a Feature Set for PCG Biometric Systems

Heart sound is an acoustic signal and many techniques used nowadays for human recognition tasks borrow speech recognition techniques. One popular choice for feature extraction of accoustic signals is the Mel Frequency Cepstral Coefficients (MFCC) which maps the signal onto a non-linear Mel-Scale that mimics the human hearing. However the Mel-Scale is almost linear in the frequency region of heart sounds and thus should produce similar results with the standard cepstral coefficients (CC). In this paper, MFCC is investigated to see if it produces superior results for PCG based human identification system compared to CC. Results show that the MFCC system is still superior to CC despite linear filter-banks in the lower frequency range, giving up to 95% correct recognition rate for MFCC and 90% for CC. Further experiments show that the high recognition rate is due to the implementation of filter-banks and not from Mel-Scaling.

Development System for Emotion Detection Based on Brain Signals and Facial Images

Detection of human emotions has many potential applications. One of application is to quantify attentiveness audience in order evaluate acoustic quality in concern hall. The subjective audio preference that based on from audience is used. To obtain fairness evaluation of acoustic quality, the research proposed system for multimodal emotion detection; one modality based on brain signals that measured using electroencephalogram (EEG) and the second modality is sequences of facial images. In the experiment, an audio signal was customized which consist of normal and disorder sounds. Furthermore, an audio signal was played in order to stimulate positive/negative emotion feedback of volunteers. EEG signal from temporal lobes, i.e. T3 and T4 was used to measured brain response and sequence of facial image was used to monitoring facial expression during volunteer hearing audio signal. On EEG signal, feature was extracted from change information in brain wave, particularly in alpha and beta wave. Feature of facial expression was extracted based on analysis of motion images. We implement an advance optical flow method to detect the most active facial muscle form normal to other emotion expression that represented in vector flow maps. The reduce problem on detection of emotion state, vector flow maps are transformed into compass mapping that represents major directions and velocities of facial movement. The results showed that the power of beta wave is increasing when disorder sound stimulation was given, however for each volunteer was giving different emotion feedback. Based on features derived from facial face images, an optical flow compass mapping was promising to use as additional information to make decision about emotion feedback.

The Main Principles of Text-to-Speech Synthesis System

In this paper, the main principles of text-to-speech synthesis system are presented. Associated problems which arise when developing speech synthesis system are described. Used approaches and their application in the speech synthesis systems for Azerbaijani language are shown.

Multiclass Support Vector Machines for Environmental Sounds Classification Using log-Gabor Filters

In this paper we propose a robust environmental sound classification approach, based on spectrograms features driven from log-Gabor filters. This approach includes two methods. In the first methods, the spectrograms are passed through an appropriate log-Gabor filter banks and the outputs are averaged and underwent an optimal feature selection procedure based on a mutual information criteria. The second method uses the same steps but applied only to three patches extracted from each spectrogram. To investigate the accuracy of the proposed methods, we conduct experiments using a large database containing 10 environmental sound classes. The classification results based on Multiclass Support Vector Machines show that the second method is the most efficient with an average classification accuracy of 89.62 %.

The Design and Implementation of Classifying Bird Sounds

This Classifying Bird Sounds (chip notes) project-s purpose is to reduce the unwanted noise from recorded bird sound chip notes, design a scheme to detect differences and similarities between recorded chip notes, and classify bird sound chip notes. The technologies of determining the similarities of sound waves have been used in communication, sound engineering and wireless sound applications for many years. Our research is focused on the similarity of chip notes, which are the sounds from different birds. The program we use is generated by Microsoft Cµ.

Trispectral Analysis of Voiced Sounds Defective Audition and Tracheotomisian Cases

This paper presents the cepstral and trispectral analysis of a speech signal produced by normal men, men with defective audition (deaf, deep deaf) and others affected by tracheotomy, the trispectral analysis based on parametric methods (Autoregressive AR) using the fourth order cumulant. These analyses are used to detect and compare the pitches and the formants of corresponding voiced sounds (vowel \a\, \i\ and \u\). The first results appear promising, since- it seems after several experimentsthere is no deformation of the spectrum as one could have supposed it at the beginning, however these pathologies influenced the two characteristics: The defective audition influences to the formants contrary to the tracheotomy, which influences the fundamental frequency (pitch).

Automatic Recognition of an Unknown and Time-Varying Number of Simultaneous Environmental Sound Sources

The present work faces the problem of automatic enumeration and recognition of an unknown and time-varying number of environmental sound sources while using a single microphone. The assumption that is made is that the sound recorded is a realization of sound sources belonging to a group of audio classes which is known a-priori. We describe two variations of the same principle which is to calculate the distance between the current unknown audio frame and all possible combinations of the classes that are assumed to span the soundscene. We concentrate on categorizing environmental sound sources, such as birds, insects etc. in the task of monitoring the biodiversity of a specific habitat.

Single Input ANC for Suppression of Breath Sound

Various sounds generated in the chest are included in auscultation sound. Adaptive Noise Canceller (ANC) is one of the useful techniques for biomedical signal. But the ANC is not suitable for auscultation sound. Because the ANC needs two input channels as a primary signal and a reference signals, but a stethoscope can provide just one input sound. Therefore, in this paper, it was proposed the Single Input ANC (SIANC) for suppression of breath sound in a cardiac auscultation sound. For the SIANC, it was proposed that the reference generation system which included Heart Sound Detector, Control and Reference Generator. By experiment and comparison, it was confirmed that the proposed SIANC was efficient for heart sound enhancement and it was independent of variations of a heartbeat.

Vocal Communication in Sooty-headed Bulbul; Pycnonotus aurigaster

Studies of vocal communication in Sooty-headed Bulbul were carried out from January to December 2011. Vocal recordings and behavioral observations were made in their natural habitats at some localities of Lampang, Thailand. After editing, cuts of high quality recordings were analyzed with the help of Avisoft- SASLab Pro (version 4.40) software. More than one thousand element repertoires in five groups were found within two vocal structures. The two structures were short sounds with single element and phrases composed of elements, the frequency ranged from 1-10 kHz. Most phrases were composed of 2 to 5 elements that were often dissimilar in structure, however, these phrases were not as complex as song phrases. The elements and phrases were combined to form many patterns. The species used ten types of calls; i.e. alert, alarm, aggressive, begging, contact, courtship, distress, exciting, flying and invitation. Alert and contact calls were used more frequently than other calls. Aggressive, alarm and distress calls could be used for interspecific communication among some other bird species in the same habitats.

Explorations in the Role of Emotion in Moral Judgment

Recent theorizations on the cognitive process of moral judgment have focused on the role of intuitions and emotions, marking a departure from previous emphasis on conscious, step-by-step reasoning. My study investigated how being in a disgusted mood state affects moral judgment. Participants were induced to enter a disgusted mood state through listening to disgusting sounds and reading disgusting descriptions. Results shows that they, when compared to control who have not been induced to feel disgust, are more likely to endorse actions that are emotionally aversive but maximizes utilitarian return The result is analyzed using the 'emotion-as-information' approach to decision making. The result is consistent with the view that emotions play an important role in determining moral judgment.

Sounds Alike Name Matching for Myanmar Language

Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.