Voice Features as the Diagnostic Marker of Autism

The aim of the study is to determine the acoustic features of voice and speech of children with autism spectrum disorders (ASD) as a possible additional diagnostic criterion. The participants in the study were 95 children with ASD aged 5-16 years, 150 typically development (TD) children, and 103 adults – listening to children’s speech samples. Three types of experimental methods for speech analysis were performed: spectrographic, perceptual by listeners, and automatic recognition. In the speech of children with ASD, the pitch values, pitch range, values of frequency and intensity of the third formant (emotional) leading to the “atypical” spectrogram of vowels are higher than corresponding parameters in the speech of TD children. High values of vowel articulation index (VAI) are specific for ASD children’s speech signals. These acoustic features can be considered as diagnostic marker of autism. The ability of humans and automatic recognition of the psychoneurological state of children via their speech is determined.

Uvulars Alternation in Hasawi Arabic: A Harmonic Serialism Approach

This paper investigates a phonological phenomenon, which exhibits variation ‘alternation’ in terms of the uvular consonants [q] and [ʁ] in Hasawi Arabic. This dialect is spoken in Alahsa city, which is located in the Eastern province of Saudi Arabia. To the best of our knowledge, no such research has systematically studied this phenomenon in Hasawi Arabic dialect. This paper is significant because it fills the gap in the literature about this alternation phenomenon in this understudied dialect. A large amount of the data is extracted from several interviews the author has conducted with 10 participants, native speakers of the dialect, and complemented by additional forms from social media. The latter method of collecting the data adds to the significance of the research. The analysis of the data is carried out in Harmonic Serialism Optimality Theory (HS-OT), a version of the Optimality Theoretic (OT) framework, which holds that linguistic forms are the outcome of the interaction among violable universal constraints, and in the recent development of OT into a model that accounts for linguistic variation in harmonic derivational steps. This alternation process is assumed to be phonologically unconditioned and in free variation in other varieties of Arabic dialects in the area. The goal of this paper is to investigate whether this phenomenon is in free variation or governed, what governs this alternation between [q] and [ʁ] and whether the alternation is phonological or other linguistic constraints are in action. The results show that the [q] and [ʁ] alternation is not free and it occurs due to different assimilation processes. Positional, segmental sequence and vowel adjacency factors are in action in Hasawi Arabic.

The Role of Ideophones: Phonological and Morphological Characteristics in Literature

Many Asian languages, such as Korean and Japanese, are well-known for their wide use of sound symbolic words or ideophones. This is a very particular characteristic which enriches its lexicon hugely. Ideophones are a class of sound symbolic words that utilize sound symbolism to express aspects, states, emotions, or conditions that can be experienced through the senses, such as shape, color, smell, action or movement. Ideophones have very particular characteristics in terms of sound symbolism and morphology, which distinguish them from other words. The phonological characteristics of ideophones are vowel ablaut or vowel gradation and consonant mutation. In the case of Korean, there are light vowels and dark vowels. Depending on the type of vowel that is used, the meaning will slightly change. Consonant mutation, also known as consonant ablaut, contributes to the level of intensity, emphasis, and volume of an expression. In addition to these phonological characteristics, there is one main morphological singularity, which is reduplication and it carries the meaning of continuity, repetition, intensity, emphasis, and plurality. All these characteristics play an important role in both linguistics and literature as they enhance the meaning of what is trying to be expressed with incredible semantic detail, expressiveness, and rhythm. The following study will analyze the ideophones used in a single paragraph of a Korean novel, which add incredible yet subtle detail to the meaning of the words, and advance the expressiveness and rhythm of the text. The results from analyzing one paragraph from a novel, after presenting the phonological and morphological characteristics of Korean ideophones, will evidence the important role that ideophones play in literature. 

Perceptual and Ultrasound Articulatory Training Effects on English L2 Vowels Production by Italian Learners

The American English contrast /ɑ-ʌ/ (cop-cup) is difficult to be produced by Italian learners since they realize L2-/ɑ-ʌ/ as L1-/ɔ-a/ respectively, due to differences in phonetic-phonological systems and also in grapheme-to-phoneme conversion rules. In this paper, we try to answer the following research questions: Can a short training improve the production of English /ɑ-ʌ/ by Italian learners? Is a perceptual training better than an articulatory (ultrasound - US) training? Thus, we compare a perceptual training with an US articulatory one to observe: 1) the effects of short trainings on L2-/ɑ-ʌ/ productions; 2) if the US articulatory training improves the pronunciation better than the perceptual training. In this pilot study, 9 Salento-Italian monolingual adults participated: 3 subjects performed a 1-hour perceptual training (ES-P); 3 subjects performed a 1-hour US training (ES-US); and 3 control subjects did not receive any training (CS). Verbal instructions about the phonetic properties of L2-/ɑ-ʌ/ and L1-/ɔ-a/ and their differences (representation on F1-F2 plane) were provided during both trainings. After these instructions, the ES-P group performed an identification training based on the High Variability Phonetic Training procedure, while the ES-US group performed the articulatory training, by means of US video of tongue gestures in L2-/ɑ-ʌ/ production and dynamic view of their own tongue movements and position using a probe under their chin. The acoustic data were analyzed and the first three formants were calculated. Independent t-tests were run to compare: 1) /ɑ-ʌ/ in pre- vs. post-test respectively; /ɑ-ʌ/ in pre- and post-test vs. L1-/a-ɔ/ respectively. Results show that in the pre-test all speakers realize L2-/ɑ-ʌ/ as L1-/ɔ-a/ respectively. Contrary to CS and ES-P groups, the ES-US group in the post-test differentiates the L2 vowels from those produced in the pre-test as well as from the L1 vowels, although only one ES-US subject produces both L2 vowels accurately. The articulatory training seems more effective than the perceptual one since it favors the production of vowels in the correct direction of L2 vowels and differently from the similar L1 vowels.

Absence of Developmental Change in Epenthetic Vowel Duration in Japanese Speakers’ English

This study examines developmental change in the production of epenthetic vowels by Japanese learners of English in relation to acquisition of L2 English speech rhythm. Seventy-two Japanese learners of English in the J-AESOP corpus were divided into lower- and higher-level learners according to their proficiency score and the frequency of vowel epenthesis. Three learners were excluded because no vowel epenthesis was observed in their utterances. The analysis of their read English speech data showed no statistical difference between lower- and higher-level learners, implying the absence of any developmental change in durations of epenthetic vowels. This result, together with the findings of previous studies, will be discussed in relation to the transfer of L1 phonology and manifestation of L2 English rhythm.

Co-Articulation between Consonant and Vowel in Cantonese Syllables

This study investigates C-V and V-C co-articulation in Cantonese monosyllables of the CV, VC or CVC structure, with C = one of the three stop consonants [p, t, k] and V = one of the three corner vowels [i, a, u]. Five repetitions of each test syllable on a randomized list were elicited from Cantonese young adult speakers in their early-20s. A research tool, EMA AG500, was used to record the synchronized audio signals and articulatory data at three different locations of the tongue – tongue tip, tongue middle, and tongue back – and the positions of the upper and lower lips during the test syllables. The main findings based on the articulatory data collected from two male Cantonese speakers are as follows: (i) For the syllable-initial [p-], strong co-articulation is observed when [p-] preceding the high vowel [i] or [u], but not the low vowel [a]. As for the syllable-final [-p], it is strongly co-articulated with the preceding vowel, even when the vowel is [a]. (ii) The co-articulation between the initial [t-] and the following vowel of any type is weak. In the syllable-final position, the degree of co-articulatory resistance of [-t] is also large when following the vowel [u], but [-t] is largely co-articulated with the preceding vowel when the vowel is [i] or [a]. (iii) The strength of co-articulation differs when the initial [k-] precedes the different types of vowel. A stronger co-articulation between [k-] and [i] than between [k-] and [u], and the strength of co-articulation is much reduced between [k-] and [a]. However, in the syllable-final position, there is strong co-articulation between [-k] and the preceding vowel [a]. (iv) Among the three types of stop consonants in the syllable-initial position, the decreasing degree of co-articulatory resistance (CR) is [t-] > [k-] > [p-], and the degree of CR is reduced during all three types of stop in the syllable-final position. In general, the data on co-articulation between consonant and vowel in the Cantonese monosyllables are similar to those in other languages reported in previous studies.

Speech Enhancement of Vowels Based on Pitch and Formant Frequency

Numerous signal processing based speech enhancement systems have been proposed to improve intelligibility in the presence of noise. Traditionally, studies of neural vowel encoding have focused on the representation of formants (peaks in vowel spectra) in the discharge patterns of the population of auditory-nerve (AN) fibers. A method is presented for recording high-frequency speech components into a low-frequency region, to increase audibility for hearing loss listeners. The purpose of the paper is to enhance the formant of the speech based on the Kaiser window. The pitch and formant of the signal is based on the auto correlation, zero crossing and magnitude difference function. The formant enhancement stage aims to restore the representation of formants at the level of the midbrain. A MATLAB software’s are used for the implementation of the system with low complexity is developed.

The Syllabic Scrutiny of Word Stress in Najdi Saudi Arabic

This study will provide a brief description of the stress in Najdi Arabic dialect as well as Modern Standard Arabic. Beyond the analysis of stress patterns, this paper will also attempt to deal with two important phenomena that affect stress, namely epenthesis/insertion, vowel shortening, and consonant (the glottal stop) deletion.

Prediction of Writer Using Tamil Handwritten Document Image Based on Pooled Features

Tamil handwritten document is taken as a key source of data to identify the writer. Tamil is a classical language which has 247 characters include compound characters, consonants, vowels and special character. Most characters of Tamil are multifaceted in nature. Handwriting is a unique feature of an individual. Writer may change their handwritings according to their frame of mind and this place a risky challenge in identifying the writer. A new discriminative model with pooled features of handwriting is proposed and implemented using support vector machine. It has been reported on 100% of prediction accuracy by RBF and polynomial kernel based classification model.

Prediction of Writer Using Tamil Handwritten Document Image Based on Pooled Features

Tamil handwritten document is taken as a key source of data to identify the writer. Tamil is a classical language which has 247 characters include compound characters, consonants, vowels and special character. Most characters of Tamil are multifaceted in nature. Handwriting is a unique feature of an individual. Writer may change their handwritings according to their frame of mind and this place a risky challenge in identifying the writer. A new discriminative model with pooled features of handwriting is proposed and implemented using support vector machine. It has been reported on 100% of prediction accuracy by RBF and polynomial kernel based classification model.

Bangla Vowel Characterization Based on Analysis by Synthesis

Bangla Vowel characterization determines the spectral properties of Bangla vowels for efficient synthesis as well as recognition of Bangla vowels. In this paper, Bangla vowels in isolated word have been analyzed based on speech production model within the framework of Analysis-by-Synthesis. This has led to the extraction of spectral parameters for the production model in order to produce different Bangla vowel sounds. The real and synthetic spectra are compared and a weighted square error has been computed along with the error in the formant bandwidths for efficient representation of Bangla vowels. The extracted features produced good representation of targeted Bangla vowel. Such a representation also plays essential role in low bit rate speech coding and vocoders.

Comparison of Fricative Vocal Tract Transfer Functions Derived using Two Different Segmentation Techniques

The acoustic and articulatory properties of fricative speech sounds are being studied using magnetic resonance imaging (MRI) and acoustic recordings from a single subject. Area functions were derived from a complete set of axial and coronal MR slices using two different methods: the Mermelstein technique and the Blum transform. Area functions derived from the two techniques were shown to differ significantly in some cases. Such differences will lead to different acoustic predictions and it is important to know which is the more accurate. The vocal tract acoustic transfer function (VTTF) was derived from these area functions for each fricative and compared with measured speech signals for the same fricative and same subject. The VTTFs for /f/ in two vowel contexts and the corresponding acoustic spectra are derived here; the Blum transform appears to show a better match between prediction and measurement than the Mermelstein technique.

Improved Zero Text Watermarking Algorithm against Meaning Preserving Attacks

Internet is largely composed of textual contents and a huge volume of digital contents gets floated over the Internet daily. The ease of information sharing and re-production has made it difficult to preserve author-s copyright. Digital watermarking came up as a solution for copyright protection of plain text problem after 1993. In this paper, we propose a zero text watermarking algorithm based on occurrence frequency of non-vowel ASCII characters and words for copyright protection of plain text. The embedding algorithm makes use of frequency non-vowel ASCII characters and words to generate a specialized author key. The extraction algorithm uses this key to extract watermark, hence identify the original copyright owner. Experimental results illustrate the effectiveness of the proposed algorithm on text encountering meaning preserving attacks performed by five independent attackers.

The Algorithm of Semi-Automatic Thai Spoonerism Words for Bi-Syllable

The purposes of this research are to study and develop the algorithm of Thai spoonerism words by semi-automatic computer programs, that is to say, in part of data input, syllables are already separated and in part of spoonerism, the developed algorithm is utilized, which can establish rules and mechanisms in Thai spoonerism words for bi-syllables by utilizing analysis in elements of the syllables, namely cluster consonant, vowel, intonation mark and final consonant. From the study, it is found that bi-syllable Thai spoonerism has 1 case of spoonerism mechanism, namely transposition in value of vowel, intonation mark and consonant of both 2 syllables but keeping consonant value and cluster word (if any). From the study, the rules and mechanisms in Thai spoonerism word were applied to develop as Thai spoonerism word software, utilizing PHP program. the software was brought to conduct a performance test on software execution; it is found that the program performs bi-syllable Thai spoonerism correctly or 99% of all words used in the test and found faults on the program at 1% as the words obtained from spoonerism may not be spelling in conformity with Thai grammar and the answer in Thai spoonerism could be more than 1 answer.

Trispectral Analysis of Voiced Sounds Defective Audition and Tracheotomisian Cases

This paper presents the cepstral and trispectral analysis of a speech signal produced by normal men, men with defective audition (deaf, deep deaf) and others affected by tracheotomy, the trispectral analysis based on parametric methods (Autoregressive AR) using the fourth order cumulant. These analyses are used to detect and compare the pitches and the formants of corresponding voiced sounds (vowel \a\, \i\ and \u\). The first results appear promising, since- it seems after several experimentsthere is no deformation of the spectrum as one could have supposed it at the beginning, however these pathologies influenced the two characteristics: The defective audition influences to the formants contrary to the tracheotomy, which influences the fundamental frequency (pitch).

Automatic Voice Classification System Based on Traditional Korean Medicine

This paper introduces an automatic voice classification system for the diagnosis of individual constitution based on Sasang Constitutional Medicine (SCM) in Traditional Korean Medicine (TKM). For the developing of this algorithm, we used the voices of 309 female speakers and extracted a total of 134 speech features from the voice data consisting of 5 sustained vowels and one sentence. The classification system, based on a rule-based algorithm that is derived from a non parametric statistical method, presents 3 types of decisions: reserved, positive and negative decisions. In conclusion, 71.5% of the voice data were diagnosed by this system, of which 47.7% were correct positive decisions and 69.7% were correct negative decisions.

Formant Tracking Linear Prediction Model using HMMs for Noisy Speech Processing

This paper presents a formant-tracking linear prediction (FTLP) model for speech processing in noise. The main focus of this work is the detection of formant trajectory based on Hidden Markov Models (HMM), for improved formant estimation in noise. The approach proposed in this paper provides a systematic framework for modelling and utilization of a time- sequence of peaks which satisfies continuity constraints on parameter; the within peaks are modelled by the LP parameters. The formant tracking LP model estimation is composed of three stages: (1) a pre-cleaning multi-band spectral subtraction stage to reduce the effect of residue noise on formants (2) estimation stage where an initial estimate of the LP model of speech for each frame is obtained (3) a formant classification using probability models of formants and Viterbi-decoders. The evaluation results for the estimation of the formant tracking LP model tested in Gaussian white noise background, demonstrate that the proposed combination of the initial noise reduction stage with formant tracking and LPC variable order analysis, results in a significant reduction in errors and distortions. The performance was evaluated with noisy natual vowels extracted from international french and English vocabulary speech signals at SNR value of 10dB. In each case, the estimated formants are compared to reference formants.

Initialization Method of Reference Vectors for Improvement of Recognition Accuracy in LVQ

Initial values of reference vectors have significant influence on recognition accuracy in LVQ. There are several existing techniques, such as SOM and k-means, for setting initial values of reference vectors, each of which has provided some positive results. However, those results are not sufficient for the improvement of recognition accuracy. This study proposes an ACO-used method for initializing reference vectors with an aim to achieve recognition accuracy higher than those obtained through conventional methods. Moreover, we will demonstrate the effectiveness of the proposed method by applying it to the wine data and English vowel data and comparing its results with those of conventional methods.