Abstract: The aim of the study is to determine the acoustic features of voice and speech of children with autism spectrum disorders (ASD) as a possible additional diagnostic criterion. The participants in the study were 95 children with ASD aged 5-16 years, 150 typically development (TD) children, and 103 adults – listening to children’s speech samples. Three types of experimental methods for speech analysis were performed: spectrographic, perceptual by listeners, and automatic recognition. In the speech of children with ASD, the pitch values, pitch range, values of frequency and intensity of the third formant (emotional) leading to the “atypical” spectrogram of vowels are higher than corresponding parameters in the speech of TD children. High values of vowel articulation index (VAI) are specific for ASD children’s speech signals. These acoustic features can be considered as diagnostic marker of autism. The ability of humans and automatic recognition of the psychoneurological state of children via their speech is determined.
Abstract: This paper investigates a phonological phenomenon, which exhibits variation ‘alternation’ in terms of the uvular consonants [q] and [ʁ] in Hasawi Arabic. This dialect is spoken in Alahsa city, which is located in the Eastern province of Saudi Arabia. To the best of our knowledge, no such research has systematically studied this phenomenon in Hasawi Arabic dialect. This paper is significant because it fills the gap in the literature about this alternation phenomenon in this understudied dialect. A large amount of the data is extracted from several interviews the author has conducted with 10 participants, native speakers of the dialect, and complemented by additional forms from social media. The latter method of collecting the data adds to the significance of the research. The analysis of the data is carried out in Harmonic Serialism Optimality Theory (HS-OT), a version of the Optimality Theoretic (OT) framework, which holds that linguistic forms are the outcome of the interaction among violable universal constraints, and in the recent development of OT into a model that accounts for linguistic variation in harmonic derivational steps. This alternation process is assumed to be phonologically unconditioned and in free variation in other varieties of Arabic dialects in the area. The goal of this paper is to investigate whether this phenomenon is in free variation or governed, what governs this alternation between [q] and [ʁ] and whether the alternation is phonological or other linguistic constraints are in action. The results show that the [q] and [ʁ] alternation is not free and it occurs due to different assimilation processes. Positional, segmental sequence and vowel adjacency factors are in action in Hasawi Arabic.
Abstract: Many Asian languages, such as Korean and Japanese, are well-known for their wide use of sound symbolic words or ideophones. This is a very particular characteristic which enriches its lexicon hugely. Ideophones are a class of sound symbolic words that utilize sound symbolism to express aspects, states, emotions, or conditions that can be experienced through the senses, such as shape, color, smell, action or movement. Ideophones have very particular characteristics in terms of sound symbolism and morphology, which distinguish them from other words. The phonological characteristics of ideophones are vowel ablaut or vowel gradation and consonant mutation. In the case of Korean, there are light vowels and dark vowels. Depending on the type of vowel that is used, the meaning will slightly change. Consonant mutation, also known as consonant ablaut, contributes to the level of intensity, emphasis, and volume of an expression. In addition to these phonological characteristics, there is one main morphological singularity, which is reduplication and it carries the meaning of continuity, repetition, intensity, emphasis, and plurality. All these characteristics play an important role in both linguistics and literature as they enhance the meaning of what is trying to be expressed with incredible semantic detail, expressiveness, and rhythm. The following study will analyze the ideophones used in a single paragraph of a Korean novel, which add incredible yet subtle detail to the meaning of the words, and advance the expressiveness and rhythm of the text. The results from analyzing one paragraph from a novel, after presenting the phonological and morphological characteristics of Korean ideophones, will evidence the important role that ideophones play in literature.
Abstract: The American English contrast /ɑ-ʌ/ (cop-cup) is difficult to be produced by Italian learners since they realize L2-/ɑ-ʌ/ as L1-/ɔ-a/ respectively, due to differences in phonetic-phonological systems and also in grapheme-to-phoneme conversion rules. In this paper, we try to answer the following research questions: Can a short training improve the production of English /ɑ-ʌ/ by Italian learners? Is a perceptual training better than an articulatory (ultrasound - US) training? Thus, we compare a perceptual training with an US articulatory one to observe: 1) the effects of short trainings on L2-/ɑ-ʌ/ productions; 2) if the US articulatory training improves the pronunciation better than the perceptual training. In this pilot study, 9 Salento-Italian monolingual adults participated: 3 subjects performed a 1-hour perceptual training (ES-P); 3 subjects performed a 1-hour US training (ES-US); and 3 control subjects did not receive any training (CS). Verbal instructions about the phonetic properties of L2-/ɑ-ʌ/ and L1-/ɔ-a/ and their differences (representation on F1-F2 plane) were provided during both trainings. After these instructions, the ES-P group performed an identification training based on the High Variability Phonetic Training procedure, while the ES-US group performed the articulatory training, by means of US video of tongue gestures in L2-/ɑ-ʌ/ production and dynamic view of their own tongue movements and position using a probe under their chin. The acoustic data were analyzed and the first three formants were calculated. Independent t-tests were run to compare: 1) /ɑ-ʌ/ in pre- vs. post-test respectively; /ɑ-ʌ/ in pre- and post-test vs. L1-/a-ɔ/ respectively. Results show that in the pre-test all speakers realize L2-/ɑ-ʌ/ as L1-/ɔ-a/ respectively. Contrary to CS and ES-P groups, the ES-US group in the post-test differentiates the L2 vowels from those produced in the pre-test as well as from the L1 vowels, although only one ES-US subject produces both L2 vowels accurately. The articulatory training seems more effective than the perceptual one since it favors the production of vowels in the correct direction of L2 vowels and differently from the similar L1 vowels.
Abstract: This study examines developmental change in the production of epenthetic vowels by Japanese learners of English in relation to acquisition of L2 English speech rhythm. Seventy-two Japanese learners of English in the J-AESOP corpus were divided into lower- and higher-level learners according to their proficiency score and the frequency of vowel epenthesis. Three learners were excluded because no vowel epenthesis was observed in their utterances. The analysis of their read English speech data showed no statistical difference between lower- and higher-level learners, implying the absence of any developmental change in durations of epenthetic vowels. This result, together with the findings of previous studies, will be discussed in relation to the transfer of L1 phonology and manifestation of L2 English rhythm.
Abstract: This study investigates C-V and V-C co-articulation in Cantonese monosyllables of the CV, VC or CVC structure, with C = one of the three stop consonants [p, t, k] and V = one of the three corner vowels [i, a, u]. Five repetitions of each test syllable on a randomized list were elicited from Cantonese young adult speakers in their early-20s. A research tool, EMA AG500, was used to record the synchronized audio signals and articulatory data at three different locations of the tongue – tongue tip, tongue middle, and tongue back – and the positions of the upper and lower lips during the test syllables. The main findings based on the articulatory data collected from two male Cantonese speakers are as follows: (i) For the syllable-initial [p-], strong co-articulation is observed when [p-] preceding the high vowel [i] or [u], but not the low vowel [a]. As for the syllable-final [-p], it is strongly co-articulated with the preceding vowel, even when the vowel is [a]. (ii) The co-articulation between the initial [t-] and the following vowel of any type is weak. In the syllable-final position, the degree of co-articulatory resistance of [-t] is also large when following the vowel [u], but [-t] is largely co-articulated with the preceding vowel when the vowel is [i] or [a]. (iii) The strength of co-articulation differs when the initial [k-] precedes the different types of vowel. A stronger co-articulation between [k-] and [i] than between [k-] and [u], and the strength of co-articulation is much reduced between [k-] and [a]. However, in the syllable-final position, there is strong co-articulation between [-k] and the preceding vowel [a]. (iv) Among the three types of stop consonants in the syllable-initial position, the decreasing degree of co-articulatory resistance (CR) is [t-] > [k-] > [p-], and the degree of CR is reduced during all three types of stop in the syllable-final position. In general, the data on co-articulation between consonant and vowel in the Cantonese monosyllables are similar to those in other languages reported in previous studies.
Abstract: Numerous signal processing based speech enhancement systems have been proposed to improve intelligibility in the presence of noise. Traditionally, studies of neural vowel encoding have focused on the representation of formants (peaks in vowel spectra) in the discharge patterns of the population of auditory-nerve (AN) fibers. A method is presented for recording high-frequency speech components into a low-frequency region, to increase audibility for hearing loss listeners. The purpose of the paper is to enhance the formant of the speech based on the Kaiser window. The pitch and formant of the signal is based on the auto correlation, zero crossing and magnitude difference function. The formant enhancement stage aims to restore the representation of formants at the level of the midbrain. A MATLAB software’s are used for the implementation of the system with low complexity is developed.
Abstract: This study will provide a brief description of the stress in Najdi Arabic dialect as well as Modern Standard Arabic. Beyond the analysis of stress patterns, this paper will also attempt to deal with two important phenomena that affect stress, namely epenthesis/insertion, vowel shortening, and consonant (the glottal stop) deletion.
Abstract: Tamil handwritten document is taken as a key source
of data to identify the writer. Tamil is a classical language which has
247 characters include compound characters, consonants, vowels and
special character. Most characters of Tamil are multifaceted in
nature. Handwriting is a unique feature of an individual. Writer may
change their handwritings according to their frame of mind and this
place a risky challenge in identifying the writer. A new
discriminative model with pooled features of handwriting is proposed
and implemented using support vector machine. It has been reported
on 100% of prediction accuracy by RBF and polynomial kernel based
classification model.
Abstract: Tamil handwritten document is taken as a key source of data to identify the writer. Tamil is a classical language which has 247 characters include compound characters, consonants, vowels and special character. Most characters of Tamil are multifaceted in nature. Handwriting is a unique feature of an individual. Writer may change their handwritings according to their frame of mind and this place a risky challenge in identifying the writer. A new discriminative model with pooled features of handwriting is proposed and implemented using support vector machine. It has been reported on 100% of prediction accuracy by RBF and polynomial kernel based classification model.
Abstract: Bangla Vowel characterization determines the spectral properties of Bangla vowels for efficient synthesis as well as recognition of Bangla vowels. In this paper, Bangla vowels in isolated word have been analyzed based on speech production model within the framework of Analysis-by-Synthesis. This has led to the extraction of spectral parameters for the production model in order to produce different Bangla vowel sounds. The real and synthetic spectra are compared and a weighted square error has been computed along with the error in the formant bandwidths for efficient representation of Bangla vowels. The extracted features produced good representation of targeted Bangla vowel. Such a representation also plays essential role in low bit rate speech coding and vocoders.
Abstract: The acoustic and articulatory properties of fricative speech sounds are being studied using magnetic resonance imaging (MRI) and acoustic recordings from a single subject. Area functions were derived from a complete set of axial and coronal MR slices using two different methods: the Mermelstein technique and the Blum transform. Area functions derived from the two techniques were shown to differ significantly in some cases. Such differences will lead to different acoustic predictions and it is important to know which is the more accurate. The vocal tract acoustic transfer function (VTTF) was derived from these area functions for each fricative and compared with measured speech signals for the same fricative and same subject. The VTTFs for /f/ in two vowel contexts and the corresponding acoustic spectra are derived here; the Blum transform appears to show a better match between prediction and measurement than the Mermelstein technique.
Abstract: Internet is largely composed of textual contents and a
huge volume of digital contents gets floated over the Internet daily.
The ease of information sharing and re-production has made it
difficult to preserve author-s copyright. Digital watermarking came
up as a solution for copyright protection of plain text problem after
1993. In this paper, we propose a zero text watermarking algorithm
based on occurrence frequency of non-vowel ASCII characters and
words for copyright protection of plain text. The embedding
algorithm makes use of frequency non-vowel ASCII characters and
words to generate a specialized author key. The extraction algorithm
uses this key to extract watermark, hence identify the original
copyright owner. Experimental results illustrate the effectiveness of
the proposed algorithm on text encountering meaning preserving
attacks performed by five independent attackers.
Abstract: The purposes of this research are to study and develop
the algorithm of Thai spoonerism words by semi-automatic computer
programs, that is to say, in part of data input, syllables are already
separated and in part of spoonerism, the developed algorithm is
utilized, which can establish rules and mechanisms in Thai
spoonerism words for bi-syllables by utilizing analysis in elements of
the syllables, namely cluster consonant, vowel, intonation mark and
final consonant. From the study, it is found that bi-syllable Thai
spoonerism has 1 case of spoonerism mechanism, namely
transposition in value of vowel, intonation mark and consonant of
both 2 syllables but keeping consonant value and cluster word (if
any).
From the study, the rules and mechanisms in Thai spoonerism
word were applied to develop as Thai spoonerism word software,
utilizing PHP program. the software was brought to conduct a
performance test on software execution; it is found that the program
performs bi-syllable Thai spoonerism correctly or 99% of all words
used in the test and found faults on the program at 1% as the words
obtained from spoonerism may not be spelling in conformity with
Thai grammar and the answer in Thai spoonerism could be more than
1 answer.
Abstract: This paper presents the cepstral and trispectral
analysis of a speech signal produced by normal men, men with
defective audition (deaf, deep deaf) and others affected by
tracheotomy, the trispectral analysis based on parametric methods
(Autoregressive AR) using the fourth order cumulant. These
analyses are used to detect and compare the pitches and the formants
of corresponding voiced sounds (vowel \a\, \i\ and \u\). The first
results appear promising, since- it seems after several experimentsthere
is no deformation of the spectrum as one could have supposed
it at the beginning, however these pathologies influenced the two
characteristics:
The defective audition influences to the formants contrary to the
tracheotomy, which influences the fundamental frequency (pitch).
Abstract: This paper introduces an automatic voice classification
system for the diagnosis of individual constitution based on Sasang
Constitutional Medicine (SCM) in Traditional Korean Medicine
(TKM). For the developing of this algorithm, we used the voices of
309 female speakers and extracted a total of 134 speech features from
the voice data consisting of 5 sustained vowels and one sentence. The
classification system, based on a rule-based algorithm that is derived
from a non parametric statistical method, presents 3 types of decisions:
reserved, positive and negative decisions. In conclusion, 71.5% of the
voice data were diagnosed by this system, of which 47.7% were
correct positive decisions and 69.7% were correct negative decisions.
Abstract: This paper presents a formant-tracking linear prediction
(FTLP) model for speech processing in noise. The main focus of this
work is the detection of formant trajectory based on Hidden Markov
Models (HMM), for improved formant estimation in noise. The
approach proposed in this paper provides a systematic framework for
modelling and utilization of a time- sequence of peaks which satisfies
continuity constraints on parameter; the within peaks are modelled
by the LP parameters. The formant tracking LP model estimation
is composed of three stages: (1) a pre-cleaning multi-band spectral
subtraction stage to reduce the effect of residue noise on formants
(2) estimation stage where an initial estimate of the LP model of
speech for each frame is obtained (3) a formant classification using
probability models of formants and Viterbi-decoders. The evaluation
results for the estimation of the formant tracking LP model tested
in Gaussian white noise background, demonstrate that the proposed
combination of the initial noise reduction stage with formant tracking
and LPC variable order analysis, results in a significant reduction in
errors and distortions. The performance was evaluated with noisy
natual vowels extracted from international french and English vocabulary
speech signals at SNR value of 10dB. In each case, the
estimated formants are compared to reference formants.
Abstract: Initial values of reference vectors have significant influence on recognition accuracy in LVQ. There are several existing techniques, such as SOM and k-means, for setting initial values of reference vectors, each of which has provided some positive results. However, those results are not sufficient for the improvement of recognition accuracy. This study proposes an ACO-used method for initializing reference vectors with an aim to achieve recognition accuracy higher than those obtained through conventional methods. Moreover, we will demonstrate the effectiveness of the proposed method by applying it to the wine data and English vowel data and comparing its results with those of conventional methods.