Performance Evaluation of Acoustic-Spectrographic Voice Identification Method in Native and Non-Native Speech

The paper deals with acoustic-spectrographic voice identification method in terms of its performance in non-native language speech. Performance evaluation is conducted by comparing the result of the analysis of recordings containing native language speech with recordings that contain foreign language speech. Our research is based on Tajik and Russian speech of Tajik native speakers due to the character of the criminal situation with drug trafficking. We propose a pilot experiment that represents a primary attempt enter the field.

Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features

The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.

Leading, Teaching and Learning “in the Middle”: Experiences, Beliefs, and Values of Instructional Leaders, Teachers, and Students in Finland, Germany, and Canada

Through the exploration of the lived experiences, beliefs and values of instructional leaders, teachers and students in Finland, Germany and Canada, we investigated the factors which contribute to developmentally responsive, intellectually engaging middle-level learning environments for early adolescents. Student-centred leadership dimensions, effective instructional practices and student agency were examined through the lens of current policy and research on middle-level learning environments emerging from the Canadian province of Manitoba. Consideration of these three research perspectives in the context of early adolescent learning, placed against an international backdrop, provided a previously undocumented perspective on leading, teaching and learning in the middle years. Aligning with a social constructivist, qualitative research paradigm, the study incorporated collective case study methodology, along with constructivist grounded theory methods of data analysis. Data were collected through semi-structured individual and focus group interviews and document review, as well as direct and participant observation. Three case study narratives were developed to share the rich stories of study participants, who had been selected using maximum variation and intensity sampling techniques. Interview transcript data were coded using processes from constructivist grounded theory. A cross-case analysis yielded a conceptual framework highlighting key factors that were found to be significant in the establishment of developmentally responsive, intellectually engaging middle-level learning environments. Seven core categories emerged from the cross-case analysis as common to all three countries. Within the visual conceptual framework (which depicts the interconnected nature of leading, teaching and learning in middle-level learning environments), these seven core categories were grouped into Essential Factors (student agency, voice and choice), Contextual Factors (instructional practices; school culture; engaging families and the community), Synergistic Factors (instructional leadership) and Cornerstone Factors (education as a fundamental cultural value; preservice, in-service and ongoing teacher development). In addition, sub-factors emerged from recurring codes in the data and identified specific characteristics and actions found in developmentally responsive, intellectually engaging middle-level learning environments. Although this study focused on 12 schools in Finland, Germany and Canada, it informs the practice of educators working with early adolescent learners in middle-level learning environments internationally. The authentic voices of early adolescent learners are the most important resource educators have to gauge if they are creating effective learning environments for their students. Ongoing professional dialogue and learning is essential to ensure teachers are supported in their work and develop the pedagogical practices needed to meet the needs of early adolescent learners. It is critical to balance consistency, coherence and dependability in the school environment with the necessary flexibility in order to support the unique learning needs of early adolescents. Educators must intentionally create a school culture that unites teachers, students and their families in support of a common purpose, as well as nurture positive relationships between the school and its community. A large, urban school district in Canada has implemented a school cohort-based model to begin to bring developmentally responsive, intellectually engaging middle-level learning environments to scale.

An Investigation of Community Radio Broadcasting in Phutthamonthon District, Nakhon Pathom, Thailand

This study aims to explore and compare the current condition of community radio stations in Phutthamonthon district, Nakhon Pathom province, Thailand, as well as the challenges they are facing. Qualitative research tools including in-depth interviews; documentary analysis; focus group interviews; and observation, are used to examine the content, programming, and management structure of three community radio stations currently in operation within the district. Research findings indicate that the management and operational approaches adopted by the two non-profit stations included in the study, Salaya Pattana and Voice of Dhamma, are more structured and effective than that of the for-profit Tune Radio. Salaya Pattana – backed by the Faculty of Engineering, Mahidol University, and the charity-funded Voice of Dhamma, are comparatively free from political and commercial influence, and able to provide more relevant and consistent community-oriented content to meet the real demand of the audience. Tune Radio, on the other hand, has to rely solely on financial support from political factions and business groups, which heavily influence its content.

Application Quality Function Deployment (QFD) Tool in Design of Aero Pumps Based on System Engineering

Quality Function Deployment (QFD) was developed in 1960 in Japan and introduced in 1983 in America and Europe. The paper presents a real application of this technique in a way that the method of applying QFD in design and production aero fuel pumps has been considered. While designing a product and in order to apply system engineering process, the first step is identification customer needs then its transition to engineering parameters. Since each change in deign after production process leads to extra human costs and also increase in products quality risk, QFD can make benefits in sale by meeting customer expectations. Since the needs identified as well, the use of QFD tool can lead to increase in communications and less deviation in design and production phases, finally it leads to produce the products with defined technical attributes.

The Most Secure Smartphone Operating System: A Survey

In the recent years, a fundamental revolution in the Mobile Phone technology from just being able to provide voice and short message services to becoming the most essential part of our lives by connecting to network and various app stores for downloading software apps of almost every activity related to our life from finding location to banking from getting news updates to downloading HD videos and so on. This progress in Smart Phone industry has modernized and transformed our way of living into a trouble-free world. The smart phone has become our personal computers with the addition of significant features such as multi core processors, multi-tasking, large storage space, bluetooth, WiFi, including large screen and cameras. With this evolution, the rise in the security threats have also been amplified. In Literature, different threats related to smart phones have been highlighted and various precautions and solutions have been proposed to keep the smart phone safe which carries all the private data of a user. In this paper, a survey has been carried out to find out the most secure and the most unsecure smart phone operating system among the most popular smart phones in use today.

Futuristic Black Box Design Considerations and Global Networking for Real Time Monitoring of Flight Performance Parameters

The aim of this research paper is to conceptualize, discuss, analyze and propose alternate design methodologies for futuristic Black Box for flight safety. The proposal also includes global networking concepts for real time surveillance and monitoring of flight performance parameters including GPS parameters. It is expected that this proposal will serve as a failsafe real time diagnostic tool for accident investigation and location of debris in real time. In this paper, an attempt is made to improve the existing methods of flight data recording techniques and improve upon design considerations for futuristic FDR to overcome the trauma of not able to locate the block box. Since modern day communications and information technologies with large bandwidth are available coupled with faster computer processing techniques, the attempt made in this paper to develop a failsafe recording technique is feasible. Further data fusion/data warehousing technologies are available for exploitation.

Saving Lives: Alternative Approaches to Reducing Gun Violence

This paper highlights an innovative and nontraditional violence prevention program that is making a noticeable impact in what was once one of the country’s most violent communities. With unique and tailored strategies, the Operation Peacemaker Fellowship, established in Richmond, California, combines components of evidence-based practices with a community-oriented focus on relationships and mentoring to fill a gap in services and increase community safety. In an effort to highlight these unique strategies and provide a blueprint for other communities with violent crime problems, the authors of this paper hope to clearly delineate how one community is moving forward with vanguard approaches to invest in the lives of young men who once were labeled their community’s most violent, even most deadly, youth. The impact of this program is evidenced through the fellows’ own voices as they illuminate the experience of being in the Fellowship. In interviews, fellows describe how participating in this program has transformed their lives and the lives of those they love. The authors of this article spent more than two years researching this Fellowship program in order to conduct an evaluation of it and, ultimately, to demonstrate how this program is a testament to the power of relationships and love combined with evidence-based practices, consequently enriching the lives of youth and the community that embraces them.

An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Development of a Mobile Image-Based Reminder Application to Support Tuberculosis Treatment in Africa

This paper presents the design, development and evaluation of an application prototype developed to support tuberculosis (TB) patients’ treatment adherence. The system makes use of graphics and voice reminders as opposed to text messaging to encourage patients to follow their medication routine. To evaluate the effect of the prototype applications, participants were given mobile phones on which the reminder system was installed. Thirty-eight people, including TB health workers and patients from Zanzibar, Tanzania, participated in the evaluation exercises. The results indicate that the participants found the mobile image-based application is useful to support TB treatment. All participants understood and interpreted the intended meaning of every image correctly. The study findings revealed that the use of a mobile visualbased application may have potential benefit to support TB patients (both literate and illiterate) in their treatment processes.

Voices and Pictures from an Online Course and a Face to Face Course

In light of the technological development and its introduction into the field of education, an online course was designed in parallel to the 'conventional' course for teaching the ''Qualitative Research Methods''. This course aimed to characterize learning-teaching processes in a 'Qualitative Research Methods' course studied in two different frameworks. Moreover, its objective was to explore the difference between the culture of a physical learning environment and that of online learning. The research monitored four learner groups, a total of 72 students, for two years, two groups from the two course frameworks each year. The courses were obligatory for M.Ed. students at an academic college of education and were given by one female-lecturer. The research was conducted in the qualitative method as a case study in order to attain insights about occurrences in the actual contexts and sites in which they transpire. The research tools were open-ended questionnaire and reflections in the form of vignettes (meaningful short pictures) to all students as well as an interview with the lecturer. The tools facilitated not only triangulation but also collecting data consisting of voices and pictures of teaching and learning. The most prominent findings are: differences between the two courses in the change features of the learning environment culture for the acquisition of contents and qualitative research tools. They were manifested by teaching methods, illustration aids, lecturer's profile and students' profile.

Eyeball Motion Controlled Wheelchair Using IR Sensors

This paper presents the ‘Eye Ball Motion Controlled Wheelchair using IR Sensors’ for the elderly and differently abled people. In this eye tracking based technology, three Proximity Infrared (IR) sensor modules are mounted on an eye frame to trace the movement of the iris. Since, IR sensors detect only white objects; a unique sequence of digital bits is generated corresponding to each eye movement. These signals are then processed via a micro controller IC (PIC18F452) to control the motors of the wheelchair. The potential and efficiency of previously developed rehabilitation systems that use head motion, chin control, sip-n-puff control, voice recognition, and EEG signals variedly have also been explored in detail. They were found to be inconvenient as they served either limited usability or non-affordability. After multiple regression analyses, the proposed design was developed as a cost-effective, flexible and stream-lined alternative for people who have trouble adopting conventional assistive technologies.

Mikrophonie I (1964) by Karlheinz Stockhausen - Between Idea and Auditory Image

Background in music analysis: Traditionally, when we think about a composer’s sketches, the chances are that we are thinking in terms of the working out of detail, rather than the evolution of an overall concept. Since music is a “time art,” it follows that questions of a form cannot be entirely detached from considerations of time. One could say that composers tend to regard time either as a place gradually and partially intuitively filled, or they can look for a specific strategy to occupy it. It seems that the one thing that sheds light on Stockhausen’s compositional thinking is his frequent use of “form schemas,” that is often a single-page representation of the entire structure of a piece. Background in music technology: Sonic Visualiser is a program used to study a musical recording. It is an open source application for viewing, analyzing, and annotating music audio files. It contains a number of visualisation tools, which are designed with useful default parameters for musical analysis. Additionally, the Vamp plugin format of SV supports to provide analysis such as for example structural segmentation. Aims: The aim of paper is to show how SV may be used to obtain a better understanding of the specific musical work, and how the compositional strategy does impact on musical structures and musical surfaces. It is known that “traditional” music analytic methods don’t allow indicating interrelationships between musical surface (which is perceived) and underlying musical/acoustical structure. Main Contribution: Stockhausen had dealt with the most diverse musical problems by the most varied methods. A characteristic which he had never ceased to be placed at the center of his thought and works, it was the quest for a new balance founded upon an acute connection between speculation and intuition. In the case with Mikrophonie I (1964) for tam-tam and 6 players Stockhausen makes a distinction between the “connection scheme,” which indicates the ground rules underlying all versions, and the form scheme, which is associated with a particular version. The preface to the published score includes both the connection scheme, and a single instance of a “form scheme,” which is what one can hear on the CD recording. In the current study, the insight into the compositional strategy chosen by Stockhausen was been compared with auditory image, that is, with the perceived musical surface. Stockhausen’s musical work is analyzed both in terms of melodic/voice and timbre evolution. Implications: The current study shows how musical structures have determined of musical surface. The general assumption is this, that while listening to music we can extract basic kinds of musical information from musical surfaces. It is shown that interactive strategies of musical structure analysis can offer a very fruitful way of looking directly into certain structural features of music.

Independent Encryption Technique for Mobile Voice Calls

The legality of some countries or agencies’ acts to spy on personal phone calls of the public became a hot topic to many social groups’ talks. It is believed that this act is considered an invasion to someone’s privacy. Such act may be justified if it is singling out specific cases but to spy without limits is very unacceptable. This paper discusses the needs for not only a simple and light weight technique to secure mobile voice calls but also a technique that is independent from any encryption standard or library. It then presents and tests one encrypting algorithm that is based of Frequency scrambling technique to show fair and delay-free process that can be used to protect phone calls from such spying acts.

Motivating the Independent Learner at the Arab Open University, Kuwait

Academicians at the Arab Open University have always voiced their concern about the efficacy of the blended learning process. Based on 75% independent study and 25% face-toface tutorial, it poses the challenge of the predisposition to adjustment. Being used to the psychology of traditional educational systems, AOU students cannot be easily weaned from being spoonfed. Hence they lack the motivation to plunge into self-study. For better involvement of AOU students into the learning practices, it is imperative to diagnose the factors that impede or increase their motivation. This is conducted through an empirical study grounded upon observations and tested hypothesis and aimed at monitoring and optimizing the students’ learning outcome. Recommendations of the research will follow the findings.

Hallucinatory Activity in Schizophrenia: The Relationship with Childhood Memories, Submissive Behavior, Social Comparison, and Depression

Auditory hallucinations among the most invalidating and distressing experiences reported by patients diagnosed with schizophrenia, leading to feelings of powerlessness and helplessness towards their illness. In more severe cases, these auditory hallucinations can take the form of commanding voices, which are often related to high suicidality rates in these patients. Several authors propose that the meanings attributed to the hallucinatory experience, rather than characteristics like form and content, can be determinant in patients’ reactions to hallucinatory activity, particularly in the case of voice-hearing experiences. In this study, 48 patients diagnosed with paranoid schizophrenia presenting auditory hallucinations were studied. Multiple regression analyses were computed to study the influence of several developmental aspects, such as family and social dynamics, bullying, depression, and sociocognitive variables on the auditory hallucinations, on patients’ attributions and relationships with their voices, and on the resulting invalidation of hallucinatory experience. Overall, results showed how relationships with voices can mirror several aspects of interpersonal relationship with others, and how self-schemas, depression and actual social relationships help shaping the voice-hearing experience. Early experiences of victimization and submission help predict the attributions of omnipotence of the voices, and increased hostility from parents seems to increase the malevolence of the voices, suggesting that socio-cognitive factors can significantly contribute to the etiology and maintenance of auditory hallucinations. The understanding of the characteristics of auditory hallucinations and the relationships patients established with their voices can allow the development of more promising therapeutic interventions that can be more effective in decreasing invalidation caused by this devastating mental illness.

A Simple Adaptive Atomic Decomposition Voice Activity Detector Implemented by Matching Pursuit

A simple adaptive voice activity detector (VAD) is implemented using Gabor and gammatone atomic decomposition of speech for high Gaussian noise environments. Matching pursuit is used for atomic decomposition, and is shown to achieve optimal speech detection capability at high data compression rates for low signal to noise ratios. The most active dictionary elements found by matching pursuit are used for the signal reconstruction so that the algorithm adapts to the individual speakers dominant time-frequency characteristics. Speech has a high peak to average ratio enabling matching pursuit greedy heuristic of highest inner products to isolate high energy speech components in high noise environments. Gabor and gammatone atoms are both investigated with identical logarithmically spaced center frequencies, and similar bandwidths. The algorithm performs equally well for both Gabor and gammatone atoms with no significant statistical differences. The algorithm achieves 70% accuracy at a 0 dB SNR, 90% accuracy at a 5 dB SNR and 98% accuracy at a 20dB SNR using 30d B SNR as a reference for voice activity.

Environmentally Adaptive Acoustic Echo Suppression for Barge-in Speech Recognition

In this study, we propose a novel technique for acoustic echo suppression (AES) during speech recognition under barge-in conditions. Conventional AES methods based on spectral subtraction apply fixed weights to the estimated echo path transfer function (EPTF) at the current signal segment and to the EPTF estimated until the previous time interval. However, the effects of echo path changes should be considered for eliminating the undesired echoes. We describe a new approach that adaptively updates weight parameters in response to abrupt changes in the acoustic environment due to background noises or double-talk. Furthermore, we devised a voice activity detector and an initial time-delay estimator for barge-in speech recognition in communication networks. The initial time delay is estimated using log-spectral distance measure, as well as cross-correlation coefficients. The experimental results show that the developed techniques can be successfully applied in barge-in speech recognition systems.

Interactive Shadow Play Animation System

The paper describes a Chinese shadow play animation system based on Kinect. Users, without any professional training, can personally manipulate the shadow characters to finish a shadow play performance by their body actions and get a shadow play video through giving the record command to our system if they want. In our system, Kinect is responsible for capturing human movement and voice commands data. Gesture recognition module is used to control the change of the shadow play scenes. After packaging the data from Kinect and the recognition result from gesture recognition module, VRPN transmits them to the server-side. At last, the server-side uses the information to control the motion of shadow characters and video recording. This system not only achieves human-computer interaction, but also realizes the interaction between people. It brings an entertaining experience to users and easy to operate for all ages. Even more important is that the application background of Chinese shadow play embodies the protection of the art of shadow play animation.

Smart Help at theWorkplace for Persons with Disabilities (SHW-PWD)

The Smart Help for persons with disability (PWD) is a part of the project SMARTDISABLE which aims to develop relevant solution for PWD that target to provide an adequate workplace environment for them. It would support PWD needs smartly through smart help to allow them access to relevant information and communicate with other effectively and flexibly, and smart editor that assist them in their daily work. It will assist PWD in knowledge processing and creation as well as being able to be productive at the work place. The technical work of the project involves design of a technological scenario for the Ambient Intelligence (AmI) - based assistive technologies at the workplace consisting of an integrated universal smart solution that suits many different impairment conditions and will be designed to empower the Physically disabled persons (PDP) with the capability to access and effectively utilize the ICTs in order to execute knowledge rich working tasks with minimum efforts and with sufficient comfort level. The proposed technology solution for PWD will support voice recognition along with normal keyboard and mouse to control the smart help and smart editor with dynamic auto display interface that satisfies the requirements for different PWD group. In addition, a smart help will provide intelligent intervention based on the behavior of PWD to guide them and warn them about possible misbehavior. PWD can communicate with others using Voice over IP controlled by voice recognition. Moreover, Auto Emergency Help Response would be supported to assist PWD in case of emergency. This proposed technology solution intended to make PWD very effective at the work environment and flexible using voice to conduct their tasks at the work environment. The proposed solution aims to provide favorable outcomes that assist PWD at the work place, with the opportunity to participate in PWD assistive technology innovation market which is still small and rapidly growing as well as upgrading their quality of life to become similar to the normal people at the workplace. Finally, the proposed smart help solution is applicable in all workplace setting, including offices, manufacturing, hospital, etc.