Hybrid Modeling Algorithm for Continuous Tamil Speech Recognition

In this paper, Fuzzy C-Means clustering with Expectation Maximization-Gaussian Mixture Model based hybrid modeling algorithm is proposed for Continuous Tamil Speech Recognition. The speech sentences from various speakers are used for training and testing phase and objective measures are between the proposed and existing Continuous Speech Recognition algorithms. From the simulated results, it is observed that the proposed algorithm improves the recognition accuracy and F-measure up to 3% as compared to that of the existing algorithms for the speech signal from various speakers. In addition, it reduces the Word Error Rate, Error Rate and Error up to 4% as compared to that of the existing algorithms. In all aspects, the proposed hybrid modeling for Tamil speech recognition provides the significant improvements for speechto- text conversion in various applications.

Probabilistic Graphical Model for the Web

The world wide web network is a network with a complex topology, the main properties of which are the distribution of degrees in power law, A low clustering coefficient and a weak average distance. Modeling the web as a graph allows locating the information in little time and consequently offering a help in the construction of the research engine. Here, we present a model based on the already existing probabilistic graphs with all the aforesaid characteristics. This work will consist in studying the web in order to know its structuring thus it will enable us to modelize it more easily and propose a possible algorithm for its exploration.

GCM Based Fuzzy Clustering to Identify Homogeneous Climatic Regions of North-East India

The North-eastern part of India, which receives heavier rainfall than other parts of the subcontinent, is of great concern now-a-days with regard to climate change. High intensity rainfall for short duration and longer dry spell, occurring due to impact of climate change, affects river morphology too. In the present study, an attempt is made to delineate the North-eastern region of India into some homogeneous clusters based on the Fuzzy Clustering concept and to compare the resulting clusters obtained by using conventional methods and nonconventional methods of clustering. The concept of clustering is adapted in view of the fact that, impact of climate change can be studied in a homogeneous region without much variation, which can be helpful in studies related to water resources planning and management. 10 IMD (Indian Meteorological Department) stations, situated in various regions of the North-east, have been selected for making the clusters. The results of the Fuzzy C-Means (FCM) analysis show different clustering patterns for different conditions. From the analysis and comparison it can be concluded that nonconventional method of using GCM data is somehow giving better results than the others. However, further analysis can be done by taking daily data instead of monthly means to reduce the effect of standardization.

Altered Network Organization in Mild Alzheimer's Disease Compared to Mild Cognitive Impairment Using Resting-State EEG

Brain functional networks based on resting-state EEG data were compared between patients with mild Alzheimer’s disease (mAD) and matched patients with amnestic subtype of mild cognitive impairment (aMCI). We integrated the time–frequency cross mutual information (TFCMI) method to estimate the EEG functional connectivity between cortical regions and the network analysis based on graph theory to further investigate the alterations of functional networks in mAD compared with aMCI group. We aimed at investigating the changes of network integrity, local clustering, information processing efficiency, and fault tolerance in mAD brain networks for different frequency bands based on several topological properties, including degree, strength, clustering coefficient, shortest path length, and efficiency. Results showed that the disruptions of network integrity and reductions of network efficiency in mAD characterized by lower degree, decreased clustering coefficient, higher shortest path length, and reduced global and local efficiencies in the delta, theta, beta2, and gamma bands were evident. The significant changes in network organization can be used in assisting discrimination of mAD from aMCI in clinical.

A Review: Comparative Study of Enhanced Hierarchical Clustering Protocols in WSN

Recent advances in wireless networking technologies introduce several energy aware routing protocols in sensor networks. Such protocols aim to extend the lifetime of network by reducing the energy consumption of nodes. Many researchers are looking for certain challenges that are predominant in the grounds of energy consumption. One such protocol that addresses this energy consumption issue is ‘Cluster based hierarchical routing protocol’. In this paper, we intend to discuss some of the major hierarchical routing protocols adhering towards sensor networks. Furthermore, we examine and compare several aspects and characteristics of few widely explored hierarchical clustering protocols, and its operations in wireless sensor networks (WSN). This paper also presents a discussion on the future research topics and the challenges of hierarchical clustering in WSNs.

Wavelet Based Residual Method of Detecting GSM Signal Strength Fading

In this paper, GSM signal strength was measured in order to detect the type of the signal fading phenomenon using onedimensional multilevel wavelet residual method and neural network clustering to determine the average GSM signal strength received in the study area. The wavelet residual method predicted that the GSM signal experienced slow fading and attenuated with MSE of 3.875dB. The neural network clustering revealed that mostly -75dB, -85dB and -95dB were received. This means that the signal strength received in the study is a weak signal.

Issue Reorganization Using the Measure of Relevance

The need to extract R&D keywords from issues and use them to retrieve R&D information is increasing rapidly. However, it is difficult to identify related issues or distinguish them. Although the similarity between issues cannot be identified, with an R&D lexicon, issues that always share the same R&D keywords can be determined. In detail, the R&D keywords that are associated with a particular issue imply the key technology elements that are needed to solve a particular issue. Furthermore, the relationship among issues that share the same R&D keywords can be shown in a more systematic way by clustering them according to keywords. Thus, sharing R&D results and reusing R&D technology can be facilitated. Indirectly, redundant investment in R&D can be reduced as the relevant R&D information can be shared among corresponding issues and the reusability of related R&D can be improved. Therefore, a methodology to cluster issues from the perspective of common R&D keywords is proposed to satisfy these demands.

A Comprehensive Review on Different Mixed Data Clustering Ensemble Methods

An extensive amount of work has been done in data clustering research under the unsupervised learning technique in Data Mining during the past two decades. Moreover, several approaches and methods have been emerged focusing on clustering diverse data types, features of cluster models and similarity rates of clusters. However, none of the single clustering algorithm exemplifies its best nature in extracting efficient clusters. Consequently, in order to rectify this issue, a new challenging technique called Cluster Ensemble method was bloomed. This new approach tends to be the alternative method for the cluster analysis problem. The main objective of the Cluster Ensemble is to aggregate the diverse clustering solutions in such a way to attain accuracy and also to improve the eminence the individual clustering algorithms. Due to the massive and rapid development of new methods in the globe of data mining, it is highly mandatory to scrutinize a vital analysis of existing techniques and the future novelty. This paper shows the comparative analysis of different cluster ensemble methods along with their methodologies and salient features. Henceforth this unambiguous analysis will be very useful for the society of clustering experts and also helps in deciding the most appropriate one to resolve the problem in hand.

Fuzzy C-Means Clustering for Biomedical Documents Using Ontology Based Indexing and Semantic Annotation

Search is the most obvious application of information retrieval. The variety of widely obtainable biomedical data is enormous and is expanding fast. This expansion makes the existing techniques are not enough to extract the most interesting patterns from the collection as per the user requirement. Recent researches are concentrating more on semantic based searching than the traditional term based searches. Algorithms for semantic searches are implemented based on the relations exist between the words of the documents. Ontologies are used as domain knowledge for identifying the semantic relations as well as to structure the data for effective information retrieval. Annotation of data with concepts of ontology is one of the wide-ranging practices for clustering the documents. In this paper, indexing based on concept and annotation are proposed for clustering the biomedical documents. Fuzzy c-means (FCM) clustering algorithm is used to cluster the documents. The performances of the proposed methods are analyzed with traditional term based clustering for PubMed articles in five different diseases communities. The experimental results show that the proposed methods outperform the term based fuzzy clustering.

Automatic Facial Skin Segmentation Using Possibilistic C-Means Algorithm for Evaluation of Facial Surgeries

Human face has a fundamental role in the appearance of individuals. So the importance of facial surgeries is undeniable. Thus, there is a need for the appropriate and accurate facial skin segmentation in order to extract different features. Since Fuzzy CMeans (FCM) clustering algorithm doesn’t work appropriately for noisy images and outliers, in this paper we exploit Possibilistic CMeans (PCM) algorithm in order to segment the facial skin. For this purpose, first, we convert facial images from RGB to YCbCr color space. To evaluate performance of the proposed algorithm, the database of Sahand University of Technology, Tabriz, Iran was used. In order to have a better understanding from the proposed algorithm; FCM and Expectation-Maximization (EM) algorithms are also used for facial skin segmentation. The proposed method shows better results than the other segmentation methods. Results include misclassification error (0.032) and the region’s area error (0.045) for the proposed algorithm.

Towards Assessment of Indicators Influence on Innovativeness of Countries' Economies: Selected Soft Computing Approaches

The aim of this paper is to assess the influence of several indicators determining innovativeness of countries' economies by applying selected soft computing methods. Such methods enable us to identify correlations between indicators for period 2006-2010. The main attention in the paper is focused on selecting proper computer tools for solving this problem. As a tool supporting identification, the X-means clustering algorithm, the Apriori rules generation algorithm as well as Self-Organizing Feature Maps (SOMs) have been selected. The paper has rather a rudimentary character. We briefly describe usefulness of the selected approaches and indicate some challenges for further research.

Performance Analysis of Cluster Based Dual Tired Network Model with INTK Security Scheme in a Wireless Sensor Network

A dual tiered network model is designed to overcome the problem of energy alert and fault tolerance. This model minimizes the delay time and overcome failure of links. Performance analysis of the dual tiered network model is studied in this paper where the CA and LS schemes are compared with DEO optimal. We then evaluate  the Integrated Network Topological Control and Key Management (INTK) Schemes, which was proposed to add security features of the wireless sensor networks. Clustering efficiency, level of protections, the time complexity is some of the parameters of INTK scheme that were analyzed. We then evaluate the Cluster based Energy Competent n-coverage scheme (CEC n-coverage scheme) to ensure area coverage for wireless sensor networks.

Growing Self Organising Map Based Exploratory Analysis of Text Data

Textual data plays an important role in the modern world. The possibilities of applying data mining techniques to uncover hidden information present in large volumes of text collections is immense. The Growing Self Organizing Map (GSOM) is a highly successful member of the Self Organising Map family and has been used as a clustering and visualisation tool across wide range of disciplines to discover hidden patterns present in the data. A comprehensive analysis of the GSOM’s capabilities as a text clustering and visualisation tool has so far not been published. These functionalities, namely map visualisation capabilities, automatic cluster identification and hierarchical clustering capabilities are presented in this paper and are further demonstrated with experiments on a benchmark text corpus.

A New Approach for Network Reconfiguration Problem in Order to Deviation Bus Voltage Minimization with Regard to Probabilistic Load Model and DGs

Recently, distributed generation technologies have received much attention for the potential energy savings and reliability assurances that might be achieved as a result of their widespread adoption. The distribution feeder reconfiguration (DFR) is one of the most important control schemes in the distribution networks, which can be affected by DGs. This paper presents a new approach to DFR at the distribution networks considering wind turbines. The main objective of the DFR is to minimize the deviation of the bus voltage. Since the DFR is a nonlinear optimization problem, we apply the Adaptive Modified Firefly Optimization (AMFO) approach to solve it. As a result of the conflicting behavior of the single- objective function, a fuzzy based clustering technique is employed to reach the set of optimal solutions called Pareto solutions. The approach is tested on the IEEE 32-bus standard test system.

Transaction Costs in Institutional Environment and Entry Mode Choice

In the study presented institutional context is discussed in terms of companies’ entry mode choice. In contrary to many previous analyses, instead of using one or two aggregated variables, a set of eleven determinants is used to establish equity and non-equity internationalization friendly conditions. Based on secondary data, 140 countries are analyzed and grouped into clusters revealing similar framework. The range of the economies explored is wide as it covers all regions distinguished by The World Bank. The results can prove a useful alternative for operationalization of institutional variables in further research concerning entry modes or strategic management in international markets.

Comparative Study on Swarm Intelligence Techniques for Biclustering of Microarray Gene Expression Data

Microarray gene expression data play a vital in biological processes, gene regulation and disease mechanism. Biclustering in gene expression data is a subset of the genes indicating consistent patterns under the subset of the conditions. Finding a biclustering is an optimization problem. In recent years, swarm intelligence techniques are popular due to the fact that many real-world problems are increasingly large, complex and dynamic. By reasons of the size and complexity of the problems, it is necessary to find an optimization technique whose efficiency is measured by finding the near optimal solution within a reasonable amount of time. In this paper, the algorithmic concepts of the Particle Swarm Optimization (PSO), Shuffled Frog Leaping (SFL) and Cuckoo Search (CS) algorithms have been analyzed for the four benchmark gene expression dataset. The experiment results show that CS outperforms PSO and SFL for 3 datasets and SFL give better performance in one dataset. Also this work determines the biological relevance of the biclusters with Gene Ontology in terms of function, process and component.

Diversity Analysis of a Quinoa (Chenopodium quinoa Willd.) Germplasm during Two Seasons

The present work has been carried out to evaluate the diversity of a collection of 78 quinoa accessions developed through recurrent selection from Andean germplasm introduced to Morocco in the winter of 2000. Twenty-three quantitative and qualitative characters were used for the evaluation of genetic diversity and the relationship between the accessions, and also for the establishment of a core collection in Morocco. Important variation was found among the accessions in terms of plant morphology and growth behavior. Data analysis showed positive correlation of the plant height, the plant fresh and the dry weight with the grain yield, while days to flowering was found to be negatively correlated with grain yield. The first four PCs contributed 74.76% of the variability; the first PC showed significant variation with 42.86% of the total variation, PC2 with 15.37%, PC3 with 9.05% and PC4 contributed 7.49% of the total variation. Plant size, days to grain filling and days to maturity are correlated to the PC1; and seed size, inflorescence density and mildew resistance are correlated to the PC2. Hierarchical cluster analysis rearranged the 78 quinoa accessions into four main groups and ten sub-clusters. Clustering was found in associations with days to maturity and also with plant size and seed-size traits.

Pattern Recognition Using Feature Based Die-Map Clusteringin the Semiconductor Manufacturing Process

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Face Recognition Based On Vector Quantization Using Fuzzy Neuro Clustering

A face recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame. A lot of algorithms have been proposed for face recognition. Vector Quantization (VQ) based face recognition is a novel approach for face recognition. Here a new codebook generation for VQ based face recognition using Integrated Adaptive Fuzzy Clustering (IAFC) is proposed. IAFC is a fuzzy neural network which incorporates a fuzzy learning rule into a competitive neural network. The performance of proposed algorithm is demonstrated by using publicly available AT&T database, Yale database, Indian Face database and a small face database, DCSKU database created in our lab. In all the databases the proposed approach got a higher recognition rate than most of the existing methods. In terms of Equal Error Rate (ERR) also the proposed codebook is better than the existing methods.

A Robust Method for Finding Nearest-Neighbor using Hexagon Cells

In pattern clustering, nearest neighborhood point computation is a challenging issue for many applications in the area of research such as Remote Sensing, Computer Vision, Pattern Recognition and Statistical Imaging. Nearest neighborhood computation is an essential computation for providing sufficient classification among the volume of pixels (voxels) in order to localize the active-region-of-interests (AROI). Furthermore, it is needed to compute spatial metric relationships of diverse area of imaging based on the applications of pattern recognition. In this paper, we propose a new methodology for finding the nearest neighbor point, depending on making a virtually grid of a hexagon cells, then locate every point beneath them. An algorithm is suggested for minimizing the computation and increasing the turnaround time of the process. The nearest neighbor query points Φ are fetched by seeking fashion of hexagon holistic. Seeking will be repeated until an AROI Φ is to be expected. If any point Υ is located then searching starts in the nearest hexagons in a circular way. The First hexagon is considered be level 0 (L0) and the surrounded hexagons is level 1 (L1). If Υ is located in L1, then search starts in the next level (L2) to ensure that Υ is the nearest neighbor for Φ. Based on the result and experimental results, we found that the proposed method has an advantage over the traditional methods in terms of minimizing the time complexity required for searching the neighbors, in turn, efficiency of classification will be improved sufficiently.