Enhanced Clustering Analysis and Visualization Using Kohonen's Self-Organizing Feature Map Networks

Cluster analysis is the name given to a diverse collection of techniques that can be used to classify objects (e.g. individuals, quadrats, species etc). While Kohonen's Self-Organizing Feature Map (SOFM) or Self-Organizing Map (SOM) networks have been successfully applied as a classification tool to various problem domains, including speech recognition, image data compression, image or character recognition, robot control and medical diagnosis, its potential as a robust substitute for clustering analysis remains relatively unresearched. SOM networks combine competitive learning with dimensionality reduction by smoothing the clusters with respect to an a priori grid and provide a powerful tool for data visualization. In this paper, SOM is used for creating a toroidal mapping of two-dimensional lattice to perform cluster analysis on results of a chemical analysis of wines produced in the same region in Italy but derived from three different cultivators, referred to as the “wine recognition data" located in the University of California-Irvine database. The results are encouraging and it is believed that SOM would make an appealing and powerful decision-support system tool for clustering tasks and for data visualization.

Denoising based on Wavelets and Deblurring via Self-Organizing Map for Synthetic Aperture Radar Images

This work deals with unsupervised image deblurring. We present a new deblurring procedure on images provided by lowresolution synthetic aperture radar (SAR) or simply by multimedia in presence of multiplicative (speckle) or additive noise, respectively. The method we propose is defined as a two-step process. First, we use an original technique for noise reduction in wavelet domain. Then, the learning of a Kohonen self-organizing map (SOM) is performed directly on the denoised image to take out it the blur. This technique has been successfully applied to real SAR images, and the simulation results are presented to demonstrate the effectiveness of the proposed algorithms.

Self-Assembling Hypernetworks for Cognitive Learning of Linguistic Memory

Hypernetworks are a generalized graph structure representing higher-order interactions between variables. We present a method for self-organizing hypernetworks to learn an associative memory of sentences and to recall the sentences from this memory. This learning method is inspired by the “mental chemistry" model of cognition and the “molecular self-assembly" technology in biochemistry. Simulation experiments are performed on a corpus of natural-language dialogues of approximately 300K sentences collected from TV drama captions. We report on the sentence completion performance as a function of the order of word-interaction and the size of the learning corpus, and discuss the plausibility of this architecture as a cognitive model of language learning and memory.

Performance Comparison of Particle Swarm Optimization with Traditional Clustering Algorithms used in Self-Organizing Map

Self-organizing map (SOM) is a well known data reduction technique used in data mining. It can reveal structure in data sets through data visualization that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOM, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of an adaptive heuristic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOM. The application of our method to several standard data sets demonstrates its feasibility. PSO algorithm utilizes a so-called U-matrix of SOM to determine cluster boundaries; the results of this novel automatic method compare very favorably to boundary detection through traditional algorithms namely k-means and hierarchical based approach which are normally used to interpret the output of SOM.

AudioMine: Medical Data Mining in Heterogeneous Audiology Records

We report on the results of a pilot study in which a data-mining tool was developed for mining audiology records. The records were heterogeneous in that they contained numeric, category and textual data. The tools developed are designed to observe associations between any field in the records and any other field. The techniques employed were the statistical chi-squared test, and the use of self-organizing maps, an unsupervised neural learning approach.

Estimating an Optimal Neighborhood Size in the Spherical Self-Organizing Feature Map

This article presents a short discussion on optimum neighborhood size selection in a spherical selforganizing feature map (SOFM). A majority of the literature on the SOFMs have addressed the issue of selecting optimal learning parameters in the case of Cartesian topology SOFMs. However, the use of a Spherical SOFM suggested that the learning aspects of Cartesian topology SOFM are not directly translated. This article presents an approach on how to estimate the neighborhood size of a spherical SOFM based on the data. It adopts the L-curve criterion, previously suggested for choosing the regularization parameter on problems of linear equations where their right-hand-side is contaminated with noise. Simulation results are presented on two artificial 4D data sets of the coupled Hénon-Ikeda map.

Mapping Paddy Rice Agriculture using Multi-temporal FORMOSAT-2 Images

Most paddy rice fields in East Asia are small parcels, and the weather conditions during the growing season are usually cloudy. FORMOSAT-2 multi-spectral images have an 8-meter resolution and one-day recurrence, ideal for mapping paddy rice fields in East Asia. To map rice fields, this study first determined the transplanting and the most active tillering stages of paddy rice and then used multi-temporal images to distinguish different growing characteristics between paddy rice and other ground covers. The unsupervised ISODATA (iterative self-organizing data analysis techniques) and supervised maximum likelihood were both used to discriminate paddy rice fields, with training areas automatically derived from ten-year cultivation parcels in Taiwan. Besides original bands in multi-spectral images, we also generated normalized difference vegetation index and experimented with object-based pre-classification and post-classification. This paper discusses results of different image classification methods in an attempt to find a precise and automatic solution to mapping paddy rice in Taiwan.

A Growing Natural Gas Approach for Evaluating Quality of Software Modules

The prediction of Software quality during development life cycle of software project helps the development organization to make efficient use of available resource to produce the product of highest quality. “Whether a module is faulty or not" approach can be used to predict quality of a software module. There are numbers of software quality prediction models described in the literature based upon genetic algorithms, artificial neural network and other data mining algorithms. One of the promising aspects for quality prediction is based on clustering techniques. Most quality prediction models that are based on clustering techniques make use of K-means, Mixture-of-Guassians, Self-Organizing Map, Neural Gas and fuzzy K-means algorithm for prediction. In all these techniques a predefined structure is required that is number of neurons or clusters should be known before we start clustering process. But in case of Growing Neural Gas there is no need of predetermining the quantity of neurons and the topology of the structure to be used and it starts with a minimal neurons structure that is incremented during training until it reaches a maximum number user defined limits for clusters. Hence, in this work we have used Growing Neural Gas as underlying cluster algorithm that produces the initial set of labeled cluster from training data set and thereafter this set of clusters is used to predict the quality of test data set of software modules. The best testing results shows 80% accuracy in evaluating the quality of software modules. Hence, the proposed technique can be used by programmers in evaluating the quality of modules during software development.

Building Relationship Network for Machine Analysis from Wear Debris Measurements

Integration of system process information obtained through an image processing system with an evolving knowledge database to improve the accuracy and predictability of wear debris analysis is the main focus of the paper. The objective is to automate intelligently the analysis process of wear particle using classification via self-organizing maps. This is achieved using relationship measurements among corresponding attributes of various measurements for wear debris. Finally, visualization technique is proposed that helps the viewer in understanding and utilizing these relationships that enable accurate diagnostics.

Evaluation of Service Continuity in a Self-organizing IMS

The NGN (Next Generation Network), which can provide advanced multimedia services over an all-IP based network, has been the subject of much attention for years. While there have been tremendous efforts to develop its architecture and protocols, especially for IMS, which is a key technology of the NGN, it is far from being widely deployed. However, efforts to create an advanced signaling infrastructure realizing many requirements have resulted in a large number of functional components and interactions between those components. Thus, the carriers are trying to explore effective ways to deploy IMS while offering value-added services. As one such approach, we have proposed a self-organizing IMS. A self-organizing IMS enables IMS functional components and corresponding physical nodes to adapt dynamically and automatically based on situation such as network load and available system resources while continuing IMS operation. To realize this, service continuity for users is an important requirement when a reconfiguration occurs during operation. In this paper, we propose a mechanism that will provide service continuity to users and focus on the implementation and describe performance evaluation in terms of number of control signaling and processing time during reconfiguration