Dynamic Clustering using Particle Swarm Optimization with Application in Unsupervised Image Classification

A new dynamic clustering approach (DCPSO), based on Particle Swarm Optimization, is proposed. This approach is applied to unsupervised image classification. The proposed approach automatically determines the "optimum" number of clusters and simultaneously clusters the data set with minimal user interference. The algorithm starts by partitioning the data set into a relatively large number of clusters to reduce the effects of initial conditions. Using binary particle swarm optimization the "best" number of clusters is selected. The centers of the chosen clusters is then refined via the Kmeans clustering algorithm. The experiments conducted show that the proposed approach generally found the "optimum" number of clusters on the tested images.

Real Time Approach for Data Placement in Wireless Sensor Networks

The issue of real-time and reliable report delivery is extremely important for taking effective decision in a real world mission critical Wireless Sensor Network (WSN) based application. The sensor data behaves differently in many ways from the data in traditional databases. WSNs need a mechanism to register, process queries, and disseminate data. In this paper we propose an architectural framework for data placement and management. We propose a reliable and real time approach for data placement and achieving data integrity using self organized sensor clusters. Instead of storing information in individual cluster heads as suggested in some protocols, in our architecture we suggest storing of information of all clusters within a cell in the corresponding base station. For data dissemination and action in the wireless sensor network we propose to use Action and Relay Stations (ARS). To reduce average energy dissipation of sensor nodes, the data is sent to the nearest ARS rather than base station. We have designed our architecture in such a way so as to achieve greater energy savings, enhanced availability and reliability.

The influence of Local Export Externalities and Firm International Experience on Export Performance

This research tries to analyze the role that knowledge about foreign markets has in increasing firms- exports in clustered spaces. We consider two interrelated sources of knowledge: firms- direct experience and indirect experience from other clustered firms – export externalities. In particular, it is proposed that firms would improve their export performance by accessing to export externalities if they have some previous direct experience that allows them to identify, understand and exploit them. Also, we propose that this positive influence of previous direct experience on export externalities keeps only up to a point, where it becomes negative, creating an inverted “U" shape. Empirical evidence gathered among wine producers located in La Rioja tends to confirm that firms enjoy of export externalities if they have export experience along several years and countries increase their export performance. While this relationship becomes less relevant as they develop a higher experience, we could not confirm the existence of a curvilinear relationship in their influence on export externalities and export performance.

Grid-based Supervised Clustering - GBSC

This paper presents a supervised clustering algorithm, namely Grid-Based Supervised Clustering (GBSC), which is able to identify clusters of any shapes and sizes without presuming any canonical form for data distribution. The GBSC needs no prespecified number of clusters, is insensitive to the order of the input data objects, and is capable of handling outliers. Built on the combination of grid-based clustering and density-based clustering, under the assistance of the downward closure property of density used in bottom-up subspace clustering, the GBSC can notably reduce its search space to avoid the memory confinement situation during its execution. On two-dimension synthetic datasets, the GBSC can identify clusters with different shapes and sizes correctly. The GBSC also outperforms other five supervised clustering algorithms when the experiments are performed on some UCI datasets.

Information Gain Ratio Based Clustering for Investigation of Environmental Parameters Effects on Human Mental Performance

Methods of clustering which were developed in the data mining theory can be successfully applied to the investigation of different kinds of dependencies between the conditions of environment and human activities. It is known, that environmental parameters such as temperature, relative humidity, atmospheric pressure and illumination have significant effects on the human mental performance. To investigate these parameters effect, data mining technique of clustering using entropy and Information Gain Ratio (IGR) K(Y/X) = (H(X)–H(Y/X))/H(Y) is used, where H(Y)=-ΣPi ln(Pi). This technique allows adjusting the boundaries of clusters. It is shown that the information gain ratio (IGR) grows monotonically and simultaneously with degree of connectivity between two variables. This approach has some preferences if compared, for example, with correlation analysis due to relatively smaller sensitivity to shape of functional dependencies. Variant of an algorithm to implement the proposed method with some analysis of above problem of environmental effects is also presented. It was shown that proposed method converges with finite number of steps.

A Computer Aided Detection (CAD) System for Microcalcifications in Mammograms - MammoScan mCaD

Clusters of microcalcifications in mammograms are an important sign of breast cancer. This paper presents a complete Computer Aided Detection (CAD) scheme for automatic detection of clustered microcalcifications in digital mammograms. The proposed system, MammoScan μCaD, consists of three main steps. Firstly all potential microcalcifications are detected using a a method for feature extraction, VarMet, and adaptive thresholding. This will also give a number of false detections. The goal of the second step, Classifier level 1, is to remove everything but microcalcifications. The last step, Classifier level 2, uses learned dictionaries and sparse representations as a texture classification technique to distinguish single, benign microcalcifications from clustered microcalcifications, in addition to remove some remaining false detections. The system is trained and tested on true digital data from Stavanger University Hospital, and the results are evaluated by radiologists. The overall results are promising, with a sensitivity > 90 % and a low false detection rate (approx 1 unwanted pr. image, or 0.3 false pr. image).

Eco-innovation and Economic Performance in Industrial Clusters: Evidence from Italy

The article aims to investigate the presence of a correlation between eco-innovation and economic performance within industrial districts. The case analyzed in this article is based on a study concerning a sample of 54 Italian industrial clusters entitled "Eco-Districts" that has compiled a list of the most eco-efficient districts at the national level. After selecting two districts, this study assesses the economic performance of the last three years through the analysis of trends in four indicators. The results show that only in some cases there is a connection between eco innovation and economic performance.

Multi-Agent Systems for Intelligent Clustering

Intelligent systems are required in order to quickly and accurately analyze enormous quantities of data in the Internet environment. In intelligent systems, information extracting processes can be divided into supervised learning and unsupervised learning. This paper investigates intelligent clustering by unsupervised learning. Intelligent clustering is the clustering system which determines the clustering model for data analysis and evaluates results by itself. This system can make a clustering model more rapidly, objectively and accurately than an analyzer. The methodology for the automatic clustering intelligent system is a multi-agent system that comprises a clustering agent and a cluster performance evaluation agent. An agent exchanges information about clusters with another agent and the system determines the optimal cluster number through this information. Experiments using data sets in the UCI Machine Repository are performed in order to prove the validity of the system.

Global and Local Structure of Supported Pd Catalysts

The supported Pd catalysts were analyzed by X-ray diffraction and X-ray absorption spectroscopy in order to determine their global and local structure. The average particle size of the supported Pd catalysts was determined by X-ray diffraction method. One of the main purposes of the present contribution is to focus on understanding the specific role of the Pd particle size determined by X-ray diffraction and that of the support oxide. Based on X-ray absorption fine structure spectroscopy analysis we consider that the whole local structure of the investigated samples are distorted concerning the atomic number but the distances between atoms are almost the same as for standard Pd sample. Due to the strong modifications of the Pd cluster local structure, the metal-support interface may influence the electronic properties of metal clusters and thus their reactivity for absorption of the reactant molecules.

Practical Applications and Connectivity Algorithms in Future Wireless Sensor Networks

Like any sentient organism, a smart environment relies first and foremost on sensory data captured from the real world. The sensory data come from sensor nodes of different modalities deployed on different locations forming a Wireless Sensor Network (WSN). Embedding smart sensors in humans has been a research challenge due to the limitations imposed by these sensors from computational capabilities to limited power. In this paper, we first propose a practical WSN application that will enable blind people to see what their neighboring partners can see. The challenge is that the actual mapping between the input images to brain pattern is too complex and not well understood. We also study the connectivity problem in 3D/2D wireless sensor networks and propose distributed efficient algorithms to accomplish the required connectivity of the system. We provide a new connectivity algorithm CDCA to connect disconnected parts of a network using cooperative diversity. Through simulations, we analyze the connectivity gains and energy savings provided by this novel form of cooperative diversity in WSNs.

Determining Cluster Boundaries Using Particle Swarm Optimization

Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.

A Predictive Rehabilitation Software for Cerebral Palsy Patients

Young patients suffering from Cerebral Palsy are facing difficult choices concerning heavy surgeries. Diagnosis settled by surgeons can be complex and on the other hand decision for patient about getting or not such a surgery involves important reflection effort. Proposed software combining prediction for surgeries and post surgery kinematic values, and from 3D model representing the patient is an innovative tool helpful for both patients and medicine professionals. Beginning with analysis and classification of kinematics values from Data Base extracted from gait analysis in 3 separated clusters, it is possible to determine close similarity between patients. Prediction surgery best adapted to improve a patient gait is then determined by operating a suitable preconditioned neural network. Finally, patient 3D modeling based on kinematic values analysis, is animated thanks to post surgery kinematic vectors characterizing the closest patient selected from patients clustering.

Faculty-Industry R&D Joint Ventures: Barriers VS Incentives for Developing Nations

The aspiration of this research article is to target and focus the gains of university-Industry (U-I) collaborations and exploring those hurdles which are the obstacles for attaining these gains. University-Industry collaborations have attained great importance since 1980 in USA due to its application in all fields of life. U-I collaboration is a bilateral process where academia is a proactive member to make such alliances. Universities want to ameliorate their academic-base with the technicalities of technobabbles. U-I collaboration is becoming an essential lane for achieving innovative goals in this century. Many developed nations have set successful examples to prove this phenomenon as a catalyst to reduce costs, efforts and personnel for R&D projects. This study is exploits amplitudes of UI collaboration incentives in the light of success stories of developed countries. Many universities in USA, UK, Canada and various European Countries have been engaged with enterprises for numerous collaborative agreements. A long list of strategic and short term R&D projects has been executed in developed countries to accomplish their intended purposes. Due to the lack of intentions, genuine research and research-oriented environment, the mentioned field could not grow very well in developing countries. During last decade, a new wave of research has induced the institutes of developing countries to promote R&D culture especially in Pakistan. Higher Education Commission (HEC) has initiated many projects and funding supports for universities which have collaborative intentions with industry. Findings show that rapid innovation, overwhelm the technological complexities and articulated intellectual-base are major incentives which steer both partners to establish faculty-industry alliances. Everchanging technologies, concerned about intellectual property, different research environment and culture, research relevancy (Basic or applied), exposure differences and diversity of knowledge (bookish or practical) are main barriers to establish and retain joint ventures. Findings also concluded that, it is dire need to support and enhance cooperation among academia and industry to promote highly coordinated research behaviors. Author has proposed a roadmap for developing countries to promote R&D clusters among faculty and industry to deal the technological challenges and innovation complexities. Based on our research findings, Model for R&D Collaboration for developing countries also have been proposed to promote articulated R&D environment. If developing countries follow this phenomenon, rapid innovations can be achieved with limited R&D budget heads.

Application of a New Hybrid Optimization Algorithm on Cluster Analysis

Clustering techniques have received attention in many areas including engineering, medicine, biology and data mining. The purpose of clustering is to group together data points, which are close to one another. The K-means algorithm is one of the most widely used techniques for clustering. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima and global solutions of large problems cannot found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. This paper is presented an efficient hybrid evolutionary optimization algorithm based on combining Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO), called PSO-ACO, for optimally clustering N object into K clusters. The new PSO-ACO algorithm is tested on several data sets, and its performance is compared with those of ACO, PSO and K-means clustering. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for handing data clustering.

Towards Clustering of Web-based Document Structures

Methods for organizing web data into groups in order to analyze web-based hypertext data and facilitate data availability are very important in terms of the number of documents available online. Thereby, the task of clustering web-based document structures has many applications, e.g., improving information retrieval on the web, better understanding of user navigation behavior, improving web users requests servicing, and increasing web information accessibility. In this paper we investigate a new approach for clustering web-based hypertexts on the basis of their graph structures. The hypertexts will be represented as so called generalized trees which are more general than usual directed rooted trees, e.g., DOM-Trees. As a important preprocessing step we measure the structural similarity between the generalized trees on the basis of a similarity measure d. Then, we apply agglomerative clustering to the obtained similarity matrix in order to create clusters of hypertext graph patterns representing navigation structures. In the present paper we will run our approach on a data set of hypertext structures and obtain good results in Web Structure Mining. Furthermore we outline the application of our approach in Web Usage Mining as future work.

Memory Leak Detection in Distributed System

Due to memory leaks, often-valuable system memory gets wasted and denied for other processes thereby affecting the computational performance. If an application-s memory usage exceeds virtual memory size, it can leads to system crash. Current memory leak detection techniques for clusters are reactive and display the memory leak information after the execution of the process (they detect memory leak only after it occur). This paper presents a Dynamic Memory Monitoring Agent (DMMA) technique. DMMA framework is a dynamic memory leak detection, that detects the memory leak while application is in execution phase, when memory leak in any process in the cluster is identified by DMMA it gives information to the end users to enable them to take corrective actions and also DMMA submit the affected process to healthy node in the system. Thus provides reliable service to the user. DMMA maintains information about memory consumption of executing processes and based on this information and critical states, DMMA can improve reliability and efficaciousness of cluster computing.

Using Morphological and Microsatellite (SSR) Markers to Assess the Genetic Diversity in Alfalfa (Medicago sativa L.)

Utilization of diverse germplasm is needed to enhance the genetic diversity of cultivars. The objective of this study was to evaluate the genetic relationships of 98 alfalfa germplasm accessions using morphological traits and SSR markers. From the 98 tested populations, 81 were locals originating in Europe, 17 were introduced from USA, Australia, New Zealand and Canada. Three primers generated 67 polymorphic bands. The average polymorphic information content (PIC) was very high (> 0.90) over all three used primer combinations. Cluster analysis using Unweighted Pair Group Method with Arithmetic Means (UPGMA) and Jaccard´s coefficient grouped the accessions into 2 major clusters with 4 sub-clusters with no correlation between genetic and morphological diversity. The SSR analysis clearly indicated that even with three polymorphic primers, reliable estimation of genetic diversity could be obtained.

Customer Segmentation in Foreign Trade based on Clustering Algorithms Case Study: Trade Promotion Organization of Iran

The goal of this paper is to segment the countries based on the value of export from Iran during 14 years ending at 2005. To measure the dissimilarity among export baskets of different countries, we define Dissimilarity Export Basket (DEB) function and use this distance function in K-means algorithm. The DEB function is defined based on the concepts of the association rules and the value of export group-commodities. In this paper, clustering quality function and clusters intraclass inertia are defined to, respectively, calculate the optimum number of clusters and to compare the functionality of DEB versus Euclidean distance. We have also study the effects of importance weight in DEB function to improve clustering quality. Lastly when segmentation is completed, a designated RFM model is used to analyze the relative profitability of each cluster.

DCBOR: A Density Clustering Based on Outlier Removal

Data clustering is an important data exploration technique with many applications in data mining. We present an enhanced version of the well known single link clustering algorithm. We will refer to this algorithm as DCBOR. The proposed algorithm alleviates the chain effect by removing the outliers from the given dataset. So this algorithm provides outlier detection and data clustering simultaneously. This algorithm does not need to update the distance matrix, since the algorithm depends on merging the most k-nearest objects in one step and the cluster continues grow as long as possible under specified condition. So the algorithm consists of two phases; at the first phase, it removes the outliers from the input dataset. At the second phase, it performs the clustering process. This algorithm discovers clusters of different shapes, sizes, densities and requires only one input parameter; this parameter represents a threshold for outlier points. The value of the input parameter is ranging from 0 to 1. The algorithm supports the user in determining an appropriate value for it. We have tested this algorithm on different datasets contain outlier and connecting clusters by chain of density points, and the algorithm discovers the correct clusters. The results of our experiments demonstrate the effectiveness and the efficiency of DCBOR.

Influence of Ambiguity Cluster on Quality Improvement in Image Compression

Image coding based on clustering provides immediate access to targeted features of interest in a high quality decoded image. This approach is useful for intelligent devices, as well as for multimedia content-based description standards. The result of image clustering cannot be precise in some positions especially on pixels with edge information which produce ambiguity among the clusters. Even with a good enhancement operator based on PDE, the quality of the decoded image will highly depend on the clustering process. In this paper, we introduce an ambiguity cluster in image coding to represent pixels with vagueness properties. The presence of such cluster allows preserving some details inherent to edges as well for uncertain pixels. It will also be very useful during the decoding phase in which an anisotropic diffusion operator, such as Perona-Malik, enhances the quality of the restored image. This work also offers a comparative study to demonstrate the effectiveness of a fuzzy clustering technique in detecting the ambiguity cluster without losing lot of the essential image information. Several experiments have been carried out to demonstrate the usefulness of ambiguity concept in image compression. The coding results and the performance of the proposed algorithms are discussed in terms of the peak signal-tonoise ratio and the quantity of ambiguous pixels.