Abstract: A new dynamic clustering approach (DCPSO), based
on Particle Swarm Optimization, is proposed. This approach is
applied to unsupervised image classification. The proposed approach
automatically determines the "optimum" number of clusters and
simultaneously clusters the data set with minimal user interference.
The algorithm starts by partitioning the data set into a relatively large
number of clusters to reduce the effects of initial conditions. Using
binary particle swarm optimization the "best" number of clusters is
selected. The centers of the chosen clusters is then refined via the Kmeans
clustering algorithm. The experiments conducted show that
the proposed approach generally found the "optimum" number of
clusters on the tested images.
Abstract: The issue of real-time and reliable report delivery is extremely important for taking effective decision in a real world mission critical Wireless Sensor Network (WSN) based application. The sensor data behaves differently in many ways from the data in traditional databases. WSNs need a mechanism to register, process queries, and disseminate data. In this paper we propose an architectural framework for data placement and management. We propose a reliable and real time approach for data placement and achieving data integrity using self organized sensor clusters. Instead of storing information in individual cluster heads as suggested in some protocols, in our architecture we suggest storing of information of all clusters within a cell in the corresponding base station. For data dissemination and action in the wireless sensor network we propose to use Action and Relay Stations (ARS). To reduce average energy dissipation of sensor nodes, the data is sent to the nearest ARS rather than base station. We have designed our architecture in such a way so as to achieve greater energy savings, enhanced availability and reliability.
Abstract: This research tries to analyze the role that knowledge
about foreign markets has in increasing firms- exports in clustered
spaces. We consider two interrelated sources of knowledge: firms-
direct experience and indirect experience from other clustered firms –
export externalities. In particular, it is proposed that firms would
improve their export performance by accessing to export externalities
if they have some previous direct experience that allows them to
identify, understand and exploit them. Also, we propose that this
positive influence of previous direct experience on export
externalities keeps only up to a point, where it becomes negative,
creating an inverted “U" shape. Empirical evidence gathered among
wine producers located in La Rioja tends to confirm that firms enjoy
of export externalities if they have export experience along several
years and countries increase their export performance. While this
relationship becomes less relevant as they develop a higher
experience, we could not confirm the existence of a curvilinear
relationship in their influence on export externalities and export
performance.
Abstract: This paper presents a supervised clustering algorithm,
namely Grid-Based Supervised Clustering (GBSC), which is able to
identify clusters of any shapes and sizes without presuming any
canonical form for data distribution. The GBSC needs no prespecified
number of clusters, is insensitive to the order of the input
data objects, and is capable of handling outliers. Built on the
combination of grid-based clustering and density-based clustering,
under the assistance of the downward closure property of density
used in bottom-up subspace clustering, the GBSC can notably reduce
its search space to avoid the memory confinement situation during its
execution. On two-dimension synthetic datasets, the GBSC can
identify clusters with different shapes and sizes correctly. The GBSC
also outperforms other five supervised clustering algorithms when
the experiments are performed on some UCI datasets.
Abstract: Methods of clustering which were developed in the
data mining theory can be successfully applied to the investigation of
different kinds of dependencies between the conditions of
environment and human activities. It is known, that environmental
parameters such as temperature, relative humidity, atmospheric
pressure and illumination have significant effects on the human
mental performance. To investigate these parameters effect, data
mining technique of clustering using entropy and Information Gain
Ratio (IGR) K(Y/X) = (H(X)–H(Y/X))/H(Y) is used, where
H(Y)=-ΣPi ln(Pi). This technique allows adjusting the boundaries of
clusters. It is shown that the information gain ratio (IGR) grows
monotonically and simultaneously with degree of connectivity
between two variables. This approach has some preferences if
compared, for example, with correlation analysis due to relatively
smaller sensitivity to shape of functional dependencies. Variant of an
algorithm to implement the proposed method with some analysis of
above problem of environmental effects is also presented. It was
shown that proposed method converges with finite number of steps.
Abstract: Clusters of microcalcifications in mammograms are an
important sign of breast cancer. This paper presents a complete
Computer Aided Detection (CAD) scheme for automatic detection of
clustered microcalcifications in digital mammograms. The proposed
system, MammoScan μCaD, consists of three main steps. Firstly
all potential microcalcifications are detected using a a method for
feature extraction, VarMet, and adaptive thresholding. This will also
give a number of false detections. The goal of the second step,
Classifier level 1, is to remove everything but microcalcifications.
The last step, Classifier level 2, uses learned dictionaries and sparse
representations as a texture classification technique to distinguish
single, benign microcalcifications from clustered microcalcifications,
in addition to remove some remaining false detections. The system
is trained and tested on true digital data from Stavanger University
Hospital, and the results are evaluated by radiologists. The overall
results are promising, with a sensitivity > 90 % and a low false
detection rate (approx 1 unwanted pr. image, or 0.3 false pr. image).
Abstract: The article aims to investigate the presence of a correlation between eco-innovation and economic performance within industrial districts. The case analyzed in this article is based on a study concerning a sample of 54 Italian industrial clusters entitled "Eco-Districts" that has compiled a list of the most eco-efficient districts at the national level. After selecting two districts, this study assesses the economic performance of the last three years through the analysis of trends in four indicators. The results show that only in some cases there is a connection between eco innovation and economic performance.
Abstract: Intelligent systems are required in order to quickly and accurately analyze enormous quantities of data in the Internet environment. In intelligent systems, information extracting processes can be divided into supervised learning and unsupervised learning. This paper investigates intelligent clustering by unsupervised learning. Intelligent clustering is the clustering system which determines the clustering model for data analysis and evaluates results by itself. This system can make a clustering model more rapidly, objectively and accurately than an analyzer. The methodology for the automatic clustering intelligent system is a multi-agent system that comprises a clustering agent and a cluster performance evaluation agent. An agent exchanges information about clusters with another agent and the system determines the optimal cluster number through this information. Experiments using data sets in the UCI Machine Repository are performed in order to prove the validity of the system.
Abstract: The supported Pd catalysts were analyzed by X-ray
diffraction and X-ray absorption spectroscopy in order to determine
their global and local structure. The average particle size of the
supported Pd catalysts was determined by X-ray diffraction method.
One of the main purposes of the present contribution is to focus on
understanding the specific role of the Pd particle size determined by
X-ray diffraction and that of the support oxide. Based on X-ray
absorption fine structure spectroscopy analysis we consider that the
whole local structure of the investigated samples are distorted
concerning the atomic number but the distances between atoms are
almost the same as for standard Pd sample. Due to the strong
modifications of the Pd cluster local structure, the metal-support
interface may influence the electronic properties of metal clusters
and thus their reactivity for absorption of the reactant molecules.
Abstract: Like any sentient organism, a smart environment
relies first and foremost on sensory data captured from the real
world. The sensory data come from sensor nodes of different
modalities deployed on different locations forming a Wireless Sensor
Network (WSN). Embedding smart sensors in humans has been a
research challenge due to the limitations imposed by these sensors
from computational capabilities to limited power. In this paper, we
first propose a practical WSN application that will enable blind
people to see what their neighboring partners can see. The challenge
is that the actual mapping between the input images to brain pattern
is too complex and not well understood. We also study the
connectivity problem in 3D/2D wireless sensor networks and propose
distributed efficient algorithms to accomplish the required
connectivity of the system. We provide a new connectivity algorithm
CDCA to connect disconnected parts of a network using cooperative
diversity. Through simulations, we analyze the connectivity gains
and energy savings provided by this novel form of cooperative
diversity in WSNs.
Abstract: Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.
Abstract: Young patients suffering from Cerebral Palsy are
facing difficult choices concerning heavy surgeries. Diagnosis settled
by surgeons can be complex and on the other hand decision for
patient about getting or not such a surgery involves important
reflection effort. Proposed software combining prediction for
surgeries and post surgery kinematic values, and from 3D model
representing the patient is an innovative tool helpful for both patients
and medicine professionals. Beginning with analysis and
classification of kinematics values from Data Base extracted from
gait analysis in 3 separated clusters, it is possible to determine close
similarity between patients. Prediction surgery best adapted to
improve a patient gait is then determined by operating a suitable
preconditioned neural network. Finally, patient 3D modeling based
on kinematic values analysis, is animated thanks to post surgery
kinematic vectors characterizing the closest patient selected from
patients clustering.
Abstract: The aspiration of this research article is to target and
focus the gains of university-Industry (U-I) collaborations and
exploring those hurdles which are the obstacles for attaining these
gains. University-Industry collaborations have attained great
importance since 1980 in USA due to its application in all fields of
life. U-I collaboration is a bilateral process where academia is a
proactive member to make such alliances. Universities want to
ameliorate their academic-base with the technicalities of technobabbles.
U-I collaboration is becoming an essential lane for achieving
innovative goals in this century. Many developed nations have set
successful examples to prove this phenomenon as a catalyst to reduce
costs, efforts and personnel for R&D projects. This study is exploits
amplitudes of UI collaboration incentives in the light of success
stories of developed countries. Many universities in USA, UK,
Canada and various European Countries have been engaged with
enterprises for numerous collaborative agreements. A long list of
strategic and short term R&D projects has been executed in
developed countries to accomplish their intended purposes. Due to
the lack of intentions, genuine research and research-oriented
environment, the mentioned field could not grow very well in
developing countries. During last decade, a new wave of research
has induced the institutes of developing countries to promote R&D
culture especially in Pakistan. Higher Education Commission (HEC)
has initiated many projects and funding supports for universities
which have collaborative intentions with industry.
Findings show that rapid innovation, overwhelm the technological
complexities and articulated intellectual-base are major incentives
which steer both partners to establish faculty-industry alliances. Everchanging
technologies, concerned about intellectual property,
different research environment and culture, research relevancy (Basic
or applied), exposure differences and diversity of knowledge
(bookish or practical) are main barriers to establish and retain joint
ventures. Findings also concluded that, it is dire need to support and
enhance cooperation among academia and industry to promote highly
coordinated research behaviors. Author has proposed a roadmap for
developing countries to promote R&D clusters among faculty and
industry to deal the technological challenges and innovation
complexities. Based on our research findings, Model for R&D
Collaboration for developing countries also have been proposed to
promote articulated R&D environment. If developing countries
follow this phenomenon, rapid innovations can be achieved with
limited R&D budget heads.
Abstract: Clustering techniques have received attention in many areas including engineering, medicine, biology and data mining. The purpose of clustering is to group together data points, which are close to one another. The K-means algorithm is one of the most widely used techniques for clustering. However, K-means has two shortcomings: dependency on the initial state and convergence to local optima and global solutions of large problems cannot found with reasonable amount of computation effort. In order to overcome local optima problem lots of studies done in clustering. This paper is presented an efficient hybrid evolutionary optimization algorithm based on combining Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO), called PSO-ACO, for optimally clustering N object into K clusters. The new PSO-ACO algorithm is tested on several data sets, and its performance is compared with those of ACO, PSO and K-means clustering. The simulation results show that the proposed evolutionary optimization algorithm is robust and suitable for handing data clustering.
Abstract: Methods for organizing web data into groups in order
to analyze web-based hypertext data and facilitate data availability
are very important in terms of the number of documents available
online. Thereby, the task of clustering web-based document structures
has many applications, e.g., improving information retrieval on the
web, better understanding of user navigation behavior, improving web
users requests servicing, and increasing web information accessibility.
In this paper we investigate a new approach for clustering web-based
hypertexts on the basis of their graph structures. The hypertexts will
be represented as so called generalized trees which are more general
than usual directed rooted trees, e.g., DOM-Trees. As a important
preprocessing step we measure the structural similarity between the
generalized trees on the basis of a similarity measure d. Then,
we apply agglomerative clustering to the obtained similarity matrix
in order to create clusters of hypertext graph patterns representing
navigation structures. In the present paper we will run our approach
on a data set of hypertext structures and obtain good results in
Web Structure Mining. Furthermore we outline the application of
our approach in Web Usage Mining as future work.
Abstract: Due to memory leaks, often-valuable system memory
gets wasted and denied for other processes thereby affecting the
computational performance. If an application-s memory usage
exceeds virtual memory size, it can leads to system crash. Current
memory leak detection techniques for clusters are reactive and
display the memory leak information after the execution of the
process (they detect memory leak only after it occur).
This paper presents a Dynamic Memory Monitoring Agent
(DMMA) technique. DMMA framework is a dynamic memory leak
detection, that detects the memory leak while application is in
execution phase, when memory leak in any process in the cluster is
identified by DMMA it gives information to the end users to enable
them to take corrective actions and also DMMA submit the affected
process to healthy node in the system. Thus provides reliable service
to the user. DMMA maintains information about memory
consumption of executing processes and based on this information
and critical states, DMMA can improve reliability and
efficaciousness of cluster computing.
Abstract: Utilization of diverse germplasm is needed to enhance
the genetic diversity of cultivars. The objective of this study was to
evaluate the genetic relationships of 98 alfalfa germplasm accessions
using morphological traits and SSR markers. From the 98 tested
populations, 81 were locals originating in Europe, 17 were introduced
from USA, Australia, New Zealand and Canada. Three primers
generated 67 polymorphic bands. The average polymorphic
information content (PIC) was very high (> 0.90) over all three used
primer combinations. Cluster analysis using Unweighted Pair Group
Method with Arithmetic Means (UPGMA) and Jaccard´s coefficient
grouped the accessions into 2 major clusters with 4 sub-clusters with
no correlation between genetic and morphological diversity. The SSR
analysis clearly indicated that even with three polymorphic primers,
reliable estimation of genetic diversity could be obtained.
Abstract: The goal of this paper is to segment the countries
based on the value of export from Iran during 14 years ending at 2005. To measure the dissimilarity among export baskets of different countries, we define Dissimilarity Export Basket (DEB) function and
use this distance function in K-means algorithm. The DEB function
is defined based on the concepts of the association rules and the
value of export group-commodities. In this paper, clustering quality
function and clusters intraclass inertia are defined to, respectively,
calculate the optimum number of clusters and to compare the
functionality of DEB versus Euclidean distance. We have also study
the effects of importance weight in DEB function to improve
clustering quality. Lastly when segmentation is completed, a
designated RFM model is used to analyze the relative profitability of
each cluster.
Abstract: Data clustering is an important data exploration technique
with many applications in data mining. We present an enhanced
version of the well known single link clustering algorithm. We will
refer to this algorithm as DCBOR. The proposed algorithm alleviates
the chain effect by removing the outliers from the given dataset.
So this algorithm provides outlier detection and data clustering
simultaneously. This algorithm does not need to update the distance
matrix, since the algorithm depends on merging the most k-nearest
objects in one step and the cluster continues grow as long as possible
under specified condition. So the algorithm consists of two phases;
at the first phase, it removes the outliers from the input dataset. At
the second phase, it performs the clustering process. This algorithm
discovers clusters of different shapes, sizes, densities and requires
only one input parameter; this parameter represents a threshold for
outlier points. The value of the input parameter is ranging from 0 to
1. The algorithm supports the user in determining an appropriate
value for it. We have tested this algorithm on different datasets
contain outlier and connecting clusters by chain of density points,
and the algorithm discovers the correct clusters. The results of
our experiments demonstrate the effectiveness and the efficiency of
DCBOR.
Abstract: Image coding based on clustering provides immediate
access to targeted features of interest in a high quality decoded
image. This approach is useful for intelligent devices, as well as for
multimedia content-based description standards. The result of image
clustering cannot be precise in some positions especially on pixels
with edge information which produce ambiguity among the clusters.
Even with a good enhancement operator based on PDE, the quality of
the decoded image will highly depend on the clustering process. In
this paper, we introduce an ambiguity cluster in image coding to
represent pixels with vagueness properties. The presence of such
cluster allows preserving some details inherent to edges as well for
uncertain pixels. It will also be very useful during the decoding phase
in which an anisotropic diffusion operator, such as Perona-Malik,
enhances the quality of the restored image. This work also offers a
comparative study to demonstrate the effectiveness of a fuzzy
clustering technique in detecting the ambiguity cluster without losing
lot of the essential image information. Several experiments have been
carried out to demonstrate the usefulness of ambiguity concept in
image compression. The coding results and the performance of the
proposed algorithms are discussed in terms of the peak signal-tonoise
ratio and the quantity of ambiguous pixels.