Abstract: Asset management of urban infrastructures faces a multitude of challenges that need to be overcome to obtain a reliable measurement of performances. Predicting the performance of slowly aging systems is one of those challenges, which helps the asset manager to investigate specific failure modes and to undertake the appropriate maintenance and rehabilitation interventions to avoid catastrophic failures as well as to optimize the maintenance costs. This article presents a methodology for modeling the deterioration of slowly degrading assets based on an operating history. It consists of extracting degradation profiles by grouping together assets that exhibit similar degradation sequences using an unsupervised classification technique derived from artificial intelligence. The obtained clusters are used to build the performance prediction models. This methodology is applied to a sample of a stormwater drainage culvert dataset.
Abstract: Abstract—Attribute or feature selection is one of the basic
strategies to improve the performances of data classification tasks,
and, at the same time, to reduce the complexity of classifiers,
and it is a particularly fundamental one when the number
of attributes is relatively high. Its application to unsupervised
classification is restricted to a limited number of experiments in
the literature. Evolutionary computation has already proven itself
to be a very effective choice to consistently reduce the number
of attributes towards a better classification rate and a simpler
semantic interpretation of the inferred classifiers. We present a feature
selection wrapper model composed by a multi-objective evolutionary
algorithm, the clustering method Expectation-Maximization (EM),
and the classifier C4.5 for the unsupervised classification of data
extracted from a psychological test named BASC-II (Behavior
Assessment System for Children - II ed.) with two objectives:
Maximizing the likelihood of the clustering model and maximizing
the accuracy of the obtained classifier. We present a methodology
to integrate feature selection for unsupervised classification, model
evaluation, decision making (to choose the most satisfactory model
according to a a posteriori process in a multi-objective context), and
testing. We compare the performance of the classifier obtained by the
multi-objective evolutionary algorithms ENORA and NSGA-II, and
the best solution is then validated by the psychologists that collected
the data.
Abstract: DNA Barcode provides good sources of needed
information to classify living species. The classification problem has
to be supported with reliable methods and algorithms. To analyze
species regions or entire genomes, it becomes necessary to use the
similarity sequence methods. A large set of sequences can be
simultaneously compared using Multiple Sequence Alignment which
is known to be NP-complete. However, all the used methods are still
computationally very expensive and require significant computational
infrastructure. Our goal is to build predictive models that are highly
accurate and interpretable. In fact, our method permits to avoid the
complex problem of form and structure in different classes of
organisms. The empirical data and their classification performances
are compared with other methods. Evenly, in this study, we present
our system which is consisted of three phases. The first one, is called
transformation, is composed of three sub steps; Electron-Ion
Interaction Pseudopotential (EIIP) for the codification of DNA
Barcodes, Fourier Transform and Power Spectrum Signal Processing.
Moreover, the second phase step is an approximation; it is
empowered by the use of Multi Library Wavelet Neural Networks
(MLWNN). Finally, the third one, is called the classification of DNA
Barcodes, is realized by applying the algorithm of hierarchical
classification.
Abstract: An automatic moment-based texture segmentation approach is proposed in this paper. First, we describe the related work in this computer vision domain. Our texture feature extraction, the first part of the texture recognition process, produces a set of moment-based feature vectors. For each image pixel, a texture feature vector is computed as a sequence of area moments. Then, an automatic pixel classification approach is proposed. The feature vectors are clustered using an unsupervised classification algorithm, the optimal number of clusters being determined using a measure based on validation indexes. From the resulted pixel classes one determines easily the desired texture regions of the image.
Abstract: This paper presents an optimal and unsupervised satellite image segmentation approach based on Pearson system and k-Means Clustering Algorithm Initialization. Such method could be considered as original by the fact that it utilised K-Means clustering algorithm for an optimal initialisation of image class number on one hand and it exploited Pearson system for an optimal statistical distributions- affectation of each considered class on the other hand. Satellite image exploitation requires the use of different approaches, especially those founded on the unsupervised statistical segmentation principle. Such approaches necessitate definition of several parameters like image class number, class variables- estimation and generalised mixture distributions. Use of statistical images- attributes assured convincing and promoting results under the condition of having an optimal initialisation step with appropriated statistical distributions- affectation. Pearson system associated with a k-means clustering algorithm and Stochastic Expectation-Maximization 'SEM' algorithm could be adapted to such problem. For each image-s class, Pearson system attributes one distribution type according to different parameters and especially the Skewness 'β1' and the kurtosis 'β2'. The different adapted algorithms, K-Means clustering algorithm, SEM algorithm and Pearson system algorithm, are then applied to satellite image segmentation problem. Efficiency of those combined algorithms was firstly validated with the Mean Quadratic Error 'MQE' evaluation, and secondly with visual inspection along several comparisons of these unsupervised images- segmentation.
Abstract: This paper regards the phenomena of intensive suburbanization and urbanization in Olomouc city and in Olomouc region in general for the period of 1986–2009. A Remote Sensing approach that involves tracking of changes in Land Cover units is proposed to quantify the urbanization state and trends in temporal and spatial aspects. It actually consisted of two approaches, Experiment 1 and Experiment 2 which implied two different image classification solutions in order to provide Land Cover maps for each 1986–2009 time split available in the Landsat image set. Experiment 1 dealt with the unsupervised classification, while Experiment 2 involved semi- supervised classification, using a combination of object-based and pixel-based classifiers. The resulting Land Cover maps were subsequently quantified for the proportion of urban area unit and its trend through time, and also for the urban area unit stability, yielding the relation of spatial and temporal development of the urban area unit. Some outcomes seem promising but there is indisputably room for improvements of source data and also processing and filtering.
Abstract: Data mining uses a variety of techniques each of which
is useful for some particular task. It is important to have a deep
understanding of each technique and be able to perform sophisticated
analysis. In this article we describe a tool built to simulate a variation
of the Kohonen network to perform unsupervised clustering and
support the entire data mining process up to results visualization. A
graphical representation helps the user to find out a strategy to
optimize classification by adding, moving or delete a neuron in order
to change the number of classes. The tool is able to automatically
suggest a strategy to optimize the number of classes optimization, but
also support both tree classifications and semi-lattice organizations of
the classes to give to the users the possibility of passing from one
class to the ones with which it has some aspects in common.
Examples of using tree and semi-lattice classifications are given to
illustrate advantages and problems. The tool is applied to classify
macroeconomic data that report the most developed countries- import
and export. It is possible to classify the countries based on their
economic behaviour and use the tool to characterize the commercial
behaviour of a country in a selected class from the analysis of
positive and negative features that contribute to classes formation.
Possible interrelationships between the classes and their meaning are
also discussed.
Abstract: An unsupervised classification algorithm is derived
by modeling observed data as a mixture of several mutually
exclusive classes that are each described by linear combinations of
independent non-Gaussian densities. The algorithm estimates the
data density in each class by using parametric nonlinear functions
that fit to the non-Gaussian structure of the data. This improves
classification accuracy compared with standard Gaussian mixture
models. When applied to textures, the algorithm can learn basis
functions for images that capture the statistically significant structure
intrinsic in the images. We apply this technique to the problem of
unsupervised texture classification and segmentation.
Abstract: Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optmize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is also able to automatically suggest a strategy for number of classes optimization.The tool is used to classify macroeconomic data that report the most developed countries? import and export. It is possible to classify the countries based on their economic behaviour and use an ad hoc tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation.
Abstract: The clustering ensembles combine multiple partitions
generated by different clustering algorithms into a single clustering
solution. Clustering ensembles have emerged as a prominent method
for improving robustness, stability and accuracy of unsupervised
classification solutions. So far, many contributions have been done to
find consensus clustering. One of the major problems in clustering
ensembles is the consensus function. In this paper, firstly, we
introduce clustering ensembles, representation of multiple partitions,
its challenges and present taxonomy of combination algorithms.
Secondly, we describe consensus functions in clustering ensembles
including Hypergraph partitioning, Voting approach, Mutual
information, Co-association based functions and Finite mixture
model, and next explain their advantages, disadvantages and
computational complexity. Finally, we compare the characteristics of
clustering ensembles algorithms such as computational complexity,
robustness, simplicity and accuracy on different datasets in previous
techniques.