Abstract: The exponential increase in the volume of medical image database has imposed new challenges to clinical routine in maintaining patient history, diagnosis, treatment and monitoring. With the advent of data mining and machine learning techniques it is possible to automate and/or assist physicians in clinical diagnosis. In this research a medical image classification framework using data mining techniques is proposed. It involves feature extraction, feature selection, feature discretization and classification. In the classification phase, the performance of the traditional kNN k nearest neighbor classifier is improved using a feature weighting scheme and a distance weighted voting instead of simple majority voting. Feature weights are calculated using the interestingness measures used in association rule mining. Experiments on the retinal fundus images show that the proposed framework improves the classification accuracy of traditional kNN from 78.57 % to 92.85 %.
Abstract: A feature weighting and selection method is proposed
which uses the structure of a weightless neuron and exploits the
principles that govern the operation of Genetic Algorithms and
Evolution. Features are coded onto chromosomes in a novel way
which allows weighting information regarding the features to be
directly inferred from the gene values. The proposed method is
significant in that it addresses several problems concerned with
algorithms for feature selection and weighting as well as providing
significant advantages such as speed, simplicity and suitability for
real-time systems.
Abstract: We present here the results for a comparative study of
some techniques, available in the literature, related to the relevance
feedback mechanism in the case of a short-term learning. Only one
method among those considered here is belonging to the data mining
field which is the K-nearest neighbors algorithm (KNN) while the
rest of the methods is related purely to the information retrieval field
and they fall under the purview of the following three major axes:
Shifting query, Feature Weighting and the optimization of the
parameters of similarity metric. As a contribution, and in addition to
the comparative purpose, we propose a new version of the KNN
algorithm referred to as an incremental KNN which is distinct from
the original version in the sense that besides the influence of the
seeds, the rate of the actual target image is influenced also by the
images already rated. The results presented here have been obtained
after experiments conducted on the Wang database for one iteration
and utilizing color moments on the RGB space. This compact
descriptor, Color Moments, is adequate for the efficiency purposes
needed in the case of interactive systems. The results obtained allow
us to claim that the proposed algorithm proves good results; it even
outperforms a wide range of techniques available in the literature.
Abstract: One of the approaches enabling people with amputated
limbs to establish some sort of interface with the real world includes
the utilization of the myoelectric signal (MES) from the remaining
muscles of those limbs. The MES can be used as a control input to a
multifunction prosthetic device. In this control scheme, known as the
myoelectric control, a pattern recognition approach is usually utilized
to discriminate between the MES signals that belong to different
classes of the forearm movements. Since the MES is recorded using
multiple channels, the feature vector size can become very large. In
order to reduce the computational cost and enhance the generalization
capability of the classifier, a dimensionality reduction method is
needed to identify an informative yet moderate size feature set. This
paper proposes a new fuzzy version of the well known Fisher-s
Linear Discriminant Analysis (LDA) feature projection technique.
Furthermore, based on the fact that certain muscles might contribute
more to the discrimination process, a novel feature weighting scheme
is also presented by employing Particle Swarm Optimization (PSO)
for estimating the weight of each feature. The new method, called
PSOFLDA, is tested on real MES datasets and compared with other
techniques to prove its superiority.
Abstract: This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.