A Study of Gaps in CBMIR Using Different Methods and Prospective

In recent years, rapid advances in software and hardware in the field of information technology along with a digital imaging revolution in the medical domain facilitate the generation and storage of large collections of images by hospitals and clinics. To search these large image collections effectively and efficiently poses significant technical challenges, and it raises the necessity of constructing intelligent retrieval systems. Content-based Image Retrieval (CBIR) consists of retrieving the most visually similar images to a given query image from a database of images[5]. Medical CBIR (content-based image retrieval) applications pose unique challenges but at the same time offer many new opportunities. On one hand, while one can easily understand news or sports videos, a medical image is often completely incomprehensible to untrained eyes.

A Software Framework for Predicting Oil-Palm Yield from Climate Data

Intelligent systems based on machine learning techniques, such as classification, clustering, are gaining wide spread popularity in real world applications. This paper presents work on developing a software system for predicting crop yield, for example oil-palm yield, from climate and plantation data. At the core of our system is a method for unsupervised partitioning of data for finding spatio-temporal patterns in climate data using kernel methods which offer strength to deal with complex data. This work gets inspiration from the notion that a non-linear data transformation into some high dimensional feature space increases the possibility of linear separability of the patterns in the transformed space. Therefore, it simplifies exploration of the associated structure in the data. Kernel methods implicitly perform a non-linear mapping of the input data into a high dimensional feature space by replacing the inner products with an appropriate positive definite function. In this paper we present a robust weighted kernel k-means algorithm incorporating spatial constraints for clustering the data. The proposed algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis by exploring patterns and structures in the data, and thus can be used for predicting oil-palm yield by analyzing various factors affecting the yield.

Initialization Method of Reference Vectors for Improvement of Recognition Accuracy in LVQ

Initial values of reference vectors have significant influence on recognition accuracy in LVQ. There are several existing techniques, such as SOM and k-means, for setting initial values of reference vectors, each of which has provided some positive results. However, those results are not sufficient for the improvement of recognition accuracy. This study proposes an ACO-used method for initializing reference vectors with an aim to achieve recognition accuracy higher than those obtained through conventional methods. Moreover, we will demonstrate the effectiveness of the proposed method by applying it to the wine data and English vowel data and comparing its results with those of conventional methods.

Mining Genes Relations in Microarray Data Combined with Ontology in Colon Cancer Automated Diagnosis System

MATCH project [1] entitle the development of an automatic diagnosis system that aims to support treatment of colon cancer diseases by discovering mutations that occurs to tumour suppressor genes (TSGs) and contributes to the development of cancerous tumours. The constitution of the system is based on a) colon cancer clinical data and b) biological information that will be derived by data mining techniques from genomic and proteomic sources The core mining module will consist of the popular, well tested hybrid feature extraction methods, and new combined algorithms, designed especially for the project. Elements of rough sets, evolutionary computing, cluster analysis, self-organization maps and association rules will be used to discover the annotations between genes, and their influence on tumours [2]-[11]. The methods used to process the data have to address their high complexity, potential inconsistency and problems of dealing with the missing values. They must integrate all the useful information necessary to solve the expert's question. For this purpose, the system has to learn from data, or be able to interactively specify by a domain specialist, the part of the knowledge structure it needs to answer a given query. The program should also take into account the importance/rank of the particular parts of data it analyses, and adjusts the used algorithms accordingly.

Discovering Complex Regularities by Adaptive Self Organizing Classification

Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optmize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is also able to automatically suggest a strategy for number of classes optimization.The tool is used to classify macroeconomic data that report the most developed countries? import and export. It is possible to classify the countries based on their economic behaviour and use an ad hoc tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation.

Possibilistic Clustering Technique-Based Traffic Light Control for Handling Emergency Vehicle

A traffic light gives security from traffic congestion,reducing the traffic jam, and organizing the traffic flow. Furthermore,increasing congestion level in public road networks is a growingproblem in many countries. Using Intelligent Transportation Systemsto provide emergency vehicles a green light at intersections canreduce driver confusion, reduce conflicts, and improve emergencyresponse times. Nowadays, the technology of wireless sensornetworks can solve many problems and can offer a good managementof the crossroad. In this paper, we develop a new approach based onthe technique of clustering and the graphical possibilistic fusionmodeling. So, the proposed model is elaborated in three phases. Thefirst one consists to decompose the environment into clusters,following by the fusion intra and inter clusters processes. Finally, wewill show some experimental results by simulation that proves theefficiency of our proposed approach.KeywordsTraffic light, Wireless sensor network, Controller,Possibilistic network/Bayesain network.

Binary Classification Tree with Tuned Observation-based Clustering

There are several approaches for handling multiclass classification. Aside from one-against-one (OAO) and one-against-all (OAA), hierarchical classification technique is also commonly used. A binary classification tree is a hierarchical classification structure that breaks down a k-class problem into binary sub-problems, each solved by a binary classifier. In each node, a set of classes is divided into two subsets. A good class partition should be able to group similar classes together. Many algorithms measure similarity in term of distance between class centroids. Classes are grouped together by a clustering algorithm when distances between their centroids are small. In this paper, we present a binary classification tree with tuned observation-based clustering (BCT-TOB) that finds a class partition by performing clustering on observations instead of class centroids. A merging step is introduced to merge any insignificant class split. The experiment shows that performance of BCT-TOB is comparable to other algorithms.

Unsupervised Clustering Methods for Identifying Rare Events in Anomaly Detection

It is important problems to increase the detection rates and reduce false positive rates in Intrusion Detection System (IDS). Although preventative techniques such as access control and authentication attempt to prevent intruders, these can fail, and as a second line of defence, intrusion detection has been introduced. Rare events are events that occur very infrequently, detection of rare events is a common problem in many domains. In this paper we propose an intrusion detection method that combines Rough set and Fuzzy Clustering. Rough set has to decrease the amount of data and get rid of redundancy. Fuzzy c-means clustering allow objects to belong to several clusters simultaneously, with different degrees of membership. Our approach allows us to recognize not only known attacks but also to detect suspicious activity that may be the result of a new, unknown attack. The experimental results on Knowledge Discovery and Data Mining-(KDDCup 1999) Dataset show that the method is efficient and practical for intrusion detection systems.

A Text Clustering System based on k-means Type Subspace Clustering and Ontology

This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.

A 3D Approach for Extraction of the Coronaryartery and Quantification of the Stenosis

Segmentation and quantification of stenosis is an important task in assessing coronary artery disease. One of the main challenges is measuring the real diameter of curved vessels. Moreover, uncertainty in segmentation of different tissues in the narrow vessel is an important issue that affects accuracy. This paper proposes an algorithm to extract coronary arteries and measure the degree of stenosis. Markovian fuzzy clustering method is applied to model uncertainty arises from partial volume effect problem. The algorithm employs: segmentation, centreline extraction, estimation of orthogonal plane to centreline, measurement of the degree of stenosis. To evaluate the accuracy and reproducibility, the approach has been applied to a vascular phantom and the results are compared with real diameter. The results of 10 patient datasets have been visually judged by a qualified radiologist. The results reveal the superiority of the proposed method compared to the Conventional thresholding Method (CTM) on both datasets.

A New Method for Detection of Artificial Objects and Materials from Long Distance Environmental Images

The article presents a new method for detection of artificial objects and materials from images of the environmental (non-urban) terrain. Our approach uses the hue and saturation (or Cb and Cr) components of the image as the input to the segmentation module that uses the mean shift method. The clusters obtained as the output of this stage have been processed by the decision-making module in order to find the regions of the image with the significant possibility of representing human. Although this method will detect various non-natural objects, it is primarily intended and optimized for detection of humans; i.e. for search and rescue purposes in non-urban terrain where, in normal circumstances, non-natural objects shouldn-t be present. Real world images are used for the evaluation of the method.

EEG Spikes Detection, Sorting, and Localization

This study introduces a new method for detecting, sorting, and localizing spikes from multiunit EEG recordings. The method combines the wavelet transform, which localizes distinctive spike features, with Super-Paramagnetic Clustering (SPC) algorithm, which allows automatic classification of the data without assumptions such as low variance or Gaussian distributions. Moreover, the method is capable of setting amplitude thresholds for spike detection. The method makes use of several real EEG data sets, and accordingly the spikes are detected, clustered and their times were detected.

Gas Sensing Properties of SnO2 Thin Films Modified by Ag Nanoclusters Synthesized by SILD Method

The effect of SnO2 surface modification by Ag nanoclusters, synthesized by SILD method, on the operating characteristics of thin film gas sensors was studied and models for the promotional role of Ag additives were discussed. It was found that mentioned above approach can be used for improvement both the sensitivity and the rate of response of the SnO2-based gas sensors to CO and H2. At the same time, the presence of the Ag clusters on the surface of SnO2 depressed the sensor response to ozone.

How the Iranian Free-Style Wrestlers Know and Think about Doping? – A Knowledge and Attitude Study

Nowadays, doping is an intricate dilemma. Wrestling is the nationally popular sport in Iran. Also the prevalence of doping may be high, due to its power demanding characteristics. So, we aimed to assess the knowledge and attitudes toward doping among the club wrestlers. In a cross sectional study, 426 wrestlers were studied. For this reason, a researcher made questionnaire was used. In this study, researchers selected the clubs by randomized clustered sampling and distributed the questionnaire among wrestlers. Knowledge of wrestlers in three categories of doping definitions, recognition of prohibited drugs and side effects was poor or moderate in 70.8%, 95.8% and 99.5%, respectively. Wrestlers have poor knowledge in doping. Furthermore, they believe some myths which are unfavorable. It seems necessary to design a comprehensive educational program for all of the athletes and coaches.

Benchmarking: Performance on ALPS and Formosa Clusters

This paper presents the benchmarking results and performance evaluation of differentclustersbuilt atthe National Center for High-Performance Computingin Taiwan. Performance of processor, memory subsystem andinterconnect is a critical factor in the overall performance of high performance computing platforms. The evaluation compares different system architecture and software platforms. Most supercomputer used HPL to benchmark their system performance, in accordance with the requirement of the TOP500 List. In this paper we consider system memory access factors that affect benchmark performance, such as processor and memory performance.We hope these works will provide useful information for future development and construct cluster system.

Self-Organization of Clusters Having Locally Distributed Patterns for Highly Synchronized Inputs

Many experimental results suggest that more precise spike timing is significant in neural information processing. We construct a self-organization model using the spatiotemporal pat-terns, where Spike-Timing Dependent Plasticity (STDP) tunes the conduction delays between neurons. We show that, for highly syn-chronized inputs, the fluctuation of conduction delays causes globally continuous and locally distributed firing patterns through the self-organization.

Decision Tree-based Feature Ranking using Manhattan Hierarchical Cluster Criterion

Feature selection study is gaining importance due to its contribution to save classification cost in terms of time and computation load. In search of essential features, one of the methods to search the features is via the decision tree. Decision tree act as an intermediate feature space inducer in order to choose essential features. In decision tree-based feature selection, some studies used decision tree as a feature ranker with a direct threshold measure, while others remain the decision tree but utilized pruning condition that act as a threshold mechanism to choose features. This paper proposed threshold measure using Manhattan Hierarchical Cluster distance to be utilized in feature ranking in order to choose relevant features as part of the feature selection process. The result is promising, and this method can be improved in the future by including test cases of a higher number of attributes.

Construction of cDNALibrary and EST Analysis of Tenebriomolitorlarvae

Tofurther advance research on immune-related genes from T. molitor, we constructed acDNA library and analyzed expressed sequence taq (EST) sequences from 1,056 clones. After removing vector sequence and quality checkingthrough thePhred program (trim_alt 0.05 (P-score>20), 1039 sequences were generated. The average length of insert was 792 bp. In addition, we identified 162 clusters, 167 contigs and 391 contigs after clustering and assembling process using a TGICL package. EST sequences were searchedagainst NCBI nr database by local BLAST (blastx, E

Model Order Reduction of Discrete-Time Systems Using Fuzzy C-Means Clustering

A computationally simple approach of model order reduction for single input single output (SISO) and linear timeinvariant discrete systems modeled in frequency domain is proposed in this paper. Denominator of the reduced order model is determined using fuzzy C-means clustering while the numerator parameters are found by matching time moments and Markov parameters of high order system.

Socio-Demographic Status and Arrack Drinking Patterns among Muslim, Hindu, Santal and Oraon Communities in Rasulpur Union,Bangladesh: A Cross-Cultural Perspective

Arrack is one of the forms of alcoholic beverage or liquor which is produced from palm or date juice and commonly consumed by the lower social class of all religious/ethnic communities in the north-western villages of Bangladesh. The purpose of the study was to compare arrack drinking patterns associated with socio-demographic status among the Muslim, Hindu, Santal, and Oraon communities in the Rasulpur union of Bangladesh. A total of 391 respondents (Muslim n-109, Hindu n-103, Santal n-89, Oraon n-90) selected by cluster random sampling were interviewed by ADP (Arrack Drinking Pattern) questionnaire. The results of Pearson Chi-Squire test revealed that arrack drinking patterns were significantly differed among the Muslim, Hindu, Santal, and Oraon communities- drinkers. In addition, the results of Spearman-s bivariate correlation coefficients also revealed that sociodemographic characteristics of the communities- drinkers were the significantly positive and negative associations with the arrack drinking patterns in the Rasulpur union, Bangladesh. The study suggests that further cross-cultural researches should be conducted on the consequences of arrack drinking patterns on the communities- drinkers.