Grid-based Supervised Clustering - GBSC

This paper presents a supervised clustering algorithm, namely Grid-Based Supervised Clustering (GBSC), which is able to identify clusters of any shapes and sizes without presuming any canonical form for data distribution. The GBSC needs no prespecified number of clusters, is insensitive to the order of the input data objects, and is capable of handling outliers. Built on the combination of grid-based clustering and density-based clustering, under the assistance of the downward closure property of density used in bottom-up subspace clustering, the GBSC can notably reduce its search space to avoid the memory confinement situation during its execution. On two-dimension synthetic datasets, the GBSC can identify clusters with different shapes and sizes correctly. The GBSC also outperforms other five supervised clustering algorithms when the experiments are performed on some UCI datasets.

Information Gain Ratio Based Clustering for Investigation of Environmental Parameters Effects on Human Mental Performance

Methods of clustering which were developed in the data mining theory can be successfully applied to the investigation of different kinds of dependencies between the conditions of environment and human activities. It is known, that environmental parameters such as temperature, relative humidity, atmospheric pressure and illumination have significant effects on the human mental performance. To investigate these parameters effect, data mining technique of clustering using entropy and Information Gain Ratio (IGR) K(Y/X) = (H(X)–H(Y/X))/H(Y) is used, where H(Y)=-ΣPi ln(Pi). This technique allows adjusting the boundaries of clusters. It is shown that the information gain ratio (IGR) grows monotonically and simultaneously with degree of connectivity between two variables. This approach has some preferences if compared, for example, with correlation analysis due to relatively smaller sensitivity to shape of functional dependencies. Variant of an algorithm to implement the proposed method with some analysis of above problem of environmental effects is also presented. It was shown that proposed method converges with finite number of steps.

A Brain Inspired Approach for Multi-View Patterns Identification

Biologically human brain processes information in both unimodal and multimodal approaches. In fact, information is progressively abstracted and seamlessly fused. Subsequently, the fusion of multimodal inputs allows a holistic understanding of a problem. The proliferation of technology has exponentially produced various sources of data, which could be likened to being the state of multimodality in human brain. Therefore, this is an inspiration to develop a methodology for exploring multimodal data and further identifying multi-view patterns. Specifically, we propose a brain inspired conceptual model that allows exploration and identification of patterns at different levels of granularity, different types of hierarchies and different types of modalities. A structurally adaptive neural network is deployed to implement the proposed model. Furthermore, the acquisition of multi-view patterns with the proposed model is demonstrated and discussed with some experimental results.

A Computer Aided Detection (CAD) System for Microcalcifications in Mammograms - MammoScan mCaD

Clusters of microcalcifications in mammograms are an important sign of breast cancer. This paper presents a complete Computer Aided Detection (CAD) scheme for automatic detection of clustered microcalcifications in digital mammograms. The proposed system, MammoScan μCaD, consists of three main steps. Firstly all potential microcalcifications are detected using a a method for feature extraction, VarMet, and adaptive thresholding. This will also give a number of false detections. The goal of the second step, Classifier level 1, is to remove everything but microcalcifications. The last step, Classifier level 2, uses learned dictionaries and sparse representations as a texture classification technique to distinguish single, benign microcalcifications from clustered microcalcifications, in addition to remove some remaining false detections. The system is trained and tested on true digital data from Stavanger University Hospital, and the results are evaluated by radiologists. The overall results are promising, with a sensitivity > 90 % and a low false detection rate (approx 1 unwanted pr. image, or 0.3 false pr. image).

Eco-innovation and Economic Performance in Industrial Clusters: Evidence from Italy

The article aims to investigate the presence of a correlation between eco-innovation and economic performance within industrial districts. The case analyzed in this article is based on a study concerning a sample of 54 Italian industrial clusters entitled "Eco-Districts" that has compiled a list of the most eco-efficient districts at the national level. After selecting two districts, this study assesses the economic performance of the last three years through the analysis of trends in four indicators. The results show that only in some cases there is a connection between eco innovation and economic performance.

Web Traffic Mining using Neural Networks

With the explosive growth of data available on the Internet, personalization of this information space become a necessity. At present time with the rapid increasing popularity of the WWW, Websites are playing a crucial role to convey knowledge and information to the end users. Discovering hidden and meaningful information about Web users usage patterns is critical to determine effective marketing strategies to optimize the Web server usage for accommodating future growth. The task of mining useful information becomes more challenging when the Web traffic volume is enormous and keeps on growing. In this paper, we propose a intelligent model to discover and analyze useful knowledge from the available Web log data.

Multi-Agent Systems for Intelligent Clustering

Intelligent systems are required in order to quickly and accurately analyze enormous quantities of data in the Internet environment. In intelligent systems, information extracting processes can be divided into supervised learning and unsupervised learning. This paper investigates intelligent clustering by unsupervised learning. Intelligent clustering is the clustering system which determines the clustering model for data analysis and evaluates results by itself. This system can make a clustering model more rapidly, objectively and accurately than an analyzer. The methodology for the automatic clustering intelligent system is a multi-agent system that comprises a clustering agent and a cluster performance evaluation agent. An agent exchanges information about clusters with another agent and the system determines the optimal cluster number through this information. Experiments using data sets in the UCI Machine Repository are performed in order to prove the validity of the system.

A Modified Fuzzy C-Means Algorithm for Natural Data Exploration

In Data mining, Fuzzy clustering algorithms have demonstrated advantage over crisp clustering algorithms in dealing with the challenges posed by large collections of vague and uncertain natural data. This paper reviews concept of fuzzy logic and fuzzy clustering. The classical fuzzy c-means algorithm is presented and its limitations are highlighted. Based on the study of the fuzzy c-means algorithm and its extensions, we propose a modification to the cmeans algorithm to overcome the limitations of it in calculating the new cluster centers and in finding the membership values with natural data. The efficiency of the new modified method is demonstrated on real data collected for Bhutan-s Gross National Happiness (GNH) program.

Global and Local Structure of Supported Pd Catalysts

The supported Pd catalysts were analyzed by X-ray diffraction and X-ray absorption spectroscopy in order to determine their global and local structure. The average particle size of the supported Pd catalysts was determined by X-ray diffraction method. One of the main purposes of the present contribution is to focus on understanding the specific role of the Pd particle size determined by X-ray diffraction and that of the support oxide. Based on X-ray absorption fine structure spectroscopy analysis we consider that the whole local structure of the investigated samples are distorted concerning the atomic number but the distances between atoms are almost the same as for standard Pd sample. Due to the strong modifications of the Pd cluster local structure, the metal-support interface may influence the electronic properties of metal clusters and thus their reactivity for absorption of the reactant molecules.

Practical Applications and Connectivity Algorithms in Future Wireless Sensor Networks

Like any sentient organism, a smart environment relies first and foremost on sensory data captured from the real world. The sensory data come from sensor nodes of different modalities deployed on different locations forming a Wireless Sensor Network (WSN). Embedding smart sensors in humans has been a research challenge due to the limitations imposed by these sensors from computational capabilities to limited power. In this paper, we first propose a practical WSN application that will enable blind people to see what their neighboring partners can see. The challenge is that the actual mapping between the input images to brain pattern is too complex and not well understood. We also study the connectivity problem in 3D/2D wireless sensor networks and propose distributed efficient algorithms to accomplish the required connectivity of the system. We provide a new connectivity algorithm CDCA to connect disconnected parts of a network using cooperative diversity. Through simulations, we analyze the connectivity gains and energy savings provided by this novel form of cooperative diversity in WSNs.

Signature Recognition and Verification using Hybrid Features and Clustered Artificial Neural Network(ANN)s

Signature represents an individual characteristic of a person which can be used for his / her validation. For such application proper modeling is essential. Here we propose an offline signature recognition and verification scheme which is based on extraction of several features including one hybrid set from the input signature and compare them with the already trained forms. Feature points are classified using statistical parameters like mean and variance. The scanned signature is normalized in slant using a very simple algorithm with an intention to make the system robust which is found to be very helpful. The slant correction is further aided by the use of an Artificial Neural Network (ANN). The suggested scheme discriminates between originals and forged signatures from simple and random forgeries. The primary objective is to reduce the two crucial parameters-False Acceptance Rate (FAR) and False Rejection Rate (FRR) with lesser training time with an intension to make the system dynamic using a cluster of ANNs forming a multiple classifier system.

Determining Cluster Boundaries Using Particle Swarm Optimization

Self-organizing map (SOM) is a well known data reduction technique used in data mining. Data visualization can reveal structure in data sets that is otherwise hard to detect from raw data alone. However, interpretation through visual inspection is prone to errors and can be very tedious. There are several techniques for the automatic detection of clusters of code vectors found by SOMs, but they generally do not take into account the distribution of code vectors; this may lead to unsatisfactory clustering and poor definition of cluster boundaries, particularly where the density of data points is low. In this paper, we propose the use of a generic particle swarm optimization (PSO) algorithm for finding cluster boundaries directly from the code vectors obtained from SOMs. The application of our method to unlabeled call data for a mobile phone operator demonstrates its feasibility. PSO algorithm utilizes U-matrix of SOMs to determine cluster boundaries; the results of this novel automatic method correspond well to boundary detection through visual inspection of code vectors and k-means algorithm.

A Predictive Rehabilitation Software for Cerebral Palsy Patients

Young patients suffering from Cerebral Palsy are facing difficult choices concerning heavy surgeries. Diagnosis settled by surgeons can be complex and on the other hand decision for patient about getting or not such a surgery involves important reflection effort. Proposed software combining prediction for surgeries and post surgery kinematic values, and from 3D model representing the patient is an innovative tool helpful for both patients and medicine professionals. Beginning with analysis and classification of kinematics values from Data Base extracted from gait analysis in 3 separated clusters, it is possible to determine close similarity between patients. Prediction surgery best adapted to improve a patient gait is then determined by operating a suitable preconditioned neural network. Finally, patient 3D modeling based on kinematic values analysis, is animated thanks to post surgery kinematic vectors characterizing the closest patient selected from patients clustering.

Artificial Intelligence for Software Quality Improvement

This paper presents a software quality support tool, a Java source code evaluator and a code profiler based on computational intelligence techniques. It is Java prototype software developed by AI Group [1] from the Research Laboratories at Universidad de Palermo: an Intelligent Java Analyzer (in Spanish: Analizador Java Inteligente, AJI). It represents a new approach to evaluate and identify inaccurate source code usage and transitively, the software product itself. The aim of this project is to provide the software development industry with a new tool to increase software quality by extending the value of source code metrics through computational intelligence.

Evaluation of University Technology Malaysia on Campus Transport Access Management

Access Management is the proactive management of vehicular access points to land parcels adjacent to all manner of roadways. Good access management promotes safe and efficient use of the transportation network. This study attempts to utilize archived data from the University Technology of Malaysia on-campus area to assess the accuracy with which access management display some benefits. Results show that usage of access management reduces delay and fewer crashes. Clustered development can improve walking, cycling and transit travel, reduce parking requirements and improve emergency responses. Effective Access Management planning can also reduce total roadway facility costs by reducing the number of driveways and intersections. At the end after presenting recommendations some of the travel impact, and benefits that can be derived if these suggestions are implemented have been summarized with the related comments.

Balancing Strategies for Parallel Content-based Data Retrieval Algorithms in a k-tree Structured Database

The paper proposes a unified model for multimedia data retrieval which includes data representatives, content representatives, index structure, and search algorithms. The multimedia data are defined as k-dimensional signals indexed in a multidimensional k-tree structure. The benefits of using the k-tree unified model were demonstrated by running the data retrieval application on a six networked nodes test bed cluster. The tests were performed with two retrieval algorithms, one that allows parallel searching using a single feature, the second that performs a weighted cascade search for multiple features querying. The experiments show a significant reduction of retrieval time while maintaining the quality of results.

Sequential Straightforward Clustering for Local Image Block Matching

Duplicated region detection is a technical method to expose copy-paste forgeries on digital images. Copy-paste is one of the common types of forgeries to clone portion of an image in order to conceal or duplicate special object. In this type of forgery detection, extracting robust block feature and also high time complexity of matching step are two main open problems. This paper concentrates on computational time and proposes a local block matching algorithm based on block clustering to enhance time complexity. Time complexity of the proposed algorithm is formulated and effects of two parameter, block size and number of cluster, on efficiency of this algorithm are considered. The experimental results and mathematical analysis demonstrate this algorithm is more costeffective than lexicographically algorithms in time complexity issue when the image is complex.

A Study on the Design Elements of Sidewalk in Urban Commercial District

This study was to search for the desirable direction of the sidewalk planning in Korea by establishing the concepts of walking and pedestrian space, and analyzing the advanced precedents in and out of country. Also, based on the precedent studies and relevant laws, regulations, and systems, it aimed for the following sequential process: firstly, to derive design elements from the functions and characteristics of sidewalk and cluster the similar elements by each characteristics, sampling representative characteristics and making them hierarchical; then, to analyze their significances via the first questionnaire survey, and the relative weights and priorities of each elements via the Analytic Hierarchy Process(AHP); finally, based on the analysis result, to establish the frame of suggesting the direction of policy to improve the pedestrian environment of sidewalk in urban commercial district for the future planning and design of pedestrian space.

Faculty-Industry R&D Joint Ventures: Barriers VS Incentives for Developing Nations

The aspiration of this research article is to target and focus the gains of university-Industry (U-I) collaborations and exploring those hurdles which are the obstacles for attaining these gains. University-Industry collaborations have attained great importance since 1980 in USA due to its application in all fields of life. U-I collaboration is a bilateral process where academia is a proactive member to make such alliances. Universities want to ameliorate their academic-base with the technicalities of technobabbles. U-I collaboration is becoming an essential lane for achieving innovative goals in this century. Many developed nations have set successful examples to prove this phenomenon as a catalyst to reduce costs, efforts and personnel for R&D projects. This study is exploits amplitudes of UI collaboration incentives in the light of success stories of developed countries. Many universities in USA, UK, Canada and various European Countries have been engaged with enterprises for numerous collaborative agreements. A long list of strategic and short term R&D projects has been executed in developed countries to accomplish their intended purposes. Due to the lack of intentions, genuine research and research-oriented environment, the mentioned field could not grow very well in developing countries. During last decade, a new wave of research has induced the institutes of developing countries to promote R&D culture especially in Pakistan. Higher Education Commission (HEC) has initiated many projects and funding supports for universities which have collaborative intentions with industry. Findings show that rapid innovation, overwhelm the technological complexities and articulated intellectual-base are major incentives which steer both partners to establish faculty-industry alliances. Everchanging technologies, concerned about intellectual property, different research environment and culture, research relevancy (Basic or applied), exposure differences and diversity of knowledge (bookish or practical) are main barriers to establish and retain joint ventures. Findings also concluded that, it is dire need to support and enhance cooperation among academia and industry to promote highly coordinated research behaviors. Author has proposed a roadmap for developing countries to promote R&D clusters among faculty and industry to deal the technological challenges and innovation complexities. Based on our research findings, Model for R&D Collaboration for developing countries also have been proposed to promote articulated R&D environment. If developing countries follow this phenomenon, rapid innovations can be achieved with limited R&D budget heads.

Automatic Reusability Appraisal of Software Components using Neuro-fuzzy Approach

Automatic reusability appraisal could be helpful in evaluating the quality of developed or developing reusable software components and in identification of reusable components from existing legacy systems; that can save cost of developing the software from scratch. But the issue of how to identify reusable components from existing systems has remained relatively unexplored. In this paper, we have mentioned two-tier approach by studying the structural attributes as well as usability or relevancy of the component to a particular domain. Latent semantic analysis is used for the feature vector representation of various software domains. It exploits the fact that FeatureVector codes can be seen as documents containing terms -the idenifiers present in the components- and so text modeling methods that capture co-occurrence information in low-dimensional spaces can be used. Further, we devised Neuro- Fuzzy hybrid Inference System, which takes structural metric values as input and calculates the reusability of the software component. Decision tree algorithm is used to decide initial set of fuzzy rules for the Neuro-fuzzy system. The results obtained are convincing enough to propose the system for economical identification and retrieval of reusable software components.