Development of Subjective Measures of Interestingness: From Unexpectedness to Shocking

Knowledge Discovery of Databases (KDD) is the process of extracting previously unknown but useful and significant information from large massive volume of databases. Data Mining is a stage in the entire process of KDD which applies an algorithm to extract interesting patterns. Usually, such algorithms generate huge volume of patterns. These patterns have to be evaluated by using interestingness measures to reflect the user requirements. Interestingness is defined in different ways, (i) Objective measures (ii) Subjective measures. Objective measures such as support and confidence extract meaningful patterns based on the structure of the patterns, while subjective measures such as unexpectedness and novelty reflect the user perspective. In this report, we try to brief the more widely spread and successful subjective measures and propose a new subjective measure of interestingness, i.e. shocking.

Framework for Delivery Reliability in European Machinery and Equipment Industry

Today-s manufacturing companies are facing multiple and dynamic customer-supplier-relationships embedded in nonhierarchical production networks. This complex environment leads to problems with delivery reliability and wasteful turbulences throughout the entire network. This paper describes an operational model based on a theoretical framework which improves delivery reliability of each individual customer-supplier-relationship within non-hierarchical production networks of the European machinery and equipment industry. By developing a non-centralized coordination mechanism based on determining the value of delivery reliability and derivation of an incentive system for suppliers the number of in time deliveries can be increased and thus the turbulences in the production network smoothened. Comparable to an electronic stock exchange the coordination mechanism will transform the manual and nontransparent process of determining penalties for delivery delays into an automated and transparent market mechanism creating delivery reliability.

Actionable Rules: Issues and New Directions

Knowledge Discovery in Databases (KDD) is the process of extracting previously unknown, hidden and interesting patterns from a huge amount of data stored in databases. Data mining is a stage of the KDD process that aims at selecting and applying a particular data mining algorithm to extract an interesting and useful knowledge. It is highly expected that data mining methods will find interesting patterns according to some measures, from databases. It is of vital importance to define good measures of interestingness that would allow the system to discover only the useful patterns. Measures of interestingness are divided into objective and subjective measures. Objective measures are those that depend only on the structure of a pattern and which can be quantified by using statistical methods. While, subjective measures depend only on the subjectivity and understandability of the user who examine the patterns. These subjective measures are further divided into actionable, unexpected and novel. The key issues that faces data mining community is how to make actions on the basis of discovered knowledge. For a pattern to be actionable, the user subjectivity is captured by providing his/her background knowledge about domain. Here, we consider the actionability of the discovered knowledge as a measure of interestingness and raise important issues which need to be addressed to discover actionable knowledge.

An Agent-Based Approach to Immune Modelling: Priming Individual Response

This study focuses on examining why the range of experience with respect to HIV infection is so diverse, especially in regard to the latency period. An agent-based approach in modelling the infection is used to extract high-level behaviour which cannot be obtained analytically from the set of interaction rules at the cellular level. A prototype model encompasses local variation in baseline properties, contributing to the individual disease experience, and is included in a network which mimics the chain of lymph nodes. The model also accounts for stochastic events such as viral mutations. The size and complexity of the model require major computational effort and parallelisation methods are used.

Mining Sequential Patterns Using Hybrid Evolutionary Algorithm

Mining Sequential Patterns in large databases has become an important data mining task with broad applications. It is an important task in data mining field, which describes potential sequenced relationships among items in a database. There are many different algorithms introduced for this task. Conventional algorithms can find the exact optimal Sequential Pattern rule but it takes a long time, particularly when they are applied on large databases. Nowadays, some evolutionary algorithms, such as Particle Swarm Optimization and Genetic Algorithm, were proposed and have been applied to solve this problem. This paper will introduce a new kind of hybrid evolutionary algorithm that combines Genetic Algorithm (GA) with Particle Swarm Optimization (PSO) to mine Sequential Pattern, in order to improve the speed of evolutionary algorithms convergence. This algorithm is referred to as SP-GAPSO.

A Neurofuzzy Learning and its Application to Control System

A neurofuzzy approach for a given set of input-output training data is proposed in two phases. Firstly, the data set is partitioned automatically into a set of clusters. Then a fuzzy if-then rule is extracted from each cluster to form a fuzzy rule base. Secondly, a fuzzy neural network is constructed accordingly and parameters are tuned to increase the precision of the fuzzy rule base. This network is able to learn and optimize the rule base of a Sugeno like Fuzzy inference system using Hybrid learning algorithm, which combines gradient descent, and least mean square algorithm. This proposed neurofuzzy system has the advantage of determining the number of rules automatically and also reduce the number of rules, decrease computational time, learns faster and consumes less memory. The authors also investigate that how neurofuzzy techniques can be applied in the area of control theory to design a fuzzy controller for linear and nonlinear dynamic systems modelling from a set of input/output data. The simulation analysis on a wide range of processes, to identify nonlinear components on-linely in a control system and a benchmark problem involving the prediction of a chaotic time series is carried out. Furthermore, the well-known examples of linear and nonlinear systems are also simulated under the Matlab/Simulink environment. The above combination is also illustrated in modeling the relationship between automobile trips and demographic factors.

Model Development for Allocation of Raw Material in Timber Processing Industry in Indonesia

This research is intended to develop a raw material allocation model in timber processing industry in Perum Perhutani Unit I, Central Java, Indonesia. The model can be used to determine the quantity of allocation of timber between chain in the supply chain to select supplier considering factors that are log price and the distance. In determining the quantity of allocation of timber between chains in the supply chain, the model considers the optimal inventory in each chain. Whilst the optimal inventory is determined based on demand forecast, the capacity and safety stock. Problem solving allocation is conducted by developing linear programming model that aims to minimize the total cost of the purchase, transportation cost and storage costs at each chain. The results of numerical examples show that the proposed model can generate savings of the purchase cost of 20.84% and select suppliers with mileage closer.

Exploring the Roles of Social Exchanges in Using Information Systems

Previous studies have indicated that one of the most critical failure reasons of enterprise systems is the lack of knowledge sharing and utilization across organizations. As a consequence, many information systems researchers have paid attention to examining the effect of absorptive capacity closely associated with knowledge sharing and transferring on IS usage performance. A lack of communications and interactions due to a lack of organizational citizenship behavior might lead to weak absorptive capacity and thus negatively influence knowledge sharing across organizations. In this study, a theoretical model which delves into the relationship between usage performance of enterprise systems and its determinants was established.

The Impact of Knowledge Sharing on Innovation Capability in United Arab Emirates Organizations

The purpose of this study was to explore the relationship between knowledge sharing and innovation capability, by examining the influence of individual, organizational and technological factors on knowledge sharing. The research is based on a survey of 103 employees from different organizations in the United Arab Emirates. The study is based on a model and a questionnaire that was previously tested by Lin [1]. Thus, the study aims at examining the validity of that model in UAE context. The results of the research show varying degrees of correlation between the different variables, with ICT use having the strongest relationship with the innovation capabilities of organizations. The study also revealed little evidence of knowledge collecting and knowledge sharing among UAE employees.

Numerical Simulation of the Flow Field around a Vertical Flat Plate of Infinite Extent

This paper presents a CFD analysis of the flow field around a thin flat plate of infinite span inclined at 90° to a fluid stream of infinite extent. Numerical predictions have been compared to experimental measurements, in order to assess the potential of the finite volume code of determining the aerodynamic forces acting on a bluff body invested by a fluid stream of infinite extent. Several turbulence models and spatial node distributions have been tested. Flow field characteristics in the neighborhood of the flat plate have been investigated, allowing the development of a preliminary procedure to be used as guidance in selecting the appropriate grid configuration and the corresponding turbulence model for the prediction of the flow field over a two-dimensional vertical flat plate.

ANP-based Intra and Inter-industry Analysis for Measuring Spillover Effect of ICT Industries

The interaction among information and communication technology (ICT) industries is a recently ubiquitous phenomenon through fixed-mobile integration. To monitor the impact of interaction, previous research has mainly focused on measuring spillover effect among ICT industries using various methods. Among others, inter-industry analysis is one of the useful methods for examining spillover effect between industries. However, more complex ICT industries become, more important the impact within an industry is. Inter-industry analysis is limited in mirroring intra-relationships within an industry. Thus, this study applies the analytic network process (ANP) to measure the spillover effect, capturing all of the intra and inter-relationships. Using ANP-based intra and inter-industry analysis, the spillover effect is effectively measured, mirroring the complex structure of ICT industries. A main ICT industry and its linkages are also explored to show the current structure of ICT industries. The proposed approach is expected to allow policy makers to understand interactions of ICT industries and their impact.

A Comparative Study of Page Ranking Algorithms for Information Retrieval

This paper gives an introduction to Web mining, then describes Web Structure mining in detail, and explores the data structure used by the Web. This paper also explores different Page Rank algorithms and compare those algorithms used for Information Retrieval. In Web Mining, the basics of Web mining and the Web mining categories are explained. Different Page Rank based algorithms like PageRank (PR), WPR (Weighted PageRank), HITS (Hyperlink-Induced Topic Search), DistanceRank and DirichletRank algorithms are discussed and compared. PageRanks are calculated for PageRank and Weighted PageRank algorithms for a given hyperlink structure. Simulation Program is developed for PageRank algorithm because PageRank is the only ranking algorithm implemented in the search engine (Google). The outputs are shown in a table and chart format.

AudioMine: Medical Data Mining in Heterogeneous Audiology Records

We report on the results of a pilot study in which a data-mining tool was developed for mining audiology records. The records were heterogeneous in that they contained numeric, category and textual data. The tools developed are designed to observe associations between any field in the records and any other field. The techniques employed were the statistical chi-squared test, and the use of self-organizing maps, an unsupervised neural learning approach.

A Text Mining Technique Using Association Rules Extraction

This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.

Evaluation of a PSO Approach for Optimum Design of a First-Order Controllers for TCP/AQM Systems

This paper presents a Particle Swarm Optimization (PSO) method for determining the optimal parameters of a first-order controller for TCP/AQM system. The model TCP/AQM is described by a second-order system with time delay. First, the analytical approach, based on the D-decomposition method and Lemma of Kharitonov, is used to determine the stabilizing regions of a firstorder controller. Second, the optimal parameters of the controller are obtained by the PSO algorithm. Finally, the proposed method is implemented in the Network Simulator NS-2 and compared with the PI controller.

Automatic Segmentation of Dermoscopy Images Using Histogram Thresholding on Optimal Color Channels

Automatic segmentation of skin lesions is the first step towards development of a computer-aided diagnosis of melanoma. Although numerous segmentation methods have been developed, few studies have focused on determining the most discriminative and effective color space for melanoma application. This paper proposes a novel automatic segmentation algorithm using color space analysis and clustering-based histogram thresholding, which is able to determine the optimal color channel for segmentation of skin lesions. To demonstrate the validity of the algorithm, it is tested on a set of 30 high resolution dermoscopy images and a comprehensive evaluation of the results is provided, where borders manually drawn by four dermatologists, are compared to automated borders detected by the proposed algorithm. The evaluation is carried out by applying three previously used metrics of accuracy, sensitivity, and specificity and a new metric of similarity. Through ROC analysis and ranking the metrics, it is shown that the best results are obtained with the X and XoYoR color channels which results in an accuracy of approximately 97%. The proposed method is also compared with two state-ofthe- art skin lesion segmentation methods, which demonstrates the effectiveness and superiority of the proposed segmentation method.

Challenges to Technological Advancement in Economically Weak Countries: An Assessment of the Nigerian Educational Situation

Nigeria is considered as one of the many countries in sub-Saharan Africa with a weak economy and gross deficiencies in technology and engineering. Available data from international monitoring and regulatory organizations show that technology is pivotal to determining the economic strengths of nations all over the world. Education is critical to technology acquisition, development, dissemination and adaptation. Thus, this paper seeks to critically assess and discuss issues and challenges facing technological advancement in Nigeria, particularly in the education sector, and also proffers solutions to resuscitate the Nigerian education system towards achieving national technological and economic sustainability such that Nigeria can compete favourably with other technologicallydriven economies of the world in the not-too-distant future.

Navigation Patterns Mining Approach based on Expectation Maximization Algorithm

Web usage mining algorithms have been widely utilized for modeling user web navigation behavior. In this study we advance a model for mining of user-s navigation pattern. The model makes user model based on expectation-maximization (EM) algorithm.An EM algorithm is used in statistics for finding maximum likelihood estimates of parameters in probabilistic models, where the model depends on unobserved latent variables. The experimental results represent that by decreasing the number of clusters, the log likelihood converges toward lower values and probability of the largest cluster will be decreased while the number of the clusters increases in each treatment.

An Appraisal of Coal Fly Ash Soil Amendment Technology (FASAT) of Central Institute of Mining and Fuel Research (CIMFR)

Coal will continue to be the predominant source of global energy for coming several decades. The huge generation of fly ash (FA) from combustion of coal in thermal power plants (TPPs) is apprehended to pose the concerns of its disposal and utilization. FA application based on its typical characteristics as soil ameliorant for agriculture and forestry is the potential area, and hence the global attempt. The inferences drawn suffer from the variations of ash characteristics, soil types, and agro-climatic conditions; thereby correlating the effects of ash between various plant species and soil types is difficult. Indian FAs have low bulk density, high water holding capacity and porosity, rich silt-sized particles, alkaline nature, negligible solubility, and reasonable plant nutrients. Findings of the demonstrations trials for more than two decades from lab/pot to field scale long-term experiments are developed as FA soil amendment technology (FASAT) by Central Institute of Mining and Fuel Research (CIMFR), Dhanbad. Performance of different crops and plant species in cultivable and problematic soils, are encouraging, eco-friendly, and being adopted by the farmers. FA application includes ash alone and in combination with inorganic/organic amendments; combination treatments including bio-solids perform better than FA alone. Optimum dose being up to 100 t/ha for cultivable land and up to/ or above 200 t/ha of FA for waste/degraded land/mine refuse, depending on the characteristics of ash and soil. The elemental toxicity in Indian FA is usually not of much concern owing to alkaline ashes, oxide forms of elements, and elemental concentration within the threshold limits for soil application. Combating toxicity, if any, is possible through combination treatments with organic materials and phytoremediation. Government initiatives through extension programme involving farmers and ash generating organizations need to be accelerated

A Review on Technology Forecasting Methods and Their Application Area

Technology changes have been acknowledged as a critical factor in determining competitiveness of organization. Under such environment, the right anticipation of technology change has been of huge importance in strategic planning. To monitor technology change, technology forecasting (TF) is frequently utilized. In academic perspective, TF has received great attention for a long time. However, few researches have been conducted to provide overview of the TF literature. Even though some studies deals with review of TF research, they generally focused on type and characteristics of various TF, so hardly provides information about patterns of TF research and which TF method is used in certain technology industry. Accordingly, this study profile developments in and patterns of scholarly research in TF over time. Also, this study investigates which technology industries have used certain TF method and identifies their relationships. This study will help in understanding TF research trend and their application area.