Incremental Mining of Shocking Association Patterns

Association rules are an important problem in data mining. Massively increasing volume of data in real life databases has motivated researchers to design novel and incremental algorithms for association rules mining. In this paper, we propose an incremental association rules mining algorithm that integrates shocking interestingness criterion during the process of building the model. A new interesting measure called shocking measure is introduced. One of the main features of the proposed approach is to capture the user background knowledge, which is monotonically augmented. The incremental model that reflects the changing data and the user beliefs is attractive in order to make the over all KDD process more effective and efficient. We implemented the proposed approach and experiment it with some public datasets and found the results quite promising.

Development of Subjective Measures of Interestingness: From Unexpectedness to Shocking

Knowledge Discovery of Databases (KDD) is the process of extracting previously unknown but useful and significant information from large massive volume of databases. Data Mining is a stage in the entire process of KDD which applies an algorithm to extract interesting patterns. Usually, such algorithms generate huge volume of patterns. These patterns have to be evaluated by using interestingness measures to reflect the user requirements. Interestingness is defined in different ways, (i) Objective measures (ii) Subjective measures. Objective measures such as support and confidence extract meaningful patterns based on the structure of the patterns, while subjective measures such as unexpectedness and novelty reflect the user perspective. In this report, we try to brief the more widely spread and successful subjective measures and propose a new subjective measure of interestingness, i.e. shocking.

Genetic Algorithm Parameters Optimization for Bi-Criteria Multiprocessor Task Scheduling Using Design of Experiments

Multiprocessor task scheduling is a NP-hard problem and Genetic Algorithm (GA) has been revealed as an excellent technique for finding an optimal solution. In the past, several methods have been considered for the solution of this problem based on GAs. But, all these methods consider single criteria and in the present work, minimization of the bi-criteria multiprocessor task scheduling problem has been considered which includes weighted sum of makespan & total completion time. Efficiency and effectiveness of genetic algorithm can be achieved by optimization of its different parameters such as crossover, mutation, crossover probability, selection function etc. The effects of GA parameters on minimization of bi-criteria fitness function and subsequent setting of parameters have been accomplished by central composite design (CCD) approach of response surface methodology (RSM) of Design of Experiments. The experiments have been performed with different levels of GA parameters and analysis of variance has been performed for significant parameters for minimisation of makespan and total completion time simultaneously.