Abstract: Association rules are an important problem in data
mining. Massively increasing volume of data in real life databases
has motivated researchers to design novel and incremental algorithms
for association rules mining. In this paper, we propose an incremental
association rules mining algorithm that integrates shocking
interestingness criterion during the process of building the model. A
new interesting measure called shocking measure is introduced. One
of the main features of the proposed approach is to capture the user
background knowledge, which is monotonically augmented. The
incremental model that reflects the changing data and the user beliefs
is attractive in order to make the over all KDD process more
effective and efficient. We implemented the proposed approach and
experiment it with some public datasets and found the results quite
promising.
Abstract: Rule Discovery is an important technique for mining
knowledge from large databases. Use of objective measures for
discovering interesting rules leads to another data mining problem,
although of reduced complexity. Data mining researchers have
studied subjective measures of interestingness to reduce the volume
of discovered rules to ultimately improve the overall efficiency of
KDD process.
In this paper we study novelty of the discovered rules as a
subjective measure of interestingness. We propose a hybrid approach
based on both objective and subjective measures to quantify novelty
of the discovered rules in terms of their deviations from the known
rules (knowledge). We analyze the types of deviation that can arise
between two rules and categorize the discovered rules according to
the user specified threshold. We implement the proposed framework
and experiment with some public datasets. The experimental results
are promising.
Abstract: Rule Discovery is an important technique for mining knowledge from large databases. Use of objective measures for discovering interesting rules lead to another data mining problem, although of reduced complexity. Data mining researchers have studied subjective measures of interestingness to reduce the volume of discovered rules to ultimately improve the overall efficiency of KDD process. In this paper we study novelty of the discovered rules as a subjective measure of interestingness. We propose a hybrid approach that uses objective and subjective measures to quantify novelty of the discovered rules in terms of their deviations from the known rules. We analyze the types of deviation that can arise between two rules and categorize the discovered rules according to the user specified threshold. We implement the proposed framework and experiment with some public datasets. The experimental results are quite promising.
Abstract: Knowledge Discovery in Databases (KDD) is the process of extracting previously unknown, hidden and interesting patterns from a huge amount of data stored in databases. Data mining is a stage of the KDD process that aims at selecting and applying a particular data mining algorithm to extract an interesting and useful knowledge. It is highly expected that data mining methods will find interesting patterns according to some measures, from databases. It is of vital importance to define good measures of interestingness that would allow the system to discover only the useful patterns. Measures of interestingness are divided into objective and subjective measures. Objective measures are those that depend only on the structure of a pattern and which can be quantified by using statistical methods. While, subjective measures depend only on the subjectivity and understandability of the user who examine the patterns. These subjective measures are further divided into actionable, unexpected and novel. The key issues that faces data mining community is how to make actions on the basis of discovered knowledge. For a pattern to be actionable, the user subjectivity is captured by providing his/her background knowledge about domain. Here, we consider the actionability of the discovered knowledge as a measure of interestingness and raise important issues which need to be addressed to discover actionable knowledge.
Abstract: Knowledge Discovery in Databases (KDD) has
evolved into an important and active area of research because of
theoretical challenges and practical applications associated with the
problem of discovering (or extracting) interesting and previously
unknown knowledge from very large real-world databases. Rough
Set Theory (RST) is a mathematical formalism for representing
uncertainty that can be considered an extension of the classical set
theory. It has been used in many different research areas, including
those related to inductive machine learning and reduction of
knowledge in knowledge-based systems. One important concept
related to RST is that of a rough relation. In this paper we presented
the current status of research on applying rough set theory to KDD,
which will be helpful for handle the characteristics of real-world
databases. The main aim is to show how rough set and rough set
analysis can be effectively used to extract knowledge from large
databases.