Abstract: The various types of frequent pattern discovery
problem, namely, the frequent itemset, sequence and graph mining
problems are solved in different ways which are, however, in certain
aspects similar. The main approach of discovering such patterns can
be classified into two main classes, namely, in the class of the levelwise
methods and in that of the database projection-based methods.
The level-wise algorithms use in general clever indexing structures
for discovering the patterns. In this paper a new approach is proposed
for discovering frequent sequences and tree-like patterns efficiently
that is based on the level-wise issue. Because the level-wise
algorithms spend a lot of time for the subpattern testing problem, the
new approach introduces the idea of using automaton theory to solve
this problem.
Abstract: The problem of frequent pattern discovery is defined
as the process of searching for patterns such as sets of features or items that appear in data frequently. Finding such frequent patterns
has become an important data mining task because it reveals associations, correlations, and many other interesting relationships
hidden in a database. Most of the proposed frequent pattern mining
algorithms have been implemented with imperative programming
languages. Such paradigm is inefficient when set of patterns is large
and the frequent pattern is long. We suggest a high-level declarative
style of programming apply to the problem of frequent pattern
discovery. We consider two languages: Haskell and Prolog. Our
intuitive idea is that the problem of finding frequent patterns should
be efficiently and concisely implemented via a declarative paradigm
since pattern matching is a fundamental feature supported by most
functional languages and Prolog. Our frequent pattern mining
implementation using the Haskell and Prolog languages confirms our
hypothesis about conciseness of the program. The comparative
performance studies on line-of-code, speed and memory usage of
declarative versus imperative programming have been reported in the
paper.
Abstract: Frequent pattern discovery over data stream is a hard
problem because a continuously generated nature of stream does not
allow a revisit on each data element. Furthermore, pattern discovery
process must be fast to produce timely results. Based on these
requirements, we propose an approximate approach to tackle the
problem of discovering frequent patterns over continuous stream.
Our approximation algorithm is intended to be applied to process a
stream prior to the pattern discovery process. The results of
approximate frequent pattern discovery have been reported in the
paper.
Abstract: Frequent patterns are patterns such as sets of features or items that appear in data frequently. Finding such frequent patterns has become an important data mining task because it reveals associations, correlations, and many other interesting relationships hidden in a dataset. Most of the proposed frequent pattern mining algorithms have been implemented with imperative programming languages such as C, Cµ, Java. The imperative paradigm is significantly inefficient when itemset is large and the frequent pattern is long. We suggest a high-level declarative style of programming using a functional language. Our supposition is that the problem of frequent pattern discovery can be efficiently and concisely implemented via a functional paradigm since pattern matching is a fundamental feature supported by most functional languages. Our frequent pattern mining implementation using the Haskell language confirms our hypothesis about conciseness of the program. The performance studies on speed and memory usage support our intuition on efficiency of functional language.