Abstract: Neighborhood Rough Sets (NRS) has been proven to
be an efficient tool for heterogeneous attribute reduction. However,
most of researches are focused on dealing with complete and noiseless
data. Factually, most of the information systems are noisy, namely,
filled with incomplete data and inconsistent data. In this paper, we
introduce a generalized neighborhood rough sets model, called
VPTNRS, to deal with the problem of heterogeneous attribute
reduction in noisy system. We generalize classical NRS model with
tolerance neighborhood relation and the probabilistic theory.
Furthermore, we use the neighborhood dependency to evaluate the
significance of a subset of heterogeneous attributes and construct a
forward greedy algorithm for attribute reduction based on it.
Experimental results show that the model is efficient to deal with noisy
data.
Abstract: The healthcare environment is generally perceived as
being information rich yet knowledge poor. However, there is a lack
of effective analysis tools to discover hidden relationships and trends
in data. In fact, valuable knowledge can be discovered from
application of data mining techniques in healthcare system. In this
study, a proficient methodology for the extraction of significant
patterns from the Coronary Heart Disease warehouses for heart
attack prediction, which unfortunately continues to be a leading cause
of mortality in the whole world, has been presented. For this purpose,
we propose to enumerate dynamically the optimal subsets of the
reduced features of high interest by using rough sets technique
associated to dynamic programming. Therefore, we propose to
validate the classification using Random Forest (RF) decision tree to
identify the risky heart disease cases. This work is based on a large
amount of data collected from several clinical institutions based on
the medical profile of patient. Moreover, the experts- knowledge in
this field has been taken into consideration in order to define the
disease, its risk factors, and to establish significant knowledge
relationships among the medical factors. A computer-aided system is
developed for this purpose based on a population of 525 adults. The
performance of the proposed model is analyzed and evaluated based
on set of benchmark techniques applied in this classification problem.
Abstract: General requirements for knowledge representation in
the form of logic rules, applicable to design and control of industrial
processes, are formulated. Characteristic behavior of decision trees
(DTs) and rough sets theory (RST) in rules extraction from recorded
data is discussed and illustrated with simple examples. The
significance of the models- drawbacks was evaluated, using
simulated and industrial data sets. It is concluded that performance of
DTs may be considerably poorer in several important aspects,
compared to RST, particularly when not only a characterization of a
problem is required, but also detailed and precise rules are needed,
according to actual, specific problems to be solved.
Abstract: Students in high education are presented with new terms and concepts in nearly every lecture they attend. Many of them prefer Web-based self-tests for evaluation of their concepts understanding since they can use those tests independently of tutors- working hours and thus avoid the necessity of being in a particular place at a particular time. There is a large number of multiple-choice tests in almost every subject designed to contribute to higher level learning or discover misconceptions. Every single test provides immediate feedback to a student about the outcome of that test. In some cases a supporting system displays an overall score in case a test is taken several times by a student. What we still find missing is how to secure delivering of personalized feedback to a user while taking into consideration the user-s progress. The present work is motivated to throw some light on that question.
Abstract: This article outlines conceptualization and
implementation of an intelligent system capable of extracting
knowledge from databases. Use of hybridized features of both the
Rough and Fuzzy Set theory render the developed system flexibility
in dealing with discreet as well as continuous datasets. A raw data set
provided to the system, is initially transformed in a computer legible
format followed by pruning of the data set. The refined data set is
then processed through various Rough Set operators which enable
discovery of parameter relationships and interdependencies. The
discovered knowledge is automatically transformed into a rule base
expressed in Fuzzy terms. Two exemplary cancer repository datasets
(for Breast and Lung Cancer) have been used to test and implement
the proposed framework.
Abstract: Knowledge Discovery in Databases (KDD) has
evolved into an important and active area of research because of
theoretical challenges and practical applications associated with the
problem of discovering (or extracting) interesting and previously
unknown knowledge from very large real-world databases. Rough
Set Theory (RST) is a mathematical formalism for representing
uncertainty that can be considered an extension of the classical set
theory. It has been used in many different research areas, including
those related to inductive machine learning and reduction of
knowledge in knowledge-based systems. One important concept
related to RST is that of a rough relation. In this paper we presented
the current status of research on applying rough set theory to KDD,
which will be helpful for handle the characteristics of real-world
databases. The main aim is to show how rough set and rough set
analysis can be effectively used to extract knowledge from large
databases.