Abstract: Covering-based rough sets is an extension of rough
sets and it is based on a covering instead of a partition of the
universe. Therefore it is more powerful in describing some practical
problems than rough sets. However, by extending the rough sets,
covering-based rough sets can increase the roughness of each model
in recognizing objects. How to obtain better approximations from
the models of a covering-based rough sets is an important issue.
In this paper, two concepts, determinate elements and indeterminate
elements in a universe, are proposed and given precise definitions
respectively. This research makes a reasonable refinement of the
covering-element from a new viewpoint. And the refinement may
generate better approximations of covering-based rough sets models.
To prove the theory above, it is applied to eight major coveringbased
rough sets models which are adapted from other literature.
The result is, in all these models, the lower approximation increases
effectively. Correspondingly, in all models, the upper approximation
decreases with exceptions of two models in some special situations.
Therefore, the roughness of recognizing objects is reduced. This
research provides a new approach to the study and application of
covering-based rough sets.
Abstract: This paper presents the idea of a rough controller with application to control the overhead traveling crane system. The structure of such a controller is based on a suggested concept of a fuzzy logic controller. A measure of fuzziness in rough sets is introduced. A comparison between fuzzy logic controller and rough controller has been demonstrated. The results of a simulation comparing the performance of both controllers are shown. From these results we infer that the performance of the proposed rough controller is satisfactory.
Abstract: Medical Decision Support Systems (MDSSs) are sophisticated, intelligent systems that can provide inference due to lack of information and uncertainty. In such systems, to model the uncertainty various soft computing methods such as Bayesian networks, rough sets, artificial neural networks, fuzzy logic, inductive logic programming and genetic algorithms and hybrid methods that formed from the combination of the few mentioned methods are used. In this study, symptom-disease relationships are presented by a framework which is modeled with a formal concept analysis and theory, as diseases, objects and attributes of symptoms. After a concept lattice is formed, Bayes theorem can be used to determine the relationships between attributes and objects. A discernibility relation that forms the base of the rough sets can be applied to attribute data sets in order to reduce attributes and decrease the complexity of computation.
Abstract: The distinction among urban, periurban and rural areas represents a classical example of uncertainty in land classification. Satellite images, geostatistical analysis and all kinds of spatial data are very useful in urban sprawl studies, but it is important to define precise rules in combining great amounts of data to build complex knowledge about territory. Rough Set theory may be a useful method to employ in this field. It represents a different mathematical approach to uncertainty by capturing the indiscernibility. Two different phenomena can be indiscernible in some contexts and classified in the same way when combining available information about them. This approach has been applied in a case of study, comparing the results achieved with both Map Algebra technique and Spatial Rough Set. The study case area, Potenza Province, is particularly suitable for the application of this theory, because it includes 100 municipalities with different number of inhabitants and morphologic features.
Abstract: Neighborhood Rough Sets (NRS) has been proven to
be an efficient tool for heterogeneous attribute reduction. However,
most of researches are focused on dealing with complete and noiseless
data. Factually, most of the information systems are noisy, namely,
filled with incomplete data and inconsistent data. In this paper, we
introduce a generalized neighborhood rough sets model, called
VPTNRS, to deal with the problem of heterogeneous attribute
reduction in noisy system. We generalize classical NRS model with
tolerance neighborhood relation and the probabilistic theory.
Furthermore, we use the neighborhood dependency to evaluate the
significance of a subset of heterogeneous attributes and construct a
forward greedy algorithm for attribute reduction based on it.
Experimental results show that the model is efficient to deal with noisy
data.
Abstract: The healthcare environment is generally perceived as
being information rich yet knowledge poor. However, there is a lack
of effective analysis tools to discover hidden relationships and trends
in data. In fact, valuable knowledge can be discovered from
application of data mining techniques in healthcare system. In this
study, a proficient methodology for the extraction of significant
patterns from the Coronary Heart Disease warehouses for heart
attack prediction, which unfortunately continues to be a leading cause
of mortality in the whole world, has been presented. For this purpose,
we propose to enumerate dynamically the optimal subsets of the
reduced features of high interest by using rough sets technique
associated to dynamic programming. Therefore, we propose to
validate the classification using Random Forest (RF) decision tree to
identify the risky heart disease cases. This work is based on a large
amount of data collected from several clinical institutions based on
the medical profile of patient. Moreover, the experts- knowledge in
this field has been taken into consideration in order to define the
disease, its risk factors, and to establish significant knowledge
relationships among the medical factors. A computer-aided system is
developed for this purpose based on a population of 525 adults. The
performance of the proposed model is analyzed and evaluated based
on set of benchmark techniques applied in this classification problem.
Abstract: General requirements for knowledge representation in
the form of logic rules, applicable to design and control of industrial
processes, are formulated. Characteristic behavior of decision trees
(DTs) and rough sets theory (RST) in rules extraction from recorded
data is discussed and illustrated with simple examples. The
significance of the models- drawbacks was evaluated, using
simulated and industrial data sets. It is concluded that performance of
DTs may be considerably poorer in several important aspects,
compared to RST, particularly when not only a characterization of a
problem is required, but also detailed and precise rules are needed,
according to actual, specific problems to be solved.
Abstract: Recently, X. Ge and J. Qian investigated some relations between higher mathematics scores and calculus scores (resp. linear algebra scores, probability statistics scores) for Chinese university students. Based on rough-set theory, they established an information system S = (U,CuD,V, f). In this information system, higher mathematics score was taken as a decision attribute and calculus score, linear algebra score, probability statistics score were taken as condition attributes. They investigated importance of each condition attribute with respective to decision attribute and strength of each condition attribute supporting decision attribute. In this paper, we give further investigations for this issue. Based on the above information system S = (U, CU D, V, f), we analyze the decision rules between condition and decision granules. For each x E U, we obtain support (resp. strength, certainty factor, coverage factor) of the decision rule C —>x D, where C —>x D is the decision rule induced by x in S = (U, CU D, V, f). Results of this paper gives new analysis of on higher mathematics scores for Chinese university students, which can further lead Chinese university students to raise higher mathematics scores in Chinese graduate student entrance examination.
Abstract: Students in high education are presented with new terms and concepts in nearly every lecture they attend. Many of them prefer Web-based self-tests for evaluation of their concepts understanding since they can use those tests independently of tutors- working hours and thus avoid the necessity of being in a particular place at a particular time. There is a large number of multiple-choice tests in almost every subject designed to contribute to higher level learning or discover misconceptions. Every single test provides immediate feedback to a student about the outcome of that test. In some cases a supporting system displays an overall score in case a test is taken several times by a student. What we still find missing is how to secure delivering of personalized feedback to a user while taking into consideration the user-s progress. The present work is motivated to throw some light on that question.
Abstract: In this paper, some practical solid transportation models are formulated considering per trip capacity of each type of conveyances with crisp and rough unit transportation costs. This is applicable for the system in which full vehicles, e.g. trucks, rail coaches are to be booked for transportation of products so that transportation cost is determined on the full of the conveyances. The models with unit transportation costs as rough variables are transformed into deterministic forms using rough chance constrained programming with the help of trust measure. Numerical examples are provided to illustrate the proposed models in crisp environment as well as with unit transportation costs as rough variables.
Abstract: The knowledge base of welding defect recognition is
essentially incomplete. This characteristic determines that the recognition results do not reflect the actual situation. It also has a further influence on the classification of welding quality. This paper is
concerned with the study of a rough set based method to reduce the influence and improve the classification accuracy. At first, a rough set
model of welding quality intelligent classification has been built. Both condition and decision attributes have been specified. Later on, groups
of the representative multiple compound defects have been chosen
from the defect library and then classified correctly to form the
decision table. Finally, the redundant information of the decision table has been reducted and the optimal decision rules have been reached. By this method, we are able to reclassify the misclassified defects to
the right quality level. Compared with the ordinary ones, this method
has higher accuracy and better robustness.
Abstract: It is important problems to increase the detection rates
and reduce false positive rates in Intrusion Detection System (IDS).
Although preventative techniques such as access control and
authentication attempt to prevent intruders, these can fail, and as a
second line of defence, intrusion detection has been introduced. Rare
events are events that occur very infrequently, detection of rare
events is a common problem in many domains. In this paper we
propose an intrusion detection method that combines Rough set and
Fuzzy Clustering. Rough set has to decrease the amount of data and
get rid of redundancy. Fuzzy c-means clustering allow objects to
belong to several clusters simultaneously, with different degrees of
membership. Our approach allows us to recognize not only known
attacks but also to detect suspicious activity that may be the result of
a new, unknown attack. The experimental results on Knowledge
Discovery and Data Mining-(KDDCup 1999) Dataset show that the
method is efficient and practical for intrusion detection systems.
Abstract: Clustering in high dimensional space is a difficult
problem which is recurrent in many fields of science and
engineering, e.g., bioinformatics, image processing, pattern
reorganization and data mining. In high dimensional space some of
the dimensions are likely to be irrelevant, thus hiding the possible
clustering. In very high dimensions it is common for all the objects in
a dataset to be nearly equidistant from each other, completely
masking the clusters. Hence, performance of the clustering algorithm
decreases.
In this paper, we propose an algorithmic framework which
combines the (reduct) concept of rough set theory with the k-means
algorithm to remove the irrelevant dimensions in a high dimensional
space and obtain appropriate clusters. Our experiment on test data
shows that this framework increases efficiency of the clustering
process and accuracy of the results.
Abstract: This article outlines conceptualization and
implementation of an intelligent system capable of extracting
knowledge from databases. Use of hybridized features of both the
Rough and Fuzzy Set theory render the developed system flexibility
in dealing with discreet as well as continuous datasets. A raw data set
provided to the system, is initially transformed in a computer legible
format followed by pruning of the data set. The refined data set is
then processed through various Rough Set operators which enable
discovery of parameter relationships and interdependencies. The
discovered knowledge is automatically transformed into a rule base
expressed in Fuzzy terms. Two exemplary cancer repository datasets
(for Breast and Lung Cancer) have been used to test and implement
the proposed framework.
Abstract: Knowledge Discovery in Databases (KDD) has
evolved into an important and active area of research because of
theoretical challenges and practical applications associated with the
problem of discovering (or extracting) interesting and previously
unknown knowledge from very large real-world databases. Rough
Set Theory (RST) is a mathematical formalism for representing
uncertainty that can be considered an extension of the classical set
theory. It has been used in many different research areas, including
those related to inductive machine learning and reduction of
knowledge in knowledge-based systems. One important concept
related to RST is that of a rough relation. In this paper we presented
the current status of research on applying rough set theory to KDD,
which will be helpful for handle the characteristics of real-world
databases. The main aim is to show how rough set and rough set
analysis can be effectively used to extract knowledge from large
databases.