Abstract: Feature selection and attribute reduction are crucial
problems, and widely used techniques in the field of machine
learning, data mining and pattern recognition to overcome the
well-known phenomenon of the Curse of Dimensionality. This paper
presents a feature selection method that efficiently carries out attribute
reduction, thereby selecting the most informative features of a dataset.
It consists of two components: 1) a measure for feature subset
evaluation, and 2) a search strategy. For the evaluation measure,
we have employed the fuzzy-rough dependency degree (FRFDD)
of the lower approximation-based fuzzy-rough feature selection
(L-FRFS) due to its effectiveness in feature selection. As for the
search strategy, a modified version of a binary shuffled frog leaping
algorithm is proposed (B-SFLA). The proposed feature selection
method is obtained by hybridizing the B-SFLA with the FRDD. Nine
classifiers have been employed to compare the proposed approach
with several existing methods over twenty two datasets, including
nine high dimensional and large ones, from the UCI repository.
The experimental results demonstrate that the B-SFLA approach
significantly outperforms other metaheuristic methods in terms of the
number of selected features and the classification accuracy.
Abstract: Relation between tolerance class and indispensable attribute and knowledge dependency in rough set model with tolerance relation is explored. After giving definitions and concepts of knowledge dependency and knowledge dependency degree for incomplete information system in tolerance rough set model by distinguishing decision attribute containing missing attribute value or not, the result of maintaining reflectivity, transitivity, augmentation, decomposition law and merge law for complete knowledge dependency is proved. Knowledge dependency degrees (not complete knowledge dependency degrees) only satisfy some laws after transitivity, augmentation and decomposition operations. An algorithm to solve attribute reduction in an incomplete decision table is designed. The correctness is checked by an example.
Abstract: Some invariant properties of incomplete information systems homomorphism are studied in this paper. Demand conditions of tolerance class, attribute reduction, indispensable attribute and dispensable attribute being invariant under homomorphism in incomplete information system are revealed and discussed. The existing condition of endohomomorphism on an incomplete information system is also explored. It establishes some theoretical foundations for further investigations on incomplete information systems in rough set theory, like in information systems.
Abstract: One of the global combinatorial optimization
problems in machine learning is feature selection. It concerned with
removing the irrelevant, noisy, and redundant data, along with
keeping the original meaning of the original data. Attribute reduction
in rough set theory is an important feature selection method. Since
attribute reduction is an NP-hard problem, it is necessary to
investigate fast and effective approximate algorithms. In this paper,
we proposed two feature selection mechanisms based on memetic
algorithms (MAs) which combine the genetic algorithm with a fuzzy
record to record travel algorithm and a fuzzy controlled great deluge
algorithm, to identify a good balance between local search and
genetic search. In order to verify the proposed approaches, numerical
experiments are carried out on thirteen datasets. The results show that
the MAs approaches are efficient in solving attribute reduction
problems when compared with other meta-heuristic approaches.
Abstract: Neighborhood Rough Sets (NRS) has been proven to
be an efficient tool for heterogeneous attribute reduction. However,
most of researches are focused on dealing with complete and noiseless
data. Factually, most of the information systems are noisy, namely,
filled with incomplete data and inconsistent data. In this paper, we
introduce a generalized neighborhood rough sets model, called
VPTNRS, to deal with the problem of heterogeneous attribute
reduction in noisy system. We generalize classical NRS model with
tolerance neighborhood relation and the probabilistic theory.
Furthermore, we use the neighborhood dependency to evaluate the
significance of a subset of heterogeneous attributes and construct a
forward greedy algorithm for attribute reduction based on it.
Experimental results show that the model is efficient to deal with noisy
data.