A Fuzzy-Rough Feature Selection Based on Binary Shuffled Frog Leaping Algorithm

Feature selection and attribute reduction are crucial
problems, and widely used techniques in the field of machine
learning, data mining and pattern recognition to overcome the
well-known phenomenon of the Curse of Dimensionality. This paper
presents a feature selection method that efficiently carries out attribute
reduction, thereby selecting the most informative features of a dataset.
It consists of two components: 1) a measure for feature subset
evaluation, and 2) a search strategy. For the evaluation measure,
we have employed the fuzzy-rough dependency degree (FRFDD)
of the lower approximation-based fuzzy-rough feature selection
(L-FRFS) due to its effectiveness in feature selection. As for the
search strategy, a modified version of a binary shuffled frog leaping
algorithm is proposed (B-SFLA). The proposed feature selection
method is obtained by hybridizing the B-SFLA with the FRDD. Nine
classifiers have been employed to compare the proposed approach
with several existing methods over twenty two datasets, including
nine high dimensional and large ones, from the UCI repository.
The experimental results demonstrate that the B-SFLA approach
significantly outperforms other metaheuristic methods in terms of the
number of selected features and the classification accuracy.




References:
[1] X. Zhao, D. Li, B. Yang, C. Ma, Y. Zhu, and H. Chen, “Feature selection
based on improved ant colony optimization for online detection of
foreign fiber in cotton,” Applied Soft Computing, vol. 24, pp. 585 –
596, 2014. [2] E. Hancer, B. Xue, D. Karaboga, and M. Zhang, “A binary {ABC}
algorithm based on advanced similarity scheme for feature selection,”
Applied Soft Computing, vol. 36, pp. 334 – 348, 2015.
[3] N. Sreeja and A. Sankar, “Pattern matching based classification using ant
colony optimization based feature selection,” Applied Soft Computing,
vol. 31, pp. 91 – 102, 2015.
[4] S. Saha, R. Spandana, A. Ekbal, and S. Bandyopadhyay, “Simultaneous
feature selection and symmetry based clustering using multiobjective
framework,” Applied Soft Computing, vol. 29, pp. 479 – 486, 2015.
[5] X. Han, “Implicit feature selection for omics data phenotype
discrimination,” Applied Soft Computing, vol. 20, pp. 70 – 82, 2014,
hybrid intelligent methods for health technologies.
[6] S.-W. Lin, K.-C. Ying, C.-Y. Lee, and Z.-J. Lee, “An intelligent
algorithm with feature selection and decision rules applied to anomaly
intrusion detection,” Applied Soft Computing, vol. 12, no. 10, pp. 3285
– 3290, 2012.
[7] A. M. Canuto, K. M. Vale, A. Feitos, and A. Signoretti, “Reinsel: A
class-based mechanism for feature selection in ensemble of classifiers,”
Applied Soft Computing, vol. 12, no. 8, pp. 2517 – 2529, 2012.
[8] K. Manimala, K. Selvi, and R. Ahila, “Hybrid soft computing techniques
for feature selection and parameter optimization in power quality data
mining,” Applied Soft Computing, vol. 11, no. 8, pp. 5485 – 5497, 2011.
[9] R. Nock and M. Sebban, “Sharper bounds for the hardness of prototype
and feature selection,” in Algorithmic Learning Theory, ser. Lecture
Notes in Computer Science, H. Arimura, S. Jain, and A. Sharma, Eds.
Springer Berlin Heidelberg, 2000, vol. 1968, pp. 224–238.
[10] S. C. Yusta, “Different metaheuristic strategies to solve the feature
selection problem,” Pattern Recognition Letters, vol. 30, no. 5, pp. 525
– 534, 2009.
[11] P. Pudil, J. Novoviov, and P. Somol, “Feature selection toolbox software
package,” Pattern Recognition Letters, vol. 23, no. 4, pp. 487 – 492,
2002.
[12] M. ElAlami, “A filter model for feature subset selection based on genetic
algorithm,” Knowledge-Based Systems, vol. 22, no. 5, pp. 356 – 362,
2009.
[13] S. Nemati, M. E. Basiri, N. Ghasem-Aghaee, and M. H. Aghdam, “A
novel acoga hybrid algorithm for feature selection in protein function
prediction,” Expert Systems with Applications, vol. 36, no. 10, pp. 12 086
– 12 094, 2009.
[14] S. M. Vieira, J. M. Sousa, and T. A. Runkler, “Two cooperative ant
colonies for feature selection using fuzzy models,” Expert Systems with
Applications, vol. 37, no. 4, pp. 2714 – 2723, 2010.
[15] M. Sebban and R. Nock, “A hybrid filter/wrapper approach of feature
selection using information theory,” Pattern Recognition, vol. 35, no. 4,
pp. 835 – 846, 2002.
[16] K. Thangavel and A. Pethalakshmi, “Dimensionality reduction based on
rough set theory: A review,” Applied Soft Computing, vol. 9, no. 1, pp.
1 – 12, 2009.
[17] A. Verikas, M. Bacauskiene, D. Valincius, and A. Gelzinis, “Predictor
output sensitivity and feature similarity-based feature selection,” Fuzzy
Sets and Systems, vol. 159, no. 4, pp. 422 – 434, 2008.
[18] C. Degang and Z. Suyun, “Local reduction of decision system with fuzzy
rough sets,” Fuzzy Sets and Systems, vol. 161, no. 13, pp. 1871 – 1883,
2010.
[19] R. Jensen and Q. Shen, “New approaches to fuzzy-rough feature
selection,” Fuzzy Systems, IEEE Transactions on, vol. 17, no. 4, pp.
824–838, Aug 2009.
[20] Y. Chen, D. Miao, and R. Wang, “A rough set approach to feature
selection based on ant colony optimization,” Pattern Recognition Letters,
vol. 31, no. 3, pp. 226 – 233, 2010.
[21] N. Suguna and K. Thanushkodi, “A novel rough set reduct algorithm
for medical domain based on bee colony optimization,” Journal of
Computing, vol. 2, no. 6, pp. 49–54, June 2010.
[22] X. Wang, J. Yang, X. Teng, W. Xia, and R. Jensen, “Feature
selection based on rough sets and particle swarm optimization,” Pattern
Recognition Letters, vol. 28, no. 4, pp. 459 – 471, 2007.
[23] J. Wr´oblewski, “Finding minimal reducts using genetic algorithms,”
in Proccedings of the second annual join conference on infromation
science, 1995, pp. 186–189.
[24] J. R. Anaraki and M. Eftekhari, “Rough set based feature selection:
A review,” in Information and Knowledge Technology (IKT), 2013 5th
Conference on, May 2013, pp. 301–306.
[25] R. Jensen and Q. Shen, “Fuzzy-rough data reduction with ant colony
optimization,” Fuzzy Sets and Systems, vol. 149, no. 1, pp. 5 – 20,
2005.
[26] J. Xiang, X. Han, F. Duan, Y. Qiang, X. Xiong, Y. Lan, and
H. Chai, “A novel hybrid system for feature selection based on an
improved gravitational search algorithm and k-nn method,” Applied Soft
Computing, vol. 31, pp. 293 – 307, 2015.
[27] S. M. Vieira, L. F. Mendona, G. J. Farinha, and J. M. Sousa, “Modified
binary {PSO} for feature selection using {SVM} applied to mortality
prediction of septic patients,” Applied Soft Computing, vol. 13, no. 8,
pp. 3494 – 3504, 2013.
[28] Z. Xu, G. Huang, K. Q. Weinberger, and A. X. Zheng, “Gradient
boosted feature selection,” in Proceedings of the 20th ACM SIGKDD
international conference on Knowledge discovery and data mining.
ACM, 2014, pp. 522–531.
[29] J. H. Friedman, “Greedy function approximation: a gradient boosting
machine,” Annals of statistics, pp. 1189–1232, 2001.
[30] L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen, Classification
and regression trees. CRC press, 1984.
[31] Z. Pawlak, “Rough sets,” International Journal of Computer &
Information Sciences, vol. 11, no. 5, pp. 341–356, 1982.
[32] J. Komorowski, Z. Pawlak, L. Polkowski, and A. Skowron, “Rough sets:
A tutorial,” in Rough-Fuzzy Hybridization: A New Trend in Decision
Making, S. K. Pal and A. Skowron, Eds. Secaucus, NJ, USA:
Springer-Verlag New York, Inc., 1998, pp. 3–98.
[33] M. Eusuff, K. Lansey, and F. Pasha, “Shuffled frog-leaping algorithm:
a memetic meta-heuristic for discrete optimization,” Engineering
Optimization, vol. 38, no. 2, pp. 129–154, 2006.
[34] Q. Duan, S. Sorooshian, and V. Gupta, “Effective and efficient global
optimization for conceptual rainfall-runoff models,” Water Resources
Research, vol. 28, no. 4, pp. 1015–1031, 1992.
[35] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Neural
Networks, 1995. Proceedings., IEEE International Conference on, vol. 4,
Nov 1995, pp. 1942–1948 vol.4.
[36] A. S. Reddy and K. Vaisakh, “Environmental constrained economic
dispatch by modified shuffled frog leaping algorithm,” Journal of
Bioinformatics and Intelligent Control, vol. 2, no. 3, pp. 216–222, 2013.
[37] A. M. Radzikowska and E. E. Kerre, “A comparative study of fuzzy
rough sets,” Fuzzy Sets and Systems, vol. 126, no. 2, pp. 137 – 155,
2002.
[38] S. Kamyab, M. Eftekhari, and J. R. Anaraki, “A novel rough set based
dissimilarity measure and its application in multimodal optimization,”
in Artificial Intelligence and Signal Processing (AISP), 2012 16th CSI
International Symposium on, May 2012, pp. 180–185.
[39] R. Jensen and Q. Shen, Computational intelligence and feature selection:
rough and fuzzy approaches. John Wiley & Sons, 2008, vol. 8.
[40] M. Lichman, “UCI machine learning repository,” 2013. (Online).
Available: http://archive.ics.uci.edu/ml.
[41] A. Tsanas, M. A. Little, C. Fox, and L. O. Ramig, “Objective automatic
assessment of rehabilitative speech treatment in parkinson’s disease,”
Neural Systems and Rehabilitation Engineering, IEEE Transactions on,
vol. 22, no. 1, pp. 181–190, 2014.
[42] B. A. Johnson, “High-resolution urban land-cover classification using a
competitive multi-scale object-based approach,” Remote Sensing Letters,
vol. 4, no. 2, pp. 131–140, 2013.
[43] B. Johnson and Z. Xie, “Classifying a high resolution image of an urban
area using super-object information,” ISPRS Journal of Photogrammetry
and Remote Sensing, vol. 83, pp. 40–49, 2013.
[44] P. Van Der Putten and M. van Someren, “Coil challenge 2000: The
insurance company case,” Published by Sentient Machine Research,
Amsterdam. Also a Leiden Institute of Advanced Computer Science
Technical Report, vol. 9, pp. 1–43, 2000.
[45] I. Guyon, S. Gunn, A. Ben-Hur, and G. Dror, “Result analysis of the nips
2003 feature selection challenge,” in Advances in neural information
processing systems, 2004, pp. 545–552.
[46] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H.
Witten, “The weka data mining software: An update,” SIGKDD Explor.
Newsl., vol. 11, no. 1, pp. 10–18, Nov. 2009.
[47] K.-C. Wong, C.-H. Wu, R. K. Mok, C. Peng, and Z. Zhang,
“Evolutionary multimodal optimization using the principle of locality,”
Information Sciences, vol. 194, pp. 138 – 170, 2012.