Implementing an Intuitive Reasoner with a Large Weather Database

In this paper, the implementation of a rule-based intuitive reasoner is presented. The implementation included two parts: the rule induction module and the intuitive reasoner. A large weather database was acquired as the data source. Twelve weather variables from those data were chosen as the “target variables" whose values were predicted by the intuitive reasoner. A “complex" situation was simulated by making only subsets of the data available to the rule induction module. As a result, the rules induced were based on incomplete information with variable levels of certainty. The certainty level was modeled by a metric called "Strength of Belief", which was assigned to each rule or datum as ancillary information about the confidence in its accuracy. Two techniques were employed to induce rules from the data subsets: decision tree and multi-polynomial regression, respectively for the discrete and the continuous type of target variables. The intuitive reasoner was tested for its ability to use the induced rules to predict the classes of the discrete target variables and the values of the continuous target variables. The intuitive reasoner implemented two types of reasoning: fast and broad where, by analogy to human thought, the former corresponds to fast decision making and the latter to deeper contemplation. . For reference, a weather data analysis approach which had been applied on similar tasks was adopted to analyze the complete database and create predictive models for the same 12 target variables. The values predicted by the intuitive reasoner and the reference approach were compared with actual data. The intuitive reasoner reached near-100% accuracy for two continuous target variables. For the discrete target variables, the intuitive reasoner predicted at least 70% as accurately as the reference reasoner. Since the intuitive reasoner operated on rules derived from only about 10% of the total data, it demonstrated the potential advantages in dealing with sparse data sets as compared with conventional methods.




References:
[1] R. M. Hogarth, Educating Intuition. Chicago, IL: The University of
Chicago Press, 2001.
[2] M. Gladwell, Blink: The Power of Thinking without Thinking. New
York, NY: Little, Brown and Company, 2005.
[3] Y.-C. Sun and O. G. Clark, "A computational model of an intuitive
reasoner for ecosystem control," unpublished.
[4] W. R. Burrows, "Combining classification and regression trees and the
neuro-fuzzy inference system for environmental data modeling," in
Proc. 18th International Conf. of the North American Fuzzy Information
Processing Society, New York, 1999.
[5] G. Latini and G. Passerini, Handling Missing Data: Applications to
Environmental Analysis. Billerica, MA: Computational Mechanics Inc.,
2004
[6] H. Liu, F. Hussain, C. L. Tan, and M. Dash. (2002). "Discretization: an
enabling technique," Data Mining and Knowledge Discovery, vol. 26,
pp 393-423, 2002.
[7] L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen, Classification
and Regression Trees. Belmont, CA: Wadsworth International Group,
1984.
[8] G. W. Snedecor and W. G. Cochran, Statistical Methods. Ames, IA: The
Iowa State University Press, 1967.
[9] J. P. Ignizio, Introduction to Expert System. New York, NY: McGraw-
Hill, Inc., 1991
[10] K. Demirli, S. X. Cheng, and P. Muthukumaran, "Subtractive clustering
based modeling of job sequencing with parametric search," Fuzzy Sets
Systems, vol. 137, pp 235-270, 2003.
[11] D. Pedro, "Unifying instance-based and rule-based induction," Machine
Learning, vol. 24, pp 141-168, 1996.
[12] E. H. Mamdani and S. Assilian, "An experiment in linguistic synthesis
with a fuzzy logic controller," International Journal of Human-
Computer Studies, vol. 51, pp 135-147, 1999.
[13] W. Pedrycz, Fuzzy control and fuzzy systems (2nd, extended ed.).
Taunton, UK: Research Studies Press Ltd., 1993.
[14] A. Loskiewicz-Buczak and R. E. Uhrig, "Information fusion by fuzzy set
operation and genetic algorithms," Simulation, vol. 65, pp 52-66, 1995.
[15] L. A. Kurgan and K. J. Cios, "CAIM discretization algorithm," IEEE
Transactions on Knowledge and Data Engineering, vol. 16, pp 145-153,
2004.
[16] S. L. Chiu, "A cluster estimation method with extension to fuzzy model
identification," In Proc. of the 3rd IEEE World Congress on
Computational Intelligence, vol. 2, pp 1240-1245, 1994.
[17] S. Visa and A. Ralescu, "Issues in mining imbalanced data sets - a
review paper," in Proc. of the 16th Midwest Artificial Intelligence and
Cognitive Science Conf., Dayton, 2005.
[18] A. Estabrooks, T. Jo, and N. Japkowicz,. "A multiple resampling method
for learning from imbalanced data sets," Computational Intelligence,
vol. 20, pp 18-36, 2004.
[19] M. St├ñger, P. Lukowicz, and G. Tröster, "Dealing with class skew in
context recognition," in Proc of the 26th IEEE International Conf. on
Distributed Computing Systems Workshops, Lisboa, Portugal, 2006.
[20] C. M. Ennett, M. Frize, and C. R. Walker, "Influence of missing values
on artificial neural network performance," Medinfo. V. L. Patel, R.
Rogers and R. Haux. London, UK: IOS Press, 2001.