Information Gain Ratio Based Clustering for Investigation of Environmental Parameters Effects on Human Mental Performance

Methods of clustering which were developed in the data mining theory can be successfully applied to the investigation of different kinds of dependencies between the conditions of environment and human activities. It is known, that environmental parameters such as temperature, relative humidity, atmospheric pressure and illumination have significant effects on the human mental performance. To investigate these parameters effect, data mining technique of clustering using entropy and Information Gain Ratio (IGR) K(Y/X) = (H(X)–H(Y/X))/H(Y) is used, where H(Y)=-ΣPi ln(Pi). This technique allows adjusting the boundaries of clusters. It is shown that the information gain ratio (IGR) grows monotonically and simultaneously with degree of connectivity between two variables. This approach has some preferences if compared, for example, with correlation analysis due to relatively smaller sensitivity to shape of functional dependencies. Variant of an algorithm to implement the proposed method with some analysis of above problem of environmental effects is also presented. It was shown that proposed method converges with finite number of steps.




References:
[1] Wyon DP, Andersen IN, and Lundqvis GR, "The e¤ects of moderate
heat stress on mental performance", Scandinavian Journal of Work
Environment and Health 5: 352-361, 1979.
[2] Wyon DP, "Healthy Buildings and their impact on productivity",
Proceedings of Indoor Air, 6th International Conference on Indoor Air
Quality and Climate, Helsinki 6:3-13,1993.
[3] Levin, H., "Physical factors in the indoor environment", Occupational
Medicine: State of the Art Reviews 10(1):59-94, 1995.
[4] Pepler RD, Warner RE, "Temperature and learning: an experimental
study", ASHRAE Transaction 74(II): 211-219, 1968.
[5] Green GH, "The effects of indoor relative humidity on absenteeism and
colds in schools", ASHRAE Transaction 8/0:131-141, 1974.
[6] Peltonon, J., Rantamaki, J, Niittymaki, S., Sweins, K., Viitasalo, J. and
Rusko, H., "Effects of oxygen fraction in inspired air on rowing
performance", Medicine and Science Sports and Exercise, 27: 573-578,
1995.
[7] Smith SW and Rea MS, "Proofreading under different levels of
illumination", Journal of Illumination Engineering Society, 8(1): 47-78,
1979.
[8] Eliseeva I.I., Rukavishnikov V.O., "The grouping, the correlativity, the
pattern recognition", Moscow, 1977, 273 p. rus.
[9] Jain, Murty and Flynn, "Data Clustering: A Review", ACM Comp. Surv,
1999.
[10] Rui Xu Wunsch, D., II, "Survey of clustering algorithms, Neural
Networks", IEEE Transactions, volume: 16, issue: 3, May 2005.
[11] A.A. Kavokin, M.Sarmad Ali, Adeel Mumtaz, Tahir Jameel, Shujaat Ali
Rathore, "Adjusting clustering bounds using information gain ratio"
International Journal of Software Engineering, Vol. 1. No. 2, pages 17-
22.
[12] Duda, R.O., Hart, P.E., Stork, D.G., "Pattern classification" (2nd
edition), Wiley, ISBN 0471056693, 2001.
[13] H. A. Taha., "Operations Research: An Introduction", Prentice Hall,
1996.
[14] Olli Seppanen, William J. Fisk, Q. H. Lei, "Effect of Temperature on
Task Performance in Office Environment", NTIS, Alexandria, 2006.
[15] Kulback S., "Information Theory and Statistics", Courier Dover
Publications, pp. 416.