Mixtures of Monotone Networks for Prediction

In many data mining applications, it is a priori known that the target function should satisfy certain constraints imposed by, for example, economic theory or a human-decision maker. In this paper we consider partially monotone prediction problems, where the target variable depends monotonically on some of the input variables but not on all. We propose a novel method to construct prediction models, where monotone dependences with respect to some of the input variables are preserved by virtue of construction. Our method belongs to the class of mixture models. The basic idea is to convolute monotone neural networks with weight (kernel) functions to make predictions. By using simulation and real case studies, we demonstrate the application of our method. To obtain sound assessment for the performance of our approach, we use standard neural networks with weight decay and partially monotone linear models as benchmark methods for comparison. The results show that our approach outperforms partially monotone linear models in terms of accuracy. Furthermore, the incorporation of partial monotonicity constraints not only leads to models that are in accordance with the decision maker's expertise, but also reduces considerably the model variance in comparison to standard neural networks with weight decay.




References:
[1] O. Harrison and D. Rubinfeld. Hedonic prices and the demand for clean
air. Journal of Environmental Economics and Management, 53:81–102,
1978.
[2] H. Mukarjee and S. Stern. Feasible nonparametric estimation of
multiargument monotone functions. Journal of the American Statistical
Association, 89(425):77–80, 1994.
[3] H.A.M. Daniels and B. Kamp. Application of MLP networks to bond
rating and house pricing. Neural Computing & Applications, 8(3):226–
234, 1999.
[4] H. Kay and L. H. Ungar. Estimating monotonic functions and their
bounds. American Institute of Chemical Engineers (AIChE) Journal,
46(12):2426–2434, 2000.
[5] J. Sill. Monotonic networks. Advances in Neural Information Processing
Systems, 10:661–667, 1998.
[6] S. Wang. A neural network method of density estimation for univariate
unimodal data. Neural Computing & Applications, 2(3):160–167, 1994.
[7] M. Sarfraz, M. Al-Mulhem, and F. Ashraf. Preserving monotonic shape
of the data by using piecewise rational cubic functions. Computers and
Graphics, 21:5–14, 1997.
[8] M. Velikova and H.A.M. Daniels. Decision trees for monotone price
models. Computational Management Science, 1(3–4):231–244, 2004.
[9] A. Ben-David. Monotonicity maintenance in information-theoretic
machine learning algorithms. Machine Learning, 19(1):29–43, 1995.
[10] R. Potharst and A. Feelders. Classication trees for problems with
monotonicity constraints. SIGKDD Explorations Newsletter, 4(1):1–10,
2002.
[11] P. J. Rousseeuw. Silhouettes: a graphical aid to the interpretation and
validation of cluster analysis. Journal of Computational and Applied
Mathematics, 20:53–65, 1987.
[12] E. A. Nadaraya. On estimating regression. Theory of Probability & Its
Applications, 9(1):141–142, 1964.
[13] G. S. Watson. Smooth regression analysis. Sankhya: The Indian Journal
of Statistics, Series A, 26(4):359–372, 1964.
[14] S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the
bias/variance dilemma. Neural Computation, 4(1):1–58, 1992.
[15] C. F. J. Wu and M. Hamada. Experiments: Planning, Analysis, and
Parameter Design Optimization. John Wiley & Sons, New York, wiley
series in probability and statistics edition, 2000.
[16] M. Velikova, H.A.M. Daniels, and A. Feelders. Solving partially
monotone problems with neural networks. In Proceedings of the twelfth
International Conference on Computer Science, Vienna, Austria, pages
82–87. World Enformatika Society, Turkey, 2006.
[17] D. J. Newman, S. Hettich, C. L. Blake, and C. J. Merz. UCI Repository
of Machine Learning Databases, 1998. http://www.ics.uci.
edu/┬ÿmlearn/MLRepository.html.
[18] S. Waugh. Extending and benchmarking cascade-correlation. Phd
dissertation, University of Tasmania, Tasmania, Australia, 1995.
[19] D. Clark, Z. Schreter, and A. Adams. A quantitative comparison of
Dystal and backpropagation. In Proceedings of the seventh Australian
Conference Neural Networks, Canberra, Australia, pages 132–137.
1996.
[20] A. M. Abdelbar. Achieving superior generalisation with a high order
neural network. Neural Computing & Applications, 7(2):141–146, 1998.
[21] W. J. Nash, T. L. Sellers, S. R. Talbot, and W. B. Cawthorn, A.
J. anf Ford. The population biology of abalone (halotis species) in
tasmania. i blacklip abalone (h. rubra) from the north coast and islands of
bass strait. Technical Report 48, Sea Fisheries Division, Marine Research
Laboratories-Taroona, Department of Primary Industry and Fisheries,
Tasmania, Australia, 1994.