Computational Aspects of Regression Analysis of Interval Data

We consider linear regression models where both input data (the values of independent variables) and output data (the observations of the dependent variable) are interval-censored. We introduce a possibilistic generalization of the least squares estimator, so called OLS-set for the interval model. This set captures the impact of the loss of information on the OLS estimator caused by interval censoring and provides a tool for quantification of this effect. We study complexity-theoretic properties of the OLS-set. We also deal with restricted versions of the general interval linear regression model, in particular the crisp input – interval output model. We give an argument that natural descriptions of the OLS-set in the crisp input – interval output cannot be computed in polynomial time. Then we derive easily computable approximations for the OLS-set which can be used instead of the exact description. We illustrate the approach by an example.


Authors:



References:
[1] G. Alefeld and J. Herzberger, Introduction to interval computations,
Computer Science and Applied Mathematics, New York, USA: Academic
Press, 1983.
[2] S. Arora and B. Barak, Computational complexity: A modern approach,
Cambridge, Great Britain: Cambridge University Press, 2009.
[3] D. Avis and K. Fukuda, Reverse search for enumeration, Discrete Applied
Mathematics 65, 1996, 21-46.
[4] A. H. Bentbib, Solving the full rank interval least squares problem,
Applied Numerical Mathematics 41 (2), 2002, 283-294.
[5] M. Cˇ erny' and M. Hlad'ık, The regression tolerance quotient in data
analysis, in: M. Houda and J. Friebelov'a (eds.), Procceding of Mathematical
Methods in Economics 2010, Czech Republic: University of South
Bohemia, 2010, 98-104.
[6] M. Cˇ erny' and M. Rada, A note on linear regression with interval data
and linear programming, in: Quantitative methods in economics: Multiple
Criteria Decision Making XV, Slovakia: Kluwer, Iura Edition, 2010, 276-
282.
[7] P.-T. Chang, E. S. Lee and S. A. Konz, Applying fuzzy linear regression
to VDT legibility, Fuzzy Sets and Systems 80 (2), 1996, 197-204.
[8] C. Chuang, Extended support vector interval regression networks for
interval input-output data, Information Science 178 (3), 2008, 871-891.
[9] J. P. Dunyak and D. Wunsch, Fuzzy regression by fuzzy number neural
networks, Fuzzy Sets and Systems 112 (3), 2000, 371-380.
[10] T. Entani and M. Inuiguchi, Group decisions in interval AHP based
on interval regression analysis, in: V.-N. Huynh et al. (eds.), Integrated
uncertainty management and applications, Advances in Soft Computing,
vol. 68, Germany: Springer, 2010, 269-280.
[11] J.-A. Ferrez, K. Fukuda and T. Liebling, Solving the fixed rank convex
quadratic maximization in binary variables by a parallel zonotope
construction algorithm, European Journal of Operational Research 166,
2005, 35-50.
[12] D. M. Gay, Interval least squaresÔÇöa diagnostic tool, in R. E. Moore
(ed.), Reliability in computing, the role of interval methods in scientific
computing, Perspectives in Computing, vol. 19, Boston, USA: Academic
Press, 1988, 183-205.
[13] M. Gr¨otschel, L. Lov'asz and A. Schrijver, Geometric algorithms and
combinatorial optimization, Germany: Springer, 1993.
[14] P. Guo and H. Tanaka, Dual models for possibilistic regression analysis,
Computational Statistics & Data Analysis 51 (1), 2006, 253-266.
[15] B. Hesmaty and A. Kandel, Fuzzy linear regression and its applications
to forecasting in uncertain environment, Fuzzy Sets and Systems 15, 1985,
159-191.
[16] M. Hlad'─▒k, Description of symmetric and skew-symmetric solution set,
SIAM Journal on Matrix Analysis and Applications 30 (2), 2008, 509-
521.
[17] M. Hlad'─▒k, Solution set characterization of linear interval systems with
a specific dependence structure, Reliable Computing 13 (4), 2007, 361-
374.
[18] M. Hlad'─▒k, Solution sets of complex linear interval systems of equations,
Reliable Computing 14, 2010, 78-87.
[19] M. Hlad'ık and M. Cˇ erny', Interval regression by tolerance analysis approach,
Submitted in Fuzzy Sets and Systems, Preprint: KAM-DIMATIA
Series 963, 2010.
[20] M. Hlad'ık and M. Cˇ erny', New approach to interval linear regression, in:
R. Kas─▒mbeyli et al. (eds.), 24th Mini-EURO conference on continuous
optimization and information-based technologies in the financial sector
MEC EurOPT 2010, Selected papers, Vilnius, Lithuania: Technika, 2010,
167-171.
[21] C.-H. Huang and H.-Y. Kao, Interval regression analysis with softmargin
reduced support vector machine, Lecture Notes in Computer
Science 5579, Germany: Springer, 2009, 826-835.
[22] M. Inuiguchi, H. Fujita and T. Tanino, Robust interval regression
analysis based on Minkowski difference, in: SICE 2002, proceedings of
the 41st SICE Annual Conference, vol. 4, Osaka, Japan, 2002, 2346-2351.
[23] H. Ishibuchi and H. Tanaka, Several formulations of interval regression
analysis, in: Proceedings of Sino-Japan joint meeting on fuzzy sets and
systems, Beijing, China, 1990, B2-2, 1-4.
[24] H. Ishibuchi, H. Tanaka and H. Okada, An architecture of neural
networks with interval weights and its application to fuzzy regression
analysis, Fuzzy Sets and Systems 57 (1), 1993, 27-39.
[25] C. Jansson, Calculation of exact bounds for the solution set of linear
interval systems, Linear Algebra and its Applications 251, 1997, 321-340.
[26] G. Jun-peng and L. Wen-hua, Regression analysis of interval data based
on error theory, in: Proceedings of 2008 IEEE International Conference
on Networking, Sensing and Control, ICNSC, Sanya, China, 2008, 552-
555.
[27] M. Kaneyoshi, H. Tanaka, M. Kamei and H. Furuta, New system
identification technique using fuzzy regression analysis, in: Proceedings of
the First International Symposium on Uncertainty Modeling and Analysis,
Baltimore, USA, 1990, 528-533.
[28] H. Kashima, K. Yamasaki, A. Inokuchi and H. Saigo, Regression with
interval output values, in: 19th International Conference on Pattern
Recognition ICPR 2008, Tampa, USA, 2008, 1-4.
[29] H. Lee and H. Tanaka, Fuzzy regression analysis by quadratic programming
reflecting central tendency, Behaviormetrika 25 (1), 1998, 65-80.
[30] H. Lee and H. Tanaka, Upper and lower approximation models in interval
regression using regression quantile techniques, Europeran Journal
of Operational Research 116 (3), 1999, 653-666.
[31] B. Li, C. Li, J. Si and G. Abousleman, Interval least-squares filtering
with applications to robust video target tracking, in: 2008 IEEE
International Conference on Acoustics, Speech and Signal Processing ÔÇö
Proceedings, Las Vegas, USA: IEEE Signal Processing Society, 2008,
3397-3400.
[32] E. de A. Lima Neto, F. de A. T. de Carvalho, Constrained linear
regression models for symbolic interval-valued variables, Computational
Statistics & Data Analysis 54 (2), 2010, 333-347.
[33] P. Liu, Study on a speech learning approach based on interval support
vector regression, in: Proceedings of 4th International Conference on
Computer Science & Education, Nanning, China, 2009, 1009-1012.
[34] I. Moral-Arce, J. M. Rodr'─▒guez-P'oo and S. Sperlich, Low dimensional
semiparametric estimation in a censored regression model, Journal of
Multivariate Analysis 102 (1), 118-129.
[35] E. Nasrabadi and S. Hashemi, Robust fuzzy regression analysis using
neural networks, International Journal of Uncertainty, Fuzziness and
Knowledge-Based Systems 16 (4), 2008, 579-598.
[36] A. Neumaier, Interval methods for systems of equations, Cambridge,
Great Britain: Cambridge University Press, 1990.
[37] S. Ning and R. B. Kearfott, A comparison of some methods for solving
linear interval equations, SIAM Journal on Numerical Analysis 34 (4),
1997, 1289-1305.
[38] W. Pan and R. Chappell, Computation of the NPMLE of distribution
functions for interval censored and truncated data with applications to
the Cox model, Computational Statistics & Data Analysis 28 (1), 1998,
33-50.
[39] C. Papadimitriou, Computational complexity, Addison-Wesley Longman,
1995.
[40] J. Rohn, A handbook of results on interval linear problems, Prague,
Czech Republic: Czech Academy of Sciences, 2005; available at:
http://uivtx.cs.cas.cz/Ôê╝rohn/handbook/handbook.zip.
[41] A. Schrijver, Theory of linear and integer programming, USA: Wiley,
2000.
[42] K. Sugihara, H. Ishii and H. Tanaka, Interval priorities in AHP by
interval regression analysis, Europeran Journal of Operational Research
158 (3), 2004, 745-754.
[43] H. Tanaka and H. Lee, Fuzzy linear regression combining central
tendency and possibilistic properties, in: Proceedings of the Sixth IEEE
International Conference on Fuzzy Systems, vol. 1, Barcelona, Spain,
1997, 63-68.
[44] H. Tanaka and H. Lee, Interval regression analysis by quadratic programming
approach, IEEE Transactions on Fuzzy Systems 6 (4), 1998,
473-481.
[45] H. Tanaka and J. Watada, Possibilistic linear systems and their application
to the linear regression model, Fuzzy Sets and Systems 27 (3),
1988, 275-289.
[46] X. Zhang and J. Sun, Regression analysis of clustered interval-censored
failure time data with informative cluster size, Computational Statistics
& Data Analysis 54 (7), 2010, 1817-1823.
[47] G. Ziegler, Lectures on polytopes, Germany: Springer, 2004.