On the outlier Detection in Nonlinear Regression

The detection of outliers is very essential because of their responsibility for producing huge interpretative problem in linear as well as in nonlinear regression analysis. Much work has been accomplished on the identification of outlier in linear regression, but not in nonlinear regression. In this article we propose several outlier detection techniques for nonlinear regression. The main idea is to use the linear approximation of a nonlinear model and consider the gradient as the design matrix. Subsequently, the detection techniques are formulated. Six detection measures are developed that combined with three estimation techniques such as the Least-Squares, M and MM-estimators. The study shows that among the six measures, only the studentized residual and Cook Distance which combined with the MM estimator, consistently capable of identifying the correct outliers.




References:
[1] Anskombe, F. J. and Tukey, J. w. (1963), The examination and analysis
of residuals. Technometrics, 5, 141-60.
[2] Atkinson, A.C., (1981), Two graphical displays for outlying and
influential observations in regression, Biometrika, 68, 1, 13-20.
[3] Atkinson, A.C., (1982), Regression Diagnostics, Transformations and
Constructed Variables, Journal od Royal Statistical Society, B, 44, 1, 1-
36.
[4] Atkinson, A.C., (1986). Masking unmasked, Biometrika, 73, 3, 533-541.
[5] Bates, D.M. Watts, D.G., (1980). Relative curvature measures of
nonlinearity, J. R. statist. Ser. B 42, 1-25.
[6] Belsley, D. A., Kuh, E., and Welsch, R. E. (1980), Regression
Diagnostics, John Wiley & Sons, New York.
[7] Cook, R. D., and Weisberg, S., (1982), Residuals and Influence in
Regression. CHAPMAN and HALL.
[8] Fox, T., Hinkley, D. and Larntz, K., (1980), Jackknifing in nonlinear
regression. Technometrics, 22, 29-33.
[9] Habshah, M., Noraznan, M. R., Imon, A. H. M. R. (2009). The
performance of diagnostic-robust generalized potential for the
identification of multiple high leverage points in linear regression,
Journal of Applied Statistics, 36(5):507-520.
[10] Hadi, A.H. (1992). A new measure of overall potential influence in
linear regression, Computational Statistics and Data Analysis 14 (1992)
1-27.
[11] Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J. and Stahel, W.A.
(1986), Robust Statistics: The Approach Based on InfluenceFunctions.
New York: John Wiley & Sons, Inc.
[12] Hoaglin, D. C., Mosteller, F., Tukey, J. W. (1983), Understanding
Robust and Exploratory Data Analysis, John Wiley and Sons.
[13] Hoaglin, D.C., & Wellsch, R. (1978). The hat Matrix in regression and
ANOVA. Ammerican Statistician 32, 17-22.
[14] Huber, P. J. (1981), Robust Statistics, Wiley, New York .
[15] Imon, A.H.M.R, (2002), Identifying multiple high leverage points in
linear regression, J. Stat. Stud. 3, 207-218.
[16] Kennedy. W. and Gentle, J. (1980). Statistical Computing. New
York:Dekker.
[17] Rousseeuw, P. J., and Leroy, A. M. (1987), Robust Regression and
outlier detection, New York: John Wiley.
[18] Riazoshams, H., Habshah, Midi, (2009), A Nonlinear regression model
for chickens- growth data. European Journal of Scientific Research, 35,
3, 393-404.
[19] Srikantan, K. S. (1961), Testing for a single outlier in a regression
model. Sankhya A, 23, 251-260.
[20] Stromberg, A. J., (1993), Computation of High Breakdown Nonlinear
Regression Parameters, Journal of American Statistical Association, 88
(421), 237-244.
[21] Seber, G., A. F. and Wild, C. J. (2003), Nonlinear Regression, John
Wiley and Sons.
[22] Yohai, V. J. (1987), High Breakdown point and high efficiency robust
estimates for regression, The Annals of Statistics, 15, 642-656.