Abstract: One of the biggest challenges in nonparametric
regression is the curse of dimensionality. Additive models are known
to overcome this problem by estimating only the individual additive
effects of each covariate. However, if the model is misspecified, the
accuracy of the estimator compared to the fully nonparametric one
is unknown. In this work the efficiency of completely nonparametric
regression estimators such as the Loess is compared to the estimators
that assume additivity in several situations, including additive and
non-additive regression scenarios. The comparison is done by
computing the oracle mean square error of the estimators with regards
to the true nonparametric regression function. Then, a backward
elimination selection procedure based on the Akaike Information
Criteria is proposed, which is computed from either the additive or
the nonparametric model. Simulations show that if the additive model
is misspecified, the percentage of time it fails to select important
variables can be higher than that of the fully nonparametric approach.
A dimension reduction step is included when nonparametric estimator
cannot be computed due to the curse of dimensionality. Finally, the
Boston housing dataset is analyzed using the proposed backward
elimination procedure and the selected variables are identified.
Abstract: This study investigates the performance of radial basis function networks (RBFN) in forecasting the monthly CO2 emissions of an electric power utility. We also propose a method for input variable selection. This method is based on identifying the general relationships between groups of input candidates and the output. The effect that each input has on the forecasting error is examined by removing all inputs except the variable to be investigated from its group, calculating the networks parameter and performing the forecast. Finally, the new forecasting error is compared with the reference model. Eight input variables were identified as the most relevant, which is significantly less than our reference model with 30 input variables. The simulation results demonstrate that the model with the 8 inputs selected using the method introduced in this study performs as accurate as the reference model, while also being the most parsimonious.