Peer-Reviewed Publications

      Adaptively preconditioned Krylov spaces to identify irrelevant predictors

      Kondylis, A.; Whittaker, J.
      Published
      Aug 24, 2010
      DOI
      10.1016/j.chemolab.2010.08.010
      Summary

      Linear regression methods have problems in estimation when the predictor variables are highly correlated and when their number exceeds the number of available observations. PLS is one well known method for handling such ill-conditioned regression problems. It does so by approximating the regression solution in a low dimensional subspace. While it copes with collinearity and singularity problems, PLS does not have a variable selection procedure intrinsic to the method. However, it is often the case that one needs to decide which predictors, among the numerous and correlated ones, are the more relevant. The PLS coefficient is a good starting point for the identification of relevant variables in ill-conditioned regression settings. We propose to adaptively precondition the space generated by PLS in order to determine the most relevant predictors. The relevant subset is determined by a multiple testing procedure, and preconditioning stops when this set no longer changes. The principal objective is to do regression modelling and to recover solutions that are easy to interpret in the high dimensional regression setting. We use dimension reduction in a PLS fashion, using information on the response to guide the variable selection procedure. A variety of examples is studied with good results.