For the selection of predictors in multivariate linear regression with $p$ suitable predictors, what methods are available to find an 'optimal' subset of the predictors without explicitly testing all $2^p$ subsets? In 'Applied Survival Analysis,' Hosmer & Lemeshow make reference to Kuk's method, but I cannot find the original paper. Can anyone describe this method, or, even better, a more modern technique? One may assume normally distributed errors.
I've never heard of Kuk's method, but the hot topic these days is L1 minimisation. The rationale being that if you use a penalty term of the absolute value of the regression coefficients, the unimportant ones should go to zero.
These techniques have some funny names: Lasso, LARS, Dantzig selector. You can read the papers, but a good place to start is with Elements of Statistical Learning, Chapter 3.Tweet