FAQ
overflow

Great Answers to
Questions About Everything

QUESTION

For the selection of predictors in multivariate linear regression with $p$ suitable predictors, what methods are available to find an 'optimal' subset of the predictors without explicitly testing all $2^p$ subsets? In 'Applied Survival Analysis,' Hosmer & Lemeshow make reference to Kuk's method, but I cannot find the original paper. Can anyone describe this method, or, even better, a more modern technique? One may assume normally distributed errors.

{ asked by shabbychef }

ANSWER

I've never heard of Kuk's method, but the hot topic these days is L1 minimisation. The rationale being that if you use a penalty term of the absolute value of the regression coefficients, the unimportant ones should go to zero.

These techniques have some funny names: Lasso, LARS, Dantzig selector. You can read the papers, but a good place to start is with Elements of Statistical Learning, Chapter 3.

{ answered by Simon Byrne }
Tweet