Shrinkage Methods in a model (2024)

In the linear regression context, subsetting means choosing a subset from available variables to include in the model, thus reducing its dimensionality.

Shrinkage, on the other hand, means reducing the size of the coefficient estimates. Consequently, such a case can also be seen as a kind of subsetting.

Shrinkage and selection aim at improving upon the simple linear regression. It may not be immediately obvious why such a constraint should improve the fit, but it turns out that shrinking the coefficient estimates can significantly reduce their variance.

There are two techniques in Shrinkage:

Ridge regression

It is very similar to least squares, except that the coefficients are estimated by minimizing a slightly different quantity.

Shrinkage Methods in a model (1)

whereλ ≥ 0 is atuning parameter, to be determined separately.

As with least squares, ridge regression seeks coefficient estimates that fit the data well, by making the RSS small.

The tuning parameterλserves to control the relative impact of these two terms on the regression coefficient estimates. Whenλ = 0, the penalty term has no effect, and ridge regression will produce the least squares estimates. However, asλ, the impact of the shrinkage penalty grows, and the ridge regression coefficient estimates will approach zero

Ridge regression’s advantage over least squares is rooted in thebias-variance trade-off. Asλincreases, the flexibility of the ridge regression fit decreases, leading to decreased variance but increased bias.At the least squares coefficient estimates, which correspond to ridge regression withλ = 0, the variance is high but there is no bias. But asλincreases, the shrinkage of the ridge coefficient estimates leads to a substantial reduction in the variance of the predictions, at the expense of a slight increase in bias.

Lasso regression

The LASSO is a regression method that involves penalizing the absolute size of the regression coefficients.

By penalizing \you end up in a situation where some of the parameter estimates may be exactly zero. The larger the penalty applied, the further estimates are shrunk towards zero.

This is convenient when we want some automatic feature/variable selection, or when dealing with highly correlated predictors, where standard regression will usually have regression coefficients that are 'too large'.

Shrinkage Methods in a model (2)

As with ridge regression, the lasso shrinks the coefficient estimates towards zero. However, in the case of the lasso, the1penalty has the effect of forcing some of the coefficient estimates to be exactly equal to zero when the tuning parameterλis sufficiently large. Hence, much like best subset selection, the lasso performsvariable selection. As a result, models generated from the lasso are generally much easier to interpret than those produced by ridge regression.

Shrinkage Methods in a model (2024)
Top Articles
Latest Posts
Article information

Author: Kelle Weber

Last Updated:

Views: 6042

Rating: 4.2 / 5 (73 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Kelle Weber

Birthday: 2000-08-05

Address: 6796 Juan Square, Markfort, MN 58988

Phone: +8215934114615

Job: Hospitality Director

Hobby: tabletop games, Foreign language learning, Leather crafting, Horseback riding, Swimming, Knapping, Handball

Introduction: My name is Kelle Weber, I am a magnificent, enchanting, fair, joyous, light, determined, joyous person who loves writing and wants to share my knowledge and understanding with you.