Multi-collinearity in polynomials

Multicollinearity arises in the polynomial regression. As a special case of the linear model, two problems occurs, the first is the multicollinearity, the second is that if power \(k\) is large, say more than \(3\) or \(4\), the magnitudes of these powers tend to vary over a rather wide range.

To deal with these problems, \(2\) strategies are needed. One is to limit to a cubic (\(k=3\)) if at all possible, and generally no more than a quintic (\(k=5\)). The other is to center the \(x\)-variable. In other words, fit the model \[ y=\beta_0^*+\beta_1^*(x-\bar{x})+\cdots+\beta_k^*(x-\bar{x})^k \] In addition to centering, one could also scale the \(x\) variable by dividing by its standard deviation \(s_x\), thus standardizing it to have a zero mean and a unit standard deviation.

Reference

  • Statistics and data analysis : from elementary to intermediate (page 418)