Linear regression Part

Posted on 2022-04-21 Edited on 2022-11-07 In statistics

Prove mean of predicted values in OLS regression is equal to the mean of original values

In OLS estimation, we can summarize response \(y\) as \[ y_i=\hat{y}_i+e_i \] where residual \(e_i\) is assumed to follow a normal distribution \(N(0,\sigma^2)\) \[ \sum e_i=\overline{e}=0 \] thereby, we have \[ \sum_{i=1}^ny_i=\sum_{i=1}^n(\hat{y}_i+e_i)\\ =\sum_{i=1}^n\hat{y}_i \]

> iris_model <- lm(Petal.Width~Sepal.Length+Sepal.Width+Petal.Length,data = iris)
> summary(iris_model)

Call:
lm(formula = Petal.Width ~ Sepal.Length + Sepal.Width + Petal.Length, 
    data = iris)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.60959 -0.10134 -0.01089  0.09825  0.60685 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -0.24031    0.17837  -1.347     0.18    
Sepal.Length -0.20727    0.04751  -4.363 2.41e-05 ***
Sepal.Width   0.22283    0.04894   4.553 1.10e-05 ***
Petal.Length  0.52408    0.02449  21.399  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.192 on 146 degrees of freedom
Multiple R-squared:  0.9379,	Adjusted R-squared:  0.9366 
F-statistic: 734.4 on 3 and 146 DF,  p-value: < 2.2e-16

> mean(iris_model$fitted.values)
[1] 1.199333
> mean(iris$Petal.Width)
[1] 1.199333

Proof that the mean of predicted values in OLS regression is equal to the mean of original values?

Prove mean of predicted values in OLS regression is equal to the mean of original values

Related Q&A