Prove $\bar{X}$ and $X_i-\bar{X}$ are independent if $X_i's$ are normally distributed

To Prove \(\bar{X}\) and \(X_i-\bar{X}\) are independent if \(X_i's\) are normally distributed and \(\bar{X}\) and \(X_i-\bar{X}\) are uncorrelated, two conditions are needed to be fulfilled:

  • Prove that given a joint MGF \(M_{X,Y}(s,t)\), \(M_{X,Y}(s,t)=M_X(s)\cdot M_Y(t)\) is the sufficient condition for independence of \(X\) and \(Y\)
  • Prove MGF of \((\bar{X},X_i-\bar{X})\) can be expressed as product of MGF of \(\bar{X}\) and MGF of \(X_i-\bar{X}\)

It's worth mentioning the uniqueness theorem of MGFS:

Theorem 1:

Suppose \(\exists\mathbf{t}>0\) such that \(M_{\mathbf{X}}(\mathbf{t})<\infty\) for all \(\mathbf{t}\in\mathbb{R}^n\) with \(||\mathbf{t}||\leq r\). Then, the distribution of \(\mathbf{X}\)

is determined uniquely by the function \(M_{\mathbf{X}}\). That is, if \(\mathbf{Y}\) is any random vector whose MGF is the same as \(M_{\mathbf{X}}\), then \(\mathbf{Y}\) has the same distribution as \(\mathbf{X}\)

Let \(\mathbf{X}\) be a random \(n-\)vector with a MGF that is finite in an open neighborhood of the origin \(0\in \mathbb{R}^n\). Let \(\tilde{X}\) denote an independent copy of \(\mathbf{X}\). Define a new random vector \(\mathbf{Y}\) as follows: \[ \mathbf{Y:=}\begin{bmatrix}X_1\\ \vdots \\ X_r\\\tilde{X}_{r+1}\\ \vdots\\ \tilde{X}_n\end{bmatrix} \] Then, \[ \begin{gather*} M_{\mathbf{Y}}(\mathbf{t})=Ee^{\sum_{i=1}^rt_iX_i}\cdot Ee^{\sum_{j=r+1}^nt_jX_j}\\ =M_{\mathbf{X}}(t_1,\cdots,t_r,0\cdots,0)\cdot M_{\mathbf{X}}(0,\cdots,0,t_{r+1},\cdots,t_n) \end{gather*} \] Suppose there exists \(r=1,\cdots,n\) such that \[ M_{\mathbf{X}}(\mathbf{t})=M_{\mathbf{X}}(t_1,\cdots,t_r,0,\cdots,0)\cdot M_{\mathbf{X}}(0,\cdots,0,t_{r+1},\cdots,t_n) \] for all \(\mathbf{t}\in \mathbb{R}^n\).

Then according to the theorem 1, \(\mathbf{X}\) and \(\mathbf{Y}\) have the same MGF’s, and therefore have the same distribution. That is, for all sets \(A_1,\cdots,A_n,\) there is \[ P\{X_1\in A_1,\cdots,X_n\in A_j\}=P\{Y_1\in A_1,\cdots,Y_n\in A_n\} \] which is, by construction equal to \[ P\{X_1\in A_1,\cdots,X_r\in A_r\}\cdot P\{\tilde{X}_{r+1}\in A_{r+1},\cdots,\tilde{X}_n\in A_n\} \] Since \(\tilde{X}\) has the same distribution as \(\mathbf{X}\), this proves that \[ P\{X_1\in A_1,\cdots, X_n\in A_n\}=P\{X_1\in A_1,\cdots,X_r\in A_r\}\cdot P\{X_{r+1}\in A_{r+1},\cdots,X_n\in A_n\} \] Which implies that \((X_1,\cdots,X_r)\) and \((X_{r+1},\cdots,X_n)\) are independent.

Then we have the second theorem

Theorem 2:

(Independence theorem of MGFs). Let \(X\) be a random n-vector with a MGF that is finite in an open neighborhood of the origin \(\mathbf{0}\in \mathbb{R}^n\). Suppose there exists \(r=1,\cdots,n\) such that \[ M_{\mathbf{X}}(\mathbf{t})=M_{\mathbf{X}}(t_1,\cdots,t_r,0,\cdots,0)\cdot M_{\mathbf{X}}(0,\cdots,0,X_{r+1},\cdots,X_n) \] for all \(\mathbf{t}\in\mathbb{R}^n\). Then \((X_1,\cdots,X_r)\) and \((X_{r+1},\cdots,X_n)\) are independent

By now, we have proved the first statement that

Given a joint MGF \(M_{X,Y}(s,t)\), \(M_{X,Y}(s,t)=M_X(s)\cdot M_Y(t)\) is the sufficient condition for independence of \(X\) and \(Y\)

Next we prove the second statement, by utilizing the first statement.

The proof process is given by the answer to my posted question Prove \(\bar{X}\) and \(X_i-\bar{X}\) are independent if \(X_i's\) are independently normally distributed

I hereby paste this elegant answer in my blog, and also give thanks to Damian Pavlyshyn

(Note: In this answer I'll only consider standard normal \(X_i\)s, since it's no harder to do when they have other means and variances)

One way of doing this is with moment generating functions. For a bivariate random variable \(Z = (Z_1, Z_2)\), \(Z_1\) and \(Z_2\) are independent if and only if the moment generating function \(M_Z(\mathbf{s}) = \mathbf{E} e^{\mathbf{s}^T Z}\) factors into a product of a functions of \(s_1\) and \(s_2\) only, where \(\mathbf{s} = (s_1, s_2)\).

Now, the thing to notice is that \((\bar{X}, X_i - \bar{X})\) is a linear transformation of the whole sample \(X = (X_1, \dotsc, X_n)\). Namely, \[ \begin{pmatrix}\bar{X} \\ X_i - \bar{X}\end{pmatrix} = \begin{pmatrix}\mathbf{1}^T/n \\ e_i^T - \mathbf{1}^T/n\end{pmatrix} X := H X \] where \(\mathbf{1}\) is a vector of all \(1\)s and \(e_i\) is the vector of all \(0\)s with a \(1\) in the \(i\)th entry.

We can use this to write down the MGF: \[ \begin{align*} \mathbf{E} \exp\biggl\{\mathbf{s}^T \begin{pmatrix}\bar{X} \\ X_i - \bar{X}\end{pmatrix}\biggr\} = \mathbf{E} \exp\{\mathbf{s}^T H X\} = \mathbf{E} \exp\{(H^T\mathbf{s})^T X\} \end{align*} \] Then we can use the known moment generating function of a vector of iid normals (\(\mathbf{E} e^{\mathbf{t}^T X} = \exp\{\frac{1}{2}\mathbf{t}^T \mathbf{t}\}\)) to conclude that \[ \mathbf{E} \exp\{(H^T\mathbf{s})^T X\} = \exp\{\frac{1}{2}(H^T\mathbf{s})^T(H^T\mathbf{s})\} = \exp\{\frac{1}{2}\mathbf{s}^T HH^T \mathbf{s}\}. \]

Now, we can multiply \[ \begin{gather*} HH^T = \begin{pmatrix}\mathbf{1}^T/n \\ e_i^T - \mathbf{1}^T/n\end{pmatrix} \begin{pmatrix}\mathbf{1}/n , & e_i - \mathbf{1}/n\end{pmatrix} \\ = \begin{pmatrix} \mathbf{1}^T/n \mathbf{1}/n & \mathbf{1}^T/n (e_i - \mathbf{1}/n) \\ (e_i - \mathbf{1}/n)^T \mathbf{1}/n & (e_i - \mathbf{1}/n)^T(e_i - \mathbf{1}/n) \end{pmatrix} \\ = \begin{pmatrix} 1/n & 0 \\ 0 & 1 - 1/n \end{pmatrix} \end{gather*} \]

so that the MGF becomes \[ \exp\biggl\{\frac{1}{2}\mathbf{s}^T HH^T \mathbf{s}\biggr\} = \exp\biggl\{\frac{1}{2}\biggl(\frac{1}{n}s_1^2 + \frac{n-1}{n}s_2^2\biggr)\biggr\} = e^{\frac{1}{2n}s_1^2} e^{\frac{n-1}{2n} s_2^2}. \]

Since this factors into terms containing only \(s_1\) and \(s_2\) respectively, we conclude that \(\bar{X}\) and \(X_i - \bar{X}\) are independent (and also that they have \(N(0, 1/n)\) and \(N(0, 1 - 1/n)\) distributions respectively).

Reference