Proof of Unbiasedness of the OLS Estimator: E(β̂) = β

Master the rigorous proof of OLS estimator unbiasedness, \( E(\hat{\boldsymbol{\beta}}) = \boldsymbol{\beta} \). Understand critical assumptions, geometric intuition, and common pitfalls for robust linear modeling.

Visualizing...

Our institutional research engineers are currently mapping the formal proof for Proof of Unbiasedness of the OLS Estimator: E(β̂) = β.

Apply for Institutional Early Access →

The Formal Theorem

Given the multiple linear regression model y=Xβ+ϵ \mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon} , where y \mathbf{y} is an n×1 n \times 1 vector of observations, X \mathbf{X} is an n×k n \times k design matrix of full column rank, β \boldsymbol{\beta} is a k×1 k \times 1 vector of unknown parameters, and ϵ \boldsymbol{\epsilon} is an n×1 n \times 1 vector of random errors satisfying the Gauss-Markov assumption E(ϵX)=0 E(\boldsymbol{\epsilon} | \mathbf{X}) = \mathbf{0} , the Ordinary Least Squares (OLS) estimator β^ \hat{\boldsymbol{\beta}} , defined as β^=(XTX)1XTy \hat{\boldsymbol{\beta}} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y} , is an unbiased estimator of β \boldsymbol{\beta} , formally stated as:
E(β^)=β E(\hat{\boldsymbol{\beta}}) = \boldsymbol{\beta}

Analytical Intuition.

Imagine being a master marksman aiming for a luminous bullseye that represents the true, unknown population parameter β \boldsymbol{\beta} . Each time you fire, you collect a sample and calculate an Ordinary Least Squares estimate, β^ \hat{\boldsymbol{\beta}} . While your hand might shake slightly (representing the random error term ϵ \boldsymbol{\epsilon} ), causing each individual bullet to land slightly off the bullseye, the unbiasedness property guarantees something profound: if you were to fire an infinite number of shots, the *average* landing point of all those bullets would be precisely the center of the bullseye. The inherent randomness ϵ \boldsymbol{\epsilon} doesn't systematically push your aim high or low, left or right; it simply introduces symmetrical, zero-mean deviations around the true target. Therefore, our OLS estimator β^ \hat{\boldsymbol{\beta}} is a tool that, on average, provides an accurate reading of the underlying reality, ensuring E(β^)=β E(\hat{\boldsymbol{\beta}}) = \boldsymbol{\beta} .
CAUTION

Institutional Warning.

Students often confuse E(ϵ)=0 E(\boldsymbol{\epsilon}) = \mathbf{0} with the critical E(ϵX)=0 E(\boldsymbol{\epsilon} | \mathbf{X}) = \mathbf{0} . The latter, strict exogeneity, ensures errors are uncorrelated with *all* regressors, guaranteeing finite-sample unbiasedness. The former, merely zero unconditional mean error, is insufficient if X \mathbf{X} is correlated with ϵ \boldsymbol{\epsilon} .

Academic Inquiries.

01

Does the unbiasedness property hold if the regressors X \mathbf{X} are stochastic (random variables) rather than fixed?

Yes, provided the strict exogeneity assumption E(ϵX)=0 E(\boldsymbol{\epsilon} | \mathbf{X}) = \mathbf{0} holds. This conditional expectation accounts for X \mathbf{X} being stochastic by ensuring that, for any realization of X \mathbf{X} , the errors average to zero, thus allowing the law of iterated expectations to yield E(β^)=β E(\hat{\boldsymbol{\beta}}) = \boldsymbol{\beta} .

02

Is an unbiased estimator always preferred over a biased one?

Not necessarily. While unbiasedness is desirable, it doesn't consider estimator variance. A slightly biased estimator with much lower variance might be preferred, especially in terms of Mean Squared Error (MSE). This is known as the bias-variance trade-off, crucial in advanced estimation theory.

03

What happens to unbiasedness if there is perfect multicollinearity among the regressors?

Perfect multicollinearity means the design matrix X \mathbf{X} does not have full column rank, rendering XTX \mathbf{X}^T \mathbf{X} singular. Consequently, its inverse (XTX)1 (\mathbf{X}^T \mathbf{X})^{-1} does not exist, and the OLS estimator β^ \hat{\boldsymbol{\beta}} cannot be uniquely computed, thus making the concept of its unbiasedness moot.

04

How does omitted variable bias specifically break the unbiasedness of OLS?

Omitted variable bias occurs when a relevant variable, correlated with both an included regressor and the dependent variable, is left out of the model. Its effect is absorbed into the error term ϵ \boldsymbol{\epsilon} , making ϵ \boldsymbol{\epsilon} correlated with the included X \mathbf{X} . This violates E(ϵX)=0 E(\boldsymbol{\epsilon} | \mathbf{X}) = \mathbf{0} , causing E((XTX)1XTϵ) E((\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \boldsymbol{\epsilon}) to be non-zero, leading to a biased β^ \hat{\boldsymbol{\beta}} .

Standardized References.

  • Definitive Institutional SourceWooldridge, J. M. (2019). Introductory Econometrics: A Modern Approach (7th ed.). Cengage Learning.

Institutional Citation

Reference this proof in your academic research or publications.

NICEFA Visual Mathematics. (2026). Proof of Unbiasedness of the OLS Estimator: E(β̂) = β: Visual Proof & Intuition. Retrieved from https://www.nicefa.org/library/general-linear-models-/proof-of-unbiasedness-of-the-ols-estimator--e--------

Dominate the Logic.

"Abstract theory is just a movement we haven't seen yet."