The t-statistic for Individual Regression Coefficients: Derivation and its Distribution

Master the derivation and distribution of the t-statistic in GLMs. Explore the geometry, the role of variance estimation, and its t-distribution convergence.

Visualizing...

Our institutional research engineers are currently mapping the formal proof for The t-statistic for Individual Regression Coefficients: Derivation and its Distribution.

Apply for Institutional Early Access →

The Formal Theorem

Consider the linear model Y=Xβ+ϵ Y = X\beta + \epsilon , where YRn Y \in \mathbb{R}^n , XRn×p X \in \mathbb{R}^{n \times p} is a full-rank matrix, and ϵN(0,σ2In) \epsilon \sim N(0, \sigma^2 I_n) . The ordinary least squares estimator β^ \hat{\beta} satisfies β^N(β,σ2(XTX)1) \hat{\beta} \sim N(\beta, \sigma^2 (X^T X)^{-1}) . Let σ^2=eTenp \hat{\sigma}^2 = \frac{e^T e}{n-p} be the unbiased estimator of σ2 \sigma^2 , where e=YXβ^ e = Y - X\hat{\beta} . For any component β^j \hat{\beta}_j , the statistic:
t=β^jβjσ^2[(XTX)1]jjtnp \begin{aligned} t = \frac{\hat{\beta}_j - \beta_j}{\sqrt{\hat{\sigma}^2 [(X^T X)^{-1}]_{jj}}} \sim t_{n-p} \end{aligned}
follows a Student's t-distribution with np n-p degrees of freedom.

Analytical Intuition.

In the vast multidimensional space of our data, β^ \hat{\beta} is our best estimate of the truth β \beta , but it is inherently noisy. Imagine peering through a lens that vibrates due to the underlying variance σ2 \sigma^2 . To confirm if a specific variable Xj X_j actually influences the outcome Y Y , we must quantify how far our estimate β^j \hat{\beta}_j deviates from a null hypothesis (typically βj=0 \beta_j = 0 ) relative to the 'noise' we perceive. The numerator β^jβj \hat{\beta}_j - \beta_j captures the signal deviation, while the denominator acts as a scaling factor, normalizing this deviation by the estimated uncertainty. By dividing a Gaussian variable by the square root of a scaled χ2 \chi^2 variable, we transition from the rigid world of the Normal distribution to the fatter-tailed Student's t-distribution. This reflects the reality that our estimate of the noise σ^2 \hat{\sigma}^2 is itself uncertain, requiring us to be more conservative in our claims of statistical significance.
CAUTION

Institutional Warning.

Students frequently confuse the standard error of the coefficient SE(β^j) \text{SE}(\hat{\beta}_j) with the residual standard error σ^ \hat{\sigma} . The former is specific to the sensitivity of βj \beta_j to the data layout, while the latter represents the global noise level of the model.

Academic Inquiries.

01

Why is the t-distribution used instead of the Normal distribution?

Because σ2 \sigma^2 is unknown, we must estimate it using σ^2 \hat{\sigma}^2 . The resulting dependency introduces extra uncertainty, requiring the heavier tails of the t-distribution.

02

What happens as \( n-p \to \infty \?

By the Law of Large Numbers, σ^2σ2 \hat{\sigma}^2 \to \sigma^2 . The t-distribution converges to the Standard Normal distribution N(0,1) N(0,1) .

03

Does the t-test for βj \beta_j depend on other coefficients?

Yes, through the matrix (XTX)1 (X^TX)^{-1} . Multicollinearity increases the diagonal elements [(XTX)1]jj [(X^TX)^{-1}]_{jj} , thereby inflating the standard error and reducing the t-statistic.

Standardized References.

  • Definitive Institutional SourceRencher, A. C., & Schaalje, G. B., Linear Models in Statistics.

Institutional Citation

Reference this proof in your academic research or publications.

NICEFA Visual Mathematics. (2026). The t-statistic for Individual Regression Coefficients: Derivation and its Distribution: Visual Proof & Intuition. Retrieved from https://www.nicefa.org/library/general-linear-models-/the-t-statistic-for-individual-regression-coefficients--derivation-and-its-distribution

Dominate the Logic.

"Abstract theory is just a movement we haven't seen yet."