Derivation of the Coefficient of Determination (R²): Interpretation and Relationship to Correlation

Q: Does a high R² imply that the independent variable causes the dependent variable?

No. $ R^2 $ measures only the strength of the linear association. Correlation does not imply causation; the association could be spurious or driven by unobserved confounding variables.

Q: Can R² ever be negative in a simple linear regression?

In a standard OLS regression with an intercept, $ R^2 $ is non-negative. However, if the model is forced through the origin or evaluated against a baseline other than the mean, the sum of squares identity may not hold, potentially leading to misleading interpretations.

Master the derivation and interpretation of the Coefficient of Determination (R²) in GLM. Explore variance decomposition, geometric orthogonality, and pitfalls.

Visualizing...

Our institutional research engineers are currently mapping the formal proof for Derivation of the Coefficient of Determination (R²): Interpretation and Relationship to Correlation.

Apply for Institutional Early Access →

The Formal Theorem

For a simple linear regression model

Y = \beta_0 + \beta_1 X + \epsilon

, the coefficient of determination

R^2

is defined as the proportion of the total variation in the dependent variable

Y

explained by the model. It is formally derived from the identity of sums of squares:

\begin{aligned} \sum_{i=1}^{n} (y_i - \bar{y})^2 &= \sum_{i=1}^{n} (\hat{y}_i - \bar{y})^2 + \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 \\ SST &= SSR + SSE \\ R^2 &= \frac{SSR}{SST} = 1 - \frac{SSE}{SST} = [r_{xy}]^2 \end{aligned}

Analytical Intuition.

Imagine a scattered galaxy of data points, each representing a complex reality where

Y

fluctuates based on a multitude of hidden variables. When we draw a regression line through this chaotic cloud, we are attempting a singular act of reductionism.

R^2

is our measure of success in this endeavor. It asks: 'How much of the total distance between these points and their collective average (the mean

\bar{y}

) has been successfully captured by the path of our line?' If the line perfectly tracks the fluctuations,

R^2

hits 1; if the line is as useless as the mean itself,

R^2

drops to 0. It is the geometric ratio of 'explained' clarity versus the 'unexplained' noise. Because we are looking at the square of the correlation coefficient

r_{xy}

R^2

discards the direction of the relationship, focusing solely on the strength of the linear alignment, mapping the variance of the data onto a tidy, percentage-based scale of predictive power.

CAUTION

Institutional Warning.

Students often conflate $R^2$ as a measure of model accuracy. It is critical to recognize that $R^2$ only measures the goodness-of-fit for the current sample. It does not indicate whether the model is biased, nor does it guarantee predictive power for new, out-of-sample observations.

Academic Inquiries.

Does a high R² imply that the independent variable causes the dependent variable?

No. $R^2$ measures only the strength of the linear association. Correlation does not imply causation; the association could be spurious or driven by unobserved confounding variables.

Can R² ever be negative in a simple linear regression?

In a standard OLS regression with an intercept, $R^2$ is non-negative. However, if the model is forced through the origin or evaluated against a baseline other than the mean, the sum of squares identity may not hold, potentially leading to misleading interpretations.

Why does adding a predictor variable always increase the R²?

Mathematically, adding a variable expands the column space of the design matrix, allowing the projection $\hat{Y}$ to capture more variance, thereby reducing the residual sum of squares $SSE$ monotonically.

Standardized References.

Definitive Institutional SourceMontgomery, D. C., Peck, E. A., & Vining, G. G., Introduction to Linear Regression Analysis.

Advanced

The Matrix Formulation of the General Linear Model: Y = Xβ + ϵ and its Fundamental Assumptions

Master the matrix formulation of the General Linear Model, $ Y = X\beta + \epsilon $, and its fundamental assumptions. Rigorous yet intuitive content for BSc Math/Stats students.

Foundational

Derivation of the Ordinary Least Squares (OLS) Estimator: β̂ = (X'X)⁻¹X'Y

Master the OLS estimator derivation: $ \hat{\beta} = (X'X)^{-1}X'Y $. Explore the geometric orthogonality, matrix calculus, and Gauss-Markov foundations.

Foundational

Proof of Unbiasedness of the OLS Estimator: E(β̂) = β

Master the rigorous proof of OLS estimator unbiasedness, $ E(\hat{\boldsymbol{\beta}}) = \boldsymbol{\beta} $. Understand critical assumptions, geometric intuition, and common pitfalls for robust linear modeling.

Foundational

Derivation of the Variance-Covariance Matrix of the OLS Estimator: Var(β̂) = σ²(X'X)⁻¹

A rigorous derivation of the Variance-Covariance matrix for the OLS estimator, exploring the geometric impact of data configuration on statistical precision.

Institutional Citation

Reference this proof in your academic research or publications.

NICEFA Visual Mathematics. (2026). Derivation of the Coefficient of Determination (R²): Interpretation and Relationship to Correlation: Visual Proof & Intuition. Retrieved from https://www.nicefa.org/library/general-linear-models-/derivation-of-the-coefficient-of-determination--r----interpretation-and-relationship-to-correlation

Dominate the Logic.

"Abstract theory is just a movement we haven't seen yet."

Subscribe for Full Proofs Early Access

Visualizing...

The Formal Theorem

Analytical Intuition.

Institutional Warning.

Academic Inquiries.

Does a high R² imply that the independent variable causes the dependent variable?

Can R² ever be negative in a simple linear regression?

Why does adding a predictor variable always increase the R²?

Standardized References.

Related Proofs Cluster.

The Matrix Formulation of the General Linear Model: Y = Xβ + ϵ and its Fundamental Assumptions

Derivation of the Ordinary Least Squares (OLS) Estimator: β̂ = (X'X)⁻¹X'Y

Proof of Unbiasedness of the OLS Estimator: E(β̂) = β

Derivation of the Variance-Covariance Matrix of the OLS Estimator: Var(β̂) = σ²(X'X)⁻¹

Institutional Citation

Dominate the Logic.