Proof of the Stationarity Condition for an AR(1) Process (|φ| < 1)

Unravel the stationarity condition for AR(1) processes. Rigorous proof, cinematic intuition, and essential insights for time series analysis.

The Formal Theorem

Consider an Autoregressive process of order 1 (AR(1)) defined by Yt=c+ϕYt1+ϵt Y_t = c + \phi Y_{t-1} + \epsilon_t , where Yt Y_t is the observed value at time t t , ϕ \phi is the autoregressive coefficient, c c is a constant, and ϵt \epsilon_t is a white noise error term such that E[ϵt]=0 E[\epsilon_t] = 0 , Var(ϵt)=σϵ2< Var(\epsilon_t) = \sigma_{\epsilon}^2 < \infty , and Cov(ϵt,ϵs)=0 Cov(\epsilon_t, \epsilon_s) = 0 for ts t \neq s .\nIf the condition ϕ<1 |\phi| < 1 holds, then the AR(1) process Yt Y_t is weakly stationary, meaning its mean, variance, and autocovariance function are constant over time. Specifically, the mean E[Yt] E[Y_t] and variance Var(Yt) Var(Y_t) are given by:
E[Yt]=c1ϕVar(Yt)=σϵ21ϕ2 \begin{aligned} E[Y_t] &= \frac{c}{1 - \phi} \\ Var(Y_t) &= \frac{\sigma_{\epsilon}^2}{1 - \phi^2} \end{aligned}

Analytical Intuition.

Imagine a grand, echoing hall where every new sound, ϵt \epsilon_t , is like a fresh whisper at time t t . This whisper doesn't vanish instantly; it leaves an echo, ϕYt1 \phi Y_{t-1} , that contributes to the sound at the next moment. The coefficient ϕ \phi acts as the 'fading factor' for this echo. If ϕ |\phi| is strictly less than 1, each subsequent echo of that original whisper becomes weaker, diminishing exponentially into the past. The hall thus reaches a steady, predictable hum – a stable, statistically constant background noise – because the influence of distant whispers eventually becomes imperceptible. However, if ϕ |\phi| were 1, every whisper would echo indefinitely at full strength, leading to an ever-growing cacophony. And if ϕ>1 |\phi| > 1 , each echo would amplify its predecessor, turning a whisper into a deafening, explosive roar. The stationarity condition, ϕ<1 |\phi| < 1 , precisely ensures this graceful diminishing return, allowing the system to settle into a rhythmic, statistically constant state where the impact of past events elegantly fades over time, giving rise to predictable long-run properties like stable mean and variance.
CAUTION

Institutional Warning.

Students often struggle to grasp that ϕ<1 |\phi| < 1 is not merely a condition for finite variance, but fundamentally ensures that the influence of past values and initial conditions decays over time, preventing explosive behavior. They sometimes confuse this stability condition with the concept of a stable estimation of ϕ \phi itself.

Institutional Deep Dive.

01
The notion of stationarity in time series analysis is fundamentally about predictability and statistical equilibrium. For an Autoregressive process of order 1, Yt=c+ϕYt1+ϵt Y_t = c + \phi Y_{t-1} + \epsilon_t , the condition ϕ<1 |\phi| < 1 is the mathematical linchpin that guarantees this equilibrium. Let's dissect this through core logic, geometric mechanics, and institutional pitfalls.\n\n**Core Logic**:\nTo understand why ϕ<1 |\phi| < 1 is crucial, we employ iterative substitution. We can express Yt Y_t not just in terms of its immediate past, Yt1 Y_{t-1} , but as a sum of all past innovations, ϵt \epsilon_t .\nStarting with Yt=c+ϕYt1+ϵt Y_t = c + \phi Y_{t-1} + \epsilon_t :\nSubstitute Yt1=c+ϕYt2+ϵt1 Y_{t-1} = c + \phi Y_{t-2} + \epsilon_{t-1} into the equation for Yt Y_t :\nYt=c+ϕ(c+ϕYt2+ϵt1)+ϵt Y_t = c + \phi(c + \phi Y_{t-2} + \epsilon_{t-1}) + \epsilon_t \nYt=c(1+ϕ)+ϕ2Yt2+ϕϵt1+ϵt Y_t = c(1 + \phi) + \phi^2 Y_{t-2} + \phi \epsilon_{t-1} + \epsilon_t \nRepeating this substitution backward in time k k times:\nYt=c(1+ϕ+ϕ2++ϕk1)+ϕkYtk+j=0k1ϕjϵtj Y_t = c(1 + \phi + \phi^2 + \dots + \phi^{k-1}) + \phi^k Y_{t-k} + \sum_{j=0}^{k-1} \phi^j \epsilon_{t-j} \nFor the process to be truly stationary, its statistical properties must be independent of the starting point or initial condition Ytk Y_{t-k} . As k k approaches infinity (i.e., looking infinitely far back in time), the term ϕkYtk \phi^k Y_{t-k} must vanish for the influence of the initial condition to fade. This occurs precisely when ϕ<1 |\phi| < 1 , because limkϕk=0 \lim_{k \to \infty} \phi^k = 0 .\nIf this condition holds, and assuming Yt Y_t has been running for an infinitely long time (or started sufficiently far in the past), the sum 1+ϕ+ϕ2+ 1 + \phi + \phi^2 + \dots converges to 11ϕ \frac{1}{1 - \phi} (a geometric series sum). Similarly, the sum of past errors j=0ϕjϵtj \sum_{j=0}^{\infty} \phi^j \epsilon_{t-j} also converges.\nThus, in the limit, Yt Y_t can be expressed as an infinite moving average:\nYt=c1ϕ+j=0ϕjϵtj Y_t = \frac{c}{1 - \phi} + \sum_{j=0}^{\infty} \phi^j \epsilon_{t-j} \nFrom this representation, we can readily derive the mean and variance.\nE[Yt]=E[c1ϕ+j=0ϕjϵtj] E[Y_t] = E\left[\frac{c}{1 - \phi} + \sum_{j=0}^{\infty} \phi^j \epsilon_{t-j}\right] \n=c1ϕ+j=0ϕjE[ϵtj] = \frac{c}{1 - \phi} + \sum_{j=0}^{\infty} \phi^j E[\epsilon_{t-j}] \nSince E[ϵt]=0 E[\epsilon_t] = 0 for all t t , then E[Yt]=c1ϕ E[Y_t] = \frac{c}{1 - \phi} . This is a constant, independent of t t .\nFor the variance:\nVar(Yt)=Var[c1ϕ+j=0ϕjϵtj] Var(Y_t) = Var\left[\frac{c}{1 - \phi} + \sum_{j=0}^{\infty} \phi^j \epsilon_{t-j}\right] \nThe constant term c1ϕ \frac{c}{1 - \phi} does not contribute to the variance. Also, since ϵt \epsilon_t are uncorrelated, the variance of their sum is the sum of their variances:\nVar(Yt)=j=0(ϕj)2Var(ϵtj) Var(Y_t) = \sum_{j=0}^{\infty} (\phi^j)^2 Var(\epsilon_{t-j}) \n=j=0ϕ2jσϵ2 = \sum_{j=0}^{\infty} \phi^{2j} \sigma_{\epsilon}^2 \n=σϵ2j=0(ϕ2)j = \sigma_{\epsilon}^2 \sum_{j=0}^{\infty} (\phi^2)^j \nThis is another geometric series with ratio ϕ2 \phi^2 . For this series to converge, we require ϕ2<1 |\phi^2| < 1 , which simplifies to ϕ<1 |\phi| < 1 . If this holds, the sum converges to 11ϕ2 \frac{1}{1 - \phi^2} .\nTherefore, Var(Yt)=σϵ21ϕ2 Var(Y_t) = \frac{\sigma_{\epsilon}^2}{1 - \phi^2} . This is also a constant, independent of t t .\nThe autocovariance function also proves to be time-invariant under ϕ<1 |\phi| < 1 .\n\n**Geometric Mechanics**:\nThe condition ϕ<1 |\phi| < 1 acts as a dampening mechanism. Imagine a physical system, like a pendulum. If you give it a push (an ϵt \epsilon_t shock), it swings. But friction (analogous to 1ϕ 1-\phi ) gradually reduces the amplitude until it settles back to its equilibrium. Each swing is smaller than the last, much like ϕj \phi^j gets smaller as j j increases.\nIf ϕ=0 \phi = 0 , Yt=c+ϵt Y_t = c + \epsilon_t , a purely random process (white noise with a constant mean). The effect of any ϵt \epsilon_t is immediate and vanishes completely by t+1 t+1 .\nIf 0<ϕ<1 0 < \phi < 1 , a positive shock ϵt \epsilon_t causes Yt Y_t to increase. This increase is carried over to Yt+1 Y_{t+1} as ϕYt \phi Y_t , but with a reduced magnitude. This effect continues, but its influence decays exponentially. The series "remembers" past shocks but with decreasing clarity.\nIf ϕ=1 \phi = 1 , the process becomes a random walk (Yt=c+Yt1+ϵt Y_t = c + Y_{t-1} + \epsilon_t ). A shock ϵt \epsilon_t has a permanent impact; it shifts the level of the series indefinitely. The variance grows with time, meaning it's non-stationary.\nIf ϕ>1 \phi > 1 or ϕ<1 \phi < -1 , the process is explosive. A shock ϵt \epsilon_t is amplified in subsequent periods (ϕj \phi^j grows in magnitude), leading to an ever-increasing (or oscillating and increasing) deviation from the mean, making the variance infinite.\nThe crucial point is that ϕ<1 |\phi| < 1 ensures that the process has finite memory; the distant past ceases to have an appreciable influence on the present.\n\n**Institutional Pitfalls**:\nStudents often conflate weak stationarity with strict stationarity. While strict stationarity implies weak stationarity (if moments exist), the reverse is not always true. However, for an AR(1) process with i.i.d. Gaussian white noise ϵt \epsilon_t , weak stationarity *does* imply strict stationarity, because linear combinations of Gaussian variables are also Gaussian, and Gaussian processes are fully characterized by their first two moments.\nAnother common pitfall is neglecting the assumptions about ϵt \epsilon_t . The proof relies heavily on E[ϵt]=0 E[\epsilon_t] = 0 and Var(ϵt)=σϵ2< Var(\epsilon_t) = \sigma_{\epsilon}^2 < \infty , and crucially, Cov(ϵt,ϵs)=0 Cov(\epsilon_t, \epsilon_s) = 0 for ts t \neq s . If the error terms themselves are autocorrelated or have time-varying variance, the derived stationarity conditions and properties may no longer hold. For instance, if ϵt \epsilon_t itself were an AR(1) process, the overall process Yt Y_t would be more complex and its stationarity would depend on the properties of ϵt \epsilon_t . Finally, the requirement ϕ1 \phi \neq 1 for the mean to be defined is critical, and ϕ±1 \phi \neq \pm 1 for the variance. The ϕ<1 |\phi| < 1 condition elegantly covers both these requirements while simultaneously ensuring the convergence of the infinite sums.

Academic Inquiries.

01

Why is it called "weak" stationarity, and what is "strict" stationarity?

Weak (or covariance) stationarity requires constant mean, finite variance, and autocovariance that depends only on the lag, not time. Strict stationarity requires the entire joint probability distribution of the process to be time-invariant. Weak stationarity is often sufficient for practical applications and easier to prove.

02

What happens to the AR(1) process if ϕ=1 \phi = 1 ?

If ϕ=1 \phi = 1 , the process becomes a random walk (Yt=c+Yt1+ϵt Y_t = c + Y_{t-1} + \epsilon_t ). In this case, past shocks have a permanent impact, the variance grows indefinitely with time, and the process is non-stationary. Differencing (ΔYt=YtYt1 \Delta Y_t = Y_t - Y_{t-1} ) can make it stationary.

03

Can an AR(1) process with ϕ>1 |\phi| > 1 ever be stationary?

No. If ϕ>1 |\phi| > 1 , the impact of past shocks (ϕjϵtj \phi^j \epsilon_{t-j} ) grows exponentially, leading to an explosive process where the variance is infinite. Such a process is clearly non-stationary.

04

How crucial are the assumptions about the error term ϵt \epsilon_t (white noise) for this proof?

Extremely crucial. The assumptions E[ϵt]=0 E[\epsilon_t] = 0 , Var(ϵt)=σϵ2< Var(\epsilon_t) = \sigma_{\epsilon}^2 < \infty , and Cov(ϵt,ϵs)=0 Cov(\epsilon_t, \epsilon_s) = 0 for ts t \neq s are fundamental. Without E[ϵt]=0 E[\epsilon_t] = 0 , the mean would be more complex. Without finite variance, the variance of Yt Y_t wouldn't be finite. Most importantly, uncorrelated errors simplify Var(Xi)=Var(Xi) Var(\sum X_i) = \sum Var(X_i) , enabling the derivation of Var(Yt) Var(Y_t) .

Standardized References.

  • Definitive Institutional SourceBox, G.E.P., Jenkins, G.M., Reinsel, G.C., and Ljung, G.M. (2015). Time Series Analysis: Forecasting and Control. 5th ed. Wiley.

Institutional Citation

Reference this proof in your academic research or publications.

NICEFA Visual Mathematics. (2026). Proof of the Stationarity Condition for an AR(1) Process (|φ| < 1): Visual Proof & Intuition. Retrieved from https://www.nicefa.org/library/time-series-analysis/proof-of-the-stationarity-condition-for-an-ar-1--process--------1-

Dominate the Logic.

"Abstract theory is just a movement we haven't seen yet."