Equity R-Squared
9 min read
Measure how closely your equity curve follows a straight line of growth — the simplest indicator of strategy consistency.
9 min read
Measure how closely your equity curve follows a straight line of growth — the simplest indicator of strategy consistency.
Plot cumulative P&L against trade number, draw a line of best fit, square the correlation. That single number — Equity R-Squared (R²) — separates strategies you can size from strategies you can only hope for. It is also one of the easiest numbers in finance to fool yourself with.
R² is the coefficient of determination from a linear regression of your cumulative equity curve against trade number (or time). If you size in percentage terms, run the regression on log(equity) — geometric compounding produces an exponential, not linear, ideal curve, and a raw-equity R² will artificially fall as the account grows.
Calculation:
Where:
R² = 1 − (SS_residual / SS_total)
R² ranges from 0 to 1:
| R² Value | Quality | Description |
|---|---|---|
| 0.95 - 1.00 | Excellent | Near-linear growth, very consistent |
| 0.85 - 0.95 | Good | Steady growth with manageable variation |
| 0.70 - 0.85 | Moderate | Noticeable drawdowns but overall upward trend |
| 0.50 - 0.70 | Weak | Significant equity volatility, questionable consistency |
| <0.50 | Poor | Equity curve is essentially random noise around a trend |
Consider two strategies:
| Metric | Strategy A | Strategy B |
|---|---|---|
| Return (250 trades) | +200% | +80% |
| Equity R-squared | 0.45 | 0.92 |
| Max drawdown | -38% | -7% |
| Sharpe | 0.9 | 2.1 |
Strategy A's CAGR looks better; B's Kelly-optimal sizing is ~3× higher because the path is predictable, so leveraged-B beats A on the same risk budget. A made more money on the headline, but its equity curve was chaotic — periods of huge gains followed by deep drawdowns. Strategy B's equity grew steadily. Which would you trust with real capital?
Strategy B is almost always the better choice. The deeper reason: B's trade returns have low positive autocorrelation (see Autocorrelation of Returns), so wins and losses interleave smoothly. A's returns cluster in regimes — exactly the failure mode covered in Regime Sensitivity & Volatility Dependency. A high-R² strategy is psychologically easier to trade, more predictable for risk management, and more likely to be genuinely robust rather than lucky.
Two failure modes hide here. (1) Lumpy edge: low R² with high return means a few outsized trades carry the curve — remove them and the strategy dies. (2) Curve-fit edge: extremely high in-sample R² (>0.97) on a parameter-rich strategy is a red flag, not a green light. Always recompute R² on a holdout window — OOS R² is the only number worth quoting. The converse — high R² implies no curve-fit — is false: an over-fit, parameter-rich system on its training set can produce R² > 0.95 by construction. R² is necessary, not sufficient, evidence of robustness.
With high out-of-sample R², you can size more aggressively — the equity path is predictable. With high in-sample R² alone, you have measured nothing about the future; sizing on it is the canonical overfit-then-blow-up pattern. With low R², even optimal Kelly sizing becomes dangerous because the equity volatility makes drawdowns deeper and less predictable than the formula assumes.
R² allows fair comparison between strategies with different return profiles. A strategy with moderate returns but R² = 0.90 may be preferable to one with high returns and R² = 0.60, especially when you factor in the psychological cost of drawdowns.
R² and the Sharpe Ratio are related but not identical:
| Metric | Measures | Best at flagging | Blind to |
|---|---|---|---|
| R² | Path linearity | Lumpy / outlier-driven curves | Slope sign, magnitude |
| Sharpe | Return per σ | Volatility drag | Drawdown shape |
| Calmar | Return / max DD | Tail drawdown | Path between peaks |
| K-ratio | Slope × √N / SE(slope) | Slow, steady edge | Outlier sensitivity |
A strategy can have a high Sharpe but moderate R² if returns are good on average but the equity curve has a curved (exponential) shape. Conversely, a strategy with a perfectly linear equity curve (R² = 1.0) will have a high Sharpe.
In practice, high R² almost always implies a respectable Sharpe, but the reverse is not always true. For the rigorous version, see Lars Kestner, Quantitative Trading Strategies (McGraw-Hill, 2003) — his K-ratio formalises "how straight is the line and how steep" into a single number.
pnl array, one entry per closed trade.equity = np.cumsum(pnl) — or np.log(np.cumsum(pnl) + start) for percentage sizing.t = np.arange(len(equity)).slope, _, r, _, se = scipy.stats.linregress(t, equity).r_squared = r**2. Pass criterion: r_squared > 0.85 and slope > 0 and N ≥ 100.import numpy as np
import scipy.stats
equity = np.log(np.cumsum(pnl) + start_capital) # log-equity if compounding
t = np.arange(len(equity))
slope, _, r, _, se = scipy.stats.linregress(t, equity)
r_squared = r ** 2
k_ratio = slope * np.sqrt(len(equity)) / se
Excel: =RSQ(equity_range, sequence_range).
Two notes serious systematic traders care about: (1) Kestner's K-ratio = slope × √N / std-error of slope is the proper sibling — it rewards both straightness and slope, where R² alone does not. (2) Only out-of-sample R² is informative; an over-fit curve can hit R² > 0.98 in-sample and collapse on new data.
If your equity R² is low, investigate:
Experiment with different win rates and payoff ratios to see how they affect equity curve smoothness. A high R² strategy produces a near-linear equity curve — consistent and predictable. Use the demo to recreate the threshold table above: a 55% win-rate / 1:1 payoff strategy lands around R² ≈ 0.93; drop the win rate to 48% with a 2:1 payoff and R² collapses to ~0.55 even though expectancy is identical. Path quality is not expectancy.
| Scenario | Win rate | Payoff | Equity R-squared | Expectancy |
|---|---|---|---|---|
| High frequency, balanced | 55% | 1:1 | ~0.93 | +0.10R |
| Low hit rate, big winners | 48% | 2:1 | ~0.55 | +0.10R |
The three robustness metrics in this module measure overlapping but distinct things. Regime sensitivity asks does the edge survive across regimes? Autocorrelation of returns asks do trade outcomes cluster? Equity R² asks does the path look straight? A robust strategy passes all three. Equity R² is the easiest to compute and the easiest to fool — read it as the summary of the other two, not a replacement.
Above 0.85 with a positive regression slope is solid for most strategies; above 0.95 is exceptional and worth scrutinising for overfit. Below 0.70 is lumpy. The exact threshold depends on trade count — treat sub-100-trade samples as noise.
Sharpe measures return per unit volatility; R² measures path linearity. A strategy can have a high Sharpe with a curved or noisy equity path, and a strategy with R² = 0.95 can still be losing money if the slope is negative. Use both — they answer different questions.
R² is unstable below ~50 trades and noisy below ~200. Treat anything computed on fewer than ~100 trades as a confidence interval, not a point estimate. For cross-strategy comparison, ≥250 trades is the working floor.
Yes. R² ignores the slope sign — a perfectly straight line going down still scores R² = 1.0. Always check that the regression slope is positive before celebrating a high R².
No. A parameter-rich system can hit R² > 0.97 in-sample by construction and collapse on new data. Only out-of-sample R² is informative for sizing or live-trading decisions.
Equity R² is the cheapest sanity check you can run on a backtest — but it is also the easiest to fool. A strategy with R² = 0.92 out-of-sample, K-ratio > 2, and N > 250 has earned the right to capital. The same R² in-sample with 60 trades has earned nothing but a closer look. The question every trader should ask is not "is my growth consistent?" but "is my growth consistent on data my optimizer never saw?"