Position Sizing Based on Confidence Intervals
8 min read
Size positions based on statistical certainty rather than emotion, using confidence intervals from your actual track record.
8 min read
Size positions based on statistical certainty rather than emotion, using confidence intervals from your actual track record.
Position sizing on confidence intervals means scaling risk to the lower bound of your edge's statistical range — not the point estimate. With N trades, expectancy E and standard deviation s, plan around E − t·s/√N, never E itself.
Just because you had 10 wins in a row doesn’t mean you should double your risk. Here’s how to size based on statistical certainty — not emotion.
Prerequisites: Distribution of Trade Returns (variance, std dev) and Risk of Ruin (drawdown framing).
Most traders ask:
“Can I size up now?”
And most do it based on:
But professionals ask:
“Is my edge proven enough to justify larger size?”
This post introduces confidence intervals — a statistical way to decide when it’s safe to scale up, and how much your data can be trusted.
A confidence interval (CI) tells you:
“Given my sample of results, what’s the range of possible true values for my win rate, EV, or Sharpe ratio?”
By the Law of Large Numbers, the sample mean converges to the true EV — but slowly. Until N is large, your sample is a noisy estimate, and the CI quantifies how noisy.
It’s based on:
Let’s say you’ve had 20 trades with a +0.5R average return.
With a small sample, your confidence interval is wide — meaning the true EV could be much lower (or higher) — but in sizing, only the downside matters. Upside surprises don’t blow up accounts; downside surprises do. Plan around the lower bound.
You shouldn’t size up until your edge is statistically stable.
You win 12 out of 20 trades → 60% win rate.
95% Confidence Interval for win rate (Wilson score interval — preferred over the Wald normal approximation at small N; see Wilson, 1927):
≈ 38% to 79%
That means: you're 95% confident that your true win rate lies somewhere in that range. That’s… not very reliable.
Now with 100 trades and 60 wins: CI shrinks to ≈ 50% to 69% → Far more stable → safer to scale risk slightly.
Wilson 95 percent CI width (percentage points) collapses from about 41pp to about 19pp as N grows from 20 to 100.
Let’s say:
The 95% CI for the mean EV is:
EV +/- t(0.025, n-1) * (s / sqrt(n)) (for n=25 use t approx 2.06, not 1.96; using 1.96 below for round numbers) = 0.6 +/- 1.96 * (1.4 / sqrt(25)) = 0.6 +/- 1.96 * 0.28 = 0.6 +/- 0.55 => EV in [0.05R, 1.15R]
where EV = sample mean expectancy in R, s = sample standard deviation, n = trade count, t = Student t-critical value at chosen confidence.
For n < 30 the t-critical value (Student, 1908) replaces 1.96. And remember: trade returns are fat-tailed, so this CI is optimistic — a bootstrap CI on your actual P&L is safer (see also VaR and CVaR for tail-risk quantification).
That’s a huge range. Would you want to size up based on that?
Now try 100 trades:
EV +/- 1.96 * (1.4 / sqrt(100)) = 0.6 +/- 0.27 => EV in [0.33R, 0.87R]
Plan as if your edge equals the lower CI bound (here 0.33R), not the point estimate. Position sizing built on the upper half of a CI is just hot-streak escalation in a lab coat.
Worked example: N=40, E=0.4R, s=1.1R → 95% CI ≈ [0.05R, 0.75R]. Shrinkage factor = lower_CI / point_estimate = 0.05 / 0.4 = 0.125. If your unshrunk Kelly or base risk suggests 1.0% risk per trade, your shrunk risk = 0.125%. Trade until the lower CI moves up; then re-size.
Base risk 1.0 percent times shrinkage 0.125 yields shrunk risk 0.125 percent. Re-size after the lower CI bound moves up.
End-to-end CI-shrunk position sizing: inputs in, shrunk risk out.
Rule of thumb: risk% = base_risk% × max(0, lower_CI / point_estimate). If the lower CI is ≤ 0, your edge is not statistically distinguishable from zero — size = 0. See expectancy decomposition for where E_hat actually comes from.
| Method | What it sizes on | Robust to small N? | Typical failure mode |
|---|---|---|---|
| Point-estimate % risk | E_hat | No | Overconfidence after hot streaks |
| Lower-CI shrinkage | E_hat − t·s/√N | Yes | Slow scaling |
| Full Kelly | f* on E_hat | No | Catastrophic when E_hat is wrong |
| Fractional Kelly (½) | f*/2 on E_hat | Partial | Still uses point estimate |
| Bayesian posterior | posterior mean of E | Yes (with prior) | Sensitive to prior choice |
| Trade Count (Sample Size) | Recommended Max Risk per Trade | Why |
|---|---|---|
| 0–30 trades | 0.25%–0.5% | At N=20 the lower CI is typically negative — point estimate is noise. |
| 30–75 trades | 0.5%–0.75% | Lower CI starts to stabilize, but still ~10–25% of point estimate. |
| 75–150 trades | 1.0% | Lower CI typically reaches ~30–50% of point estimate. |
| 150+ trades, stable stats | 1.25%–1.5% (advanced only) | Lower CI is a meaningful fraction of point estimate; regime stability still required. |
Scaling tip: Only increase size if:
Common mistake: “I held +0.6R EV across 50 trades, so I doubled size.” Compute the lower 95% CI bound. If it moved from 0.05R to 0.30R, that's modest evidence — bump risk by 20%, not 100%.
The CI is a model. Like any model, it ships with assumptions that fail on real trade returns:
Use a bootstrap CI on your live P&L curve as a sanity check, and cross-check with risk of ruin on the lower-bound EV.
You wouldn't bet millions on 10 trades. Don't risk a large % of your capital based on a small data set.
Use confidence intervals to protect yourself from overconfidence.
Real edge is the lower bound of your CI, not the average of your trade log. Everything above the lower bound is hope.
Confidence intervals give you:
Let your system earn the right to scale.
Next: Optimal Withdrawal & Growth Strategy — once your lower CI bound is positive and stable, Kelly tells you the upper bound on size.
A confidence interval (CI) is the statistical range of plausible true values for a metric — win rate, expectancy, or Sharpe ratio — given your sample of trades. A 95% CI means: if you re-ran the same number of trades many times, the true value would fall inside that range in 95% of those re-runs.
The lesson's sample-size guidance: 0–30 trades cap risk at 0.25–0.5%; 30–75 at 0.5–0.75%; 75–150 at 1.0%; 150+ with stable stats at 1.25–1.5%. The deeper rule: scale only once the lower bound of your 95% CI on expectancy is positive and trending upward, not just because the point estimate held.
The lower bound. Position sizing built on the point estimate (the average) is hot-streak escalation in disguise — upside surprises don't blow up accounts, downside surprises do. Plan as if your edge equals the lower CI bound, then let live evidence raise that bound before you scale.
With a small sample, the confidence interval around your edge is wide, so the true expectancy could be much lower than the recent run suggests. Sizing up on a streak treats noise as evidence — the lower bound of your CI tells you how much of that recent performance is statistically real.
The standard t/z CI assumes normality and i.i.d. trades, which trade returns violate — so the CI is optimistic in real conditions. Use it as a first-pass filter, then sanity-check with a bootstrap CI computed directly on your live P&L curve, and pair with risk-of-ruin and CVaR analysis for tail exposure.