Skewness & Kurtosis
9 min read
Go beyond mean and variance to examine the asymmetry and tail thickness of return distributions and why they matter for risk.
9 min read
Go beyond mean and variance to examine the asymmetry and tail thickness of return distributions and why they matter for risk.
Skewness and kurtosis are the third and fourth standardized moments of a distribution. Skewness measures asymmetry — whether your big outcomes cluster on the win side or the loss side. Kurtosis measures tail thickness — how often extreme outcomes occur. Together they determine whether a strategy with a clean Sharpe ratio is actually safe or quietly carries blow-up risk.
Not all trading results follow a normal distribution — and ignoring this fact can ruin even statistically sound strategies.
Prerequisite: Variance & Standard Deviation — you need to be comfortable with σ before adding the third and fourth moments.
Next: Monte Carlo Simulations — the practical tool for stress-testing strategies whose PnL is not Gaussian.
You’ve built a system.
But then:
What happened?
It’s not randomness. It’s skewness and kurtosis — critical statistical traits of your trade distribution that tell you how your edge plays out over time.
Skewness = E[((X−μ)/σ)^3]. It measures the asymmetry of a distribution. Positive skew → right tail dominates (rare big wins); negative skew → left tail dominates (rare big losses). For reference: S&P 500 daily returns ≈ −0.5 to −1.0; BTC daily ≈ −0.2 to +0.5 depending on regime.
Skew and excess kurtosis of common return series. Sigma-based risk math degrades as excess kurt grows.
| Asset / series | Daily skewness | Daily excess kurtosis | Sigma-based risk model holds? |
|---|---|---|---|
| T-bills (daily) | approx 0 | approx 0 | Yes |
| S&P 500 (daily) | -0.5 to -1.0 | 5 to 10 | Approximately |
| BTC (daily) | -0.2 to +0.5 | above 20 | No |
| Single-name stress | varies | above 5 | No |
Two distributions matter here, and they are different: (1) the return distribution of the asset, and (2) the PnL distribution of your strategy. Skewness applies to both, but the sign can flip between them — a long-vol strategy on a negatively-skewed asset typically produces positively-skewed PnL.
In trading terms:
Most retail systems are negatively skewed — they feel good (frequent wins), but blow up occasionally.
| Type | Skew | Example |
|---|---|---|
| Scalping | Negative | 80% win rate, but one –5R loss ruins a week |
| Trend-following | Positive | 30% win rate, but occasional +6R or +10R wins |
| Martingale | Very Negative | Many small wins, occasional total wipeout |
Kurtosis = E[((X−μ)/σ)^4]. The normal distribution has kurtosis = 3, so analysts report excess kurtosis = kurt − 3. Excess > 0 = leptokurtic (fat tails). Real-world: S&P daily ≈ 5–10 excess; BTC daily can exceed 20. Note (Westfall 2014): kurtosis is about tails, not 'peakedness' — that older textbook description is wrong.
In trading:
Strategies with high kurtosis carry tail risk — rare, extreme outcomes that matter more than they should. Caveat: in genuinely fat-tailed regimes (Pareto-like, which crypto and stressed equity tails approximate), the sample kurtosis is itself an unreliable estimator — the true value can be effectively infinite. Treat any historical kurt number as a lower bound on your actual tail risk.
Most trading books assume a bell curve (Gaussian distribution) — where:
But in reality:
Theoretical 5-sigma down-day frequency under a normal distribution.
Roughly 7,000x more often than the Gaussian model predicts.
Tail risk roughly 80,000x worse than Gaussian expectation.
Long-Term Capital Management's sigma-based risk model labeled the Russian default a 6-sigma event — "once in the lifetime of the universe". The trade existed for 4 years before it happened. Sigma-only risk models routinely under-count fat-tail events by 2 to 3 orders of magnitude.
If you only evaluate based on short-term win rate, you will misjudge the system.
Translate skew into strategy class: short-vol strategies (selling premium, mean-reversion, martingale-flavored) generate negative skew and look good on a Sharpe sheet — they're picking up nickels in front of a steamroller. Long-vol strategies (trend, breakout, long-tail option buys) generate positive skew and look bad on a Sharpe sheet but survive regime changes. Same Sharpe ≠ same survival.
Rule of thumb: excess kurtosis < 1 → near-Gaussian, full Kelly is roughly safe. 1–5 → leptokurtic, use ½-Kelly. > 5 → fat-tailed (most crypto strategies live here), cap at ¼-Kelly and stress-test with the worst observed drawdown ×2.
| Excess kurtosis | Distribution name | Real-world example | Sizing implication |
|---|---|---|---|
| ≈ 0 | Mesokurtic (Gaussian) | T-bills daily | Full Kelly OK |
| 1–5 | Mildly leptokurtic | SPX daily | ½ Kelly |
| > 5 | Fat-tailed | BTC daily, single-name stress | ¼ Kelly + stress test |
Don’t use the same risk model across all strategies — match sizing to distribution shape, not just EV.
Python: returns.skew() and returns.kurt() (pandas reports excess kurtosis by default)
Excel: SKEW(range) and KURT(range) (Excel's KURT also returns excess kurtosis)
Sort trades by R, plot a histogram, and overlay a normal with the same μ, σ — if the empirical bars stick out past ±3σ, you're leptokurtic.
Look for:
Frequent small gains or losses
Rare outliers
Long “tails” of performance
Know what game you’re playing — and whether it fits your psychology and capital constraints.
Your edge doesn’t just live in the average. It lives in the shape of your results — and how you handle the extremes.
Ignoring skew and kurtosis is like flying without knowing your plane's stall speed. Once you've measured skew and kurt and know your distribution is non-Gaussian, the analytical formulas for VaR, drawdown, and Kelly all break. The standard fix is simulation — which is the subject of the next lesson.
Two systems with identical Sharpe ratios can have very different ruin probabilities. The difference lives in the third and fourth moments. Measure them, or stop pretending you know your risk.
Excess kurtosis = kurtosis − 3. Subtracting 3 makes the normal distribution the zero baseline, so positive excess kurtosis means fatter tails than normal. Pandas' kurt() and Excel's KURT() both return excess by default.
Excess kurtosis > 1 is leptokurtic; > 5 is meaningfully fat; > 10 is the regime where σ-based risk math fails. Crypto often sits in the 10–50 band on daily returns, while SPX daily typically lands at 5–10.
Equity index daily returns (S&P, NDX) are typically negatively skewed (≈ −0.5 to −1.0). Individual stocks and most cryptocurrencies vary by regime — the sign is not a constant.
No — full Kelly assumes thin tails. For excess kurtosis 1–5, use ½-Kelly; for > 5 (most crypto strategies), cap at ¼-Kelly and stress-test with the worst observed drawdown ×2.
Use =SKEW(range) for sample skewness and =KURT(range) for excess kurtosis. Excel's KURT returns excess (not raw) kurtosis, so the normal distribution scores 0, not 3.