r/SMA200

New /learn article: why your synthetic LETF backtest probably overstates returns by ~60%
▲ 4 r/SMA200

New /learn article: why your synthetic LETF backtest probably overstates returns by ~60%

Just published https://sma200.trade/learn/leveraged-etf-borrow-cost

Background: I posted backtest numbers in r/LETFs last week with synthetic TQQQ going back to 1999. A commenter asked how my numbers compared to Testfolio. They didn't, by a lot. Turns out I had been using the simplified L × daily − ER/252 formula that almost every Reddit/forum synthetic LETF backtest uses, and it omits the daily borrow cost real leveraged ETFs pay on the swap-financing of their leveraged exposure.

Calibration vs real TQQQ over 2015-2024:

  • Real TQQQ (ProShares fund): 20.4× wealth multiple
  • Simple-formula synthetic: 33.0× (+62% overshoot)
  • Borrow-modeled synthetic (^(IRX) + 40bps spread): 21.5× (within 5% of real)

The article walks through:

  • The exact missing formula term and why every retail synthetic LETF backtest ignores it
  • The calibration proof against real TQQQ
  • Corrected 27-year synthetic TQQQ: B&H CAGR drops from 8.46% to 3.52%, SMA200 filter Sharpe edge drops from +0.141 to +0.019
  • The refined framing: SMA200 filter's value scales with how much equity-vol-decay is in your portfolio. +0.238 Sharpe lift on a 75/25 UPRO/UGL portfolio, +0.019 on a 25/75. The filter is a portfolio construction tool, not a single-asset rescue.

Also open-sourced the harness as a standalone Python package: https://github.com/prismlfx/sma200-bt

MIT licensed, pip-installable, calibrated against real TQQQ, six pytest cases pinning the formula. Anyone can verify the article's numbers in about 10 lines of code (README walks through it).

This is also the public correction to the Reddit numbers I posted last week. Better to be wrong loudly and corrected publicly than confidently wrong forever.

u/printoninja — 3 days ago
▲ 9 r/SMA200+1 crossposts

I backtested the 200-day SMA on SOXL, TQQQ, and UPRO from inception. The Sharpe result wasn't what I expected.

A few days ago I posted a primer on the 200-day SMA strategy for LETFs (link below). In that post I said filtered strategies "typically run 0.3 to 0.7 higher in Sharpe than buy-and-hold." That was the general community claim. I went back and re-ran the numbers on my own backtest harness to check, and the data doesn't support it.. at least not on this window. This is the correction post.

It turned out to be more interesting than the original framing.

The setup

Pure mechanical SMA200 strategy:

  • Long the LETF when daily close > 200-day SMA
  • 100% cash when below
  • Signal computed at close, position taken at next bar's close (no lookahead)
  • No confluence indicators (no MACD, RSI, buffer band, anti-whipsaw block)
  • 1 basis point per side transaction costs
  • Total-return data (yfinance auto_adjust, dividends reinvested)
  • From inception of each LETF through 2026-05-15

Comparison: buy-and-hold the same LETF over the same window. Same data, same costs.

The numbers

SOXL (since 2010-03-11, ~16.2 years)

Metric Buy-and-hold SMA200 filter
CAGR 41.5% 28.5%
Max drawdown -90.5% -69.7%
Volatility (ann.) 89.3% 59.5%
Sharpe (rf=0) 0.839 0.723
Sortino 1.182 0.794
Trades per year 3.3
Time in market 100% 63.2%

TQQQ (since 2010-02-11, ~16.2 years)

Metric Buy-and-hold SMA200 filter
CAGR 43.8% 30.3%
Max drawdown -81.7% -50.0%
Volatility (ann.) 61.0% 42.1%
Sharpe (rf=0) 0.905 0.842
Sortino 1.165 0.923
Trades per year 2.7
Time in market 100% 74.3%

UPRO (since 2009-06-25, ~16.9 years)

Metric Buy-and-hold SMA200 filter
CAGR 33.1% 16.8%
Max drawdown -76.8% -56.7%
Volatility (ann.) 51.3% 31.7%
Sharpe (rf=0) 0.817 0.651
Sortino 1.009 0.706
Trades per year 2.9
Time in market 100% 73.3%

What the data actually says

Three things stand out:

1. Pure SMA200 filtering underperforms buy-and-hold on Sharpe across all three LETFs.

Sharpe differences (filter minus B&H):

  • SOXL: -0.116
  • TQQQ: -0.063
  • UPRO: -0.166

Small but consistent in the same direction. The conventional "filter improves Sharpe" claim doesn't hold across these tickers over this window. This is the part I had wrong in Post 1.

2. The filter consistently slashes max drawdown.

  • SOXL: -90% → -70% (20 pp better)
  • TQQQ: -82% → -50% (32 pp better)
  • UPRO: -77% → -57% (20 pp better)

A 90% drawdown on SOXL is the textbook "wiped out" outcome. Going from -90% to -70% is the difference between destroyed and badly bruised. On TQQQ the filter's -50% is rough but recoverable - most retail holders would survive it psychologically. -82% buy-and-hold, much less clear.

3. The CAGR cost is real and large.

Buy-and-hold absolutely crushes the filter on raw return:

  • SOXL: 41.5% vs 28.5% (-13 pp)
  • TQQQ: 43.8% vs 30.3% (-13.5 pp)
  • UPRO: 33.1% vs 16.8% (-16.2 pp)

The filter strategy still turned $10k into $577k on SOXL, $735k on TQQQ, $138k on UPRO over the windows. Just nowhere near what B&H did, and B&H benefited from the entire 17-year run being one of the strongest bull markets in equity history.

Why the filter case is actually behavioral, not mathematical

Here's what the Sharpe number doesn't capture: backtests assume you would hold through a 90% drawdown. In practice, people don't. They sell at the bottom, re-enter after recovery is confirmed, and capture a much worse return than the backtest implies.

The 200-day SMA filter is risk management, not return optimization. Its real value is that you can actually execute it without ego-collapsing during a bear. You exit at -10 to -20% from the top, not at -90%. That's psychologically holdable. Buy-and-hold of a 3x LETF, for most people, is not.

So when someone asks "does the SMA200 filter work on LETFs?" the honest answer is:

  • If you're a backtest robot with infinite holding capacity: it underperforms B&H on Sharpe over this specific window.
  • If you're a human with money you'd panic-sell at -70%: it probably saves you from yourself, which is the whole point.

Caveats worth naming

Don't over-cite these numbers without flagging:

Sample bias. 2009-2026 contains one of the strongest equity bull runs in modern history, plus two unusually V-shaped drawdowns (2020, 2022). A different sample: e.g. weighted toward the 2000-2012 NDX lost decade - would tilt this comparison meaningfully toward the filter. Bear-heavy regimes are where the filter would dominate.

The 10-month SMA warmup. The filter is forced into cash during the first 200 days of each LETF's data (no SMA exists yet). Skipping this handicap lifts filter CAGR by 1-3 pp across these names without materially changing the Sharpe story.

Pre-tax. Filter generates ~3 short-term capital gains events per year. B&H generates zero until exit. After-tax results widen further in favor of B&H for taxable accounts.

No regime decomposition. A bull-vs-bear breakdown would show the filter winning hands-down in 2022 and underperforming materially in 2017/2019/2023. Inception-to-now totals smear that out.

One bonus finding worth flagging

For UPRO specifically, computing the SMA200 on SPY (the underlying) and applying it to UPRO position works meaningfully better than computing the SMA200 directly on UPRO:

Direct SMA on UPRO SPY SMA → UPRO
CAGR 16.8%
Max drawdown -56.7%
Sharpe 0.651

That's 8 percentage points of CAGR for free, plus a slightly better drawdown. Sharpe on the underlying-signal version basically matches buy-and-hold (-0.016 difference), but with the same drawdown protection.

For TQQQ, the QQQ-based signal is only marginally better than direct SMA on TQQQ (+0.007 Sharpe).

For SOXL, the SOXX-based signal is actually slightly worse (-0.049 Sharpe vs direct SMA on SOXL).

So the "track the underlying" claim that gets repeated in LETF communities has real evidence for UPRO/SPY, weak evidence for TQQQ/QQQ, and basically no evidence for SOXL/SOXX over this window. Worth its own post.. I'll get to it next week.

The honest takeaway

The 200-day SMA on LETFs is not a Sharpe-improver over very long bull-heavy windows. It's a drawdown manager that costs real CAGR. The pitch is "you can sleep at night and you won't panic-sell at the bottom," not "you'll outperform B&H."

Whether that's worth it depends on whether you'd actually hold a 3x LETF through -82%. If you would: you don't need this. If you wouldn't: this is the move.

I'm working on a Python notebook version of this backtest harness for GitHub so anyone can verify or critique the methodology directly. Will link when it's up.

Has this changed how you think about it? Or were you already running it as a drawdown filter rather than a return enhancer? Curious where the community shakes out.

reddit.com
u/printoninja — 4 days ago