u/BacktestingArena

Grid Bot on BTC, 16 months, 297 trades — beat USD Buy & Hold by 18.66% in a year B&H got crushed. But the range was set with hindsight.

Strategy Backtest

TL;DR: Ran an arithmetic grid bot on BTCUSDT, 70k–150k range, 30 grids, Jan 2025 to May 2026 (~16 months). Final return: +2.16% on $1,000 starting capital, 297 trades, 0.77% in fees. Buy & Hold over the same period: -16.51%. So the grid "outperformed" B&H by +18.66 percentage points — but the entire result hinges on a range I picked in hindsight. The honest test isn't "did it work" — it's "would anyone have set 70k–150k on January 1, 2025 in real life?"

This post is about what a grid bot actually does well, where it fails, and why outperformance numbers vs B&H are misleading when you cherry-pick the range.

The setup

  • Pair: BTCUSDT
  • Period: 2025-01-01 to 2026-05-17 (~501 active days)
  • Starting capital: $1,000
  • Grid type: Arithmetic (equal price spacing)
  • Lower bound: $70,000
  • Upper bound: $150,000
  • Grid count: 30
  • Grid interval: $2,666.67 per grid level
  • Profit per grid: 1.66% – 3.66%
  • Fees: 0.075% per trade (KuCoin taker rate)

The headline numbers

Metric Grid Buy & Hold
Final Value $1,021.56 $834.87 (implied from -16.51%)
Total Return +2.16% -16.51%
CAGR +1.57% -12.32% p.a.
Trades 297 1
Fees $7.69 (0.77% of capital)
Final balance 0.007965 BTC + $398.73 USDT
Outperformance +18.66 pp

Looks great on the headline. +18.66 percentage points vs Buy & Hold over 16 months is the kind of number that gets screenshotted on Twitter. But that headline is doing a lot of work covering up what actually happened.

What actually happened

The price chart over the period is the most important context, and it explains everything:

  • Jan – Apr 2025: BTC ranged $80k–$105k. Grid harvested chop. Both grid and B&H roughly flat-to-slightly-up.
  • May – Oct 2025: BTC ran from $95k to ~$125k peak. B&H pulled ahead significantly — equity curve shows B&H peaking around 1.300, grid stuck around 1.170. This is the classic grid weakness: capped upside in a strong trend.
  • Nov 2025 – Mar 2026: BTC crashed from $120k to a $65k low — breaking below the grid's lower bound of $70k. B&H equity curve collapsed from 1.300 to ~700. Grid held around 950–1000 because the orange line shows the bot kept buying down to its floor, accumulating cheap inventory while B&H just sat on a depreciating bag.
  • Apr – May 2026: BTC recovered to ~$80k. B&H clawed back to ~$850. Grid grinded back to $1,021.

The grid won not because it's a magic strategy. It won because B&H got crushed in a sharp drawdown, and the grid's mean-reversion design happened to be the right tool for that specific shape of move.

The 70k–150k range problem

Here's the thing no grid backtest screenshot ever addresses: who, on January 1, 2025, would have actually set the range to 70k–150k?

On January 1, 2025, BTC was trading around $95k. To set 70k as a lower bound, you'd need to assume a ~26% drawdown was on the table. To set 150k as upper bound, you'd need to assume a ~58% rally was on the table. Both ended up almost exactly right — BTC peaked near 125k, bottomed near 65k. The range captured 100% of the move with maybe 5% to spare on the downside.

That's not skill. That's hindsight. If I'd set 80k–140k (still reasonable a priori), the lower bound would have been hit harder in the Q1 2026 crash and the bot would have run out of USDT to buy with. If I'd set 60k–160k (wider, more conservative), the grid spacing would have been so loose that the chop wouldn't have triggered enough trades to matter.

The grid's outperformance is therefore not really "grid > B&H." It's "a well-calibrated grid > B&H in a regime that punished B&H." Both halves of that sentence are doing work.

What grids actually do well

Setting aside the hindsight issue, the mechanics worked as designed:

  • 297 trades over 501 days = roughly one every 1.7 days. Steady, mechanical, low-attention.
  • Win rate effectively 100% — every grid pair (buy-low → sell-high) closes profitable by design. The only "loss" is opportunity cost when price runs out of the upper bound or accumulation cost when it crashes below the lower bound.
  • Fees were 0.77% of capital for 297 trades. On KuCoin at 0.075% taker that's exactly what you'd expect, and it's the main cost driver. Tighter grids = more trades = more fees. The 30-grid setting balanced this reasonably.
  • The Q1 2026 crash is where grids genuinely shine. While B&H lost 35%+ from peak, the grid was buying at every level down to 70k. The final BTC balance of 0.007965 + $398.73 USDT means the bot still has half its capital in cash, ready to buy if BTC drops further. B&H has zero dry powder.

What grids actually do badly

The Q2–Q3 2025 bull run is where the grid's structural weakness shows:

  • Capped upside. Once price hits the upper bound, the grid stops buying back in. B&H rode the entire move from 95k to 125k. The grid sold all its BTC into the run-up and sat in cash watching the rest of the rally happen.
  • Whipsaw chop near boundaries. When price oscillates near the lower or upper bound, the grid fills only one side. This bleeds into the equity curve in subtle ways.
  • No directional view. Grids are pure mean-reversion. If BTC enters a sustained one-way market (either parabolic up or extended drawdown that breaks the range), the strategy is structurally on the wrong side.

The honest framing

What this backtest shows is not "grids beat B&H." What it shows is: if you pick a range that captures the full move, a grid will smooth your equity curve relative to B&H in volatile sideways-to-down regimes. That's a real, repeatable property of the strategy. It's also not the same thing as edge.

The fair comparison isn't grid vs B&H over a cherry-picked period. The fair comparison is:

  • Grid vs B&H averaged over many starting dates and range configurations
  • Grid vs other systematic strategies on the same data
  • Grid live performance with a range set forward, not back

I'd love to see the same setup re-run with the range set as a function of something observable at t=0 — e.g. (current price ± 1 ATR-derived band) or (Bollinger Band extremes) — so the range selection is mechanical, not artistic.

What I take away from this

The grid did exactly what grids are designed to do — extract value from chop and dollar-cost-average down through a drawdown. The fact that it beat B&H over this specific 16-month window is real but not generalizable. A different range, a different period, and the comparison flips.

The interesting question for grid bots isn't "do they outperform B&H?" It's "what's the cost of being wrong on the range, and how do you size that risk?" Setting the upper bound too low caps your upside in a bull run. Setting the lower bound too high means the bot runs out of dry powder in a crash and just holds bags at the bottom. The 70k–150k range I used here was, in retrospect, almost optimal — which is exactly why I'm skeptical of the result.

297 trades is a decent sample, but it's all from one market regime (one cycle peak, one drawdown, one recovery). The minimum bar to take this seriously would be running the same range-selection methodology across 2018, 2019–20, and 2021–22 and seeing if it holds. Different volatility regimes, different price ranges, different outcomes.

Open questions for discussion

  • What's the cleanest mechanical rule for setting the range at t=0? ATR-bands? Bollinger? Some volatility-aware envelope?
  • How would arithmetic vs geometric grid compare on the same data? I ran arithmetic — geometric would put more density at lower prices, which arguably matches BTC's log-normal price distribution better.
  • Has anyone tested grids on altcoins with higher vol? ETH, SOL, the chop-heavy mid-caps?
  • What's the slippage assumption people use for grid bots? I used pure 0.075% fees, no slippage. On 297 small orders that probably doesn't matter, but in tighter grids it might.

Methodology disclosure: Run on Backtesting Arena, I'm the founder (Rule 4 in action). Standard arithmetic grid, KuCoin taker fees, no slippage modeled, no funding rates (this isn't perp). Range was picked manually — that's the entire point of the post. Anyone can reproduce with the parameters listed above.

Per Rule 11: don't trust this just because I'm telling you. Run it with a worse range (60k–130k, 80k–140k) and watch the outperformance collapse or flip. That's the actual test.

reddit.com
u/BacktestingArena — 4 days ago