← Back to Blog
StrategyMar 17, 2026 · 6 min read

Why your backtest looks perfect but your live account doesn't

A strategy with 85% win rate in testing and 40% in live trading is not a strategy problem. It's an overfitting problem. Understanding the difference changes how you build — and what you trust.

What overfitting actually means

Overfitting happens when a strategy is too well-adapted to the historical data it was tested on. The rules become so specific — so tuned to the exact price movements that already happened — that they stop working on any data the strategy hasn't seen before.

Think of it this way: if you test a strategy on one year of data and it produces a 90% win rate, you haven't found an edge. You've found a description of that specific year. The market in year two will be different, and a strategy that only knows year one will fail.

The signs you're looking at an overfit strategy

Overfitted strategies tend to share certain characteristics. Not all of them are obvious:

  • Too many conditionsA strategy that needs five indicators to align before entering is likely fitting noise. Real edge in markets is usually simple.
  • Results that look too cleanAn equity curve that goes up in a straight line with minimal drawdown is suspicious. Real strategies have losing periods. If yours doesn't, it's probably because it was trained on a period that didn't test it.
  • Win rate above 80% on short timeframesHigh win rates on 1-minute or 5-minute data almost always indicate overfitting. At those resolutions, the noise-to-signal ratio is enormous.
  • The strategy only works on one assetA genuinely robust strategy should show at least some edge across related markets. If it only works on EURUSD between 2021 and 2023, that's a data-specific pattern, not a structural one.
  • Very specific parameter valuesIf your strategy uses a 23-period moving average and a 0.0047 threshold, ask yourself: why those exact numbers? If the answer is 'because those gave the best backtest results', that's a problem.

How it happens — even without trying

Overfitting isn't always deliberate. It happens naturally in the optimization process. Every time you look at backtest results and adjust a parameter, you're using historical data to make a decision. Do that enough times — change the stop loss, tweak the entry condition, adjust the timeframe — and the strategy gradually bends toward fitting the past rather than predicting the future.

This is why a strategy that took 50 iterations to build is almost always more overfit than one that took 5. The optimization process itself is the source of the problem.

It's also why walk-forward testing exists. Instead of testing on the full dataset, you train on one portion and validate on another portion the strategy has never seen. If the results hold on the unseen data, you have more reason to believe the edge is structural rather than historical.

What to do about it

The first rule is to simplify. Most robust trading edges are simple. An entry condition based on one or two clear market behaviors — a session breakout, a liquidity sweep, a structure break — is more durable than a complex multi-condition filter that barely works on historical data.

The second rule is to test on data you haven't touched. Run the final version of your strategy on a period you didn't use during development. If the strategy was built on 2020–2023 data, test it on 2024 data before deploying. The results don't need to be identical — but they should be directionally consistent.

The third rule is to watch the sample size. A strategy that generated 12 trades in the backtest doesn't have enough data to draw conclusions from. You need at minimum a few hundred trades across different market conditions — trending, ranging, volatile, quiet — before a win rate number is meaningful.

Finally: be suspicious of perfection. A strategy that looks too good is almost always too good. Real edge is modest. A win rate of 55–65% with a decent risk/reward ratio, consistent across multiple years and multiple assets, is far more valuable than an 85% win rate that disappears when the calendar year changes.

How Charton handles this

When the AI reviews a strategy in Charton, one of the things it explicitly checks is structural health — whether the logic is robust or brittle. If it detects signs of overfitting — too many conditions, results that are too clean relative to the data, parameters that are suspiciously precise — it flags it.

This isn't a guarantee. No system can fully eliminate the risk of overfitting. But having an automated review that specifically looks for it — rather than only optimizing for performance numbers — changes the quality of what gets built.

Trade faster,
smarter, better.