Data snooping refers to statistical inference that the researcher decides to perform after looking at the data. It is misleadingly out of ignorance is a common error in using statistics.
Data Snooping Bias is also referred to as Optimization Bias or Curve Fitting.
This bias is the result of refining too many parameters to improve a system’s performance on a single data set. Like most biases, Data Snooping is fairly easy to understand at face value, however, it has a habit of subtly creeping its way into system development.
A common example of data snooping starts out as an honest effort to improve a system. You then test if adding another indicator into the mix will improve your results. If it does, then you incorporate that indicator into the system and test adding another indicator. The end result is a system that is perfectly optimized to trade the exact data set you tested it on. The problem is that the system is only optimized for that specific data set, which already happened.
The best way to avoid data snooping, or curve fitting, is to keep your systems simple, using as few parameters as possible. It is also important to backtest your system on many different data sets across different markets and time periods.
We’re the Plan in “Plan your Trade and Trade your Plan” – TraderJanie
“If awesome were inches, we’d be the Effiel Tower.”