August 22, 2018

Pre-inclusion Bias: How to create a false strategy

In the previous post I described a simple rule to double the returns of a mean reversion strategy. In this post, I show how pre-inclusion bias can take a losing strategy and make it a winning one.

Recently I had reader send me the rules for a stock trend following strategy. He knew these are the strategies I have been researching lately. The rules were few and I had time, so I coded it up.

Here is my definition of pre-inclusion bias from How much does not having survivorship free data change test results?

Pre-inclusion bias is using today’s index constituents as your trading universe and assuming these stocks were always in the index during your testing period. For example if one were testing back to 2004, GOOG did not enter the S&P500 index until early 2006 at a price of $390. But your testing could potentially trade GOOG during the huge rise from $100 to $300.

The Rules

I will not be sharing the rules but the general idea. Test is from 1/1/2007 to 6/30/2018.

Buy

  1. Some trend following rules
  2. Stock price is between $1 and $20
  3. Stock is member of the Russell 3000

Sell

  • Trend is broken

 

I made two assumptions about the rules. For rule (B), I assumed that this was the as traded price. Meaning this is the stock price without adjustment for stock splits or dividends. If one went back in time, this is the price one would have seen on the chart that day. The farther one tests back in time the more pronounced the change in your results this can have.

The second assumption was for rule (C). I assumed that this is using the historical constituent information.

Because I use Norgate Data and subscribe to the historical constituent data, I can test this way.

Bias free and as traded results

Well those numbers just suck. My first thought was I had made a coding error. After double checking and triple checking, I was sure I had not. Then came my next thought. Maybe they were using current Russell 3000 and adjusted prices. After a couple of emails, I confirmed this was the case.

With bias and adjusted prices results

Now we have a positive CAR. As you can see there is a huge difference between these two results.

All combinations

These results are the four combinations of using Current vs Historical Data and As-Traded vs Adjusted Prices.

Spreadsheet

File the form below to get the spreadsheet with lots of additional information. This includes top drawdowns, trade statistics and more.

Final Thoughts

People frequently ask me how important it is to have historical constituent data. Short answer, very! I do not trust any results that have pre-inclusion bias.

Backtesting platform used: AmiBroker. Data provider: Norgate Data (referral link)

Good quant trading,

Fill in for free spreadsheet:

spreadsheeticon

 

Click Here to Leave a Comment Below

Erik - November 19, 2022 Reply

Am I understanding this correctly:
If you use pre-inclusion bias (e.g. using stocks that are in the Russel 3000 today but may not have been there 10 years ago), the strategy provides positive returns. However, if you remove the bias by using Norgate data, the return is very poor (negative).
In other words, the strategy doesn’t work with correct data.

    Cesar Alvarez - November 21, 2022 Reply

    That is correct. When using the correct data, the strategy does not work.

Leave a Reply: