Category Archives for "Research"

April 6, 2022

Benford’s Law and Strategy Selection

While talking to a trader, he mentioned an article in the December 2021 issue of Technical Analysis of Stocks & Commodities about Benford’s Law. I had read the same article and was wondering how it could be applied to my trading. Benford’s Law is often used to look for fraud. I am sure I am not committing fraud on myself. As we talked, we wondered whether this could help in selecting which strategy to trade from an optimization?

Benford’s Law Summary

According to Benford’s Law, in a large set of numbers the significant digit is not evenly distributed but the digit “1” occurs the most frequently and the digit “9” the least. The theoretical distribution is shown in this chart.

In general, for a data set to follow Benford’s Law, the data set should have the following properties

  1. Several orders of magnitude between the smallest and largest values
  2. Either the minimum or maximum should be unbounded
  3. Data set with more than a couple thousand numbers
  4. The data is not concentrated around the mean

For more information about Benford’s Law, these are two good articles. What is Benford’s Law and why is it important for data science? and Wikipedia’s article Benford’s law.

Application to stock data

Using the percentage daily return on stock data, here is how the data properties fit with what Benford’s Law wants in a data set

  1. Yes, the data have several orders of magnitude
  2. Maximum is unbounded
  3. 10 years of a single stock is 2500 data points. More would be better.
  4. Returns are concentrated around the mean. This is not what we want.

The data are not a perfect fit, but we will try and see what we get.

SPY

From 2007 to 2021, here are the results of the significant digit count on the daily percentage return on the SPY

Visually the SPY appears to follow Benford’s Law. There are several statistical methods for determining if it does follow. I will be using the Chi-Square Statistic. What we need to know about this value is the lower the better. If there was a perfect match between the data set and the theoretical distribution, the value would be zero. Typically, a value under 15.5 is used to determine if the data does follow Benford’s Law. The Chi-Square Statistic for the SPY data is 17.9. Close to the cutoff. I guess perhaps the SPY is slightly manipulated.

S&P500 Stocks

Next, I was curious about how the individual stocks of the S&P500 behaved. I used the current S&P 500 stocks and data from 2007 to 2021.

Values for the Chi-Square range from a low of 5 for MTB to a high of 102 for AWK. The average for all was 44.4. Only seven stocks had values under 15.5. Either one of two things is going on here. One, these stocks are highly manipulated. Or two, the data is not a good fit to apply Benford’s Law. I am going with number two because I am an optimist and believe the markets are only slightly manipulated.

 

Momentum Strategy

Now to what I wanted to know. Is there any predictive value here for selecting a strategy from an optimization run?

I ran an optimization (432 runs) on a momentum strategy that I trade from 2007 to 2016. I took the top 1/3 of the runs (145 runs) based on CAR and computed the Chi-Square Statistic for each based on the daily % return of the strategy. From the Chi-Square Statistic, I divided these into terciles.

Side note: I thought the Chi-Square Statistic was big for the S&P 500 stocks. In my optimization run, the range was from 210 to 2143. Not even close to following Benford’s Law!

Then using an out-of-sample optimization, I computed the CAR tercile that each of these in-sample runs fell into. My hypothesis was that failure to follow Benford’s Law implied a higher potential for curve fitting, which then implies that those in-sample runs in the top tercile for CAR and bottom tercile for Chi-Square Statistic would again end up in the top tercile for CAR in the out-of-sample.

The numbers in the table represent how many runs fell into that cell, with a blue background highlighting the highest value in that row The first row represents 48 runs from the in-sample that fell into the top tercile by CAR and then the top tercile by Chi-Square Statistic for that top tercile. The blue 30 then means that 30 of those 48 runs end up in the top tercile by CAR in the out-of-sample.

At first, I was excited by the very clear diagonal line and the fact that the other cells followed what I expected. But then I realized this was the opposite of what I thought. I expected the lowest tercile by Chi-Square Statistic to be most likely to stay in the top CAR tercile. I started making up stories to explain this, but then I realized I should test this on a different strategy.

Mean Reversion Strategy

Following the same process that I used for the momentum strategy, here are the results of a mean reversion strategy.

Now, it is random. No pattern at all. Too bad.

Thanks

I want to thank Matt Radtke for helping proof this article. The first time I sent it to him, he caught a huge logical mistake that I had made. It would have been embarrassing to publish an article with such a bad mistake. That is also the reason it has been so long since my last post. I had to redo a lot of the work I had done.

Final Thoughts

As often happens in research, an interesting idea leads nowhere. Given that the data do not follow the requirements for applying Benford’s Law, I am not too surprised by the results. This was a good lesson for me on not stopping when I got the first good results. Even had the results panned out, I am not sure I would have used them. No story that I could make about why this should work made me comfortable. And we must be comfortable with the reasons we are trading our strategies.

Backtesting platform used: AmiBroker. Data provider: Norgate Data (referral link)

Good quant trading,

Internal Bar Strength for Mean Reversion

I’ve been writing this blog for nine years now. Sometimes I am amazed about topics I have not covered and this is one of them. When developing a new strategy, these are the indicators I likely test: RSI, Historical Volatility and Internal Bar Strength (IBS). I had a reader send me an email pointing me to research done on IBS. I thought let me send him my blog post on this. After searching my records and site, I could not find one. That made it easy to decide what my next post would be about.

Continue reading

SP-500 Seasonality

I’ve been seeing lots of seasonality type charts on the S&P500 where they take the average return for each day of the year and then create a return curve for the year. The chart often ‘shows’ the sell in May and buy in November flatness of the returns. And then the holiday end of the year run up.

Steven, my trading buddy, sent me yet another chart and I noticed something I had not seen before. A beginning of the year downtrend from January to mid-March. This got me thinking. How much does the start of the data set impact these charts?

Continue reading

January Effect on Stocks

A member of The Crew recently asked me about the January Effect and if had I done any research on it. I had not. I have tested the December effect, which is buying the worst stocks of the year on December 1st, Should You Buy the Best or Worst YTD Stocks.

From Investopedia, ‘The January Effect is a perceived seasonal increase in stock prices during the month of January. Analysts generally attribute this rally to an increase in buying, which follows the drop in price that typically happens in December when investors, engaging in tax-loss harvesting to offset realized capital gains, prompt a sell-off.” And then they state that the effect has largely disappeared. Time to find out.

Continue reading

Multi-day Limits for Mean Reversion

A reader recently suggested leaving the limit orders for a mean reversion trade on for a couple of days. Typically, these orders are good only for one day unless the stock sets up again. I did not think that this would help but as I always tell my consulting clients when they ask me if an idea will work or not, “I am always surprised but what works and what doesn’t, so I test everything and let the numbers decide.” My expectation would be higher exposure but will this lead to higher returns?

Continue reading

Mean Reversion Entry: At Open vs. Intraday Pullback vs Confirmation

For the mean reversion strategies that I have created in the past and are trading now, they typically enter at the next day’s open or wait for a further pullback intraday before entering. My current mean reversion strategy, which enters on a limit down, was doing great until a few months ago when the performance started to slip. Looking at the missed trades and the trades taken, it seemed like the best trades would have been entering at open or waiting for some intraday confirmation.

Waiting for a pullback to take out the previous day’s high was very popular when I started trading. But my testing then showed that it was better to wait for a further intraday pullback to enter. Have the markets changed such that waiting for confirmation is better?

Continue reading

Volume Positive Negative Indicator for Breakouts

Probably like a lot of you, I am an indicator junkie. Whenever I read about an indicator I have not tested and makes some sense, I got to try it out. Now, most of the time they turn out to not be useful for my strategies.

While reading the April 2021 Technical Analysis of Stocks & Commodities, I came across an article about Volume Positive Negative (VPN) Indicator for detecting high-volume breakouts. As I have written before, I rarely use volume in my strategies because I can never get it to work. Here was an indicator using volume, not a chance I would not test this.

Continue reading

Different ranking methods for a monthly S&P500 Stock Rotation Strategy

Recently for my own trading, I have been researching rotational strategies on both the weekly and monthly timeframes. The most common indicator that I use for ranking stocks is Rate of Change (ROC) of the closing price. I read about using Rate of Change on the EMA to rank stocks. I liked a small twist on the idea and wanted to know how it compared to what I am using.

Then this led me down another path of trying other ranking methods with an interesting result using historical volatility (HV) that I did not expect.

Continue reading