Beware of survivorship bias

Strategies might be attractive on the surface but fail moving forward. Why? Here we discuss the qualities of effective trading strategies, and the many traps one should avoid.

Beware of survivorship bias

Postby Overload » Wed Feb 22, 2006 3:01 pm

One of the first trading systems I ever found was based on stocks rebounding after extreme drops in price of 70% or more. The theory was that stocks that had been pummeled so badly would often have a significant recovery following the drop. But the risk is that my data included only companies that continued to exist after such a price drop.

This is called survivorship bias, and it means that most price data includes only the stocks of companies that have survived through today. There are many companies that have come and gone throughout the span of time within our Evaluation Periods. But our prices and symbols include only the survivors. The ones that didn’t survive (Enron is a wonderful example) are removed from our database, never to be seen or back tested again. In many cases, this isn’t a significant problem, but that depends on the nature of your trading system.

In my original trading system, based on price drops of 70% or more, there’s a very good chance that many of those companies failed to survive. Had I traded that system, I may have found that it led to a significant number of bankruptcies and complete loss of capital. I’ll never know for sure, since my price database has a survivorship bias.

Pete
Overload
 
Posts: 2247
Joined: Wed Nov 30, 2005 12:14 pm

Re: Beware of survivorship bias

Postby taowave » Tue Feb 06, 2007 5:34 pm

Overload wrote:One of the first trading systems I ever found was based on stocks rebounding after extreme drops in price of 70% or more. The theory was that stocks that had been pummeled so badly would often have a significant recovery following the drop. But the risk is that my data included only companies that continued to exist after such a price drop.

This is called survivorship bias, and it means that most price data includes only the stocks of companies that have survived through today. There are many companies that have come and gone throughout the span of time within our Evaluation Periods. But our prices and symbols include only the survivors. The ones that didn’t survive (Enron is a wonderful example) are removed from our database, never to be seen or back tested again. In many cases, this isn’t a significant problem, but that depends on the nature of your trading system.

In my original trading system, based on price drops of 70% or more, there’s a very good chance that many of those companies failed to survive. Had I traded that system, I may have found that it led to a significant number of bankruptcies and complete loss of capital. I’ll never know for sure, since my price database has a survivorship bias.

Pete


Hi Pete,
I dont know what you were backtesting on,but if it wasnt "all stocks",what do you think of screning in SS for stocks that were down 50% or more during he test period and adding them to your backtest portfolio..

Or,enable SS to "create" synthetic stocks that are absolute pigs that coud hopefully negate the survivorship bias issue..
taowave
 
Posts: 584
Joined: Sat Dec 02, 2006 12:39 pm

Postby Overload » Tue Feb 06, 2007 10:54 pm

Certainly the less extreme the drop, the lower the impact of survivorship bias will be. At least that's probably a good guess. But I don't think it can be eliminated entirely without a prices database that actually includes those delisted symbols and their prices while they were listed. I believe such databases do exist, but it gets quite complicated with mergers, acquisitions, bankruptcies, etc.

I might not be understanding how creating synthetic stocks as you mention could solve the problem. The key piece of information we'd need is: of all stocks that drop 70% in a week, what percentage will survive? Without having a solid answer for that, it's pretty hard to manage that risk.

Pete
Overload
 
Posts: 2247
Joined: Wed Nov 30, 2005 12:14 pm

Postby taowave » Tue Feb 06, 2007 11:54 pm

Overload wrote:
I might not be understanding how creating synthetic stocks as you mention could solve the problem. The key piece of information we'd need is: of all stocks that drop 70% in a week, what percentage will survive? Without having a solid answer for that, it's pretty hard to manage that risk.

Pete


I think CSI may have the database that includes companies that went under for a given time period..

There must be a way to come up with an estimate of what % of stocks go under in their respective indicies.Once you do that,you could assume that those stocks if in your portfolio would have been stopped out on their way to zero.One could factor in some sort of "bankrupcy cost",just as one factors in slippage.

Unfortunately,I think the bigger problem occurs is the stocks action on the initial 70% drop.Once again, one could simply factor the same bankrupcy cost or a slight multiple assuming stop losses were exceeded due to gaps..Its certainly not perfect,but its a step in the right direction..

I agree that one must know the % of stocs that go belly up,but I dont think that is a difficult statistic to find.Its determining the price path to zero that is the more difficult task.IMHO,some cost factor must be assigned....
taowave
 
Posts: 584
Joined: Sat Dec 02, 2006 12:39 pm

Postby taowave » Wed Feb 07, 2007 9:39 am

On a related note,dont you feel dividends should be accounted for? If one backtests over significant periods of time,that can have an extremely large effect...

To a much lesser degree,split vs non split data can have an impact as well


Allan
taowave
 
Posts: 584
Joined: Sat Dec 02, 2006 12:39 pm

Postby Dacamic » Wed Feb 07, 2007 3:06 pm

taowave wrote:There must be a way to come up with an estimate of what % of stocks go under in their respective indicies.Once you do that,you could assume that those stocks if in your portfolio would have been stopped out on their way to zero.One could factor in some sort of "bankrupcy cost",just as one factors in slippage.

Unfortunately,I think the bigger problem occurs is the stocks action on the initial 70% drop.Once again, one could simply factor the same bankrupcy cost or a slight multiple assuming stop losses were exceeded due to gaps..Its certainly not perfect,but its a step in the right direction..

I agree that one must know the % of stocs that go belly up,but I dont think that is a difficult statistic to find.Its determining the price path to zero that is the more difficult task.IMHO,some cost factor must be assigned....

I must confess my desire to have a complete prices database. I've even tried to create my own, but those anemic efforts were futile. So, I instead accept -- even though reluctantly -- certain assumptions regarding the validity of my price data.

Along those lines, a few ideas are listed below as suggestions for coping with fat tail swings that might not be reflected in our data sample:
    1. Do not take signals in live trading for positions in bankruptcy, because back tests do not include those circumstances;
    2. Similar to (2), do not take signals for positions being acquired, merged or bought out;
    3. Use short, recent evaluation periods for back tests;
    4. Use stable sectors for testing;
    5. Use fundamental filters;
    6. Use minimum price filters;
    7. Adjust slippage costs to be a proxy for adverse events; and,
    8. Create a trading rule that selects symbols with X% decline.
With the possible exceptions of (1) and (2), the ideas above can create a cure that is worse than the disease. For example, short evaluation periods can easily exclude other low-probability, high impact events, e.g. failed clinical trials (I recently looked at a biotech company whose stock price fell 80% in one day after announcing a failed trial ... youch). So, care must be taken while deciding if and how to use any of the suggestions above.
Steve
Dacamic
 
Posts: 457
Joined: Wed Nov 30, 2005 12:40 pm

Postby taowave » Wed Feb 07, 2007 3:47 pm

Dacamic wrote:
taowave wrote:There must be a way to come up with an estimate of what % of stocks go under in their respective indicies.Once you do that,you could assume that those stocks if in your portfolio would have been stopped out on their way to zero.One could factor in some sort of "bankrupcy cost",just as one factors in slippage.

Unfortunately,I think the bigger problem occurs is the stocks action on the initial 70% drop.Once again, one could simply factor the same bankrupcy cost or a slight multiple assuming stop losses were exceeded due to gaps..Its certainly not perfect,but its a step in the right direction..

I agree that one must know the % of stocs that go belly up,but I dont think that is a difficult statistic to find.Its determining the price path to zero that is the more difficult task.IMHO,some cost factor must be assigned....

I must confess my desire to have a complete prices database. I've even tried to create my own, but those anemic efforts were futile. So, I instead accept -- even though reluctantly -- certain assumptions regarding the validity of my price data.

Along those lines, a few ideas are listed below as suggestions for coping with fat tail swings that might not be reflected in our data sample:
    1. Do not take signals in live trading for positions in bankruptcy, because back tests do not include those circumstances;
    2. Similar to (2), do not take signals for positions being acquired, merged or bought out;
    3. Use short, recent evaluation periods for back tests;
    4. Use stable sectors for testing;
    5. Use fundamental filters;
    6. Use minimum price filters;
    7. Adjust slippage costs to be a proxy for adverse events; and,
    8. Create a trading rule that selects symbols with X% decline.
With the possible exceptions of (1) and (2), the ideas above can create a cure that is worse than the disease. For example, short evaluation periods can easily exclude other low-probability, high impact events, e.g. failed clinical trials (I recently looked at a biotech company whose stock price fell 80% in one day after announcing a failed trial ... youch). So, care must be taken while deciding if and how to use any of the suggestions above.


I think you hit the proverbial nail on the head...We are in a risk business,and all we can do is identify it ,address it,and tilt the probability of success on our side....
taowave
 
Posts: 584
Joined: Sat Dec 02, 2006 12:39 pm

Postby taowave » Sat Mar 17, 2007 7:33 am

Hi all,
Have a "philosophical" question and it relates to the survivorship bias question.I like ti run AutoSearch's and currently I am running a custom BB search in which SS takes 4 cross rules and tests the Ndx100 stocks.My testing period began in 2000.

The core rules are
close crosses above Low Bb
close crosses above Up BB
close crosses below Up BB
close crosses below Down BB

As I have learned from SS,expect the unexpected when it comes to what has worked well in the past.There were several combinations that performed very well but had extremely long holding periods with perfomances of 200-500% returns.I got to thinking that this type of system with long holding periods and massive returns on a handful of winning stocks may have worked well due to survivorship bias as opposed to systems that had very small holding periods and were more "swing" like in nature...I was not using any stops in my initial evaluation...

Any thoughts??
taowave
 
Posts: 584
Joined: Sat Dec 02, 2006 12:39 pm

Postby Overload » Sat Mar 17, 2007 8:40 am

A few questions....

1) Just how long were the holding periods?

2) Were the gains distributed pretty evenly throughout your entire evaluation period?

3) How many winning and how many losing trades were there in your back test during this period?

Pete
Overload
 
Posts: 2247
Joined: Wed Nov 30, 2005 12:14 pm

Postby taowave » Sat Mar 17, 2007 5:04 pm

Hi pete,

In some of the systems the avg days held exceeded 400 days,and suprisingly enough it appears that the gains were fairly evenly distributed.

There were only 92 trades in the period from 2000-2006,as they were very long duration trades.On the combination report,there were over 500 trades.Monte carlo returns drop from 35% to 25%.

A great deal of the BB systems suggest that you buy extremely oversold stocks and let the chips fall where they may.The exit rules SS came up with are very difficult to meet,so the system is almost a buy and hold...I have intentionally not added any MM rules,but I do get a sneaking suspicion I am benefitting from survivorship bias in this instance..


Overload wrote:A few questions....

1) Just how long were the holding periods?

2) Were the gains distributed pretty evenly throughout your entire evaluation period?

3) How many winning and how many losing trades were there in your back test during this period?

Pete
taowave
 
Posts: 584
Joined: Sat Dec 02, 2006 12:39 pm

Postby Overload » Sun Mar 18, 2007 9:31 am

Actually I'm not so sure there's a survivorship bias going on there. For that to happen, there would need to be a significant number of stocks in that sector going belly-up. And while there may be the occasional Enron, I'm guessing not that many stocks in the Nasdaq 100 go bankrupt. I don't have any stats to back that up, but the size of the companies in the Nasdaq 100 make bankruptcies less likely than, say, the Russell 2000.

But because the rotation of the Nasdaq 100 is so large, there may be a predictive bias that's having some impact. In short, your early purchases (2000-2003) may be based on buying stocks that will, at some point in the future, become a part of the Nasdaq 100. Obviously, knowing that a stock will become a part of the Nasdaq 100 in 3 or 4 years would give you predictive information that couldn't be repeated in real trading.

Another question that comes to mind is whether these long holding periods were simply just allowing you to buy the market in general. In other words, could you have picked any random stocks in the Nasdaq 100, held them for 400 days, and come out just as well? You can test this by having a look at your Z-Ratio for that strategy. While the Z-Ratio has different statistical levels, a number less than 1.64 would tell you that there's no predictive ability in your strategy at all, and buying any stock in the Nasdaq 100 and holding it 400 days would do just as well.

FYI, there have been various times when strategies with long holding periods like this have shown up on my radar as well. And if the returns are that much better than anything that can be found with short holding periods, the idea can be tempting. But I've always ditched those systems primarily because I didn't want to wait 2 years to just find out if the system is working or not. Trading systems always take time, including a significant number of trades, to reach their average performance. And waiting 2+ years for that information might be a challenging thing to do. Too challenging for me anyway.

Pete
Overload
 
Posts: 2247
Joined: Wed Nov 30, 2005 12:14 pm

Postby taowave » Sun Mar 18, 2007 3:17 pm

Thanks for the help,and I will look a bit more closely at Z-score....

As I mentioned,I did not apply any MM to the strtaegy,and I have a feeling that will make a big difference...If you took a look at the MAE/MFE charts,you would see that positions had monstorous swings....Not sure I would consider that "system" trading
taowave
 
Posts: 584
Joined: Sat Dec 02, 2006 12:39 pm


Return to Curve-Fitting and Other Pitfalls

cron