Survivorship Bias - Norgate's Data

StrataSearch has many powerful features, with many techniques and approaches to explore. Here we discuss the things we've done, and the things we'd like to do.

Survivorship Bias - Norgate's Data

Postby StinkiePhish » Fri May 11, 2012 1:01 am

I purchased the historical data for current and delisted US stocks from Norgate (aka premiumdata.net) to try and solve survivorship bias in my backtesting. I utilize SS strategies that initially filter all securities on price and volume.

When a stock that was listed on a major exchange (NYSE) becomes delisted, Norgate seems to dump that entire company's history into a generic "delisted securities" folder. I have no idea if the other stocks in that folder should or should not be made available to my strategies. A company like Enron provides a great example of a problem I foresee when using Norgate's data. When Enron was listed on the NYSE, my strategy could have theoretically traded it, so it needs to be in my backtesting data during that period. However, when Enron was delisted from the NYSE and traded OTCBB/PINK, I do *not* want it available for backtesting. I do not believe that such a delineation is possible because only one history file exists as "delisted data," ENRNQ-200411.

The recent merger of Constellation (CEG) and Exelon (EXC) in which CEG was delisted provides another example. Norgate has CEG's history from 1985 until 2012 in the same generic "delisted data" folder. Obviously, CEG should be in my backtesting data.

Essentially, my question is whether it is possible to build a survivorship-bias free database with Norgate's current and delisted data, that excludes stocks or periods when stocks are traded OTCBB/PINK. Does anyone have any experience massaging their delisted data into a workable survivorship bias-free database for SS? Any other ideas or red flags that would signal the delisting off the exchanges?

(As an aside, I am into my second month of SS and am really enjoying using the program.)
StinkiePhish
 
Posts: 4
Joined: Thu Apr 26, 2012 12:09 am

Re: Survivorship Bias - Norgate's Data

Postby Overload » Fri May 11, 2012 9:32 am

I have never attempted to create a prices database that accomodates survivorship bias. But I think the goals are pretty straightforward:

1) You want to include the activity of stocks that no longer exist.

2) You want to create exits that properly reflect the price you would have received had you owned that particular stock on its final day.

Companies like Enron fall into the category where their decline was displayed in their stock price pretty much from beginning to end. In other words, the stock remained publicly traded during its entire decline from $90 to $1. So, for the most part, by simply including the Enron stock in your database, you'll be including that history (#1 above).

But sometimes a company might go bankrupt, or go through a merger or buyout. In this case, you want your "exit price" to reflect the return you would have had on your investment. For example, suppose a company went bankrupt and the stock price became 0. You'd want your position to exit with a price of 0. Or, suppose a company was bought, and received the equivalent of a 25% jump in price after the transaction. You'd want your position to exit with a price reflecting that 25% jump.

The basic processing in StrataSearch is to exit positions with the last price available within that Evaluation Period. For example, suppose you were holding a position in ABC on Oct 15, 2005 when that company went bankrupt. If the final stock price were 4.55, then that is the exit price that StrataSearch would use. But if you want the final price to reflect something different, you would need to attach a final price accordingly. For example, you could enter a price of 0 on Oct 16 if the company went bankrupt. Or you could enter a price of 5.69 on Oct 16 if the stock price had a 25% jump after being bought out.

Creating a survivorship bias database is a pretty tricky thing to do, and I think that's why most users haven't gone to the trouble. But at the very least, I think you'll need to identify whether #1 or #2 (or both) are important to you, and create your database accordingly. #1 is easy. #2 will take some work.

Pete
Overload
 
Posts: 2248
Joined: Wed Nov 30, 2005 12:14 pm

Re: Survivorship Bias - Norgate's Data

Postby Dacamic » Fri May 11, 2012 10:47 am

For what it's worth, I have not had any stocks disappear on me during my seven years of systems-based trading. And, I've only experienced two disappearances in thirty years of financial fiddling: Montana Power Company (bankruptcy) and Yankee Candle (buyout). I had worked for Montana Power, and thus rode with them to $0 based on loyalty (it was a small position ... don't judge me). If I'm allowed that episode as an asterisk, my disappearance count drops to one.
Steve
Dacamic
 
Posts: 457
Joined: Wed Nov 30, 2005 12:40 pm

Re: Survivorship Bias - Norgate's Data

Postby Overload » Fri May 11, 2012 11:55 am

According to one study, an average of 5.6% of stocks listed on the Nasdaq were delisted each year. That's pretty significant. But naturally, your choice of holdings will have a big effect on whether or not you'll witness a delisting. For example, if your trading system buys from the Russell 2000, you'll have a much larger chance of having one of your stocks delisted than if your system buys from the S&P 500.

In any case, here's an old but informative study on the effects of delisting bias:

http://www-personal.umich.edu/~shumway/ ... sdbias.pdf

Pete
Overload
 
Posts: 2248
Joined: Wed Nov 30, 2005 12:14 pm

Re: Survivorship Bias - Norgate's Data

Postby Dacamic » Fri May 11, 2012 1:10 pm

With mention of indexes such as the Russell 2000 and S&P 500, we enter the realm of, figuratively speaking, rotational bias. Now, there's a bucket of worms worthy of banging one's head onto their desk.
Steve
Dacamic
 
Posts: 457
Joined: Wed Nov 30, 2005 12:40 pm

Re: Survivorship Bias - Norgate's Data

Postby Overload » Fri May 11, 2012 1:35 pm

Now, there's a bucket of worms worthy of banging one's head onto their desk.

Right. That's a whole other ballgame. Didn't mean to bring that up.

Pete
Overload
 
Posts: 2248
Joined: Wed Nov 30, 2005 12:14 pm

Re: Survivorship Bias - Norgate's Data

Postby StinkiePhish » Fri May 11, 2012 3:34 pm

Thank you for the responses. After further investigation (unfortunately after my purchase), I can say that Norgate's data is almost entirely unusable as a base for building a survivorship bias-free database for active trading that wants to avoid OTCBB trading. Norgate states that it defines delisted stocks as "untradable," and names the symbol of the stock on the last traded symbol (see Enron example below). From this treatment, I made a significant misunderstanding.

Pete, I am primarily focused on your listed concern #1.

If you purchase Norgate's "current" product, you receive folders of all US exchanges (e.g. NYSE, ARCA, NASDAQ, OTC). Their "delisted" product provides a single folder, "Delisted Securities." I imported each folder into SS and identify them by separate exchange designations. "Delisted Securities" are lumped together as a single exchange. It is a total of approximately 25,000 symbols.

My concern is that once a company like Enron (bankruptcy) or Constellation (merger) is delisted, it becomes extremely difficult to identify that those companies should be in my tradable database at all.

Here is an example of how their historical data is treated:
From http://www.premiumdata.net/products/premiumdata/ushistorical.php#delisted
NASDAQ:FWLT
Foster Wheeler
Originally formed in 1927 by the merger of a power company owned by the Foster family and the "Wheeler Condenser and Engineering Company", it was listed on the NYSE under the symbol FW. In November 2003, the stock was removed from the NYSE due to its inability to meet requirements regarding stockholder's equity. It then traded on the OTC-BB under the symbol FWHLF. In June 2005, it was able to relist on a major exchange and began trading as NASDAQ:FWLT. Our history includes price data from all of these phases.


In the end, there is only one history file for this company. It is in the "current" NASDAQ folder listed as FWLT, and contains the history from 1985-2012. The time period between 2003-2005 when it was delisted from the NYSE is not in any way identified. Ideally, I would want this as an "untradable" period for SS testing.

Enron provides another example:
OTC:ENRNQ
Enron Corp originally formed in 1931 as the "Northern Natural Gas Company" (NYSE:NNG). In 1979 it reorganised and changed its name to "InterNorth" (NYSE:INI) . After the takeover of the "Houston Natural Gas" company in 1985, it changed its name to Enron (NYSE:ENE). In January 2002, the NYSE removed Enron from the exchange and it became an OTC security trading as ENRNQ. It was finally delisted in November 2004.


Enron's entire history, despite the fact that it traded on the NYSE as ENE for the majority of its life, is in the Delisted Stock folder under the OTCBB symbol ENRNQ.

So I am left with "current" stocks that contain data from when the company was delisted to OTCBB, and with "delisted" stocks that contain data from when the company was actively traded on a major exchange.

Is this expected? Yes. Norgate properly accounts for a given company's tradable stock price through all phases of a company's life (major exchange --> OTCBB --> major exchange). Unfortunately, that's not precisely what I was looking for. If I want to trade only NASDAQ stocks, I want only the periods in which a company traded on NASDAQ. That is the only way how SS could be traded going live. I recognize I may be asking too much and we're entering the murky realm of rotational bias. All I ask for is a non-OTC v. OTC distinction! :mrgreen:

Ideas:
    Focus on a volume/liquidity filter to exclude periods where stocks were traded OTCBB. This may be my best and only option.
    A symbol filter (e.g. exclude delisted securities with 5 letter symbols that end in a Q, indicating bankruptcy) will not work because of symbols like Enron, discussed above. [*Edit: But I most likely can exclude foreign or ADR symbols by filtering 5 letter symbols with F and Y. This warrants further investigation.]

Thanks again for your responses and thoughts. I hope that someone, somewhere, had taken on the arduous task of using the delisted data.
StinkiePhish
 
Posts: 4
Joined: Thu Apr 26, 2012 12:09 am

Re: Survivorship Bias - Norgate's Data

Postby Overload » Fri May 11, 2012 5:17 pm

Is there any particular reason why you don't want to include dates/prices in which the stocks were delisted? Stocks still have value when they're OTCBB, and they can still be traded. I would think that including that data would be better than excluding it entirely. For example, what if you happened to be holding one of those stocks when it was delisted. Your position would be incomplete without that OTCBB data included.

I'm also curious if Norgate adjusts the data during the delisted period. It's very common for penny stocks go through reverse-splits, and I'm curious if the price data is adjusted for that. For example, did FWLT have a reverse-split during the 2003-2005 period? If so, is the price data adjusted for that split within the complete FWLT price file?

If the delisted price data is adjusted, that's one more reason why it might be helpful to include that data. But I guess it depends what your goal is.

Pete
Overload
 
Posts: 2248
Joined: Wed Nov 30, 2005 12:14 pm

Re: Survivorship Bias - Norgate's Data

Postby StinkiePhish » Tue May 15, 2012 11:07 pm

I do not have a particular reason for wanting to exclude OTCBB stocks. I assumed, perhaps naively, that components of the S&P 500 or listed NASDAQ/NYSE stocks would trade sufficiently different than OTCBB stocks. Regardless, the differentiation in delisted historical data cannot happen so the issue is moot.

The Norgate data is adjusted for splits, but not dividends.
StinkiePhish
 
Posts: 4
Joined: Thu Apr 26, 2012 12:09 am


Return to General Discussion