Category Archives: Uncategorized

Crowdsourced stock ratings – a new dataset

We are very excited to announce a new dataset we will be distributing: crowdsourced stock ratings from ClosingBell.

ClosingBell is an app-based collaborative stock trading platform which allows its users to connect their online brokerage accounts and share their trades and portfolios with one another. Users, who are retail traders, can also share their trade ideas, or ratings, with the ClosingBell community. The ClosingBell crowdsourced stock ratings dataset includes all such Buy and Sell ratings issued by the community’s members since 2014, currently over 40,000 ratings from 2,800 contributors.

We find that the ratings are predictive of stock prices over a one to twenty trading day period. The returns of portfolios built using the ratings exhibit dollar neutral returns of 28% with a Sharpe ratio of 2.0. These returns are robust to trading costs and are not explained by common risk factors. The ClosingBell data represents a unique and powerful source of crowdsourced alpha which can improve systematic sentiment strategies.

You can find our fact sheet here, and please contact us for more information or if you’d like to see historical data for backtesting.

The Quant Quake, 10 years on

August 7, 2017

10 years ago today there was a wakeup call in systematic investing when many quants across the Street suffered their worst losses – before or since – over a three day period that has been called the “Quant Quake.” The event wasn’t widely reported outside of the quant world, but it was a worldview-changing week for those of us who traded through it. Most quant investors today, it seems, either didn’t hear the wakeup call, or have forgotten. This article addresses what’s changed in the last ten years, what hasn’t, what we learned and what we didn’t, and eight ideas on how to change your research process to insulate yourself from the next Quake.

In summary:

  • The Quake, which caused massive losses in quant funds in 2007 as well as some fund closures, was driven by crowded trades and similar alphas across many funds
  • There are more quants trading more capital today than ten years ago, but most of them haven’t significantly changed their alphas or data sources and have not widely adopted alternative data, perhaps due to complacency or herding behavior
  • So there’s more risk of another Quake than there was 10 years ago. In 2017 we’re seeing evidence of crowdedness and poor performance in standard strategies, with alternative data sets exhibiting far stronger performance
  • Quants need to build systematic processes for evaluating new data sources, and should view alternative data as a prime directive

The Quake

After poor but not hugely unusual performance in July ’07, many quantitative strategies experienced dramatic losses – 12 standard deviation events or more by some accounts – over the three consecutive days of August 7, 8, and 9. In the normally highly risk controlled world of market neutral quant investing, such a string of returns was unheard of. Typically-secretive quants even reached out to their competitors to get a handle on what was going on, though no clear answers were immediately forthcoming.

Many quants believed that the dislocations must be temporary since they were deviations from what the models considered fair value. During the chaos, however, each manager had to decide whether to cut capital to stem the bleeding – thereby locking in losses – or to hang on and risk having to close shop if the expected snap back didn’t arrive on time. And the decision was sometimes not in their hands, in cases where they didn’t have access to steady sources of capital. Hedge funds with monthly liquidity couldn’t be compelled by their investors to liquidate, but managers of SMAs and proprietary trading desks didn’t necessarily have that luxury.

 

On August 10th, the strategies rebounded strongly, per the chart above from Khandani and Lo’s postmortem Quant Quake paper. By the end of the week, those quants who had held on to their positions were nearly back where they started; their monthly return streams wouldn’t even register a blip! Unfortunately, many hadn’t, or couldn’t, hold on; they cut capital or reduced leverage – in some cases, like GSAM, to this day. Some large funds shut down soon afterwards.

What happened???

Gradually a sort of consensus emerged about what had happened. Most likely, a multi-strategy fund which traded both classic quant signals and some less liquid strategies suffered some large losses in those less liquid books; and they liquidated their quant books quickly to cover the margin calls. The positions they liquidated turned out to be very similar to the positions held by many other quant-driven portfolios across the world; and the liquidation put downward pressure on those particular stocks, thereby negatively affecting other managers, some of whom in turn liquidated, causing a domino effect. Meanwhile, the broader investment world didn’t notice; these strategies were mostly market neutral and there were no large directional moves in the market at the time.

With hindsight, we can look back at some factors which we knew to have been crowded and some others which were not, and see the difference in performance during the Quake quite clearly. In the chart below, we look at three crowded factors: earnings yield; 12-month price momentum; and 5-day price reversal. Most of the data sets we now use to reduce the crowdedness of our portfolios weren’t around in 2007, but for a few of these less-crowded alphas we can go back that far in a backtest. Here, we use components of some ExtractAlpha models, namely: the Tactical Model (TM1)’s Seasonality component, which measures the historical tendency of a stock to perform well at that time of year; the Cross-Asset Model (CAM1)’s Volume component, which compares Put to Call volume and option to stock volume; and CAM1’s Skew component, which measures the implied volatility of out of the money puts. The academic research documenting these anomalies was mostly published between 2008 and 2012, and the ideas weren’t very widely known at the time; arguably, these anomalies are still relatively uncrowded compared to their “Smart Beta” counterparts.

The table above shows the average annualized return of dollar neutral, equally weighted portfolios of liquid U.S. equities built from these single factors and rebalanced daily. For the seven-year period up to and through the Quant Quake, the less crowded factors didn’t perform spectacularly, on average, whereas the crowded factors did quite well; their average annualized return for the period was around 10% before costs, about half that of the crowded factors. But their drawdowns during the Quake were minimal, compared to those of the crowded factors. Therefore, we can view some of these factors as diversifiers or hedges against crowding. And to the extent that one does want to unwind positions, there should be more liquidity in a less-crowded portfolio.

It turned out, we were all trading the same stuff!

The inferior performance of the factors which we now know to have been crowded was a shocking revelation to some managers at the time who viewed their methodology as unique or at least uncommon. It turned out, we were all trading the same stuff! Most equity market neutral quants traded pretty much the same universe, controlling risk using pretty much the same risk models… and pretty much betting on the same alphas built on the same data sources! In many ways, the seed of the idea which became ExtractAlpha – that investors need to diversify their factor bets beyond these well-known ones – were planted in 2007. At the time one would have assumed that other quants would have had the same thought, and that the Quant Quake was a call to arms – but as we’ve learned more recently, the arms don’t seem to have been taken up.

But it won’t happen again… will it?

Quant returns were generally good in the ensuing years, but many groups took years to rehabilitate their reputations and AUMs. By early 2016, the Quant Quake seemed distant enough, and returns had been good enough for long enough that complacency had set in. Times were good – until they weren’t, as many quant strategies have fared poorly in the last 18 months. At least one sizeable quant fund has closed, and several well known multi-manager firms have shut their quant books. Meanwhile, many alternative alphas have done well. In our view, this was somewhat inevitable; since 2013 we’ve been saying that times eventually wouldn’t be good due to recent crowding in common quant factors, in part due to the proliferation of quant funds, their decent performance relative to discretionary managers, and the rise of smart beta products; and there’s a clear way to protect yourself: diversify your alphas!

With so much data available today, there’s no excuse for letting your portfolio be dominated by crowded factors.

With so much data available today – most of which was unavailable in 2007 – there’s no longer any excuse for letting your portfolio be dominated by classic, crowded factors. Well, maybe some excuses. Figuring out which data sets are useful is hard. Turning them into alphas is hard. But we’ve had ten years to think about it now. These are the problems ExtractAlpha helps its clients solve, by parsing through dozens of unique data sets and turning them into actionable alphas.

You’d think quants would actively embrace new alpha sources, and would have started doing so in earnest around August 15th, 2007. Strangely, they barely seem to have done so at all. Most quant managers still rely on the same factors they always have, though they may trade them with more attention to risk, crowding, and liquidity. Alternative data hasn’t crossed the chasm.

Perhaps the many holdouts are simply hoping that value, momentum, and mean reversion aren’t really crowded, or that their take on these factors really is sufficiently differentiated – which it may be, but it seems a strange thing to rely on in the absence of better information. It’s also true that there are a lot more quants and quant funds around now than there were then, across more geographies and styles – and so the institutional memory has faded a lot. Those of us who were trading in those days are veterans (and we don’t call ourselves “data scientists” either!)

It’s also possible that a behavioral explanation is at work: herding. Just like allocators who pile money into the largest funds despite those funds’ underperformance relative to emerging funds – because nobody can fault them for a decision everyone else has also already made – or like research analysts who only move their forecasts with the crowd to avoid a bold, but potentially wrong, call – perhaps quants prefer to be wrong at the same time as everyone else. Hey, everyone else lost money too, so am I so bad? This may seem to some managers to be a better outcome than adopting a strategy which is more innovative than using classic quant factors but which has a shorter track record and is potentially harder to explain to an allocator.

Another quant quake is actually more likely now than it was ten years ago.

Whatever the rationale, it seems clear that another quant quake is actually more likely now than it was ten years ago. The particular mechanism might be different, but a crowdedness-driven liquidation event seems very possible in these crowded markets.

So, what should be done?

We do see that many funds have gotten better at reaching out to data providers and working through the evaluation process in terms of vendor management. But most have not become particularly efficient at evaluating the data sets in the sense of finding alpha in them.

In our view, any quant manager’s incremental research resources should be applied directly towards acquiring orthogonal signals (and, relatedly, to controlling crowdedness risk) rather than towards refining already highly correlated ones in order to make them possibly slightly less correlated. Here are eight ideas on how to do so effectively:

  1. The focus should be on allocating research resources specifically to new data sets, setting a clear time horizon for evaluating each (say, 4-6 weeks), and making a definitive call about the presence or absence of added value from a data set. This requires maintaining a pipeline of new data sets and sticking to a schedule and a process.
  2. Quants should build a turnkey backtesting environment which can efficiently evaluate new alphas and determine their potential added value to the existing process. There will always be creativity involved in testing data sets, but the more mundane data processing, evaluation, and reporting aspects should be automated to expedite the process in (1)
  3. An experienced quant should be responsible for evaluating new data sets – someone who has seen a lot of alpha factors before and can think about how the current one might be similar or different. New data sets shouldn’t be a side project, but rather a core competency of any systematic fund.
  4. Quants should pay attention to innovative data suppliers rather than what’s available from the big players (admittedly, we’re biased on this one!)
  5. Priority should be given to data sets which are relatively easy to test, in order to expedite one’s exposure to alternative alpha. More complex, raw, or unstructured data sets can indeed get you to more diversification and more unique implementations, but at the cost of sitting on your existing factors for longer – so it’s best to start with some low hanging fruit if you’re new to alternative data
  6. Quants need to gain comfort with limited history that we often see with alternative data sets. We recognize that with many new data sets one is “making a call” subject to limited historical data. We can’t judge these data sets by the same criteria of 20-year backtests as we can with more traditional factors, both because the older data simply isn’t there and because the world 20 years ago has little bearing on the crowded quant space of today. But the alternative sounds far more risky.
  7. In sample and out of sample methodologies might have to change to account for the shorter history and evolving quant landscape; here is one approach to the problem.
  8. Many of the new alphas we find are relatively short horizon compared to their crowded peers; the alpha horizons are often in the 1 day to 2 month range. For large-AUM asset managers who can’t be too nimble, using these faster new alphas in unconventional ways such as trade timing or separate faster-trading books can allow them to move the needle with these data sets. We’ve seen a convergence to the mid-horizon as quants who run lower-Sharpe books look to juice their returns and higher-frequency quants look for capacity, making the need for differentiated mid-horizon alphas even greater.

I haven’t addressed risk and liquidity here, which are two other key considerations when implementing a strategy on new or old data. But for any forward-thinking quant, sourcing unique alpha should be the primary goal, and implementing these steps should help to get them there. Let’s not wait for another Quake before we learn from the lessons of ten years ago!

 

 

Originally posted on LinkedIn at https://www.linkedin.com/pulse/quant-quake-10-years-vinesh-jha

 

Avoiding overfitting in an evolving market

This is a methodological note on how we approach in sample and out of sample testing.  Avoiding overfitting is of course paramount to building robust quantitative models.  Reserving some portion of one’s historical data for validation of one’s models is a great way to know whether you’ve overfit; if out of sample performance is comparable to in sample performance, you have some evidence that your modeling was less subject to data mining (in the perjorative sense of the term).  Slicing data by time is probably the most common way to do this, with the initial say 60-75% of the dates in the sample used for training (in sample) and the most recent 25-40% of dates used for validation (out of sample).

A difficulty of this approach is that markets evolve over time.  If you build a model which works great through several years ago, it may not reflect the increasing crowdedness of a trade, or changes in market microstructure or the macroeconomic environment, for example.  To address this problem we’ve devised a “striped” approach for our model building which alternates in sample and out of sample dates for some portion of the history, as follows:

Note that for the 2005-2016 period, 60% of the dates are in sample, with a contiguous block at the beginning and alternating months towards the end.  The last two years are reserved as fully out of sample.  You’ll also notice that in the striped 2010-2014 period the particular in sample months change each year: in even years (2010, 2012, 2014), odd months (Jan/Mar/May/Jul/Sep/Nov) are in sample; and in odd years, even months are in sample.  This helps to avoid seasonal bias in our results if, for example our returns are all coming from earnings season.

The benefit here is that we’re able to look at relatively recent data without burning all of the out of sample data through that relatively recent date.  The cost is that one must be very careful to not let out of sample data “bleed” into in sample periods which come afterwards.  In particular, we completely throw away data on anything we’re trying to predict (for example, returns) from the out of sample data prior to doing any modeling.

This is just one of several techniques we use to enhance our model construction technique, some of which I touched on in an article series last year on best practices in quant research.  The result is that we should see out of sample performance which is broadly comparable with, and roughly contemporaneous with, in sample performance, as we saw for our recently released CAM1 model:

It’s comforting to see that the degradation in cumulative returns when going out of sample is modest.  Good in sample selection is a very useful item in a quant’s toolbox and deserves careful consideration.

Crowded stat arb vs alternative alpha

We’ve been hearing tales of woe across the board regarding stat arb performance lately, with May sounding especially bad.  Mean reversion strategies suffered significantly during the month, and fears of crowding and liquidation events are floating around.  Here’s a plot of the HFRX Equity Market Neutral index YTD, alongside a simple dollar neutral reversal strategy before costs:

It’s an ugly chart and roughly in line with the chatter.  It’s too soon to know whether we’re in for another August 2007-like quant crisis, but even the hint of such an event provides all the motivation one should need to diversify one’s alpha sources.

Meanwhile we’re still seeing very solid performance from our alternative alpha models: the TRESS signal (financial bloggers sentiment) has a YTD Sharpe of 1.3, and the Digital Revenue Signal (online consumer demand) is showing a whopping 4.8 (with five months in a row of positive performance), and both models are showing much lower volatility than we’ve been seeing with mean reversion lately.

Alternative data conferences – two upcoming and one past (with video!)

Our founder Vinesh Jha was a panelist at a recent alternative data conference hosted by Wall Street Horizon in New York, discussing alternative data adoption, some of the exciting new (and in some cases, old but newly available) data sets out there, and some of the issues with data robustness.

Vinesh will be speaking at two more events: Estimize’s Learn2Quant, designed for discretionary managers who would like to embrace data driven processes more; and Battle of the Quants, where he’ll be on a panel discussing structured and unstructured datasets.

OTAS Technologies Partners With ExtractAlpha To Provide Enhanced Trading Factor Analytics

OTAS Technologies Partners With ExtractAlpha To Provide Enhanced Trading Factor Analytics

Published on   Oct 03, 2016

OTAS Technologies (OTAS), a specialist provider of market analytics and trader intelligence, today announced it has partnered with ExtractAlpha the independent fintech research firm to provide unique, actionable trading factor analytics. The content will be available to buy-side and sell-side clients including quants and asset managers.

ExtractAlpha’s Tactical Model is now available within OTAS Core Summary allowing clients to benefit from rigorous quantitative analysis built for alpha generation and superior trade timing. ExtractAlpha’s quantitative models are designed for institutional investors to gain a measurable edge over their competitors, optimize trade entry and exit points, and avoid crowded trades. Currently ExtractAlpha has three stock selection models for U.S. equities, with further models and global coverage to follow.

“We partnered with OTAS as together both companies can provide a unique offering of trading analytics to the market that isn’t available through any other company. OTAS’ data sets are extremely valuable and will help our investors profit from this exclusive source of information,” said Vinesh Jha, CEO of ExtractAlpha. “With the proliferation of new data sets available today, forward thinking managers will now have an opportunity to be ahead of the game.”

“OTAS is proud to be working with ExtractAlpha, adding to the growing list of integration partners we are already working with,” said Tom Doris, CEO of OTAS Technologies. “OTAS is continually looking to extend our list of strategic partners, ensuring clients are armed with the tools they need to be as successful in their trading decisions as possible. With this integration, clients will now be able to take advantage of this unique analysis allowing them to make trading decisions in a quick and efficient manner.”

Speaking at the GS Quant Conference in London

We’re excited to have the opportunity to present at Goldman’s Quantitative Investing conference on Sept 15 in London! We’ll be on a panel discussing alternative data in investment management. Should be an interesting topic.

Here’s the registration page.

We’ll also be at the London Quant Group conference in Oxford from Sept 12-15.  If you’ll be at either event please say hello!

Revenue Surprise hit rates for Q2

Second quarter results for the Digital Revenue Signal (DRS) are in, and as with previous quarters, stocks with high rankings consistently beat revenue expectations whereas those with low rankings consistently missed.  DRS is a new quantitative stock selection model designed to predict the likelihood of a company beating revenue expectations based on trends in online consumer demand from web Site, Search, and Social data sources.

In line with historical averages, 70% of top-ranked stocks according to DRS beat expectations, versus only 36% of bottom-ranked stocks, as the headline chart shows.  Top ranked stocks experienced an average surprise magnitude of +2.2%, versus a 3.1% average shortfall for bottom-ranked stocks.  Both of these metrics are closely in line with last quarter and prior quarters.

See a full report with examples here.