Glass Oracles: The Art and Danger of Prediction (Part 1)

It’s been a while since my last post, mainly due to real life factors like well, the end of the quarter of my Ph.D. program fast approaching. I’ve been extremely lucky to be on some fantastic podcasts to muse about life, the universe and everything, and ended up in some bizarre Twitter drama — which is doubly hilarious given it’s been mostly mid-career finance professionals and sir, this is a Denny’s.

In this weird niche of the internet we find ourselves, financial Twitter, you see an interesting confluence of people throughout the financial spectrum: ambitious youngsters (myself included) exchanging clout as currency, talented researchers, complete charlatans, legendary investors, powerful hedge fund managers, regular day-traders and low information investors. What’s fascinating about the internet is it ends up fairly meritocratic, up and to a point — regardless of origin, if you produce quality content, your audience will follow. This algorithm of course can be gamed, and it isn’t to say that everyone with a large following is necessarily knowledgeable. Despite recent events, I’m still 25, and I still have yet to work in industry. Books only get you so far.

The most interesting phenomenon out of this is the currency of clout in the trading and financial world — the art of prediction. Prediction of the market is one part magic trick, one part weather forecaster. It tends, out of any other content, to be the most impactful and the quickest way to build up an audience, and tends to create a positive feedback loop (successful predictions tend to signal boost the predictor exponentially). I’m not calling prediction the realm of charlatanism — it’s undeniable that successful investors predict, albeit it on varying timeframes, the market. We instinctively make predictions every day, but most of us don’t call it that.

For this series I’m going to focus on the non-intuitive edge cases of probability, and the implications they have for our understanding of the markets and in trading.

The Sun Also Rises

“What is the probability that the sun will rise tomorrow?”

At its face, this question is almost obvious nonsense. Given that within the scope of human understanding (again, until modern astronomy), the Sun had always risen each day, right on cue. In general, humans understand probability in the observational sense — for example, given a six-sided fair die, I can roll it enough times and observe the probability of any given side facing up is about one in six. If we rely on this observational understanding of probability the answer to this question simply must be 100% (there is no chance the Sun does not rise tomorrow). This is because on all days ever observed, the Sun has risen. However, we know this innately to be invalid (never mind our modern day knowledge of stellar lifecycles). There must, despite having not observed this possibility, be some probability — however slight — that the Sun fails to show up tomorrow morning.

Pierre Laplace, a pretty famous 18th century mathematician, tried to answer this problem through the introduction of psuedocounts, also known as Laplace additive smoothing. This sounds particularly fancy, and I can use lots of Greek letters to describe it, but more simply it refers to the following understanding of an event’s probability:

P(“some event happening”) = Number of times it happens/total number of times

This easily describes the die analogy — given perhaps 1000 rolls, you expect that around 166 times you’ll see 1, 166 times you’ll see 2, etc (this doesn’t happen exactly in practice, but is close enough). However, it fails to account for the Sun not rising tomorrow, simply because that has never happened in our total set of observations.

To reconcile this simple notion of probability with an obvious possibility, Laplace simply considered that although observationally the Sun has always risen (100% of days), to truly gauge the probability of the Sun not rising we must consider two more “pseudo”-observations: a day where the Sun does not rise, and a day that the Sun does. By constructing these “pseudo”-observations, he was able to derive in his famous Law of Succession that the true probability of the Sun not rising was not 0, but:

P(“some unobserved event happening”) = 1/(total number of times+2)

This allows us to create a sensible — albeit upper bound — understanding of probability for unobserved events. This has substantial applications in bioinformatics and computer science, given it allows us to deal with otherwise sparse probability matrices in a reasonable way (for example, in motif finding).

This is especially important in the markets. Despite common belief, stocks don’t historically only go up. In many ways, the recent 2010s bull run has been a substantial anomaly in historical market analysis (and as I’ve alluded to on Twitter, perhaps the post-2018 period has been even more of an anomaly). With minimal exception (like the 2020 Coronavirus Crash), since the nadir in February 2009 the U.S. equities market has marched ever higher, propelled by economic growth and unparalleled intervention by the U.S. Federal Reserve.

This can be problematic when constructing backtests. Although 12 years in the real world is a fairly significant length of time (heck, it’s about half of my age), it has been anything but a normal regime. When designing backtests or, in a more general context making inferences about historical correlations, one has to be mindful of the length of time they’re observing. Much like Laplace and the Sun, the window of observation can greatly distort (both in underestimation and overestimation) probability and success of backtested quantitative strategies.

This is especially salient to me given the recent predictive success of my model, the Net Options Pricing Effect. For a great example, we can observe a recent backtest:

Obviously not going to give details on the parameters here.

If you look at this prima facie, the results look incredible. Using a similar buy and hold strategy on SPY, you make nearly $50,000 less over the time period (again, ignoring transaction costs, slippage, or whatever else you want to throw at it). Let’s simply observe the elephant in the room:

This time period, if you didn’t guess, is the Coronavirus Crash of 2020. The model loves volatility, and made out like a bandit during the massive financial meltdown (at least in perfect backtest!). However, we need to understand the likelihood of an event of a similar caliber occurring in the future is quite low. Despite showing up in the backtest (and actually happening of course in the period tested), the likelihood of observing these returns in the future is low.

The Mail Wizard of Oz

Imagine I have a mailing list of 1,000,000 subscribers that perhaps I bought off the dark web somewhere, and I know these people are gullible and also trade the markets. I want them to believe I’m some sort of prediction guru to ideally bilk the marks of money later on down the line (or recommend them stocks to pump and dump for example).

That said in real life I’m less of a Rasputin and more of a:

I don’t really have a crystal ball to predict tomorrow, but I want people to think I do.

How would I construct my scam?

Simply put, let’s assume the probability of SPY going up tomorrow is 1/2 and the probability of it going down is also 1/2 (this isn’t exactly correct, but that’s an implementation detail here).

Each day, I could simply send to half my audience a prediction of green for the next day and to the other, a prediction of red. On the first day, half of my audience (500,000 individuals) would receive the correct prediction, while the other half would receive the incorrect prediction. If I were then crafty, I could cease sending mail to the half that received the incorrect prediction.

On the second day, I could pull off the same sleight of hand, sending half my audience a prediction of green and the other half red. Just as before, we would expect that half of our remaining audience (250,000) would receive prophetic predictions, while we would discard the other half.

We can trivially continue this train of logic to the nth day and see this resembles a simple power series. On the nth day, we would expect that precisely 1,000,000 * 2^-n subscribers will have received crystal clear, accurate predictions! From their perspective, you are now an oracle. To everyone else, you’re simply making random guesses.

Thus is the power of observational bias in prediction. From the lucky subscriber’s standpoint, using simple empirical probability, you are basically an omniscient guide to the markets. From this, you can prey on them as you please.

The dangers of this exercise are more salient when you consider the real world implications. Humans tend to be optimistic and prone to magical thinking, no matter how rational we tend to believe ourselves to be. By the nature of probability, there is always a possibility of any possible event occurring, without necessary appeal to higher authority. From the observer’s perspective, our charlatan hit a perfect home run (or whatever the sportsball metaphor is here). However, there is no causal basis here to believe in our forecast abilities; by simple math, we would anticipate that someone would receive completely accurate predictions even though no prediction was made at all. This appears in the market in the construction of many trading strategies, which I will talk more about in further posts.

According to History.com, the ancient Aztecs believed that human sacrifice was necessary to keep the Sun rising tomorrow. In a bizarre and perhaps alien way, this makes sense. Perhaps an early ancestor of the Aztecs noted a solstice or persistent cloudy period, and became the first to sacrifice a fellow countryman to appease the bloodlust of Huitzilopochtli. Perhaps it made sense in a mystical albeit causal way given Aztec cosmology. At the end of the day, blood was spilt and the Sun did rise. As the blood kept flowing, the Sun never disappeared. However, hundreds of years later and starved of blood flow, the Sun still continues her daily procession. How could this be?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store