Human Hubris and the Big Data Fallacy

I recently finished reading both The Black Swan and Antifragile by Nassim Taleb and find myself constantly thinking of the “Turkey Problem”. For those unfamiliar with Taleb’s work, the Turkey Problem relates to the fallacy of thinking that past results indicate future trends.

Taleb uses a turkey being raised for Thanksgiving as an illustration. The turkey is being fed and taken care of every day of its life from birth until the day it dies. A human is feeding the turkey, providing it with shelter, and protecting it from the harshness of the real world (predators, weather, etc). After the first X number of days, the turkey starts to truly believe that the world is a place where humans constantly help it improve it’s well-being. By extrapolating to the future, the turkey predicts (perhaps using Big Data techniques?) that this state of affairs will go on forever. Unfortunately for the turkey, on day 1001, the day before Thanksgiving, the plump turkey is killed and is used as the centerpiece for your Thanksgiving meal. Yum!

If the turkey were using modern statistical models, this event would’ve come as a complete shock. Extrapolation is dangerous territory but what makes this situation even worse is that the turkey is in the most danger exactly when the model says he’s safest! (after 1000 days of data)

 

OOturkey6612.png

 

Being in the SF Bay Area, naturally my thoughts drifted to the rise of “Big Data” and how it is being used in almost every industry now to “optimize” and make predictions about the future. In markets where there are physical constraints and well understood rules on things (bounds, defined odds, etc), there is definitely value to having a model and using it as a predictor. For example, a casino can model the behavior of gamblers to optimize profits. Fine. No problem with that.

Where things get tricky is when people take complex systems that are not well understood and try to build models on top of that. Don’t get me wrong, there is nothing wrong with having models and using them to try to create a picture of the universe in some controlled manner. Where these models get dangerous is when people use them exclusively to drive company policy or worse, government policy to force people into their contrived version of “reality”. Prime examples of dangerous models include anything health, climate, or economy related.

In case you think the turkey example is only limited to poultry, remember that after every major global event in our lives, the people in charge said that “this couldn’t have been predicted”. They’re right. Everything from 9/11 to the 2008 economic meltdown to Fukishima, every major event that has shaped the present has been a surprise that went beyond anything that was modeled.

There’s a reason for this. A model simply cannot predict something that hasn’t happened before if it is only using past events as a basis for future events.

And to finish this post off, here’s a prediction by Nobel Prize winning economist Paul Krugman made in 1998 about the future of the Internet:

“The growth of the Internet will slow drastically, as the flaw in “Metcalfe’s law”–which states that the number of potential connections in a network is proportional to the square of the number of participants–becomes apparent: most people have nothing to say to each other! By 2005 or so, it will become clear that the Internet’s impact on the economy has been no greater than the fax machine’s” (emphasis mine)

With a track record like that, you’d think people would stop listening to the forecasters. Unfortunately, that’s not the case. Paul Krugman is still out there, making predictions and writing his weekly column in the New York Times. That’s human hubris right there.