What can Darwin’s finches tell us about the downturn?
Newspaper articles paint the markets in metaphors like “difficult climate” and “harsh landscape” –but these clichéd phrases have a kernel of truth. Thinking about markets as natural environments reveals that selective forces are at work. But it also predicts when they work. In the natural world, as the story of Darwin’s finches tells us, selection acts in times of crisis: drought, famine, and disease. For our markets, that time is now.
(Aside: I confess that relating the economic crisis to Darwin is a symptom of an academic bad habit: namely, mapping every phenomenon onto the intellectual giant of one’s field. Somewhere there is a psychologist blogging about Freud and the economy).
When does natural selection act? This question motivates two modern naturalists, Peter and Rosemary Grant, who studied Darwin’s finches over several decades on the Galapagos islands, and whose work is chronicled in Jonathan Weiner’s The Beak of the Finch.
During the wet seasons, it was hard to see how a finch’s beak made any difference to its fitness.
“[F]inches with long thin beaks and short fat parrot-like beaks [were] all hopping on the same lava, eating identical bird food… All those beaks were cracking the same birdseed.” [p.52]
A long line of ornithologists had concluded that the beak of the finch was unimportant.
But despite this, Peter and Rosemary Grant kept returning to the islands, and kept measuring beaks. In 1977, the rainy season brought no rain. Weiner describes what the naturalists’ witnessed:
“They found fewer than two hundred finches alive on the island. Just one finch in seven had made it through the drought… The average beak before the drought was 10.68 millimeters long and 9.42 deep. The average beak of the fortis that survived was 11.07 millimeters long and 9.96 deep… The birds were not simply magnified by the drought: they were reformed and revised. They were changed by their dead. Their beaks were carved by their losses.” [p.78]
The drought was the crucible that shaped the species. And it wasn’t simply size, but dimension (longer and deeper beaks, versus wider) that separated the survivors from the dead.
In the same way, the benefits of new technologies are often masked during good times. Firms with both new and old technologies remain solidly profitable, happily hopping along. Like ornithologists watching finches in the wet season, some analysts have questioned whether technological innovation even matters. Robert Solow summed up this paradox by quipping “You can see the computer age everywhere but in the productivity statistics.”
But when hard times hit, innovators survive. More importantly, they flourish when the business cycle swings up again. Work by Erik Brynjolfsson and others has shown strong positive evidence for technology’s impact on productivity, most markedly over five-to-seven year periods – the resonant frequency of the business cycle. But like Darwin’s finches, the survivors are not just those who have more technology investments, but those who get the dimensions right.
Downturns are not only good for innovation, they are necessary. While innovation may occur in times of plenty, crises allow the right innovations (hybrid cars) to outcompete the wrong ones (SUVs). This assumes that crises are allowed to run their course (the case against bailouts), but that there are at least some survivors (the case for them).
As a data guy, I’m cautiously optimistic that firms who have invested in analytics, who have quietly innovated in understanding their business data, will emerge as winners on the other side of this downturn. As a contemporary of Darwin’s said, “That which does not kill us makes us stronger.”
What I’ll be presenting at O’Reilly Money Tech 2009
I’ve been invited to speak at O’Reilly’s Money Tech conference this coming February 4-6th in New York City and thought I’d share the abstract for my talk here. I’ll likely be in New York for several days, if you’d like to get together to chat about data drop me a line!
Open Source Analytics: Visualization and Predictive Modeling of
Big Data with the R Programming Language
ABSTRACT
Just as the explosion of online data catalyzed the development of
storage technologies such as Hadoop, new challenges in data analytics
– turning terabytes into actionable insights — demand new tools. R,
an open-source language for statistical computing and graphics, is an
extensible, embeddable, and industry-strength solution for analytics.
In this session, I showcase R’s power by building predictive models
for Brazilian soybean harvests and baseball slugger salaries.
DESCRIPTION
The economics of data aggregation and analysis are being disrupted by
falling costs for storage and CPU power, the continuing shift of
business processes online, and the deluge of data that is being
generated as a consequence.
Satellite images, SEC filings, supply chain data (RFID data streams),
online prices, and newsgroup content represent just a few of the data
sources that hold potential for predictive modeling of markets.
Much of this data does not fit within existing paradigms for business
analysis: either its size overwhelms traditional desktop tools such as
Excel, or else its unique dimensions (such as geocodes) prevent its
being pipelined into more powerful, but narrowly designed, analysis
tools. Finally, closed-source tools cannot keep pace with the leading
edge of innovation in statistical and machine-learning algorithms.
Enter the open source programming language R. R has been dubbed the
lingua franca for statistical computing and graphical analysis, with a
pedigree tracing back several decades at Bell Labs. Though its
million-plus users are concentrated within academia, R is gaining
currency within several high-profile quantitative analysis groups,
including Google’s Customer Insights team and Barclays Global
Investors. In addition, R’s extensibility via user-contributed
packages has spawned an active developer community.
In this session, I will focus on applying R’s powerful visualization
tools to guide the construction of predictive models, using the kind
of large, multidimensional data sets that increasingly confront
quantitative analysts. Along the way, I will highlight R’s packages
for inferential statistics, its compact modeling syntax, and its ease
of connectivity with persistent data stores.
The two specific examples I will discuss are:
- an analysis of NASA’s Landsat imagery of Brazil’s center-west
agricultural regions to detect correlates for soybean harvest yields,
and a derived predictor of the Brazilian soybean market based in part
on these correlates.
- a validation of Bill James’ sabermetrics approach to batting
performance using 30 years of Major League Baseball statistics, and a
derived predictor for batters’ salaries.
For all of its strengths, R has an admittedly steep learning curve.
While source code for the examples will be provided online, this talk
will emphasize techniques and working examples over technical details.
The goal of this session is to give quantitative analysts the courage
to invest in learning the R language, by showcasing R’s power,
highlighting its features, and providing examples of its use for
innovative applications.
John Henry and decision making
Cap’n said to John Henry,
You’ve got a willin’ mind.
But you just well lay yoh hammah down,
You’ll nevah beat this drill of mine,
You’ll nevah beat this drill of mine. (Lyrics)

Today we take it for granted that a human is no match for a machine drill when it comes to digging through a mountain. At the time such songs were first sung, the verdict was dodgier; some believed that human ability would always be superior to machines. It seems that every time our technology advances, we must struggle a little to accept that our domain of superiority has narrowed.
Nowhere is this struggle more acute than in the contemplation and recent realization of thinking machines. Our discomfort with the idea that a machine that performs intellectual labor could ever be our equal dates at least to the roots of the industrial revolution and information revolution in the Enlightenment.
“…although such machines might execute many things with equal or perhaps greater perfection than any of us, they would, without doubt, fail in certain others from which it could be discovered that they did not act from knowledge, but solely from the disposition of their organs” –René Descartes, Discours de la méthode, 1637
When it comes to decision making, which often requires drilling through a mountain of data, we still manifest a deep skepticism of our machines. We call these creations black boxes, soulless as the machine drill, and place our faith in modern John Henrys who make decisions based on gut instinct. Human insight and judgment are still irreplaceable, but they are being increasingly augmented by thinking machines that can account for far more factors then we can in making a decision. And far from a black box, these machines are just our knowledge made scalable and executable.
Popular culture picks up on our need to differentiate ourselves; futuristic artificial intelligences are portrayed as, ultimately, too rigid to compete, unable to grasp the subtleties we humans use to drive our decision making. A scientific approach means the ability to perceive such subtleties should generate a testable hypothesis. In more and more areas, the quantitative approach is showing itself able to outperform raw human judgement, even in fairly subtle situations. Particularly famous is the work of the Oakland As in evaluating baseball players. Orley Ashenfelter’s remarkable success in predicting the quality of wines was another day the steam drill won.