Color: The Cinderella of dataviz
“Avoiding catastrophe becomes the first principle in bringing color to information: Above all, do no harm.” — Envisioning Information, Edward Tufte, Graphics Press, 1990
Color is one of the most abused and neglected tools in data visualization. It is abused when we make poor color choices; it is neglected when we rely on poor software defaults. Yet despite its historically poor treatment at the hands of engineers and end-users alike, if used wisely, color is unrivaled as a visualization tool.
Most of us think twice before walking outside in fluorescent red underoos. If only we were as cautious in choosing colors for infographics. The difference is that few of us design our own clothes. But until good palettes (like ColorBrewer) are commonplace, to get colors that fit our purposes, we must be our own tailors.
While obsessing about how to implement color on the Dataspora Labs’ PitchFX viewer I began with a basic motivating question: Read more
People who love scatter plots & connecting dots

We hosted the first Dataviz Salon SF on Tuesday night, with lightning talks by boredom cop Shane Booth, dataviz wiz Lee Byron , computational journalist Brad Stenger, data wrangler Pete Skomoroch , and any/all data enthusiast Brendan O’Connor .
I was going to blog all about it — but Tom Carden of Stamen Design already has a great write-up.
… Dataspora invited a few people to a Dataviz Salon yesterday evening. Mike and I went along and huddled in a brick-built basement in SoMa to listen to the following:
.
How do you measure a major league slugger?
I gave a talk last month at SAP Labs in Palo Alto, along with Jim Porzak of ResponSys, introducing the R Statistical Language to a Business Intelligence interest group. The goal was to highlight how open source tools, like R, can be used to build predictive models. The example I gave centered around baseball and a simple question: how do you measure a baseball slugger?
Michael Lewis, in Moneyball , described how the baseball analyst Bill James was frustrated by the fact that major league hitters were consistently rated by their batting averages. James wrote: Read more
Visualizing Tim Wakefield’s knuckleball
(May 2009 Update: The Pitch F/X viewer for all MLB Pitchers is up and running).
“Back in 1980, STATS Inc. … sent its own scorekeepers [to record] play-by-play information about the games that had never before been systematically collected: the pitch count at the end of at bats, pitch types and locations, the direction and distance of batted balls. They broke the field down into twenty-six wedges radiating out from home plate.”
– Michael Lewis, Moneyball, p. 84
A friend recently sent me a blog post (originally from Josh Kalk ) visualizing the differences in two Red Sox pitchers’ styles by using a data set — called MLB Extended Game Log — which catalogs over a dozen attributes of each pitch thrown. This got me wondering about why baseball has attracted such interest by statisticians, and also about ways in which this pitching data, in particular, might be better visualized.
Read more
