Google Refine, a.k.a. Gridworks 2.0 released; ProPublica’s “Dollars for Docs” featured.

Good news for data-nerds everywhere. The 2.0 version of Google’s fantastic data-cleaning tool, Google Refine (formerly Gridworks), has been released. And they were nice enough to feature ProPublica’s Dollars for Docs as an example of a use-case. I talked briefly to BusinessJournalism.org about how I used Refine to put together the pharma top earners list.

It’s possible I could’ve done it using SQL queries and Ruby libraries. But I definitely would’ve missed a lot of matches, and probably overdosed on over-the-counter pharma-painkillers.

I'm a programmer journalist, currently teaching computational journalism at Stanford University. I'm trying to do my new blogging at blog.danwin.com.