Tag Archives: ruby on rails

The Big Pharma-Dollars-for-Doctors Database, at ProPublica

Haven’t had much time to blog, or eat, or sleep in the past few months because of this project, but the first part just rolled out today (at about 2am, actually): at ProPublica, my colleagues and I collected the past two years of reports (albeit just from 7 companies) disclosing what they pay doctors to speak on their behalf. I still have a few posts and articles to write about what undertaking and background, but it’s the first time that someone has compiled all these reports and made them available to the public, something that will be mandated by law in 2013.

Our first investigation related to the data looked at how some of the companies’ top earners, who are ostensibly supposed to be experts in their field, had either shady or slim expertise. I did most of the datawork, including collecting the data and managing it, polling the various state websites to look up physician disciplinary records, and designing and coding (with the help of my genius coder co-worker Jeff Larson) the website. Whew!

Check it out.

ProPublica tracks the bailout, a year or so later

Today, my ProPublica colleague Paul Kiel and I put out some graphical revisions to PP’s bank bailout tracking site, including our master list of companies to get taxpayer bailout money:

Graphic: The Status of the Bailout

Graphic: The Status of the Bailout

Bailout List Page

Bailout List Page

Nothing fancy, mostly made the numbers easier to find and compare. The site itself has been far-from-fancy at its inception, since it was my first project after taking a crash course on Ruby on Rails. Back when the bailout was first announced in Q4 2008, the Treasury declined to name the banks it was doling taxpayer money to, for fear that non-listed banks would take a hit in reputation. Paul was one of the first few people to comb through banks’ press releases and enter them into a spreadsheet. His list of the first 26 – put into a simple html table – was a pretty big hit.

As the list grew into the dozens and hundreds, it became more cumbersome to maintain the static list, which was nothing more than the bank’s name, date of announcement, and amount of bailout. Plus, it was no longer just one bailout per company; Citigroup and Bank of America were beneficiaries of billions of dollars through a couple other programs.

So, I proposed a bailout site that would allow Paul to record the data at a more discrete level…up to that point, for example, most online lists showed that AIG had several dozen billion dollars committed to it, but not the various programs, reasons, and dates on which those allocations were made. A little anal maybe, but it gave the site the flexibility to adapt when the bailout grew to include all varieties of disbursements, including to auto parts manufacturers and mortgage servicers, as well as the money flow coming in the opposite direction, in the form of refunds and dividends.

I saw the site as more of a place for Paul to base his bailout coverage on (he’s been doing an excellent job covering the progress of the mortgage modification program), as I assumed that in the near future, Treasury would have its own, easy-to-use site of the data. Unfortunately, that is not quite the case, nearly a year and a half later. Besides some questionable UI decisions (such as having the front-facing page consist of a Flash map), the data is not put forth in an easily accessible method. It could be that I need to take an Excel refresher course here, but trying to sort the columns in these Excel spreadsheets just to find the biggest bailout amount, for example, throws an error.

Only in the past couple of months did Treasury finally release dividends in non-pdf form, and even then, it’s still a pain to work with (there’s no way, for example, to link the bank names in the dividends sheet to the master spreadsheet of bailouts). I would’ve thought that’d be the set of bailout data Treasury would be most eager to send out, because it’s the taxpayers’ return on investment. But, as it turns out, there is a half-empty perspective from this data (such as banks not having enough reserves to pay dividends in a timely fashion), one that would’ve been immediately obvious if the data were in a more sortable form.

ProPublica’s bailout tracking site doesn’t have much data other than the official Treasury bailout numbers; there’s all kinds of other unofficial numbers, such as how much each bank is giving out in bonuses, that people are more interested in. American University has gathered all kinds of financial health indicators for each bailout bank, too. There’s definitely much more data that PP, and other bailout trackers need to collect to provide a bigger picture of the bailout situation. But for now, I guess it’s a small victory to be one of the top starting points to find out just exactly where hundreds of billions of our taxes went to. And the why, too; Paul’s done a great job writing translations of the Treasury’s official-speak on each program.

ProPublica’s Eye on the Bailout

Bad Nurses, and Our Tragic Inability to Track Them

Get rich in the temp nursing business

Get rich in the temp nursing business

On Sunday, my ProPublica colleagues Tracy Weber and Charles Ornstein, in conjunction with the Los Angeles Times, put out a story examining the lack of standards in the temp nursing agency, a dangerous situation considering California’s desperate shortage of nursing staff.

Emboldened by a chronic nursing shortage and scant regulation, the firms vie for their share of a free-wheeling, $4-billion industry. Some have become havens for nurses who hopscotch from place to place to avoid the consequences of their misconduct. (see related story: A ‘Crazy’ Way for an Industry to Operate)

A joint investigation with the Los Angeles Times found dozens of instances in which staffing agencies skimped on background checks or ignored warnings from hospitals about sub-par nurses on their payrolls. Some hired nurses sight unseen, without even conducting an interview.

The gist of the problem: California lacks virtually any kind of tracking of errant temp nurses. This nurse, for example, was accused of stealing drugs from at least six hospitals, suffered a drug-induced seizure on the job, and had his Minnesota nursing license suspended before California got around to filing an accusation against him. Two years later, after a few more reported incidents of drug theft, the California registered nursing board finally revoked his license when he didn’t make his hearing on time.

Charlie and Tracy have been covering this story even before they joined ProPublica; LATimers Maloy Moore and Doug Smith contributed a massive amount of the essential research and data-analysis. This temp nurses chapter is just another consequence of what appears to be awful records-keeping and sloth by the various oversight bodies.

My own contribution to the coverage was small, the most notable aspect of which was this Ruby on Rails site I built to catalogue the sanctioned nurses, a relatively minor task compared to actually collecting and parsing the data (i.e. reading through all the PDF files for the buried information). . It was pretty simple, allowing users at a glance to see the numbers of disciplined nurses by various categories, including year and type of discipline. I was a little skeptical of doing it at first, just because the CA nursing board does have a searchable and functional database of its own.

Theoretically (well, if it weren’t the case that the records themselves are often incomplete, so that criminal nurses come up with a clean sheet), any member of the public could look up their own nurses’ records and avoid the bad ones. But the meat of the Charlie’s and Tracy’s is the numbers: 1,254 days on average to discipline a nurse (compared to 173 for Texas). 1,706 days before one nurse, who was kicked out of a drug-recovery program and considered a threat to public safety, had even an accusation filed against her. Our site makes it evident that hard numbers, not just heartbreaking anecdotes,  argue against California’s regulatory status quo.

A screenshot from our sanctioned nurses database

A screenshot from our sanctioned nurses database

The reporters on this story put in months of time manually tabulating the data to come up with the thrust of their stories. Sadly, all of these numbers and statistical conclusions were probably right under the nursing board’s nose. The regulators apparently track dates and types of accusations and disciplines for each nurse. A few simple database queries would’ve quickly uncovered the glaring delays and bottlenecks in the system (e.g. (SELECT AVG(TO_DAYS(`date_discipline`)-TO_DAYS(`date_initial_complaint`)) as average_delay from `disciplinary_actions`).

A day after Charlie and Tracy’s initial story in July 2009, Gov. Schwarzenegger sacked a majority of the registered nursing board and new regulations include making public the restrictions on a nurse’s license. Read ProPublica’s complete coverage on California’s flawed oversight of health-care workers here.