Coding for Journalists 101 : A four-part series

nico.cavallotto

Photo by Nico Cavallotto on Flickr

Update, January 2012: Everything…yes, everything, is superseded by my free online book, The Bastards Book of Ruby, which is a much more complete walkthrough of basic programming principles with far more practical and up-to-date examples and projects than what you’ll find here.

I’m only keeping this old walkthrough up as a historical reference. I’m sure the code is so ugly that I’m not going to even try re-reading it.

So check it out: The Bastards Book of Ruby

-Dan

Update, Dec. 30, 2010: I published a series of data collection and cleaning guides for ProPublica, to describe what I did for our Dollars for Docs project. There is a guide for Pfizer which supersedes the one I originally posted here.

So a little while ago, I set out to write some tutorials that would guide the non-coding-but-computer-savvy journalist through enough programming fundamentals so that he/she could write a web scraper to collect data from public websites. A “little while” turned out to be more than a month-and-a-half. I actually wrote most of it in a week and then forgot about. The timeliness of the fourth lesson, which shows how to help Pfizer in its mission to more transparent, compelled me to just publish them in incomplete form. There’s probably inconsistencies in the writing and some of the code examples, but the final code sections at the end of each tutorial do seem to execute as expected.

As the tutorials are aimed at people who aren’t experienced programming, the code is pretty verbose, pedantic, and in some cases, a little inefficient. It was my attempt to think how to make the code most readable, and I’m very welcome to editing changes.

DISCLAIMER: The code, data files, and results are meant for reference and example only. You use it at your own risk.

Follow me on Twitter: @dancow. If you're interested in learning programming, check out my online-book-in-progress, The Bastards Book of Ruby.

Discussion23 Comments Category works Tags , , , , , ,

23 Responses to Coding for Journalists 101 : A four-part series

  1. Coding for Journalists 101: Go from knowing nothing to scraping Web pages. In an hour. Hopefully. | Danwin: Dan Nguyen, in short

  2. uberVU - social comments

  3. Coding for Journalists 102: Who’s in Jail Now: Collecting info from a county jail site | Danwin: Dan Nguyen, in short

  4. Coding for Journalists 104: Pfizer’s Doctor Payments; Making a Better List | Danwin: Dan Nguyen, in short

  5. Caribous ahead 6-3 after 2nd

  6. A sharp-edged sword : The Temple News - rewiews & features in sharpala.com

  7. Coding for Journalists 101 : A four-part series | Danwin: Dan … | Ruby WebDev Insider

  8. Dan Nguyen: Coding for journalists – four online tutorials | Journalism.co.uk Editors' Blog

  9. links for 2010-04-19 | Aram on Mason

  10. How To Build a Deck Part 5: Stairs & Railings – The Home Depot | Stairs Construction Assembly

  11. Curated links! : Innovation in College Media

  12. Recommended Links for April 28th | Alex Gamela - Digital Media & Journalism

  13. I came, I saw, I coded. » What’s so important about pro-jos, anyway?

  14. links for 2010-05-17 « Onlinejournalismtest's Blog

  15. MediaShift . Your Guide to Digital Training Programs for Mid-Career Journalists | PBS

  16. links for 2010-06-25 | Tachydidaxy

  17. An introduction to data scraping with Scraperwiki | Online Journalism Blog

  18. La programmation, l’avenir du journalisme ? – Media Trend

  19. Why you need to learn (at least a little) code, and how to get started : BusinessJournalism.org Reynolds Center for Business Journalism

  20. Innsamling av data: verktøy for skjermskraping « Grafa (norrøn; grave)

  21. Mapping Ratata: Who’s Hot? « dataist

  22. The Bastards Book: A Programming Tutorial for journalists, researchers, analysts, and anyone else who cares about data | Dan Nguyen pronounced fast is danwin

  23. A few reasons to learn the command line « Chris Essig's blog

Leave a Reply

Your email address will not be published. Required fields are marked *

*


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>