DISCLAIMER: The code, data files, and results are meant for reference and example only. You use it at your own risk.
In particular, with lesson 3, I skipped basically any explanation to the code. I hope to get around to it later.
Going to Court
In the last lesson, we learned how to write a script that would record who was in jail at a given hour. This could yield some interesting stories for a crime reporter, including spates of arrests for notable crimes and inmates who are held with $1,000,000 bail for relatively minor crimes. However, an even more interesting angle would be to check the inmates’ prior records, to get a glimpse of the recidivism rate, for example.
Sacramento Superior Court allows users to search by not just names, but by the unique ID number given to inmates by Sacramento-area jurisdictions. This makes it pretty easy to link current inmates to court records.
However, the techniques we used in past lessons to automate the data collection won’t work here. As you can see in the above picture, you have to fill out a form. That’s not something any of the code we’ve written previously will do. Luckily, that’s where Ruby’s mechanize comes in.
Ruby Mechanize
Go the the mechanize library homepage to learn how to install it as a Ruby gem. It requires that nokogiri is installed, which you should’ve done if you’ve made it this far into my tutorials.
There are some basic examples on the project page, but you’re going to have to read some of the technical documentation to learn some of mechanize’s commands.
Here’s a code example we’ll be using:
search_form['txtXref']='00112233' result_page_form = search_form.submit
search_form refers to a mechanize Form object. In that HTML form is a textfield with a name of ‘txtXref’. The array notation we used above is setting that textfield to the value ‘00112233’.
Then, using mechanize’s Form object’s submit method, we submit the form just as if we had clicked the “Submit” button on a webpage.
That’s the basic theory.
The Code
Note: The following code works, if you have an inmates.txt file from the last lesson (use this one if you don’t; keep in mind that the last names and birthdates have been changed/redacted). However, it’s very rudimentary, with no error-checking at all. Still, it’ll give you a couple tab-delimited files that will list an inmate’s past charges and past sentences served, with XREF being the key that links those files to inmates.txt.
Remember that you’re accessing a live site here. This script pauses for 2 seconds after each access…there should be no reason to be more frequent about it.
This tutorial will be updated in the future.
require 'rubygems' require 'mechanize' search_url='https://services.saccourt.com/indexsearchnew/CriminalSearchV2.aspx' xrefs = File.open("inmates.txt", 'r').readlines().map{|x| x.split("\t")[7].match(/[0-9]+/).to_s}.uniq # open datafile a = Mechanize.new { |agent| agent.user_agent_alias = 'Mac Safari' } search_page = a.get(search_url) search_form = search_page.form_with(:name=>'frmCriminalSearch') #show the fieldnames search_form.fields.map {|f| f.name} #=> ["__EVENTTARGET", "__EVENTARGUMENT", "__VIEWSTATE", "txtLastName", "txtFirstName", "txtDOB", "txtXref", "txtCaseNumber", "lstCaseType"] search_form.buttons.map{|m| m.name} # => ["btnFindByName", "btnFindByNumber"] xrefs.each do |xref| puts "\nFinding info for xref: #{xref}" search_form['txtXref']=xref search_form.field_with(:name=>'lstCaseType').options[1].select result_page_form = search_form.submit.forms.first case_buttons = result_page_form.buttons[1..-2] puts "There are #{case_buttons.length} cases to check:" case_buttons.each do |cb| file_page = result_page_form.click_button(cb) file_page = file_page.parser charges_arr = [] sentences_arr =[] charge_rows = file_page.css('#dgDispositionCharges tr') if charge_rows.length > 0 puts "Charges: " charge_rows[1..-1].each do |cr| ctd = cr.css('td').map{|td| td.text} charges_arr << {:plea=>ctd[1], :charge=>ctd[2], :date=>ctd[4], :severity=>ctd[5]} puts "\t - #{charges_arr.last.collect().join("\t")}" end end sentence_rows = file_page.css('#dgSentenceSummary tr') if sentence_rows.length > 0 puts "Sentences: " sentence_rows[1..-1].each do |sr| sentences_arr << sr.css('td').map{|td| td.text}.join("\t") puts "\t - #{sentences_arr.last}" end end File.open("court_charges.txt",'a+'){ |f| charges_arr.each do |c| f.puts("#{xref}\t#{c[:plea]}\t#{c[:charge]}\t#{c[:date]}\t#{c[:severity]}") end } File.open("sentences.txt", 'a+'){ |f| sentences_arr.each do |c| f.puts("#{xref}\t#{c}") end } end #done checking a case entry puts "Done with #{xref}, sleeping" sleep 1 end
Pingback: Coding for Journalists 102: Who’s in Jail Now: Collecting info from a county jail site | Danwin: Dan Nguyen, in short
Pingback: Know Your Penalties: Cross-Checking | Manchester Monarchs AHL Announcer
Pingback: Coding for Journalists 101 : A four-part series | Dan Nguyen pronounced fast is danwin
Hi there, the links to the mechanize gem on this page are broken. Just thought you should know – not easily seeing a “how to install” section for mechanize now.