<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>danwin.com &#187; mechanize</title>
	<atom:link href="https://danwin.com/tag/mechanize/feed/" rel="self" type="application/rss+xml" />
	<link>https://danwin.com</link>
	<description>Words, photos, and code by Dan Nguyen. The &#039;g&#039; is mostly silent.</description>
	<lastBuildDate>Thu, 21 Nov 2019 12:29:57 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.2.39</generator>
	<item>
		<title>Coding for Journalists 103: Who&#8217;s been in jail before: Cross-checking the jail log with the court system; Use Ruby&#8217;s mechanize to fill out a form</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-part-3-cross-checking-the-jail-log-with-the-court-system-use-rubys-mechanize-to-fill-out-a-form/</link>
		<comments>https://danwin.com/2010/04/coding-for-journalists-part-3-cross-checking-the-jail-log-with-the-court-system-use-rubys-mechanize-to-fill-out-a-form/#comments</comments>
		<pubDate>Tue, 06 Apr 2010 13:40:53 +0000</pubDate>
		<dc:creator><![CDATA[Dan Nguyen]]></dc:creator>
				<category><![CDATA[works]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[courts]]></category>
		<category><![CDATA[journalism]]></category>
		<category><![CDATA[mechanize]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">https://danwin.com/?p=584</guid>
		<description><![CDATA[<p>This is part of a four-part series on web-scraping for journalists. As of Apr. 5, 2010, it was a published a bit incomplete because I wanted to post a timely solution to the recent Pfizer doctor payments list release, but the code at the bottom of each tutorial should execute properly. The code examples are [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://danwin.com/2010/04/coding-for-journalists-part-3-cross-checking-the-jail-log-with-the-court-system-use-rubys-mechanize-to-fill-out-a-form/">Coding for Journalists 103: Who&#8217;s been in jail before: Cross-checking the jail log with the court system; Use Ruby&#8217;s mechanize to fill out a form</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></description>
				<content:encoded><![CDATA[<div class='over-note' style='font-size: 12pt; color: #a44; border: 1px solid black; margin: 20px; padding: 20px;'>This is part of a <a href="https://danwin.com/works/coding-for-journalists-101-a-four-part-series/">four-part series on web-scraping for journalists</a>. As of <strong>Apr. 5, 2010</strong>, it was a published a bit incomplete because I wanted to post a timely solution to the <a href="https://danwin.com/works/pfizer-web-scraping-for-journalists-part-4-pfizers-doctor-payments/">recent Pfizer doctor payments list release</a>, but the code at the bottom of each tutorial should execute properly. The code examples are meant for reference and I make no claims to the accuracy of the results. Contact <a href="mailto:dan@danwin.com">dan@danwin.com</a> if you have any questions, or leave a comment below.</p>
<p><strong>DISCLAIMER:</strong> <em>The code, data files, and results are meant for reference and example only. You use it at your own risk.</em></p>
<p><b>In particular, with lesson 3</b>, I skipped basically any explanation to the code. I hope to get around to it later.</p>
</div>
<h2>Going to Court</h2>
<p>In the <a href="https://danwin.com/works/coding-for-journalists-101-a-four-part-series/">last lesson</a>, we learned how to write a script that would record who was in jail at a given hour. This could yield some interesting stories for a crime reporter, including spates of arrests for notable crimes and inmates who are held with $1,000,000 bail for relatively minor crimes. However, an even more interesting angle would be to check the inmates&#8217; prior records, to get a glimpse of the recidivism rate, for example.</p>
<p><a href="https://services.saccourt.com/indexsearchnew/CaseType.aspx">Sacramento Superior Court</a> allows users to search by not just names, but by the unique ID number given to inmates by Sacramento-area jurisdictions. This makes it pretty easy to link current inmates to court records.</p>
<p><a href="https://danwin.com/words/wp-content/uploads/2010/04/small-court-page.gif"><img src="https://danwin.com/words/wp-content/uploads/2010/04/small-court-page.gif" alt="" title="small-court-page" width="500"  class="size-full wp-image-672" /></a><br />
</p>
<p>However, the techniques we used in past lessons to automate the data collection won&#8217;t work here. As you can see in the above picture, you have to fill out a form. That&#8217;s not something any of the code we&#8217;ve written previously will do. Luckily, that&#8217;s where Ruby&#8217;s <strong>mechanize</strong> comes in.</p>
<p><span id="more-584"></span></p>
<div class="code-doc">
<link rel='stylesheet' href='https://danwin.com/css/code.css' type='text/css' media='all' />
<div class='sec'>
<h2>Ruby Mechanize</h2>
<p>Go the the <a href="http://mechanize.rubyforge.org/mechanize/">mechanize library homepage</a> to learn how to install it as a Ruby gem. It requires that <a href="http://nokogiri.rubyforge.org/">nokogiri</a> is installed, which you should&#8217;ve done if you&#8217;ve made it this far into my tutorials.</p>
<p>There are some <a href="http://mechanize.rubyforge.org/mechanize/EXAMPLES_rdoc.html">basic examples on the project page</a>, but you&#8217;re going to have to read some of the technical documentation to learn some of mechanize&#8217;s commands.</p>
<p>Here&#8217;s a code example we&#8217;ll be using:</p>
<pre class="ruby" name="code">
search_form['txtXref']='00112233'
result_page_form = search_form.submit
</pre>
<p><b>search_form</b> refers to a mechanize Form object. In that HTML form is a textfield with a name of &#8216;txtXref&#8217;. The array notation we used above is setting that textfield to the value &#8216;00112233&#8217;.</p>
<p>Then, using mechanize&#8217;s Form object&#8217;s <b>submit</b> method, we submit the form just as if we had clicked the &#8220;Submit&#8221; button on a webpage.</p>
<p>That&#8217;s the basic theory.</p>
</div>
<div class='sec'>
<h2>The Code</h2>
<p>Note: The following code works, if you have an inmates.txt file from the last lesson (<a href="https://danwin.com/static/jail-list/inmates.txt">use this one if you don&#8217;t</a>; keep in mind that the last names and birthdates have been changed/redacted). However, it&#8217;s very rudimentary, with no error-checking at all. Still, it&#8217;ll give you a couple tab-delimited files that will list an inmate&#8217;s past charges and past sentences served, with XREF being the key that links those files to inmates.txt.</p>
<p>Remember that you&#8217;re accessing a live site here. This script pauses for 2 seconds after each access&#8230;there should be no reason to be more frequent about it.</p>
<p>This tutorial will be updated in the future.</p>
<pre name="code" class="ruby">
require 'rubygems'
require 'mechanize'
search_url='https://services.saccourt.com/indexsearchnew/CriminalSearchV2.aspx'
xrefs = File.open("inmates.txt", 'r').readlines().map{|x| x.split("\t")[7].match(/[0-9]+/).to_s}.uniq

# open datafile


a = Mechanize.new { |agent|
  agent.user_agent_alias = 'Mac Safari'
}

search_page = a.get(search_url) 
search_form = search_page.form_with(:name=>'frmCriminalSearch')

#show the fieldnames
search_form.fields.map {|f| f.name}
#=> ["__EVENTTARGET", "__EVENTARGUMENT", "__VIEWSTATE", "txtLastName", "txtFirstName", "txtDOB", "txtXref", "txtCaseNumber", "lstCaseType"]

search_form.buttons.map{|m| m.name}
# => ["btnFindByName", "btnFindByNumber"]


xrefs.each do |xref|
  puts "\nFinding info for xref: #{xref}"
  search_form['txtXref']=xref
  search_form.field_with(:name=>'lstCaseType').options[1].select
  result_page_form = search_form.submit.forms.first
  case_buttons = result_page_form.buttons[1..-2]

  puts "There are #{case_buttons.length} cases to check:"
  case_buttons.each do |cb|
    file_page = result_page_form.click_button(cb)
    file_page = file_page.parser
  
    charges_arr = []
    sentences_arr =[]
    charge_rows = file_page.css('#dgDispositionCharges tr')
  
    if charge_rows.length > 0
    puts "Charges: "
      charge_rows[1..-1].each do |cr|
        ctd = cr.css('td').map{|td| td.text}
        charges_arr << {:plea=>ctd[1], :charge=>ctd[2], :date=>ctd[4], :severity=>ctd[5]}
        puts "\t - #{charges_arr.last.collect().join("\t")}"
      end  
    end
  
    sentence_rows = file_page.css('#dgSentenceSummary tr')
  
    if sentence_rows.length > 0
      puts "Sentences: "
      sentence_rows[1..-1].each do |sr|
        sentences_arr << sr.css('td').map{|td| td.text}.join("\t")
        puts "\t - #{sentences_arr.last}"
      end
    end
    
    
    File.open("court_charges.txt",'a+'){ |f|

      charges_arr.each do |c|
        f.puts("#{xref}\t#{c[:plea]}\t#{c[:charge]}\t#{c[:date]}\t#{c[:severity]}")
      end
    }

    File.open("sentences.txt", 'a+'){ |f| 
      sentences_arr.each do |c|
        f.puts("#{xref}\t#{c}")
      end
    }
    
    
    
  
  end #done checking a case entry
  
  puts "Done with #{xref}, sleeping"
  sleep 1
  
  
end  

 

 
 
</pre>
</div>
</div>
<p>The post <a rel="nofollow" href="https://danwin.com/2010/04/coding-for-journalists-part-3-cross-checking-the-jail-log-with-the-court-system-use-rubys-mechanize-to-fill-out-a-form/">Coding for Journalists 103: Who&#8217;s been in jail before: Cross-checking the jail log with the court system; Use Ruby&#8217;s mechanize to fill out a form</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://danwin.com/2010/04/coding-for-journalists-part-3-cross-checking-the-jail-log-with-the-court-system-use-rubys-mechanize-to-fill-out-a-form/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
