<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>danwin.com &#187; crime</title>
	<atom:link href="https://danwin.com/tag/crime/feed/" rel="self" type="application/rss+xml" />
	<link>https://danwin.com</link>
	<description>Words, photos, and code by Dan Nguyen. The &#039;g&#039; is mostly silent.</description>
	<lastBuildDate>Thu, 21 Nov 2019 12:29:57 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.2.39</generator>
	<item>
		<title>Yu Yao&#8217;s killer sentenced</title>
		<link>https://danwin.com/2011/04/yu-yaos-killer-sentenced/</link>
		<comments>https://danwin.com/2011/04/yu-yaos-killer-sentenced/#comments</comments>
		<pubDate>Thu, 28 Apr 2011 14:38:08 +0000</pubDate>
		<dc:creator><![CDATA[Dan Nguyen]]></dc:creator>
				<category><![CDATA[thoughts]]></category>
		<category><![CDATA[crime]]></category>
		<category><![CDATA[Flushing]]></category>
		<category><![CDATA[Queens]]></category>
		<category><![CDATA[Yu Yao]]></category>

		<guid isPermaLink="false">https://danwin.com/?p=1652</guid>
		<description><![CDATA[<p>Last May, I blogged about a murder case that caught my eye: Yu Yao, a 23-year-old Chinese immigrant, was raped and fatally beaten as she walked home from the grocery store in Flushing, Queens. Even in the statistically safe streets of New York, shocking crimes happen on a regular basis but the circumstances behind Yu&#8217;s [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://danwin.com/2011/04/yu-yaos-killer-sentenced/">Yu Yao&#8217;s killer sentenced</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p><a href="https://danwin.com/2011/04/yu-yaos-killer-sentenced/screen-shot-2011-04-28-at-10-18-54-am-2/" rel="attachment wp-att-1653"><img src="https://danwin.com/words/wp-content/uploads/2011/04/Screen-shot-2011-04-28-at-10.18.54-AM.png" alt="" title="Screen shot 2011-04-28 at 10.18.54 AM" width="455" height="300" class="aligncenter size-full wp-image-1653" /></a></p>
<p>Last May, I <a href="https://danwin.com/2010/05/who-was-yu-yaoyau/">blogged about a murder case that caught my eye</a>: Yu Yao, a 23-year-old Chinese immigrant, was raped and fatally beaten as she walked home from the grocery store in Flushing, Queens. Even in the statistically safe streets of New York, shocking crimes happen on a regular basis but the circumstances behind Yu&#8217;s death seemed especially tragic and senseless. Yesterday, Yu&#8217;s killer, 29-year-old Carlos Salazar Cruz, was sentenced to the maximum prison term of 22-years-to-life after pleading guilty to second-degree murder. Despite the resolution, neither the senselessness or tragedy of her death have lessened. <a href="https://danwin.com/2010/05/who-was-yu-yaoyau/">I&#8217;ve updated my original post</a>.</p>
<p>The post <a rel="nofollow" href="https://danwin.com/2011/04/yu-yaos-killer-sentenced/">Yu Yao&#8217;s killer sentenced</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://danwin.com/2011/04/yu-yaos-killer-sentenced/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Is Solitary Confinement Torture? From Atul Gawande and the New Yorker</title>
		<link>https://danwin.com/2011/02/is-solitary-confinement-torture-from-atul-gawande-and-the-new-yorker/</link>
		<comments>https://danwin.com/2011/02/is-solitary-confinement-torture-from-atul-gawande-and-the-new-yorker/#comments</comments>
		<pubDate>Tue, 01 Feb 2011 15:30:38 +0000</pubDate>
		<dc:creator><![CDATA[Dan Nguyen]]></dc:creator>
				<category><![CDATA[thoughts]]></category>
		<category><![CDATA[Atul Gawande]]></category>
		<category><![CDATA[crime]]></category>
		<category><![CDATA[prisons]]></category>
		<category><![CDATA[the new yorker]]></category>

		<guid isPermaLink="false">https://danwin.com/?p=1543</guid>
		<description><![CDATA[<p>Thanks to longform.org for spotlighting another thought-provoking piece by Dr. Atul Gawande in the New Yorker. The tag line is: Hellhole: The United States holds tens of thousands of inmates in long-term solitary confinement. Is this torture? Dr. Gawande&#8217;s reporting builds a strong case for &#8220;Yes.&#8221; Some interesting bullet points: America holds at least 25,000 [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://danwin.com/2011/02/is-solitary-confinement-torture-from-atul-gawande-and-the-new-yorker/">Is Solitary Confinement Torture? From Atul Gawande and the New Yorker</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></description>
				<content:encoded><![CDATA[<div id="attachment_1547" style="width: 510px" class="wp-caption aligncenter"><a href="https://danwin.com/2011/02/is-solitary-confinement-torture-from-atul-gawande-and-the-new-yorker/800px-v-m-_doroshevich-sakhalin-_part_i-_punishment_cells/" rel="attachment wp-att-1547"><img src="https://danwin.com/words/wp-content/uploads/2011/02/800px-V.M._Doroshevich-Sakhalin._Part_I._Punishment_Cells-500x371.png" alt="Punishment Cells" title="800px-V.M._Doroshevich-Sakhalin._Part_I._Punishment_Cells" width="500" height="371" class="size-medium wp-image-1547" /></a><p class="wp-caption-text">Punishment Cells. From: Page 257 of part II of Vlas Mikhailovich Doroshevich Â«Sakhalin (Katorga)Â», Moscow. Sytin publisher, 1905.</p></div>
<p>Thanks to <a href="http://longform.org">longform.org</a> for spotlighting another thought-provoking piece by <a href="https://danwin.com/2010/07/letting-go-the-new-yorkers-atul-gawande-on-giving-up-life-to-live/">Dr. Atul Gawande</a> in the New Yorker. The tag line is: <a href="http://www.newyorker.com/reporting/2009/03/30/090330fa_fact_gawande">Hellhole: The United States holds tens of thousands of inmates in long-term solitary confinement. Is this torture?</a></p>
<p>Dr. Gawande&#8217;s reporting builds a strong case for &#8220;Yes.&#8221; Some interesting bullet points:</p>
<ul>
<li>America holds at least 25,000 inmates in solitary confinement in <a href="http://en.wikipedia.org/wiki/Supermax">Supermax prisons</a>
</li>
<li>More than a <strong>century</strong> ago, the U.S. Supreme Court <a href="http://supreme.justia.com/us/134/160/case.html">considered banning solitary confinement</a>
</li>
<li>A 2003 analysis of Arizona, Illinois, and Minnesota found that levels of inmate-on-inmate violence were unchanged after their supermax prisons opened
</li>
<li>The state of <strong>Maine</strong> has more inmates in long-term solitary than does all of <strong>England</strong>
</li>
</ul>
<p>Supermax prisons and the long-term isolation of large numbers of inmates, Dr. Gawande notes, is only a decades-old concept in the American prison system. However, <a href="http://supreme.justia.com/us/134/160/case.html#167">in the 1890 SCOTUS case, Medley vs. U.S</a>., the court takes note of a solitary confinement system in <a href="http://supreme.justia.com/us/134/160/case.html#167">Philadelphia back in 1787</a>. The conditions and consequences, noted more than two centuries ago, aren&#8217;t much different than what Dr. Gawande describes today:</p>
<blockquote><p>The peculiarities of this system were the complete isolation of the prisoner from all human society, and his confinement in a cell of considerable size, so arranged that he had no direct intercourse with or sight of any human being and no employment or instruction&#8230;.</p>
<p>A considerable number of the prisoners fell, after even a short confinement, into a semi-fatuous condition, from which it was next to impossible to arouse them, and others became violently insane; others still committed suicide, while those who stood the ordeal better were not generally reformed, and in most cases did not recover sufficient mental activity to be of any subsequent service to the community. </p>
<p>It became evident that some changes must be made in the system, and the separate system was originated by the Philadelphia Society for Ameliorating the Miseries of Public Prisons, founded in 1787.
</p></blockquote>
<p>Following the standard journalistic narrative, Dr. Gawande leads with his best anecdote and ends with his second-best. The entire piece is a must read, but the last anecdote is particularly astonishing. Gawande describes the case of Robert Felton, who spent 14 years of his 36 years on earth in solitary confinement. The isolation drove him crazy, Gawande writes, and Felton tried so many times to set his cell on fire with a lightbulb that &#8220;the walls of his cell were black with soot.&#8221;</p>
<p>Gawande writes about one of his last meetings with Felton. Felton had just found out the prison director who kept him in solitary confinement had just been convicted of bribery (from lobbyists, a sidestory that would probably illuminate why America holds on to certain prison strategies regardless of effect) and sentenced to two years in prison:</p>
<blockquote><p>
â€œTwo years in prison,â€ Felton marvelled. â€œHe could end up right where I used to be.â€</p>
<p>I asked him, â€œIf he wrote to you, asking if you would release him from solitary, what would you do?â€</p>
<p>Felton didnâ€™t hesitate for a second. â€œIf he wrote to me to let him out, Iâ€™d let him out,â€ he said.</p>
<p>This surprised me. I expected anger, vindictiveness, a desire for retribution. â€œYouâ€™d let him out?â€ I said.</p>
<p>â€œIâ€™d let him out,â€ he said, and he put his fork down to make the point. â€œI wouldnâ€™t wish solitary confinement on anybody. Not even him.â€ </p></blockquote>
<p><a href="http://www.newyorker.com/reporting/2009/03/30/090330fa_fact_gawande">Read Dr. Gawande&#8217;s story in the New Yorker.</a></p>
<p>The post <a rel="nofollow" href="https://danwin.com/2011/02/is-solitary-confinement-torture-from-atul-gawande-and-the-new-yorker/">Is Solitary Confinement Torture? From Atul Gawande and the New Yorker</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://danwin.com/2011/02/is-solitary-confinement-torture-from-atul-gawande-and-the-new-yorker/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NYPD&#8217;s Feris Jones is too old for this shit</title>
		<link>https://danwin.com/2010/10/nypds-feris-jones-is-too-old-for-this-shit/</link>
		<comments>https://danwin.com/2010/10/nypds-feris-jones-is-too-old-for-this-shit/#comments</comments>
		<pubDate>Tue, 26 Oct 2010 18:21:35 +0000</pubDate>
		<dc:creator><![CDATA[Dan Nguyen]]></dc:creator>
				<category><![CDATA[thoughts]]></category>
		<category><![CDATA[cops]]></category>
		<category><![CDATA[crime]]></category>
		<category><![CDATA[Feris Jones]]></category>
		<category><![CDATA[NYPD]]></category>

		<guid isPermaLink="false">https://danwin.com/?p=1385</guid>
		<description><![CDATA[<p>&#8220;Lethal Weapon&#8221; reference inspired by @andymboyle From the New York Times: Off-duty NYPD officer Feris Jones was at Sabineâ€™s Hallway, a beauty salon in Bed-Stuy, Brooklyn, when a robber came in, brandishing a gun, and ordered her and the other patrons into a bathroom. Jones told the owner to call the bathroom before stepping out [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://danwin.com/2010/10/nypds-feris-jones-is-too-old-for-this-shit/">NYPD&#8217;s Feris Jones is too old for this shit</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p><div id="attachment_1386" style="width: 200px" class="wp-caption alignright"><a href="https://danwin.com/thoughts/nypds-feris-jones-is-too-old-for-this-shit/attachment/salon1-articleinline/" rel="attachment wp-att-1386"><img src="https://danwin.com/words/wp-content/uploads/2010/10/salon1-articleInline.jpg" alt="" title="salon1-articleInline" width="190" height="227" class="size-full wp-image-1386" /></a><p class="wp-caption-text">NYPD officer Feris Jones</p></div><br />
<em><a href="http://www.imdb.com/title/tt0093409/quotes">&#8220;Lethal Weapon&#8221; reference</a> inspired by <a href="http://twitter.com/#!/andymboyle/status/28797638924">@andymboyle</a></em></p>
<p><a href="http://www.nytimes.com/2010/10/26/nyregion/26salon.html?src=me&#038;ref=nyregion">From the New York Times</a>: Off-duty NYPD officer Feris Jones was at Sabineâ€™s Hallway, a beauty salon in Bed-Stuy, Brooklyn, when a robber came in, brandishing a gun, and ordered her and the other patrons into a bathroom. Jones told the owner to call the bathroom before stepping out and ordering the robber to surrender. The robber, from 12 feet away, fired four shots at Jones with his .44 Magnum revolver. Jones dodged the bullets and returned fire with her own revolver five times, hitting the robber&#8217;s hands and causing him to drop his gun <em>and</em> hitting the front door lock, jamming it and slowing down the robber&#8217;s escape.</p>
<p>Five-time prior arrestee Winston Cox, 19, was apprehended at a Bed-Stuy hotel on Monday, his hands wrapped in towels.</p>
<p>Jones has been with the department since 1990 and worked in evidence collection for the past 12 years. The Oct. 23 attempted stickup was the first time she had fired her gun in the line-of-duty. Not only should her marksmanship should be commended, but her restraint and levelheadedness in not shooting the fleeing robber in the back (he apparently had to crawl out through a glass panel, even though moments before he nearly killed her. <a href="http://www.nypost.com/p/news/local/brooklyn/salon_perp_mom_to_cop_hair_hair_xnUz9YMKLdVH5Ot5fEmLWK">No wonder the suspect&#8217;s mother is proud of Jones</a>.</p>
<p>Sometimes, life <em>is</em> like a cop movie.</p>
<p>The post <a rel="nofollow" href="https://danwin.com/2010/10/nypds-feris-jones-is-too-old-for-this-shit/">NYPD&#8217;s Feris Jones is too old for this shit</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://danwin.com/2010/10/nypds-feris-jones-is-too-old-for-this-shit/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Calm under pressure, New Orleans</title>
		<link>https://danwin.com/2010/07/calm-under-pressure-new-orleans/</link>
		<comments>https://danwin.com/2010/07/calm-under-pressure-new-orleans/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 13:07:36 +0000</pubDate>
		<dc:creator><![CDATA[Dan Nguyen]]></dc:creator>
				<category><![CDATA[thoughts]]></category>
		<category><![CDATA[crime]]></category>
		<category><![CDATA[Danziger Bridge]]></category>
		<category><![CDATA[new orleans]]></category>
		<category><![CDATA[police]]></category>

		<guid isPermaLink="false">https://danwin.com/?p=1064</guid>
		<description><![CDATA[<p>Contrasting testimony in the New York Times&#8217; story on the Danziger Bridge shootings: â€œThe federal government has clearly forgotten or chosen to ignore the circumstances police officers were working under and clearly chose not to factor in any of those circumstances when they decided to charge them with an intentional act of murder,â€ [lawyer] Mr. [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://danwin.com/2010/07/calm-under-pressure-new-orleans/">Calm under pressure, New Orleans</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.nytimes.com/2010/07/14/us/14justice.html?_r=2&#038;hp">Contrasting testimony in the New York Times&#8217; story on the <strong>Danziger Bridge</strong> shootings</a>: </p>
<blockquote><p>â€œThe federal government has clearly forgotten or chosen to ignore the circumstances police officers were working under and clearly chose not to factor in any of those circumstances when they decided to charge them with an intentional act of murder,â€ [lawyer] Mr. Eric Hessler said in an interview.</p></blockquote>
<blockquote><p>
Some officers then traveled to the other side of the bridge and found two brothers, Ronald and Lance Madison, who were on their way to check on a dentistâ€™s office that belonged to their oldest brother, Dr. Romell Madison. <strong>According to the indictment, Mr. Faulcon then shot Ronald Madison to death with a shotgun. Afterward, it continues, Sergeant Bowen kicked and stomped on Mr. Madison as he lay dying on the ground</strong>.</p></blockquote>
<p>The post <a rel="nofollow" href="https://danwin.com/2010/07/calm-under-pressure-new-orleans/">Calm under pressure, New Orleans</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://danwin.com/2010/07/calm-under-pressure-new-orleans/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Who was Yu Yao? Rape-and-homicide case in downtown Flushing (updated)</title>
		<link>https://danwin.com/2010/05/who-was-yu-yaoyau/</link>
		<comments>https://danwin.com/2010/05/who-was-yu-yaoyau/#comments</comments>
		<pubDate>Tue, 25 May 2010 03:20:50 +0000</pubDate>
		<dc:creator><![CDATA[Dan Nguyen]]></dc:creator>
				<category><![CDATA[thoughts]]></category>
		<category><![CDATA[crime]]></category>
		<category><![CDATA[new york]]></category>
		<category><![CDATA[Yu Yao]]></category>
		<category><![CDATA[Yu Yau]]></category>

		<guid isPermaLink="false">https://danwin.com/?p=789</guid>
		<description><![CDATA[<p>Update 4/28/2011: &#8211; Carlos Salazar Cruz, 29, is sentenced to the maximum term of 22-years to life for confessing to the second-degree murder of Yu Yao. Update 6/1/2010: The accused killer appears in court. Update 5/28/2010: Chinese residents patrol the neighborhood I know that nothing in a city as big as New York should shock [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://danwin.com/2010/05/who-was-yu-yaoyau/">Who was Yu Yao? Rape-and-homicide case in downtown Flushing (updated)</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></description>
				<content:encoded><![CDATA[<div id="attachment_1648" style="width: 465px" class="wp-caption aligncenter"><a href="http://newyork.cbslocal.com/2011/04/27/mother-faces-man-convicted-of-brutally-raping-killing-daughter/"><img src="https://danwin.com/words/wp-content/uploads/2010/05/Screen-shot-2011-04-28-at-10.18.54-AM.png" alt="" title="Screen shot 2011-04-28 at 10.18.54 AM" width="455" height="300" class="size-full wp-image-1648" /></a><p class="wp-caption-text">Yu Yao (CBS NY)</p></div>
<p><strong>Update 4/28/2011:</strong> &#8211; <a href="#u4282011">Carlos Salazar Cruz, 29, is sentenced to the maximum term of 22-years to life for confessing to the second-degree murder of Yu Yao.</a></p>
<p><strong>Update 6/1/2010:</strong> <a href="#u601">The accused killer appears in court.</a><br />
<strong>Update 5/28/2010:</strong> <a href="#u528">Chinese residents patrol the neighborhood</a></p>
<p><img src="https://danwin.com/words/wp-content/uploads/2010/05/yu-yao.jpg" alt="" title="yu-yao" width="147" height="204" class="alignleft size-full wp-image-975" /><br />
I know that nothing in a city as big as New York should shock me, even during a period of record-low violent crime.</p>
<p>But the <a href="http://www.nypost.com/p/news/local/queens/beaten_raped_qns_gal_dies_6cHBpZH5RlWFFpdM0j99NL">rape and murder</a> of  <strong>Yu Yao</strong> (also spelled Yau, in some reports), a 23-year-old woman who came over from China just two months ago, was enough to snap me from my normal Monday night routine.</p>
<p>Maybe it was the <em>that-could-be-me</em> element: It was only 9 p.m. on Sunday, May 16 when Yu was attacked, while walking back from the grocery on a relatively busy street in Flushing.</p>
<div id="attachment_790" style="width: 509px" class="wp-caption aligncenter"><a href="https://danwin.com/words/wp-content/uploads/2010/05/Screen-shot-2010-05-24-at-11.08.36-PM.png"><img src="https://danwin.com/words/wp-content/uploads/2010/05/Screen-shot-2010-05-24-at-11.08.36-PM-499x340.png" alt="This is where Yao was attacked: 133-23 41st Road in downtown Flushing" title="This is where Yao was attacked: 133-23 41st Road in downtown Flushing" width="499" height="340" class="size-medium wp-image-790" /></a><p class="wp-caption-text">This is where Yao was attacked: 133-23 41st Road in downtown Flushing</p></div>
<p>Maybe it&#8217;s the pure brutality of the killing: Her attacker smashed her face with a pipe, then drug her into an alley to beat and rape her. (<a href="http://www.myfoxny.com/dpp/news/local_news/queens/video-shows-queens-rape-suspect-20100520">some reports say that a surveillance tape</a> shows several passersby apparently ignoring the attack) Yao was in a coma for a week before life-support was pulled on May 22.</p>
<p>Maybe it&#8217;s the hard-working immigrant angle. She reportedly moved here just 2 months ago on a student visa to live with a distant uncle, <a href="http://www.nydailynews.com/news/ny_crime/2010/05/20/2010-05-20_raped__left_for_dead_thug_bashes_chinese_woman_with_pipe_assaults_her_in_qns__co.html?r=news&#038;utm_source=feedburner&#038;utm_medium=feed&#038;utm_campaign=Feed:+nydnrss/news+(News)&#038;utm_content=Google+Reader">worked in a nail salon, and hoped to become a lawyer</a>. After the attack, authorities frantically tried to find her parents <a href="http://queens.ny1.com/content/top_stories/119167/vigil-held-for-victim-of-fatal-beating--rape">half the world away to tell them what happened</a>. And, presumably, to know whether or not they wanted to keep Yu on life support. As a 23-year-old Chinese citizen, she may have been their only child.</p>
<p>I know Yu&#8217;s is just one death out of the hundreds of murders in New York annually. But the news editor in me suggests that this would&#8217;ve gotten more coverage if it was a young American woman who had been raped and left brain-dead on a Sunday summer night. Not out of bias, necessarily, but the cultural gulf and language barrier probably makes this story too difficult to cover in a 24-hour-news cycle.</p>
<p>I came to New York on easy circumstances, with a good job and good friends waiting. So I admire anyone who can take the risk of moving to this busy, beautiful but uncaring city, especially from a foreign country. It&#8217;s common to fail and leave here because of the expense or the noise or the cold. But to die like that, so cruelly in an alley?</p>
<p><a href="https://danwin.com/words/wp-content/uploads/2010/05/20100520queens_tmb0001_20100520184052_320_240.jpg"><img src="https://danwin.com/words/wp-content/uploads/2010/05/20100520queens_tmb0001_20100520184052_320_240.jpg" alt="" title="20100520queens_tmb0001_20100520184052_320_240" width="320" height="240" class="alignleft size-full wp-image-792" /></a><strong>Carlos Salazar Cruz</strong>, the 28-year-old alleged murderer, was also an immigrant. He moved here two years ago from Mexico and worked at a fish market, <a href="http://www.nydailynews.com/news/ny_crime/2010/05/20/2010-05-20_raped__left_for_dead_thug_bashes_chinese_woman_with_pipe_assaults_her_in_qns__co.html?r=news&#038;utm_source=feedburner&#038;utm_medium=feed&#038;utm_campaign=Feed:+nydnrss/news+(News)&#038;utm_content=Google+Reader">according to the Daily News</a>. His sister, contacted by the Daily News, said of Cruz: &#8220;&#8221;He never acted violently&#8230;.We just don&#8217;t know why he would do this. We can&#8217;t explain it.&#8221;</p>
<p>As someone who covered crime for a short time, I always wondered if I&#8217;d become completely desensitized to crime reports. And in New York, enough happens that even a crime like is just a blurb in the papers for a week (also in the local news today, <a href="http://www.nytimes.com/2010/05/25/nyregion/25newark.html?ref=nyregion">a murder conviction in a triple-slaying</a> at a Newark schoolyard involving guns, machetes, and rape. It was a nationwide story in 2007, but I don&#8217;t remember it) . I don&#8217;t know whether to feel better that yes, I can still be shocked. Or to be depressed that there is just no upper-limit to horror and tragedy, even when the victim is a complete stranger.</p>
<p><strong><a name='u528'></a>Update (5/28):</strong> China&#8217;s state English-language paper<a href="http://english.peopledaily.com.cn/90001/90776/90883/7002768.htmll"> has a piece on the community activism following Yu&#8217;s death</a>. It touches on the long-held perception that Asians won&#8217;t fight back:</p>
<blockquote><p>
A week after the rape, several Chinese residents in Flushing teamed up to patrol the neighborhood each weekday night. The team has since expanded to almost 40 members, one-fifth of whom are women, said Zhu Lichuang, president of the New York Chinese Associations Alliance. Zhu started the watch and is one of its volunteers.</p>
<p>&#8220;They (the criminals) choose this place because they think Chinese are usually obedient, like carrying cash and prefer to keep silent about incidents,&#8221; he said. &#8220;So we need to take some actions to show these people that they are wrong.&#8221;</p>
<p>Earlier this week, Yu Guihua, Yao&#8217;s mother, arrived at Newark airport from Heilongjiang province to the grim news of her daughter&#8217;s death. Yao&#8217;s father, who is in poor health back in China, has not still been informed.</p>
<p>&#8220;My child, you&#8217;re so well-behaved, why did you have such a fate,&#8221; Yu cried out. &#8220;My daughter was very pretty, why did he beat her like that?&#8221;</p>
<p>The New York State Assembly&#8217;s Grace Meng said several pedestrians witnessed the attack but walked away.</p>
<p>Having lived in Flushing for 23 years, Zhu said the rape case is the &#8220;most astonishing&#8221; crime he&#8217;s heard about in this neighborhood. &#8220;It&#8217;s not a premeditated crime, which however adds to its seriousness,&#8221; he said. &#8220;It exposes the problems we have had here for a long time &#8211; We Chinese are not unified enough, nor do we care enough about each other.&#8221;
</p></blockquote>
<div id="attachment_815" style="width: 460px" class="wp-caption aligncenter"><a href="https://danwin.com/words/wp-content/uploads/2010/05/P201005280812212923718804.jpg"><img src="https://danwin.com/words/wp-content/uploads/2010/05/P201005280812212923718804.jpg" alt="" title="Yu Guihua" width="450" height="282" class="size-full wp-image-815" /></a><p class="wp-caption-text">Yu Guihua (right), mother of murder victim Yao Yu, grieves on her way from Newark airport to the hospital where her daughter's body is.  (Tan Lixian for China Daily)</p></div>
<p><strong><a name='u601'></a>Update (6/1):</strong> Carlos Salazar Cruz made his first appearance in Queens Supreme Court on June 1, and pandemonium broke loose. Guihua Yu, Yao&#8217;s mother, tried to attack Cruz in court, according to the <a href="http://www.nydailynews.com/news/ny_crime/2010/06/01/2010-06-01_mother_of_yu_yao_chinese_immigrant_killed_in_pipe_attack_charges_accused_killer_.html">Daily News</a>. Cruz also said in a jailhouse interview with the NYDN that he was too drunk to remember the incident:</p>
<blockquote><p>&#8220;I want to kill that man!,&#8221; Guihua Yu wailed repeatedly in her native Mandarin. &#8220;I want my daughter back!&#8221;</p>
<p>Guihua, 55, tried to pull away by grabbing at a courthouse bench as state court officers moved in.</p>
<p>Later, she was wheeled out of the courthouse on a stretcher and taken by ambulance to a nearby hospital.</p>
<p>During the brief morning hearing in Queens Supreme Court, prosecutors upgraded charges against Carlos Salazar Cruz to second-degree murder for the May 16 attack on Yu Yao, 23.</p>
<p>&#8230;</p>
<p>In a jailhouse interview with the Daily News, Cruz claimed he&#8217;d been drinking for two days and can&#8217;t remember the attack.</p>
<p>&#8220;I never wanted to hurt her,&#8221; he said. &#8220;I never even met her.&#8221;</p>
<p>Read more: http://www.nydailynews.com/news/ny_crime/2010/06/01/2010-06-01_mother_of_yu_yao_chinese_immigrant_killed_in_pipe_attack_charges_accused_killer_.html#ixzz0pi3QUoCL</p></blockquote>
<div id="attachment_791" style="width: 510px" class="wp-caption aligncenter"><a href="https://danwin.com/words/wp-content/uploads/2010/05/Screen-shot-2010-05-24-at-10.48.49-PM.png"><img src="https://danwin.com/words/wp-content/uploads/2010/05/Screen-shot-2010-05-24-at-10.48.49-PM-500x264.png" alt="Yao&#039;s Family at Newark Airport" title="Yao&#039;s Family at Newark Airport (NY1)" width="500" height="264" class="size-medium wp-image-791" /></a><p class="wp-caption-text">Yao's Family at Newark Airport</p></div>
<p><a name="u4282011"></a>Update: <strong>4/28/2011</strong> Nearly a year later, Yu Yao&#8217;s killer received his punishment. Carlos Salazar Cruz, now 29, received the <a href="http://online.wsj.com/article/AP3052c142daa948c5a092f94d30b4a0d1.html">maximum sentence of 22-years-to-life in prison</a> for agreeing to plead guilty to second-degree murder. Even though Yu Yao&#8217;s murder was one of the too-many terrible crimes in this past year, her story has received significant attention, then, and today. <a href="http://www.nydailynews.com/news/ny_crime/2011/04/27/2011-04-27_mother_confronts_killer_who_brutally_murdered_her_daughter_with_a_metal_pipe_in_.html">Much of the coverage has focused on the dramatic confrontation between Cruz and Yu Yao&#8217;s mother</a>, who had to be granted a special visa in order to both receive her daughter&#8217;s body last year and then now, to attend the sentencing of Cruz. The pure senselessness of the murder has not abated with the resolution, however. Cruz, both at the time of his arrest and during his sentencing, professed an inability to understand his actions that night and blamed it on drug use and alcohol.</p>
<div id="attachment_1650" style="width: 498px" class="wp-caption aligncenter"><a href="http://newyork.cbslocal.com/2011/04/27/mother-faces-man-convicted-of-brutally-raping-killing-daughter/"><img src="https://danwin.com/words/wp-content/uploads/2010/05/Screen-shot-2011-04-28-at-10.19.26-AM.png" alt="" title="Screen shot 2011-04-28 at 10.19.26 AM" width="488" height="285" class="size-full wp-image-1650" /></a><p class="wp-caption-text">Guihua Yu breaks down while confronting her daughter&#039;s killer</p></div>
<p><a href="http://www.nydailynews.com/news/ny_crime/2011/04/27/2011-04-27_mother_confronts_killer_who_brutally_murdered_her_daughter_with_a_metal_pipe_in_.html">From the Daily News:</a></p>
<blockquote><p>
Yao was our sun, our hope, our dreams, our future and our strength,&#8221; Guihua Yu told Carlos Salazar Cruz, who sat at the defense table with his head bowed.</p>
<p>&#8220;You beast!&#8221; she shouted during the nearly 45-minute tongue-lashing.</p>
<p>&#8220;I just wanted to be able to hold her and see her. What I saw was a corpse, a dead body,&#8221; she mourned.</p>
<p>&#8220;You have destroyed our lives,&#8221; Yu wailed. &#8220;Come back my daughter! My only child. I have lost my child. My child, my child.&#8221;</p>
<p>&#8230;</p>
<p>Cruz, another immigrant pursuing the American Dream, blamed his troubles on the alcohol and crack cocaine that buoyed him during his forced separation from his wife and child in Mexico.</p>
<p>&#8220;I ask for forgiveness,&#8221; Cruz said through a Spanish language translator. &#8220;God our Lord knows that I am completely repentant for my sins.&#8221;
 </p></blockquote>
<p>The post <a rel="nofollow" href="https://danwin.com/2010/05/who-was-yu-yaoyau/">Who was Yu Yao? Rape-and-homicide case in downtown Flushing (updated)</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://danwin.com/2010/05/who-was-yu-yaoyau/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Coding for Journalists 102: Who&#8217;s in Jail Now: Collecting info from a county jail site</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-102-collecting-info-from-a-county-jail-site/</link>
		<comments>https://danwin.com/2010/04/coding-for-journalists-102-collecting-info-from-a-county-jail-site/#comments</comments>
		<pubDate>Tue, 06 Apr 2010 13:30:51 +0000</pubDate>
		<dc:creator><![CDATA[Dan Nguyen]]></dc:creator>
				<category><![CDATA[works]]></category>
		<category><![CDATA[crime]]></category>
		<category><![CDATA[jail]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[web scraping]]></category>

		<guid isPermaLink="false">https://danwin.com/?p=485</guid>
		<description><![CDATA[<p>This is part 2 of a 4-part series in introductory coding for journalists. Go here for the first lesson. This lesson and code will still be verbose, but will have a lot less hand-holding than the previous one. This is part of a four-part series on web-scraping for journalists. As of Apr. 5, 2010, it [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://danwin.com/2010/04/coding-for-journalists-102-collecting-info-from-a-county-jail-site/">Coding for Journalists 102: Who&#8217;s in Jail Now: Collecting info from a county jail site</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>This is <a href="https://danwin.com/works/coding-for-journalists-101-a-four-part-series/">part 2 of a 4-part series</a> in introductory coding for journalists. <a href="https://danwin.com/works/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/">Go here for the first lesson</a>. This lesson and code will still be verbose, but will have a lot less hand-holding than the previous one.</p>
<p><span id="more-485"></span></p>
<link rel='stylesheet' href='https://danwin.com/css/code.css' type='text/css' media='all' />
<div class="code-doc">
<div class='over-note' style='font-size: 12pt; color: #a44; border: 1px solid black; margin: 20px; padding: 20px;'>This is part of a <a href="https://danwin.com/works/coding-for-journalists-101-a-four-part-series/">four-part series on web-scraping for journalists</a>. As of <strong>Apr. 5, 2010</strong>, it was a published a bit incomplete because I wanted to post a timely solution to the <a href="https://danwin.com/works/pfizer-web-scraping-for-journalists-part-4-pfizers-doctor-payments/">recent Pfizer doctor payments list release</a>, but the code at the bottom of each tutorial should execute properly. The code examples are meant for reference and I make no claims to the accuracy of the results. Contact <a href="mailto:dan@danwin.com">dan@danwin.com</a> if you have any questions, or leave a comment below.</p>
<p><strong>DISCLAIMER:</strong> <em>The code, data files, and results are meant for reference and example only. You use it at your own risk.</em></p>
</div>
<p><b>A note about privacy</b>: This tutorial uses files that I archived from a real-world jail website. Though booking records are public record, I make no claims about the legal proceedings involving the inmates who happened to be in jail when I took my snapshot. For all I know, they could have all been wrongfully arrested and therefore don&#8217;t deserve to have their name attached in online perpetuity to erroneous charges (even if the site only purports to record who was arrested and when, and not any legal conclusions). For that reason, I&#8217;ve redacted the last names of the inmates and randomized their birthdates.</p>
<div class='sec'>
<h2>The Cops Reporter and the Log</h2>
<p>If you&#8217;re a daily cops reporter, calling the police station to ask for the list of last night&#8217;s arrests is probably part of your job. Because many papers have some kind of cops blotter where arrested suspects are listed&#8230;and online and in print, this is usually one of a paper&#8217;s top features. The St. Petersburg Times has a modern version of the feature, <a href="http://mugshots.tampabay.com/">complete with mugshots and stats summaries</a>.</p>
<p>Arrest logs have sometimes been criticized for being little more than voyeurism (<a href="http://www.poynter.org/column.asp?id=101&#038;aid=161525">here&#8217;s a discussion over the St. Pete&#8217;s mugshot site</a>). But knowing who your law officers are arresting, and why, is essential to a nice, free society (and for a fair and efficient police force). And the more data you have as a reporter, the better you&#8217;ll be able to cover your beat.</p>
<p>Most pro-active police departments will announce when they&#8217;ve made high-profile arrests. But relying on the police to tell you what the most noteworthy arrests are kind of begs the question, and doesn&#8217;t tell the whole picture of arrest activity. Most states consider arrest logs to be public information (not that that <a href="http://www.arundelmuckraker.com/storyview.asp?storyID=59">stops some jurisdictions from hiding them</a>). But a paper list or a PDF is hard to analyze. Luckily, some police departments are putting their work on the Web They might be willing to send you a spreadsheet of arrest activity, but what if you wanted up-to-the-hour information, so that you could be aware of:</p>
<ol>
<li>Suspected crimes that fall between egregious and infamous (non-fatal assaults, robberies, car jackings, etc.)</li>
<li>An abnormally large number of arrests at a given time</li>
<li>Unusual types of suspected crimes at a given time</li>
</ol>
</div>
<div class='sec'>
<p>This is where the web-scraping you learned in my last tutorial gets useful. You&#8217;re going to have an automated way of collecting the latest arrests news, in an ordered fashion (so that you could, for example, find the inmate with the largest bail at a given time), and you&#8217;ll save yourself and your friendly police PIO tedious paper shuffling and typing.</p>
<p>I&#8217;m going to base my lesson on <a href="http://www.sacsheriff.com/inmate_information/">this sheriff department&#8217;s jail system</a>. I&#8217;ve mirrored a snapshot of their site <a href="https://danwin.com/static/jail-list/current_listing.cfm.html">here</a> (zip file <a href="https://danwin.com/static/jail-list/jail-list.zip">here</a>), so I recommend you run your scripts on my mirror (root directory: <a href="https://danwin.com/static/jail-list/current_listing.cfm.html">https://danwin.com/static/jail-list/</a>)before doing a real-world test. </p>
<p>The jail web site has these characteristics:</p>
<ul>
<li>At this page is a list of every person booked in the last 24 hours</li>
<li>The list typically has 100 to 200 inmates at a time</li>
<li>Most entries in that list contain a link to an inmate&#8217;s page containing data including name, DOB, bail, charges, booking time.</li>
<li>Each inmate has a unique identifying number called X-REF</li>
<li>Not all entries have a link; inmates who have been released have only their names listed</li>
</ul>
<p>The site is pretty useful and user-friendly. However, it&#8217;s hard to quickly glean any useful information from the main list. You have to click through each individual entry to find out why someone was jailed. <strong>The purpose of the following lesson is to automate that process so you can efficiently get the big picture of a jail&#8217;s activity.</strong></p>
<p>Program flow will go something like this:</p>
<ol>
<li><a href="#t_file_io">Create two text files</a>: one to store the list of inmates (inmates.txt), one to store the list of charges (charges.txt)</li>
<li><a href='#t_open_list'>Open the inmate listing page</a></li>
<li>Collect each list entry</li>
<li>If list entry is not a link (i.e. inmate has been released)</li>
<ol>
<li><a href="#t_ifnotlinkfetch">Fetch first name, middle name, last name, intake time and release date</a></li>
</ol>
<li>Else If list entry that is a link, open it</li>
<ol>
<li>Fetch first name, middle name, last name, xref, intake time, and DOB of an inmate</li>
<li>Fetch and parse list of charges</li>
<li>Fetch the bail amount</li>
</ol>
<li>In an each loop, for each inmate entry we collected above:	</li>
<ol>
<li> Output inmate information, in tab-delimited format, into <strong>inmates.txt</strong>, including the XREF.</li>
<li> Output the charges associated with the inmate into <strong>charges.txt</strong>. Each charge will take up one line, and the XREF of the inmate will also be included as to provided a key to the associated inmate </li>
</ol>
</ol>
<h3><a name="t_file_io"></a>File I/O</h3>
<p>We didn&#8217;t cover opening and writing to an external text file in the last lesson. So here&#8217;s how it goes briefly: Using Ruby&#8217;s <a href="http://ruby-doc.org/core/classes/IO.html">IO class</a>, we&#8217;re going to create two files, inmates.txt and charges.txt, and write to them what we find on the jail&#8217;s website. We&#8217;ll be using the variables <b>inmates_file</b> and <b>charges_file</b> to refer to the external files. </p>
<p>To open the the files and set the variables, use the IO class&#8217;s <b><a href="http://ruby-doc.org/core/classes/IO.html#M002238">new</a></b> method, which takes in two parameters: a string designating the file name, and a string<br />
designating the mode&#8230;which in this case, will be &#8220;a&#8221;: write-only (read about the <a href="http://ruby-doc.org/core/classes/IO.html">various modes here</a>).</p>
<pre name="code" class="ruby">
inmates_file = File.new('inmates.txt', 'a')
charges_file = File.new('charges.txt', 'a')
</pre>
<p>If these files don&#8217;t already exist, they will now. If they did, the &#8216;a&#8217; mode will append new content to the end of the file.</p>
<p>To write something to the file, use the <b>puts</b> method, which writes whatever string you supply to it as one line in the file (we&#8217;ve used this method without the IO class, in which case it outputs to the screen):</p>
<pre name="code" class="ruby">
charges_file.puts("Adding a new line of text to the charges file.")
</pre>
<p>While we&#8217;re setting up, let&#8217;s create an array of hashes, with each hash object holding an inmate and his/her information. We don&#8217;t have to do this&#8230;we could just output to the file each inmate record as we get to it, but this will allow us some flexibility later. All we have to do is initialize the array:</p>
<pre name="code" class="ruby">
inmates_array = []
</pre>
<h3><a name="t_open_list"></a>Open the inmate listing page</h3>
<p>Now let&#8217;s fetch the inmates listing. We&#8217;ll be using Nokogiri in the same fashion we did in the <a href="https://danwin.com/thoughts/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/#topic_nokogiri">last lesson</a>, beginning by requiring the nokogiri and open-uri libraries, then using the Open-URI&#8217;s <b><a href="http://www.ruby-doc.org/stdlib/libdoc/open-uri/rdoc/">open</a></b> method to fetch the page, and then Nokogiri&#8217;s <a href="http://nokogiri.rubyforge.org/nokogiri/Nokogiri/HTML/Document.html">HTML class</a> to wrap up the page in a parsable format.</p>
<pre name="code" class="ruby">
require 'rubygems'
require 'nokogiri'
require 'open-uri'
		
base_url='https://danwin.com/static/jail-list/' # all links on the list will be relative to this address		
inmate_listing = Nokogiri::HTML(open("#{base_url}current_listing.cfm.html"))
</pre>
<div class="note">A reminder. The construct <b>#{something_here}</b>, when put inside a double-quoted string, will treat <b>something_here</b> as an actual value of the variable <b>something_here</b>, not just the string. This is called <em>string interpolation</em>. The two following expressions, the latter using interpolation, are equivalent, though the latter will not throw an error if string2 happens to not be a String.</p>
<p>	a_combined_string = &#8220;Hello &#8221; + string2<br />
	a_combined_string = &#8220;Hello #{string2}&#8221;</p>
<p>Read more about Ruby&#8217;s <a href="http://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#Interpolation">string interpolation here</a>.
</div>
<p>Let&#8217;s visit the page with a browser and examine the structure. The list is an HTML table, with each row containing several columns, the first column being the inmate&#8217;s full name and, if the inmate hasn&#8217;t been released, a link to his/her booking page.</p>
<p>If you inspect the HTML closely, you&#8217;ll see that this page is composed of several tables. What we want is the table contained inside the &lt;td&gt; element with a class of &#8220;content.gsub(/\302\240/, &#8216; &#8216;).&#8221;</p>
<p>So we&#8217;ll collect all the table rows, using Nokogiri&#8217;s xpath method, and iterate through them using an each loop. We&#8217;re going to use a variation of an each loop called <b>each_index</b>, which provides the numerical index of the current iteration we&#8217;re on.</p>
<p><a name="t_ifnotlinkfetch"></a></p>
<pre name="code" class="ruby">
	inmate_rows = inmate_listing.xpath("//td[@class='content']/table")[0].xpath(".//tr").collect[1..-1]
</pre>
<p>	The XPath syntax here is looking for a td element with class=&#8217;content&#8217;, then the table inside of that. There&#8217;s more than one, but the first one on the page has the data. From that, we gather all the rows (<b>tr</b>) within that. We call the collect method to convert the result into an array since Nokogiri&#8217;s xpath method returns a <em>NodeSet</em>, which won&#8217;t have the <b>each_index</b> method. <strong>each_index</strong> loops through an array, just like each, but it provides the index of the current iteration.</p>
<pre name="code" class="ruby">
	inmate_rows.each_index do |i|
		inmate_row = inmate_rows[i]
		inmates_array[i] = {}
		inmate = inmates_array[i]

		# each row has a set of columns with the inmate info
		list_columns = inmate_row.xpath('./td')
</pre>
<p>Because we know we&#8217;re on the ith row, we can also initialize the ith index in inmates_array as a hash to store the ith inmate&#8217;s information. Remember that each element in the inmates_array is going to be a hash of information.</p>
<p>Lets use the variable named inmate as a shorthand way to refer to this position in the inmates_array .Each time we iterate through the loop, <strong>inmate</strong> will refer to the next spot in the inmates_array.</p>
<p>This is easier to type out 10 times than inmates_array[i]</p>
<p>Before we get to visiting the individual inmate pages, let&#8217;s just collect the name and other information readily available here</p>
<p>Each name consists of a String in this format: <em>last_name</em>, <em>first_name</em> <em>middle_name</em></p>
<p>So let&#8217;s use the String split method. First to split the string by comma; this will give us an array with the first element being what&#8217;s on the left side of the comma. Splitting the second element of that array, with a space, will give us <em>another</em> array, consisting of a first name and middle name.			</p>
<pre name="code" class="ruby">
		
		
		
		# remember that you need to call Nokogiri's content method to get the text, as a String, between a tag	
		the_inmate_name =  list_columns[0].content.gsub(/\302\240/, ' ').strip.split(',')
		
		inmate['last_name'] = the_inmate_name[0]					# the name before the comma
		inmate['first_name'] = the_inmate_name[1].split(' ')[0]		# the name after the comma, but before the next space
		inmate['middle_name'] = the_inmate_name[1].split(' ')[1..-1]
		
		
</pre>
<p>I&#8217;m going to be using this method call after each use of <b>content</b>: gsub(/\302\240/, &#8216; &#8216;).strip </p>
<p>Not all entries have a middle name. So we use the <em>if <strong>the_inmate_name</strong>.length > 2</em> conditional statement to tell Ruby to skip this line if the_inmate_name</p>
<pre name="code" class="ruby">
		
		# Moving on to the next table cell, which will be the 1 spot in list_columns
		inmate['sex'] = list_columns[1].content
		
		
		# next cell, DOB
		inmate['dob'] = list_columns[2].content
			
		# next cell, booking time
		inmate['intake_time'] = list_columns[3].content
		
	
		
		
		# let's go back to the first column to see if it contained a link
		if list_columns[0].xpath('./a').length == 0  # if there was no link, there would be 0 links returned
			
			# No link to visit, so this must have been a released inmate. Let's grab his/her release date 
			# which comes in the pattern "Released mm/dd/yyyy"...so we'll split the string and capture the second term

			inmate['release_date'] = list_columns[4].content.gsub(/\302\240/, ' ').split(' ')[1]
			
		else
		
			# visit link
			# we'll get to this subroutine in the next section
			
			
		end
	end
</pre>
<div class='note'>
	I make a method call named <b>gsub</b> to cleanse the strings of data. This particular website uses <strong>&amp;nbsp;</strong> (non-breaking-space) to form a space-character, and Nokogiri treats these differently than normal space characters, so <strong>strip</strong> doesn&#8217;t work as intended. So this method call is called frequently:<br />
	.gsub(/\302\240/, &#8216; &#8216;)</p>
<p>	Read more about this from <a href="http://www.vitarara.org/cms/hpricot_to_nokogiri_day_1">Vita Ara</a>
</div>
<p>OK, that should&#8217;ve given you a refresher on arrays, hashes, XPath, and string manipulation. Now we&#8217;ll handle the case of when the first <b>list_column</b> array item does contain a link. It will involve fetching the page from that link and then more XPathing to pick out the wanted data.</p>
<p>At this time, go to the <a href="https://danwin.com/static/jail-list/current_listing.cfm.html">inmate list page</a> and click on one of the inmate pages in the browser.</p>
<p>There&#8217;s a lot more information here; what will be most relevant to us right now is the X-Reference Number, charges, and bail. This next section of code will fit into the <b>else</b> branch of our <a href="#t_ifnotlinkfetch">previous section of code</a>.</p>
<pre name="code" class="ruby">
	# visit link (remember that the xpath method returns an array, so we have to explicitly refer to
	# the 0th index to get the link)
	inmate_link = list_columns[0].xpath('./a')[0]["href"] 
	
	# remember that we set base_url to contain the site's base address. we append 
	# inmate_link to it to get the absolute address to the inmate page
	inmate_page = Nokogiri::HTML(open("#{base_url}#{inmate_link}"))
	
	# everything is inside a &lt;td&gt; with a class="content" attribute, so let's set a variable
	# to hold the table rows inside
	
	content_table_rows = inmate_page.xpath("//td[@class='content']/table/tbody/tr")
	
	# the xref number appears to be in the third row and in the third cell
	# again, we're still using the inmate variable to hold the data associated with an inmate
	
	inmate["xref"] = content_table_rows[2].xpath("./td")[2].content.gsub(/\302\240/, ' ').strip
	# the strip method removes characters that are just space, such as tabs and carriage returns
	
	inmate['booking_number'] = content_table_rows[3].xpath("./td")[2].content.gsub(/\302\240/, ' ').strip
	inmate['arresting_agency'] = content_table_rows[13].xpath("./td")[2].content.gsub(/\302\240/, ' ').strip
	inmate['total_bail'] = content_table_rows[16].xpath("./td")[2].content.gsub(/(\302\240)|\s|\n|\r/, ' ').strip
</pre>
<div class='note'>
	Total bail gets an extra <strong>gsub</strong> condition because there are a few cases where carriage returns are in the table cell, which causes issues when we later try to import the result into a tab-delimited file/spreadsheet.
</div>
<p>OK, so we collected the basic info about each inmate. Now, we want to collect the charges leveled against them. This is a little bit trickier. If you inspect the table-cell containing the charges, you&#8217;ll see that the charge listing itself is a table. The first row of the table lists the case number and type of arrest (warrant, or fresh pickup). Below that is a list of charges, with each charge taking up two rows, like so:</p>
<table>
<tr>
<td>1st Row</td>
<td>1st Cell: Charge code (i.e. PC 459)</td>
<td>2nd Cell: Charge severity (i.e. Felony)</td>
</tr>
<tr>
<td>2nd Row:</td>
<td colspan='2'>Charge description (i.e. &#8220;Burglary&#8221;)</td>
</tr>
</table>
<p>For most of the inmate listings, this is immediately followed by another row listing the bail amount.</p>
<p><strong>However</strong>, there are a few inmates who are held on more than one charge. And there are some who are being held from multiple charges stemming from multiple warrants, such as this person here, who appears to have racked up a number of public nuisance accusations, including evading ticket fare and prohibited public drinking. In his case, the charge listing is one row after another, and each row could either mention the case, the agency that issued the warrant, the charge, or the bail amount per warrant.</p>
<p>My point here is that you won&#8217;t be able to predict that the third row, for instance, always contains the charge code and severity. But using <b>Inspect Element</b>, we see that the table cells containing the code, severity, and description have class attributes &#8220;cellTopLeft&#8221;, &#8220;cellTopMiddle&#8221; and &#8220;cellBottom&#8221;, respectively. The bail amount per case is in the cell with class &#8220;cellBail&#8221;&#8230;but we&#8217;re not interested in bail per case, so we&#8217;ll ignore it.</p>
<p>We&#8217;re going to loop through rows inside this table, and if that row contains a td cell of class &#8220;cellTopLeft&#8221;, we know that each this row will contain the code and severity of a charge. We&#8217;re going to assume that the row immediately following it has a cell with class &#8220;cellBottom,&#8221; which contains the description.</p>
<p>Processing this sub-table of charges will require its own loop. And since each inmate could have more than one charge, we need to store <b>&#8220;charges&#8221;</b> inside our <b>inmate</b> hash&#8230;<b>charges</b> will point to an array. And each item in the <b>charges</b> array will itself be a hash, with keys of &#8220;code&#8221;, &#8220;severity&#8221;, and &#8220;description.&#8221;</p>
<p>Confusing? Well, here&#8217;s a quick diagram of what we have so far, in terms of variables:</p>
<pre>
inmates		=> an array of Hashes...
				inmate = inmates[index] (each inmate is a Hash)
			=> inmate['first_name'] => inmate's first name
			=> inmate['last_name']  => inmate's last name
			=> inmate['xref'] 		=> inmate's xref
			... all the other attributes
			=> inmate['charges']  =>  an array of hashes
						charge = inmate['charges'][charge_index] (each charge is a Hash)
						charge['code']			=> charge's code
						charge['severity']			=> charge's severity
						charge['description']	=> charge's description
</pre>
<p>The loop to fill out that charge array is as follows:</p>
<pre name="code" class="ruby">
	# first, grab the entire table of charges that exists in the 16th row of the main content table
	table_of_charges = content_table_rows[15].xpath("./td")[2]
	
	# and give this inmate an array of charges
	inmate['charges'] = []
	
	# Now, collect all rows that have a td with class "cellTopLeft"
	charge_1st_rows =  table_of_charges.xpath(".//tr[td[@class='cellTopLeft']]")
	
	# Now, collect all rows that have a td with class "cellBottom"
	charge_2nd_rows = table_of_charges.xpath(".//tr[td[@class='cellBottom']]")
	
	# OK, you should do some basic error checking here. We expect the arrays of charge_1st_rows and charge_2nd_rows to have
	# equal length, since each charge has a code, severity and description, right?
	
	# If not, that means our assumption was wrong, and you should do something...like exit the script and re-examine your
	# datasource and assumptions about it. But I'll skip that for now
	
	charge_1st_rows.collect.each_index do |charge_row_index|
	
		# we found a row with a charge, so let's create a new hash that will hold the charge's attributes
		hash_of_inmate_charge = {}
		
		charge_1st_row = charge_1st_rows[charge_row_index]
		hash_of_inmate_charge['code'] = charge_1st_row.xpath('.//td')[0].content.gsub(/\302\240/, ' ').strip
		hash_of_inmate_charge['severity'] = charge_1st_row.xpath('.//td')[1].content.gsub(/\302\240/, ' ').strip
			
		# we assume that the row, with the same index in the charge_2nd_rows array will be the description of the charge
		# listed in charge_1st_rows
			
		hash_of_inmate_charge['description'] = charge_2nd_rows[charge_row_index].xpath('.//td')[1].content.gsub(/\302\240/, ' ').strip
		
		
		# push this hash on to the array of inmate charges:
		inmate['charges'] << hash_of_inmate_charge	
		
	end
</pre>
<p>	Well, we've collected all the relevant inmate information, and if our assumptions were right, each of the inmate's charges. We've reached the end of the loop that examines each row in the main inmate listing. Our script will go onto the next inmate and collect his/her info. And so on until it has reached the end of the list. Here's all the code so far:</p>
<pre name="code" class="ruby">
		require 'rubygems'
		require 'nokogiri'
		require 'open-uri'
		inmates_array = []
		base_url='' 		
		inmate_listing = Nokogiri::HTML(open("#{base_url}current_listing.cfm.html"))

		inmate_rows = inmate_listing.xpath("//td[@class='content']/table")[0].xpath(".//tr").collect[1..-1]
		inmate_rows.each_index do |i|
			inmate_row = inmate_rows[i]		
			inmates_array[i] = {}
			inmate = inmates_array[i]


			list_columns = inmate_row.xpath('./td')		
			the_inmate_name =  list_columns[0].content.gsub(/\302\240/, ' ').strip.split(',')
			inmate['last_name'] = the_inmate_name[0]					# the name before the comma

			inmate['first_name'] = the_inmate_name[1].split(' ')[0]		# the name after the comma, but before the next space
			inmate['middle_name'] = the_inmate_name[1].split(' ')[1..-1] if  the_inmate_name.length > 2	



			inmate['sex'] = list_columns[1].content		
			inmate['dob'] = list_columns[2].content

			inmate['intake_time'] = list_columns[3].content


			if list_columns[0].xpath('./a').length == 0 
				inmate['release_date'] = list_columns[4].content.gsub(/\302\240/, ' ').split(' ')[1]
			else

				inmate_link = list_columns[0].xpath('./a')[0]["href"] 
				inmate_page = Nokogiri::HTML(open("#{base_url}#{inmate_link}"))
				content_table_rows = inmate_page.xpath("//td[@class='content']/table/tr")

		    if content_table_rows.length > 0


			  	inmate["xref"] = content_table_rows[2].xpath("./td")[2].content.gsub(/\302\240/, ' ').strip
		  		inmate['booking_number'] = content_table_rows[3].xpath("./td")[2].content.gsub(/\302\240/, ' ').strip
		  		inmate['arresting_agency'] = content_table_rows[13].xpath("./td")[2].content.gsub(/\302\240/, ' ').strip
		  		inmate['total_bail'] = content_table_rows[16].xpath("./td")[2].content.gsub(/\302\240/, ' ').gsub(/\s|\n|\r/, ' ').strip

		  		table_of_charges = content_table_rows[15].xpath("./td")[2]
		  		inmate['charges'] = []

		  		charge_1st_rows =  table_of_charges.xpath(".//tr[td[@class='cellTopLeft']]")
		  		charge_2nd_rows = table_of_charges.xpath(".//tr[td[@class='cellBottom']]")

		  		charge_1st_rows.collect{|x| x}.each_index do |charge_row_index|

		  			hash_of_inmate_charge = {}

		  			charge_1st_row = charge_1st_rows[charge_row_index]
		  			hash_of_inmate_charge['code'] = charge_1st_row.xpath('.//td')[0].content.gsub(/\302\240/, ' ').strip
		  			hash_of_inmate_charge['severity'] = charge_1st_row.xpath('.//td')[1].content.gsub(/\302\240/, ' ').strip
		  			hash_of_inmate_charge['description'] = charge_2nd_rows[charge_row_index].xpath('.//td')[0].content.gsub(/\302\240/, ' ').strip

		  			# push this hash on to the array of inmate charges:
		  			inmate['charges'] << hash_of_inmate_charge	

		  		end
				end # end if content_table_rows

			end
		end
	</pre>
</p></div>
<div class="sec">
<h3><a name="#topic_file"></a>Storing your Data into a File</h3>
<p>		At this point in your script, all your carefully collected data is in memory. When the script finishes execution, it disappears. That defeats the purpose of any way of tracking data. So let's store it in a persistent way...my choice would be in some kind of database, like MySQL or SQLite. But for our purposes, we can quickly learn the methods to store this information in a tab-delimited file that can be opened as an Excel spreadsheet.</p>
<p>		We will be using Ruby's <a href="http://ruby-doc.org/core/classes/File.html#M002579">File class</a>:</p>
<pre name="code" class="ruby">

			##write to file
			File.open("inmate.txt", 'w'){ |f| 

				f.write("first_name\tmiddle_name\tlast_name\tsex\tdob\tintaketime\trelease_date\txref\tbooking_number\tarresting_agency\ttotal_bail\n")

				inmates_array.each do |inmate|

			f.write("#{inmate['first_name']}\t#{inmate['middle_name']}\t#{inmate['last_name']}\t#{inmate['sex']}\t#{inmate['dob']}\t#{inmate['intake_time']}\t#{inmate['release_date']}\t#{inmate['xref']}\t#{inmate['booking_number']}\t#{inmate['arresting_agency']}\t#{inmate['total_bail']}\n")

				end
			}
</pre>
<p>A quick explanation. The <b>File</b> class has the <b>open</b> method, to which we pass in two arguments: the name of the file we want to write to, and the <em>mode</em>. In this case, we're using 'w', which stands for "write" mode. The curly-braces sets off the code that gets executed while this File is open, with the variable <strong>f</strong> referring to the actual file.</p>
<p>File also has an instance method called <b>write</b>, which takes in a String as an argument to write to the open file.</p>
<p>Backslash-t will write a <b>tab</b>, and backslash-n will write a newline character.</p>
<p>The next block of code is similar to the first...but it refers to a "charges.txt" file. Remember that each inmate could have more than one charge to his/her name. The following file lists every charge, but also lists the xref key to tie back into inmates.txt. For convenience sake, we're also going to print out the inmate name and the inmate's <em>total bail</em> on each line.</p>
<pre name="code" class="ruby">

			File.open("charges.txt",'w'){ |f|
			  f.write("name\txref\ttotal_bail\tcode\tseverity\tdescription\n")

			  inmates_array.each do |inmate|	  
				  if inmate['charges']
				    inmate['charges'].each do |charge|
			  	    f.write("#{inmate['first_name']} #{inmate['last_name']}\t#{inmate['xref']}\t#{inmate['total_bail']}\t#{charge['code']}\t#{charge['severity']}\t#{charge['description']}\n")
			      end
			    end
				end

			}
		</pre>
<p>Printing out the inmate's name and total bail, although redundant, allows us to quickly skim the list to see if there were any unusual crimes connected to unusual amounts of bail (note that the jail site does not breakdown bail amounts per charge).		</p></div>
<div class="sec">
<h3><a name="#topic_realworld"></a> Putting it all together for the real world</h3>
<p>		The above code, put all together, will execute cleanly and compile some nice text files for you, especially if you've saved the package of HTML files onto your hard drive. But in the real world, you'll be targeting an internet server, which may not like you hitting it at a rate of five times per second. Or, may intermittently fail.</p>
<p>		To deal with this, I've added a call to Ruby's <a href="http://ruby-doc.org/core/classes/Kernel.html#M005972">sleep</a> method, which pauses script execution for a given number of seconds. I've also thrown in some <a href="http://ruby.activeventure.com/programmingruby/book/tut_exceptions.html">error-handling.</a> Here's the basic structure:</p>
<pre name="code" class="ruby">
		# some code
		begin
			# risky code here
			# The Ruby interpreter will watch the code that gets executed within the begin branch...if something goes wrong, it's going to execute code in the following rescue branch
	
		rescue
			# the begin-branch messed up, time to run some other code
			puts "An error happened!"
		else
			# this code gets executed if the begin-branch worked fine
		ensure
			# this code in the ensure branch (which is optional) runs no matter what.
			puts "We're done with our error handling"
		end
	
		</pre>
<p>		Read more about <a href="http://ruby.activeventure.com/programmingruby/book/tut_exceptions.html">error-handling here</a>.</p>
<p>		And finally, I'm going to make a few alterations to the script to make it so that it'll run repeatedly for every half hour (essentially, by sleeping a half hour after going through the list). This is the crudest way to schedule a script, but it'll work for now. It will also use another instance method of <b>File</b>: <b>readlines</b>.</p>
<p>		Each half hour, it's likely that the list of inmates will be the same. So a crude way to reduce the number of repeat listings is to check the inmates.txt file (using the <b>match</b> method) to see if a given inmate's xref number is in there. This gets slower as inmates.txt grows. Like I said, it's crude. I prefer using a database, which is a topic outside the scope of this tutorial.</p>
<p>		So I've taken the code above and split it into five parts:</p>
<ol>
<li>the <strong>process_inmate_row</strong> method - This method takes in a single row from the list of inmates and reads the basic information, including name, sex, and date of birth. It takes in as its second argument the entire text of inmates.txt and sees if inmate.txt already contains the name. If so, it will return a hash of inmate data. If not, it will return nil
<p>Note: As said previously, constantly searching the entire inmates.txt file is incredibly inefficient. And, what happens if two John Smiths are arrested in the same time period? The name-check will fail to differentiate inmates of similar names (an even better match method would involve using the date of birth). But I leave it as an exercise for you to develop a more efficient method, which could involve a database. Or storing the name columns of inmates.txt into an array.</p>
<p>But the reason why we're doing the name-check is to save us the time of entering an inmate's page. And, of course, to not fill the inmates.txt file with duplicate entries.
				</li>
<li>the <strong>process_inmate_page_link</strong> method - The code that fetches an inmate's individual page and then processes the extra data, including the total bail amount and charges, is done here. It returns a hash of the inmate data. </li>
<li>the <strong>write_to_file</strong> method - This code invokes the File.open methods and, for each inmate and charge, writes a tab-delimited line to the inmates.txt and charges.txt files</li>
<li>the <strong>check_the_site</strong> method - This is the master method. It retrieves the list of inmates from the jail site and then on each row of inmate data, calls all the previously defined methods. IT also has some basic error handling. If something happens, like your internet connection drops in the middle of a page retrieval, it skips the current inmate and moves on. This is better than just crashing.</li>
<li>The main execution loop - All the code previously written out as methods will <b>do nothing</b> unless you actually invoke the methods. So we initialize a variable, called <b>hours</b>, to zero and while that is less than 24, we run the <strong>check_the_site</strong> method. After <strong>check_the_site</strong> finishes, <strong>hours</strong> is incremented and the script sleeps for an hour (3600 seconds).</li>
</ol></div>
<div class='sec'>
Here's the final code, which will be reading from the mirrored archive list I've provided <a href="https://danwin.com/static/jail-list/current_listing.cfm.html">here</a>. So obviously, running the main collection loop more than once is pointless as my list is static...but at least it's practice. You can download a <a href="https://danwin.com/static/jail-list/jail-list.zip">zipped archive</a> of the files here.</p>
<pre name='code' class='ruby'>
require 'rubygems'
require 'nokogiri'
require 'open-uri'


def process_inmate_row(inmate_row, inmate_text)
  
	list_columns = inmate_row.xpath('./td')		
	inmate = {}
	the_inmate_name =  list_columns[0].content.gsub(/\302\240/, ' ').strip.split(',')
	inmate['last_name'] = the_inmate_name[0]					# the name before the comma  
	inmate['first_name'] = the_inmate_name[1].split(' ')[0]		# the name after the comma, but before the next space
	inmate['middle_name'] = the_inmate_name[1].split(' ')[1..-1] if  the_inmate_name.length > 2	
  
  # at this point, we can determine if the inmate is already in our textfile
  
  name_to_match="#{inmate['first_name']}\t#{inmate['middle_name']}\t#{inmate['last_name']}"  
   # remember that in the text file, we tab-delimited the name, so we have to match that pattern
	
  
   if inmate_text.match(name_to_match)
     puts "NOT adding inmate #{name_to_match} to inmates txt, as it already exists"
     inmate = nil
     # the method that invoked process_inmate_row will only add the inmate if it is not nil
     # we DON'T want this inmate added, so that's why we're setting it to nil
		
   else  
     
     	puts "Adding inmate #{name_to_match} to inmates txt"
 		  inmate['sex'] = list_columns[1].content		
   	  inmate['dob'] = list_columns[2].content
   	  inmate['intake_time'] = list_columns[3].content
	    puts "Basic info of inmate: #{inmate['first_name']} #{inmate['last_name']}: #{inmate['dob']}"
       
  end
  
  
  return inmate
  
end


def process_inmate_page_link(inmate_link)
  	inmate_page = Nokogiri::HTML(open(inmate_link))
		content_table_rows = inmate_page.xpath("//td[@class='content']/table/tr")

    more_inmate_stuff = {}
    
    if content_table_rows.length > 0
      
	  	more_inmate_stuff["xref"] = content_table_rows[2].xpath("./td")[2].content.gsub(/\302\240/, ' ').strip
  		
  		more_inmate_stuff['booking_number'] = content_table_rows[3].xpath("./td")[2].content.gsub(/\302\240/, ' ').strip
  		more_inmate_stuff['arresting_agency'] = content_table_rows[13].xpath("./td")[2].content.gsub(/\302\240/, ' ').strip
  		more_inmate_stuff['total_bail'] = content_table_rows[16].xpath("./td")[2].content.gsub(/\302\240/, ' ').gsub(/\s|\n|\r/, ' ').strip

  		puts "Found more inmate info, total-bail: #{more_inmate_stuff['total_bail']} arresting-agency: #{more_inmate_stuff['arresting_agency']}"


  		table_of_charges = content_table_rows[15].xpath("./td")[2]
  		more_inmate_stuff['charges'] = []

  		charge_1st_rows =  table_of_charges.xpath(".//tr[td[@class='cellTopLeft']]")
  		
  		puts "Number of charges: #{charge_1st_rows.length}"
  		charge_2nd_rows = table_of_charges.xpath(".//tr[td[@class='cellBottom']]")

  		charge_1st_rows.collect{|x| x}.each_index do |charge_row_index|

  			hash_of_inmate_charge = {}

  			charge_1st_row = charge_1st_rows[charge_row_index]
  			hash_of_inmate_charge['code'] = charge_1st_row.xpath('.//td')[0].content.gsub(/\302\240/, ' ').strip
  			hash_of_inmate_charge['severity'] = charge_1st_row.xpath('.//td')[1].content.gsub(/\302\240/, ' ').strip
  			hash_of_inmate_charge['description'] = charge_2nd_rows[charge_row_index].xpath('.//td')[0].content.gsub(/\302\240/, ' ').strip

  			# push this hash on to the array of inmate charges:
  			more_inmate_stuff['charges'] << hash_of_inmate_charge	
    
        puts hash_of_inmate_charge.collect.join(" | ")
  		end
  		
  	else
  	  "Could not find more inmate info"	
		end # end if content_table_rows
  
    return more_inmate_stuff
end

def write_to_file(inmate)

    ##write to file
    puts "Writing to inmates.txt"
    
    # note that we use the 'a' mode here, which will append new input onto the end of an existing file (or create a new one if it doesn't exist), instead of overwriting it
    # Obviously, we don't want to keep overwriting inmates.txt if we intend it to be a persistent record of the inmate log
    
    File.open("inmates.txt", 'a+'){ |f| 
	f.write("first_name\tmiddle_name\tlast_name\tsex\tdob\tintaketime\trelease_date\txref\tbooking_number\tarresting_agency\ttotal_bail\n") unless File.size(f) >= 0 
      # we don't want to repeatedly print the column headers    
      f.write("#{inmate['first_name']}\t#{inmate['middle_name']}\t#{inmate['last_name']}\t#{inmate['sex']}\t#{inmate['dob']}\t#{inmate['intake_time']}\t#{inmate['release_date']}\t#{inmate['xref']}\t#{inmate['booking_number']}\t#{inmate['arresting_agency']}\t#{inmate['total_bail']}\t#{Time.now}\n")
   
    }
   
    puts "Writing to charges.txt"

    File.open("charges.txt",'a+'){ |f|
      f.write("name\txref\ttotal_bail\tcode\tseverity\tdescription\n") unless File.size(f) >= 0 
      # we don't want to repeatedly print the column headers
      
  	  if inmate['charges']
  	    inmate['charges'].each do |charge|
  	      puts "Writing charge: #{charge['description']}"
    	    f.write("#{inmate['first_name']} #{inmate['last_name']}\t#{inmate['xref']}\t#{inmate['total_bail']}\t#{charge['code']}\t#{charge['severity']}\t#{charge['description']}\t#{Time.now}\n")
        end
      end
  
    }
end
  
  

def check_the_site(base_url, index_url)
  # read the contents of inmates.txt into a variable so that we can check to see if an inmate already exists
   inmate_text = File.exists?("inmates.txt") ? File.open("inmates.txt", 'r').readlines().join() : ''
   inmates_added_count = 0 # just a piece of info we want to keep track of. We'll increment this number on each successful add
   
    
    begin	
      inmate_listing = Nokogiri::HTML(open("#{base_url}#{index_url}"))
    rescue Exception=>e
      puts "Oops, had a problem getting the inmates list at #{Time.now}"
      return nil #get out of here.
    end
      
    inmate_rows = inmate_listing.xpath("//td[@class='content']/table")[0].xpath(".//tr").collect[1..-1]
    puts "There are #{inmate_rows.length} rows to process"
    inmate_rows.each_index do |i|
  
      puts "\nProcessing inmate row: #{i}"
      inmate_row = inmate_rows[i]
      
      begin
        # The following code is potentially risky; we're making calls to process_inmate_row and process_inmate_page_link, two methods that could potentially throw an error if the data is improperly formatted or if the website refuses to send data
        
        # I've set up some rudimentary error handling to notify you of an error, but to keep chugging along to the next row
        
        inmate = process_inmate_row(inmate_rows[i], inmate_text)
        
        # process_inmate_row will return a hash of inmate data 
        # BUT, it will reutrn nil if it turns out this inmate already exists
        # so here's another if branch to check for that
        
        if inmate.nil?
          # do nothing
        else  
          # inmate was not blank, so let's continue
          list_columns = inmate_row.xpath('./td')		
        	if list_columns[0].xpath('./a').length == 0 
        		inmate['release_date'] = list_columns[4].content.gsub(/\302\240/, ' ').split(' ')[1]
        		puts "inmate was released on #{inmate['release_date']}"
        	else
        	  inmate_link = list_columns[0].xpath('./a')[0]["href"] 
        	  inmate_link = "#{base_url}#{inmate_link}"
        	  puts "Fetching: #{inmate_link}"
            more_inmate_attributes = process_inmate_page_link(inmate_link)
            inmate.merge!(more_inmate_attributes)
          end
    
          
        end # end of the if inmate.blank? branch
      rescue Exception=>e
        puts "Oops, had a problem getting data from inmate row #{i}, Error: #{e}"
      rescue Timeout::Error => e 
        puts "Had a timeout error: #{e}"
        sleep(10)
      else
         # got all the info for the inmage, so lets add him/her to the file
        
        unless inmate.nil?
          write_to_file(inmate) unless inmate.nil? 
  	      # an inline conditional: remember that inmate was set to nil if it already existed in the text file
  	      # we don't want to add it to the main array in such a case, hence the 'unless'
  	      inmates_added_count+=1
  	        puts "We successfully queried the site, so let's sleep a second"
          	sleep 1
  	    end

        
        
      end
    
    end
	
	  # reached the end, let's print a summary:
	  puts "#{Time.now}: Out of #{inmate_rows.length}, we added #{inmates_added_count} inmates"
	
end

   


hours = 0
BASE_URL='https://danwin.com/static/jail-list/'

while(hours < 24)
  puts "Checking the site (#{hours} out of 24 times):"
  puts "***********************"
  check_the_site(BASE_URL, 'current_listing.cfm.html')
  #run the code that hits the site and processes the links...this method also returns an array of all the inmates
  
  
  
  hours += 1 # increment the counter, or this will run forever...
  puts "sleeping till next iteration"
  
  sleep_count = 0
  while(sleep_count < 1800)
    sleep(1) #sleep for an hour
    sleep_count +=1
    puts "Will check again in #{(1800-sleep_count)/60} minutes" if sleep_count%60==0
  end
  
  
end

    
</pre>
</div>
<p><b>4/4/2010:</b> This lesson remains unfinished, but the above code should execute. From it, you should have text files that, at a glance, will tell you some of the more interesting circumstances that this set of inmates were arrested under. There's various kinds of analysis you could do on a long term basis. But trying to figure out why some inmates have bail set at $1,000,000 isn't easy; you need to know their prior criminal record too...which is <a href="https://danwin.com/works/coding-for-journalists-part-3-cross-checking-the-jail-log-with-the-court-system-use-rubys-mechanize-to-fill-out-a-form/">what we hope to do in the third tutorial in this series</a>.</p>
</div>
<p>The post <a rel="nofollow" href="https://danwin.com/2010/04/coding-for-journalists-102-collecting-info-from-a-county-jail-site/">Coding for Journalists 102: Who&#8217;s in Jail Now: Collecting info from a county jail site</a> appeared first on <a rel="nofollow" href="https://danwin.com">danwin.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://danwin.com/2010/04/coding-for-journalists-102-collecting-info-from-a-county-jail-site/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
