<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	
	>
<channel>
	<title>Comments on: Coding for Journalists 101: Go from knowing nothing to scraping Web pages. In an hour. Hopefully.</title>
	<atom:link href="https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/feed/" rel="self" type="application/rss+xml" />
	<link>https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/</link>
	<description>Words, photos, and code by Dan Nguyen. The &#039;g&#039; is mostly silent.</description>
	<lastBuildDate>Sun, 07 Dec 2025 04:13:29 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=4.2.39</generator>
	<item>
		<title>By: Dubai Ubar</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/#comment-48838</link>
		<dc:creator><![CDATA[Dubai Ubar]]></dc:creator>
		<pubDate>Sun, 28 Jun 2020 14:06:49 +0000</pubDate>
		<guid isPermaLink="false">https://danwin.com/?p=436#comment-48838</guid>
		<description><![CDATA[&lt;strong&gt;Dubai Ubar Girls&lt;/strong&gt;

https://lailajan389.hatenadiary.com/entry/2020/06/28/225649]]></description>
		<content:encoded><![CDATA[<p><strong>Dubai Ubar Girls</strong></p>
<p><a href="https://lailajan389.hatenadiary.com/entry/2020/06/28/225649" rel="nofollow">https://lailajan389.hatenadiary.com/entry/2020/06/28/225649</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dataliser &#187; Blog Archive &#187; Tutorial Websites</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/#comment-2148</link>
		<dc:creator><![CDATA[Dataliser &#187; Blog Archive &#187; Tutorial Websites]]></dc:creator>
		<pubDate>Tue, 03 Apr 2012 00:53:41 +0000</pubDate>
		<guid isPermaLink="false">https://danwin.com/?p=436#comment-2148</guid>
		<description><![CDATA[[...] Coding for journalists 101: Scraping for Journalism: A Guide for Collecting Data [...]]]></description>
		<content:encoded><![CDATA[<p>[&#8230;] Coding for journalists 101: Scraping for Journalism: A Guide for Collecting Data [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Cracking the Code: My Pledge to Learn How to Web Scrape &#171; Newsfangled: Learning to Teach New Media</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/#comment-2141</link>
		<dc:creator><![CDATA[Cracking the Code: My Pledge to Learn How to Web Scrape &#171; Newsfangled: Learning to Teach New Media]]></dc:creator>
		<pubDate>Fri, 30 Mar 2012 15:42:33 +0000</pubDate>
		<guid isPermaLink="false">https://danwin.com/?p=436#comment-2141</guid>
		<description><![CDATA[[...] Coding for Journalists [...]]]></description>
		<content:encoded><![CDATA[<p>[&#8230;] Coding for Journalists [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Datenjournalismus &#124; Pearltrees</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/#comment-2124</link>
		<dc:creator><![CDATA[Datenjournalismus &#124; Pearltrees]]></dc:creator>
		<pubDate>Fri, 23 Mar 2012 12:13:11 +0000</pubDate>
		<guid isPermaLink="false">https://danwin.com/?p=436#comment-2124</guid>
		<description><![CDATA[[...] Coding for Journalists 101: Go from knowing nothing to scraping Web pages. In an hour. Hopefully. &#124; ...  But now, itâ€™s possible that a public-information officer will just point you to the public website and say, there it is. And itâ€™s not always a case of them being ignorant/disdainful of laws that oblige them to give the dataset, in electronic form, that backs the website. From their viewpoint, the information is there for any idiot with an Internet connection to ask for, so what are you whining about? At this point, you can either go through a weeks-long argument through emails and phone messages that ends with their legal counsel compelling the PI officer to hand over the data. Or, if keeping your story idea secret isnâ€™t a priority, you could explain what your intent is, and why you need a whole dataset to see if a trend exists. [...]]]></description>
		<content:encoded><![CDATA[<p>[&#8230;] Coding for Journalists 101: Go from knowing nothing to scraping Web pages. In an hour. Hopefully. | &#8230;  But now, itâ€™s possible that a public-information officer will just point you to the public website and say, there it is. And itâ€™s not always a case of them being ignorant/disdainful of laws that oblige them to give the dataset, in electronic form, that backs the website. From their viewpoint, the information is there for any idiot with an Internet connection to ask for, so what are you whining about? At this point, you can either go through a weeks-long argument through emails and phone messages that ends with their legal counsel compelling the PI officer to hand over the data. Or, if keeping your story idea secret isnâ€™t a priority, you could explain what your intent is, and why you need a whole dataset to see if a trend exists. [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Screen Scraping (It&#8217;s Kinda Like Christmas)! &#171; The One Techeteer</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/#comment-1825</link>
		<dc:creator><![CDATA[Screen Scraping (It&#8217;s Kinda Like Christmas)! &#171; The One Techeteer]]></dc:creator>
		<pubDate>Tue, 27 Dec 2011 08:39:12 +0000</pubDate>
		<guid isPermaLink="false">https://danwin.com/?p=436#comment-1825</guid>
		<description><![CDATA[[...] found a great tutorial by Dan Nguyen, a developer at ProPublica, to screen scrape with Ruby and the Nokogiri parsing tool. [...]]]></description>
		<content:encoded><![CDATA[<p>[&#8230;] found a great tutorial by Dan Nguyen, a developer at ProPublica, to screen scrape with Ruby and the Nokogiri parsing tool. [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Barney</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/#comment-1458</link>
		<dc:creator><![CDATA[Barney]]></dc:creator>
		<pubDate>Fri, 14 Oct 2011 15:58:08 +0000</pubDate>
		<guid isPermaLink="false">https://danwin.com/?p=436#comment-1458</guid>
		<description><![CDATA[Wow. You put a lot of effort into this post. Awesome. I would like to add that there are web scraping software options out there that take away the need for any coding (although, learning XPath will be helpful to anyone that is serious about web scraping). For instance, I work for &lt;a href=&quot;http://www.mozenda.com&quot; rel=&quot;nofollow&quot;&gt;Mozenda&lt;/a&gt;, and we have people successfully using it that haven&#039;t touched a line of code in their lives. Anyway, I don&#039;t want to take anything away from such an in-depth tutorial, but if anyone is finding this stuff daunting, you can always check out Mozenda or any of our competitors (we&#039;re pretty confident in offering the easiest-to-use scraper at an affordable price).

Again, great job on the tutorial, and good luck to everyone in getting your data!]]></description>
		<content:encoded><![CDATA[<p>Wow. You put a lot of effort into this post. Awesome. I would like to add that there are web scraping software options out there that take away the need for any coding (although, learning XPath will be helpful to anyone that is serious about web scraping). For instance, I work for <a href="http://www.mozenda.com" rel="nofollow">Mozenda</a>, and we have people successfully using it that haven&#8217;t touched a line of code in their lives. Anyway, I don&#8217;t want to take anything away from such an in-depth tutorial, but if anyone is finding this stuff daunting, you can always check out Mozenda or any of our competitors (we&#8217;re pretty confident in offering the easiest-to-use scraper at an affordable price).</p>
<p>Again, great job on the tutorial, and good luck to everyone in getting your data!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Grepsr</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/#comment-1452</link>
		<dc:creator><![CDATA[Grepsr]]></dc:creator>
		<pubDate>Thu, 13 Oct 2011 11:39:52 +0000</pubDate>
		<guid isPermaLink="false">https://danwin.com/?p=436#comment-1452</guid>
		<description><![CDATA[You could always try Grepsr - http://www.grepsr.com/ a newly launched innovative cloud based data extraction service. We are offering a free trial at the moment.]]></description>
		<content:encoded><![CDATA[<p>You could always try Grepsr &#8211; <a href="http://www.grepsr.com/" rel="nofollow">http://www.grepsr.com/</a> a newly launched innovative cloud based data extraction service. We are offering a free trial at the moment.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Coding for Journalists 101: Go from knowing nothing to scraping Web pages. In an hour. Hopefully. &#124; Dan Nguyen pronounced fast is danwin &#124; AZ Journalism &#124; Journalism news &#124; Journalism career</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/#comment-1380</link>
		<dc:creator><![CDATA[Coding for Journalists 101: Go from knowing nothing to scraping Web pages. In an hour. Hopefully. &#124; Dan Nguyen pronounced fast is danwin &#124; AZ Journalism &#124; Journalism news &#124; Journalism career]]></dc:creator>
		<pubDate>Tue, 16 Aug 2011 16:13:38 +0000</pubDate>
		<guid isPermaLink="false">https://danwin.com/?p=436#comment-1380</guid>
		<description><![CDATA[[...] Delicious/tag/journalism   This entry was posted in Journalism and tagged Coding, danwin, fast, from, Hopefully., hour., journalists, knowing, Nguyen, nothing, pages., pronounced, scraping. Bookmark the permalink.    &#8592; Google buys Motorola Mobility for $12.5bn Starbucks guitar man &#8594; [...]]]></description>
		<content:encoded><![CDATA[<p>[&#8230;] Delicious/tag/journalism   This entry was posted in Journalism and tagged Coding, danwin, fast, from, Hopefully., hour., journalists, knowing, Nguyen, nothing, pages., pronounced, scraping. Bookmark the permalink.    &larr; Google buys Motorola Mobility for $12.5bn Starbucks guitar man &rarr; [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: anon</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/#comment-1238</link>
		<dc:creator><![CDATA[anon]]></dc:creator>
		<pubDate>Mon, 30 May 2011 01:53:58 +0000</pubDate>
		<guid isPermaLink="false">https://danwin.com/?p=436#comment-1238</guid>
		<description><![CDATA[You could always just use freelancers on sites like Freelancer.com 

They will do it for peanuts!]]></description>
		<content:encoded><![CDATA[<p>You could always just use freelancers on sites like Freelancer.com </p>
<p>They will do it for peanuts!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Norm Cimon</title>
		<link>https://danwin.com/2010/04/coding-for-journalists-go-from-a-know-nothing-to-web-scraper-in-an-hour-hopefully/#comment-897</link>
		<dc:creator><![CDATA[Norm Cimon]]></dc:creator>
		<pubDate>Mon, 17 Jan 2011 18:39:58 +0000</pubDate>
		<guid isPermaLink="false">https://danwin.com/?p=436#comment-897</guid>
		<description><![CDATA[Good morning. Two things:
- Thank you very much for the tutorial. It&#039;s a nice introduction to nokogiri and to the use of xpath. That&#039;s much appreciated.

- Given the changes to the wiki page and the dropped content attribute, the correct construct to fetch the president&#039;s name is:

list_of_presidents.xpath(&quot;//tr/td[4]/a[1]&quot;)[0].content

Again, thank you very much.]]></description>
		<content:encoded><![CDATA[<p>Good morning. Two things:<br />
&#8211; Thank you very much for the tutorial. It&#8217;s a nice introduction to nokogiri and to the use of xpath. That&#8217;s much appreciated.</p>
<p>&#8211; Given the changes to the wiki page and the dropped content attribute, the correct construct to fetch the president&#8217;s name is:</p>
<p>list_of_presidents.xpath(&#8220;//tr/td[4]/a[1]&#8221;)[0].content</p>
<p>Again, thank you very much.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
