Tag Archives: programming

How to compile OpenCV 2.4.10 on Ubuntu 14.04 and 14.10

For my upcoming Computational Methods in the Civic Sphere class at Stanford, I wanted my students to have access to OpenCV so that they could explore computer-vision algorithms, such as face detection with Haar classifiers.

On the Stanford FarmShare machines (which run on Ubuntu 13.10), I had trouble getting their installation of OpenCV working, but was able to use the
Anaconda distribution to install both Python 2.7.8 and OpenCV 2.4.9.1 via the Binstar package repo.

Briefly, here are the instructions:

  1. Get the Anaconda download link
  2. curl (*the-anaconda-URL-script*) -o /tmp/anaconda-install.sh && bash /tmp/anaconda-install.sh
  3. conda install binstar
  4. conda install -c https://conda.binstar.org/menpo opencv

Note: For Mac users for whom `brew install opencv` isn’t working: Anaconda worked well enough for me, though I had to install from a different pacakge repo:

conda install -c https://conda.binstar.org/jjhelmus opencv

The Anaconda system, which I hadn't used before but find really convenient, automatically upgrades/downgrades the necessary dependencies (such as numpy).

Using Anaconda works fine on fresh Ubuntu installs (I tested on AWS and Digital Ocean), but I wanted to see if I could compile it from source just in case I couldn't use Anaconda. This ended up being a very painful time of wading through blog articles and Github issues. Admittedly, I'm not at all an expert at *nix administration, but it's obvious there's a lot of incomplete and varying answers out there.

The help.ubuntu.docs on OpenCV are the most extensive, but right at the top, they state:

Ubuntu's latest incarnation, Utopic Unicorn, comes with a new version of libav, and opencv sources will fail to build with this new library version. Likewise, some packages required by the script no longer exist (libxine-dev, ffmpeg) in the standard repositories. The procedures and script described below will therefore not work at least since Ubuntu 14.10!

The removal of ffmpeg from the official Ubuntu package repo is, from what I can tell, the main source of errors when trying to compile OpenCV for Ubuntu 14.04/14.10. Many of the instructions deal with getting ffmpeg from a personal-package-archive and then trying to build OpenCV. That approach didn't work for me, but admittedly, I didn't test out all the possible variables (such as version of ffmpeg).

In the end, what worked was to simply just set the flag to build without ffmpeg:

  cmake [etc] -D WITH_FFMPEG=OFF

I've created a gist to build out all the software I want for my class machines, but here are the relevant parts for OpenCV:

sudo apt-get update && sudo apt-get -y upgrade
sudo apt-get -y dist-upgrade && sudo apt-get -y autoremove

# build developer tools. Some of these are probably non-pertinent
sudo apt-get install -y git-core curl zlib1g-dev build-essential \
     libssl-dev libreadline-dev libyaml-dev libsqlite3-dev \
     libxml2-dev libxslt1-dev libcurl4-openssl-dev \
     python-software-properties

# numpy is a dependency for OpenCV, so most of these other
# packages are probably optional
sudo apt-get install -y python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose
## Other scientific libraries (obviously not needed for OpenCV)
pip install -U scikit-learn
pip install -U nltk

### opencv from source
# first, installing some utilities
sudo apt-get install -y qt-sdk unzip
OPENCV_VER=2.4.10
curl "http://fossies.org/linux/misc/opencv-${OPENCV_VER}.zip" -o opencv-${OPENCV_VER}.zip
unzip "opencv-${OPENCV_VER}.zip" && cd "opencv-${OPENCV_VER}"
mkdir build && cd build
# build without ffmpeg
cmake -D WITH_TBB=ON -D BUILD_NEW_PYTHON_SUPPORT=ON \
      -D WITH_V4L=ON -D INSTALL_C_EXAMPLES=ON \
      -D INSTALL_PYTHON_EXAMPLES=ON -D BUILD_EXAMPLES=ON \
      -D WITH_QT=ON -D WITH_OPENGL=ON -D WITH_VTK=ON \
      -D WITH_FFMPEG=OFF ..

A recurring issue I had come across – I didn't test it myself, but just saw it in the variety of speculation regarding the difficulty of building OpenCV – is that building with a Python other than the system's Python would cause problems. So, for what it's worth, the above process works with 14.04's Python 2.7.6, and 14.10's 2.7.8. I'm not much of a Python user myself so I don't know much about best practices regarding environment…pyenv works pretty effortlessly (that is, it works just like rbenv), but I didn't try it in relation to building OpenCV.

Also, this isn't the bare minimum…I'm not sure what dev tools or which cmake flags are are absolutely needed, or if qt-sdk is needed if you don't build with Qt support. But it works, so hopefully anyone Googling this issue will be able to make some progress.

Note: Other things I tried that did not work on clean installs of Ubuntu 14.04/14.10:

The Python code needed to do simple face-detection looks something like this (based off of examples from OpenCV-Python and Practical Python and OpenCV:

(You can find pre=built XML classifiers at the OpenCV repo)


import cv2
face_cascade_path = '/YOUR/PATH/TO/haarcascade_frontalface_default.xml'
face_cascade = cv2.CascadeClassifier(face_cascade_path)

scale_factor = 1.1
min_neighbors = 3
min_size = (30, 30)
flags = cv2.cv.CV_HAAR_SCALE_IMAGE

# load the image
image_path = "YOUR/PATH/TO/image.jpg"
image = cv2.imread(image_path)

# this does the work
rects = face_cascade.detectMultiScale(image, scaleFactor = scale_factor,
  minNeighbors = min_neighbors, minSize = min_size, flags = flags)

for( x, y, w, h ) in rects:
  cv2.rectangle(image, (x, y), (x + w, y + h), (255, 255, 0), 2)

cv2.imwrite("YOUR/PATH/TO/output.jpg", image)

The Computer Science of Tomorrow, Today, and the Past

“The thing you are doing has likely been done before. And that might seem depressing, but I think it’s the most wonderful thing ever. Because it means an education in computer science is worth something.”

The quote above comes from an informative and entertaining talk that John Graham-Cumming gave at OSCON in 2013, in which he points out that there hasn’t been much new in computing since 1983, what with wireless networking first implemented in 1971, markup languages in the 1960s, and, as pictured above, “hypertext with clickable links, 1967″.

Because progress has largely consisted of performance and interface improvements, it is comforting to know that applying yourself to the knowledge of computing is nearly as vital and timeless a pursuit as math and literacy. Fittingly, I saw this video after someone linked to it on Hacker News, in response to a 1964 Atlantic article I linked to: Martin Greenberger’s “Computers of Tomorrow”.

In his 50-year-old essay, Greenberger effectively predicts, the Internet, net neutrality, cloud computing, and the automation of the New York Stock Exchange. But the best line is the essay’s last line, which aligns with Graham-Cumming’s optimism about human knowledge in computing:

By 2000 AD man should have a much better comprehension of himself and his system, not because he will be innately any smarter than he is today, but because he will have learned to use imaginatively the most powerful amplifier of intelligence yet devised.

Graham-Cumming’s talk is available on SlideShare too.

Preparing eggs and programming

As an egg fan, I loved this Times dining article about a “tasting expedition” of the high- and low-brow egg dishes in New York. As a programmer, there were two passages that stuck out to me about the nature of skill, complexity, and genius behind cooking (and programming):

“In the French Laundry book, no one step is very difficult,” [author Michael Ruhlman] said. “There are just so many that it takes technique to its farthest reaches.” For instance, Mr. Keller insists that fava beans be peeled before cooking. “If you’re good, it takes 20 seconds per bean,” Mr. Ruhlman said. “Someone in his kitchen put a batch of them in the water once it lost its boil. Thomas [Keller] said, ‘Get rid of those.’ That guy didn’t last.”

This next passage comes after the Times writer and Ruhlman visit Aldea in the Flatiron district to try George Mendes’ “signature Knoll Krest Farm Egg with bacalao (salt cod), olive and potato.”

After we left, I expressed surprise that so much effort went into a dish billed on the menu as a “snack.” Mr. Ruhlman nodded. “Working as a chef can be mind-numbingly boring,” he said. “The reason dishes are so good is not because someone is a genius, but because he or she has done it a thousand times. They are looking to keep their minds active and energetic.”

I couldn’t describe programming better myself: no one line is difficult, its the order and arrangement of thousands of steps that make a useful program. And you don’t have to be a genius, but because programming inherently involves repetitive processes, you have to keep your mind alive, and be continuously observant and critical of the patterns you come across.

Fran Allen and the social relevance of computer science

If you haven’t read it yet, Peter Seibel’s Coders at Work (2009), is one of the best books about computer programming that doesn’t have actual code in it. It distills “nearly eighty hours of conversations with fifteen all-time great programmers and computer scientists,” with equal parts given to fascinating technical minutiae (including the respondents’ best/worst bug hunting stories) and to learning how these coders came to think the way they do.

So in a book full of interviews worth reading, it’s not quite accurate to say that Fran Allen stands out. It’s better to say that Allen is different; as a Turing Award recipient for her “pioneering contributions to the theory and practice of optimizing compiler techniques,” Allen spends much of her interview arguing that compiler optimization is woefully unstudied. Allen even argues that the popular adoption of C was a step backwards for computer science, which is kind of an alien concept for those of us today who almost exclusively study and use high-level languages.

Allen is also different in that she’s the only woman in Seibel’s book, and understandably, she has a few thoughts about their place in computer science. The summary of it is that she’s not at all optimistic about the “50/50 by 2020″ initiative (the goal to have women make up half of computer science degree earners by 2020). And the problem, Allen (who was a math teacher herself) is not in the curriculum:

I feel it’s our problem to solve. It’s not telling the educators to change their training; we in the field have to make it more appealing.

What I found particularly insightful in Allen’s interview with Seibel is that it’s not just about the need for more role models, because the current lack of women programmers is going to place a limit on that. In Allen’s opinion, girls have shown an equal aptitude for science, especially in medicine, biology, and ecology. So she suspects that the problem is with how _limited_ computer science can appear as a profession.

At my little high school in Croton, New York, we had a Westinghouse person nationally come in fifth. And they have a nice science program. Six of the seven people in it this year at the senior level are women doing amazing pieces of individual science.

What’s happening with those women is that they’re going into socially relevant fields. Computer science could be extremely socially relevant, but they’re going into earth sciences, biological sciences, medicine. Medicine is going to be 50/50 very soon. A lot of fields have belied that theory, but we [in computer science] haven’t.

I don’t necessarily think this perception that programming doesn’t seem to have a purpose behind obsessively sitting in front of a computer all day is exclusive to women. Even for those who’ve pursued a degree in computer science, it’s not clear how programming has relevance that is not an end to itself.

Check out this 2008 Slashdot thread, in which a recent computer science undergrad asks for suggestions of “Non-Programming Jobs for a Computer Science Major?” because he can’t think of ways to use computational thinking that doesn’t directly involve code. Or more recently, this screed by a NYU journalism professor, who sees coding as a trend du jour, little more than a pointless struggle to learn more code before a new language becomes hot and makes you obsolete.

I can’t claim to have insight myself, because when I left college with a computer engineering degree, I had no idea how to use it except to be a computer engineer, which I didn’t want to be, so I ditched it entirely at my first journalism job. Years later, I’ve slowly learned how to use programming to, well, practice journalism’s core function of interpreting and disseminating information. However, I attribute this to how much our world has become digitized with far fewer bottlenecks in applying computational thinking. So now it seems much more obvious that computer science can be as directly relevant to general society as medicine and ecology.

Non-scientists often assume that all scientists, and similarly left-brained people, can equally grok the concepts of programming. But this is as wrong an assumption as thinking that any programmer can easily pass the MCATs. Within the field of biological research, for example, there’s a difference of roles for biologists who can program and those who cannot.

The two fields of research are described as “wet-lab” and “dry-lab” work. In a recent issue of Nature, Roberta Kwok writes about how “biologists frustrated with wet-lab work can find rewards in a move to computational research“:

During her master’s programme in genetics from 2005 to 2008, Sarah Hird dreaded going into the lab. She was studying subspecies of red-tailed chipmunks and had become discouraged and frustrated by the uncertainties of molecular-biology experiments. She spent six weeks trying to amplify repetitive sequences in chipmunk DNA as part of an experiment to identify genetic differences between populations — but to no avail. Hird tried replacing reagents, switching to a different machine for running the polymerase chain reaction and decontaminating the sample-preparation area. Nothing worked. And then, for reasons that she never quite deciphered, the technique suddenly started working again.

By the end of her master’s, Hird had come to dislike working in a wet lab, and she decided not to apply for PhD programmes.

About six months after finishing her master’s degree, while working as a part-time technician at Louisiana State University in Baton Rouge, she discovered a better direction. The lab’s principal investigator had suggested that she learn a computer-programming language so that she could help with a simulation project. Hird, who had never programmed before, taught herself the language using a book and online tutorials, and quickly became engrossed.

“Once I started, it was like an addiction,” she says. She enjoyed developing algorithms, and she found the software-debugging process less frustrating than troubleshooting wet-lab problems. The work felt more under her control.

Later in her article, Kwok interviews a German biologist at the Max Planck Institute who offers this insight:

He notes that newcomers may stay more motivated if they can apply computational skills to real scientific problems rather than to the ‘toy’ exercises in a computer-science class. For example, a researcher who works with many image files could write a program to automatically perform processing steps, such as contrast enhancement, on thousands of images.

If young students – male or female – are turned off at the prospect of learning computer science, it’s not enough to just have role models. The usefulness of computational thinking are far too broad for just that. Why should only dedicated computer scientists benefit from the techniques and theory of programming, as if the importance of writing should only be left up to published writers?

Ideally the importance of computational thinking would be part of the general curriculum, and not just as a separate programming class, but integrated in the same way that you must read and write and perform calculations in your biology, physics, and economics class. But while we wait for that change to come about eventually – if at all – those of us in the field can help to increase diversity in computer science by increasing the visibility of computer science’s diverse impacts and applications.

After Allen complains about computer science’s too-narrow scope, Seibel simply asks, “So why do you like it?” She responds:

Part of it is that there’s the potential for new ideas every day. One sees something, and says, “Oh, that’s new.” The whole field gets refreshed very frequently. It’s very exciting to think about what the potential for all of this is and the impacts it can have.

Isaac Asimov made a statement about the future of computers-I don’t know whether it’s right or not-that what they’ll do is make every one of us much more creative. Computing would launch the age of creativity. One sees some of that happening-particularly in media. Kids are doing things they weren’t able to do before-make movies, create pictures.

We tend to think of creativity as a special gift that a person has, just as being able to read and write were special gifts in the Dark Ages-things only a few people were able to do. I found the idea that computers are the enablers of creativity very inspiring.

There’s a lot of other great stuff and stories in Allen’s interview, including her attempt to teach Fortran to IBM scientists, the need for compiler optimization in the age of petaflop-speed computing, and how other women in the industry, including one “who essentially was the inventor of multiprogramming”, have been robbed of their achievements. Read the rest of Allen’s interview, and 14 other equally great interviews with coders, in Seibel’s book, Coders at Work.

“Better know a developer” at AAJA 2013

I had the privilege of being on a panel with the New York Times’s Chase Davis and former YouTube designer Hong Qu at this year’s Asian American Journalists Association convention

The panel was titled, “Better Know a Developer” and my part of it was to discuss how non-programming journalists can work best with programmers.

You can see the slides here. The advice boils down to: Don’t believe in magic. Think about how you would do it yourself. And use a spreadsheet.

Ruby MiniTest Cheat Sheet, Unit and Spec reference

Ruby’s standard testing suite, MiniTest, is in dire need of a quick-and-handy reference for its syntax. I’ve put one together comparing the unit syntax (assert/refute) and the spec syntax (must/wont). You can see it below or download the Google Spreadsheet I made and roll your own sheet (HTML, XLS)

Test syntax

Unit Spec Arguments Examples

assert_empty
refute_empty

must_be_empty
wont_be_empty

obj, msg=nil

assert_empty []
refute_empty [1,2,3]
[].must_be_empty
[1,2,3].wont_be_empty


assert_equal
refute_equal

must_equal
wont_equal

exp, act, msg=nil

assert_equal 2, 2
refute_equal 2,1
2.must_equal 2
2.wont_equal 1


assert_in_delta
refute_in_delta

must_be_within_delta
wont_be_within_delta

exp, act, dlt=0.001, msg=nil

assert_in_delta 2012, 2010, 2
refute_in_delta 2012, 3012, 2
2012.must_be_within_delta 2010, 2
2012.wont_be_within_delta 3012, 2


must_be_close_to
wont_be_close_to

act, dlt=0.001, msg=nil

2012.must_be_close_to 2010, 2
2012.wont_be_close_to 3012, 2


assert_in_epsilon
refute_in_epsilon

must_be_within_epsilon
wont_be_within_epsilon

a, b, eps=0.001, msg=nil

assert_in_epsilon 1.0, 1.02, 0.05
refute_in_epsilon 1.0, 1.06, 0.05
1.0.must_be_within_epsilon 1.02, 0.05
1.0.wont_be_within_epsilon 1.06, 0.05


assert_includes
refute_includes

must_include
wont_include

collection, obj, msg=nil

assert_includes [1, 2], 2
refute_includes [1, 2], 3
[1, 2].must_include 2
[1, 2].wont_include 3


assert_instance_of
refute_instance_of

must_be_instance_of
wont_be_instance_of

klass, obj, msg=nil

assert_instance_of String, "bar"
refute_instance_of String, 100
"bar".must_be_instance_of String
100.wont_be_instance_of String


assert_kind_of
refute_kind_of

must_be_kind_of
wont_be_kind_of

klass, obj, msg=nil

assert_kind_of Numeric, 100
refute_kind_of Numeric, "bar"
100.must_be_kind_of Numeric
"bar".wont_be_kind_of Numeric


assert_match
refute_match

must_match
wont_match

exp, act, msg=nil

assert_match /\d/, "42"
refute_match /\d/, "foo"
"42".must_match /\d/
"foo".wont_match /\d/


assert_nil
refute_nil

must_be_nil
wont_be_nil

obj, msg=nil

assert_nil [].first
refute_nil [1].first
[].first.must_be_nil
[1].first.wont_be_nil


assert_operator
refute_operator

must_be
wont_be

o1, op, o2, msg=nil

assert_operator 1, :<, 2
refute_operator 1, :>, 2
1.must_be :<, 2
1.wont_be :>, 2


assert_output

must_output

stdout = nil, stderr = nil

assert_output("hi\n"){ puts "hi" }
Proc.new{puts "hi"}.must_output "hi\n"


assert_raises

must_raise

*exp

assert_raises(NoMethodError){ nil! }
Proc.new{nil!}.must_raise NoMethodError


assert_respond_to
refute_respond_to

must_respond_to
wont_respond_to

obj, meth, msg=nil

assert_respond_to "foo",:empty?
refute_respond_to 100, :empty?
"foo".must_respond_to :empty?
100.wont_respond_to :empty?


assert_same
refute_same

must_be_same_as
wont_be_same_as

exp, act, msg=nil

assert_same :foo, :foo
refute_same ['foo'], ['foo']
:foo.must_be_same_as :foo
['foo'].wont_be_same_as ['foo']


assert_silent

must_be_silent

-

assert_silent{ 1 + 1 }
Proc.new{ 1 + 1}.must_be_silent


assert_throws

must_throw

sym, msg=nil

assert_throws(:up){ throw :up}
Proc.new{throw :up}.must_throw :up

Test Setup

Unit Spec
setup() before(type = nil, &block)
teardown() after(type = nil, &block)

Mocks

via MiniTest::Mock

  • expect(name, retval, args=[]) – Expect that method name is called, optionally with args or a blk, and returns retval.

    Example:

    @mock.expect(:meaning_of_life, 42)
    @mock.meaning_of_life # => 42
    
    @mock.expect(:do_something_with, true, [some_obj, true])
    @mock.do_something_with(some_obj, true) # => true
    
    @mock.expect(:do_something_else, true) do |a1, a2|
       a1 == "buggs" && a2 == :bunny
    end
    
  • verify – Verify that all methods were called as expected. Raises MockExpectationError if the mock object was not called as expected.

Other syntax

  • def flunk(msg=nil)
  • def pass(msg=nil)
  • def skip(msg=nil, bt=caller)
  • def it (desc="anonymous", &block)
  • i_suck_and_my_tests_are_order_dependent!() – Call this at the top of your tests when you absolutely positively need to have ordered tests. In doing so, you’re admitting that you suck and your tests are weak. (TestCase public class method)
  • parallelize_me!() – Call this at the top of your tests when you want to run your tests in parallel. In doing so, you’re admitting that you rule and your tests are awesome.
  • make_my_diffs_pretty!() – Make diffs for this TestCase use pretty_inspect so that diff in assert_equal can be more details. NOTE: this is much slower than the regular inspect but much more usable for complex objects.

Tutorials

Reference

Learning to code is learning digital literacy

One of my least fulfilling reporting experiences involved a meeting between Ukrainian community leaders and local school district officials. There had been a brawl at a middle school and the Ukrainians, being the minority in that area, suspected their kids were racially targeted.

Anyway, the difficulty was that they didn’t speak English very well and many others at the meeting – the district officials and me, included – didn’t speak Ukrainian. So we had translators. The meeting had taken some time to plan and parents were nervous about their children’s safety, so everyone hoped for some kind of resolution. But after the meeting kicked off with long speeches by both sides – made doubly long by having to listen to the sterile translation that came afterwards, it seemed obvious that even a meaningful, impassioned debate was futile, nevermind reaching a satisfying resolution. The district officials told me afterwards, of course, that this was a good use of time, while the Ukrainian community members didn’t think anything came out of it.

When you think about it in theory, having a translator seems like a “good enough” solution. But when you’re sitting there, even though you and your other side share the same thought processes and general human understanding (independent of spoken language), a translator cannot:

  • Connect statements to the body language and emotion nuance of the original speaker
  • Mediate the kind of free-flowing give-and-take that is necessary for fulfilling debates
  • Make up for the time delay between the original statement and spoken translation

And of course, there’s the problem of knowing if the translation is even correct, as we know from Suntory Time:

This is a roundabout way to argue for the need for digital literacy, i.e. the Learn to Code movement. Code.org launched with big fanfare this week, featuring just about, well, everyone – Bill Gates, Bill Clinton, will.i.am(?), and everyone’s favorite “computer geek trapped in an NBA player’s body,” Chris Bosh – telling us why we all needed to program.

And of course there’s the backlash: why does everyone need to program? Jeff Atwood wrote a well-read essay about it last year (“Please Don’t Learn to Code”), Coders Lexicon posted an essay last night:

The fact is, the whole world should not learn to code anymore than all of us should learn to be a space shuttle engine designer or a lawyer. While I understand the need for more people to get interested in computer science and to fill our ranks with people who can meet the skills of the 21st century, going out there and telling everyone that coding is as easy as putting a bit of syntax down into an IDE and hitting compile is not the way. We need passionate people who are creative and want to learn to DESIGN software in addition to coding.

There are a couple problems with this argument. The first is making programming sound as time-intensive and domain-specific as space shuttle engineering or law. The second is making programming sound like an all-for-nothing proposition: don’t get into programming unless you can do it really well, or else your program will kill people.

(Not that that hasn’t happened…but the more relevant statement would be: don’t code people-killing programs unless you know what you’re doing.)

What would schools be like if we warned children to not learn how to read or write because ideas in books and newspapers have frequently made people’s lives miserable? Or, how about, don’t learn math, or the metric system, because even Lockheed Martin and NASA engineers can screw up and lose a $125 million Mars satellite?

The reason to learn how code works is to have a direct involvement in the technology and data we rely on. Just as we learn to read and write at a basic level so that our time isn’t spent waiting for someone – who may be much better and specialized at it – to read the news or road signs or write love letters for us.

In the newspaper world, it’s assumed that every reporter knows how to write. That’s not even remotely close to reality. In the golden, fat days, a reporter could expect three to four layers of editors to go through his/her work before it got printed (today, readers just have to deal with a lot more typos and errors). But even then, you couldn’t get by as being a fantastic reporter and storyteller who was also illiterate. Think about it: you don’t need to know how to read or write to go around asking questions or dictating what you remember hearing of a blockbuster scoop. But imagine the editors who get to spend their day transcribing and retelling you what they’re actually putting down on paper.

More importantly, think about what you, the illiterate reporter loses out on. You don’t get hands-on involvement on how the facts and quotes are arranged, because you have to wait for it to be done (and then read back to you) before you have a say. And there’s a lot that you just don’t know that you don’t know. Like how a headline, by its size, color, and length, can change everything about how a reader perceives your story. Or how a story that’s in print has a much different, detached effect than what you’ve experienced with the spoken word (similarly, writing copy for broadcast is its own separate art compared to newspaper writing).

But you don’t know that because you know that there’s a story, of some sort, is on the paper and people seem to be reading it so what’s the big deal?

This is the gap of understanding for anyone who isn’t digitally literate. It’s not about knowing how to build a website or assemble your own computer. It’s not even that you need to know a lot, just as you don’t have to be able to recite the entire works of Shakespeare to justify traditional literacy. With digital literacy, it’s about knowing enough to know what’s possible with computers, data, and technology, to control more of how it affects you. Or, as Donald Rumsfeld puts it, having fewer unknown unknowns.

How much is “knowing enough”? There’s a lot that’s been written about it in better places, Code.org probably isn’t a bad place to start. Besides learning to program, which is not the only thing to do or even the ultimate end goal, the two practical steps I can suggest are:


Also this week, Jeremy Ashkenas announced the first release of Literate CoffeeScript, a format that combines CoffeeScript with the easy-to-read-and-turn-to-HTML Markdown syntax. If you’re an experienced coder, this seems like nothing more than block-extended comments that are found in any language.

However, in practice, the power is in the details. I know that I avoid documenting code – even though we spend far more time re-reading code than actually writing it – because I just have never committed to some kind of system of making code comments easy to read (should I use asterisks to emphasize something? Or all caps? Or double space things?). With Markdown, you have the power of HTML to document your code. And more importantly, it’s intuitive and easy, which makes you more likely to write comments in the first place.

Check out Ashkenas’s demonstration: his own blog is powered by a “quick-and-dirty” blogging engine he wrote in Literate CoffeeScript. This is what the generated code and documentation look like on Github.

On a related note: if you’re a programmer who doesn’t understand why non-programmers need to be digitally literate: imagine having to write code in another language. Not programming language, but human language, such as in Chinese characters, without actually knowing that language. Sure, you could conceive an amazing program. But writing it yourself, with the help of an interpreter who types out what you say? Sounds even more laborious and mind-numbing than the worst programming project imaginable.

The Bastards Book of Regular Expressions

Well, I’m not quite done with my promised revision of the Bastards Book of Ruby. Or of Photography…but I’ve decided, oh what the hell, I should write something about regular expressions.

Actually, there is some method to this madness. As part of the process of updating the Ruby book, I realized I needed to spin off some of the larger, non-Ruby related topics. So, at some point, there will be mini-books about HTML and SQL. Regular expressions, as I keep telling people who want to deal with data, are incredibly important, even if you think you never want to learn programming. Hopefully this mini-book will make a strong case for learning regexes.

The second motive is I’ve been looking for a html/text-to-pdf workflow. So this is my experiment with Leanpub, which promises to turn a set of Markdown files into PDF/mobi/etc, while handling the selling process. I don’t expect to sell any copies of the BBoRegexes, but I hope to get a lot of insight about the mechanics behind Leanpub and if it presents a viable way for me to publish my other projects.

Check out the Leanpub homepage for my tentatively tiled book, The Bastards Book of Regular Expressions. Or, you could just read the mega-chapter on regexes in my Ruby book.

Tools to get to the precipice of programming

I’m not a master programmer but it’s been so long since I’ve done my first “Hello World” that I don’t remember how people first grok the point of programming (for me, it was to get a good grade in programming class).

So when teaching non-programmers the value of code, I’m hoping there’s an even friendlier, shallower first step than the many zero-to-coder references out there, including Zed Shaw’s excellent Learn Code the Hard Way series.

Not only should this first step be “easy”, but nearly ubiquitous, free-to-use, and most importantly: has immediate benefit for both beginners and experts. The point here is not to teach coding, per se, but to get them to a precipice of great things. So that when they stand at the edge, they can at least see something to program towards, even if the end goal is simply labor-aversion, i.e. “I don’t want to copy-and-paste 100 web page tables by hand.”

Here are a few tools I’ve tried:

Inspecting a cat photo

1. Using the web inspector – I’ve never seen the point of taking an indepth HTML class (unless you want to become a full-time web designer/developer, and even then…) because so many non-techies even grasp that webpages are (largely) text, external multimedia assets (such as photos and videos), and the text that describes where those assets come from. To them, editing a webpage is as arcane as compiling a binary.

Nothing breaks that illusion better than the web inspector. Its basic element-inspector and network panel illustrates immediately the “magic” behind the web. As a bonus, with regular, casual use, the inspector can teach you the HTML and CSS vocabulary if you do intend to be a developer. It’s hard to think of another tool that is as ubiquitous and easy to use as the web inspector, yet as immensely useful to beginner and expert alike.

Its uses are immediate, especially for anyone who’s ever wanted to download a video from YouTube. To journalists, I’ve taught how this simple-to-use tool has helped me in my investigative reporting when I needed to find an XML file that was obfuscated through a Flash object.

In a hands-on class I taught, a student asked “So how do I get that XML into Excel?” – and that’s when you can begin to describe the joy of a basic for loop.

Here’s an overview of a hands-on web session I taught at NICAR12. Here’s the guide I wrote for my ProPublica project. And here’s the first of a multi-part introduction to the web inspector.

Refine WH Visitors

2. Google Refine – Refine is a spreadsheet-like software that allows you to easily explore and clean data: the most common example is resolving varied entries (“JOHN F KENNEDY”, “John F. Kennedy”, “Jack Kennedy”, “John Fitzgerald Kennedy”) into one (“John F. Kennedy”). Given that so many great investigative stories and data projects start with “How many times does this person’s name appear in this messy database?”, its uses are immediate and obvious.

Refine is an open-source tool that works out of the web browser and yet is such a powerful point-and-click interface that I’m happy to take my data out of my scripted workflow in order to use Refine’s features on it. Not only can you use regular expressions to help filter/clean your data, you can write full-on scripts, making Refine a pretty good environment to show some basic concepts of code (such as variables and functions).

I wrote a guide showing how Refine was essential for one of my investigative data projects. Refine’s official video tutorial is also a great place to start.

3. Regular Expressions – maybe it was because my own comsci curriculum skipped regexes, leaving me to figure out their worth much much later than I should have. But I really try to push learning regexes every time the following questions are asked:

  • In Excel, how do I split this “last_name, first_name middle_name” column into three different columns?
  • In Excel, how do I get all these date formats to be the same?
  • In Excel, how do I extract the zip code from this address field?

…and so on. The use of LEFT, TRIM, RIGHT, etc. functions seem to always be much more convoluted than the regex needed to do this kind of simple parsing. And while regexes aren’t the answer to every parsing problem, they sure deliver a lot of return for the investment (which can start from a simple cheat sheet next to your computer).

Regular-expressions.info has always been one of my favorite references. Zed Shaw is also writing a book on regexes. I’ve also written a lengthy tutorial on regexes.

So none of these tools or concepts involve programming…yet. But they’re immediately useful on their own, opening new doors to useful data just enough to interest beginners into going further. In that sense, I think these tools make for an inviting introduction towards learning programming.

Analyzing the U.S. Senate Smiles: A Ruby tutorial with the Face.com and NYT Congress APIs

U.S. Senate Smiles, ranked by Face.com face-detection algorithm

The smiles of your U.S. Senate from most smiley-est to least, according to Face.com's algorithm

Who’s got the biggest smile among our U.S. senators? Let’s find out and exercise our Ruby coding and civic skills. This article consists of a quick coding strategy overview (from the full code is at my Github). Or jump here to see the results, as sorted by Face’s algorithm.

About this tutorial

This is a Ruby coding lesson to demonstrate the basic features of Face.com’s face-detection API for a superficial use case. We’ll mash with the New York Times Congress API and data from the Sunlight Foundation.

The code comprehension is at a relatively simple level and is intended for learning programmers who are comfortable with RubyGems, hashes, loops and variables.

If you’re a non-programmer: The use case may be a bit silly here but I hope you can view it from an abstract-big-picture level and see the use of programming to: 1) Make quick work of menial work and 2) create and analyze datapoints where none existed before.

On to the lesson!

The problem with portraits

For the SOPA Opera app I built a few weeks ago, I wanted to use the Congressional mugshots to illustrate
the front page. The Sunlight Foundation provides a convenient zip file download of every sitting Congressmember’s face. The problem is that the portraits were a bit inconsistent in composition (and quality). For example, here’s a usable, classic head-and-shoulders portrait of Senator Rand Paul:

Sen. Rand Paul

But some of the portraits don’t have quite that face-to-photo ratio; Here’s Sen. Jeanne Shaheen’s portrait:

Sen. Jeanne Shaheen

It’s not a terrible Congressional portrait. It’s just out of proportion compared to Sen. Paul’s. What we need is a closeup crop of Sen. Shaheen’s face:

Sen. Jeanne Shaheen's face cropped

How do we do that for a given set of dozens (even hundreds) of portraits that doesn’t involve manually opening each image and cropping the heads in a non-carpal-tunnel-syndrome-inducing manner?

Easy face detection with Face.com’s Developer API

Face-detection is done using an algorithm that scans an image and looks for shapes proportional to the average human face and containing such inner shapes as eyes, a nose and mouth in the expected places. It’s not as if the algorithm has to have an idea of what an eye looks like exactly; two light-ish shapes about halfway down what looks like a head might be good enough.

You could write your own image-analyzer to do this, but we just want to crop faces right now. Luckily, Face.com provides a generous API that when you send it an image, it will send you back a JSON file in this format:

{
    "photos": [{
        "url": "http:\/\/face.com\/images\/ph\/12f6926d3e909b88294ceade2b668bf5.jpg",
        "pid": "F@e9a7cd9f2a52954b84ab24beace23046_1243fff1a01078f7c339ce8c1eecba44",
        "width": 200,
        "height": 250,
        "tags": [{
            "tid": "TEMP_F@e9a7cd9f2a52954b84ab24beace23046_1243fff1a01078f7c339ce8c1eecba44_46.00_52.40_0_0",
            "recognizable": true,
            "threshold": null,
            "uids": [],
            "gid": null,
            "label": "",
            "confirmed": false,
            "manual": false,
            "tagger_id": null,
            "width": 43,
            "height": 34.4,
            "center": {
                "x": 46,
                "y": 52.4
            },
            "eye_left": {
                "x": 35.66,
                "y": 44.91
            },
            "eye_right": {
                "x": 58.65,
                "y": 43.77
            },
            "mouth_left": {
                "x": 37.76,
                "y": 61.83
            },
            "mouth_center": {
                "x": 49.35,
                "y": 62.79
            },
            "mouth_right": {
                "x": 57.69,
                "y": 59.75
            },
            "nose": {
                "x": 51.58,
                "y": 56.15
            },
            "ear_left": null,
            "ear_right": null,
            "chin": null,
            "yaw": 22.37,
            "roll": -3.55,
            "pitch": -8.23,
            "attributes": {
                "glasses": {
                    "value": "false",
                    "confidence": 16
                },
                "smiling": {
                    "value": "true",
                    "confidence": 92
                },
                "face": {
                    "value": "true",
                    "confidence": 79
                },
                "gender": {
                    "value": "male",
                    "confidence": 50
                },
                "mood": {
                    "value": "happy",
                    "confidence": 75
                },
                "lips": {
                    "value": "parted",
                    "confidence": 39
                }
            }
        }]
    }],
    "status": "success",
    "usage": {
        "used": 42,
        "remaining": 4958,
        "limit": 5000,
        "reset_time_text": "Tue, 24 Jan 2012 05:23:21 +0000",
        "reset_time": 1327382601
    }
}		
	

The JSON includes an array of photos (if you sent more than one to be analyzed) and then an array of tags – one tag for each detected face. The important part for cropping purposes are the attributes dealing with height, width, and center:

		"width": 43,
      "height": 34.4,
      "center": {
          "x": 46,
          "y": 52.4
      },	
	

These numbers represent percentage values from 0-100. So the width of the face is 43% of the image’s total width. If the image is 200 pixels wide, then the face spans 86 pixels.

Using your favorite HTTP-calling library (I like the RestClient gem), you can simply ping the Face.com API’s detect feature to get these coordinates for any image you please.

Image manipulation with RMagick

So how do we do the actual cropping? By using the RMagick (a Ruby wrapper for the ImageMagick graphics library) gem, which lets us do crops with commands as simple as these:

img = Magick::Image.read("somefile.jpg")[0]

# crop a 100x100 image starting from the top left corner
img = img.crop(0,0,100,100) 

The RMagick documentation page is a great place to start. I’ve also written an image-manipulation chapter for The Bastards Book of Ruby.

The Process

The code for all of this is stored at my Github account.

I’ve divided this into two parts/scripts. You could combine it into one script but to make things easier to comprehend (and to lessen the amount of best-practices error-handling code for me to write), I divide it into a “fetch” and “process” stage.

In the fetch.rb stage, we essentially download all the remote files we need to do our work:

  • Download a zip file of images from Sunlight Labs and unzip it at the command line
  • Use NYT’s Congress API to get latest list of Senators
  • Use Face.com API to download face-coordinates as JSON files

In the process.rb stage, we use RMagick to crop the photos based from the metadata we downloaded from the NYT and Face.com. As a bonus, I’ve thrown in a script to programmatically create a crude webpage that ranks the Congressmembers’ faces by smile, glasses-wearingness, and androgenicity. How do I do this? The Face.com API handily provides these numbers in its response:

	"attributes": {
            "glasses": {
                "value": "false",
                "confidence": 16
            },
            "smiling": {
                "value": "true",
                "confidence": 92
            },
            "face": {
                "value": "true",
                "confidence": 79
            },
            "gender": {
                "value": "male",
                "confidence": 50
            },
            "mood": {
                "value": "happy",
                "confidence": 75
            },
            "lips": {
                "value": "parted",
                "confidence": 39
            }
        }
	

I’m not going to reprint the code from my Github account, you can see the scripts yourself there:

https://github.com/dannguyen/Congressmiles

First things first: sign up for API keys at the NYT and Face.com

I also use the following gems:

The Results

Here’s what you should see after you run the process.rb script (all judgments made by Face.com’s algorithm…I don’t think everyone will agree with about the quality of the smiles):


10 Biggest Smiles

Sen. Wicker (R-MS)
Sen. Wicker (R-MS) [100]
Sen. Reid (D-NV)
Sen. Reid (D-NV) [100]
Sen. Shaheen (D-NH)
Sen. Shaheen (D-NH) [99]
Sen. Hagan (D-NC)
Sen. Hagan (D-NC) [99]
Sen. Snowe (R-ME)
Sen. Snowe (R-ME) [98]
Sen. Kyl (R-AZ)
Sen. Kyl (R-AZ) [98]
Sen. Klobuchar (D-MN)
Sen. Klobuchar (D-MN) [98]
Sen. Crapo (R-ID)
Sen. Crapo (R-ID) [98]
Sen. Johanns (R-NE)
Sen. Johanns (R-NE) [98]
Sen. Hutchison (R-TX)
Sen. Hutchison (R-TX) [98]


10 Most Ambiguous Smiles

Sen. Inouye (D-HI)
Sen. Inouye (D-HI) [40]
Sen. Kohl (D-WI)
Sen. Kohl (D-WI) [43]
Sen. McCain (R-AZ)
Sen. McCain (R-AZ) [47]
Sen. Durbin (D-IL)
Sen. Durbin (D-IL) [49]
Sen. Roberts (R-KS)
Sen. Roberts (R-KS) [50]
Sen. Whitehouse (D-RI)
Sen. Whitehouse (D-RI) [52]
Sen. Hoeven (R-ND)
Sen. Hoeven (R-ND) [54]
Sen. Alexander (R-TN)
Sen. Alexander (R-TN) [54]
Sen. Shelby (R-AL)
Sen. Shelby (R-AL) [62]
Sen. Johnson (D-SD)
Sen. Johnson (D-SD) [63]

The Non-Smilers

Sen. Bingaman (D-NM)
Sen. Bingaman (D-NM) [79]
Sen. Coons (D-DE)
Sen. Coons (D-DE) [77]
Sen. Burr (R-NC)
Sen. Burr (R-NC) [72]
Sen. Hatch (R-UT)
Sen. Hatch (R-UT) [72]
Sen. Reed (D-RI)
Sen. Reed (D-RI) [71]
Sen. Paul (R-KY)
Sen. Paul (R-KY) [71]
Sen. Lieberman (I-CT)
Sen. Lieberman (I-CT) [59]
Sen. Bennet (D-CO)
Sen. Bennet (D-CO) [55]
Sen. Udall (D-NM)
Sen. Udall (D-NM) [51]
Sen. Levin (D-MI)
Sen. Levin (D-MI) [50]
Sen. Boozman (R-AR)
Sen. Boozman (R-AR) [48]
Sen. Isakson (R-GA)
Sen. Isakson (R-GA) [41]
Sen. Franken (D-MN)
Sen. Franken (D-MN) [37]


10 Most Bespectacled Senators

Sen. Franken (D-MN)
Sen. Franken (D-MN) [99]
Sen. Sanders (I-VT)
Sen. Sanders (I-VT) [98]
Sen. McConnell (R-KY)
Sen. McConnell (R-KY) [98]
Sen. Grassley (R-IA)
Sen. Grassley (R-IA) [96]
Sen. Coburn (R-OK)
Sen. Coburn (R-OK) [93]
Sen. Mikulski (D-MD)
Sen. Mikulski (D-MD) [93]
Sen. Roberts (R-KS)
Sen. Roberts (R-KS) [93]
Sen. Inouye (D-HI)
Sen. Inouye (D-HI) [91]
Sen. Akaka (D-HI)
Sen. Akaka (D-HI) [88]
Sen. Conrad (D-ND)
Sen. Conrad (D-ND) [86]


10 Most Masculine-Featured Senators

Sen. Bingaman (D-NM)
Sen. Bingaman (D-NM) [94]
Sen. Boozman (R-AR)
Sen. Boozman (R-AR) [92]
Sen. Bennet (D-CO)
Sen. Bennet (D-CO) [92]
Sen. McConnell (R-KY)
Sen. McConnell (R-KY) [91]
Sen. Nelson (D-FL)
Sen. Nelson (D-FL) [91]
Sen. Rockefeller IV (D-WV)
Sen. Rockefeller IV (D-WV) [90]
Sen. Carper (D-DE)
Sen. Carper (D-DE) [90]
Sen. Casey (D-PA)
Sen. Casey (D-PA) [90]
Sen. Blunt (R-MO)
Sen. Blunt (R-MO) [89]
Sen. Toomey (R-PA)
Sen. Toomey (R-PA) [88]


10 Most Feminine-Featured Senators

Sen. McCaskill (D-MO)
Sen. McCaskill (D-MO) [95]
Sen. Boxer (D-CA)
Sen. Boxer (D-CA) [93]
Sen. Shaheen (D-NH)
Sen. Shaheen (D-NH) [93]
Sen. Gillibrand (D-NY)
Sen. Gillibrand (D-NY) [92]
Sen. Hutchison (R-TX)
Sen. Hutchison (R-TX) [91]
Sen. Collins (R-ME)
Sen. Collins (R-ME) [90]
Sen. Stabenow (D-MI)
Sen. Stabenow (D-MI) [86]
Sen. Hagan (D-NC)
Sen. Hagan (D-NC) [81]
Sen. Ayotte (R-NH)
Sen. Ayotte (R-NH) [79]
Sen. Klobuchar (D-MN)
Sen. Klobuchar (D-MN) [79]

For the partisan data-geeks, here’s some faux analysis with averages:

Party Smiles Non-smiles Avg. Smile Confidence
D 44 7 85
R 42 5 86
I 1 1 85

There you have it, the Republicans are the smiley-est party of them all.

Further discussion

This is an exercise to show off the very cool Face.com API and to demonstrate the value of a little programming knowledge. Writing the script doesn’t take too long, though I spent more time than I liked on idiotic bugs of my own making. But this was way preferable than cropping photos by hand. And once I had the gist of things, I not only had a set of cropped files, I had the ability to whip up any kind of visualization I needed with just a minute’s more work.

And it wasn’t just face-detection that I was using, but face-detection in combination with deep data-sources like the Times’s Congress API and the Sunlight Foundation. For the SOPA Opera app, it didn’t take long at all to populate the site with legislator data and faces. (I didn’t get around to using this face-detection technique to clean up the images, but hey, I get lazy too…)

Please don’t judge the value of programming by my silly example here – having an easy-to-use service like Face.com API (mind the usage terms, of course) gives you a lot of great possibilities if you’re creative. Off the top of my head, I can think of a few:

  • As a photographer, I’ve accumulated thousands of photos but have been quite lazy in tagging them. I could conceivably use Face.com’s API to quickly find photos without faces for stock photo purposes. Or maybe a client needs to see male/female portraits. The Face.com API gives me an ad-hoc way to retrieve those without menial browsing.
  • Data on government hearing webcasts are hard to come by. I’m sure there’s a programmatic way to split up a video into thousands of frames. Want to know at which points Sen. Harry Reid shows up? Train Face.com’s API to recognize his face and set it loose on those still frames to find when he speaks.
  • Speaking of breaking up video…use the Face API to detect the eyes of someone being interviewed and use RMagick to detect when the eyes are closed (the pixels in those positions are different in color than the second before) to do that college-level psych experiment of correlating blinks-per-minute to truthiness.

Thanks for reading. This was a quick post and I’ll probably go back to clean it up. At some point, I’ll probably add this to the Bastards Book.