Tag Archives: natural language processing

The Google death and resurrection of Amy Wilentz

Author Amy Wilentz has a fun piece about how Google listed her as “dead” in the rich snippet search result for her name. Her untimely death apparently came from her Wikipedia entry, which was, to say the least, unconventionally created:

(is there a conventional way for Wikipedia entries to come about?)

Google picked up my facts from my Wikipedia entry. My Wikipedia entry, oddly, was put up by Cousin Joel, who has a genealogy obsession, and has assembled an astounding dossier on our family, finding members of it in places as far flung as Dvinsk, Latvia, Hollywood, California, and Perth Amboy, New Jersey.

So it’s not too surprising that my original Wikipedia entry, as conceived by Joel, was — let’s be honest — more about my father (a famous New Jersey judge) than about me. Joel began the entry with my connection to my father, and immediately mentioned my father’s birthdate and the date of his death.

Google is not a subtle thief. If your name on Wikipedia is followed by a birth and death date, apparently those belong to you from that day forward, no matter whose dates they may be. Seen that way, I suppose I should just be glad that I’m not related (as far as Joel knows) to King Solomon, another judge.

The problem was probably not Google’s fault: natural language processing across the entire corpus of the web is a tricky thing. But Wilentz tackles the technical topic of search indexing from a layperson’s standpoint, which, in my opinion, makes it a particularly valuable read as she details the impregneable process of how to correct Google. I understand the technical theory (I think) of Google’s searchbots but I’m not sure that even I know how to get something fixed in the search results. More importantly, I don’t even know that if Google wanted to improve things, how they might do so that wouldn’t crimp the technical workflow. Anyway, Wilentz’s anecdote is well-worth reading, and as you’d expect from an author deserving of a Wikipedia entry, nicely written and entertaining.

At some point after Wilentz wrote her post, her search result correctly lists her as alive (for now). It’s likely a result of her Wikipedia entry’s first line listing her birth date – “Amy Wilentz (born September 1, 1954) is an American journalist and writer. – as opposed to: “Amy Wilentz is an American journalist and writer.”. Note/Update: this theory is wrong, as the corrected birthdate format didn’t happen until today. Matt Cutts responded to the post on Hacker News.

But who really knows the machinations behind Google’s search results? Wilentz’s fixed lifespan reminds me of this entertaining anecdote from (Steven Levy’s “In the Plex”) (non-affiliate link) on how Google engineers fixed a vexing problem of a garden gnome that wouldn’t go away:

But one problem was so glaring that the team wasn’t comfortable releasing Froogle: when the query “running shoes” was typed in, the top result was a garden gnome sculpture that happened to be wearing sneakers. Every day engineers would try to tweak the algorithm so that it would be able to distinguish between lawn art and footwear, but the gnome kept its top position.

One day, seemingly miraculously, the gnome disappeared from the results. At a meeting, no one on the team claimed credit. Then an engineer arrived late, holding an elf with running shoes. He had bought the one-of-a kind product from the vendor, and since it was no longer for sale, it was no longer in the index. “The algorithm was now returning the right results,” says a Google engineer. “We didn’t cheat, we didn’t change anything, and we launched.”

Someday, a Google engineer may find it easier to just ressurect someone than algorithmically fix a search snippet…

Update: Google search engineer Matt Cutts responded on Hacker News. He doesn’t say how it was eventually fixed, but says that the “Feedback / More info” link really does lead to a reporting tool that gets reviewed “and that’s the fastest way to report an issue”