Extracting Meaning from your Twitter Stream

Distillation columnPart of Twitter’s allure is that it’s always on, yet you have a life to live, so no matter how often you dip into the stream you’ve always missed something interesting.

Of course, you can always find things via search, or through hashtags (those words preceded by # that are sprinkled inside some tweets as a way of aggregating the discussion around a specific topic). But neither provides an easy way to browse what’s being said around a subject, let alone draw your attention to other tweets that you might find interesting.

That’s why I was so intrigued to read in MIT’s Technology Review recently about intelligent filtering software under development at the Palo Alto Research Center, just a few miles up the road. Called Eddi (after the eddies that form around perturbations in a stream), the prototype has two distinct modes to help you overcome what some people call “information overload”.

The first tool is a recommendation engine, which ranks tweets by how interesting they are likely to be to you. To determine this, Eddi’s algorithms look at your own tweets and interactions with other Twitter users, as well as applying mathematical values to content sources, topic interest models for users, and social voting.

The second tool, a Twitter topic browser, “summarizes the contents of a user’s timeline so that the user can quickly survey what information has come through Twitter without having to read through every post.” To do so, the system extracts the meaning behind each tweet, an impressive feat when you consider that all other natural language processing tools need to analyze much larger text samples to understand what they’re about.

I remember visiting PARC back when it was Xerox PARC, and seeing a demo of a similar system, that even then was jaw-droppingly sophisticated. Starting with a 20,000-word scientific article on any topic under the sun, it could rapidly output a perfectly intelligent summary of the article’s ideas in 200 words, or 500 words, or whatever was needed. The technology has clearly taken huge strides since then, as evidenced by its ability to distill meaning out of a mere 140 characters.

So even though the tweets just keep on coming, don’t fret about drowning in all this new information – you’ll be able to safely ignore it, and let the software provide a running summary of the discourse, alerting you when something especially interesting emerges.