Parse.ly’s P3 Platform

I was finally able to spend quality time with the Parse.ly Reader, an app designed to show some of the capabilities of the underlying Parse.ly platform, called P3, which is currently in beta. To be clear, unlike many other players in the recommendation patch (GetGlue, Xydo, Hunch, etc.), this NYC-based startup is not in the business of providing a direct service to users.

Instead they give access to their cloud-based recommendation server through a set of RESTful APIs. The Reader app is just a demonstration of what can be done with their technology.

So what can be done?

After reading through the P3 reference documents and interacting with the Parse.ly Reader, you quickly see that P3’s aim is to reproduce formerly expensive, proprietary technology mastered by a few players (Netflix, Amazon) for businesses in general— most likely, those in the small-to-medium bins.

It’s another Nick Carr moment for me, in which technology has turned a previously mysterious application, recommendation algorithms in this case, into something closer to an appliance meant for wider usage.

If you’re Netflix, you offer a $1 million prize to find the best recommendation algorithms for matching movies to users. Of course, Netflix owns the intellectual property of all those submissions.

Or in the case of Amazon, use massive computing facilities to cluster millions of customer into smaller subsets. And then generate book recommendations based on the sub-group you or I have been mapped to (see reference below).

What about the rest of us? We can all try to recreate collaborative filtering—matching user preferences with crowd averages—using standard techniques and algorithms from the literature. But that can be difficult to implement and get right on a larger scale.

This is where Parse.ly steps in. They’ve already done the hard work. Their API will be plugged-in to a web site’s back-end, effectively turning P3 into a separate black-box component, one that’s completely dedicated to learning user interest in Web pages or documents. With Parse.ly’s approach, a business with a modest level of IT and Web development skills would be able to deploy a capable recommendation system into their existing infrastructure.

Parse.ly has gotten around the training phase that appears in many of the recommendation sites—the endless questions about what you’re interested in—by treating a click as a binary yes-no vote on the underlying content. So users are actually training the model about their content preferences as they go.

And the core model, by the way, appears to be of the “Bag of Words” variety, which I’ve mentioned in another post, employing the usual Bayesian magic to determine other relevant content.

Obviously, a yes-no indication is not as fine tuned as say, a 1-10 scale, and without a formal training session, you have the usual issues with a “cold start,” but I think that P3 may be just enough for most SMB web sites.

According to their API documentation, content enters P3 as feeds (“data source management”) and can be queried with a regular expressions ( “item query”). P3 supports collaborative filtering, using the crowd to generate additional likely relevant content, by grouping users into “taste profiles.” I don’t know what algorithms are in P3’s crowd filtering machinery—they don’t say— but keep in mind that even well known clustering methods did very, very well in the Netflix contest.

From the P3 reference manual.

So how did the Parse.ly Reader do? I gave it “FCC”, “broadband policy”, and “net neutrality” as high-interest keywords. And it deftly caught the big story yesterday of FCC Chairman Genachowski placing open Internet on the agency’s December agenda, and also the net neutrality proposal that Mr. G will be presenting.

One nice thing about news feeds is the timeliness of the data, often producing better results (for me at least) than a standard Google search.

I’ve registered interest in a few of the stories, so it will wait and see whether the Reader will begin to fine tune subsequent feed items.

It’s not clear whether other beta users are part of the crowd, or I’m just a solo act. No matter, it’s still worthwhile to see what emerges.

I should note, that I’ve been told by Sachin Kamdar, Parse.ly CEO, that the Reader is not using the latest P3 technology.

My overall take: Parse.ly’s P3 gives impressive number crunching capabilities to businesses that would not have considered using these techniques.

Nick Carr would approve.

Parse.ly
Google’s Pretty Good Recommendation Service (technoverseblog.com)
Amazon.com Recommendations: Item-to-item Collaborative Filtering (slideshare.net)
A Survey of Collaborative Filtering Techniques (slideshare.net)
Netflix Prize (netflixprice.com)
Nick Carr’s Blog (roughtype.com)

Related articles