Xerox Sues Google Over Search Algorithm

I’m having a little trouble processing that headline too. But Bloomberg reported that Xerox has filed suit against Google for using its search inventions without permission. Xerox claims that a patent issued in 2004 is being infringed on by Google’s Adsense and Adwords software, which is the money-making heart of the search engine Midas.

So I decided to look up the patent. The fact that I used Google’s Patent Search should probably tell you something about the merits of Xerox’s case.

The Xerox suit came onto my radar just as I finished reading a fascinating article by Wired’s Steve Levy on why Google’s search is so much better than everyone else’s. The company started out with an innovative algorithm, called PageRank. Unlike other approaches, their big idea was to  evaluate a Web page’s relevancy by counting the number of pages — I’m oversimplifying, it’s actually a probabilistic technique—that points back to the target page. Google started off with a far superior way to manage keyword searches for Web content, and then sprinted ahead of everyone else with a succession of refinements and breakthroughs.

Back to Xerox. I decided to take a brief tour through the patent.

Xerox's 2004 Search Patent

Let me first extend my sympathies to the lawyers involved in having to argue over the text in this document. I suspect the judge and attorneys will want to gas up at Starbucks before starting their day in court.

One troubling aspect in the patent for me was that many of the diagrams included copier-scanner devices–clearly, this is a Xerox view of the world. No matter. The patent seems to describe a knowledge management system in which documents are dynamic, incorporating links and content that have been culled from the information repository and (I think) the Internet. The document is “enriched” with this meta-information.

The trick is to come up with the right queries to retrieve external information that’s appropriate for the documents “personality”. I was beginning to get it. And then I come across some words (in section 21) that stopped me in my tracks:

“…searches performed by a selected service can be limited to a specific category in the information provider’s directory (e.g., Google) of information content.”

Yikes! The Xerox patent references Google’s as a search provider. I’m not a lawyer, but I’d guess this is not favorable to Xerox.

I was losing steam at this point. But one takeaway from reading the patent was how revolutionary Google’s approach was in determining relevancy. While Xerox was using and extending standard techniques (Bayesian classifiers, etc.), the founders of Google, Page and Brin, turned search on its head by letting the crowd, that is us, decide on the importance of content based on our votes (the links from other Web pages) . That’s a cool idea.

Game, set, and match to Google.