Freebase: Semantic Sandwich for Google

There actually was some significant news last week in the technoverse, and it didn’t involve another episode from Mark Zuckerberg’s reality show: on July 16, Google purchased Metaweb, the semantic database company and the force behind the freewheeling Freebase.

No doubt, the semantic web has entered into your own knowledgebase during the last year.

If it hasn’t, quick go to Google: enter empire state building height in the search box. Notice that the numeric height “1250 ft. ( 380 m.)” is highlighted in the search results. Google knew to answer this query with an actual number, instead of merely returning text snippets in which those search keywords were found. This flavor of artificial intelligence comes courtesy of an analysis of the knowledge space.

In a way, Google comprehended that “empire state building” is a structure, which has an attribute or property known as height, which itself has a numeric value associated with it measured in distance units.

Impressive.

Metaweb’s Freebase provides both the knowledge structure and content to answer data-of-birth type questions. You get a tasty knowledge sandwich instead of a bag of raw ingredients.

With the Freebase purchase, Google has obtained “a vast, free, open online database of structured knowledge, powered and maintained by Metaweb Technologies.” This self-description is a little misleading.  Freebase is a database, but is not based on the relational model with its conservative rules on stable schemas.

Google’s semantic analysis of birthday

Freebase is a network database in which relationships between objects are the key to its design and give it the power to map raw data into a concept hierarchy—a tree of knowledge, so to speak.

Throw out your conceptual view of databases as a series of tables with columns, and instead replace that with a branching graph that connects knowledge nuggets.  It is the paths between the content nuggets from which crucial classification insights can be harvested.

And by open, the Freebase architects mean that anyone can contribute new knowledge into the existing network data structure or even extend the graph to classify new topics. Technically, Freebase has adopted a Wikipedia style organization of contributors and editors.  There are good and bad aspects to this, but  crowdsourcing has the advantage of decentralizing the giant problem of content gathering and classification to the individuals: asking lots of humans to fill up the blank slate of this database baby.

What can you do with Freebase? The possibilities are broad and far reaching.  For Google it means that searches can now be turned into queries , “Italian restaurants near Union Square” or “Civil War battles in which Stonewall Jackson led troops” would return tabular answers, with perhaps more information and relationships than the requester had initially known about.

For those interested in taking a peek at the middleware layer of this engine, the Freebase Query Editor is a good starting place. The learning curve for this thing is steep.  The JSON-like query syntax is  not-very-friendly and much more foreboding looking than MySQL.   Freebase semantic queries are intended to be generated by another layer of software, somewhere deep in a Google application.

In any case, I took a stab at running my own queries (after spending more than a few hours parsing the Freebase API manual). Once you grok the object oriented feel of the database— knowledge forms hierarchies, and objects inherit multiple types— you can start hammering out complex queries that simulate database joins from the relational world.

Freebase is rich in location and travel information, so I queried Freebase looking for tourist attractions in Rome. And because of Freebase’s flexible  typing —an object, like Rome, is not only a tourist destination, but also a location type— I can pull in all sorts of related geographic information:

[{
“/type/object/type”: “/travel/travel_destination”,
“/type/object/name”: “Rome”,
“/location/location/containedby”: [{
“/type/object/name”: “Italy”,
“/location/location/contains”: []
}],
“/travel/travel_destination/tourist_attractions”: [{
“/type/object/name”: null
}],
“/location/statistical_region/population”: {
“/measurement_unit/dated_integer/number”: null
}

}]

You can see the results of my query here, which makes  a web service request to the Freebase mql (Metaweb Query Language) server.

There are many ways to use Freebase to extend the knowledge reach of any website. To help, they offer a development environment called Acre to ease you into creating real apps.

According to Google, Freebase will continue to be an open database available to anyone.

2 Comments

  1. Pingback: Wikipedia – abraçando a web semântica « Enio de Aragon

Comments are closed.