Somewhere on my Internet journeys, I learned about Dandelion. It’s a platform for doing semantic analysis of text. Another words, software that has some reading comprehension abilities. Maybe even as smart as a 10-year old.
Startups in this automated reading-and-understanding space are each making claims as to the intelligence of the underlying machine learning engine. Dandelion is the work of an Italian startup SpazioDati based out of Trento. Anyway, with the Dandelion APIs you can extract entities—ideas, names, products, location, brands—from a document or web page. Dandelion is not promising, as some are, to completely analyze an article or story and then write a summary.
However what Dandelion does is still quite impressive. It can make connections from partial information to more specific entities and general concepts. Rather than my explaining more, let’s take a real-world example.
Here’s a small part of a review for Woody Allen’s love letter to Italy and Rome, “To Rome With Love”:
Allen is no stranger to Italy and Italians: in Play It Again, Sam (1972), his film critic Allan comes out of the portmanteau film Le Coppie, or Couples, by De Sica and others – and then fantasises about jealous husband Tony Roberts going for him with a knife, as if in a spoof Italian movie. There was also a brilliant Antonioni pastiche in Everything You Always Wanted to Know about Sex (1972), and his Stardust Memories (1980) was famously a homage to Fellini. To Rome With Love has nothing of this stylishly offbeat connoisseurship. Everything is pretty much on the nose. This film is far from the greatness of earlier years, although it sometimes has a cantering gaiety and sense of farcical fun.
You and I reading this would quickly—through our own internal knowledge maps—realize that Allen refers to Woody Allen, Play it Again Sam is another Allen movie, Fellini is the famous director, and Antonioni is still another great Italian director. Note to self: must see L’Avventura again.
Using the free entity extraction demo, here’s what Dandelion came up with:
Wow! Quite good. It made some reasonable inferences and clearly understood that this brief excerpt from a longer review is about films and directors.
Dandelion is not perfect—nor is anyone else’s software—but this might be just enough knowledge knowledge parsing for most of us.
What are some of the use cases? Before you can analyze Big Data, in many cases you need to scrub raw text. Dandelion has obvious applications in marketing for analyzing user feedback and sentiment, in data classification of raw content in, say, legal or business contexts, and finally in an area that I’ve been writing about, Google-like searches of corporate file systems.
Dandelion supports other languages besides English, including French, German, Portuguese, and of course Italian. Dandelion also allows you to create your own specialized vocabularies.
It has a freemium model for using its web APIs, and you can check out the pricing here.