iSpeech is a Newark-based–yes, you read that right–startup that has created a very useable and practical cloud-based text-to-speech and speech-rec platform. I hadn’t really thought about speech rec after my well-documented experiments with Voxeo and Twilio. Once upon a time, I assembled a few apps with these services, and then came to the hard-earned conclusion that it would require more geek skills that I could muster to do text-to-speech right.
Not to pick on their software, both of which are reasonable choices, but Twilio had definite limits on its speech abilities, and Voxeo, a company with far deeper roots in speech technology, required advanced knowledge of VoXML. I should add that one of my experimental apps involved reading headlines from a news feed.
Back to iSpeech. The folks there have pinged me every so often about taking a look at their software. The final motivation came after reviewing a recent TechCrunch Disrupt startup’s (pause) news reading app. I had thought that a minimally capable reader should be feasible, circa 2012. But I was beginning to consider the opposite conclusion.
But it is possible, if you use iSpeech. This morning I finally tried out their TTS web service APIs–the PHP version with JSON encoding– feeding it part of the New York Times reporting on the Nadal-Rosol match. That is, I gave it raw, unmarked text to see what it could do.
To these ears, iSpeech sounded pretty darn good: fluid, seemed to have gotten word accents and phonemes right without any special help, handled scores pretty well, and understood commas, questions marks, dashes, and other punctuation more or less correctly. Check out this audio file.
I was, though, a little confused on how to fine-tune iSpeech’s pronunciations. Even so, what you get out of the box is quite good, and should be well suited for many developers’ applications.
iSpeech’s text-to-speech and speech rec prowess have been discovered by the startup world. Its customers include Talktoit and TeleNav. iSpeech also has a few of their own apps, including a free mobile email reader, called DriveSafe.ly. For developers, they have a freemium model, wherein you can try out the different APIs, and then transition to a pay-as-you-go model.
iSpeech will have some interesting news in the coming weeks. They will be releasing a demo version of Siri-like app using their own AI technology that will understand and respond to requests. Cool stuff.