We have just submitted Friends Aloud 3.0, with Nuance-powered (Siri-like) speech recognition to the App Store. If they approve in the usual week or so, it should go live towards the end of next week. We are all excited to have this one almost in the hands of our loyal users.
Tweets Aloud 3.0, with similar new features, is in beta and following hot on the heels of Friends Aloud 3.0. It could make it to the App Store as early as next week, too.
We were delighted to read a Forbes article
about how Apple stock could more than double on the sheer power of Siri. As a developer who is exending Siri's speech recognition to the several million existing iOS devices, not just the latest and greatest iPhone 4S, we are sympathetic about what that article's author had to say about the importance of speech recognition to the future of mobile devices:
- Siri is bringing in new customers who were previously Android users or who were not using smart phones. I initially drew this preliminary conclusion from my conversations with a number of AT&T store managers. Now there is a variety of data from a number of sources supporting this conclusion.
- Siri increases stickiness of Apple phones. High stickiness means that once someone starts using Siri they are unlikely to switch. When asked to call my wife, Siri responded with question wanting to know which one of my contacts was my wife. After about 10 minutes, when I again asked Siri to call my wife, it knew who my wife was. Imagine Siri learning about you and what you do for two years and then giving up Siri for an Android phone that does not know you!
- In due course Siri will migrate search revenues from Google (GOOG) to Apple.
- Siri raises the bar for Google Android and Microsoft (MSFT) Mango to compete.
We at VoiZapp, like Apple, believe that the future of mobile devices is speech-enabled!
Friends Aloud 3.0
with Siri-like speech recognition is now in beta-testing. We expect to submit it to the App Store in a week or two. As existing users who upgraded to iPhone 4S may have already discovered, Friends Aloud 2.0
and Tweets Aloud 2.0
now have keyboards with optional Siri-powered speech recognition available via a small microphone icon next to the spacebar. But the hundreds of millions of pre-iPhone 4S iOS users were left behind. VoiZapp will soon give all iOS users of its Aloud
series of products the ability to use Siri-like speech recognition natively within its entire product line, starting with Friends Aloud 3.0
. This is a monumental shift in user interfaces, aptly summed up in this article
by Robert Hof:
“Siri is the culmination of the Jobs legacy,” contends Gary Morgenthaler
, a partner at the venture capital firm Morgenthaler Ventures in Menlo Park, Calif. Morgenthaler was the first VC investor in Siri and was a board member until Apple acquired it, as well as an investor and board member of Nuance Communications, the voice recognition software company whose technology also is used by Siri. Both companies were spun out of the research institute SRI International.
In an exclusive interview, Morgenthaler provided a revealing look at how Siri developed and what it could potentially do–including how it could reshape the worlds of e-commerce and advertising. Morgenthaler makes a good case that Siri represents the third revolution in human-computer interfaces
(emphasis ours) that Jobs perfected and popularized. The first was the graphical user interface, using a mouse as a pointing device, which Jobs adapted (some might say stole) from Xerox PARC and SRI to make the Macintosh. The second came in the iPod and the iPad, using a gestural interface–again, not a technology it invented, but anyone who uses Apple’s touch technology knows it performs better than anyone else’s. The third, Morgenthaler contends, is a conversational interface epitomized by Siri. “It could create a new paradigm for interacting with computers, a new man-machine interface,” he says. “We are at a turning point in history
where people can talk to a computer and be understood. It’s a watershed moment where people won’t go back.”
We at VoiZapp believe that Morganthaler is exactly right about Jobs' legacy. Once you experience the power of being able to listen to your Facebook news feed and then reply naturally in your own voice with updates, comments, and likes, you'll never want to hunch over a keyboard again. Here's a short video demo of Friends Aloud 3.0
. Look for it soon in the App Store. As always, you can purchase Friends Aloud 2.0
now and automatically update to 3.0 for free the instant that it becomes available.
Last month we released a new version of Friends Aloud 2.0, and follow up this week with a new version of Tweets Aloud 2.0. The main feature we have just added is the ability to compose new Status Updates, Comments, or Tweets using the built-in iOS keyboard. However, as VoiZapp is focused on speech, we are delighted to note that the just-released iPhone 4S now comes with a little microphone button next to the spacebar on the built-in keyboard, thereby enabling you to compose using Siri's automated speech recognition capability. Naturally, Siri misses what you said sometimes, in which case the full keyboard is still available for correcting what it heard.
This upgrade is, of course, free to existing users. If you just purchased an iPhone 4S and don't already own Friends Aloud or Tweets Aloud apps, you can purchase them from Apple's App Store for $1.99 apiece. Then you can stay in touch with your friends via Facebook and Twitter strictly via your voice, both coming and going!
One of the standard features of Interactive Voice Response (IVR) systems is known in the industry as "barge in." This is the capability that accepts your saying "billing" before the system gets through its spiel of "Please say 'billing' for Billing, 'technical support' for Technical Support, 'new accounts' for New Accounts, ..."
We have begun working on adding speech recognition to our Aloud line of products, so that you can dictate replies and control the app strictly through the use of your voice. Of course Barge-In is essential here: if, for instance, Friends Aloud is reading a long post that you're not interested in listening to, you should be able to quickly say something like "Next" and have it immediately stop reading that post and start reading the next one. Same with messages and emails and tweets and news articles. The whole idea behind the Aloud series is that they shine when you can't divert your eyes and hands from the primary task at hand.
But there's a technical problem that we must solve before adding this sort of obvious capability to our apps: how can the apps' speech recognition feature distinguish between the audio coming out of the mobile device's speaker vs. what the user says? In a telephone IVR system, the microphone generally does not pick up what comes out the handset earpiece, but in our case, most users are letting their mobile device speak to them using its built-in loudspeaker while both their eyes and hands are occupied. So we have a conundrum -- if the word "Next" is heard while the app is speaking, who said it anyway?
We are presently conducting testing of a variety of speech recognition technologies to see whether or not there exists one or more that can handle this problem. Some systems, for instance, claim to be able to identify human speakers based on their voiceprints. Perhaps we can use this sort of technology to distinguish between our app's voice and its user's voice? Alternatively, if the phone's microphone is directional enough and/or the audio coming out of the speaker quiet enough, we might be able to distinguish simply according to sound power level at the microphone input.
Bottom line: we hope and expect to add speech input and control to our apps within months. Stay tuned to this blog, where we will update you on our progress towards incorporating this powerful feature into the Aloud series.
One of our favorite speech translation companies -- Mobile Technologies LLC -- has been busy adding languages and platforms to their handheld translators. Their press releases are reprinted here.
PITTSBURGH – Mobile Technologies, LLC, developers of “Jibbigo,” the leading iPhone and Android apps for speech-to-speech foreign language translation, today announced that the company’s top-selling voice translation apps are now available for translating spoken French, German, Korean and Tagalog (Filipino).
Jibbigo voice translators enable users to speak into the phone in one language, and the statement is automatically translated and repeated aloud in another language. Jibbigo is the world’s only voice translation app that functions without an internet or phone connection, enabling users to understand spoken language in locations without reliable phone/data connections, and without fear of incurring high roaming fees.
With today’s announcement, Jibbigo users can now translate spoken language between English and any of the following languages: Spanish, Chinese, Japanese, German, French, Korean, Iraqi Arabic and Tagalog (Filipino).
Since the launch of Jibbigo for the iPhone in October, 2009, the app has garnered significant praise for its groundbreaking technology. In May 2010, Apple featured Jibbigo in international television campaigns for the iPhone, and the Jibbigo Japanese-English translator become a top-selling app in Japan within the first two weeks of its launch. In August, Travel & Leisure magazine named Jibbigo a “Top Travel App” for 2010.
PITTSBURGH – Mobile Technologies, LLC, developers of “Jibbigo,” the leading iPhone apps for speech-to-speech foreign language translation, today announced that the top-selling voice translators are now available for Android devices through the Android Market. Users of Android devices now have access to the world’s only voice translation app that functions without an internet or phone connection.
Android users can now utilize Jibbigo translators to translate spoken language between English and any of the following languages: Spanish, Chinese, Japanese, German, French, Korean, and Tagalog.
"We’re very excited to bring Jibbigo translators to the Android platform," said Dr. Alex Waibel, Mobile Technologies’ Founder and Chairman. "With the explosive growth in the number of devices supporting Android apps, this significantly expands the number of people who can use our voice translators worldwide.”
Jibbigo voice translators have been available for Apple iPhones for more than a year, and were the first apps in the category of “Speech-to-Speech Translators” – users simply speak into the phone in one language, and the statement is automatically translated and repeated aloud in another language. Jibbigo is the only app of its kind that functions without an internet or cell phone connection – enabling users to communicate in foreign countries without the fear of losing connectivity or the risk of incurring high roaming charges.