One of our favorite series -- Star Trek
-- was mentioned twice today in blogs by columnists who I respect and follow -- Mike Elgan and John C. Dvorak. --------------------Here is Mike's post today from his Google+ blog:
Columnist mind meld? Dvorak also uses 'Star Trek' reference
I linked earlier to my most recent column (submitted yesterday morning but posted this morning), which is about how Apple and Google will lead us to the Star Trek-like final frontier of talking to our computers. My column focused on how those companies will train us to get into the habit using our cell phones.
Here's that post: https://plus.google.com/113117251731252114390/posts/WM2bg9CeTYm
I just discovered that +John C Dvorak
posted a column yesterday on PCmag.com slamming Microsoft for pursuing Star Trek-like technologies, including voice-command PCs.
FWIW, I disagree with Dvorak. Not only is voice-command and virtual assistant technology "forward thinking," it's perfectly inevitable. The only question is: Which company will get there first, and which will profit most.
And here is a link to John's parallel (kind of) post that he refers to above: Microsoft's Wacky New Direction: Star Trek!
I replied to both of these postings with the following:
I've dedicated my career to speech recognition since the 1990s where my first company produced a training simulator for air traffic controllers that's now become the standard used in control towers throughout the world. My latest startup VoiZapp is taking the baby steps necessary to INTELLIGENTLY convert your visual flows of information into audio streams on your mobile devices that are controllable via your voice.
There are basically 4 classes of voice-related capabilities that we all might want to incorporate into our mobile devices, stated roughly in order of their tech maturities: 1) text-to-speech or TTS, whose tech is "good enough" now; 2) voice control, which enables us to command the computer via a circumscribed set of keywords and is also fairly well advanced; 3) voice dictation, which accepts the entire vocabulary of a language so that we can dictate replies to messages, articles, etc and whose tech works decently in controlled environments (Dragon Dictation) but not all that well in the wild; and 4) natural language understanding, in which the computer must convert those strings of words into your intentions and desires that "make sense" in a conversational context, whose tech works well enough to win at Jeopardy! but is the least advanced overall.
We've released a first iOS app called "Friends Aloud" that incorporates the first of these by reading aloud your Facebook news feed. We are about to release several more with voice control enabled, and are working on retrofitting voice dictation into Friends Aloud and the others before the end of this quarter. Apple's Siri, based on DARPA-funded work by my friends at SRI a few years back, is a great first step in the fourth category, but by next year, we at VoiZapp hope to have some apps completed that can similarly act as personal assistants and respond to your requests, but will further banter back and forth with you to fine-tune your desires by intelligently feeding back (via voice of course) what it's finding on its web searches, just as a seasoned personal assistant might.
So ... no Star Trek capabilities yet, but those of us playing in this sandbox are moving smartly in that direction at ever-accelerating (sub-warp) speeds. Stay tuned...
VoiZapp's research labs have been very busy adding hands-free operation to our Aloud
line of reading aloud apps. This technology is standard in Star Trek
: remember how Spock or Kirk could start a conversation with the Enterprise's computer with by simply saying, "Computer?" Remember when Scottie tried that with an early MacPlus in Star Trek IV: The Voyage Home
? To our knowledge, no iOS app has cracked the code to do that yet. VoiZapp has, and we are feverishly incorporating this technology into our new reading aloud apps Reader Aloud
and Later Aloud
, both of which are slated to be released next month (considerably earlier than the 23rd century). Once we get them out, we will retrofit this voice-control technology into existing apps Friends Aloud
and Tweets Aloud
. When we're done, you will be able to control your VoiZapp apps via simple, intuitive voice commands -- Play | Read [Aloud], Pause | Stop [Reading], Skip [Forward | Reverse], Go Back | Review | Previous, etc. No voice training required, of course, nothing to memorize, and exceptional recognition accuracy. It just works.
We are adding our world-class voice command technology to the website/blog reading aloud apps first because we believe that iOS 5, when released in September, will incorporate full dictation-quality automated speech recognition (ASR) licensed from Nuance
. Nuance's Dragon Dictate is the $6 billion gorilla in the room, so we hope and expect to use their technology to compose Facebook and/or Twitter posts, replies, and messages strictly via your voice. Our R&D plan then, subject as always to the vicissitudes of the market, remains to incorporate voice control first, to be closely followed by the addition of Nuance's voice dictation through VoiZapp's line of Aloud apps. It's going to be a busy summer here at VoiZapp.
A new version 1.1 of Friends Aloud Pro has just been submitted to the App Store and should become available within the next day or two. It fixes an obscure problem caused by optimization technology within iOS: if an app that plays music (or voices in our case) stops playing for 30 seconds or longer after the screen has auto-locked (gone dark), it will "lost focus" and effectively stop playing from then on, even if it's in the foreground, presumably to further reduce power usage. In our case, when Friends Aloud Pro finishes reading aloud one of its regular posting updates and is quietly awaiting arrival of the next one, it may occasionally and incorrectly be switched off by iOS. Version 1.1 fixes this problem, so that Friends Aloud Pro will now work as intended for as long as you wish, reading incoming posts and comments as they arrive, whether the screen has auto-locked or not. This is, of course, a free update, so please visit the App Store at your convenience to download it.
From a Mashable
article published this morning:
As more automakers move towards adding social media in cars, Transportation secretary Ray LaHood, has a word of advice: Stop.
Speaking to The Wall Street Journal
, LaHood said he is lobbying automakers not to add features that could distract drivers. “There’s absolutely no reason for any person to download their Facebook into the car,” LaHood said in the interview, showing a shaky command of social media terminology. “It’s not necessary.”
LaHood’s opinion on the matter is significant: He and the National Highway Traffic Safety Administration, which reports to him, can force automakers to stop adding social media feeds to new cars. He is also pushing for them to create public service campaigns against texting and driving. So far, two automakers — Subaru and BMW — have done so. BMW’s campaign, showing an overprotective mom who nevertheless endangers her kids by texting and driving, went live on June 3.
Despite LaHood’s opposition, many carmakers are busy integrating social media hooks into their new models. Toyota, for instance, inked a deal with Salesforce.com
last month to create “Toyota Friend
,” a social network for Toyota drivers. Toyota is also working with Microsoft
on a software system called Entune
that will let drivers access a version of Bing and Pandora
. General Motors has also recently added a feature to some models that lets drivers access real-time Facebook status updates. GM touted that feature in a Super Bowl ad
for its Chevy Cruze.
VoiZapp is pleased to introduce its new Mobile App Development Division. As part of our commitment to design, create, and market the very best mobile apps available today, VoiZapp has assembled and manages a variety of programming teams at multiple locations around the world. These expert teams maintain broad experience across the various modalities of mobile app programming, and they are now available to create custom mobile apps to meet your needs. Please visit our Services page to send us an outline of your wish lists and/or design thoughts, so that we can begin a thoughtful dialogue about how VoiZapp might help design and build your custom mobile apps. If desired, we will, of course, enter into a mutual NDA to insure that your concepts and designs remain protected.
Now iPad users can have a full-featured experience with Friends Aloud HD. No more magnifying an iPhone-centric app using the "2x" button or suffering through a tiny middle-of-the-screen display. Friends Aloud HD uses the native character set of the full iPad screen to bring you the posts and comments from your Facebook news feed. It goes on sale bright 'n early tomorrow morning for $2.99 in the App Store. For more information, please see the press release in the Press Room page of this website, or check out the revised Friends Aloud product page, where you can also click through to some screenshots of Friends Aloud HD. If you have an iPad, we recommend that you purchase this new HD version of our popular Friends Aloud app.
One of the standard features of Interactive Voice Response (IVR) systems is known in the industry as "barge in." This is the capability that accepts your saying "billing" before the system gets through its spiel of "Please say 'billing' for Billing, 'technical support' for Technical Support, 'new accounts' for New Accounts, ..."
We have begun working on adding speech recognition to our Aloud line of products, so that you can dictate replies and control the app strictly through the use of your voice. Of course Barge-In is essential here: if, for instance, Friends Aloud is reading a long post that you're not interested in listening to, you should be able to quickly say something like "Next" and have it immediately stop reading that post and start reading the next one. Same with messages and emails and tweets and news articles. The whole idea behind the Aloud series is that they shine when you can't divert your eyes and hands from the primary task at hand.
But there's a technical problem that we must solve before adding this sort of obvious capability to our apps: how can the apps' speech recognition feature distinguish between the audio coming out of the mobile device's speaker vs. what the user says? In a telephone IVR system, the microphone generally does not pick up what comes out the handset earpiece, but in our case, most users are letting their mobile device speak to them using its built-in loudspeaker while both their eyes and hands are occupied. So we have a conundrum -- if the word "Next" is heard while the app is speaking, who said it anyway?
We are presently conducting testing of a variety of speech recognition technologies to see whether or not there exists one or more that can handle this problem. Some systems, for instance, claim to be able to identify human speakers based on their voiceprints. Perhaps we can use this sort of technology to distinguish between our app's voice and its user's voice? Alternatively, if the phone's microphone is directional enough and/or the audio coming out of the speaker quiet enough, we might be able to distinguish simply according to sound power level at the microphone input.
Bottom line: we hope and expect to add speech input and control to our apps within months. Stay tuned to this blog, where we will update you on our progress towards incorporating this powerful feature into the Aloud series.
This article that appears in the New York Times today discusses what VoiZapp is trying to accomplish:
May 9, 2011, 5:28 PMBy STEVE LOHR
There are at least two irrefutable facts about the practice of people communicating by cellphones while driving cars. First, it is a big problem, with studies estimating that more than 500,000 traffic accidents and 2,600 deaths a year are caused by cellphone-distracted drivers. No one hasdocumented the problem as thoroughly as my colleague, Matt Richtel.
The second is that cars are not going to become monastic cones of silence. America is a car culture, and the pressures of modern work and life mean that people are going to communicate from their cars, despite the potential danger.
The real issue, then, is how best to reduce the risk. Education is going to help by making people more aware. One thing to be aware of, according to recent research, is that hands-free calling does not help much, if at all. The big trouble is that human communication distracts the brain, slowing recognition and reaction times.
In a paper presented on Monday at a research conference in Vancouver, Eric Horvitz, a scientist at Microsoft Research, and three collaborators provide evidence that a properly designed computer assistant could do a lot to reduce distracted-driving accidents. Bring some artificial intelligence to the car, they suggest, and the safety payoff could be well worth it.
The paper, “Hang on a Sec! Effects of Proactive Mediation of Phone Conversations while Driving
,” lays out the results of research done with people conversing on a hands-free phone while at the wheel of a driving simulator.
The volunteers drove through a simulated environment while answering questions like, “When did you last get gas for your car?” and “Name the last movie you saw.” Questions that require recall are more cognitively demanding than other conversation, researchers say.
The drivers had to navigate through city streets, pedestrian crosswalks and frequent turns. Their performance was measured on comparably difficult routes both when their conversations were interrupted by the “semi-smart mediation technology,” as Mr. Horvitz puts it, and when they were not.
The alerts tried out ranged from a short message simply stating, “Focus needed,” to more descriptive messages like “residential neighborhood ahead with children playing.” And calls could be put on hold, typically for 10 to 25 seconds, while the driver navigated through a setting that required maximum attention to the road.
In the simulated course, drivers did better with the semi-smart helper offering driving tips and interrupting conversations than when they talked and drove continuously. When assisted, the drivers had on average 27 percent fewer collision errors and 81 percent fewer turning errors.
“I think we could see a significant drop in traffic accidents using this kind of system,” said Mr. Horvitz, whose three co-authors were Shamsi Iqbal and Yun-Cheng Ju of Microsoft Research, and Ella Matthews, a student at the California Institute of Technology.
The research initiative was a “proof of concept” project, not a working system. But the ability of modern computing to tap and mine large Web-based data sets — road conditions, weather, accident reports — and deliver answers in real time is opening the door to such systems in cars.
One large car company, Mr. Horvitz said, has expressed an interest in his team’s research. Within five years, he predict, computer-safety assistants could become commonplace — if automakers pursue safety services and regulators prod things along. Perhaps this is the equivalent of digital seat belts?
“Cars will begin to tell people about road conditions and potential dangers,” Mr. Horvitz said.
A reviewer complained about pronunciation issues yesterday. He said:
"Though I was amazed at how well it said some of my friends’ names (I’m Asian, so it’s not like I have a bunch of John Smiths as friends), saying “nees” instead of “nice” is unacceptable. Thankfully most words are pronounced correctly, but minor mistakes like these are noticeable."
His example got us to thinking about how Friends Aloud handles heteronyms -- words that are spelled the same but pronounced differently. The word "read" is the quintessential example, being pronounced differently according to present ("reed") vs. past ("red") tense. The word "nice," meaning pleasant, is pronounced with a long 'i' and "Nice," meaning the city in France, is pronounced with a short 'i'. When it's capitalized, such as at the start of a sentence (or perhaps improperly for emphasis as in "You are So Nice!"), it's difficult for the computer to know which pronunciation to use. So we started testing Friends Aloud on this very word in context, and we discovered that it actually does pretty well in this particular instance, leaving us to wonder just what the exact context of his mispronunciation was. For example, it verbalizes "Nice to see you" properly, even as it reads "We are going to Nice" correctly as well. It uses context to determine the pronunciation of heteronyms. Still, since Friends Aloud is not using full artificial intelligence natural language understanding, but only word context to figure out pronunciations, it is bound to get some things like this wrong. Just like a human reader would, frankly.
The problem here, most likely, was that the single exclamatory sentence "Nice!" was probably used in a post or comment and, not having any context at all surrounding that single word, Friends Aloud chose to pronounce it as the city in France. We will always keep working to teach Friends Aloud be less high-brow and to resolve such conflicts in favor of modern-day slang. In the meantime, rest assured that it does get at least most of this complicated language we call English correct, even when you use texting-style abbreviations.
From an article in the New York Times, and why Friends Aloud is so important ...
It’s easy to become complacent. Maybe you’re a good driver, and you’ve gotten away with such actions for years. Maybe you managed to avert a near-accident when your attention returned to the road in the nick of time. But one of these days, your luck may run out and you, or someone you hit, could be maimed for life or dead.
“Driving while distracted is roughly equivalent to driving drunk,” Dr. Amy N. Ship, an internist at Harvard Medical School, wrote last year in a commentary in The New England Journal of Medicine
. “Any activity that distracts a driver visually or cognitively increases the risk of an accident. None of them is safe.”
Following widespread publicity about the hazards of distracted driving, including a Pulitzer-prize winning series in this newspaper
, medical groups are working hard to make patients more aware of the problem. The most recent effort was started last week by the American Academy of Orthopaedic Surgeons and the Orthopaedic Trauma Association, whose “Decide to Drive” campaign calls attention to the increasing number of distractions engaged in by multitasking drivers and the resulting toll on people’s lives.
“We take care of a lot of people injured in car accidents, and distracted driving is a substantial contributor to these accidents,” Dr. Daniel Berry, president of the academy, said in an interview. “If we could get rid of this part of our practice, it would be a great service to the people we care for.”
Orthopedists would do very well, thank you, without the business generated by the 307,369 crashes that have occurred so far this year, according to estimates from the National Safety Council, involving drivers talking on cellphones or texting.
Last year Aaron Brookens of Beloit, Wis., then 19, was driving home at 75 miles an hour after spending a weekend with his girlfriend when he decided to send her a text message — and wound up pinned under a semi. The toll: two broken femurs, a broken kneecap and ankle, nerve damage to both legs, and a lacerated spleen, kidney and liver.
Numerous operations and a lengthy rehab later, Mr. Brookens knows he’s lucky to be alive. “No one thinks it will happen to them,” he said on Wednesday at a news conference convened by the orthopedists. He now realizes that “deciding to drive” is always the best option, and he wants others to learn from his mistake.