The battle over voice control

You know… you get to a certain age and it becomes harder and harder to believe that “the next big thing” is going to be as big as everyone says it is. Your brain gets full of all sorts of promises that were never kept — 2.88MB floppy disks, trackballs, self-destructing DVDs, Myspace… all these “next big things” and more come and go and the world keeps turning.

Wasn’t it just 18 months ago that gestural control was the next big thing? We were all going to wave our hands at the TV to change channels. Don’t recall anyone actually doing that. I think it’s because no one could figure out a way to gesture to the TV that they didn’t like a show, without using some of the most obvious gestures.

So now it’s voice control. Apple’s digital assistant Siri opened the door to that. (By the way, you notice how even though it’s just a robot we still can’t call it a secretary?) Siri spawned Google Voice Search and even DIRECTV’s Voice Control app. Now everything from XBoxes to computers are going to come with voice control. Yes, there has been a little bit of paranoid ranting on this site already (sadly, the article has been lost.) But you can’t just stand there like an old-school fussbudget sitting on the porch crying “get off my lawn.” Voice control is the current “next big thing” and it’s going to succeed with or without the endorsement of this blog.

What does voice control need?

It needs to be fast. Google has proven that it can recognize words as fast as you can say them. This is critical for voice control’s success. If you can do it faster just by pushing buttons, that’s how you’ll do it.

It has to really understand. Here, Apple really has the lead. Tell Siri how you are feeling and she’ll recommend a solution. Tell Google how you’re feeling and you’ll get a wikipedia article about that. Voice control will only work if people can talk to computers the way they talk to their friends… in half-sentences full of shared (but unspoken) context. You’re not going to break into Dickensian English just to get the TV to turn on. You’ll probably choose to say “Hey stupid, I want to watch American Idol.”

It has to be consistent. Someone is going to have to break through with a voice system that is adaptable to everything. You don’t want to use one set of commands for the microwave and another set for the toaster. You just want everything to act the same way and understand you the same way. Here Google has the natural edge, as they could put Android in everything you own without too much trouble.

It has to know when you’re talking to it. The problem with Google Glass is that whenever anyone in the room says, “OK Glass,” it responds. What do you do with 10 people in the same room, all with Google Glass? This is a tough one to solve. How do you get your electronics to know you and you alone are talking to them, and then how do you get them to know that sometimes someone else can?

If voice control is really going to be the next big thing, all these questions need to be answered. So maybe… the next big thing might be something else.