The Human Factor

The true value of a human being can be found in the degree to which he has attained liberation from the self.
Albert Einstein

When you create voice applications for a living, it can be hard to watch things like the new Citibank commercials that extol the virtues of being able to dump an automated phone system and talk to a human. (Although, I will admit that I laugh every time the actor in the commercial has to say his password louder a second time … “Big Boy!”)

There are entire web sites devoted to providing information on how to circumvent automated phone systems. As a VoiceXML developer, I’ve been asking myself some tough questions about what sort of reflection these developments are on the current state of voice technologies.

Not that aggravation with automated phone systems is a new phenomenon, it’s just that we’re currently sitting in a pretty happy time for the technologies used to build telephone applications. The VoiceXML 2.0 standard has been widely adopted, along with a host of related technologies that make creating voice applications easier and more cost effective; platform vendors and developers are embracing the new standards with gusto, and; improvements to the current standards, which will dramatically enhance their power and flexibility, are already in the works.

So why aren’t more people happy with the current state of voice applications? Why aren’t consumers taking corporate web applications to task in the same way they do voice applications? Why is it still possible for those Madison Avenue weenies to elicit such a visceral reaction from the public when they take jabs a telephone applications?

After I stopped feeling picked on for a few minutes (and secretly laughing at the Citibank commercials) I came up with at least two reasons that explain this apparent paradox:

  • With voice, it’s personal. Voice applications are inherently more personal than other types of interactive applications, web-based or otherwise. Because the act of talking is such a fundamental way of communicating and emoting, people will always react differently to voice applications. As such, they will always hold voice applications to a different (and higher) standard. I don’t think there is a way around this, but I do think that there is a silver lining in this precept for voice developers.
  • VoiceXML makes it easer, not (necessarily) better. There is an excellent discussion in the latest issue of VoiceXML Review that talks about the reasons the technology was developed. This helps underscore the simple fact that it is very possible to build a lousy IVR system using a great technology like VoiceXML. VoiceXML changes the economics and the complexity of building voice applications – it doesn’t make voice applications bulletproof to second rate performance or design issues.

It is incumbent upon voice application developers and designers to understand the unique nature of voice as an interactive medium, and to appreciate the limitations that even the most powerful new voice technologies come along with. Simply put, we have to use the new generation of voice technologies to build the intuitive, agile and elegant voice applications users expect. I think that most would admit that there is a lot of work that needs to be done to change the stigma that hangs over voice applications.

Until then, enjoy your laughs while you can Big Boy.