Audio Control in VoiceXML: Redux

One of the questions I hear most from VoiceXML developers relates to audio control features in VoiceXML. VoiceXML does not natively support the ability to “rewind” or “fast forward” through audio files, but some vendors provide this functionality as extensions on their platforms.

Does this mean you can’t implement audio control in a VoiceXML application? No, it doesn’t.

Over 2 years ago, I wrote a post describing how to achieve audio control in VoiceXML in a way that was portable and platform agnostic. I continue to think that this is the right approach for achieving audio control in voice applications.

VoiceXML is a markup language that can be used in conjunction with any number of server side languages: PHP, Perl, Java, C#, Ruby, etc. If it can be used to build a traditional web app, it can be used to build a VoiceXML app as well.

I think it’s great that other platforms are providing innovative ways to provide language-specific methods for achieving audio control in voice applications.

But the true power of VoiceXML is that it is standards based, runs on a large number of platforms, and won’t shoehorn you into a programming language that may not be right for you.

VoiceXML = Flexibility, Portability, Opportunity!


2 thoughts on “Audio Control in VoiceXML: Redux

  1. Nice hack – really a out of the bound thinking!!! However, I am not clear on how will you get the current reference position from which the forward or rewind operation to be carried out. For that matter, I will be interested to know the details on how pause & resume works.

    Obviously, I differ with your last statement about VoiceXML being flexible. But irrespective of that, it is inspiring enough for me to write a blog on similar topic. Hopefully very soon!


  2. Yusaf –

    The reference position is obtained by using the marktime shadow variable. In a nutshell, when the field playing the audio file is filled (i.e., when a grammar match occurs) the marktime shadow variable provides the number of milliseconds that have elapsed since the bargein that filled the field. Its part of the VoiceXML 2.1 spec that provides an easy way for a developer to determine how long a caller listened to an audio file before they barged in and uttered a command.

    Pause and resume are not implemented in my simple example, but they could very easily be added by including one more simple VoiceXML field and tweaking the existing field’s grammar slightly. I’d be happy to provide a code example if you would be interested.

    I guess the point I wanted to make about flexibility is that the mechanics of audio file control are all available in VoiceXML. Moreover, because VoiceXML is just markup, I can use any number of different languages and platforms to generate it, not just PHP.

    Don’t get me wrong – I love PHP, but I know that others do not. The power of VoiceXML is that it does not lock a developer into a single language or platform.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s