Grammar Support in Opera 8

Conflicting posts in newsgroups dealing with the new multimodal browser released from Opera raise the question of which grammar formats are supported in the new browser.

This post on the VoiceXML Forum Community Message Board bemoans the lack of support for the SRGS (Speech Recognition Grammar Specification) format for speech grammars in Opera 8. It suggests that the new browser only supports the JSGF (Java Speech Grammar Format).

This would be a major issue in my opinion, as the SRGS specification is closely tied to the VoiceXML standard – both were developed under the auspices of the W3C, both were adopted as standards on the same day and the certification of both developers and platforms by the VoiceXML Forum is contingent on support for the SRGS. If one of the benefits of XHTML+Voice is that it will leverage the existing skills of VoiceXML developers, why wouldn’t the first XHTML+Voice browser support the designated standard for grammars under the VoiceXML specification?

After further research, I came across this post on the Opera Developer forum. It states pretty clearly that the SRGS spec is indeed supported in Opera 8. The only way to be sure, however, is to set up a quick test.

Using one of the examples on the Opera developer site, which is designed to use a JSGF grammar, I modified the example to use an SRGS grammar (in XML format). These two sample files can be tested by pointing the latest version of the Opera Browser at the links below:

Both work identically, demonstrating that the new XHTML+Voice enabled browser from Opera does indeed support the SRGS grammar format.


On Premises Hosting vs. Outsourcing

An interesting article appears in the recent issue of Speech Technology Magazine outlining the pros and cons of on premises versus outsourced hosting of voice applications. The article makes a compelling case for outsourcing voice application hosting, and to allow application developers to focus on what they know best – developing applications.

It goes without saying that any developer with enough experience is going to understand more than just the core languages and technologies used to build applications. Good developers understand how infrastructure (e.g., networks) and platforms (application servers, database servers, operating systems, etc.) will impact the behavior and performance of an application.

However, unlike traditional customer facing applications, voice applications have an additional component that can be challenging, even to experienced developers. Voice applications require a bridge to the publicly switched telephone network (PSTN) so that users can access these applications from traditional telephones. Creating the interface to the PSTN from a voice application environment, and managing the required telephony infrastructure can be challenging.

Governments that do not have overriding security concerns should consider leveraging the expertise of voice application hosting providers to deploy applications. Some, like Voxeo, will host applications severs as well as voice platforms (VoiceXML interpreter, TTS engine, ASR, etc.) Others, like BeVocal, will host the voice platform and leave application server hosting to the client. Both are excellent – and Voxeo’s platform has been certified by the VoiceXML forum.

Candidate Recommendation Status for VXML 2.1

VoiceXML 2.1 received “Candidate Recommendation” status from the W3C on June 13th, taking it one step closer to the level of a formal standard.

There are a number of extremely cool features in the new specification that will make building dynamic voice applications with VoiceXML simpler and more efficient. A good overview of these new features can be found in the “First Words” column authored by Rob Marchand for the VoiceXML Review.

Despite the relatively early stage in the process, most of the larger VoiceXML platform vendors now support most (if not all) the new functionality spelled out in the 2.1 specification.