On the demise of SALT

There has been much talk of late on Microsoft’s announcement that it will support VoiceXML in a forthcoming version of Speech Server. Many that I have read or listened to have pointed to this as

  • Good news; and,
  • Evidence that open standards (like VoiceXML) are truly the best way to develop phone-based application.

There is ample evidence that Microsoft has no problem advancing it’s own standard – even under the banner of “openness” – if it sees a financial benefit in doing so. If you don’t agree, I’d refer you to the debate raging about an open document format. So, while I agree wholeheartedly with the second point, I’m not so sure I agree with the first. At least not totally.

I think the bad news in Microsoft’s announcement can be identified by remembering what it’s nascent SALT specification was designed to do. Speech Application Language Tags were designed to be extensions to XHTML – in other words, the specification was developed specifically to build multimodal applications. And although you can build pure telephone applications with SALT, this was not the original intent.

So, if Microsoft suddenly got religion and decided to support VoiceXML for building telephone applications it may mean that multimodal applications aren’t going anywhere for a while. That’s bad news in my opinion.

I’d ask that readers refute this assertion by pointing out some existing production uses of multimodal technology. If they can find any…


One thought on “On the demise of SALT

  1. Hi Matt, that’s a perceptive insight about multimodality, and I wish I could refute it. (I will certainly refute the “demise” of SALT, see http://blogs.msdn.com/spokenword/archive/2006/04/05/569107.aspx 🙂

    To be frank, although Speech Server’s multimodal capabilities have captured the imagination of many, especially developers, the overwhelming market for Speech Server continues to be in telephony voice services, rather than multimodal. And the latest features of Speech Server are focused squarely on meeting customer needs. But – the vision of speech-enabled multimodal interaction is alive and well here, we are learning and we continue to innovate in multimodal speech technologies. The market may not be there yet, as the lack of comments suggest, but I believe it has a strong future, especially within the converging communications landscape.

    Stephen Potter (Microsoft)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s