Next Steps in the Evolution of Multimodal Applications

Over at eComm Europe – being held in Amsterdam – RJ Auburn gave a rocking presentation that can be summed up by its very apt title – the “Rise of Real Time Text and the Demise of Voice.”

There are many important take aways in this presentation for governments and any other organization that interfaces with customers (yes, taxpayers and citizens are customers). Most importantly, the increased use and growing ubiquity of alternate communications channels – IM, SMS, social networks, etc.

Stated simply, the customers of tomorrow will communicate differently than the customers of today. The customers of today are used to voice self service (although many grudgingly so). The customers of tomorrow will use new communications channels (perhaps some that do not yet exist). Your father’s customer service paradigm will probably not apply to them.

Bottom line, if you haven’t developed methods for communicating with the new wave of customers that use different modes of communication then your shiznit will be cooked. Ya dig?


Voxeo (the company for which RJ is CTO) has been on a buying spree of late, with the seeming goal of shoring up its offerings to cover a wide array of different communication modalities. I’ve expounded in the past on one acquisition in particular as being especially relevant – particularly as it relates to the next generation of customers – the acquisition of IMified.

IMified provides a simple API that allows developers to create applications that work across a host of IM networks, SMS and even Twitter. Voxeo has leveraged this acquisition to deploy new functionality on its core voice application platform to allow developers to deploy multimodal applications – apps that a user can interact with through several different modalities, whichever is most convenient for them.

Multimodal applications are not new – I’ve written about them many times and built several. But Voxeo and IMified have taken the notion of multimodality to a new level by making it practical for almost any developer to build one. Even more compelling, Voxeo’s platform lets you re-purpose applications developed for one specific modality (i.e., phone) for others (SMS, or IM).

Multimodal functionality is pretty much a requirement these days for successful customer interactions, but RJ’s presentation got me thinking about other possibilities.

Cascading Modality

The next step in the evolution of multimodal applications will be to support what I call “cascading modality.” Cascading applications will allow users to move across modalities over the course of one interaction with a company or a government.

For example, say a company wants to start a customer off in a communications channel that has relatively low cost – IM – using an application to collect basic customer information at the start of an interaction. At some point during this IM session, the customer could opt to move to a different modality. Say they send the following to the IM bot:

#switch 6401254789

This could generate an outbound call to (640) 125-4789 so that the caller could interact with an IVR system to complete their interaction – say, if they began the IM session on their desktop computer at the office and completed it while walking to the parking garage to get in their car. The information entered during the initial IM session is persisted across the switch to the IVR call, and all of the information (from both modes) is captured in a fulfillment or CRM system.

This session could be followed up by using a third modality – perhaps a confirmation message or receipt that is sent via SMS or even e-mail.

Concurrent Modality

Now consider another scenario, perhaps one that involves an older user who may be less comfortable with IM. This person could send the following to an IM bot:

  • User: #assist
  • IM Bot: I’d like to call you, and provide some additional assistance over the phone. Enter your 10-digit phone number.
  • User: 6401254789
  • IM Bot: Thank you. Hold one second while your call is placed.

As in the previous scenario, this would generate an outbound call to (640) 125-4789 but the focus of the IVR would not be to collect information – you still want the user to enter the information into the IM client they are using. The focus here would be to use the IVR to provide supportive information, so that the caller can more easily or efficiently enter required information.

One example of this “tag team” approach would be to simplify the input of information that needs to be in a particular format:

  • IVR App says: Enter your account number, which is a three part number separated by dashes. Enter all leading zeros on the left hand side of your account number.
  • IM Bot displays: Example: 00012345-87-1
  • User enters: 00078945-44-9

By using two modalities simultaneously to interact with the user, the information can be collected in less steps – a typical IVR system would probably collect this type of an account number in three separate steps, and could be prone to error (“to the left of the first dash, or the second..?”).

In this scenario, if a user enters 0 on their phone or sends #help via IM, they could automatically be routed to an agent for assistance.

Building Next Generation Multimodal Apps

Companies like Voxeo have removed a lot of the complexity from building multimodal applications, but developers will need to take heed of several factors that will become important as these kinds of applications become more widespread.

State persistence. Cascading modality will only work if a user can switch seamlessly from one mode to another without repeating data entry. VoiceXML applications and IM bots typically communicate with a backend via HTTP, which is stateless. And while there any number of different ways to maintain state in an HTTP-based application, they do not always scale well. Things can get complicated when clusters of servers or load balancers are required. These considerations require specialized skills to address properly.

Secure data transfer. A profusion of multimodal applications can raise questions about data security, particularity if said data is transmitted across pubic IM and social networks. Developers need to think clearly about what is suitable for transmission across these networks, and ways that data security can be enhanced where needed.

Yes, the dawn of a new customer service era is upon us my friends. Who knows, if you tool on over to the Voxeo or IMified developer sites, you might just get an opportunity to help build it.


Drupal Notifications Framework

I’m experimenting with the very cool Notifications Framework for Drupal on a test site that I have set up.

This module (really a collection of different modules and add ons for Drupal) adds some extremely powerful functionality to a Drupal site to allow site administrators to select different node types for subscriptions, and to set up different subscription channels. I’ve currently got Prowl and e-mail working, and am configuring Twitter and XMPP for testing now — I hope to get to SMS later this week. It also makes it easy for site users to subscribe to different nodes, to be alerted when they are updated or if a comment is submitted. Users can also select the channel they want to be notified on (a default setting or a custom selection for a particular node).

It doesn’t look like a user can select multiple channels to be notified on (e.g., Prowl and XMPP for the same noticeable event). There also isn’t currently a phone-based channel (i.e., outbound phone call with TTS), at least not that I could easily find. I may have to remedy that soon…

This is but one example of the incredible array of custom modules available from the Drupal community that Drupal-powered sites can take advantage of.

Be nice to see some sort of multi-channel notification functionality on the new Drupal-powered site. Dont think they have that yet.

Cloudvox: Building Phone Apps in the Cloud

If you are a developer of telephone applications, there has never been a more varied and powerful array of tools at your disposal to do your work than right now.

If your not a phone app developer, but always wanted to be (believe me when I say that chicks dig fellas that can build phone apps), the traditional barriers between web applications and phone applications have never been as blurred as they are right now. If you have the basic tools and skills needed to build solid web applications, you can quickly and easily add a phone interface to your code.

A new service for building phone applications that embodies this blurriness between the worlds of traditional web applications and phone applications was unveiled recently – Cloudvox. Cloudvox makes building phone applications dead simple. Here’s how.

Cloudvox is built on the open source Asterisk platform, so all of the existing tools available to Asterisk developers (PHPAGI, Adhearsion, Asterisk-Java) are available to Cloudvox developers.

What makes Cloudvox exciting is that it removes the need to build, host and manage an Asterisk server (or servers, depending on the scale of your app). If you’ve ever thought about standing up an Asterisk server just so you could build a killer phone application, then Cloudvox is worth a look. They provide an easy to use interface for managing applications (application code is still deployed on your server) and reviewing call statistics. They take the work out of building an Asterisk-based phone app, and leave all of the fun stuff (writing the actual code) to you. But wait, there’s more.

The team at Cloudvox has also deployed an API that lets you build phone apps with nothing but JSON and HTTP. Lots of platforms strive to be language agnostic and embrace developers of all stripes, but this takes it to a new level. If your language of choice can speak HTTP and supports JSON, you get an invite to the party.

The Cloudvox management interface also lets you register SIP phones (physical ones and softphones too), so that you can direct callers from an application to an extension that will ring wherever your SIP phone is currently registered from. You can also use the registered extension to make outbound phone calls.

There is a lot to like about Cloudvox. If you’re a phone application developer, or ever wanted to be one, it is worth your time to check this new service out.