Evolution of an Open Data Program

The first great challenge for any municipal data program is deciding what data to release.

The second great challenge for municipalities that embark on open data programs can be summed up in how they react to applications that get built with their data. This is particularly true if the apps that are built have not been fully considered by the municipality releasing the data.

Because of some work done by smart, passionate civic developers in Baltimore, this second challenge now confronts officials in Charm City:

“Whether web developers can use the city data to make applications that bring in revenue is still unknown. Baltimore officials released the data without providing terms of service to guide developers seeking to create applications for for-profit ventures.”

“Baltimore’s law department is crafting guidelines for how developers could use the data in such ventures,” said Rico Singleton, Baltimore’s chief information officer.

How Baltimore crafts the terms of service for their open data sets will dictate, to a large extent, how successful the program will ultimately be. Will there be more apps like SpotAgent? We’ll have to wait and see. I, for one, hope so.

It’s not surprising that governments struggle with this important milestone in the evolution of an open data program.

To a large extent, releasing open data means that governments abdicate control. There are typically some limitations on what developers can do with government data, but for the most part the creativity and personal ambitions of external developers are the engine that drives open data programs.

It can be a big leap for governments to give up this control, something that not all of them fully consider before the first data sets are released.

Here’s hoping that Baltimore officials take an approach that encourags the kind of app development that has resulted in SpotAgent. It would be great to see more of this.

Witnessing the CfA Effect in Philadelphia

Yesterday in Philadelphia, at the offices of Azavea (a Philly-based geographic data and civic apps company), 7 Code for America fellows sponsored what was the last in a month-long series of workshops/gatherings/hackathons.
City of Philadelphia
Yesterday’s event was promoted as a Data Camp – a one day sprint meant to identify useful sources of information about Philadelphia, and to build civic applications and mashups using that data. The idea was to have “demoable” projects by the end of the day.

I had the pleasure of attending this event yesterday, and was impressed by what I saw. I’m referring not only to the projects that were worked on (which were cool), or the excitement visible on the faces of everyone there – from independent hackers, to Azavea employees to city officials (there were several in attendance throughout the day). That was cool too.

What I was most impressed with was the ability of this event to highlight to those that were there what is truly possible when government data is open to and usable by developers. It provided an object lesson for all those there in the true potential of civic hacking.

I spoke with a number of people after the event, when the demos for all of the projects were complete, and the unanimous sentiment was – “Wow, all this got done in one day?”

The idea of releasing open government data has been gaining momentum in Philadelphia for a while now, principally under the auspices of the Digital Philadelphia initiative.

But yesterday’s event – in my opinion, as someone from outside of Philadelphia looking in – seemed to crystalize the benefits of releasing open data for many people there. They had already been supportive of releasing open government data in principle, and most had worked toward achieving this end. But to actually see apps getting built with open data really made the lightbulbs go off.

Having the Code for America fellows in Philadelphia, and having them essentially kick start civic coding using city data, has accelerated the awareness of what is possible. I think people would have achieved the awareness that was realized yesterday eventually, but the CfA fellows got people there sooner.

I call it “the CfA Effect.” It was pretty cool to see first hand.

Looking forward to more cool stuff from this gifted and dedicated group of young people.

Interactive Screen Pops with Asterisk & XMPP

I’ve got a thing about screen pops.
I’ve written before about using Asterisk and XMPP to enable IM-based screen pops, but the recent release of Asterisk 1.8 creates a whole new reason to be excited about this topic.

The new version of Asterisk includes a new dialplan function called JABBER_RECEIVE.

This new function nicely compliments the existing JabberSend() dialplan application and lets you read incoming XMPP messages into dialplan variables (via Set()).

Now that you can both send and receive XMPP messages via the dialplan, it is possible to build sophisticated CTI applications using standards-based XMPP servers and clients with nothing but extensions.conf. Here’s how.

You’ll need an XMPP server with (at least) two accounts. One for you, as a user. One for Asterisk. You’ll also want to fire up your XMPP client and add the Asterisk user to your buddy list.

Set up jabber.conf with the details of the Asterisk account on your XMPP server (make sure you run jabber reload in the Asterisk CLI after modifying the file):

Once you’ve done that, you’ll need to add some dialplan logic to use both JabberSend() and JABBER_RECEIVE (run dialplan reload in the Asterisk CLI after adding this logic):

In this simple example, anytime a call comes into the default context, a set of IM messages are sent to the XMPP account user@xxx.xxx.xxx.xxx (where xxx.xxx.xxx.xxx represents the host name/IP for your XMPP server). The following line in the dialplan will cause Asterisk to wait 10 seconds to receive a response from user@xxx.xxx.xxx.xxx.

exten => _XXXX,n,Set(OPTION = ${JABBER_RECEIVE(asterisk,user@xxx.xxx.xxx.xxx,10)})

When a response is received, it is read into the variable OPTION. Subsequent dialplan logic will either send the call to the extention that was dialed, or simply hang up (you could just as easily add options and logic to route the call to one of several different phone numbers or to voicemail).

That’s it!

This powerful new addition to Asterisk makes building sophisticated, interactive XMPP-based screen pops easy. Just imagine what other juicy little nuggets await in the new version of Asterisk.

Happy screen popping!

Building Voice Applications with Tropo and Node.js

I’ve been smitten of late with Node.js.

Node.js is a framework for building server-based applications in JavaScript. Node.js is event driven, so if you got a fairly good understanding of state-driven development frameworks you’ll probably get it quickly. If not, start here.
I wanted to learn more about Node.js, so I decided to build a module. There are lots of modules out there, but I wanted to do something very specific with mine. I wanted to use Node.js to build voice applications. (Not a shocker, it’s what I do.)

Turns out Node.js is a very nice match for the Tropo WebAPI, a cloud-based API for building sophisticated speech and communication applications. The Tropo WebAPI speaks JSON, and I can’t think of any more natural way of creating and consuming JSON than with good ‘ol JavaScript. Really, you can see why this gets me excited.

The Node.js module that I’ve been working on for interacting with the Tropo WebAPI is now available on GitHub. It comes with some very nice examples, and even a set of unit tests (yes, Virginia, you can write unit tests with Node.js). It has everything you need to get started using Node.js to write voice apps in JavaScript.

If you decide to give it a try (which I hope you do), there are some additional ingredients I would recommend adding to the mix:

  • The Express.js framework – a Node.js module very much like Sinatra in Ruby or Limonade in PHP.
  • CouchDB – The wonderfully powerful document-oriented storage engine that uses both JavaScript (for map/reduce and views) and JSON (for storing documents). There are also many fine Node.js modules available for interacting with CouchDB.

With these ingredients you’ve got a pretty powerful foundation on which to build robust, sophisticated multi-channel communication apps.

But why would you want to build a voice application with JavaScript?
Pretty much all of the voice application development tools and technologies that have been developed over the last decade or so have one essential unifying characteristic – each of them seeks to leverage easy to understand, low cost web technologies to build phone applications.

This principle can be seen very clearly in the approach embodied by the new Node.js library for the Tropo WebAPI. If you can write JavaScript, you can build sophisticated, cloud-based communication applications that not long ago required specialized skills, training, software and hardware (Big bucks, people. Big bucks).

Cloud-based telephony services based around simple to use APIs that employ widely supported standards like HTTP and JSON are democratizing phone and voice application development.

It’s really exciting to be a part of this trend and to contribute tools that others can use to build powerful applications.

Make the Cloud Listen (and Understand)

Yesterday I wrote a post about the changing cloud telephony landscape, and highlighted some key factors that will dictate which cloud telephony providers are around for the long haul and deliver the next innovations.

One of those factors – support for speech recognition – is a good differentiator for developers to use when choosing a cloud telephony platform.

Speech recognition is becoming increasingly important in our everyday lives. Smartphones and powerful handheld devices enable multimodality, and there are more and more restrictions placed on our use of phones while doing other tings (like driving).

Plus, I can’t think of a more deflating concept than a cloud telephony provider that allows developers to build sophisticated apps and mashups in the language of their choice but that chains users of those apps to a telephone keypad. No fun.

To give an example of how powerful speech recognition can be, and how easy it is to use with a cloud telephony provider that supports it, I worked up a small demo to illustrate the point. The sample code for this demo is on Github, and we’ll dive into it in more detail below.

This demo uses two PHP libraries that are designed to work with the Tropo platform (one of the only cloud telephony providers to support speech recognition):

If you’ve read any of my previous posts on build applications for the Tropo platform, you’ll see lots of similarities between this and previous sample apps. Here I continue my use of the insanely awesome Limonade Framework for PHP.

Let’s take the example of a company directory that allows callers to dial a single number, select a person or department at the company and then be transferred to the person they select.

With cloud telephony, there is no need to have such a system live on a machine in the server room – it can be hosted externally in the cloud, making it easier to manage and to scale. In addition, with the Tropo Platform, it doesn’t have to be the same tired old DTMF-based menu telling callers to press an extension number or to “dial by name…”.

Using the PHP WebAPI Library and Limonade, we can construct a simple, yet power script that looks like this:

This script is pretty self-explanatory, but there are some key points I want to emphasize. First, note the $options array that holds the reference to an external grammar file (more on that in a bit). Tropo seems to need for this reference to be an absolute one and not a relative reference to the file (not hard to do with PHP – you just need to be aware of it).

Also, the file reference needs to include a trailing parameter indicating that this is an XML grammar (;type=application/grammar-xml). This seems to be true even if the grammar file is served with the correct MIME type by whatever is serving it.

Now lets have a look at this grammar file.

This simplistic example demonstrates how to use the PHPGrammar library. Note the simple array structure that is being used to hold the details of employees for our fictitious company. This could very easily be replaced with a dip into a data source of pretty much any kind, like an LDAP directory or database holding employee details.

Also note in this example that we want to do something referred to as Semantic Interpretation. Our grammar file is a set of rules that will be applied to what the caller says – Semantic Interpretation (SI) dictates the value that is given to our application from the grammar when a successful match occurs.

In this example, we want the caller to be able to say the name of the person they want to be transfered to. We make the first name optional so they may either say the last name of the person or (optionally) the full name. Obviously this may need to be changed based on the size of the directory to render in a grammar file (e.g., multiple employees with the same last name).

Do note that the Tropo platform seems to require the “Script” sytax for returning SI values on a successful match as opposed to the “String Literal” syntax. (More on these alternatives here.)

Works on Tropo (Script syntax):

Does not work on Tropo (String Literal syntax):

So, when a caller says the name of a person in our company directory we want to return the number for that person to our Tropo script so we can transfer the call to them. This can clearly be seen when we examine the Result object that is delivered by the Tropo platform.

Tropo’s Result object includes the full grammar engine output, and lots of very detailed information about the recognition. As you can see, the utterance that the speech recognition engine heard was the name of one of our faux employees. The value that was returned is the number of that person.

We use this value in the transfer_call() method of our Tropo script.

// Create a new instance of the Result object.
$result = new Result();

// Get the value of the selection the caller made.
$phone = $result->getValue();

// Create a new instance of the Tropo object and transfer the call.
$tropo = new Tropo();

// Write out the JSON for Tropo to consume.

Using the PHP WebAPI library, it takes just 5 lines of code (excluding comments) to get the value of the grammar result and transfer the call. How cool is that?!

Obviously there are lots of things that can be done to enhance this script, to make it more robust, but it illustrates the essential concepts of speech recognition in the cloud.

What’s more, because of all of the great functionality provided by the Tropo cloud platform we can really push the envelope on the tired old company directory:

  • We could take an inbound call from a Skype user and transfer to a cell phone (or a SIP endpoint).
  • We could let our caller select a department in our company and then ring several different numbers at once, transferring the call to the first one answered (sort of a “hunt group in the cloud”).
  • We could use Tropo’s built in IM capabilities to send a screen pop to the person receiving the call.

The sky is the limit. Which I guess is the point of cloud telephony…

What Matters in Cloud Telephony

The landscape of cloud telephony continues to change.

I was heartened this week to see some of the sharpest minds I know in cloud telephony and unified communications get together with the acquisition of Teleku by Voxeo. Teleku and Voxeo’s Tropo service are complimentary ones that offer lots of goodies for developers, and I’m anxious to see what these guys will be cooking up now that they have joined forces. Congrats to all involved!

While there is lots of discussion about what this acquisition means for the constantly changing landscape of cloud telephony, this move validates (in my mind) some of the important trends that will determine which cloud telephony companies will be around for the long-term and how developers will use their services.

None of this is new – I’ve said it all before. It is worth noting, however, that all of the trends that I’ve observed before that are going to make the difference in the cloud telephony space are ones that both Tropo and Teleku do very well.

Portability – underscored not only by Teleku’s support for the open standard VoiceXML, but also the Tropo crew’s involvement in the Asterisk world, and the defacto standard for building Asterisk apps in Ruby – Adhearsion.

SIP integration – remember this kids: true cloud telephony has SIP baked in – the rest is just marketing fluff. Both Tropo and Teleku support SIP interoperability and make it very easy for developers to use SIP as part of their applications.

Multi-channel / multi-modality – Both Tropo and Teleku have big multi-modal chops. Being able to interact with users on multiple communication channels from one code base is a key tenet of unified communications and cloud telephony, and this will become increasingly important in the future.

Speech recognition – cloud telephony isn’t your grandfather’s way to build a phone app, so why should users be restricted to their grandfather’s way of interacting with a phone app? Speech recognition is fully supported in both Tropo and Teleku, and this will matter more and more to cloud telephony developers going forward.

So if you’re wondering what the next change in the cloud telephony landscape will be, you can bet that one of these trends will dictate the change.

Until then, I’ll be hacking on some cloud-based, speech rec enabled UC apps.😉