New Year, New Blog

The final days of 2011 are ticking away, and so too are the final days of this blog.

I’ve been writing about voice technologies, phone apps and open government under the Vox Populi monicker for a long time.

I first registered the voiceingov.org domain in 2002, and built a very basic website to share my excitement and enthusiasm for an emerging set of standards for building telephone applications using web technologies.

My interest in this area was spurred by my involvement in helping create the e-government initiative in the State of Delaware, and launching projects like the Delaware Internet Access Locater.

I have always believed in the importance of communications technology in general and telephone applications specifically for engaging citizens and encouraging an open dialog between governments and the people they serve.

This interest was cemented when I left state government to work as a technologist, using the latest and greatest tools for building communication apps of every stripe.

I’ve continued to write posts on this blog, trying to emphasize the core message that these new technologies can make it easier and more efficient for citizens to connect with government, even as my interest and work has expanded more broadly into other areas, like open government and civic hacking.

More and more, I’ve felt like I’ve been shoehorning posts on these topics into this blog which has an admittedly more narrow focus.

In addition, I’ve had much less time to devote to writing posts for this blog – I write extensively for my company’s blogs as well as for other blog sites covering open government and the civic hacking space.

With all of that said, I’m happy to say that I will soon announce a new direction in my writing and blogging efforts. This announcement, sadly, means the end of Vox Populi as it is currently constituted.

It’s hard to let go of something I’ve been working on for so long, and that is central to the way that I think technology should work to bring government and citizens together.

But the tension that currently exists between my current work and interests, and the narrow focus of this blog can’t continue.

In 2012, I’ll launch a new site where I will write more extensively about open government, open data, civic hacking and more of things that will change the way that government works and they way we interact with those that represent us.

It’s been a great run with this blog, and I’m still proud of many of things I wrote and all of the things I tried to do with it.

Looking forward to 2012.

Innovations in Civic Hacking

There are some exciting things happening in the world of civic hacking, and some cool innovations are being used to make civic hacking events more exciting, and to perpetuate the value of hackathon projects.

Last night in San Francisco, a forum of mayoral candidates was presented with the top projects from a Summer-long series of hacking events called (appropriately) the “Summer of Smart.”

I wasn’t able to be at this event, but I did attend an earlier San Francisco mayoral candidate forum that kicked off the Summer of Smart. By all accounts, the event last night was a great success and there were some awesome projects presented to those who would be Mayor of San Francisco.

I love the idea behing Summer of Smart – holding an intensive, tightly clustered series of events over several months and then “jacking in” to the political process by presenting the outcome directly to mayoral candidates in a public forum.

I’m also really interested to learn more about the new “civic residency program,” which will provide resources and space for hackathon participants to continue to develop their projects. If it works, this could become a model for other parts of the country.

Another innovative civic hacking event is taking place in Reno, NV – the Hack4Reno event gets more interesting as the days go by.

The event’s chief organizer – Kristy Fifelsky, of GovGirl fame – has started a video series providing insights on how to organize and run a civic hackathon. Insights, tips and tricks from someone who is actually hands on with organizing a civic hackathon could be extremely valuable to other municipalities that want to hold similar events.

One of the most interesting tidbits about the Hack4Reno event from the first installment in this series – the hackathon will take place out in the open, literally. The current plan is to hold the event outside the Pioneer Center in downtown Reno. How awesome is that!

There appear to be contingency plans if the weather doesn’t cooperate, but I love that the organizers are thinking outside the box.

Another cool detail from the Hack4Reno event is that almost all of the planning for the event is being done using GitHib to assign and track work. This is a great way to encourage collaboration and underscores the commitment of those involved to the idea of transparency. Everything about this event is open.

Really looking forward to seeing more innovations from civic hacking organizers.

Keep em coming!

Operation Data Liberation

I’m working on a neat new OpenGov project that has grown out of a few things I’ve worked on the past.

I’m hoping to launch a new SMS-based service this week that will let anyone quickly and easily find important locations in their neighborhood using an ordinary cell phone (i.e., a non-smart phone).

I’ll launch the new service in Philadelphia, but I plan to open source the code for it so that anyone that wants to launch a similar service in another location will have a code base to start from.

As a data source for the new service, I’m planning on using data sets that are available in PHLAPI. The only drawback to this approach, currently, is that there aren’t a whole heck of a lot of point data sets available in PHLAPI.

That’s not necessarily a deal breaker, as there are plenty of data sets out there just waiting to be liberated and added to the PHLAPI instance. So, as a way of getting the process rolling, I decided to start hacking away at liberating some open data.

One of the more interesting data sets I came across in looking for candidates (pun intended) was a listing of polling locations in the City of Philadelphia. I found a PDF document containing a detailed list of locations (prepared one presumes for the primary election in Philly last week) on the site of the Committee of Seventy.

The PDF document I started with can be found here. There is also a web-based app available on the Committee of Seventy site for finding polling locations, but I wanted the data in as raw a format possible (ideally, CSV).

It turned out to be surprisingly easy to do this. Here’s how I did it.

Converting PDF to Text

Since I run a variety of *NIX machines in my home office, it was pretty easy to use pfdtotext to convert the Committee of Seventy PDF document into a text document. Using the -layout option with pdftotext allowed me to maintain the nice table layout of the original document, and helped with further processing.

From Plain Text to CSV

Once I had the document in text format, it was time to fire up Google Refine. This is an indispensable tool in the arsenal of any OpenGov hacker, and it is enormously powerful for cleaning up and enhancing messy data.

Google refine comes with a built in scripting language called Google Refine Expression Language (GREL) – very much like JavaScript in it’s syntax – that lets you manipulate and refine the data in a project.

I won’t go over all of the steps I used to convert the plain text file I started with to CSV, but if you spend any time playing around with Google Refine you’ll see how easy it is to enhance a data set.

Adding Locational Information

Next it was time to geocode the address of each polling location – something that is not included in the original Committee of Seventy document. One of the really awesome features of Google Refine is that it allows you to add a new column to a data set that is created by making a call to a web service.

This functionality makes it pretty straightforward to use Google’s Geocoding API to get the latitude and longitude for each address in the original file. There are some really good screencasts and demos of this technique on the web, and if you look for them you’ll find some good stuff.

Exporting to CSV

One of the last steps in the process, once the data was cleaned up and enhanced, was to simply export it to CSV.

I’ve loaded a copy of the CSV file I created to GitHub. This format is nice because just about any software program or development tool can consume it (you could, for example, simply open this file in Excel).

But what I really hoped to do was to be able to make this data easy to import into a CouchDB instance, like PHLAPI. Not only would this support my new OpenGov project, it would let anyone that wanted to use the data simply replicate it from a publicly available CouchDB instance.

To do this, I turned to Node.js.

Inserting into CouchDB

What I needed to do with my CSV file was to parse it and turn each row into a JSON-formated document that I could then insert into CouchDB via HTTP POST. This a snap to do with Node.js and a few of the handy modules made available by the Node community (most notably, node-csv and the cradle module)

Here is the script I used (also available on GitHub):

A bit of a hack, but it works just fine. So I now have all of the polling places in the City of Philadelphia in one of my CouchDB instances.

Want to replicate it and use it for yourself? Just run this at the command line (assumes you are running CouchDB locally, and have created a DB named phl_polling_places):

curl -X POST "http://127.0.0.1:5984/_replicate" 
-H 'Content-type: application/json' 
-d '{"source": "http://markh.couchone.com/phl_polling_places", "target": "phl_polling_places"}'

If this post has inspired you to try and liberate some data yourself, let me know. I’d love to help with more efforts like this.

I’m also still in the process of checking my converted data, to ensure that everything looks correct. Once this is done, I’ll work on getting it inserted in PHLAPI.

Stay tuned for my new service, launching soon…!

UPDATE:

Right after I posted this I learned of an even easier way to export data from Google Refine to CouchDB. Max Ogden‘s Refine Uploader – worth checking out if you are using data from Google Refine to a CouchDB instance.

Evolution of an Open Data Program

The first great challenge for any municipal data program is deciding what data to release.

The second great challenge for municipalities that embark on open data programs can be summed up in how they react to applications that get built with their data. This is particularly true if the apps that are built have not been fully considered by the municipality releasing the data.

Because of some work done by smart, passionate civic developers in Baltimore, this second challenge now confronts officials in Charm City:

“Whether web developers can use the city data to make applications that bring in revenue is still unknown. Baltimore officials released the data without providing terms of service to guide developers seeking to create applications for for-profit ventures.”

“Baltimore’s law department is crafting guidelines for how developers could use the data in such ventures,” said Rico Singleton, Baltimore’s chief information officer.

How Baltimore crafts the terms of service for their open data sets will dictate, to a large extent, how successful the program will ultimately be. Will there be more apps like SpotAgent? We’ll have to wait and see. I, for one, hope so.

It’s not surprising that governments struggle with this important milestone in the evolution of an open data program.

To a large extent, releasing open data means that governments abdicate control. There are typically some limitations on what developers can do with government data, but for the most part the creativity and personal ambitions of external developers are the engine that drives open data programs.

It can be a big leap for governments to give up this control, something that not all of them fully consider before the first data sets are released.

Here’s hoping that Baltimore officials take an approach that encourags the kind of app development that has resulted in SpotAgent. It would be great to see more of this.

Witnessing the CfA Effect in Philadelphia

Yesterday in Philadelphia, at the offices of Azavea (a Philly-based geographic data and civic apps company), 7 Code for America fellows sponsored what was the last in a month-long series of workshops/gatherings/hackathons.
City of Philadelphia
Yesterday’s event was promoted as a Data Camp – a one day sprint meant to identify useful sources of information about Philadelphia, and to build civic applications and mashups using that data. The idea was to have “demoable” projects by the end of the day.

I had the pleasure of attending this event yesterday, and was impressed by what I saw. I’m referring not only to the projects that were worked on (which were cool), or the excitement visible on the faces of everyone there – from independent hackers, to Azavea employees to city officials (there were several in attendance throughout the day). That was cool too.

What I was most impressed with was the ability of this event to highlight to those that were there what is truly possible when government data is open to and usable by developers. It provided an object lesson for all those there in the true potential of civic hacking.

I spoke with a number of people after the event, when the demos for all of the projects were complete, and the unanimous sentiment was – “Wow, all this got done in one day?”

The idea of releasing open government data has been gaining momentum in Philadelphia for a while now, principally under the auspices of the Digital Philadelphia initiative.

But yesterday’s event – in my opinion, as someone from outside of Philadelphia looking in – seemed to crystalize the benefits of releasing open data for many people there. They had already been supportive of releasing open government data in principle, and most had worked toward achieving this end. But to actually see apps getting built with open data really made the lightbulbs go off.

Having the Code for America fellows in Philadelphia, and having them essentially kick start civic coding using city data, has accelerated the awareness of what is possible. I think people would have achieved the awareness that was realized yesterday eventually, but the CfA fellows got people there sooner.

I call it “the CfA Effect.” It was pretty cool to see first hand.

Looking forward to more cool stuff from this gifted and dedicated group of young people.

Node.js Coming of Age

I’ve written before about using Node.js to build communication apps, and I’m always looking to sharpen my skills and beef up the Node.js module I wrote for building Tropo applications.
NodeFu
It’s interesting to see Node.js start to come of age and really gain traction in the developer community.

One concrete indication that this is happening is the announcement today of a new hosting service for Node.js applications – NodeFu. NodeFu is the brainchild of my friend and co-worker Chris Matthieu.

It works very much like Heroku, and makes it dead simple to deploy and run Node.js apps. NodeFu isn’t the only Node.js hosting platform out there, but it is one of the first to start letting developers deploy apps. (Maybe this will encourage some of the other players in the space to tap the gas pedal a bit.)

I don’t think I know anybody who thinks as big about technology as Chris does. He’s a voice developer from way back, and like lots of us Chris has always though of better ways to build voice applications.

Unlike the rest of us, however, who toil away creating libraries in various languages to ease the pain of voice app development, Chris actually built his own platform.

And true to form, once he got the Node.js bug, Chris again thought big. The result – NodeFu.

Bow to your Sensei!