The Noisy Channel

 

Transparent Text Symposium: Day 1

September 21st, 2009 · 8 Comments · General

Wow, what an intense day at the Transparent Text symposium! I won’t try to give detailed summaries of the talks–videos will be posted after the conference, and you can get a pretty good picture from the live tweet stream at #tt09. Instead, I’ll try to capture my personal highlights and reactions.

I’ll start with Deputy U.S. CTO Beth Noveck‘s keynote about the Open Government Initiative. First, the very existence of such an initiative is incredible, given the culture of secrecy traditionally associated with Washington. Second, I like the top priority of releasing raw data so that other people can work on analyzing it, visualizing it, and generally making it more accessible either to the general public or to particular interest groups. This is very much what I had in mind in January when I posted “Information Sharing We Can Believe In” and I’m glad to see tangible progress. I was never a big fan of faith-based initiatives. :-)

The next session was a group of talks about watchdogs and accountability–people looking at how to ensure government transparency from the outside. New York Times editor Aron Pilhofer and software developer Jeremy Ashkenas talked about DocumentCloud, an ambitious project to enable exploratory search for news documents on the open web. Sunlight Foundation co-founder and executive director Ellen Miller offered a particularly compelling example of the power of visualization: a graph correlating the campaign contributions and earmark associated with a congressman under investigation. But my favorite presenter in this section was ProPublica‘s Amanda Michel, whose thoughts about a “human test of transparency” are worth a talk in themselves. For now, I recommend you look at the two projects she discussed: Stimulus Spot Check and Off the Bus.

After lunch, we shifted gears from government transparency to more of a focus on text. The first of the two afternoon sessions was entitled “Analyzing the Written Record” and featured Matthew Gray from Google Books, Tom Tague from Open Calais (a free text annotation service that almost all of the previous speakers raved about), and Ethan Zuckerman from Harvard’s Berkman Center. All of the talks were solid, but Ethan’s was outstanding. I blogged about his Media Cloud project back in March, but it’s come a long was in the past six months and is doing something I’ve been waiting years to see someone do: comparing how different news organizations select and cover news.

The final session was about visualization. David Small offered a presentation about literally transparent text that was, in the words of Marian Dörk, “refreshingly non-utilitarian and visually stimulating”. Ben Fry showed the power of visualizing changes in a document over time–specifically, a project called “the preservation of favoured traces” that illustrates  the evolution of Darwin’s On the Origin of Species. But, as expected, IBM’s Many Eyes researchers Fernanda Viégas and Martin Wattenberg stole the show with an incredibly informative and entertaining presentation about the visualization of repetition in text. No summary can do it justice, so I urge you to watch the video when it is available.

After all that, we enjoyed a nice reception at the IBM Center for Social Software. I’m incredibly grateful to IBM for organizing and sponsoring this event, and to Martin Wattenberg for being so kind as to invite me. I’ll try to earn my keep in my 5 minutes at the “Ignite-style” session tomorrow morning.

8 responses so far ↓

  • 1 david yehaskel // Sep 21, 2009 at 11:54 pm

    Thanks for letting me attend vicariously through your tweets and posts. I’m saving this one for the morning coffee.

    Thanks!

  • 2 jeremy // Sep 22, 2009 at 12:26 am

    It is really refreshing to hear of a whole day spent pursuing and discussing exploration and visualization as it relates to information retrieval. If there was ever a large, important domain in which this type if IR was necessary, rather than the “known item”, navigational IR that has come to dominate most search today, government transparency would be it.

    I’m curious: Was there any discussion of the target user of the Open Government Initiative? I.e. was it geared more at journalists and watchdog groups? Or at better informed/participatory citizens? Or was there any discussion about how tools could be made to appeal to a wider/mass audience?

    I guess what I’m asking is: What happens when the unwashed web masses meet the larger exploratory potential of open government? Is there an impedance mismatch when the two meet up, and was there any discussion about how to overcome that?

  • 3 dave fauth // Sep 22, 2009 at 8:58 am

    Jeremy et. al.,

    There has been recent discussion of open data standards at the Web 2.0 expo in DC earlier this month. You can see some of the presentations online at http://gov2events.blip.tv/posts?view=archive.

    I’m not sure most people understand the audience or what the desired outcome is. Right now it seems as if we are making data available without understanding how or why it will be used.

    I believe that unwashed web masses (mom, my wife) for the most part will never explore the raw open data. Rather, organizations such as Sunlight Foundation will explore the data and publish their analysis allowing people to make their own decisions.

    Where I would like to see more open data is at the local level (county schools and government) where I can analyze how they are prioritizing and spending the local dollars. Santa Cruz, CA was able to use some open source software to set up a site that enabled the community to participate in the budget deficit discussions in that city.

  • 4 Daniel Tunkelang // Sep 22, 2009 at 9:57 am

    Glad folks are enjoying my relentless tweets–I was tweeting so much yesterday that I was afraid I’d killed Michael Jackson!

    Jeremy: their main concern is making data available generally. I don’t think they’ve targeted beyond that. In fact, one of the open questions (which I asked myself) is what metrics will be used to measure success. But I agree with Dave that, in practice, mass access will be mediated through NGOs.

  • 5 jeremy // Sep 22, 2009 at 10:28 am

    I believe that unwashed web masses (mom, my wife) for the most part will never explore the raw open data. I agree with Dave that, in practice, mass access will be mediated through NGOs.

    So even for something like Many Eyes, that would translate, practically, into Many NGO Eyes, rather than Many Unwashed Eyes?

    I’m not saying that’s bad. In fact, I think this sort of govt scenario provides the opportunity to develop expert tools for the expert searcher…a great opportunity for HCIR research and development.

  • 6 Transparent Text Symposium: Day 2 | The Noisy Channel // Sep 23, 2009 at 12:33 am

    [...] RSS   ← Transparent Text Symposium: Day 1 [...]

  • 7 kellys // Sep 23, 2009 at 8:02 am

    thanks for your tweets and blog posts about #tt09.

  • 8 Privacy, Pseudonymity, and Copyright | The Noisy Channel // Sep 29, 2009 at 4:49 pm

    [...] lunch conversation during the Transparent Text symposium about transparency in social media (also a hot topic in the Ethics of Blogging [...]

Clicky Web Analytics