Category: General

General posts, typically analyzing HCIR issues.

How Do I Blog So Much?

I received a nice email from a reader today. As per my mantra of “when in doubt, make it public“, I thought I’d post an answer here.

Subject: Daniel, You Blog So Much…

…how do you do it? I am a regular reader of your blog, and I am amazed at the frequency of your posts. You seem to write posts faster than I can think of topics. Every time I blog it turns into a huge time sink. You seem to succeed despite having what I assume is a demanding job and family. Do you have any tips/tricks/strategies/methodologies?

First, the constraints: job and family.

My job is demanding, though I’ve been lucky that I have an enlightened employer that sees the value in the time I spend blogging. Even though this isn’t a corporate blog–perhaps because it isn’t a corporate blog–it does help promote Endeca as a thought leader. That doesn’t mean I can prioritize blogging over other work, but it does mean I can devote an hour a day to my blog without incurring the wrath of our accountants or investors.

Family is trickier. I try not to blog between 6pm and 11pm, or during the day on weekends. I even went without blogging for a week! But there’s no question that one of the reasons I can spend so much time blogging is that my wife takes on a disproportionate share of the parenting load. If she starts blogging, we’ll have to rebalance.

Given those constraints, how do I manage the frequency? I post roughly daily, sometimes more than that. I have some topics queued up, but many of my posts are quick reactions to what I read, either on Techmeme or on other people’s blogs. Sometimes I’m lucky and someone emails me material that is great blog fodder (as is the present case); other times, I simply blog about what I’m doing.

Do I have any tips/tricks/strategies/methodologies? Read interesting stuff that other people write, and write about your reactions to it. Work on interesting problems, and talk about them when you can. Instead of writing an email or ranting to a co-worker, put those same thoughts into a blog post.

Perhaps most importantly, cultivate a passion for unsolved problems, so that the world reminds you of them at every turn. It’s the same advice I’ve heard given to researchers, only that the threshold for publishing a blog post is a lot lower than that of submitting a publication for peer review.

And it’s like Steve Jobs says: real artists ship. Blog posts aren’t the Great American Novel you spend your life perfecting. Some of the best blogs posts are reactions to current news stories. The value of timeliness is a great forcing function to make you write something good enough and publish it while the story is fresh in people’s minds.

General

The Macroeconomics of Information and Attention

Post author By Daniel Tunkelang
Post date December 16, 2008
11 Comments on The Macroeconomics of Information and Attention

Note: this post is cross-posted at the Panjiva blog, which discusses issues affecting the global trade community. I’ve recently joined Panjiva’s advisory board (alongside Panjiva investor and reknowned economist Larry Summers), and I’m proud to be helping this new venture transform global trade by providing an unprecedented level of transparency about the strengths and weaknesses of companies engaged in it. Learn more about Panjiva at their web site or blog.

While I’m a neophyte on matters of global trade (fortunately fellow MIT alum and Panjiva investor Larry Summers is a bit more qualified on those matters), I do know a thing or to about how people interact with information. So it’s my delight to share a short series of posts on the macroeconomics of information and attention.

In Brief Principles of Macroeconomics, Greg Mankiw lists ten principles of economics that he divides into three groups:

How People Make Decisions

People Face Tradeoffs.
The Cost of Something is What You Give Up to Get It.
Rational People Think at the Margin.
People Respond to Incentives.

How the Economy Works as A Whole

Trade Can Make Everyone Better Off.
Markets Are Usually a Good Way to Organize Economic Activity.
Governments Can Sometimes Improve Market Outcomes.

How People Interact

A Country’s Standard of Living Depends on Its Ability to Produce Goods and Services.
Prices Rise When the Government Prints Too Much Money.
Society Faces a Short-Run Tradeoff Between Inflation and Unemployment.

Nobel Laureate Herb Simon articulated the concept of an attention economy in his 1971 article, “Designing Organizations for an Information-Rich World”:

in an information-rich world, the wealth of information means a dearth of something else: a scarcity of whatever it is that information consumes. What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention and a need to allocate that attention efficiently among the overabundance of information sources that might consume it.

In the next few posts, I’ll try to apply Mankiw’s principles to Simon’s conception of an attention economy to establish a macroeconomics of information and attention.

Continue: How People Make Decisions

General

What You Need To Know About Social Media

Post author By Daniel Tunkelang
Post date December 16, 2008
3 Comments on What You Need To Know About Social Media

Today I delivered an internal presentation at Endeca entitled “What You Need To Know About Social Media” with the goal of setting a baseline of what every technologist should know about this brave new world.

As proof that I drink my own kool-aid / eat my own dog food, I’m offering it here for public consumption. I suspect that a lot of it will be familiar to readers here, but you never know. I also encourage you to re-use this presentation in your own organizations to convince skeptics that social media are not just a bunch of hype.

Enjoy!

http://static.slideshare.net/swf/ssplayer2.swf?doc=what-you-need-to-know-about-social-media-1229455704538543-1&stripped_title=what-you-need-to-know-about-social-media-presentation

General

Making Government Information More Accessible

Post author By Daniel Tunkelang
Post date December 16, 2008
2 Comments on Making Government Information More Accessible

A co-worker tipped me off to a public, non-profit service that deserves all the publicity it can get. It’s called Public.Resource.Org, run by technologist and public domain advocate Carl Malamud, and devoted to “making [U.S.] government information more accessible”.

Not sold on the value of this service yet? Consider this example of their good deeds.

Public Access to Court Electronic Records (PACER) is an electronic public access service that allows users to obtain case and docket information from Federal Appellate, District and Bankruptcy courts, and the U.S. Party/Case Index via the Internet. These documents are works of the United States government and are in the public domain. But, for reasons that have no place in a 21st century democratic government, PACER charges $0.08 / page to download copies of these records–more than most of us pay for analog photocopying!

Enter the PACER recycling program, run by Public.Resource.Org:

Just upload all your PACER Documents to our recycling bin. Click on the recycle bin and you’ll be presented with a dialogue to choose files to upload. Then, just hit the “Start Upload” button and you’ll hear the sounds of progress as your documents get reinjected into the public domain.

We’ll take the documents, look at them, and then put them onto bulk.resource.org/courts.gov/pacer for future distribution. This is a manual process and you won’t see your documents show up right away. But, over time, we hope to accumulate a significant database of PACER Documents.

They claim to have saved $9,104.08 so far. That’s hardly enough to, say, bail out the auto industry, but it’s a step in the right direction. More importantly, efforts like this instill a culture I wish we could take for granted–namely, that public government documents should be generally available to the citizenry. Like most technologists, I have a libertarian streak, and I’m the first to defend the private sector. But this is a case where the goods themselves belong to the public. No one should profit at the expense of an informed citizenry.

p.s. Perhaps this effort will interest people trying to assemble corpora for information retreival research.

General

Is SOA Enabling Intelligent Agents?

Post author By Daniel Tunkelang
Post date December 14, 2008
9 Comments on Is SOA Enabling Intelligent Agents?

I recently blogged about sofware agents, mostly musing about how to reconcile their inherent rationality with our lack thereof as human beings.

But today I noticed an article by John Markoff in the New York Times entitled “A Software Secretary That Takes Charge “, which considers some companies trying to build services based on such agents. The article called my attention to the recent death of AI pioneer Oliver Selfridge, who coined the term “intelligent agents” and devoted much of his career to trying to make them a reality.

Markoff notes that “efforts to build useful computerized assistants have consistently ended in failure”, which raises the question of why any student of history is still investing in this area. Markoff quotes an answer from Rearden Commerce founder / CEO Patrick Grady:

The promise of the Web 2.0 era of the Internet has been the interconnection of Web services. Mr. Grady says he has a far easier task today because the heavy lifting has been done by others.

“This is the connective tissue that sits on top of the Web and brings you more than the sum of the parts,” he said. “I set out to deliver on the longstanding ‘holy grail of user-centric computing,’ a ‘personal Internet assistant.’”

In other words, intelligent agents are possible now because of Web 2.0 and service-oriented architecture. An interesting theory–and I can certainly accept it in theory. But I’m curious how it plays out in practice. It seems to me that there’s a lot of “heavy lifting” still waiting to be done.

General

Should We Donate Attention To Support Bloggers?

Post author By Daniel Tunkelang
Post date December 14, 2008
4 Comments on Should We Donate Attention To Support Bloggers?

In a post today entitled “The joke of advertising on social media“, Steve Hodson goes through some familiar territory in the challenges social media companies faces in coming up with a viable business model:

Social media is built around the idyllic concept that content should be free.
Social media companies insist to advertisers that they have a willing flock to make their millions off of.
But early adopters of social media behave like a swarm of pissed off wasps at the mere mention of advertising.
Case in point: many people (myself included) rejected Google’s Chrome browser because it didn’t support an ad blocking plug-in.

So far his argument is–or should be–uncontroversial. And I agree with his prediction that “Social media in all its goodness will only survive if people like you and me can contribute but know that we can pay our bills at the same time.” There’s no such thing as a free lunch, and bloggers cannot live by page views alone.

But then he continues: “users of social media…have to stop being so greedy with their attention span.” This is a slightly different take on the “no free lunch” argument than I expected. If I understand him correctly, he is suggesting that we click on ads out of a sense of obligation, to make up for the fact that we are receiving content for free.

If my understanding is accurate, then I have to part ways with Hodson. If bloggers want to put out a tip jar and encourage readers to leave tips, that’s fine. And if they want to make it clear that clicking on an ad is, from their perspective, equivalent to leaving a few pennies as a tip, that’s fine too. There’s nothing wrong with asking users to be generous.

But the whole point of tipping is that it isn’t out of a sense of obligation. Tips are supposed to be on top of paying a fair market value for services rendered. I know that isn’t always the case; in the United States, the minimum wage law calls out the existence of occupations where tips customarily represent a substantial portion of an employee’s income. But that doesn’t make it a good idea, let alone an example for new markets.

I realize that individual bloggers are hardly in a position to unilaterally change the prevailing business model accepted by a culture where people expect information to be free. And so we run through the sequence Hodson describes, and everyone is frustrated: underpaid writers, advertising-inundated readers, and profitless investors. But at least reality is finally sinking in, and I am personally optimistic that the days are numbered for the dominance of the advertising-supported model. Call it an audacity of hope.

General

Transparency vs. Simplicity

Post author By Daniel Tunkelang
Post date December 12, 2008
2 Comments on Transparency vs. Simplicity

As regular readers know, I am strong advocate for transparency in any system where people interact with machines. In fact, such transparency is a core HCIR value, since communication depends on the clarity with which a message traverses the noisy channel of human-computer interaction.

So I was a bit taken aback by a recent blog post in which Stephen Arnold seemed to attach the notion that an effective search engine could be transparent. But a more careful reading led me to believe that he’s reacting, perhaps a bit too cynically, to the increased currency that the word “transparency” has in marketing literature.

Let me try to cut through the marketing hype. Transparency is more than a buzzword to sell software. It is a core value than imposes significant constraints on how a system can act. If a system is not bound by transparency, then it is free to respond to user inputs arbitrarily, unconstrained by any requirement to offer users insight into the basis for its response. In contrast, a transparent system must produce user-consumable explanations of its output. A transparent system can’t get away with saying “if I told you, I’d have to kill you.”

In fact, a transparent system might have to reject a possible response to a user because it can’t present an explanation for the response that the user will understand. For this reason, some machine learning purists reject transparency as overly constraining, and prefer approaches that simply optimize an objective function that, in all likelihood, is completely opaque to the user–and possibly even to the system developer.

Why is transparency so important in systems that support information seeking, i.e., search and information retrieval systems? Because any systems that requires people to interact non-trivially with machines are fraught with communication challenges. Best-effort attempts to extrapolate user intent from a query–often a query comprised of only a couple of words–are beyond AI-hard; they’re ESP-hard. While all systems have to accept that they’ll misread users’ intention a significant fraction of the time, transparent systems at least offer users the opportunity to worth with the system to get back on track.

To be clear, implementing transparency isn’t simple. It’s like Mark Twain said: short letters are often harder to write than long ones. In a related vein, the world’s greatest minds aren’t always the world’s greatest communicators. And what holds true for people holds even more so for machines (or the people for program them): it’s hard to develop algorithms that deliver useful results and provide human-consumable explanations for them.

I understand Arnold’s frustration with vendors. And I won’t claim that Endeca always gets it right, though I think (and have been told) that we do better than many in communicating how our technology works. But there is no question in my mind that information seeking support systems have to become more transparent if we want them to work in the real world.

General

Computational Information Design

Post author By Daniel Tunkelang
Post date December 12, 2008

Tonight I had the good fortune to attend a talk by Ben Fry on Computational Information Design at the Broad Institute of MIT and Harvard. Ben Fry is one of those rare human beings whose work spans from the heart of academia (he’s worked with Eric Lander on visualizing genetic data) to popular culture (he work appears in Minority Report and The Hulk). And he’s an outstanding speaker.

The content of his talk reflected his dissertation work at the MIT Media Laboratory, his postdoc work the Broad Institute, and some of his more recent work as a designer and consultant. I can’t do justice to the talk, which unfortunately is not available in any recorded form. But I do suggest you seize the opportunity to hear him speak, should it come your way. He communicates the power of visualization through examples, in a way that conveys both their practical value and their beauty.

The Q&A session was almost as long as the talk, and probably could have gone on indefinitely if the organizers hadn’t finally cut it off. Suspecting that I was one of the few non-academics in the audience, I asked two eminently practical questions: how do you know that a visualization is effective, and how d you guard against a visualization skewing your perception of the data?

Fry’s answers were incisive. He judges the effectiveness of a visualization based on whether people give up their previous tools to use it. And he selects problem areas where he sees a significant opportunity to improve the state of the art. That way, the difference in adoption is so obvious that you don’t need to perform user studies to observe it.

As for concern with visualization skewing perception of the data, he acknowledges it as a valid concern but points out that we don’t seem to raise the same concern with non-visual (e.g., textual) data presentation. Somehow we are especially suspcious of aesthetic representation, a sort of “don’t hate me because I’m beautiful” bias. He adds that the risk of design skewing our perception is dwarfed by the cost of not designing at all.

Visualization is a tricky subject, and I’ll freely admit that I’m underwhelmed by much of the work I’ve encountered. Perhaps my past work in information visualization makes me a particularly harsh critic. But Fry presents a compelling picture–or rather, a compelling video, since his work is full of motion. My only complaint is that he hasn’t explored the world of search and information retrieval. His work seems to beg for application in that domain. Food for thought.

General

This is not a corporate blog

Post author By Daniel Tunkelang
Post date December 10, 2008
17 Comments on This is not a corporate blog

To paraphrase René Magritte, ceci n’est pas un blogue corporate (this is not a corporate blog).

Why do I bring this up? Because today I saw a post by Richard MacManus on ReadWriteWeb entitled “Report: Corporate Blogs Not Trusted” and a similar post by Joe Wilcox about how to “Make Your Corporate Blog Believable“. They cite a report from Forrester that company blogs represent the least trusted information source, down at 16%. Actually, personal blogs don’t fare much better at 18%, but I’d like to use this report as a pretext to talk about what it means to blog as an industry professional.

Frankly, I don’t think corporate blogs, at least as they are conventionally understood, are a good idea. Companies put out press releases that no sane person would trust as an objective information source. A corporate blog is just a repackaging of a press release web page, trying to masquerade as something more hip. I don’t think it fools anyone, and I’m not aware of any corporate blogs other than the Official Google Blog that have significant readership.

Bloggers are people who speak with individual voices. And industry professionals are still people, regardless of their corporate affiliations. I am no more the voice of Endeca’s public relations department than Greg Linden is the voice of Microsoft’s or Matt Cutts the voice of Google’s.

Of course, I have my point of view, which unsurprisingly has some alignment with my employer’s overall vision. When I advocate for HCIR, I don’t make excuses for the fact that HCIR underlies Endeca’s approach to information access. But I speak as an individual and in my own voice. When I blog, I put my own credibility on the line, and I cultivate a reputation that extends beyond my corporate affiliation.

I think the interesting question for companies is not whether they should publish corporate blogs, but rather whether they should encourage their employees to publish personal blogs that relate to the work the company does. As someone who has been involved in the development of Endeca’s core intellectual property, I understand the reservations that companies have about letting their employees publish. But I think that companies are often too conservative, and incur an enormous opportunity cost in the name of protecting trade secrets. Letting employees blog (and, more generally, publish) not only provides the companies with free marketing, but also provides employees with an avenue for personal development.

I’d be curious to hear perspectives from readers here who work for companies. Perhaps I’m lucky to work for an enlightened employer; do most corporate citizens get the memo from Legal saying that blogging is something only the marketing department should do?

General

Overwhelmed by Email?

Normally, I don’t post about press releases that people email me. But in this case, the title, “Half of Americans Are Overwhelmed by E-Mail“, hit far too close to home. Having spent the day catching up on a week of email, I’m feeling more than a little overwhelmed. And it’s made me think hard about imposing more discipline on how I manage email.

I’m in no immediate danger of declaring email bankruptcy, but I have reached a point where my ad hoc approach to managing email–particularly to checking email as it arrives–costs me so much in productivity that I am considering reducing the frequency with which I check email to once or twice a day.

One might ask if there’s anything unique about email as a source of context switching. Don’t the same issues apply to news feeds, Twitter, etc? And there’s instant messenger–which is intended to trigger an immediate context switch. Why single out email?

My suspicion is that email satisfies an unholy mix of properties:

A substantial fraction of email is personal and important, and there are no reliable automatic ways of identifying this fraction.
The sender’s expectation of how long to wait for a response varies widely–from as soon as possible (e.g., one-line bodyless emails used as instant messages) to days or even indefinite (e.g., an FYI email that does not require a response).
The typical sender sees email as the least invasive way to communicate, and therefore uses it as the default means of doing so.

The result: you (or at least I) end up with a relentless queue of email, faced with a choice of looking at all of it frequently, or likely deferring something urgent.

Of course, there are conventions, like marking emails as urgent, that are intended to sort out some of the above. But it isn’t realistic to expect everyone to use these consistently, at least not this late in the game.

Perhaps the answer is, as was suggested in an earlier post, to make it public. Specifically, Tantek suggests to “Move as much 1:1 communication into 1:many or 1:all mediums.” At some level, that is counterintuitive–after all, doesn’t that just make my problem everyone’s problem? But the key is that public communication sets different expectations. I might know the answer to your question, but so might any number of other people, so let’s balance the load.

It’s a nice idea, though it’s not clear how anarchic 1:many and 1:all mediums can accomplish this load balancing efficiently. But arguably that just an implementation detail. The first step is to calibrate the specificity of distribution to the specificity of the information / communication need.

Where does this leave me and others who are overwhelmed by email? For now, stuck with heuristics like disciplined management. For the long haul, advocating for more scalable social norms.

p.s. Ironically, this post is a public response to a private email.