Categories
General

A Scaling Challenge for Twitter Search

The other day, I explained why, as far as I can tell, Twitter’s existing search functionality isn’t that hard to implement. In a subsequent post, I argued that Twitter is not a search engine, an opinion that seems to place me in a minority of the blogosphere, albeit a substantial one.

Today I see that Twitter is have some trouble with what strikes me as a core constituency, SXSW attendees.

Daniel Terdiman at CNET writes that “At SXSW, attendees confront Twitter saturation“:

At SXSW, the standard is for everyone to include the tag “#sxsw” in their tweets. For example, on Friday, I was looking for sources for a different story and tweeted, “If you are launching an iPhone app at #sxsw, or know someone who is, please let me know. Thanks!”

That’s a great convention because it allows anyone wanting to know what’s going on to search Twitter for posts using any search term important to them.

I did a search for the “#sxsw” tag on Saturday afternoon and found that there had been 392 tweets with the term in just the previous 10 minutes. That number mushroomed to more than 1,500 in the previous hour.

Large volumes of results wouldn’t be such a problem if users had a way to summarize, navigate, and explore them. But that will take more than a search engine that offers more than reverse-date ordering of Boolean queries.

I wonder if Abdur Chowdhury and his team are working on this problem. Perhaps if Twitter is willing to make some of the historical logs available for download (they’re already public, just not easily downloaded), some of us HCIR wonks could implement interfaces on top of it to explore the possibilities. Abdur, if you read this and are interested, please let us know!

Categories
General

Challenge: Blog + Twitter vs. Aardvark

I asked Aardvark the following question this afternoon:

Trying to track down an animated short where a bunch of critters invent a machine to discover where they are, only to learn that they are a dream inside someone’s head. They ultimately turn into pink flamingos as the dream evolves. I remember them all chanting “Flick the switch!” when the invention is unveiled. No luck tracking it down with my web searching skills. 😦
The correct answer came within 6 hours. I’m curious if anyone who reads this message will find it independently–without using Aardvark themselves. If not, I’ll be forced to give Aardvark a very glowing review for answering a question that has been plaguing me for years. One way or the other, I’ll post the answer tomorrow night.
Added: Check out the rematch!
Categories
General

Clay Shirky: Save Society, Not Newspapers

There is so much writing about the impeding demise of the newspaper industry that’s it’s becoming easy to tune it out. But it’s refreshing to see a cutting analysis like the one Clay Shirky makes in “Newspapers and Thinking the Unthinkable“.

He starts with an anecdote about how, in the early 90s, the Knight-Ridder newspaper chain was fighting the unauthorized online distribution of a Dave Barry column. He quotes Gordy Thompson, who then managed internet services at the New York Times: “When a 14 year old kid can blow up your business in his spare time, not because he hates you but because he loves you, then you got a problem.” His most cutting remark:

The newspaper people often note that newspapers benefit society as a whole. This is true, but irrelevant to the problem at hand; “You’re gonna miss us when we’re gone!” has never been much of a business model.

His main point is this: “Society doesn’t need newspapers. What we need is journalism.” And he simply doesn’t see newspapers surviving as the way to deliver journalism.

I’m not so sure, but I think Shirky has his priorities right. We should be worrying about the end, not the means. I don’t subscribe to his and Jeff Jarvis’s faith-based optimism; I’m not convinced that the market demands it enough. Witness the challenge of sustaining other public goods, such as education and energy conservation. My own view is that journalism’s best hope is monetizing participation. I’m actually in the middle of writing an article about it; I’ll let you know if and when it gets published.

Categories
General

Vivisimo, Please Keep It Real

Let me preface this post with a clear disclaimer: I am the Chief Scientist of Endeca, a leading enterprise search vendor, but the views I express on this blog, including those about Endeca’s customers, partners, and competitors, are my own.

And one of my strong personal opinions is that marketing campaigns should be honest. One of the blogs I read is Search Done Right, a corporate blog maintained by Vivisimo.  Regardless of my opinions about Vivisimo’s technology, I am a fan of their marketing department. They’ve achieved visibility disproportionate to their market share–in no small part by promoting interface innovation. I also have met their CEO and CTO, and I think they’re both great guys.

But today, I saw a post entitled “New Enterprise Search Pundit on the Scene?“, which I copy in full below:

When I logged into my Vivisimo email this morning, I had a message from a guy named Stan. He must have gotten my contact info from this blog or the LinkedIn Enterprise Search Professionals group as someone interested in search and sent me the manifesto below since I don’t recall meeting him at a recent Gartner PCC show or an Enterprise Search Strategy (ESS) event. At any rate, I bookmarked his site and will let you know when it goes live – should be interesting. Maybe he has the makings of a search industry pundit in him…

I don’t know about you guys, but I’m getting really sick of not being able to find things at my company. I mean, I have a hard enough time finding my coffee cup each morning because someone in Marketing (I know it’s you, Jan) keeps moving it. But when it comes to data, it’s just impossible to find anything. Think about it for a second. On when I’m on the Internet, I do a Google search for “product requirements document” and get more than 22 million results in less than half a second. Try doing that on your intranet. I mean seriously. Can you even find a search box on your intranet? My last company didn’t even have one.

And when there is a box, what does it really search? A couple of intranet pages? What happens if you have multiple intranets? I bet your search only looks at one. Then you have to use a separate desktop search. And another search for email. And another for your external website. All of this searching is stopping us from getting any work done.

Then I heard about this enterprise search thing. Frankly, it sounds too good to be true. Searching across multiple repositories from a single search box. Presenting results into topical clusters. Tagging and rating documents to impact future search relevancy. Sharing results with other users. All of this while respecting my individual security rights. Seems like pie in the sky to me. Is anybody actually doing this, or is it just some marketing hype?

So I’m starting my own website, www.meetstan.com, to figure things out for myself. The site will be up March 16. Come by and tell me what you think.

Curious, I went to Whois.com and looked up meetstan.com. As you can verify for yourself, the registrant is none other than Vivisimo.

Am I naive to be shocked at this sort of marketing gimmick? Perhaps. But I’m sensitive because there aren’t that many people who understand enterprise search, and there’s a lot of concern about analysts offering less than independent opinions. I assume that Vivisimo isn’t planning to the site to promote a shill analyst, but rather is using this blog post  to create pre-launch buzz around a marketing portal.

Please, Vivisimo, don’t play these sorts of games. Given how uniformed so many people are about enterprise search, many people are likely to take a hoax like this seriously. That’s bad for Vivisimo’s reputation, and for the field as a whole. I’m sure this was an innocent mistake. I hope it will be quickly resolved, and that you will go back to promoting your technology and vision without resorting to such gimmicks.

I also encourage anyone from Vivisimo to comment here and offer clarification.

Categories
General

I Have 10 Aardvark Invites

The kind folks at Aardvark appreciated my write-up and sent me an invite, and, as per the usual viral rules, that means I can invite 10 more people. Let me know if you’re interested via the comments. I might be offline much of the day, but I’ll process requests in the order received.

Categories
General

Is the Aardvark a Social Animal?

A colleague alerted me to Aardvark, a social search service, scheduled to launch during SXSW, that offers users to ask question via instant messenger or email and receive live answers from your social network. Check out recent coverage by John Batelle and ReadWriteWeb.

The initial press is quite positive. In particular, ReadWriteWeb compares it favorably to asking questions on Twitter:

In our internal tests, we realized that a lot of the answers often rivaled those we received when asking our Twitter network. Thanks to the fact that Aardvark automatically routed our questions to people with the right expertise, all the answers we received so far were top-notch. In case you didn’t like the answer (or if it was obscene), you can flag it and rate it on the service’s website.

I haven’t experienced the service, so I’m in no position to evaluate it. I can’t say I’ve been overwhelemed with social question answering on Google (R.I.P.), Yahoo, or LinkedIn. Asking questions on Twitter works well for me, but that’s probably because I have a substantial number of real, knowledgeable followers (the TunkRank is strong with this one!).

But what I’m not understanding is Aardvark’s incentive system. I’ve looked at their blog and white paper, but I don’t see any mention of tangible or intangible incentives. Perhaps the incentives are reptuation and the interaction itself.

In any case, I’m cautiously optimistic. If anyone has managed to get an invite and can share, I’d greatly appreciate a chance to try it out.

Categories
General

Media Cloud: Watch, Analyze, Learn

A couple of months ago, Tom Tague, who leads the Calais initiative at Thomson Reuters, presented at the New York Semantic Web Meetup. One of the projects he alluded to was announced today and reported in ReadWriteWeb: “Media Cloud Leverages Calais to Track News Trends“:

Media Cloud, a new project from the Berkman Center at Harvard University, has an ambitious goal: It will do the heavy lifting of analyzing stories from thousands of traditional news sources, analyzing the semantics of the content through Calais (covered here and here), and then providing tools to quickly get trending results.

The article also points to an interview of project developer Ethan Zuckerman by the Neiman Journalism Lab.

What particularly excites me about this project is the possiblity of comparing how different news organizations–or, better yet, different clusters of similarly biased news organizations–select and cover news. Ever since hearing  Miles Efron present “The Liberal Media and Right-Wing Conspiracies: Using Cocitation Information to Estimate Political Orientation in Web Documents“ at CIKM 2004, I’ve been waiting for someone to take the next step and build analysis tools to compare the media “conspiracies”. For example, what stories are covered in the New York Times, but not in the National Review–and vice versa? Which details appear only in papers associated with one end of the political spectrum?

I don’t know that most people care about these questions. In fact, I suspect they don’t; my experience is that few people are interested in hearing viewpoints that challenge their own. But I fear that we are being personalized to death–that our control over what we read leads to the unfortunate behavior that we only let content through the filter if it reinforces our prejudices.

I know that Media Cloud won’t solve this problem on its own. But at least it’s a great tool for those who do want to broaden their perspectives, and I have hope that intellectually honest people will have the courage to learn from it.

Categories
General

Making Ads More Interesting…for Users or for Google?

Google annouced today that:

We think we can make online advertising even more relevant and useful by using additional information about the websites people visit. Today we are launching “interest-based” advertising as a beta test on our partner sites and on YouTube. These ads will associate categories of interest — say sports, gardening, cars, pets — with your browser, based on the types of sites you visit and the pages you view. We may then use those interest categories to show you more relevant text and display ads.

They do realize that this announcement raises lots of hackles in a world that is increasingly distrustful of Google’s accumulation of data and its control over so much of our online experience. They offer the following as grounds for trusting them:

  • Transparency – We already clearly label most of the ads provided by Google on the AdSense partner network and on YouTube. You can click on the labels to get more information about how we serve ads, and the information we use to show you ads. This year we will expand the range of ad formats and publishers that display labels that provide a way to learn more and make choices about Google’s ad serving.
  • Choice – We have built a tool called Ads Preferences Manager, which lets you view, delete, or add interest categories associated with your browser so that you can receive ads that are more interesting to you.
  • Control – You can always opt out of the advertising cookie for the AdSense partner network here. To make sure that your opt-out decision is respected (and isn’t deleted if you clear the cookies from your browser), we have designed a plug-in for your browser that maintains your opt-out choice.

Despite the predictable reactions from privacy groups, I don’t know that I find behaviorally targeted ads any worse than ads in general. Indeed, Google is probably right that that users will find the ads more relevant–indeed, they have every incentive to increase click-through rates. Privacy groups are right to call out Google’s hypocrisy in changing its tune on behavioral advertising, but so what? If Google’s going to live and die by the ad-supported model and if the overwhelming majority of the online population is on board with it, then, then it’s to be expected that Google will optimize for ad revenue.

Of course, my idea of choice and control is to use an ad blocker (specifically, the CustomizeGoogle Firefox extension), and I think Google takes a very narrow view of transparency. Still, I’m amused that Google is drawing so much heat for what seems to me a minor, incremental change.

Well, a minor change for users. Perhaps it’s not a coincidence that Google’s stock is up 3% today. $3B in market cap is a signifiant increment, even for Google.

Categories
General

Exploring Semantic Means

I gave a talk last week at the New York Semantic Web Meetup entitled “exploring semantic means“, and I thought readers here might want to peruse the slides. You can see more pictures of the event here, as well as the slides Ken Ellis presented about the work he’s doing at Daylife. I was also interviewed for a few minutes after the talk; I’ll post a link to the podcast when it’s available.

Categories
Uncategorized

More Adventures with PR People

A few weeks ago, I wrote a reply to all PR people who seem to think that, because I blog, they should pitch their companies’ press releases at me. I’m not sure whether to be flattered or annoyed.

What I’ve decided to do is share some of my experiences with readers–my top 3 that I haven’t already obliterated beyond recovery. Hopefully these same PR people will learn that indiscriminate marketing isn’t always a net gain. I’ve removed any personally identifying information about the senders; I don’t want vigilante or mischievous readers to get any ideas. Well, at least not to act on them. Here they are, in reverse order of absurdity. Drum roll, please.


#3) The GodTubes Must Be Crazy

Hi Daniel,

I want to introduce you to a new social network called tangle.com. Originally launched in 2007 as GodTube.com, a video sharing site that set the record as the fastest growing Web site in the U.S. during its first month of operation, it attracted 2.7 million users a month. Now, tangle has expanded to become the go-to Web site for the family-friendly community to safely interact on a full social network. Below is the press release that went out this morning, announcing the tangle.com launch.

I’d be happy to arrange a phone interview for you with tangle CEO, Jason Illian, to discuss tangle.com.  Jason can provide a unique look into family-friendly social media and how tangle.com differentiates itself from other social networking sites. Additionally, Jason is the author of “MySpaceÂŽ, MyKids: A Parent’s Guide to Protecting Your Kids and Navigating MySpace.com.”

Please feel free to shoot me an e-mail at XXXXXX@XXXXXXXXXXXX.com or call me at (XXX) XXX-XXXX for more information on tangle.com or to schedule time to speak with Jason.

Thanks for your consideration.

Best,
XXXXXX  for tangle.com


#2) Who Died?

Hello Daniel,

Last week, WSJ writer Jeffrey Zaslow reported that, starting next month, the Detroit News and the Detroit Free Press will be offering home delivery just three days a week. So, readers who’ve made a daily ritual of perusing obituaries with their morning coffee — and who won’t go out to buy the paper or go online — aren’t necessarily going to learn about the deaths of their acquaintances. See article here: http://online.wsj.com/article/SB123431793199571075.html

But what if there was a technology that kept readers informed about obituary news, anywhere and at any time of day?

That’s where Tributes.com steps in, the comprehensive resource for local and national obituary news and personal tributes. Tributes.com has over 82 million current and historical death records dating back to 1936.

Tributes.com makes sure that consumers can stay informed 24/7 and connected with accurate obit email alerts for any town in the US, alumni, family name, or military unit. Users can set up alerts based on the zip code they currently reside in as well as previous locations they have lived in, and when someone has passed away in their community, an email will be sent to them with names of those who have passed. Those who like to read the morning obits as much as they like their morning cup of joe won’t have to worry about missing the opportunity to leave a message of condolence or to attend a funeral because of missing the news in the paper.

Like it or not, many newspapers are cutting back on home delivery and people want their news quickly and accurately. Tributes.com is the best alternative, go-to resource for obituary news, making sure no one is left in the dark about a passing.

A few interesting facts surrounding this include:

  • The obituary market as a $750M-$1B nearly untouched industry
  • Obituaries- “last man standing” – every other classified section has gone online and made millions (Match.com, eHarmony.com, EBay, Craigslist, Monster.com, etc.)
  • Newspapers lost $64.5 billion in market value in 12 months in 2008
  • 2.5 million people die in the U.S. every year, and 12,000 of those people are turning 50 every day

Please consider mentioning this in your blog. I would be happy to arrange an interview with Jeff Taylor, founder of Tributes.com and Monster.com, to speak about new online technology and a modern world is changing the face of the print obituary.

For more information or to arrange an interview, please contact me at (XXX) XXX-XXXX xXXX or email me at XXXXXXX@XXXXXXXXXXXX.XXX.

Thanks for your consideration.

Best,

XXXXXXX


And, finally, #1: a letter from the folks at Wolfram Alpha:

Hi Daniel,

I wanted to thank you for your interest in Wolfram|Alpha and for sharing our exciting news with your readers. The response has been fantastic.
We look forward to sharing more news about this new website soon!

If you haven’t already, please sign up to receive Wolfram|Alpha release news.
You can do so at:
http://www.wolframalpha.com/

Thanks again,
XXXXXX

XXXXXX XXXXXX
Wolfram Research, Inc.

I had to say, I was a bit stunned by this email–the only reasonable explanation was my recent post about Wolfram Alpha, which didn’t, in my view, merit such a grateful response. Bemused, I responded and volunteered to look at their technology with an open mind–and even under NDA–if they had anything they were willing to share. I’ll keep you all posted.

Anyway, I hope you all found this amusing. And to any PR people who found their materials reposted here, I hope you understand that unsolicited pitches are fair game. Perhaps this is actually what you hoped for. In that case, you’re welcome, and thank you for helping me entertain my readers.