Categories
Uncategorized

Fight the Spammers that Be!

Misery loves company, so I’m reassured to know that I’m not the only recent victim of the uptake in spam. Matt Hurst reports that Political Streams has recently been hit with a lot of LiveJournal spam. A current look at the site shows that the problem still persists, at least for blogs:

Matt says that “minor modification to our spam filter should take care of it.” I hope so. But it’s clear that, as social media become increasing important, spammers are taking note.

p.s. For folks too young to remember the 80s, the title is a play on this Public Enemy song.

Categories
Uncategorized

Calling All New York Area CTY Alumni

My apologies to regular readers for this completely off-topic post. If you’ve never heard of CTY, feel free to get back to your regularly scheduled reading.

But if you are a CTY alum in the New York area and are interested in meeting your peers, please keep reading. CTY alumni coordinator Sarah Shelfer and alum Matt Mochary organized a gathering at the Pegu Club for 1980s CTY alumni in New York. We barely managed to fit around the table (new folks arrived as early birds rotated out), and all of us were excited at the prospect of renewing our connection to CTY and to one another. We’re still figuring out next steps, but the first one is to start find more of one another. I’m hoping that this blog post helps spread the word.

If you are a CTY alum, even if you’re not in New York, and you’re interested in renewing your connection to CTY and the people who shared this formative experience with you, please contact Sarah at ctyalumni@jhu.edu. And, if you are in New York–or if you remember me from my three summers at Dickinson and Franklin & Marshall–please give me a shout!

Categories
Uncategorized

To Advertise Or Not To Advertise

More from Greg on gems from CIKM:

Andrei Broder and a large crew from Yahoo Research had a paper at CIKM 2008, “To Swing or not to Swing: Learning when (not) to Advertise” (PDF), that is a joy to see for those of us that are hoping to make advertising more useful and less annoying.

Of course, folks like me dislike advertising enough that we install plug-ins like Adblock Plus and CustomizeGoogle to avoid ads entirely. I wonder if a good learning algorithm would spare me the trouble. But, more importantly, I wonder how far an ad-supported industry wants to go in making it easy for people to opt out of advertising.

Categories
Uncategorized

Blogs I Read: Geeking with Greg

One of the great things about blogging as social medium is that you quickly discover the most reputable people in your field. When I started blogging and participating in other blogs that dealt with topics related to information retrieval, I quickly discovered Geeking with Greg.

Greg Linden was the primary developer and designer of the Amazon.com recommendations engine, which is one of the most widely used–and perhaps the most famous–recommendation system on the web. He tried his hand at a news and blog aggregation, Findory, that was based on collaborative filtering but ultimately gave up on that to join Microsoft Live Labs.

Greg has been blogging since 2004. His interests combine strong research interests with practical grounding. For example, he recently wrote about a paper at CIKM 2008 (by the way, he’s been great at blogging about the conference) on learning to rank using click data. Given my day job, I am sometimes involved in “bake offs” between our engine and those of competitors, and I’d be delighted to see prospects run evaluation experiments based on this paper.

Greg’s posts range from the theoretical (“Does the entropy of search logs indicate that search should be easy?”) or the practical (“Advertising, search, and drive-by malware“). He talks about great work coming out of Microsoft’s labs, but gives equal time to works from competing labs, such as Yahoo and Google.

In short, Geeking with Greg is a must-read for anyone serious about real-world information retrieval.

Categories
Uncategorized

Under Attack by Spammers

I’ve recently seen a sharp increase in spam comments which are getting past the Akismet filter. I may need to take defensive action if it keeps up, such as moderating comments or using some kind of CAPTCHA. Please let me know if you have any suggestions. I don’t like anything that impedes the flow of legitimate comments, but I imagine that spam comments are just as annoying to you as they are to me.

Categories
Uncategorized

NSF Symposium on Semantic Knowledge Discovery, Organization and Use

This Friday and Saturday, I’ll be attending the NSF Sponsored Symposium on Semantic Knowledge Discovery, Organization and Use at NYU. Evidently it’s so popular that they have a waiting list registration! If you’re attending, please come by on Saturday afternoon to see my demo, or find me any time to chat!

Categories
Uncategorized

Spamalytics

Nice piece in the BBC today about a study led by UCSD computer scientist Stefan Savage on how spammers cash in. You can read the full CCS ’08 paper here. It’s an illuminating study, and a nice example of overcoming the challenges of ethically investigating spamming while still obtaining real-world data.

Categories
Uncategorized

LinkedIn Pushing Ads into Inbox

I’d noticed a number of reports about LinkedIn monetizing its audience through advertising, but this morning is the first time I’ve seen an ad in my inbox:

I absolutely understand their need to generate revenue. But I’m curious how users will feel about this approach–particularly if the ads are not even targeted. I am a delighted (though non-paying) LinkedIn user, but my delight would quickly fade if my inbox became over-run by spam.

Categories
Uncategorized

Happy Birthday to the ACM Digital Library!

This month’s issue of the Communications of the ACM includes a letter from ACM CEO John White celebrating the 10th anniversary of ACM’s Digital Library. As some of you may know, my colleagues and I at Endeca have been working with the ACM to improve the search and navigation functionality that the Digital Library provides.

In particular, ACM recently deployed a terminology extraction feature that we recently presented at HCIR ’08. While it’s still a work in progress (their version isn’t quite as current as what we demonstrated at the workshop), it represents a strong step in the direction of supporting exploratory search as part of the online library experience.

Please check it out and provide them with feedback, especially regarding the user interface that they designed using their own consultants. 

Categories
Uncategorized

MIT User Interface Design Teatime Blog

I just discovered that the User Interface Design group at MIT has started blogging. Here’s the mission statement from their opening post:

The sharing of knowledge and ideas is of fundamental importance to the advancement of technology. With this goal in mind, MIT’s User Interface Design group meets once a day at Tea Time to brainstorm new ideas, review new technologies and ideas, and share their experiences working in the field.

If we hope to herald innovation by sharing ideas with a research group , then there’s a boundless value to sharing ideas and thoughts with the world at large. With this goal in mind, we will post a daily log of the musings and observations we discuss in our tea time meetings, and welcome your thoughts and comments about Human Computer Interaction, User Interface Design, and increasing the value and effectiveness of how we use technology.

I’m psyched whenever I see academics blogging, and even more psyched to see a collective effort like this one.