Categories
Uncategorized

Enterprise Search on Wikipedia: An Entry in Need of Attention

It’s been a while since we rallied as a community to improve the state of information retrieval entries on Wikipedia. There are 86 entries in Category:Information Retrieval, and a number of these could use some help. But the one that strikes me as most in need of attention is the Enterprise search entry.

What is so wrong with this entry? Perhaps a better question to ask is what is right with it! But I’ll summarize my objections as follows:

  • The entry is so focused on enumerating vendors that it barely even describes what enterprise search is. Yes, there is some debate about how to define enterprise search, but in that case a Wikipedia entry should “teach the controversy” as it were.
  • The classification of vendors isn’t particularly informative. I don’t see how describing a vendor as a major vendor, a specialized vendor, a superplatform, etc. helps anyone understand the technology space. For example, I cannot figure out what makes a vendor “major” in this taxonomy.
  • The entry seems to largely reflect the haphazard edits of vendors and their surrogates, rather than any principled analysis of the space.

So I’d really like to fix it! But, given my association with Endeca, I’m not sure how much I can do personally without drawing accusations of conflict of interest. Perhaps the readership here can help out. My goal here is not to create an entry that is some pro-Endeca or anti-Endeca’s competitors. In fact, I have half a mind to remove all specific vendor references, or perhaps to move them to a separate List of Enterprise Search Vendors entry. What I really want to see is an entry that helps people understand what enterprise search is.

Please chime in if you are interested in being part of this effort. I’m happy to pitch in myself, but I feel the effort has to be collaborative in order to address the understandable concerns about conflict of interest.

Categories
Uncategorized

Classifying Posts into Categories

I’ve started classifying posts into four broad categories:

  • General: General posts, typically analyzing HCIR issues.
  • Community: Outreach out to the broader community, e.g., cleaning up Wikipedia.
  • Noise: Discussion of The Noisy Channel, or about my personal whereabouts.
  • Quick Bites: Brief notes about cool finds from the blogosphere or elsewhere on the web.

Please let me know if you find these helpful. I may break out General a bit, though I’m wary of creating artifical boundaries between overlapping subject areas. Also, I would like it to be easier to carry out conversations about topics that span multiple posts. Ideas welcome as always!

p.s. This is a typical example of a post in the Noise category.

Categories
Uncategorized

Yahoo the Platform

The other day, there was a conversation in a comment thread about Yahoo BOSS. Today, I noticed this article on Mashable entitled “The Future is Yahoo the Platform“. It mentions BOSS, but what caught my eye was the Yahoo! Query Language (YQL). Skimming the documentation, I had the impression that it might allow functionality that I’ve never seen a web search API allow–namely, the ability to pick your own sort. To be useful (at least in my view), this has to be an operation applied before the results are truncated to the top results based on the search engine’s default ranking.

Does anyone here know more? I know we have some Yahoos among the readership.

Categories
Uncategorized

Event Processing Meets Text

Thanks to Curt Monash for calling attention to an article by Seth Grimes about a presentation that Richard Brown of Thomson Reuters made at Gartner’s Event Processing Summit.

Brown’s talk was about the Reuters NewsScope Sentiment Engine, which “processes a stream of Reuters English language news items, producing sentiment data for a list of customer determined target companies.” The list of collaborators is intruiging:

I have no idea how valuabe the results are, but I intrigued at the ambition of putting all of these pieces together to support financial-market trading, Given the rocky history of quantitative trading, I’m curious to hear about their results.

Categories
Uncategorized

David Bowie on Techmeme!

Well, not quite, but here’s the screen shot:

David Bowie on Techmeme!

“Notice anything different” indeed! Nice follow-up on the information accountability theme.

Clicking through to the post on the Joost blog reveals the source of confusion: a quote at the top of the article.

“Ch-ch-ch-ch-changes”
– David Bowie

Categories
Uncategorized

Powerset’s First Live Search Projects

Just say that the Powerset Blog is announcing Powerset’s first Live Search projects. I’ve been a Powerset skeptic from even before I participated in their beta test, and I confess that I still don’t see the value proposition. If anyone reading this does see it, please help me understand.

Categories
Uncategorized

Welcome to the New Noisy Channel!

If you’ve made it this far, then the migration is a success. Please stay tuned for continuous improvements. Better yet, feel free to suggest your own!

Categories
Uncategorized

Migrating Tonight!

At long last, this blog will migrate over to a hosted WordPress platform at https://thenoisychannel.com/. Thanks to Andy Milk (and to Endeca for lending me his services) and especially to Noisy Channel regular David Fauth for making this promised migration a reality!

As of midnight EST, please visit the new site. My goal is to redirect all incoming Blogger traffic to the new hosted site. This will be the last post here at Blogger.

p.s. Please note that I’ll be manually migrating any content (posts and comments) from the past 5 days, i.e., since I performed an import on September 12th. My apologies if anything is lost in translation.

Categories
Uncategorized

Progress on the Migration

Please check out https://thenoisychannel.com/ to see the future of The Noisy Channel in progress. I’m using WordPress hosted on GoDaddy and did the minimum work to port all posts and comments (not including this one).

Here is the my current list of tasks that I’d like to get done before we move.

  • Design! I’m currently using the default WordPress theme, which is pretty lame. I’m inclined to use a clean but stylish two-column theme that is widget-friendly. Maybe Cutline. In any case, I’d like the new site to be a tad less spartan before we move into it.
  • Internal Links. My habit of linking back to previous posts now means I have to map those links to the new posts. I suspect I’ll do it manually, since I don’t see an easy way to automate it.
  • Redirects. Unfortunately I don’t think I can actually get Blogger to redirect traffic automatically. So my plan is to post signage throughout this blog making it clear that the blog has moved.

I’d love help, particularly in the form of advice on the design side. And I’ll happily give administration access to anyone who has the cycles to help implement any of these or other ideas. Please let me know by posting here or by emailing me: dtunkelang@{endeca,gmail}.com.

Categories
Uncategorized

Probably Irrelevant. (Not!)

Thanks to Jeff Dalton for spreading the word about a new information retrieval blog: Probably Irrelevant. It’s a group blog, currently listing Fernando Diaz and Jon Elsas as contributors. Given the authors and the blog name’s anagram of “Re-plan IR revolt, baby!“, I expect great things!