Categories
Uncategorized

More Adventures with PR People

A few weeks ago, I wrote a reply to all PR people who seem to think that, because I blog, they should pitch their companies’ press releases at me. I’m not sure whether to be flattered or annoyed.

What I’ve decided to do is share some of my experiences with readers–my top 3 that I haven’t already obliterated beyond recovery. Hopefully these same PR people will learn that indiscriminate marketing isn’t always a net gain. I’ve removed any personally identifying information about the senders; I don’t want vigilante or mischievous readers to get any ideas. Well, at least not to act on them. Here they are, in reverse order of absurdity. Drum roll, please.


#3) The GodTubes Must Be Crazy

Hi Daniel,

I want to introduce you to a new social network called tangle.com. Originally launched in 2007 as GodTube.com, a video sharing site that set the record as the fastest growing Web site in the U.S. during its first month of operation, it attracted 2.7 million users a month. Now, tangle has expanded to become the go-to Web site for the family-friendly community to safely interact on a full social network. Below is the press release that went out this morning, announcing the tangle.com launch.

I’d be happy to arrange a phone interview for you with tangle CEO, Jason Illian, to discuss tangle.com.  Jason can provide a unique look into family-friendly social media and how tangle.com differentiates itself from other social networking sites. Additionally, Jason is the author of “MySpace®, MyKids: A Parent’s Guide to Protecting Your Kids and Navigating MySpace.com.”

Please feel free to shoot me an e-mail at XXXXXX@XXXXXXXXXXXX.com or call me at (XXX) XXX-XXXX for more information on tangle.com or to schedule time to speak with Jason.

Thanks for your consideration.

Best,
XXXXXX  for tangle.com


#2) Who Died?

Hello Daniel,

Last week, WSJ writer Jeffrey Zaslow reported that, starting next month, the Detroit News and the Detroit Free Press will be offering home delivery just three days a week. So, readers who’ve made a daily ritual of perusing obituaries with their morning coffee — and who won’t go out to buy the paper or go online — aren’t necessarily going to learn about the deaths of their acquaintances. See article here: http://online.wsj.com/article/SB123431793199571075.html

But what if there was a technology that kept readers informed about obituary news, anywhere and at any time of day?

That’s where Tributes.com steps in, the comprehensive resource for local and national obituary news and personal tributes. Tributes.com has over 82 million current and historical death records dating back to 1936.

Tributes.com makes sure that consumers can stay informed 24/7 and connected with accurate obit email alerts for any town in the US, alumni, family name, or military unit. Users can set up alerts based on the zip code they currently reside in as well as previous locations they have lived in, and when someone has passed away in their community, an email will be sent to them with names of those who have passed. Those who like to read the morning obits as much as they like their morning cup of joe won’t have to worry about missing the opportunity to leave a message of condolence or to attend a funeral because of missing the news in the paper.

Like it or not, many newspapers are cutting back on home delivery and people want their news quickly and accurately. Tributes.com is the best alternative, go-to resource for obituary news, making sure no one is left in the dark about a passing.

A few interesting facts surrounding this include:

  • The obituary market as a $750M-$1B nearly untouched industry
  • Obituaries- “last man standing” – every other classified section has gone online and made millions (Match.com, eHarmony.com, EBay, Craigslist, Monster.com, etc.)
  • Newspapers lost $64.5 billion in market value in 12 months in 2008
  • 2.5 million people die in the U.S. every year, and 12,000 of those people are turning 50 every day

Please consider mentioning this in your blog. I would be happy to arrange an interview with Jeff Taylor, founder of Tributes.com and Monster.com, to speak about new online technology and a modern world is changing the face of the print obituary.

For more information or to arrange an interview, please contact me at (XXX) XXX-XXXX xXXX or email me at XXXXXXX@XXXXXXXXXXXX.XXX.

Thanks for your consideration.

Best,

XXXXXXX


And, finally, #1: a letter from the folks at Wolfram Alpha:

Hi Daniel,

I wanted to thank you for your interest in Wolfram|Alpha and for sharing our exciting news with your readers. The response has been fantastic.
We look forward to sharing more news about this new website soon!

If you haven’t already, please sign up to receive Wolfram|Alpha release news.
You can do so at:
http://www.wolframalpha.com/

Thanks again,
XXXXXX

XXXXXX XXXXXX
Wolfram Research, Inc.

I had to say, I was a bit stunned by this email–the only reasonable explanation was my recent post about Wolfram Alpha, which didn’t, in my view, merit such a grateful response. Bemused, I responded and volunteered to look at their technology with an open mind–and even under NDA–if they had anything they were willing to share. I’ll keep you all posted.

Anyway, I hope you all found this amusing. And to any PR people who found their materials reposted here, I hope you understand that unsolicited pitches are fair game. Perhaps this is actually what you hoped for. In that case, you’re welcome, and thank you for helping me entertain my readers.

Categories
Uncategorized

Functional Requirements for Bibliographic Records

This is the first of what I hope to be many guest posts. Our guest blogger is Kelley McGrath, a Cataloging and Metadata Services Librarian at Ball State University Libraries, and at my request she’s supplying a perspective that I feel is crucial for anyone interested in HCIR–that of an actual librarian who deals with the realities of cataloging technologies.

THE PROBLEM

Do people really judge books by their covers? If not, then why is it that the Penguin version of Passage to India is rated 5 stars while the Penguin Classic version only gets 3? A recent BBC blog entry asks this question and concludes that the problem is that bookstore and library records for books (or other things) are designed largely to support inventory tasks and are based on identifiers like ISBNs that relate to particular editions. The BBC points out that sometimes what we really need are what they call “cultural identifiers” for cultural artifacts that point to one place even if you’re talking about the large print or Spanish translation and I’m reading the standard English paperback.

A POSSIBLE SOLUTION

In fact, libraries and the publishing world are well aware of this problem. Since I’m in the library world and I work primarily with cataloging moving images (film, TV, video), I’m going to talk from that perspective. The library world’s proposed solution is an entity-relationship conceptual model called the Functional Requirements for Bibliographic Records (FRBR, often pronounced “ferber”). FRBR divides the bibliographic universe into four main entities, which from the most abstract to concrete are:

Work: This is the BBC’s “cultural artifact” or the abstract commonality of something that is considered the same essential creation. In most cases, this is clear-cut, but there can be disagreements about where to draw the boundaries. For example, is Gus van Sant’s frame-by-frame remake of Psycho a new work or is it an expression of Hitchcock’s work?

Expression: These are versions that vary in content in some significant way. If you actually want to get your hands on something, differences in expression are important. Examples of expressions are things like language translations (dubbed versions, subtitles), accessibility modifications (captions, audio descriptions), widescreen vs. full screen, colorized versions of black and white films, and theatrical release vs. uncut/unrated/director’s cut versions. All differences in expressions may not be practical to track (e.g., the various video versions of Star Wars).

Manifestation: This is more or less the same as what libraries or bookstores keep track of now—generally, a particular published edition or the set of items that have the same characteristics for ordering purposes, e.g., the Warner Home Video DVD with a certain ISBN released in a certain year. From a library perspective VHS vs. DVD vs. Blu-ray, as well as publisher names and publication dates, are manifestation-level attributes.

Item: This is the particular DVD that has a certain barcode on it that you’ve checked out from the library and neglected to bring back on time so now the library wants to charge you a fine.

Particularly as larger, shared library catalogs have become more common, the multi-level FRBR model could be used to present options to users in a more succinct and usable manner. Would it not be easier to see one basic overview record for Hamlet and choices for versions and availability rather than a long list of records of different editions of Hamlet with not much information on the initial hit list page to differentiate them?

STEPS TOWARD THE SOLUTION

So the library world has the challenge of seeing if we can get from where we are now to a FRBR-based model. What we have are records at the manifestation (published edition) level, e.g., the 2-disc special edition released by Paramount on DVD in 2008. This record probably includes information from other levels of the FRBR model such as the director’s name (work level) or the fact that it’s the full screen version (expression level). Unfortunately, these various bits of information are intermingled, not clearly identified in terms of what level they apply to, and often given in free text notes which are hard to analyze or turn into controlled data.

To get from where we are now to a world where the multi-level FRBR model is truly useful, there are a few things I think we need. Again, I am going to talk in terms of film and video, as that is what I work with. I am only going to talk in terms of two levels, work and manifestation (published edition). I think the two levels are a more practical first step. In addition, for most materials this may be a viable approach even in the long run in that the characteristics that identify the expression (version) have to be identified and verified for each new manifestation (published edition) and most of them can be coded in machine-readable form such that expressions (versions) could be automatically calculated.

1. Work sets

The existing manifestation (published edition) records have to somehow be related to the work or works they contain. This is generally done by grouping the records into work sets or by linking the relevant manifestations to work records.

OCLC, a large nonprofit, membership, computer library service and research organization, has developed what is probably the best-known clustering-based FRBR algorithm. OCLC’s algorithm uses several approaches with the most commonly-occurring one based on primary author and title. Because library rules consider most moving images to be works of mixed responsibility without a primary author, this approach works less well for them. The sets that are created by this algorithm are sometimes closer to the expression (version) level than the work level and the data that is displayed to the end user is derived algorithmically from the set of manifestations.

LibraryThing, a social book cataloging site, suggests possible combinations, but relies on human intervention to create its work clusters based on what founder Tim Spalding calls the “cocktail party” test. This test asks whether two people would think they’re talking about the same book in casual conversation. LibraryThing does include basic work-level records, parts of which appears to be surfaced from the manifestation clusters and parts of which are entered manually by users.

Both OCLC and LibraryThing offer services that create work sets on the fly based on an ISBN (XISBN and thingISBN). Manifestation (published edition) records that include multiple works can be particularly problematic for these services.

2. Work Identifiers

As the BBC noted, we need some stable way to identify and refer to works. Both OCLC and LibraryThing offer work identifiers of a sort. OCLC’s database is more comprehensive, but its identifiers are less reliably at the work level. Ed Summers recently pointed out that these identifiers only provide human-readable data and not data useful for machines.

3. Work Records

In the long run there are serious limitations with the clustering approach and with displaying work-level information created by extracting and analyzing data in sets of manifestation (published edition) records. For one thing, if information is automatically generated from clusters, it is difficult to correct errors that may sneak in. This problem may be particularly prevalent in the library world due to the practice of copying much information on new records from previous editions without verification. It is also redundant to re-enter and store all this data in multiple manifestation records when it would be more efficient to assess and maintain it in a single work-level record. In addition, a single work-level record would present consistent information to all users. Currently manifestation records may be more or less complete and may give conflicting information about the same work so what a user finds is arbitrarily influenced by the particular records that happen to be in the library catalog being searched.

A group that I am part of, OLAC (Online Audiovisual Catalogers), has created a task force to examine what it might mean to create work-level records for moving images in a library context and also to what extent we might be able to leverage existing library data. We did a project to try to extract work-level data from existing manifestation (published edition) records. Lynne Bisko and I have an article about our experience in The Code4Lib Journal. Our basic idea is that by extracting what we can from existing library records, possibly in combination with information from external data sources, we can create basic, good-enough work-level records that can be corrected and maintained by human beings. These work-level records could then be linked to our existing manifestation-level records. This data could be used to create an interface to provide better access to moving images both in terms of the work (director, cast, original date, language, and country of production, place and time period of setting, etc.) and in terms of the characteristics of the expression and manifestation that will help users easily select the best items for their needs (language(s) of audio and subtitle tracks, captioning, DVD vs. VHS vs. Blu-ray, availability, etc.).

Categories
Uncategorized

Find Out More about the Guardian Open Platform!

If you’re reading this, then it’s at least 9am GMT, and you should be able to learn more about the Guardian Open Platform at  http://www.guardian.co.uk/open-platform. That’s a bit early on this side of the pond, but I promise to share more details and impressions once I’m awake and have a chance to gather them!

Categories
Uncategorized

Guardian Launching Open Platform

The Guardian, an internationally acclaimed newspaper (and a long-time Endeca client!) that has been a major force in the United Kingdom for 180 years, is launching an open platform tomorrow. The Guardian has led the media in openness, making the unprecedented decision last fall to offer the full text of its articles in its RSS feeds.

I’ll report more about the new platform when there are more details than I can publicly share.

Categories
Uncategorized

Apologies to Google Reader Users

For some reason that I have still not diagnosed, readers who view the RSS feed for this blog using Google Reader are seeing a handful of bogus entries in the feed–something like this. I don’t know why those entries are showing up, let alone why they seem to congregate at the front of the feed. If anyone has suggestion on how to diagnose or resolve the issue, I’d greatly appreciate it. In the mean time, I apologize for the incovenience and annoyance.

Categories
Uncategorized

Google’s Marissa Mayer on Privacy vs. Transparency

TechCrunch posted a transcript of a Charlie  Rose interviewing Google Vice President of Search Product and User Experience Marissa Mayer.

Here’s an excerpt I found particularly interesting:

Charlie Rose:
This is a broader philosophical question I want to talk about later. But I mean is there some point in which we know too much about people?

Marissa Mayer:
Well I think that in all cases it’s a tradeoff, right, where you will give you some of your privacy in order to gain some functionality, and so we really need to make those tradeoffs really clear to people, what information are we using and what’s the benefit to them? And then ultimately leave it to user choice so the user can decide. And you have to be very transparent about what information you have about that user and how it’s being used.

Charlie Rose:
But it’s also seems to me clearly a product of age and generation, how willing you are to give up privacy and to allow transparency, clearly.

Marissa Mayer:
Sure, absolutely…

That’s a great attitude. I only which Charlie Rose had fact-checked Google’s actual policy when it comes to transparency. Indeed, Google’s lack of transparency with advertisers, who are its bread and butter, recently cost them $761 and a bunch of bad press. While I’m sure Google can afford the judgment (less than 2.5 shares of GOOG stock at the time of this writing), I hope they see this experience as an opportunity to review their principles.

And, of course, don’t get me started on the lack of transparency in their approach to relevance! For those who haven’t been regular readers, here are two of my recent posts about Google:

Categories
Uncategorized

Can You Digg It?

According to an article in the LA Times, USocial CEO and founder Leon Hill is bragging that they are “”gaming Digg” by letting advertisers buy votes. Sound familiar? When will people figure out that anonymous social voting schemes that don’t offer users control over the social lens are just begging to be gamed?

It’s as Ben Franklin said, “Experience is the best teacher, but a fool will learn from no other.” Emphasis mine.

Categories
Uncategorized

Jason Adams Explains TunkRank

Jason Adams, who recently won the TunkRank implementation challenge, explains on his blog how he implemented TunkRank.com. He implemented the algorithm in Ruby using Merb, MySQL, Capistrano, nginx, and ActiveRecord. For more details, check out his blog!

Note: he just added a follow-up post: The Road Ahead for TunkRank.

Categories
Uncategorized

Craig’s Dissertation on People Search

Craig McDonald (now Dr. Craig McDonald!) just announced that his thesis, The Voting Model for People Search, is available online.

Here is a teaser from the abstract:

The thesis investigates how persons in an enterprise organisation can be ranked in response to a query, so that those persons with relevant expertise to the query topic are ranked first. The expertise areas of the persons are represented by documentary evidence of expertise, known as candidate profiles. The statement of this research work is that the expert search task in an enterprise setting can be successfully and effectively modelled using a voting paradigm. In the so-called Voting Model, when a document is retrieved for a query, this document represents a vote for every expert associated with the document to have relevant expertise to the query topic. This voting paradigm is manifested by the proposition of various voting techniques that aggregate the votes from documents to candidate experts. Moreover, the research work demonstrates that these voting techniques can be modelled in terms of a Bayesian belief network, providing probabilistic semantics for the proposed voting paradigm.

Categories
Uncategorized

If Anyone Wants a Likaholix Invite…

I just joined Likaholix to see what all the buzz was about. According to their About page:

Likaholix is a fun and easy way to share and discuss your likes and discover new ones with people you know. You can like anything from a great book you have read to your favorite food to some art work that you love.

We have found that recommendations from friends, whose tastes, you trust are usually much better than most reviews on the web. Most people, when they are out on social occasions with friends, find themselves exchanging notes and discussing things that they like. We hope to bring the same experience online with Likaholix. Likaholix serves as both a self-expression and a recommendation tool. We provide personalized recommendations based on the people, topics and items you like.

It’s a nice idea, but I have to say I’m underwhelmed by the experience. Still, I’d be more than happy to share my 10 invites: first come, first serve. I believe that every new users gets 10 invites, so clearly their hoping for exponential growth through viral marketing. Hey, can’t hurt to try.