One of the big challenges of working with heterogeneous data is curating it. Below are introductions to two tools for doing do:
- Gridworks, developed by David Huynh, Stefano Mazzocchi, and their colleagues at Metaweb, the company behind Freebase.
- Needlebase, developed by Justin Boyan and colleagues at ITA Software, the company powering travel search for Kayak, Orbitz, and others.
If you’re concerned with building and maintaining collections of semi-structured data, or building your own technology for this purpose, I suggest you check out these state-of-the-art tools.