Note: This post was written by Scott Nicholson, a Senior Data Scientist at LinkedIn. Scott is data and modeling geek with a passion for startups, product and user experience. His work at LinkedIn focuses on analyzing and improving user engagement and monetization.
I’m happy to report back on my experience at the Data 2.0 conference, an event organized by midVentures and targeted at entrepreneurs building products to leverage the dramatic increase in publicly and privately collected data. The conference has four main themes: what data is available, how to obtain data, how to store and access data, and how to create value from data products. For data nerds or hackers, the conference offered a delightful stream of “you know what would be cool…” ideas.
The morning started off on a strong foot with a talk by Vivek Wadhwa on how data is going to define the next generation of successful startups in a new information age. He observed the increasing online access to data that has previously been restricted to offline access (or no access at all). He also emphasized the importance of new sources of data, such as medical records and genome data. We need to think of social use of data beyond Twitter, Facebook and LinkedIn: for example, genome data will allow us to connect to each other in ways that helps us better understand our similarities and differences. Meanwhile, some existing data sources will become increasingly open and available to all. Wadhwa stressed the importance of leveraging the open sources of federal, state and local government data to come up with solutions to the existing closed and clunky legacy systems that governments used to generate data reports (a pity that data.gov and related programs may be defunded — DT).
The morning keynote segued nicely into the panel on open data sources. Jay Nath, Director of CRM for the city of San Francisco, noted that, while many applications are using government data and APIs, they mostly address consumer convenience (e.g., public transit apps) rather than government efficiency. Panelists agreed that government employees have few incentives to take risks by using new technology: legacy systems might be expensive, inflexible and inefficient, but they do perform their limited function. Alluding to Eric Ries’s idea of a “lean startup“, Nath suggested the concept of a “lean government” that lowered costs, sped up its operations, and avoided procurement processes by using open source technology — all in the context of providing services to its citizens.
The inspiring mid-day keynote by former Amazon Chief Scientist Andreas Weigend took a different perspective from the morning sessions: he focused on the how data sharing can provide tangible value to end-users, even resulting in significant behavior change. He cited products like tweeting weight scales, FitBit, and Nike + that allow people to share data about their fitness efforts, thus leading to social reinforcement for positive behaviors. I personally see this area as a great example of where data scientists and engineers can create enormous economic value and increase people’s welfare
The day also featured a various product launches and presentations. Here are a few that caught my attention:
- Micello: Google maps for indoors. They won the startup competition that was held in conjunction with the conference.
- Tropo: API for voice calls and SMS
- DataStax Brisk: Technology unifying Hadoop, Hive & Cassandra. A new Hadoop distribution powered by Cassandra.
- Neer: always-on location awareness app from Qualcomm. Privately share location with groups and families.
- Heritage Health Prize: $3MM prize for predictive modeling around who will require hospitalization (a follow-up on their announcing the prize at Strata)
Overall, it was great to see hundreds of people exploring innovations and opportunities to use data to improve business, technology and society.