Google may be reactionary when it comes to information seeking approaches, but they are at the cutting edge of systems research. Their official blog post today on sorting a petabyte in six hours using MapReduce was a reminder of the impressive caliber of their systems team. You can learn more from their Technology RoundTable Series.
4 replies on “Sorting a Petabyte”
The problem with MapReduce, as far as I can see, is that it is not necessarily power efficient.
That’s like saying that if I put twelve V8 engines in my car, it will run faster. Big deal.
I want to go fast, and save on gas.
Given that Google’s energy consumption represents a significant cost for their daily operations, I have to imagine they’ve work on the energy efficiency of MapReduce. And I also suspect they are funding this work at UC Berkeley: http://www.eecs.berkeley.edu/Research/Projects/Data/105613.html
That’s a good way of thinking about it. Petascale computing should be measured in operations per kilowatt hour (or pick your favorite unit of power consumption). That forces the issue to be a combination of system architecture and algorithm design, as it should be.
Google may be reactionary when it comes to information seeking approaches, but they are at the cutting edge of systems research.
Totally agree. In fact, one of my coworkers is fond of saying that when we look back at Google in 30 years, the lasting legacy, how we remember them and how they will have contributed overall to the field of computing, will not be their information retrieval advances. It will be their systems advances.