Uncategorized

Sorting a Petabyte

Google may be reactionary when it comes to information seeking approaches, but they are at the cutting edge of systems research. Their official blog post today on sorting a petabyte in six hours using MapReduce was a reminder of the impressive caliber of their systems team. You can learn more from their Technology RoundTable Series.

By Daniel Tunkelang

High-Class Consultant.

View Archive

4 replies on “Sorting a Petabyte”

The problem with MapReduce, as far as I can see, is that it is not necessarily power efficient.

That’s like saying that if I put twelve V8 engines in my car, it will run faster. Big deal.

I want to go fast, and save on gas.

LikeLike

Given that Google’s energy consumption represents a significant cost for their daily operations, I have to imagine they’ve work on the energy efficiency of MapReduce. And I also suspect they are funding this work at UC Berkeley: http://www.eecs.berkeley.edu/Research/Projects/Data/105613.html

LikeLike

That’s a good way of thinking about it. Petascale computing should be measured in operations per kilowatt hour (or pick your favorite unit of power consumption). That forces the issue to be a combination of system architecture and algorithm design, as it should be.

LikeLike

Google may be reactionary when it comes to information seeking approaches, but they are at the cutting edge of systems research.

Totally agree. In fact, one of my coworkers is fond of saying that when we look back at Google in 30 years, the lasting legacy, how we remember them and how they will have contributed overall to the field of computing, will not be their information retrieval advances. It will be their systems advances.

LikeLike

Comments are closed.

Share this:

Related

By Daniel Tunkelang

4 replies on “Sorting a Petabyte”