 |
 |
|
|
Session
Scalable Computing with MapReduce
Doug Cutting, Yahoo!
Track: Java
Date: Wednesday, August 3rd, 2005
Time: 4:30pm - 5:15pm
Location: D135
Over the past few years, Google has published details of their infrastructure. Developers within Google are able to easily write algorithms that efficiently and reliably process many terabytes of data. To do this, they leverage two technologies: the Google File System (GFS), which implements reliable distributed storage; and MapReduce, a reliable distributed-processing layer built on GFS. Together, this platform facilitates tasks as diverse as log analysis and database construction. One can efficiently "grep" weeks of logs from a high-volume site, constructing concise summaries. One can build efficiently searchable indexes of huge datasets. Such tasks are easily implemented with little code. Scalability and reliability are handled by the system, so that developers can focus on algorithms.
The Nutch project has now implemented a similar platform in open source, so that folks outside of Google can enjoy these benefits. This talk outlines the platform's architecture and implementation, as well as shows how it may be used to solve real problems.
|
|
 |
 |
 |
Diamond Sponsors
Platinum Sponsors
Gold Sponsors
Silver Sponsors
Media Sponsors
In-Kind Sponsors
Sponsors
OSCON 2005 Sponsor Opportunities — Email us at
Download the OSCON 05 Sponsor/Exhibitor Prospectus
OSCON 2005 Media Sponsor Opportunities — Call Margi Levin at 707-827-7184 or email at
Press and Media
For media-related inquiries, contact Suzanne Axtell at
Conference News
Want to receive conference news? Sign up for our email newsletter.
|
 |