The O'Reilly Peer-to-Peer and Web Services Conference
oreilly.comO'Reilly Network
ConferencesInternationalSafari: Books Online

Arrow Home
Arrow Registration
Arrow Speakers
Arrow Keynotes
Arrow Tutorials
Arrow Sessions
Arrow BOFs
Arrow Community Meetings
Arrow Events
Arrow Exhibitors
Arrow Sponsors
Arrow Hotel/Travel
Arrow See & Do
Arrow Press
Arrow Mail List

Practical Tools For Innovation
O'Reilly Bioinformatics Technology Conference
January 28-31, 2002 -- Tucson, AZ
Chambered Nautilus


Linux Clusters for Biologists 101: Building Clusters with ROCKS

Glen Otero, Principal Consultant, Linux Prophet

Track: Bioinformatics Tutorials
Date: Monday, January 28
Time: 1:30pm - 5:00pm
Location: Canyon I

In the interests of saving time, money, and their sanity when analyzing biological data, many life scientists have chosen to use powerful, open source software like Linux and Perl. But what happens when your analysis requires more processing power than one workstation can handle? After all, data sets exhibiting exponential growth rates are both boon and bane to life science research these days. More data points only provide more accurate results if you can effectively analyze the data in a timely manner. Many life scientists faced with this challenge are finding it helpful, and sometimes necessary, to engage multiple computer processors for efficient data analysis. That’s a relatively simple task for the lucky ones out there working for companies or universities with massive computing resources manned by plenty of experienced admins. But most of us aren’t that lucky, and face the daunting task of creating some sort of high(er) performance computing solution. So what type of tools does one invest in or invent then? Once again, Linux and open source software provide excellent options to the life scientist. Since many bioinformatics applications like sequence analysis and molecular modeling are well suited to run on clusters of Linux computers, many researchers have taken advantage of commodity computer prices, and free software that doesn’t suck, to build Linux Beowulf clusters—virtual supercomputers with unbeatable price/performance ratios. But how difficult is it to roll your own Beowulf?

Not that difficult it turns out. I couldn’t have said that a year ago. But many recent advances in hardware and clustering software have made it no more difficult to install a Beowulf cluster than it is to install Linux. With computer vendors focusing on cluster building, the physical assembly and reliability of cluster hardware has been greatly improved. More importantly, second generation Beowulf cluster software and documentation are now available in a format simple enough for anyone to build a Beowulf. It’s now possible to get all the software you need to build and run a Beowulf cluster on a single CD.

The “Linux Clusters for Biologists 101” tutorial will be a practical introduction to building, administering, and using Beowulf clusters for life scientists. I’ll cover the physical and environmental factors associated with designing Beowulf clusters, as well as the not-so-physical software layers responsible for running applications, load balancing, job scheduling, resource management, message-passing, security, and other essentials. The format will include lecture, discussion, Q&A, live Beowulf assembly and installation, and running applications (time permitting). We’ll be using the Rocks cluster distribution from the San Diego Supercomputer Center when building our Beowulf, but we’ll also discuss other excellent Beowulf cluster distributions including Scyld, MSC.Linux (OSCAR), and SCE. I’m planning to have plenty of free Beowulf software CDs to give away, and a special guest star appearance--so don’t hesitate! Home | Conferences Home | Bioinformatics Conference Home
Registration | Speakers | Keynotes | Tutorials | Sessions | BOFs
Exhibitors | Sponsors | Hotel/Travel | Press | Mail List

© 2001, O'Reilly Media, Inc.