Track: System Administration and Infrastructure
Date: Tuesday, February 04
Time: 10:45am - 12:15pm
Location: California Ballroom C
Clusters and compute farms are a proven concept in life science informatics -- so successful, in fact, that there is a misconception that such systems are easy to build and trivial to deploy. However, architectures and management techniques that work at the lab or department level often fail to scale or do not meet the stringent demands of the production IT datacenter.
There are significant architectural, administrative, and software challenges that must be addressed in order to build flexible and scalable systems suitable for concurrent use by multiple competing researchers, groups, and projects.
Unfortunately, even in 2003, much of the publicly-available documentation and reference material is still biased toward designing highly-specialized Beowulf-style parallel environments. In many cases the specific hardware and architectural recommendations made tend to be inefficient or inappropriate choices for life science research systems.
Dagdigian focuses on IT-centric tips and best practices learned while building and integrating compute farm infrastructures within academic and industry environments. He addresses following topics, using examples from real projects:
- What compute farms can and can't do
- Architecting for high-throughput life science computing
- Why price/performance isn't everything
- Hardware and software tricks to reduce administrative burden
- SANs, NAS, hybrid storage approaches, distributed filesystems, and data grids
- Distributed resource management (DRM) suites such as LSF and Gridengine
- Reflecting business and scientific priorities in overall resource allocation
- Puncturing the grid computing hype -- what can we do today?
Download presentation file