The rapid growth of sequence data in public and proprietary databases has created unprecedented opportunities to mine human and other genome databases for novel genes, proteins and biological pathways. In addition, with web-based user interfaces, these resources have become accessible to researchers with limited skills with computers. However, using these interactive interfaces becomes unwieldy when faced with complex comparative analyses involving hundreds or thousands of sequences and multiple databases.
Perl has emerged as the language of choice for the automated access and manipulation of bioinformatics data. However, while writing Perl programs is relatively easy, fully exploiting Perl's bioinformatics capabilities requires a level of programming experience and sophistication not common among biologists. To address this problem, the Bioperl Package - a set of object-oriented modules that implements common bioinformatics tasks - has been developed.
This tutorial describes Perl and Bioperl and their application to practical problems in molecular biology sequence analysis. The tutorial includes an overview of the principal features of Perl and Bioperl relevant to biology, followed by examples of how they can be applied to common bioinformatics tasks. Attention will also be paid to identifying bioinformatics problems for which Perl and Bioperl are not appropriate tools. By the end of the tutorial, participants will have a sense of what capabilities Perl and Bioperl can provide them in molecular biology research, as well as pointers to resources for acquiring the skills and knowledge they will need in order to take advantage of them.