Peter Cooper, The National Center for Biotechnology Information
Track: Bioinformatics 2003 Tutorial
Date: Monday, February 03
Time: 8:30am - 12:00pm
Location: Plaza
The NCBI has a growing collection of complete and draft-level complex genomes. Most notably, this includes the human and mouse genomes, as well as the sequence of the fruit fly, C. elegans, Arabidopsis thaliana, and Anopheles gambiae. Three important and integrated avenues of access to genome data are the MapViewer, LocusLink, and UniGene.
The MapViewer serves as a common graphical platform for displaying various kinds of maps and providing access to annotations, sequences, and variations (SNPs). The MapViewer allows all scales of display, from the whole chromosome to the sequence itself. It provides for both text queries and sequence-similarity searching through the genomic BLAST pages.
For several of the complete genomes as well as other important experimental organisms, the NBCI has created LocusLink, a central point for access to primary and reference sequences including assembled contigs, curated transcripts and proteins, gene models, expressed sequences, mapping information, functional data, and nomenclature.
The richest sources of sequence data for many organisms are expressed sequences in the form of ESTs. The NCBI organizes and reduces this highly redundant but important set of data in the UniGene collections. UniGene contains automatically generated, non-redundant sets of gene-based clusters of expressed sequences for a number of important organisms.
The MapViewer, LocusLink, and UniGene are fully integrated with each other as genome resources. In addition, they are integrated with the Entrez/PubMed and BLAST systems, either through Entrez links or through the LinkOut service. Cooper’s tutorial shows how to access genome information using these tools by sequence similarity, marker interval, gene name, and sequence identifier, and demonstrates how they can be used for gene discovery.