|
Session
High Throughput Analysis in Model Organism Genomics: The Bulk Data Pipeline To Rat Genome Database
Dean Pasko, Bioinformatics Specialist, Medical College of Wisconsin
Track: Fundamentals
Date: Wednesday, January 30
Time: 3:00pm
- 3:45pm
Location: Canyon IV
The bulk data pipeline is a major component of the overall Rat Genome Database curation activity that works with collaborators around the globe to create a consistent environment fully integrated with major rat database repositories. Using object-oriented methodology, we have defined RGD objects (such as SSLPs, genes, ESTs, mapped data, references, sequence data, strains, and QTLs) that are the functional entities of the database required to support most of the desired functionality. We also have developed each data object template for the bulk data pipeline. The pipeline is able to process and check a large amount of data. The processing entails the unique identification of data objects, symbols, sequences (both primers and clones) and each data attribute. In addition the results of the processing are available to the curators to view through a web-based interface and to load the clean data into RGD. This kind of bulk data pipeline could be very beneficial for genomic data processing of any other organism.
|