The O'Reilly Peer-to-Peer and Web Services Conference
oreilly.comO'Reilly Network
ConferencesInternationalSafari: Books Online

Arrow Home
Arrow Registration
Arrow Speakers
Arrow Keynotes
Arrow Tutorials
Arrow Sessions
Arrow BOFs
Arrow Community Meetings
Arrow Events
Arrow Exhibitors
Arrow Sponsors
Arrow Hotel/Travel
Arrow See & Do
Arrow Press
Arrow Mail List

Practical Tools For Innovation
O'Reilly Bioinformatics Technology Conference
January 28-31, 2002 -- Tucson, AZ
Chambered Nautilus


Data Integration for Function Discovery

Keith Allen, Ph.D., Paradigm Genetics

Track: Integration
Date: Wednesday, January 30
Time: 11:15am - 12:00pm
Location: Canyon II

A central truth of the genomics revolution is that only a small percentage of newly found genes can be reliably annotated (i.e. assigned function with certainty) by even the most sensitive sequence comparison methods. Adding structural information improves this percentage slightly, but the sheer complexity of even the simplest biological systems dictates that function discovery must involve the collection and successful integration of data from a variety of data streams. For small samples human analysis of disparate data sources is feasible, but this becomes impossible for genome-scale analysis. High throughput function discovery requires integrated data.

Keith Allen describes the efforts to integrate data from sequence annotation, gene expression profiling, biochemical profiling and detailed phenotypic analysis. Data integration is only a prerequisite for function discovery, however, as all of the data must be transformed into coherent data sets. We define coherent data as truly comparable data from multiple technology platforms. Thus the data streams themselves must be translated such that they are cross compatible, and can be simultaneously analyzed by a single method. Once coherent data sets are established, any analytical tool (i.e., cluster analysis) that could be applied to a single data set can be applied to all of the data at once. In this way meaningful comparisons can be made between data points from heterogeneous sources, and biological relationships can be discerned that would otherwise have been missed. He shows how this approach has worked in a pilot study of herbicide action in Arabidopsis. Home | Conferences Home | Bioinformatics Conference Home
Registration | Speakers | Keynotes | Tutorials | Sessions | BOFs
Exhibitors | Sponsors | Hotel/Travel | Press | Mail List

© 2001, O'Reilly Media, Inc.