Data integration is a growing field in bioinformatics, as researchers combine information from multiple diverse data sets to learn about and explain natural processes. Methods have been developed to integrate insights from hybridized cDNA and oligomer microarray data with genome sequence annotations. The integrated data enables the use of genome annotations to explain gene expression patterns and to compare gene expression patterns for orthologous genes from different organisms.
Terry Gaasterland discusses the integration project, with examples from the
genomes of Trypanosoma brucei (a human and cattle parasite), six
strains of Staphylococcus aureus (a pathogenic bacterium), human and
mouse.