While centralized respositories such as Genbank have proven to be
valuable resources, they require significant resources to maintain and
often don't reflect the needs of individual laboratories. This is
especially true of gene expression data where the contextual data is
equally as important as the numerical expression values. Many
laboratories will want to organize their data in local data management
systems, and each system will likely vary in the amount and type of
contextual data that it records. One issue is that centralized
databases will likely be unable to store all of the contextual
information associated with every experiment (because their data model
only supports a common denominator of all experiment types) or
unwilling to (because of the resources required to store that level of
information).
The GeneX project (genex.sf.net) has begun exploring Peer-to-Peer
mechanism for publishing scientific data; help peers find what data is
available, enable peers to query the data or download data to their
local machines. GeneX includes a framework based upon the successful
model of the Distributed Annotation System project (DAS - biodas.org),
we call G2G.
G2G defines a distributed network of cooperating gene expression
databases that use a common basic data model, standards for minimal
information about microarray data, and standardized exchange formats
(defined by MAGE - mged.sf.net). Each peer is responsible for the
quality of data they control which distributes the cost and
responsibility across the community.