O'Reilly Open Source Convention
Books Safari Bookshelf Conferences O'Reilly Network

Arrow Home
Arrow Registration
Arrow Speakers
Arrow Keynotes
Arrow Tutorials
Arrow Sessions
Arrow At-a-Glance
Arrow Wiki
Arrow BOFs
Arrow Events
Arrow Exhibitors
Arrow Sponsors
Arrow Hotel/Travel
Arrow Venue Map
Arrow See & Do
Arrow Tips for
Arrow Press
Arrow Mail List



Building the Open Warehouse
Roger Magoulas, Director Market Research, O'Reilly Media, Inc.

Track: MySQL
Date: Wednesday, July 28
Time: 1:45pm - 3:20pm
Location: Salon I


Commodity hardware, faster disks, and open source software now make building a data warehouse more of a resource and design issue than a cost issue for many organizations. Now a robust analysis infrastructure can be built on an open source platform with no performance or functional compromises.

This talk will cover a proven analysis architecture, the open source tool options for each architecture component, the basics of dimensional modeling, and a few tricks of the trade.

Why open source? Aside from the cost savings, open source lets you leverage what your staff already knows -- tools like Perl, SQL and Apache -- rather than having to procure and staff for the proprietary tools that dominate the commercial space.

Data Warehouse Architecture: - Consolidated Data Store (CDS)
- Process to condition, correlate and transform data
- Multi-topic data marts
- dimensional models
- Multi-channel data access

Open Source Components
Database: MySQL
- fast, effective
Data Movement: Perl/DBI/SQL
- flexible data access
Data Access: Perl/Apache/SQL
- template toolkit for ad hoc SQL
- Perl hash for crosstabs/pivot
- Perl for reports

Dimensional Model
- organizes data for queries and navigation from detail to summary
- normalized fact table for quantitative data
- denormalized dimensions with descriptive data
- conforming dimensions available to multiple facts

Performance Considerations
- configuration
- indexing
- SQL-92 joins
- aggregate tables and aggregate navigation

The presentation should provide you with the basic architecture, toolkit, design principles, and strategy for building an effective open source data warehouse.

Download presentation file

O'Reilly Home | Privacy Policy

© 2004, O'Reilly Media, Inc.