O'Reilly Open Source Convention
oreilly.comO'Reilly Network
Conferences International Safari: Books Online


Arrow Home
Arrow Registration
Arrow Speakers
Arrow Keynotes
Arrow Tutorials
Arrow Sessions
Arrow At-a-Glance
Arrow BOFs
Arrow Events
Arrow Exhibitors
Arrow Sponsors
Arrow Hotel/Travel
Arrow Venue Map
Arrow See & Do
Arrow Press
Arrow Mail List
Kids World


From the Frontiers of Research to the Heart of the Enterprise
O'Reilly Open Source Convention
Sheraton San Diego Hotel and Marina
July 22-26, 2002 -- San Diego, CA


Apache

Tutorial

Regular Expression Mastery
Mark-Jason Dominus, Plover Systems Co.

Track: Perl
Date: Monday, July 22
Time: 1:45pm - 5:15pm
Location: Grande Ballroom B

Almost everyone has written a regex that failed to match something they wanted it to, or that matched something they thought it shouldn't, and often it can be hard to predict what a regex will do. This class will fix that.

The first section will explore the algorithm that perl uses internally to do regex matching. Understanding this algorithm will allow us to predict whether a regex will match, which of several matches Perl will find, and which regexes will be faster than others. During this discussion we'll pause to discuss practical applications that illustrate features of the algorithm. We'll examine the essential but frequently misunderstood concept of 'greed', and we'll learn why commonly-used regex symbols like '.', '$', and '\1' might not mean what you thought they did.

In the second section, we'll apply our knowledge of the internals, examining at several common disasters, a few practical parsing applications, and some new features such that would have been hard to understand before. We'll see an example of every regex metacharacter and modifier. We'll finish with a discussion of some of the new optimizations that were added in Perl 5.6, and why you should avoid the '/i' modifier.

  • Inside the Regex Engine
    • Regular Expressions are Programs
    • Backtracking
    • Quantifiers
    • Greed
    • Anti-greed
    • Anchors and assertions
    • Backreferences
  • Disasters and Optimizations
    • Where machines come from
    • Disaster examples
    • Regex modifiers
    • Tokenizing
    • New optimizations
    • Matching strings with balanced parentheses



oreilly.com Home | O'Reilly Bookstores | How to Order | O'Reilly Contacts
International | About O'Reilly | Affiliated Companies | Privacy Policy


© 2001, O'Reilly Media, Inc.
conf-webmaster@oreilly.com