O'Reilly Open Source Convention
oreilly.comO'Reilly Network

Arrow Home
Arrow Registration
Arrow Hotel/Travel
Arrow See & Do
Arrow Tutorials
Arrow Sessions
Arrow Evening Events
Arrow BOFs
Arrow Speakers
Arrow Press
Arrow Mail List
Arrow Exhibitors
Arrow Sponsors
O'Reilly Open Source Convention
Sheraton San Diego Hotel, San Diego, CA
July 23-27, 2001

News Coverage


Statistical Disambiguation of Word Senses with Perl

Dan Brian, Software Engineer, Conceptuary

Track: Perl Conference 5
Date: Wednesday, July 25
Time: 1:45pm - 2:15pm
Location: Grande Ballroom B

This presentation examines a new technology that uses Perl in a complex Natural Language Processing application. Word sense ambiguity is considered one of the most difficult tasks that must be performed by all NLP ontologies. This presentation will examine the problems facing NLP development. Next, it will briefly explore the Perl modules Lingua::Wordnet and Lingua::LinkParser and their uses in addressing these problems. A framework that draws upon these technologies will then be presented, including a new algorithm, which makes possible the accurate unsupervised disambiguation of an entire English text using grammatical and collocational statistics of a referenced Wordnet lexicon. The end result is an arbitrary text having word sense tags, which may be used in application ranging from automated document categorization, to smart retrieval technologies, to conversation simulators.

This presentation holds value for developers interested not only in Natural Language Processing, but also statistical analysis, complex data models, use of the Tk modules for graphical analysis of algorithmic processes, and data storage. While technically advanced, this presentation may be attended by beginning Perl programmers also, since the results are easily understood and appreciated.

oreilly.com Home | Conferences Home | Open Source Convention Home
Registration | Hotels/Travel | Tutorials | Sessions | Speakers
Press | Mail List | Exhibitors | Sponsors

© 2001, O'Reilly Media, Inc.