This presentation examines a new technology that uses Perl in a complex Natural
Language Processing application. Word sense ambiguity is considered one of the
most difficult tasks that must be performed by all NLP ontologies. This
presentation will examine the problems facing NLP development. Next, it will
briefly explore the Perl modules Lingua::Wordnet and Lingua::LinkParser and
their uses in addressing these problems. A framework that draws upon these
technologies will then be presented, including a new algorithm, which makes
possible the accurate unsupervised disambiguation of an entire English text
using grammatical and collocational statistics of a referenced Wordnet lexicon.
The end result is an arbitrary text having word sense tags, which may be used
in application ranging from automated document categorization, to smart
retrieval technologies, to conversation simulators.
This presentation holds value for developers interested not only in Natural
Language Processing, but also statistical analysis, complex data models, use
of the Tk modules for graphical analysis of algorithmic processes, and data
storage. While technically advanced, this presentation may be attended by
beginning Perl programmers also, since the results are easily understood and
appreciated.