Book Image

Natural Language Processing with Java and LingPipe Cookbook

Book Image

Natural Language Processing with Java and LingPipe Cookbook

Overview of this book

Table of Contents (14 chapters)
Natural Language Processing with Java and LingPipe Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Introduction


Coreference is a basic mechanism in human language that allows two sentences to be about the same thing. It's a big deal for human communication—it functions much in the same way as variable names do in programming languages, with the additional subtly that scope is defined by very different rules than blocks. Coreference is less important commercially—maybe this chapter will help change that. Here is an example:

Alice walked into the garden. She was surprised.

Coreference exists between Alice and She; the phrases talk about the same thing. It all gets very interesting when we start asking whether Alice in one document is the same as Alice in another.

Coreference, like word-sense disambiguation, is a next-generation industrial capacity. The challenges of coreference contribute to the insistence of the IRS to have a social security number that unambiguously identifies persons independent of their names. Many of the techniques discussed were developed to help track persons and organizations...