Book Image

Natural Language Processing with Java and LingPipe Cookbook

Book Image

Natural Language Processing with Java and LingPipe Cookbook

Overview of this book

Table of Contents (14 chapters)
Natural Language Processing with Java and LingPipe Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Regular expression-based chunking for NER


Named Entity Recognition (NER) is the process of finding mentions of specific things in text. Consider a simple name; location-named entity recognizer might find Ford Prefect and Guildford as the name and location mentions, respectively, in the following text:

Ford Prefect used to live in Guildford before he needed to move.

We will start by building rule-based NER systems and move up to machine-learning methods. Here, we'll take a look at building an NER system that can extract e-mail addresses from text.

How to do it…

  1. Enter the following command into the command prompt:

    java –cp lingpipe-cookbook.1.0.jar:lib/lingpipe-4.1.0.jar com.lingpipe.cookbook.chapter5.RegexNer
    
  2. Interaction with the program proceeds as follows:

    Enter text, . to quit:
    >Hello,my name is Foo and my email is [email protected] or you can also contact me at [email protected].
    input=Hello,my name is Foo and my email is [email protected] or you can also contact me at [email protected].
    chunking...