Creating a StringField
Let's look at a quick recap of field objects in Lucene; they are part of a document containing information about the document. A field is composed of three parts: name, type, and value. Values can be text, binary, or numeric. A field can also be stored in the index so that their values are returned along with hits. Lucene provides a number of field implementations out of the box that are suitable for most applications. In this section, we will cover a field implementation that stores the literal string, StringField
. Any value stored in this field can be indexed, but not tokenized. The entire string is treated as a single token.
So why don't we want to tokenize the text since we have talked about tokenization for quite a bit already? Consider that a part of a document is an address and that you have fields such as street address, city, state, and country contained within it. It's not a very good idea to analyze and tokenize the city, state, and country, because it's...