Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Lucene 4 Cookbook
  • Toc
  • feedback
Lucene 4 Cookbook

Lucene 4 Cookbook

By : Edwood Ng, Vineeth Mohan
3.2 (5)
close
Lucene 4 Cookbook

Lucene 4 Cookbook

3.2 (5)
By: Edwood Ng, Vineeth Mohan

Overview of this book

This book is for software developers who are new to Lucene and who want to explore the more advanced topics to build a search engine. Knowledge of Java is necessary to follow the code samples. You will learn core concepts, best practices, and also advanced features, in order to build an effective search application.
Table of Contents (11 chapters)
close
10
Index

TermVectors


TermVectors is a feature in Lucene that lets you retrieve per document term-based statistical data from the index. These additional data points can be useful for features such as highlighting or any term-based reports analysis. As you may expect, this feature is not enabled by default, as it can be expensive to compute these data points and it would increase the index size significantly.

This TermVectors provides the following additional data points for each document:

  • Term frequency

  • Term position(s)

  • Term offsets

Term frequency is the number of times the term appears in a document. Positions is the term in a document where each position is incremented by term. offsets has a starting and ending positions by characters where the term can be located in a document.

Let's look at an example of what you can expect to see in TermVectors. Here is a piece of text to be added to a document:

humpty dumpty sat on a wall

Here is what you will retrieve from TermVectors:

Term

Frequency

Position

...
bookmark search playlist font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete