Book Image

Lucene 4 Cookbook

By : Edwood Ng, Vineeth Mohan
Book Image

Lucene 4 Cookbook

By: Edwood Ng, Vineeth Mohan

Overview of this book

Table of Contents (16 chapters)
Lucene 4 Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Index

A

  • abstract class
    • acceptsDocsOutOfOrder() / How it works...
    • collect(int) / How it works...
    • setNextReader(AtomicReaderContext) / How it works...
    • setScorer(Scorer) / How it works...
  • acquire
    • about / Using the SearcherManager to refresh IndexSearcher
  • action types
    • index / Performing bulk indexing
    • create / Performing bulk indexing
    • delete / Performing bulk indexing
    • update / Performing bulk indexing
  • advanced filtering
    • performing / Performing advanced filtering, How it works…
  • analysis process
    • defining / Introduction
  • analyzer
    • creating / Creating an analyzer, How it works…
  • analyzers, Lucene
    • WhitespaceAnalyzer / Getting ready
    • SimpleAnalyzer / Getting ready
    • StopAnalyzer / Getting ready
    • StandardAnalyzer / Getting ready
    • SnowballAnalyzer / Getting ready
  • Apache Solr / Some Lucene implementations
  • arguments, filter
    • about / How to do it...
  • autogenerated phrase query / Autogenerated phrase query
  • autosuggest implementations
    • AnalyzingSuggester / Employing autosuggest
    • AnalyzingInfixSuggester / Employing autosuggest
    • FreeTextSuggester / Employing autosuggest
    • FuzzySuggester / Employing autosuggest
  • autosuggest module
    • employing / Employing autosuggest, Getting ready…, How to do it…

B

  • Backus normal Form (BNF) / Creating queries with the Lucene QueryParser
  • BBoxStrategy / Exploring spatial search
  • BM25 model
    • about / The BM25 model
    • implementing / Implementing the BM25 model, How to do It…, How it works…
  • Boolean model (BM)
    • about / Introduction
  • BooleanQuery
    • about / BooleanQuery, How it works…
  • boosting
    • about / Delving into field norms
    • index time boost / Delving into field norms
    • query time boost / Delving into field norms
  • built-in filters, Lucene
    • defining / Performing advanced filtering
    • TermRangeFilter / Performing advanced filtering
    • NumericRangeFilter / Performing advanced filtering
    • FieldCacheRangeFilter / Performing advanced filtering
    • QueryWrapperFilter / Performing advanced filtering
    • PrefixFilter / Performing advanced filtering
    • FieldCacheTermsFilter / Performing advanced filtering
    • FieldValueFilter / Performing advanced filtering
    • CachingWrapperFilter / Performing advanced filtering
  • built-in spatial strategies
    • BBoxStrategy / Exploring spatial search
    • PointVectorStrategy / Exploring spatial search
    • PrefixTreeStrategy / Exploring spatial search
    • SerializedDVStrategy / Exploring spatial search
  • bulk indexing
    • performing / Performing bulk indexing, How to do it…, How it works…

C

  • CLucene
    • URL / Some Lucene implementations
  • cluster customization
    • URL / There's more…
  • clustering
    • about / Scaling Elasticsearch
  • collectors
    • using / Using Collectors, How it works...
  • common analyzer
    • obtaining / Obtaining a common analyzer, How to do it..., How it works..., There's more…
  • components, divergence from randomness model
    • BasicModel / Implementing the divergence from randomness model
    • AfterEffect / Implementing the divergence from randomness model
    • Normalization / Implementing the divergence from randomness model
  • components, information-based model
    • Distribution / Implementing the information-based model
    • Lambda / Implementing the information-based model
    • Normalization / Implementing the information-based model
  • ControlledRealTimeReopenThread method
    • about / Generational indexing with TrackingIndexWriter
  • CURL
    • URL / Introduction
  • custom analyzers
    • defining / Defining custom analyzers
  • custom attributes
    • defining / Defining custom attributes, How it works…
  • custom Collector
    • building / How to do it...
  • custom FieldComparator
    • sorting with / Sorting with custom FieldComparator, How it works...
  • custom filter
    • creating / Creating a custom filter, How to do it...
  • CustomScoreQuery
    • defining / CustomScoreQuery, How to do it…, How it works…
  • custom TokenFilters
    • defining / Defining custom TokenFilters, How it works…
  • custom tokenizers
    • defining / Defining custom tokenizers

D

  • date resolution / Date resolution
  • default operator / Default operator
  • DirectoryReader
    • used, for opening index in Near Real-Time / Using the DirectoryReader to open index in Near Real-Time, How it works...
  • DisjunctionMaxQuery
    • defining / DisjunctionMaxQuery, How it works…
  • divergence from randomness model
    • about / The divergence from randomness model
    • implementing / Implementing the divergence from randomness model, How to do It…, How it works…
  • DocIdSet
    • about / Creating a custom filter
  • document
    • adding / Adding a document, How to do it..., How it works...
    • deleting / Deleting a document
    • updating / Updating a document, How to do it…
  • document ID (called DocId) / How Lucene works
  • document object per thread
    • reusing / Reusing field and document objects per thread, How it works...
  • documents
    • adding, to index / Creating and writing documents to an index
    • deleting / Deleting documents, How it works…
  • documents, index
    • defining / How it works…
  • DocValue field
    • creating / Creating a DocValue Field, How it works...
  • DocValue types
    • BinaryDocValues / Creating a DocValue Field
    • NumericDocValues / Creating a DocValue Field
    • SortedDocValues / Creating a DocValue Field
    • SortedNumericDocValues / Creating a DocValue Field
    • SortedSetDocValues / Creating a DocValue Field
  • DSL (domain-specific language)
    • about / Searching the index

E

  • Eclipse IDE
    • URL / Getting ready
  • Elasticsearch
    • about / Introduction
    • obtaining / Getting Elasticsearch, How to do it...
    • URL / Getting ready
    • starting / How to do it...
    • scaling / Scaling Elasticsearch, How to do it…, How it works…
  • Elasticsearch-head plugin
    • URL / There's more…
  • Extract Transform Load (ETL) process / How Lucene works

F

  • faceting
    • performing / Performing faceting, How it works…
    • benefits / Performing faceting
    • implementing / Performing faceting
  • Ferret
    • URL / Some Lucene implementations
  • field boost
    • about / Introduction
  • FieldCache
    • used, for un-inverting single-valued fields in memory / Un-inverting single-valued fields in memory with FieldCache, How it works...
    • about / Un-inverting single-valued fields in memory with FieldCache, Creating a custom filter
  • FieldComparator
    • defining / Sorting with custom FieldComparator
  • field mappings
    • predefining / Predefine field mappings, How to do it..., How it works....
  • field norms
    • delving into / Delving into field norms, How it works...
  • Field object
    • name / Creating a StringField
    • type / Creating a StringField
    • value / Creating a StringField
  • field object per thread
    • reusing / Reusing field and document objects per thread, How it works...
  • fields
    • creating / Creating fields
  • formula, Lucene
    • tf(t in d) / Introduction
    • idf(t) / Introduction
    • coord(q,d) / Introduction
    • queryNorm(q) / Introduction
    • t.getBoost() / Introduction
    • norm(t,d) / Introduction
    • computeNorm(FieldInvertState) / Introduction
  • FSDirectory / Obtaining an IndexWriter
  • fuzzy query / Fuzzy query
  • FuzzyQuery
    • defining / FuzzyQuery

G

  • generational indexing, with TrackingIndexWriter
    • about / Generational indexing with TrackingIndexWriter, How it works...
  • geo-spatial search
    • about / Exploring spatial search
  • grouping
    • implementing / Implementing grouping, How to do it…
  • grouping implementations
    • single-pass search / Implementing grouping
    • two-pass search / Implementing grouping

H

  • highlighting
    • implementing / Implementing highlighting, How it works…
  • HTTP PUT method
    • using / How to do it…

I

  • index
    • about / Introduction
    • searching / Searching the index, How it works...
  • index-time join / Implementing joins
    • and query-time join, comparing / Implementing joins
  • IndexDeletionPolicy
    • KeepOnlyLastCommitDeletionPolicy / Transactional commits and index versioning
    • NoDeletionPolicy / Transactional commits and index versioning
    • SnapshotDeletionPolicy / Transactional commits and index versioning
    • PersistentSnapshotDeletionPolicy / Transactional commits and index versioning
  • indexing / How Lucene works
  • IndexReader
    • about / Obtaining IndexReaders
    • AtomicReader / Obtaining IndexReaders
    • CompositeReader / Obtaining IndexReaders
  • IndexReader attributes
    • about / Introduction
  • IndexReaders
    • obtaining / Obtaining IndexReaders, How it works...
  • IndexSearcher
    • obtaining / Obtaining an IndexSearcher
    • defining / IndexSearcher
    • refreshing, SearcherManager used / Using the SearcherManager to refresh IndexSearcher, How it works...
  • index segments
    • about / Introduction
  • index time boost
    • about / Delving into field norms
  • index versioning
    • about / Transactional commits and index versioning, How it works...
  • IndexWriter
    • obtaining / Obtaining an IndexWriter, Obtaining an IndexWriter, How it works...
  • information-based model
    • about / The information-based Model
    • implementing / Implementing the information-based model, How it works…
  • inheritance / How Lucene works
  • inverted index
    • defining / Introduction

J

  • Java
    • download page / Getting ready
  • Javadoc, Lucene
    • URL / DisjunctionMaxQuery
  • joins
    • implementing / Implementing joins, How to do it…
  • joins methods, types
    • index-time join / Implementing joins
    • query-time join / Implementing joins

K

  • Kibana
    • about / Introduction
  • KinoSearch
    • URL / Some Lucene implementations

L

  • language model
    • about / The language model
    • implementing / Implementing the language model, How to do it…
  • latency
    • about / Performance tuning: latency and throughput, How to do it…, How it works…
    • benefits / How to do it…
  • lengthNorm
    • about / Introduction
  • Logstash
    • about / Introduction
  • lowercase expanded term / Lowercase expanded term
  • Lucene
    • about / Introduction, Introduction, Introduction
    • three stage process flow / Introduction
    • working / How Lucene works
    • features / Why is Lucene so popular?
    • wiki page, URL / Why is Lucene so popular?
    • implementations / Some Lucene implementations
    • installing / Installing Lucene, How to do it...
    • official page / How to do it...
    • URL / How it works...
  • Lucene.Net
    • URL / Some Lucene implementations
  • Lucene4c
    • URL / Some Lucene implementations
  • LuceneKit
    • URL / Some Lucene implementations
  • Lucene QueryParser
    • queries, creating with / Creating queries with the Lucene QueryParser
  • Lupy
    • URL / Some Lucene implementations

M

  • Maven
    • URL / How to do it...
  • Maven repository
    • URL / How to do it...
  • Montezuma
    • URL / Some Lucene implementations
  • MultiPhraseQuery
    • defining / PhraseQuery and MultiPhraseQuery
  • MUTIS
    • URL / Some Lucene implementations

N

  • Near Real-Time
    • DirectoryReader used, for opening index in / Using the DirectoryReader to open index in Near Real-Time, How it works...
  • Near Real-Time (NRT)
    • about / Introduction
  • new index
    • creating / Creating a new index, How it works…
  • NLucene
    • URL / Some Lucene implementations
  • node
    • about / Scaling Elasticsearch
  • norms
    • calculating / Delving into field norms
  • NRT
    • benefits / How it works…
    • about / How it works…
  • numeric field
    • creating / Creating a numeric field, How it works...
  • NumericRangeQuery
    • defining / NumericRangeQuery

O

  • Object Oriented Programming (OOP) / How Lucene works
  • Okapi BM25
    • about / The BM25 model
  • OpenMode options
    • APPEND / Obtaining an IndexWriter
    • CREATE / Obtaining an IndexWriter
    • CREATE_OR_APPEND / Obtaining an IndexWriter
  • ordered tree data structure
    • about / Creating a numeric field

P

  • pagination
    • about / Pagination, How it works...
  • PerFieldAnalyzerWrapper
    • using / Using PerFieldAnalyzerWrapper, How to do it…, How it works…
    • example / Using PerFieldAnalyzerWrapper
  • performance
    • improving / How to do it…
  • PhraseQuery
    • defining / PhraseQuery and MultiPhraseQuery
  • phrase slop / Phrase slop
  • Plucene
    • URL / Some Lucene implementations
  • plugins, Elasticsearch
    • URL / There's more…
  • PointVectorStrategy / Exploring spatial search
  • PositionIncrementAttribute
    • using / Using PositionIncrementAttribute, How to do it..., How it works…
  • precisionStep
    • about / Creating a numeric field
  • PrefixQuery
    • about / PrefixQuery and WildcardQuery, How it works…
  • PrefixTreeStrategy
    • about / Exploring spatial search
    • RecursivePrefixTreeStrategy / Exploring spatial search
    • TermQueryPrefixTreeStrategy / Exploring spatial search
  • PyLucene
    • URL / Some Lucene implementations

Q

  • queries
    • creating, with Lucene QueryParser / Creating queries with the Lucene QueryParser
    • constructing / Constructing queries, How it works...
  • Query / Creating queries with the Lucene QueryParser
  • query-time join / Implementing joins
    • and index-time join, comparing / Implementing joins
  • Query DSL page, Elasticsearch
    • URL / There's more…
  • QueryParser
    • searching with / Searching with QueryParser, How to do it..
    • wildcard search / Wildcard search
    • term range search / Term range search
    • autogenerated phrase query / Autogenerated phrase query
    • date resolution / Date resolution
    • default operator / Default operator
    • position increments, enabling / Enable position increments
    • fuzzy query / Fuzzy query
    • lowercase expanded term / Lowercase expanded term
    • phrase slop / Phrase slop
  • query time boost
    • about / Delving into field norms

R

  • RAMDirectory / Obtaining an IndexWriter
  • ranking value
    • about / How to do it…
  • RegexpQuery
    • defining / RegexpQuery
  • RegExp syntax
    • URL / RegexpQuery
  • release
    • about / Using the SearcherManager to refresh IndexSearcher
  • relevancy ranking
    • about / Introduction
  • replica
    • about / Scaling Elasticsearch
  • results
    • enumerating / Enumerating results, How it works…

S

  • scoring
    • about / Introduction
  • scoring models
    • BM25 model / The BM25 model
    • language model / The language model
    • divergence from randomness model / The divergence from randomness model
    • information-based model / The information-based Model
  • search
    • performing / Performing a search, How it works…
  • SearcherLifetimeManager
    • search sessions, maintaining with / Maintaining search sessions with SearcherLifetimeManager, How to do it…, How it works…
  • SearcherManager
    • used, for refreshing IndexSearcher / Using the SearcherManager to refresh IndexSearcher, How it works...
  • search result
    • forming / Forming a search result, How it works...
  • search sessions
    • maintaining, with SearcherLifetimeManager / Maintaining search sessions with SearcherLifetimeManager, How to do it…, How it works…
  • SerializedDVStrategy / Exploring spatial search
  • sharding
    • about / Scaling Elasticsearch
  • similarity class
    • overriding / Overriding similarity, How to do it…, How it works…
  • Similarity class
    • computeNorm / How to do it…
    • computeWeight / How to do it…
    • simScorer / How to do it…
  • similarity implementation
    • changing / Changing similarity implementation used during indexing, How to do it…
  • similarity methods
    • computeNorm(FieldInvertState) / Overriding similarity
    • computeWeight(float, CollectionStatics, TermStatistics) / Overriding similarity
    • coord(int, int) / Overriding similarity
    • queryNorm(float) / Overriding similarity
    • simScorer(Similarity.SimWeight, AtomicReaderContext) / Overriding similarity
  • SimpleAnalyzer / How it works…
  • simple Java Lucene project
    • setting up / Setting up a simple Java Lucene project, How to do it...
  • single-pass search
    • defining / Implementing grouping
  • single-valued fields, in memory
    • un-inverting, with FieldCache / Un-inverting single-valued fields in memory with FieldCache, How it works...
  • SnowballAnalyzer / How it works…
  • Solr
    • about / Introduction
  • Sort class
    • RELEVANCE / Specifying sort logic
    • INDEX ORDER / Specifying sort logic
  • SortedDocValues
    • about / Creating a custom filter
  • sort logic
    • specifying / Specifying sort logic, How it works...
  • SpanQuery
    • defining / SpanQuery, How it works…
    • SpanTermQuery / SpanQuery
    • SpanNearQuery / SpanQuery
    • SpanFirstQuery / SpanQuery
    • SpanNotQuery / SpanQuery
    • SpanOrQuery / SpanQuery
    • SpanMultiTermQueryWrapper / SpanQuery
    • FieldMaskingSpanQuery / SpanQuery
    • SpanPositionRangeQuery / SpanQuery
  • Spatial4j
    • URL / There's more…
  • spatial search
    • exploring / Exploring spatial search, How to do it…, How it works…
  • StandardAnalyzer / How it works…
  • stemming
    • about / Introduction
  • StopAnalyzer / How it works…
  • stopword filtering
    • about / Introduction
  • stopword removal
    • about / Introduction
  • StringField
    • creating / Creating a StringField, How it works...
  • Synonym Expansion
    • about / Introduction

T

  • Term
    • about / Introduction
  • TermQuery
    • about / TermQuery and TermRangeQuery
  • TermRangeQuery
    • about / TermQuery and TermRangeQuery
  • term range search / Term range search
  • TermVectors
    • about / TermVectors, How it works...
    • retrieving from / TermVectors
  • TextField
    • creating / Creating a TextField
  • text normalization
    • about / Introduction
  • TFIDF (term frequency and Inverse document frequency)
    • about / Introduction
  • TFIDFSimilarity
    • about / Changing similarity implementation used during indexing, Introduction
  • throughput
    • about / Performance tuning: latency and throughput, How to do it…, How it works…
  • token attribute interface
    • CharTermAttribute / Getting ready
    • PositionIncrementAttribute / Getting ready
    • OffsetAttribute / Getting ready
    • TypeAttribute / Getting ready
    • FlagsAttribute / Getting ready
    • PayloadAttribute / Getting ready
  • TokenAttribute values
    • obtaining / Obtaining TokenAttribute values, How to do it…, How it works…
  • TokenFilter
    • defining / Introduction
  • TokenStream
    • defining / Introduction
    • obtaining / Obtaining a TokenStream
  • transactional commits
    • about / Transactional commits and index versioning, How it works...
    • atomicity / Transactional commits and index versioning
    • consistency / Transactional commits and index versioning
    • isolation / Transactional commits and index versioning
    • durability / Transactional commits and index versioning
  • two-pass search
    • defining / Implementing grouping
    • example / How to do it…

V

  • vector space model (VSM)
    • about / Introduction
  • vertical scaling
    • about / Scaling Elasticsearch

W

  • WildcardQuery
    • about / PrefixQuery and WildcardQuery, How it works…
  • wildcard search / Wildcard search

Z

  • Zend Search
    • URL / Some Lucene implementations