2. Indexing Your Data | Solr Cookbook

Book Overview & Buying
Table Of Contents
Feedback & Rating

Solr Cookbook - Third Edition

By : Rafal Kuc

3.8 (6)

Buy this Book

Solr Cookbook - Third Edition

3.8 (6)

By: Rafal Kuc

Buy this Book

Overview of this book

This book is for intermediate Solr Developers who are willing to learn and implement Pro-level practices, techniques, and solutions. This edition will specifically appeal to developers who wish to quickly get to grips with the changes and new features of Apache Solr 5.

Preface

What this book covers

What you need for this book

Who this book is for

Sections

Conventions

Reader feedback

Customer support

Free Chapter

1. Apache Solr Configuration

Introduction

Running Solr on a standalone Jetty

Installing ZooKeeper for SolrCloud

Migrating configuration from master-slave to SolrCloud

Choosing the proper directory configuration

Configuring the Solr spellchecker

Using Solr in a schemaless mode

Limiting I/O usage

Using core discovery

Configuring SolrCloud for NRT use cases

Configuring SolrCloud for high-indexing use cases

Configuring SolrCloud for high-querying use cases

Configuring the Solr heartbeat mechanism

Changing similarity

2. Indexing Your Data

Introduction

Indexing PDF files

Counting the number of fields

Using parsing update processors to parse data

Using scripting update processors to modify documents

Indexing data from a database using Data Import Handler

Incremental imports with DIH

Transforming data when using DIH

Indexing multiple geographical points

Updating document fields

Detecting the document language during indexation

Optimizing the primary key indexation

Handling multiple currencies

3. Analyzing Your Text Data

Introduction

Using the enumeration type

Removing HTML tags during indexing

Storing data outside of Solr index

Using synonyms

Stemming different languages

Using nonaggressive stemmers

Using the n-gram approach to do performant trailing wildcard searches

Using position increment to divide sentences

Using patterns to replace tokens

4. Querying Solr

Introduction

Understanding and using the Lucene query language

Using position aware queries

Using boosting with autocomplete

Phrase queries with shingles

Handling user queries without errors

Handling hierarchies with nested documents

Sorting data on the basis of a function value

Controlling the number of terms needed to match

Affecting document score using function queries

Using simple nested queries

Using the Solr document query join functionality

Handling typos with n-grams

Rescoring query results

5. Faceting

Introduction

Getting the number of documents with the same field value

Getting the number of documents with the same value range

Getting the number of documents matching the query and subquery

Removing filters from faceting results

Using decision tree faceting

Calculating faceting for relevant documents in groups

Improving faceting performance for low cardinality fields

6. Improving Solr Performance

Introduction

Handling deep paging efficiently

Configuring the document cache

Configuring the query result cache

Configuring the filter cache

Improving Solr query performance after the start and commit operations

Lowering the memory consumption of faceting and sorting

Speeding up indexing with Solr segment merge tuning

Avoiding caching of rare filters to improve the performance

Controlling the filter execution to improve expensive filter performance

Configuring numerical fields for high-performance sorting and range queries

7. In the Cloud

Introduction

Creating a new SolrCloud cluster

Setting up multiple collections on a single cluster

Splitting shards

Having more than a single shard from a collection on a node

Creating a collection on defined nodes

Adding replicas after collection creation

Removing replicas

Moving shards between nodes

Using aliasing

Using routing

8. Using Additional Functionalities

Introduction

Finding similar documents

Highlighting fragments found in documents

Efficient highlighting

Using versioning

Retrieving information about the index structure

Altering the index structure on a live collection

Grouping documents by the field value

Grouping documents by the query value

Grouping documents by the function value

Efficient documents grouping using the post filter

9. Dealing with Problems

Introduction

Dealing with the too many opened files exception

Diagnosing and dealing with memory problems

Configuring sorting for non-English languages

Migrating data to another collection

SolrCloud read-side fault tolerance

Using the check index functionality

Adjusting the Jetty configuration to avoid deadlocks

Tuning segment merging

Avoiding swapping

10. Real-life Situations

Introduction

Implementing the autocomplete functionality for products

Implementing the autocomplete functionality for categories

Handling time-sliced data using aliases

Boosting words closer to each other

Using the Solr spellchecking functionality

Using the Solr administration panel for monitoring

Automatically expiring Solr documents

Exporting whole query results

Index

Customer Reviews

3.8 (6)

5 star

16.7%

4 star

50%

3 star

33.3%

2 star

1 star

Solr Cookbook - Third Edition

By : Rafal Kuc

Solr Cookbook - Third Edition

By: Rafal Kuc

Overview of this book

Indexing PDF files

How to do it...

Unlock full access

Continue reading for free

Solr Cookbook - Third Edition

By : Rafal Kuc

Solr Cookbook - Third Edition

By: Rafal Kuc

Overview of this book

Indexing PDF files

How to do it...

Unlock full access

Continue reading for free

Create a Note

Delete Bookmark

Delete Note

Edit Note

Confirmation

Buy this book with your credits?