
Elasticsearch 8.x Cookbook
By :

In Spark, you can read data from a lot of sources, but in general, with NoSQL data stores such as HBase, Accumulo, and Cassandra, you have a limited query subset, and you often need to scan all the data to read only what is required. Using Elasticsearch, you can retrieve a subset of documents that matches your Elasticsearch query, speeding up the data reading several-fold.
You need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe in Chapter 1, Getting Started.
You also need a working installation of Apache Spark and the data that we indexed in the previous example.
To read data in Elasticsearch via Apache Spark, we will perform the following steps:
./bin/spark-shell \ --conf spark.es.index...