Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Elastic Stack 8.x Cookbook
  • Toc
  • feedback
Elastic Stack 8.x Cookbook

Elastic Stack 8.x Cookbook

By : Huage Chen, Yazid Akadiri
5 (3)
close
Elastic Stack 8.x Cookbook

Elastic Stack 8.x Cookbook

5 (3)
By: Huage Chen, Yazid Akadiri

Overview of this book

Learn how to make the most of the Elastic Stack (ELK Stack) products—including Elasticsearch, Kibana, Elastic Agent, and Logstash—to take data reliably and securely from any source, in any format, and then search, analyze, and visualize it in real-time. This cookbook takes a practical approach to unlocking the full potential of Elastic Stack through detailed recipes step by step. Starting with installing and ingesting data using Elastic Agent and Beats, this book guides you through data transformation and enrichment with various Elastic components and explores the latest advancements in search applications, including semantic search and Generative AI. You'll then visualize and explore your data and create dashboards using Kibana. As you progress, you'll advance your skills with machine learning for data science, get to grips with natural language processing, and discover the power of vector search. The book covers Elastic Observability use cases for log, infrastructure, and synthetics monitoring, along with essential strategies for securing the Elastic Stack. Finally, you'll gain expertise in Elastic Stack operations to effectively monitor and manage your system.
Table of Contents (16 chapters)
close

Deleting data in Elasticsearch

In this recipe, we will explore how to delete a document from an Elasticsearch index.

Getting ready

Refer to the requirements for the Updating data in Elasticsearch recipe.

Make sure to download the following Python script from the GitHub repository: https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook/blob/main/Chapter2/python-client-sample/sampledata_delete.py.

The snippets of the recipe are available at https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook/blob/main/Chapter2/snippets.md#deleting-data-in-elasticsearch.

How to do it…

  1. First, let us inspect the sampledata_delete.py Python script. Like the process in the previous recipe, we need to retrieve document_id from the tmp.txt file:
    with open('tmp.txt', 'r') as file:
              document_id = file.read()
  2. We can now check document_id, verify that the document exists in the index, and then perform the delete operation by using the previously obtained document_id:
    if document_id != '':
        if es.exists(index=index_name, id=document_id):
            # delete the document in Elasticsearch
            response = es.delete(index=index_name, id=document_id)
            print(f"delete status: {response['result']}")
  3. After reviewing the delete script, execute it with the following command:
    $ python sampledata_delete.py

    You should see the following output:

Figure 2.8 –  The output of the sampledata_delete.py script

Figure 2.8 – The output of the sampledata_delete.py script

  1. For further verification, return to the Dev Tools in Kibana and execute the search request again on the movies index:
    GET movies/_search

    This time, the result should reflect the deletion:

Figure 2.9 – The search results in the movies index after deletion

Figure 2.9 – The search results in the movies index after deletion

The total hits will now be 0, confirming that the document has been successfully deleted.

How it works...

When a document is deleted in Elasticsearch, it is not immediately removed from the index. Instead, Elasticsearch marks the document as deleted. These documents remain in the index until a merging process occurs during routine optimization tasks, when Elasticsearch physically expunges the deleted documents from the index.

This mechanism allows Elasticsearch to handle deletions efficiently. By marking documents as deleted rather than expunging them outright, Elasticsearch avoids costly segment reorganizations within the index. The removal occurs during optimized, controlled background tasks.

There’s more…

While we have discussed deleting documents by document_id, this might not be the most efficient approach for deleting multiple documents. For such scenarios, the Delete By Query API is more suitable, such as the following:

Note

Before executing the upcoming query, it is necessary to re-index the document, since it was deleted earlier in the recipe. Ensure that you have re-added the document to the movies index by executing the sampledata_index.py Python script.

POST /movies/_delete_by_query
{
  "query": {
    "match": {
      "genre": "comedy"
    }
  }
}

The preceding query will delete all movies matching the comedy genre in our index.

Also, when deleting many documents, the best practice is to use the Delete By Query with the slices parameter to improve performance. The Delete by Query feature with the slices parameter in Elasticsearch offers considerable advantages, especially when dealing with the deletion of numerous documents. This best practice enhances performance by splitting a large deletion task into smaller, parallel operations. This method not only boosts the efficiency and reliability of the deletion process but also lessens the burden on the cluster. By dividing the task, you ensure a more balanced and effective approach to managing large-scale deletions in Elasticsearch.

See also

For more details on the Delete By Query feature, refer to the official documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html.

bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete