Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Elastic Stack 8.x Cookbook
  • Table Of Contents Toc
  • Feedback & Rating feedback
Elastic Stack 8.x Cookbook

Elastic Stack 8.x Cookbook

By : Huage Chen, Yazid Akadiri
5 (3)
close
close
Elastic Stack 8.x Cookbook

Elastic Stack 8.x Cookbook

5 (3)
By: Huage Chen, Yazid Akadiri

Overview of this book

Learn how to make the most of the Elastic Stack (ELK Stack) products—including Elasticsearch, Kibana, Elastic Agent, and Logstash—to take data reliably and securely from any source, in any format, and then search, analyze, and visualize it in real-time. This cookbook takes a practical approach to unlocking the full potential of Elastic Stack through detailed recipes step by step. Starting with installing and ingesting data using Elastic Agent and Beats, this book guides you through data transformation and enrichment with various Elastic components and explores the latest advancements in search applications, including semantic search and Generative AI. You'll then visualize and explore your data and create dashboards using Kibana. As you progress, you'll advance your skills with machine learning for data science, get to grips with natural language processing, and discover the power of vector search. The book covers Elastic Observability use cases for log, infrastructure, and synthetics monitoring, along with essential strategies for securing the Elastic Stack. Finally, you'll gain expertise in Elastic Stack operations to effectively monitor and manage your system.
Table of Contents (16 chapters)
close
close

Defining index mapping

In Elasticsearch, mapping refers to the process of defining the schema or structure of an index. It defines how documents and their fields are stored and indexed within Elasticsearch. Mapping allows you to specify the data type of each field, such as text, a keyword, a numeric character, and a date, and configure various properties for each field, including indexing options and analyzers. By defining a mapping, you provide Elasticsearch with crucial information about the data you intend to index, enabling it to efficiently store, search, and analyze the documents.

Mapping plays a critical role in delivering precise search results, efficient data storage, and effective handling of different data types within Elasticsearch.

When no mapping is predefined, Elasticsearch attempts to dynamically infer data types and create the mapping; this is what has occurred with our movie dataset thus far.

In this recipe, we will apply an explicit mapping to the movies index.

Getting ready

Make sure that you have completed the Updating data in Elasticsearch recipe.

All the command snippets for the Dev Tools in this recipe are available at https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook/blob/main/Chapter2/snippets.md#defining-index-mapping.

How to do it…

You can define mappings during index creation or update them in an existing index.

An important note on mappings

When updating the mapping of an existing index that already contains documents, the mapping of those existing documents will not change. The new mapping will only apply to documents indexed afterward.

In this recipe, you are going to create a new index with explicit mapping, and then re-index the data from the movie index, assuming that you have already created that index beforehand:

  1. Head to Kibana | Dev Tools.
  2. Next, let’s check the mapping of the previously created index with the following command:
    GET /movies/_mapping

    You will get the results shown in the following figure. Note that, for readability, some fields were collapsed.

Figure 2.13 – The default mapping on the movies index

Figure 2.13 – The default mapping on the movies index

Let’s review what’s going on in the figure:

a. Examining the current mapping of the genre field reveals a multi-field mapping technique. This approach allows a single field to be indexed in several ways to serve different purposes. For example, the genre field is indexed both as a text field for full-text search and as a keyword field for sorting and aggregation. This dual approach to mapping the genre field is actually beneficial and well-suited for its intended use cases.

b. Examining the release_year field reveals that indexing it as a text field is not optimal, since it represents numerical data, which could be beneficial for range queries, as well as other numeric-specific operations. Retaining the keyword mapping for this field is advantageous for sorting and aggregation purposes. To address this, applying an explicit mapping to treat release_year appropriately as a numerical field is the next step.

c. There are two other fields that will require mapping adjustments – plot and cast. Given their nature, these fields should be indexed solely as text, considering it is unlikely there will be a need to sort or aggregate on these fields. However, this indexing strategy still allows for effective searching against them.

  1. Now, let’s create a new index with the correct explicit mapping for the cast, plot, and release_year fields:
    PUT movies-with-explicit-mapping
    {
      "mappings": {
        "properties": {
          "release_year": {
            "type": "short",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "cast": {
            "type": "text"
          },
          "plot": {
            "type": "text"
          }
        }
      }
    }
  2. Next, reindex the original data in the new index so that the new mapping is applied:
    POST /_reindex
    {
      "source": {
        "index": "movies"
      },
      "dest": {
        "index": "movies-with-explicit-mapping"
      }
    }
  3. Check whether the new mapping has been applied to the new index:
    GET movies-with-explicit-mapping/_mapping

    Figure 2.14 shows the explicit mapping applied to the index:

Figure 2.14 – Explicit mapping

Figure 2.14 – Explicit mapping

How it works...

Explicit mapping in Elasticsearch allows you to define the schema or mapping for your index explicitly. Instead of relying on dynamic mapping, which automatically detects and creates the mapping based on the first indexed document, explicit mapping gives you full control over the data types, field properties, and analysis settings for each field in your index, as shown in Figure 2.15:

Figure 2.15 – The field mapping options

Figure 2.15 – The field mapping options

There’s more…

Mapping is a key aspect of data modeling in Elasticsearch. Avoid relying on dynamic mapping and try, when possible, to explicitly define your mappings to have better control over the field types, properties, and analysis settings. This helps maintain consistency and avoids unexpected field mappings.

You should consider using multi-field mapping to index the same field in different ways, depending on the use cases. For instance, for a full-text search of a string field, text mapping is necessary. If the same string field is mostly used for aggregations, filtering, or sorting, then mapping it to a keyword field is more efficient. Also, consider using mapping limit settings to prevent a mapping explosion (https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-settings-limit.html). A situation where every new ingested document introduces new fields, as with dynamic mapping, can result in defining too many fields in an index. This can cause a mapping explosion. When each new field is continually added to the index mapping, it can grow excessively and lead to memory shortages and recovery challenges.

When it comes to mapping limit settings, there are several best practices to keep in mind. First, limit the number of field mappings to prevent documents from causing a mapping explosion. Second, limit the maximum depth of a field. Third, restrict the number of different nested mappings an index can have. Fourth, set a maximum for the count of nested JSON objects allowed in a single document, across all nested types. Finally, limit the maximum length of a field name. Keep in mind that setting higher limits can affect performance and cause memory problems.

For many years now, Elastic has been developing a specification called Elastic Common Schema (ECS) that provides a consistent and customizable way to structure data in Elasticsearch. Adopting this mapping has a lot of benefits (data correlation, reuse, and future-proofing, to name a few), and as a best practice, always refer to the ECS convention when you consider naming your fields. We will see more examples using ECS in the next chapters.

See also

Create a Note

Modal Close icon
You need to login to use this feature.
notes
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Delete Note

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Edit Note

Modal Close icon
Write a note (max 255 characters)
Cancel
Update Note

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY