
Elasticsearch 8.x Cookbook
By :

Elasticsearch is often used to store machine learning data for training algorithms. X-Pack provides the Dense Vector field to store vectors that have up to 2,048 dimension values.
You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1, Getting Started.
To execute the commands in this recipe, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar. I suggest using the Kibana console, which provides code completion and better character escaping for Elasticsearch.
We want to use Elasticsearch to store a vector of values for our machine learning models. To achieve this, follow these steps:
PUT test-dvector { "mappings": { "properties": { "vector": { "type": "dense_vector", "dims": 4 }, "model": { "type": "keyword" } } } }
POST test-dvector/_doc/1 { "model":"pipe_flood", "vector" : [8.1, 8.3, 12.1, 7.32] }
The Dense Vector field is a helper field for storing vectors in Elasticsearch.
The ingested data for the field must be a list of floating-point values with the exact dimension of the value provided by the dims
property of the mapping (4
, in our example).
If the dimension of the vector field is incorrect, an exception is raised, and the document is not indexed.
For example, let's see what happens when we try to index a similar document with the wrong feature dimension:
POST test-dvector/_doc/1 { "model":"pipe_flood", "vector" : [8.1, 8.3, 12.1] }
We will see a similar exception that enforces the right dimension size. Here, the document will not be stored:
{ "error" : { "root_cause" : [ { "type" : "mapper_parsing_exception", "reason" : "failed to parse" } ], "type" : "mapper_parsing_exception", "reason" : "failed to parse", "caused_by" : { "type" : "illegal_argument_exception", "reason" : "Field [vector] of type [dense_vector] of doc [1] has number of dimensions [3] less than defined in the mapping [4]" } }, "status" : 400 }