
ElasticSearch Cookbook
By :

Related to shard management, there is the key concept of replication and cluster status.
You need one or more nodes running to have a cluster. To test an effective cluster you need at least two nodes (they can be on the same machine).
An index can have one or more replicas—the shards are called primary if they are part of the master index and secondary if they are part of replicas.
To maintain consistency in write operations the following workflow is executed:
During search operations, a valid set of shards is chosen randomly between primary and secondary to improve performances.
The following figure shows an example of possible shards configuration:
In order to prevent data loss and to have High Availability, it's good to have at least one replica so that your system can survive a node failure without downtime and without loss of data.
Related to the concept of replication there is the cluster indicator of the health of your cluster.
It can cover three different states:
Mainly yellow status is due to some shards that are not allocated. If your cluster is in recovery status, just wait if there is enough space in nodes for your shards.
If your cluster, even after recovery is still in yellow state, it means you don't have enough nodes to contain your replicas so you can either reduce the number of your replicas or add the required number of nodes.
The total number of nodes must not be lower than the maximum number of replicas.
When you have lost data (that is, one or more shard is missing), you need to try restoring the node(s) that are missing. If your nodes restart and the system goes back to yellow or green status you are safe. Otherwise, you have lost data and your cluster is not usable. In this case, delete the index/indices and restore them from backup (if you have it) or from other sources.
To prevent data loss, I suggest having always at least two nodes and the replica set to 1
.
Having one or more replicas on different nodes on different machines allows you to have a live backup of your data, always updated.
Change the font size
Change margin width
Change background colour