In this tutorial, we cover a few common issues related to shard management in Elasticsearch, their solutions, and several best practices. In some use cases, we incorporate special tricks to get things done. 

Moving a Shard from One Node to Another

This is one of the most common use cases when dealing with clusters of any size. A typical scenario is that if too many shards co-exist on a single node, they are all used up for querying or indexing. 

This situation presents a potential risk for node/cluster health. Therefore, it is a good practice to move shards from one node to another. Elasticsearch might not deal with this situation automatically, which means we need to intervene manually. How to make this happen?

Elasticsearch provides a cluster-level API, which allows moving shards from one node to another. Let's see an example of using this API below:

curl -XPOST 'http://localhost:9200/_cluster/reroute' -d '{
"commands" : [
{
"move" :
{
"index" : "testindex", "shard" : 0,
"from_node" : "target_node_name", "to_node" : "destination_node_name"
}
}
]
}'

In Elasticsearch, when an index is created with default settings, we have 5 primary shards created for that index. These shards are numbered from 0 to 4. In the above request, we have provided 0 as the value to the "shard"parameter. This is the shard number of the index named "testindex" .

Each node under a cluster has a unique name. The target node's name and the destination node's name should also be provided for the above API to work properly. 

Decommissioning a Node

Another use case is decommissioning a node from an active cluster. One of the main challenges in this scenario is decommissioning the node without causing a downtime or restart of the cluster. Fortunately, Elasticsearch provides an option to remove/decommission a node gracefully without losing data or causing downtime. Let's see how it can be achieved:

curl -XPUT 'localhost:9200/_cluster/settings' -d '{
"transient" :{
"cluster.routing.allocation.exclude._ip" : "IP of the node"
  }
}'

The above API makes the cluster stop allocating anything to the specified node and excludes it. Meanwhile, the data from this node is ported to a non-excluded node. This data transfer will occur in the background and, when complete, will lead to a complete removal of the node from the cluster.

When decommissioning a node, the disk space available in the other nodes should be more than the data size to be transferred. Otherwise, the cluster state may become red or yellow, which could cause downtime.

It is often helpful to have other options to identify the node to be decommissioned. In the above example we have the node identified by the "ip" of the node. We can also do the same using the "node id" and the "node name", which are unique in the cluster.

Exclude by Node ID:

curl -XPUT 'localhost:9200/_cluster/settings' -d '{
"transient" :{
"cluster.routing.allocation.exclude._id" : "unique id of the node"
}
}'

Exclude the Node by Name:

curl -XPUT localhost:9200/_cluster/settings -d '{
"transient" :{
"cluster.routing.allocation.exclude._name" : "name of the node"
}
}'

How do we see if the decommissioning of the node is over? We have two provisions for that:

1. Check the cluster health to see if there is any reallocation happening.

curl -XGET 'http://localhost:9200/_cluster/health?pretty'

The response for the above is below:

{
"cluster_name" : "elasticsearch",
"status" : "yellow",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 15,
"active_shards" : 15,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 15,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 50.0
}

In the above response, the "relocating_shards" value is 0 which indicates that there are no shards being transferred.

2. Check the excluded node's status by using the API below.

curl -XGET 'http://localhost:9200/_nodes/<NAME_OF_THE_NODE>/stats/indices?pretty‌'

In the response, check the field "indices.docs.count". If it is zero, the data transfer is complete.

Renaming Indices

Another use case is renaming indices. It can be done in a couple of ways depending on the use case.

Aliasing

If we want an index to be renamed without losing any data, the most commonly used method is aliasing.

For example, we want to rename the index "testindex" to "testindex-1". We can provide the alias name of "testindex-1" to the index "testindex", so that all the requests referring to "testindex-1" will now be routed to "testindex". This can be done as below:

curl -XPOST 'localhost:9200/_aliases?pretty' -H 'Content-Type: application/json' -d'
{
"actions" : [
{ "add" : { "index" : "testindex", "alias" : "testindex-1" } }
  ]
}'

This method allows us to rename the indices with zero downtime.

Reindex API

Sometimes, aliasing is not the best choice for renaming. In such cases, we are left with the option called reindexing. It will reindex all the documents from a target index to a destination index. For this to be done effectively there are two things to be checked:

  1. Whether there is enough space left on the machine.

  2. Whether the destination index exists with the right mapping.

If the above two conditions are met, we can use the reindex API as below:

curl -XPOST 'localhost:9200/_reindex?pretty' -H 'Content-Type: application/json' -d'
{
"source": {
"index": "testindex"
 },
"dest": {
"index": "testindex-1"
 }
}'

Conclusion

In this tutorial, we discussed how to manage node reallocation, node decommission, and rename indices via aliases or reindexing. These basic operations are indispensable for the effective Elasticsearch cluster administration. 

Questions/Comments? Drop us a line below.

Other Tutorials

Give It a Whirl!

It's easy to spin up a standard hosted Elasticsearch cluster on any of our 47 Rackspace, Softlayer, Amazon or Microsoft Azure data centers. And you can now provision a replicated cluster.

Questions? Drop us a note, and we'll get you a prompt response.

Not yet enjoying the benefits of a hosted ELK-stack enterprise search on Qbox? We invite you to create an account today and discover how easy it is to manage and scale your Elasticsearch environment in our cloud hosting service.