Editors Note: This post is part 3 of a 3-part series on tuning Elasticsearch performance. Part 1 can be found here and Part 2 can be found here.

Shard Allocation, Rebalancing and Awareness are very crucial and important from the perspective of preventing any data loss or to prevent the painful Cluster Status: RED (a sign alerting that the cluster is missing some primary shards). Apart from shard allocation, everyone loves to tweak threadpools. For whatever reason, it seems people cannot resist increasing thread counts.

The default threadpool settings in Elasticsearch are very sensible. For all threadpools (except search) the threadcount is set to the number of CPU cores. If we have eight cores, we can be running only eight threads simultaneously. It makes sense to assign only eight threads to any particular threadpool.

In this tutorial, we’ll be focussing on Shard Allocation and Threadpool Configuration settings to keep our cluster’s health green and improve overall performance.

For this post, we will be using hosted Elasticsearch on Qbox.io. You can sign up or launch your cluster here, or click “Get Started” in the header navigation. If you need help setting up, refer to “Provisioning a Qbox Elasticsearch Cluster.

Diving deep into Shard Allocation and Rebalancing

We can setup ElasticSearch to have multiple shards and replicas of each index it serves. This can be very handy in many situations. With the ability to have multiple shards of a single index we can deal with indices that are too large for efficient serving by a single machine. With the ability to have multiple replicas of each shard we can handle higher query load by spreading replicas over multiple servers. In order to shard and replicate ElasticSearch has to figure out where in the cluster it should place shards and replicas.

Cluster Based Shard Allocation

Shard allocation is the process of allocating shards to nodes. This can happen during initial recovery, replica allocation, rebalancing, or when nodes are added or removed. ES includes several recovery properties which improve both Elasticsearch cluster recovery and restart times. The value that will work best for us depends on the hardware we have in use (disk and network being the usual bottlenecks), and the best advice we can give is to test, test, and test again.

The following dynamic settings may be used to control shard allocation and recovery and can be configured in the elasticsearch.yml config file or updated dynamically with the Cluster Update Settings API:

cluster.routing.allocation.enable – Enables or disables allocation for specific kinds of shards:

  • all – (default) Allows shard allocation for all kinds of shards.
  • primaries – Allows shard allocation only for primary shards.
  • new_primaries – Allows shard allocation only for primary shards for new indices.
  • none – No shard allocations of any kind are allowed for any indices.

This setting does not affect the recovery of local primary shards when restarting a node.

cluster.routing.allocation.node_concurrent_outgoing_recoveries – Number of concurrent outgoing shard recoveries are allowed to happen on a node. Outgoing recoveries are the recoveries where the source shard (most likely the primary unless a shard is relocating) is allocated on the node. Defaults to 2.

cluster.routing.allocation.node_concurrent_incoming_recoveries – Number of concurrent incoming shard recoveries are allowed to happen on a node. Incoming recoveries are the recoveries where the target shard (most likely the replica unless a shard is relocating) is allocated on the node. Defaults to 2.

cluster.routing.allocation.node_concurrent_recoveries – A shortcut to set bothcluster.routing.allocation.node_concurrent_incoming_recoveriesand

cluster.routing.allocation.node_concurrent_outgoing_recoveries.

cluster.routing.allocation.node_initial_primaries_recoveries – The recovery of replicas happens over the network but recovery of an unassigned primary after node restart uses data from the local disk. It should be fast so that initial primary recoveries can happen in parallel on the same node. Defaults to 4.

cluster.routing.allocation.same_shard.host – Allows to perform a check to prevent allocation of multiple instances of the same shard on a single host, based on hostname and host address. Defaults to false and applies if multiple nodes are started on the same machine.

indices.recovery.concurrent_streams – Controls the number of parallel streams to open to support recovery of a shard.

indices.recovery.max_bytes_per_sec – Closely tied to the number of streams, is the total network bandwidth available for recovery.

Disk Based Shard Allocation

Elasticsearch looks into the available disk space on a node before deciding whether to allocate new shards to that node or to actively relocate shards away from that node.

The following dynamic settings may be used to control the disk based shard allocation across the cluster:

  • cluster.routing.allocation.disk.threshold_enabled – Defaults to true. Set to false to disable the disk allocation decider.
  • cluster.routing.allocation.disk.watermark.low – Controls the low watermark for disk usage. It defaults to 85%, meaning ES will not allocate new shards to nodes once they have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent ES from allocating shards if less than the configured amount of space is available.
  • cluster.routing.allocation.disk.watermark.high – Controls the high watermark. It defaults to 90%, meaning ES will attempt to relocate shards to another node if the node disk usage rises above 90%. It can also be set to an absolute byte value (similar to the low watermark) to relocate shards once less than the configured amount of space is available on the node.

Note: Percentage values refer to used disk space, while byte values refer to free disk space. This can be confusing, since it flips the meaning of high and low. For example, it makes sense to set the low watermark to 10gb and the high watermark to 5gb, but not the other way around.

An example of updating the low watermark to no more than 85% of the disk size, a high watermark of at least 60 gigabytes free, and updating the information about the cluster every two minutes:

curl -XPUT 'localhost:9200/_cluster/settings?pretty' -H 'Content-Type: application/json' -d'
{
  "transient": {
    "cluster.routing.allocation.disk.watermark.low": "85%",
    "cluster.routing.allocation.disk.watermark.high": "60gb",
    "cluster.info.update.interval": "2m"
  }
}'

Shard Rebalancing

The following dynamic settings may be used to control the rebalancing of shards across the cluster:

cluster.routing.rebalance.enable – Enables or disables rebalancing for specific kinds of shards:

  • all – (default) Allows shard balancing for all kinds of shards.
  • primaries – Allows shard balancing only for primary shards.
  • replicas – Allows shard balancing only for replica shards.
  • none – No shard balancing of any kind are allowed for any indices.

cluster.routing.allocation.allow_rebalance – Specify when shard rebalancing is allowed:

  • always – Always allow rebalancing.
  • indices_primaries_active – Only when all primaries in the cluster are allocated.
  • indices_all_active – (default) Only when all shards (primaries and replicas) in the cluster are allocated.

cluster.routing.allocation.cluster_concurrent_rebalance – Number of concurrent shard rebalances are allowed cluster wide. Defaults to 2. This setting only controls the number of concurrent shard relocations due to imbalances in the cluster and does not limit shard relocations due to allocation filtering or forced awareness.

Shard Balancing Discovery

The following settings are used together to determine where to place each shard. The cluster is balanced when no allowed action can bring the weights of each node closer together by more than the balance.threshold.

  • cluster.routing.allocation.balance.shard – Defines the weight factor for shards allocated on a node (float). Defaults to 0.45f. Raising this raises the tendency to equalize the number of shards across all nodes in the cluster.
  • cluster.routing.allocation.balance.index – Defines a factor to the number of shards per index allocated on a specific node (float). Defaults to 0.55f. Raising this raises the tendency to equalize the number of shards per index across all nodes in the cluster.
  • cluster.routing.allocation.balance.threshold – Minimal optimization value of operations that should be performed (non negative float). Defaults to 1.0f. Raising this will cause the cluster to be less aggressive about optimizing the shard balance.

Configuring the Threadpool Properties

A node holds several thread pools in order to improve how threads memory consumption are managed within a node. Many of these pools also have queues associated with them, which allow pending requests to be held instead of discarded.

There are several thread pools, but the important ones include:

  • index – For index/delete operations. Thread pool type is fixed with a size of available processors count and queue_size of 200. The maximum size for this pool is (1 + No. of available processors).
  • search – For count/search/suggest operations. Thread pool type is fixed with a size of int(((No. of available_processors * 3) / 2) + 1) and queue_size of 1000.
  • bulk – For bulk operations. Thread pool type is fixed with a size of available processors count, queue_size of 50. The maximum size for this pool is (1 + No. of available processors).

Thread Pool Types:

The following are the types of thread pools and their respective parameters:

Fixed: The fixed thread pool holds a fixed size of threads to handle the requests with a queue (optionally bounded) for pending requests that have no threads to service them.

thread_pool:
    index:
        size: 30 // number of threads, defaults to the number of cores times 5
        queue_size: 1000 // size of the queue of pending requests that have no threads to execute them, defaults to -1

Scaling : The scaling thread pool holds a dynamic number of threads. This number is proportional to the workload and varies between the value of the core and max parameters.

The keep_alive parameter determines how long a thread should be kept around in the thread pool without it doing any work.

thread_pool:
    warmer:
        core: 1
        max: 8
        keep_alive: 2m // how long a thread should be kept around in the thread pool without it doing any work

Note: The number of processors is automatically detected, and the thread pool settings are automatically set based on it. In some cases it can be useful to override the number of detected processors to avoid some rare faulty detection use cases by setting:

processors: 2

Optimal Configurations:

Elasticsearch Config (elasticsearch.yml) can be tweaked to overcome default limitations and to increase the concurrency. if searching is the primary use case i.e. more search operations and less indexing operations, the threadpool for search can be increased and the threadpool for indexing can be much lower.

#for search operation
threadpool.search.type: fixed
threadpool.search.size: 50
threadpool.search.queue_size: 200
 
#for bulk operations
threadpool.bulk.type: fixed
threadpool.bulk.size: 10
threadpool.bulk.queue_size: 100
 
#for indexing operations
threadpool.index.type: fixed
threadpool.index.size: 60
threadpool.index.queue_size: 1000

ES by default assumes the primary use case as searching and querying. It thus allocates 90% of its total allocated HEAP memory for searching. This can be changed with the indices.memory.index_buffer_size setting but the implication of this setting can be significant as we are reducing the memory allocated for search. The following setting grants ES 30% of its heap memory for indexing purpose.

indices.memory.index_buffer_size: 30%

Give It a Whirl!

It’s easy to spin up a standard hosted Elasticsearch cluster on any of our 47 Rackspace, Softlayer, Amazon or Microsoft Azure data centers. And you can now provision your own AWS Credits on Qbox Private Hosted Elasticsearch.

Questions? Drop us a note, and we’ll get you a prompt response.

Not yet enjoying the benefits of a hosted ELK-stack enterprise search on Qbox? We invite you to create an account today and discover how easy it is to manage and scale your Elasticsearch environment in our cloud hosting service.