Tutorial Series: The Authoritative Guide to Elasticsearch Performance Tuning
Search and Analytics are key features of modern software applications. Scalability and the capability to handle large volumes of data in near real-time is demanded by many applications such as mobile apps, web and data analytics applications. Autocomplete in text fields, Search suggestions, Location or Geospatial search and Faceted Navigation are standards in usability to meet business requirements nowadays.
Tuning is essential, necessary and crucial! Any system tuning must be supported by performance measurements; that’s why a clear understanding of monitoring and the implications of changed metrics is essential for anyone using Elasticsearch.
This three part tutorial series introduces some tips and methods for performance tuning, explaining at each step the most relevant system configuration settings and metrics.
Editors Note: This post is part 2 of a 3-part series on tuning Elasticsearch performance. Part 1 can be found here.
If we are using Elasticsearch mainly for search, or if search is a customer-facing feature that is key to our organization, we should monitor query latency and take action if it surpasses a threshold.
It’s important to monitor relevant metrics about queries and fetches that can help us determine how our searches perform over time. For example, we may want to track cluster’s health to provide high availability or track spikes and long-term increases in query requests, so that we can be prepared to tweak our configuration to optimize for better performance and reliability.
In this tutorial, we continue focusing on performance tuning strategies to keep our cluster’s health green and improve overall performance.
Shard Allocation, Rebalancing and Awareness are very crucial and important from the perspective of preventing any data loss or to prevent the painful Cluster Status: RED (a sign alerting that the cluster is missing some primary shards). Apart from shard allocation, everyone loves to tweak threadpools. For whatever reason, it seems people cannot resist increasing thread counts.
The default threadpool settings in Elasticsearch are very sensible. For all threadpools (except search) the threadcount is set to the number of CPU cores. If we have eight cores, we can be running only eight threads simultaneously. It makes sense to assign only eight threads to any particular threadpool.
In this tutorial, we’ll be focussing on Shard Allocation and Threadpool Configuration settings to keep our cluster’s health green and improve overall performance.