Editors Note: This post is part 3 of a 3-part series on tuning Elasticsearch performance. Part 1 can be found here and Part 2 can be found here.

Shard Allocation, Rebalancing and Awareness are very crucial and important from the perspective of preventing any data loss or to prevent the painful Cluster Status: RED (a sign alerting that the cluster is missing some primary shards). Apart from shard allocation, everyone loves to tweak threadpools. For whatever reason, it seems people cannot resist increasing thread counts. 

The default threadpool settings in Elasticsearch are very sensible. For all threadpools (except search) the threadcount is set to the number of CPU cores. If we have eight cores, we can be running only eight threads simultaneously. It makes sense to assign only eight threads to any particular threadpool.

In this tutorial, we’ll be focussing on Shard Allocation and Threadpool Configuration settings to keep our cluster's health green and improve overall performance.

Keep reading

Editors Note: This post is part 2 of a 3-part series on tuning Elasticsearch performance. Part 1 can be found here.

If we are using Elasticsearch mainly for search, or if search is a customer-facing feature that is key to our organization, we should monitor query latency and take action if it surpasses a threshold. 

It’s important to monitor relevant metrics about queries and fetches that can help us determine how our searches perform over time. For example, we may want to track cluster's health to provide high availability or track spikes and long-term increases in query requests, so that we can be prepared to tweak our configuration to optimize for better performance and reliability.

In this tutorial, we continue focusing on performance tuning strategies to keep our cluster's health green and improve overall performance.

Keep reading

Search and Analytics are key features of modern software applications. Scalability and the capability to handle large volumes of data in near real-time is demanded by many applications such as mobile apps, web and data analytics applications. Autocomplete in text fields, Search suggestions, Location or Geospatial search and Faceted Navigation are standards in usability to meet business requirements nowadays.

Tuning is essential, necessary and crucial! Any system tuning must be supported by performance measurements; that’s why a clear understanding of monitoring and the implications of changed metrics is essential for anyone using Elasticsearch.

This three part tutorial series introduces some tips and methods for performance tuning, explaining at each step the most relevant system configuration settings and metrics.

Keep reading

The penetration testing world is fast moving and persistently demands new ideas, tools and methods for solving problems and breaking things. In recent years many people have gotten used to the idea of using Elasticsearch in the penetration testing workflow, most notably for hacking web applications.  

More and more companies and websites are opening bug bounty programs. If you have new tools in your arsenal that other people don’t use or understand yet, then you could be making a great deal more money from Bug Bounty hunting. This tutorial teaches you how to use new tools with Elasticsearch to give you that competitive edge. 

Keep reading

A common use case that comes up when we use any product is how can we get metrics from it? How can we monitor it? Elasticsearch, since its early release, has always provided a way to monitor it using the _cat/stats API. However, for Logstash there wasn’t a way to gather metrics and monitor it until recently. With the release of Logstash 5.0+, Logstash has introduced a set of APIs to monitor Logstash.  In this article we explore the monitoring APIs exposed by Logstash, which includes the Node Info API, the Plugins API, the Node Stats API, and the Hot Threads API. 

Keep reading

When working with thousands of documents, a question that emerges is how to find documents that are similar to a given document or a set of documents. There are often uses cases when one would like to show documents that are similar to the document that the user is viewing, or is interested in. Elasticsearch has a query feature called “More Like This Query”, also known as the MLT Query, that tackles these cases.

Keep reading