Qbox Joins Instaclustr, Plans March 31, 2023, Sunset Date. Read about it in our blog post here.

In this article, we’ll continue our overview of Elasticsearch bucket aggregations, focusing on significant terms and significant text aggregations. These aggregations are designed to search for interesting and/or unusual occurrences of terms in your datasets that can tell much about the hidden properties of your data. This functionality is especially useful for the following use cases:

  • Identifying relevant documents for the user queries containing synonyms, acronyms, etc. For example, the significant terms aggregation could suggest documents with “bird flu” when the user searches for H1N1.
  • Identifying anomalies and interesting occurrences in your data. For example, by filtering documents based on location, we could identify the most frequent crime types in particular areas.
  • Identifying the most significant properties of a group of subjects using the significant terms aggregation on integer fields like height, weight, income, etc.

It should be noted that both significant terms and significant text aggregations perform complex statistical computations on documents retrieved by the direct query (foreground set) and all other documents in your index (background set). Therefore, both aggregations are computationally intensive and should be properly configured to work fast. However, once you master them with the help of this tutorial, you’ll acquire a powerful tool for building very useful features in your applications and getting useful insights from your datasets. Let’s get started!

Keep reading

Bucket aggregations in Elasticsearch create buckets or sets of documents based on certain criteria. Depending on the aggregation type, you can create filtering buckets, that is, buckets representing different value ranges and intervals for numeric values, dates, IP ranges, and more.

Although bucket aggregations do not calculate metrics, they can hold metrics sub-aggregations that can calculate metrics for each bucket generated by the bucket aggregation. This makes bucket aggregations very useful for granular representation and analysis of your Elasticsearch indices. In this article, we’ll focus on such bucket aggregations as histogram, range, filters, and terms. Let’s get started!

Keep reading

This blog posts continues our overview of Elasticsearch metrics aggregation. We will focus here on such metrics aggregations as geo bounds, geo centroid, percentiles, percentile ranks, and some other single-value and multi-value aggregations. By the end of this series, you’ll have a good understanding of metrics aggregations in Elasticsearch including some important statistical measures and how to visualize them in Kibana. Let’s get started!

Keep reading

In a previous tutorial, we discussed the structure of Elasticsearch pipeline aggregations and walked you through setting up several common pipelines such as derivatives, cumulative sums, and avg bucket aggregations.

In this article, we’ll continue with the analysis of Elasticsearch pipeline aggregations, focusing on such pipelines as stats, moving averages and moving functions, percentiles, bucket sorts, and bucket scripts, among others. Some of the pipeline aggregations discussed in the article such as moving averages are supported in Kibana, so we’ll show you how to visualize them as well. Let’s get started!

Keep reading

As you might already know from the previous Elasticsearch aggregation series, both metrics and buckets aggregations work on the numeric fields in the document set directly.

In contrast to this, pipeline aggregations, which we discuss in this article, work on the output produced by other aggregations transforming the values already computed by them. A pipeline aggregation, hence, works on the intermediary values not present in the original document set. This makes pipeline aggregation very useful for calculating complex statistical and mathematical measures like cumulative sum, derivatives, and moving averages among others.

In the first part of this series, we’ll discuss two basic types of pipeline aggregations and show examples of such common Elasticsearch pipelines as a sum and cumulative sum, min and max, avg bucket, and derivative pipeline aggregations. Let’s get started!

Keep reading

With this blog post we begin a comprehensive overview of Elasticsearch metrics aggregations that focuses on Elasticsearch numeric metrics aggregations — a subset of metrics aggregations that produces numeric values. There are two types of these aggregations in Elasticsearch: single-value aggregations, which output a single value, and multi-value aggregations, which generate multiple metrics.

In the first part of our metrics aggregations series, we’ll discuss such single-value metrics aggregations as average and weighted average, min, max, and cardinality. The only multi-value aggregation type discussed in this article is extended stats aggregation. To help you understand how these aggregations work, we’ll accompany each description with the corresponding visualization in the Kibana dashboard. Let’s get started!

Keep reading