Recent Posts by Kirill Goltsman

Kirill Goltsman is a tech writer, blogger and technology enthusiast with over five years of experience. His portfolio includes research and blog articles on topics as diverse as cloud computing, machine learning, Artificial Intelligence, and web programming.

The new ELK stack 6.6.0 was officially released by Elasticsearch on January 29, 2019, and it offers a lot of groundbreaking features and enhancements for Elasticsearch, Kibana, Logstash, APM, and Beats.

We've already tested Elasticsearch 6.6.0 with the brand new Kibana and are excited to share our experience with such valuable features as Index Lifecycle Management and Remote Cluster management. In this article, we'll summarize these and other major new features for Elasticsearch, Kibana, and Elastic APM and will give you a glimpse of some cool stuff you can now do with your Elasticsearch indices in Kibana 6.6.0. Let's get started!

Keep reading

Done

  • Deploying Supergiant Capacity Service for more efficient management of cluster resources. Supergiant Capacity Service used to provision Kubernetes nodes for Elasticsearch Pods substantially improves the stability of user deployments. We've experienced fewer issues with inability to provision Elasticsearch clusters after deploying Supergiant Capacity Service.

In Progress

  • Adding Elasticsearch 6.4 to the list of available versions. This release offers several new features (e.g., WeightedAvg metric aggregation), and you can learn about them here.
  • Adding the Learn-to-Rank plugin to the provisioner. The Elasticsearch Learn-to-Rank plugin leverages Machine Learning (ML) to improve search relevance ranking.
  • Security Improvements. We are currently working on email verification and user verification before being able to log into Qbox.

Planned

  • Add Hunspell dictionaries to the Qbox dashboard. Hunspell is a spell checker and morphological analyzer originally designed for the Hungarian language. It is a good analyzer solution for languages with rich morphology and complex word compounding and character encoding.

In this article, we'll continue our overview of Elasticsearch bucket aggregations, focusing on significant terms and significant text aggregations. These aggregations are designed to search for interesting and/or unusual occurrences of terms in your datasets that can tell much about the hidden properties of your data. This functionality is especially useful for the following use cases:

  • Identifying relevant documents for the user queries containing synonyms, acronyms, etc. For example, the significant terms aggregation could suggest documents with "bird flu" when the user searches for H1N1.
  • Identifying anomalies and interesting occurrences in your data. For example, by filtering documents based on location, we could identify the most frequent crime types in particular areas.
  • Identifying the most significant properties of a group of subjects using the significant terms aggregation on integer fields like height, weight, income, etc.

It should be noted that both significant terms and significant text aggregations perform complex statistical computations on documents retrieved by the direct query (foreground set) and all other documents in your index (background set). Therefore, both aggregations are computationally intensive and should be properly configured to work fast. However, once you master them with the help of this tutorial, you'll acquire a powerful tool for building very useful features in your applications and getting useful insights from your datasets. Let's get started!

Keep reading

Bucket aggregations in Elasticsearch create buckets or sets of documents based on certain criteria. Depending on the aggregation type, you can create filtering buckets, that is, buckets representing different value ranges and intervals for numeric values, dates, IP ranges, and more. 

Although bucket aggregations do not calculate metrics, they can hold metrics sub-aggregations that can calculate metrics for each bucket generated by the bucket aggregation. This makes bucket aggregations very useful for the granular representation and analysis of your Elasticsearch indices. In this article, we'll focus on such bucket aggregations as histogram, range, filters, and terms. Let's get started!

Keep reading

This blog posts continues our overview of Elasticsearch metrics aggregation. We will focus here on such metrics aggregations as geo bounds, geo centroid, percentiles, percentile ranks, and some other single-value and multi-value aggregations. By the end of this series, you'll have a good understanding of metrics aggregations in Elasticsearch including some important statistical measures and how to visualize them in Kibana. Let's get started!

Keep reading

In a previous tutorial, we discussed the structure of Elasticsearch pipeline aggregations and walked you through setting up several common pipelines such as derivatives, cumulative sums, and avg bucket aggregations. 

In this article, we'll continue with the analysis of Elasticsearch pipeline aggregations, focusing on such pipelines as stats, moving averages and moving functions, percentiles, bucket sorts, and bucket scripts, among others. Some of the pipeline aggregations discussed in the article such as moving averages are supported in Kibana, so we'll show you how to visualize them as well. Let's get started!

Keep reading

As you might already know from the previous Elasticsearch aggregation series, both metrics and buckets aggregations work on the numeric fields in the document set directly.

In contrast to this, pipeline aggregations, which we discuss in this article, work on the output produced by other aggregations transforming the values already computed by them. A pipeline aggregation, hence, works on the intermediary values not present in the original document set. This makes pipeline aggregation very useful for calculating complex statistical and mathematical measures like cumulative sum, derivatives, and moving averages among others.

In the first part of this series, we'll discuss two basic types of pipeline aggregations and show examples of such common Elasticsearch pipelines as a sum and cumulative sum, min and max, avg bucket, and derivative pipeline aggregations. Let's get started!

Keep reading

Qbox dashboard offers a variety of useful features such as cluster monitoring, backups, cloning, viewing alerts, etc. Our Kubernetes-backed AWS users can now easily access Elasticsearch logs from their dashboards.

Qbox Download Logs Feature

In order to get your Elasticsearch logs, select "Download Logs" under the "Manage" drop-down of your cluster. The logs will be downloaded in the tar format.

With this blog post we begin a comprehensive overview of Elasticsearch metrics aggregations that focuses on Elasticsearch numeric metrics aggregations -- a subset of metrics aggregations that produces numeric values. There are two types of these aggregations in Elasticsearch: single-value aggregations, which output a single value, and multi-value aggregations, which generate multiple metrics.

In the first part of our metrics aggregations series, we'll discuss such single-value metrics aggregations as average and weighted average, min, max, and cardinality. The only multi-value aggregation type discussed in this article is extended stats aggregation. To help you understand how these aggregations work, we'll accompany each description with the corresponding visualization in the Kibana dashboard. Let's get started!

Keep reading

In an earlier post, How to Build an Autocomplete Feature with Elasticsearch, we showed how to build a basic autocomplete that looks for all documents in the index. This feature is good for the generic autocomplete feature, but it is not enough if your index has a lot of product categories, for example. Therefore, in this post we'll explore context-based autocompletion, which will help you implement intelligent filtering based on categories and geo points. Let's get started!

Keep reading