01.04.2019 -- Qbox Changelog

Posted by Kirill Goltsman January 6, 2019

Done

  • Deploying Supergiant Capacity Service for more efficient management of cluster resources. Supergiant Capacity Service used to provision Kubernetes nodes for Elasticsearch Pods substantially improves the stability of user deployments. We've experienced fewer issues with inability to provision Elasticsearch clusters after deploying Supergiant Capacity Service. 

In Progress

  • Adding Elasticsearch 6.4 to the list of available versions. This release offers several new features (e.g.,  WeightedAvg metric aggregation), and you can learn about them here.
  • Adding the Learn-to-Rank plugin to the provisioner. The Elasticsearch Learn-to-Rank plugin leverages Machine Learning (ML) to improve search relevance ranking.
  • Security Improvements. We are currently working on email verification and user verification before being able to log into Qbox.

Planned

  • Add Hunspell dictionaries to the Qbox dashboard. Hunspell is a spell checker and morphological analyzer originally designed for the Hungarian language. It is a good analyzer solution for languages with rich morphology and complex word compounding and character encoding.

In this article, we'll continue our overview of Elasticsearch bucket aggregations, focusing on significant terms and significant text aggregations. These aggregations are designed to search for interesting and/or unusual occurrences of terms in your datasets that can tell much about the hidden properties of your data. This functionality is especially useful for the following use cases:

  • Identifying relevant documents for the user queries containing synonyms, acronyms, etc. For example, the significant terms aggregation could suggest documents with "bird flu" when the user searches for H1N1. 
  • Identifying anomalies and interesting occurrences in your data. For example, by filtering documents based on location, we could identify the most frequent crime types in particular areas. 
  • Identifying the most significant properties of a group of subjects using the significant terms aggregation on integer fields like height, weight, income, etc. 

It should be noted that both significant terms and significant text aggregations perform complex statistical computations on documents retrieved by the direct query (foreground set) and all other documents in your index (background set). Therefore, both aggregations are computationally intensive and should be properly configured to work fast. However, once you master them with the help of this tutorial, you'll acquire a powerful tool for building very useful features in your applications and getting useful insights from your datasets. Let's get started!

Keep reading

Not yet enjoying the benefits of a hosted ELK-stack enterprise search on Qbox? Discover how easy it is to manage and scale your Elasticsearch environment.

Get Started 5 minutes to get started

Bucket aggregations in Elasticsearch create buckets or sets of documents based on certain criteria. Depending on the aggregation type, you can create filtering buckets, that is, buckets representing different value ranges and intervals for numeric values, dates, IP ranges, and more. 

Although bucket aggregations do not calculate metrics, they can hold metrics sub-aggregations that can calculate metrics for each bucket generated by the bucket aggregation. This makes bucket aggregations very useful for the granular representation and analysis of your Elasticsearch indices. In this article, we'll focus on such bucket aggregations as histogram, range, filters, and terms. Let's get started!

Keep reading

This blog posts continues our overview of Elasticsearch metrics aggregation. We will focus here on such metrics aggregations as geo bounds, geo centroid, percentiles, percentile ranks, and some other single-value and multi-value aggregations. By the end of this series, you'll have a good understanding of metrics aggregations in Elasticsearch including some important statistical measures and how to visualize them in Kibana. Let's get started!

Keep reading

In a previous tutorial, we discussed the structure of Elasticsearch pipeline aggregations and walked you through setting up several common pipelines such as derivatives, cumulative sums, and avg bucket aggregations. 

In this article, we'll continue with the analysis of Elasticsearch pipeline aggregations, focusing on such pipelines as stats, moving averages and moving functions, percentiles, bucket sorts, and bucket scripts, among others. Some of the pipeline aggregations discussed in the article such as moving averages are supported in Kibana, so we'll show you how to visualize them as well. Let's get started!

Keep reading

As you might already know from the previous Elasticsearch aggregation series, both metrics and buckets aggregations work on the numeric fields in the document set directly.

In contrast to this, pipeline aggregations, which we discuss in this article, work on the output produced by other aggregations transforming the values already computed by them. A pipeline aggregation, hence, works on the intermediary values not present in the original document set. This makes pipeline aggregation very useful for calculating complex statistical and mathematical measures like cumulative sum, derivatives, and moving averages among others.

In the first part of this series, we'll discuss two basic types of pipeline aggregations and show examples of such common Elasticsearch pipelines as a sum and cumulative sum, min and max, avg bucket, and derivative pipeline aggregations. Let's get started!

Keep reading