Bulk indexing in Elasticsearch is an important topic to understand because you might occasionally need to write your own code to bulk index custom data. In addition, experience with bulk indexing is important when you need to understand performance issues with an Elasticsearch cluster.

Keep reading

In this blog post, we will cover an important feature, the filtering of values with partitions in terms aggregation, which can also be used to navigate through the unique terms in the buckets of terms aggregation.

Prior to 5.2, there was an option to put zero as value for size in terms aggregation and fetch all terms. But this approach was a failure because it could harm the main memory while downloading entire set of terms.

Keep reading

A common use case when working with Elasticsearch(ES) are the creation of dynamic fields, performing calculations on fields on the fly, modifying the scoring based upon a logic, etc. In order to perform these operations, Elasticsearch supports scripting.

Since the earlier versions of ES, it supported scripting, but the scripting language has evolved over the ES releases. Starting with MVEL prior version 1.4, Groovy (post version 1.4), and now the latest entry “Painless” starting ES 5.0, the scripting in ES has evolved. A key reason for this evolution is the need for faster, safer and simpler scripting language.

Keep reading

Scaling Elasticsearch is not an easy task. In this article, we go over different methods to make a High-Availability Logstash Indexing Solution using Qbox Hosted Elasticsearch.

Logstash Indexer is the component that indexes events and sends them to Elasticsearch for faster searches. We will use multiple logstash indexers with the exact same configuration. Having multiple indexers with the same configuration opens up different possibilities to make a highly available logstash solution for your ELK stack. These indexer nodes with identical configuration can easily be created using configuration management tools like Puppet or Chef.

Keep reading

We have already discussed about indexing parent-child relationships in elasticsearch. We gave realised that the parent-child functionality allows us to associate one document type with another, in a one-to-many relationship—one parent to many children.

For this post, we will be using hosted Elasticsearch on Qbox.io. You can sign up or launch your cluster here, or click “Get Started” in the header navigation. If you need help setting up, refer to “Provisioning a Qbox Elasticsearch Cluster.

The advantages that parent-child has over nested objects are as follows:

  • The parent document can be updated without reindexing the children.
  • Child documents can be added, changed, or deleted without affecting either the parent or other children. This is especially useful when child documents are large in number and need to be added or changed frequently.
  • Child documents can be returned as the results of a search request.
Keep reading

Effective log management involves a possibility to instantly draw useful insights from millions of log entries, identify issues as they arise, and visualize/communicate patterns that emerge out of your application logs. Fortunately, ELK stack (Elasticsearch, Logstash, and Kibana) makes it easy to ship logs from your application to ES collections for storage and analysis.

Recently, Elastic infrastructure was extended by useful tools for shipping logs called Beats. Filebeat is a part of Beats tool set that can be configured to send log events either to Logstash (and from there to Elasticsearch), or even directly to the Elasticsearch. The tool turns your logs into searchable and filterable ES documents with fields and properties that can be easily visualized and analyzed.

In a previous post, we discussed how to use Filebeat to ship Linux system logs. Now, it’s time to show how to ship logs from your MySQL database via Filebeat transport to your Elasticsearch cluster. Making MySQL general and slow logs accessible via Kibana and Logstash will radically improve your database management, log analysis and pattern discovery leveraging the full potential of ELK stack.

Keep reading