Recent Posts by Adam Vanderbush

VP Marketing for Qbox and Supergiant.io. Qbox is a a venture-backed company focusing on search as a service. Foundational cloud Elasticsearch product at Qbox helps users discover insights through data exploration and analytics. 

In the previous article, we covered “painless” and provided details about its syntax and its usage. It also covered some best practices, like why to use params, when to use “doc” values versus  “_source” when accessing the document fields, and how to create fields on the fly, etc.  

We also covered topics like using painless scripting in a query context, filter context and topics like using conditionals in scripting, deleting fields/nested fields, accessing nested objects and usage of scripting in scoring etc. In this final "Painless" post, we explore how to use painless scripting in Kibana.

Keep reading

Is there a simple way to index emails to elasticsearch? Logstash is the answer. Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite "stash."  Here, “stash” means products like Elasticsearch, PagerDuty, Email, Nagios, Jira, and more. 

The Logstash event processing pipeline has three stages: inputs → filters → outputs. Inputs generate events, filters modify them, and outputs ship them elsewhere. Inputs and outputs support codecs that enable you to encode or decode the data as it enters or exits the pipeline without having to use a separate filter.

Keep reading

Developers and administrators of Elasticsearch find it scary when they either see the index is “red” or they see some of the shards in “unassigned” state. What’s much scarier is that when they try to identify the reason for the unassigned shards using API’s like “_cat/shards,” or try relocating shards using “_cluster/reroute” API, they fail to identify the real reason and factors that contributed to making some of the shards unassigned.

Wouldn’t it also be nice to find out why a particular shard is assigned to a current node and is not rebalanced to the other node? To help us in getting answers for this, Elasticsearch 5.0 released the cluster allocation API,  _cluster/allocation/explain, which is helpful when diagnosing why a shard is unassigned, or why a shard continues to remain on its current node when you might expect otherwise.

Keep reading

In the previous article, we covered “painless” and provided details about its syntax and its usage. It also covered some best practices, like why to use params, when to use “doc” values versus  “_source” when accessing the document fields, and how to create fields on the fly, etc. 

In this article, we explore further usages of “painless” scripting. This article covers using painless scripting in a query context, filter context, using conditionals in scripting, deleting fields/nested fields, accessing nested objects, usage of scripting in scoring, and more.

Keep reading

A common use case when working with Elasticsearch(ES) are the creation of dynamic fields, performing calculations on fields on the fly, modifying the scoring based upon a logic, etc. In order to perform these operations, Elasticsearch supports scripting. 

Since the earlier versions of ES, it supported scripting, but the scripting language has evolved over the ES releases. Starting with MVEL prior version 1.4, Groovy (post version 1.4), and now the latest entry “Painless” starting ES 5.0, the scripting in ES has evolved. A key reason for this evolution is the need for faster, safer and simpler scripting language.

Keep reading

We have discussed indexing, searching, and aggregations for parent-child and grandparent-grandchildren relationships in elasticsearch. The parent-child functionality allows us to associate one document type with another, in a one-to-many relationship or one parent to many children.

Keep reading

We have covered a lot on Parent-Child Relationships in Elasticsearch, indexing, searching, aggregations and the challenges it could easily face. We shall continue out streak with exploring further into Parent Child Relationships. The parent-child relationship is similar in nature to the nested model: both allows us to associate one entity with another. The difference is that, with nested objects, all entities live within the same document while, with parent-child, the parent and children are completely separate documents.

Keep reading

In the past few articles, we have focused on indexing and searching parent-child relationships in elasticsearch. The parent-child functionality allows us to associate one document type with another, in a one-to-many relationship, or one parent to many children. In this tutorial, we continue with parent-child aggregations in elasticsearch.

Keep reading

We have already discussed about indexing parent-child relationships in elasticsearch. We gave realised that the parent-child functionality allows us to associate one document type with another, in a one-to-many relationship—one parent to many children.

For this post, we will be using hosted Elasticsearch on Qbox.io. You can sign up or launch your cluster here, or click "Get Started" in the header navigation. If you need help setting up, refer to "Provisioning a Qbox Elasticsearch Cluster."

The advantages that parent-child has over nested objects are as follows:

  • The parent document can be updated without reindexing the children.

  • Child documents can be added, changed, or deleted without affecting either the parent or other children. This is especially useful when child documents are large in number and need to be added or changed frequently.

  • Child documents can be returned as the results of a search request.

Keep reading

We have been discussing extensively on Handling Relationships and Data Modeling in our series so far. The need to bridge the gap between flat mapping and the real world has made us focus on the following techniques.

  • Application-side joins

  • Data denormalization

  • Nested objects

Keep reading