Recent Posts by Adam Vanderbush

VP Marketing for Qbox and Supergiant.io. Qbox is a a venture-backed company focusing on search as a service. Foundational cloud Elasticsearch product at Qbox helps users discover insights through data exploration and analytics. 

When we are indexing data, the task is rarely as simple as each document existing in isolation. Sometimes, we are better off denormalizing all data into the child documents. For example, if we were modeling blog posts, adding an author field to blog could be a sensible choice; even if in the database, the authoritative datasource, the data is split into separate authors and blogs table. It’s simple and one can easily construct queries on both attributes of the blogs and the author’s name.

Keep reading

We discussed about Data Denormalization in our previous post Denormalization and Concurrency Issues in Elasticsearch and had emulated a filesystem with directory trees in Elasticsearch, much like a filesystem on Linux: the root of the directory is /, and each directory can contain files and subdirectories. The problem comes when we want to allow more than one person to rename files or directories at the same time. We shall be discussing about Concurrency issues and various kinds of locking in Elasticsearch in this post.

Keep reading

One of the key principles behind Elasticsearch is to allow you to make the most out of your data. Historically, search was a read-only enterprise where a search engine was loaded with data from a single source. As the usage grows and Elasticsearch becomes more central to your application, it happens that data needs to be updated by multiple components. 

Multiple components lead to concurrency and concurrency leads to conflicts. Elasticsearch's versioning system is there to help cope with those conflicts.

Keep reading

Complex relational databases can lead to tortuous SQL queries and slow responses from the web application. If you’re trying to return a long list of objects that are built up from five, ten or even seventeen related tables your response times can be unacceptably slow. 

Such problems are encountered regularly in large and complex data modeling applications. We have found that using Elasticsearch along with some conventions for denormalising complex objects can make it easy to generate sufficiently speedy responses, even when they are returning lots of rows.

Keep reading

Elasticsearch is a different kind of beast, especially if you come from the world of SQL. It comes with many benefits: performance, scale, near real-time search, and analytics across massive amounts of data.

Handling relationships between entities is not as obvious as it is with a dedicated relational store. The golden rule of a relational database, i.e., normalize your data, does not apply to Elasticsearch. This tutorial series will walk through Handling Relationships, Nested Objects, and Parent-Child Relationship to discuss the pros and cons of each of the available approaches.

Keep reading

We have already discussed Elasticsearch 5.0 and its ton of new and awesome features, and if you've been paying attention, then you know that one of the more prominent of these features is the new shiny ingest node. Simply put, ingest aims to provide a lightweight solution for pre-processing and enriching documents within Elasticsearch itself before they are indexed.

We can use ingest node to pre-process documents before the actual indexing takes place. This pre-processing happens by an ingest node that intercepts bulk and index requests, applies the transformations, and then passes the documents back to the index or bulk APIs.

Keep reading

This post is about tuning Elasticsearch Disk Usage. In it, we discuss disk usage tuning techniques, strategies, and recommendations specific to Elasticsearch 5.0 or onwards.

Keep reading

This post is Part 3 of a 3-part series about tuning Elasticsearch Search. Part 1 can be found here, and Part 2 can be found here. The aim of this tutorial is to further talk about some Search Tuning techniques, strategies, and recommendations specific to Elasticsearch 5.0 or onward.

Elasticsearch 5.0.0 had really been a major release after Elasticsearch 2.x version, and it does have something for everyone. It is a part of a wider release of the Elastic Stack that lines up version numbers of all the stack products. Kibana, Logstash, Beats, Elasticsearch are all version 5.0 now. It is the fastest, safest, most resilient, easiest to use version of Elasticsearch ever, and it comes with a boatload of enhancements and new features.

Keep reading

This post is part 2 of a 3-part series about tuning Elasticsearch Search Tuning. Part 1 can be found here. The aim of this tutorial is to further discuss Search Tuning techniques, strategies and recommendations specific to Elasticsearch 5.0 or onwards.

Kibana, Logstash, Beats, Elasticsearch - are all version 5.0 now. It is the fastest, safest, most resilient, easiest to use version of Elasticsearch ever, and it comes with a boatload of enhancements and new features.

Keep reading

Elasticsearch 5.0.0 had really been a major release after Elasticsearch 2.x version and it does have something for everyone. It is a part of a wider release of the Elastic Stack which lines-up version numbers of all the stack products. Kibana, Logstash, Beats, Elasticsearch - are all version 5.0 now. It is the fastest, safest, most resilient, easiest to use version of Elasticsearch ever, and it comes with a boatload of enhancements and new features.

Keep reading