In the previous tutorial, we learned how to set up a QBox Cluster with the ES-Hadoop connector to interface with Hadoop’s data warehouse component, Hive, to perform SQL queries on top of Elasticsearch. The benefits of offloading and manipulating ES indices with Hive enable a multitude of possibilities for high-performing, deeper analysis across large data sets.  

In this tutorial we will take it a step further, by using Logstash to import an existing data set in the form of a CSV file into Elasticsearch in order to perform later batch-analytics in Hadoop’s powerful ecosystem.

Keep reading

The suggest API is one of the important APIs in Elasticsearch. It is used extensively in search solutions to tremendously improve the user experience. Ranging from normal autocomplete to context based suggestions, this API has many interesting use cases, which we will explore. In this tutorial, we show how to implement a simple autocomplete with elasticsearch.

Keep reading

Sometimes when firing a query, it gets delayed, or the response time is slow. There could be a number of reasons for the sluggishness of the query; ranging from shard issues or from computing certain elements in the query. Elasticsearch, from version 2.2, provides the Profile API for users to inspect the query execution time and other details. In this blog post, we explore how the profile API can be used to look into query timings.

Keep reading

In this blog post, we explain memory related settings in detail, which can be used to give elasticsearch better performance especially at times of scaling. We also go over issues caused by poor memory settings, and the ways to overcome them.  

Keep reading

Are you looking for full-text search and highlight capability on .PDF, .doc, or .epub files that you have in your system? In this tutorial, we show you how with the mapper-attachment-plugin

Keep reading

In the previous article, we covered “painless” and provided details about its syntax and its usage. It also covered some best practices, like why to use params, when to use “doc” values versus  “_source” when accessing the document fields, and how to create fields on the fly, etc.  

We also covered topics like using painless scripting in a query context, filter context and topics like using conditionals in scripting, deleting fields/nested fields, accessing nested objects and usage of scripting in scoring etc. In this final "Painless" post, we explore how to use painless scripting in Kibana.

Keep reading