In this blog post, we explain memory related settings in detail, which can be used to give elasticsearch better performance especially at times of scaling. We also go over issues caused by poor memory settings, and the ways to overcome them.  

Keep reading

Are you looking for full-text search and highlight capability on .PDF, .doc, or .epub files that you have in your system? In this tutorial, we show you how with the mapper-attachment-plugin

Keep reading

Not yet enjoying the benefits of a hosted ELK-stack enterprise search on Qbox? Discover how easy it is to manage and scale your Elasticsearch environment.

Get Started 5 minutes to get started

In the previous article, we covered “painless” and provided details about its syntax and its usage. It also covered some best practices, like why to use params, when to use “doc” values versus  “_source” when accessing the document fields, and how to create fields on the fly, etc.  

We also covered topics like using painless scripting in a query context, filter context and topics like using conditionals in scripting, deleting fields/nested fields, accessing nested objects and usage of scripting in scoring etc. In this final "Painless" post, we explore how to use painless scripting in Kibana.

Keep reading

Is there a simple way to index emails to elasticsearch? Logstash is the answer. Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite "stash."  Here, “stash” means products like Elasticsearch, PagerDuty, Email, Nagios, Jira, and more. 

The Logstash event processing pipeline has three stages: inputs → filters → outputs. Inputs generate events, filters modify them, and outputs ship them elsewhere. Inputs and outputs support codecs that enable you to encode or decode the data as it enters or exits the pipeline without having to use a separate filter.

Keep reading

Developers and administrators of Elasticsearch find it scary when they either see the index is “red” or they see some of the shards in “unassigned” state. What’s much scarier is that when they try to identify the reason for the unassigned shards using API’s like “_cat/shards,” or try relocating shards using “_cluster/reroute” API, they fail to identify the real reason and factors that contributed to making some of the shards unassigned.

Wouldn’t it also be nice to find out why a particular shard is assigned to a current node and is not rebalanced to the other node? To help us in getting answers for this, Elasticsearch 5.0 released the cluster allocation API,  _cluster/allocation/explain, which is helpful when diagnosing why a shard is unassigned, or why a shard continues to remain on its current node when you might expect otherwise.

Keep reading

In the previous article, we covered “painless” and provided details about its syntax and its usage. It also covered some best practices, like why to use params, when to use “doc” values versus  “_source” when accessing the document fields, and how to create fields on the fly, etc. 

In this article, we explore further usages of “painless” scripting. This article covers using painless scripting in a query context, filter context, using conditionals in scripting, deleting fields/nested fields, accessing nested objects, usage of scripting in scoring, and more.

Keep reading