Although elasticsearch can scale indefinitely, you should store required data only. This will speed up the search operation, as well as response time to retrieve the data, and even reduce resource utilization substantially.

Elasticsearch uses an “Inverted Index” to retrieve data that you are searching for. Although this algorithm is one of the best when it comes to text searching, keeping only the data that you need in the index is the best approach.

In this tutorial, we discuss data retention techniques that you can use in elasticsearch. This will obviously depend on the kind of data and your application, because some might need longer retention policies compared to others. 

Imagine an application that deals with finance and money transactions. Such applications will need all of the records forever. But, do these records need to always exist in elasticsearch? Does all of this data need to be quickly searchable?

Logstash provides methods where you can segregate different events, and then store it in standard file storage rather than elasticsearch for long-term storage.

Keep reading

Filebeat is extremely lightweight compared to its predecessors when it comes to efficiently sending log events. It uses lumberjack protocol, compression, and is easy to configure using a yaml file. It can send events directly to elasticsearch as well as logstash. It keeps track of files and position of its read, so that it can resume where it left of. 

The goal of this tutorial is to set up a proper environment to ship Linux system logs to Elasticsearch with Filebeat. It then shows helpful tips to make good use of the environment in Kibana.

Keep reading

In a previous tutorial, we discussed how to use one of Rust's Elasticsearch clients, rs-es, to interact with Elasticsearch via REST API. Now, we'll take a look at the other Rust Elasticsearch client, elastic

Elastic, like rs-es, is idiomatic Elasticsearch, but unlike rs-es, it is not idiomatic Rust. Strong typing over document types and query responses is prioritized over providing a comprehensive mapping of the Query DSL into Rust constructs. The elastic project aims to be equally usable for developers with and without Rust experience.

Structurally, the elastic crate combines several other crates which can also be used independently depending on the user's needs. The first of these is elastic-reqwest, a synchronous implementation of the Elasticsearch REST API based on Rust's reqwest library. Elastic-reqwest serves as the HTTP backend for the elastic crate itself. 

Second is elastic-requests, a strongly-typed implementation of Elasticsearch's REST API. Third is elastic-responses, which integrates with elastic-reqwest and facilitates handling Elasticsearch search responses by creating iterators for search results. Finally, elastic-types allows custom definitions of Elasticsearch types as Rust structures. It uses serde, which we encountered in the prior Rust elasticsearch tutorial, for serialization.

Keep reading

Previous tutorials have discussed how to use native clients in languages like Java and Python to interact with Elasticsearch via REST API.

Meanwhile, systems programming language Rust has been gaining more widespread use in production and its library ecosystem is growing. So, is it now possible to interact with the Elasticsearch API using a Rust client?

The answer is yes! There are actually two Rust Elasticsearch clients under active development, rs-es and elastic. In this tutorial, we will use RS-ES. While both libraries are idiomatic Elasticsearch, only RS-ES is also idiomatic Rust. For example, the type of an Elasticsearch document in rs-es is referred to as doc_type since type is a reserved keyword in Rust. By contrast, though documents in elastic are strongly-typed, queries are weakly-typed, meaning some errors are not caught until runtime.

Currently, rs-es implements the most common Elasticsearch APIs, including searching and indexing. The implementation of other APIs is planned, as are other improvements, including to performance. RS-ES supports Elasticsearch versions 2.0 and up. Finally, it's worth noting that neither rs-es nor elastic has support for asynchronous calls, though async support in planned for later versions of elastic

Keep reading

While a search request returns a single “page” of results, the scroll API can be used to retrieve large numbers of results (or even all results) from a single search request, in much the same way as you would use a cursor on a traditional database. Scrolling is not intended for real time user requests, but rather for processing large amounts of data, e.g. in order to reindex the contents of one index into a new index with a different configuration.

The results that are returned from a scroll request reflect the state of the index at the time that the initial search request was made, like a snapshot in time. Subsequent changes to documents (index, update or delete) will only affect later search requests.

Keep reading

In the previous tutorial in ElastAlert Series, we implemented cardinality, percentage match and single metric aggregation rules for ElastAlert alerting via HipChat. We will be next looking into configuring and setting up alerting using ElastAlert on to the super-fast, simple and free messaging app Telegram.

ElastAlert is now available on Qbox provisioned Elasticsearch clusters and can be easily configured. Implementing ElastAlert is easy on Qbox. When you provision a cluster, there is a configuration box where you can input your Alert rules.  If you’re unclear how to structure rules in YAML, be sure to consult the ElastAlert Documentation.

Keep reading