The last two blogs in the analyzer series covered a lot of topics ranging from the basics of the analyzers to how to create a custom analyzer for our purpose with multiple elements. In this blog we are going to see a few special tokenizers like the email-link tokenizers and token-filters like edge-n-gram and phonetic token filters.

These tokenizers and filters provide very useful functionalities that will be immensely beneficial  in making our search more precise.

Keep reading

In the previous blog in our analyzer series we learned, in detail, about the creation of an inverted index, the components of an analyzer, and watched a simple example of how to use those analyzer components as a single entity and analyze the input text.

Now, in this blog we will progress towards the application of analyzers by creating a custom analyzer with multiple components. Here we will have a look into analyzer design and more components that are crucial in making better search results with more accuracy and we will also include examples.

Keep reading

There are times for a beginner in Elasticsearch, when one fires the perfect query but yields only partial result, no result or sometimes totally unexpected results. Since the query is flawless, it might cause us to be curious about what is happening behind the curtains.

Remember that Elasticsearch is a text based search engine built on top of Lucene. The wide range of operations available in Lucene is made easily usable and applicable in Elasticsearch by encapsulating them effectively into simple APIs.

In order to make our searches more effective and accurate, it is necessary that we know some key aspects of Lucene which are the foundation for Elasticsearch.

So in this blog, we will familiarise ourselves with aspects such as terms, tokens and stems as part of a similarity algorithm to learn how text is handled and processed in Elasticsearch.

Keep reading

It can be challenging to get the right outcomes from your Elasticsearch aggregations. But it's possible to get precise results with tokenization, exact mappings, and a custom analyzer.

In this article, we explain some of the subtleties that are inherent in the design of the Elasticsearch analyzer. We help you understand a common cause of erroneous result sets. Then we show you two methods for improving the results and getting them to be entirely accurate. We also provide many resources to help you gain proficiency in ES aggregations.

Keep reading

Welcome to Episode #3 of our Elasticsearch tutorial. In our last episode we searched with and learned about some of the Query DSL of Elasticsearch. Today we’ll create unstructured search in Elasticsearch using analyzers. After you have an Elasticsearch cluster started per instructions of Episode #1, we’ll get started.

Keep reading