In this blog post, we show how the suggest API in elasticsearch can handle misspelled words using the terms suggester. We also explore the various implementations of the term suggester API.

Term Suggestion for Spell Checking

Users often type in the wrong spelling of search queries in searches. A search solution must be able to provide the suggestion even though there are spelling mistakes in the query. The suggest API provides two such spell check APIs, term suggest and phrase suggest. The term suggest API is more into correcting the spelling of the mistyped word. The phrase suggest API is an advanced term suggest API that accounts for multiple terms. Let us go over the term suggest API and its actions.

Test Documents

Index some sample documents to illustrate how the terms suggester operates. Here are five documents which are indexed in order to demonstrate:

curl -XPOST localhost:9200/term-suggest/test/1 -d '{"name": "bald"}'
curl -XPOST localhost:9200/term-suggest/test/2 -d '{"name": "bold"}'
curl -XPOST localhost:9200/term-suggest/test/3 -d '{"name": "blend"}'
curl -XPOST localhost:9200/term-suggest/test/4 -d '{"name": "bend"}'
curl -XPOST localhost:9200/term-suggest/test/5 -d '{"name": "blood"}'

Case 1 - Simple Use Case

Now that the index is ready with the necessary documents, pass the suggest query to demonstrate how the misspelled words are queried, and how the corrections are shown. Pass the below suggest query to the index:

curl -XPOST localhost:9200/term-suggest/_suggest -d '{
 "suggest_demo_01": {
   "text": "blod",
   "term": {
     "field": "name"
   }
 }
}'

The above query is a basic example of the term suggest API. In the above query, we can see that the "suggest_demo_01" is an arbitrary identifier which can be passed according to our choice. Now, in the "text" field the search keyword is passed and as you can see, is a misspelled word. Now we are checking the spelling suggestions against the field "name" of the documents indexed, which is mentioned under the "term" object. The results for the above query is given below:

{
 "_shards": {
   "total": 5,
   "successful": 5,
   "failed": 0
 },
 "suggest_demo_01": [
   {
     "text": "blod",
     "offset": 0,
     "length": 4,
     "options": [
       {
         "text": "blood",
         "score": 0.75,
         "freq": 1
       },
       {
         "text": "bold",
         "score": 0.75,
         "freq": 1
       },
       {
         "text": "bald",
         "score": 0.5,
         "freq": 1
       },
       {
         "text": "bend",
         "score": 0.5,
         "freq": 1
       },
       {
         "text": "blend",
         "score": 0.5,
         "freq": 1
       }
     ]
   }
 ]
}

In the above response, we can see the spell suggestions are listed under the array "options" inside the "suggest-term-01" object. Here the closest match for the given search term ("blod") is found to be "blood" and "bald", both of which scores 0.75, respectively. Other terms in the array have low scores, which indicates they are not the closest matches.

Case 2 - Multi Term Suggest Request

Another useful feature is the multi term suggest request. Here we can pack two term suggest requests in the same suggest query. See how it is done.

curl localhost:9200/term-suggest/_suggest -d '{
 "suggest_demo_01": {
   "text": "blod",
   "term": {
     "field": "name"
   }
 },
 "suggest_demo_02": {
   "text": "blad",
   "term": {
     "field": "name"
   }
 }
}'

In the above example, we made use of the spell checks against only one field. We can do the same for multiple fields, too, by simply replacing the "suggest_demo_02" field to be any other.

For the above request, the response is below.

{
 "_shards": {
   "total": 5,
   "successful": 5,
   "failed": 0
 },
 "suggest_demo_02": [
   {
     "text": "blad",
     "offset": 0,
     "length": 4,
     "options": [
       ...
     ]
   }
 ],
 "suggest_demo_01": [
   {
     "text": "blod",
     "offset": 0,
     "length": 4,
     "options": [
       ...
     ]
   }
 ]
}

We can see from the above response that the results are now stacked under two separate arrays, the "suggest_demo_01" and "suggest_demo_02".  

Case 3 - Term Suggest as Part of a Search Request

The terms suggest can also be sent as a part of the search request. In all of the above cases, we have seen the suggest requests were passed to the "_suggest" endpoint. Now we can also pass the same using the "_search" endpoint. One such query example is given below:

curl -XPOST localhost:9200/term-suggest/_search -d '{
 "query": {
   "term": {
     "name": "blood"
   }
 },
 "suggest": {
   "suggest-term-01": {
     "text": "blod",
     "term": {
       "field": "name"
     }
   }
 }
}'

In the above request, we can see that the query section queries for the term "blood" and the "suggest" section contains the suggestion text. Here the thing to note is that the results for the suggest part is totally independent of the query. The term suggester does not take the query into account.

Conclusion

In this tutuorial, we explored the suggest API feature: term suggestion. We also went over various use cases  involving how to pass multiple suggestions and how to pass it through the search endpoint. 

Other Helpful Tutorials

Give It a Whirl!

It's easy to spin up a standard hosted Elasticsearch cluster on any of our 47 Rackspace, Softlayer, or Amazon data centers. And you can now provision your own AWS Credits on Qbox Private Hosted Elasticsearch

Questions? Drop us a note, and we'll get you a prompt response.

Not yet enjoying the benefits of a hosted ELK-stack enterprise search on Qbox? We invite you to create an account today and discover how easy it is to manage and scale your Elasticsearch environment in our cloud hosting service.

comments powered by Disqus