The Suggest API is one of the most developed APIs in Elasticsearch. It is extensively used in search solutions which tremendously improve user experience. Ranging from normal autocomplete to context based suggestions, this API has many interesting use cases, which we are going to explore. In this tutorial, we are going to show how to implement a simple autocomplete with Elasticsearch.

Spin Up a Qbox Elasticsearch Cluster 

 Setting up and configuring Elasticsearch can be quite the undertaking.  For this reason, we prefer to spin up a Qbox cluster to handle the Elasticsearch heavy lifting. You can get started with a Qbox Cluster here. For this tutorial, we will be using Elasticsearch major version 2.4.4.

Document Sample

A sample document for populating the index for this blog is given below:

{
fullName: "Alex Sat",
city : "New York"
}

Here we plan to implement the auto complete feature on the field "city".

The data set we use in this tutorial, containing 1000 documents, can be found here.

Index Creation and Mapping

For the purpose of this tutorial, let's create an index with the name "autosuggest".

curl -XPUT "http://localhost:9200/autosuggest"

We also need to create a mapping for the above documents that contains a suggestion field:

curl -X PUT localhost:9200/autosuggest/mytype/_mapping 
-H "Content-Type:application/json" -d '{
  "mytype" : {
        "properties" : {
            "fullName" : { "type" : "string" },
            "city": {"type": "string" },
            "citySuggest" : { "type" : "completion",
                          "analyzer" : "simple",
                          "search_analyzer" : "simple",
                          "payloads" : true
            }
        }
    }
}’

In the above mapping, along with the general fields "fullName" and "city", we have also defined a field called "citySuggest" that contains all autosuggest options. The autosuggest object uses a "simple" analyzer and has a "completion" type. 

Indexing the Documents

Next, we are going to index documents. Each document indexed contains a resident's name and a city's name and "citySuggest" object for the autocomplete to use:

curl -X PUT 'localhost:9200/autosuggest/mytype/44' -d '{
 "fullName": "Anna Reyes",
 "city": "At Tall al Kabīr",
 "citySuggest": {
   "input": [
     "At",
     "Tall",
     "al",
     "Kabir"
   ],
   "output": "At Tall al Kabīr"
 }
}'

The "citySuggest" object contains two lists named "input" and "output". In the input array, we are giving the space delimited values of the field "city". In the output field, the expected result for the value of the field "city" is given.

The code for indexing the entire data set can be found in this repo.

How Auto Suggestion Works

In the above section, you saw how the documents were indexed with an extra field that contains the "input" and "output" lists. This implementation is for aiding the FST, (finite state transducers) to auto-suggest the documents. In the FST method, we have two bands, which are the "input" and the "output" bands. Each specific state of the transducer defines a binary relation between the input and the output values. In the above indexing example, we can see such a relation, where any input of "At", "Tall", "al""Kabir" is related to the output word "At Tall al Kabir".

We can query such an occurrence with the query below:

curl -XPOST 'http://localhost:9233/autosuggest/_suggest' 
-H "Content-Type:application/json" -d ' {
 "city-suggest": {
   "text": "luis",
   "completion": {
     "field": "citySuggest"
   }
 }
}’

The query above returns the following response:

{
 "_shards": {
   "total": 5,
   "successful": 5,
   "failed": 0
 },
 "city-suggest": [
   {
     "text": "luis",
     "offset": 0,
     "length": 4,
     "options": [
       {
         "text": "Fray Luis A. Beltrán",
         "score": 1
       },
       {
         "text": "Luisiana",
         "score": 1
       }
     ]
   }
 ]
}

In the above response, you can see that there are suggestions with the text "luis" in them. We can consider another case where we replace the "luis" with "At" and see the results we get. In that result set, you can see the suggestions with the names starting with "At" or having "at" in them.

Note: _suggest endpoint was deprecated in favour of making use of suggest via _searchendpoint. In 5.0 Elasticsearch version, the _search endpoint has been optimized for suggest-only search requests. The tutorial above is based on Elasticsearch 2.4 version. Please, consult the official documentation for your ES version to see the available options and changes from the earlier versions. 

Conclusion

In this tutorial, we have covered the most basic Suggest API of Elasticsearch, along with some examples. In future tutorials, we will cover phrase suggesters, and context-based autocomplete.