In a previous post, How to Build an Autocomplete Feature with Elasticsearch, we showed how to build a simple autosuggest in elasticsearch. In this post, we explore the context based autosuggest, and show how to implement it. 

Context Autocomplete

We have seen how the completion suggest works, and how the finite state transduders (FST) method makes the suggest API faster than the other suggestion APIs, such as prefix queries and n-grams. Within the simple completion suggest API, there is an overhead. The overhead is that it does not consider the case of the context where suggestion is needed.  

Consider we have a searchable website which deals with the manufacturers of different categories like mobiles, drones, pesticides, etc. When a user selects the "mobile" category, the search is limited to that particular category, and the autosuggest is limited to the "mobile" category. What the basic completion suggest does is make FSTs for the entire data available, which is memory intensive and time consuming.

As a solution, Elasticsearch created the "context" based autosuggestion, so that the suggestion generation can be limited to only a specific data set. Let us explore how it is done in the coming sections.

Path Based Context Autocomplete

Suppose we have an index consisting of manufacturers of different products. For the sake of this tutorial, let us consider two types of products, "mobiles" and "drones". Create the index first, like below:

curl -XPUT localhost:9200/manufacturers

Now we have to create two types in the same index, one for "mobiles" and the other for "drones" in the index with distinct mappings as below:

For the Type Mobiles

curl -XPUT localhost:9200/manufacturers/mobiles/_mapping -d '{
 "mobiles": {
   "properties": {
     "manufacturer_suggest": {
       "type": "completion",
       "context": {
         "type": {
           "type": "category",
           "path": "_type"
         }
       }
     }
   }
 }
}'

For the Type Drones

curl -XPUT localhost:9200/manufacturers/drones/_mapping -d '{
 "drones": {
   "properties": {
     "manufacturer_suggest": {
       "type": "completion",
       "context": {
         "type": {
           "type": "category",
           "path": "_type"
         }
       }
     }
   }
 }
}'

In the above mappings you can see we are specifying an extra parameter in the mapping named "context" wherein we are specifying the type as "category" and the path as the "_type", which specifies the path to use.

Data Indexing

For this example we are using indexing 10 documents under each type.

An example for the document under the type "mobiles" is below:

curl -XPUT localhost:9200/manufacturers/mobiles/1 -d '{
 "manufacturer": "apple",
 "manufacturer_suggest": {
   "input": [
     "apple"
   ]
 }
}'

An example for the document under the type "drones" is given below:

curl -XPUT localhost:9200/manufacturers/drones/1 -d '{
 "manufacturer": "dji",
 "manufacturer_suggest": {
   "input": [
     "dji"
   ]
 }
}'

The full set of data documents can be found here.

Suggest

After indexing the data set, we can use the suggest API on a specific data type. In the full data set, there are documents in both types which contains the "manufacturer_suggest" field value to be starting with "h". If we give a simple completion suggest starting with "h", it will return all the documents from across both types. We do not want this scenario, so we restrict the suggestions to a single type, say mobiles. This can be done by the following query:

curl -XPOST localhost:9200/manufacturers/_suggest -d '{
 "suggest": {
   "text": "h",
   "completion": {
     "field": "manufacturer_suggest",
     "size": 10,
     "context": {
       "type": "mobiles"
     }
   }
 }
}'

The results for the above is taken only from the type "mobiles".

Geo Based Context Autosuggestion

In the above section we have seen the context based autosuggestion based on path (types). Elasticsearch also allows us context based autosuggest on geographical fields. For example, if we are at a particular location, say Paris, and we search for a hotel, the preference would be hotels in close proximity to our current location. This type of suggest implementation is called as the geo based context autosuggestion. Let us have a look at the implementation.

Consider we are having an index with information of hotels in it.

curl -XPUT localhost:9200/hotels

We want to implement geo location based autosuggest feature. For this we need to have a radius of coverage for which the suggestions would be given by elasticsearch. In this case, take a distance of about 10-12 km as the radii. For this to happen, the geohash grid should have a "precision" parameter of about 5km. We apply the above to the mapping as below:

curl -XPUT localhost:9200/hotels/location/_mapping -d '{
 "location": {
   "properties": {
     "hotel_suggest": {
       "type": "completion",
       "context": {
         "location": {
           "type": "geo",
           "precision": "5km"
         }
       }
     }
   }
 }
}'

The above will generate the geohash grid with approximately 10-12km in radii.

Index a few documents containing the hotel information and the location. In the below documents, the first three are within the 10km grid and the last one is outside the 10km range.

Set-1 Indexing Hotels within 10 to 12 KM Range

Document 01
curl -XPOST localhost:9200/hotels/location/2 -d '{
 "hotel_suggest": {
   "input": [
     "best"
   ],
   "context": {
     "location": {
       "lat": 10.0221731,
       "lon": 76.3345504
     }
   }
 }
}'


Document 02
curl -XPOST localhost:9200/hotels/location/3 -d '{
 "hotel_suggest": {
   "input": [
     "goldsteins"
   ],
   "context": {
     "location": {
       "lat": 10.0231731,
       "lon": 76.3346544
     }
   }
 }
}'


Document 03
curl -XPOST localhost:9200/hotels/location/4 -d '{
 "hotel_suggest": {
   "input": [
     "galore"
   ],
   "context": {
     "location": {
       "lat": 10.0231720,
       "lon": 76.3336655
     }
   }
 }
}'

Set-2 Indexing Hotels Above 12 KM Range

curl -XPOST localhost:9200/hotels/location/5 -d ‘{
 "hotel_suggest": {
   "input": [
     "gamble"
   ],
   "context": {
     "location": {
       "lat": 10.5334732,
       "lon": 76.3345504
     }
   }
 }
}’

Now construct the suggest query with the starting alphabet "g" and the location inside the first data set (falling inside the first three hotels) as below:

curl -XPOST localhost:9200/hotels/_suggest -d '{
 "suggest": {
   "text": "g",
   "completion": {
     "field": "hotel_suggest",
     "context": {
       "location": {
         "value": {
           "lat": 10.0231731,
           "lon": 76.3346544
         }
       }
     }
   }
 }
}'

As the result of this query we can see only the suggestions for the the hotels from set-1 like below:

{
 "_shards": {
   "total": 5,
   "successful": 5,
   "failed": 0
 },
 "suggest": [
   {
     "text": "g",
     "offset": 0,
     "length": 1,
     "options": [
       {
         "text": "galore",
         "score": 1
       },
       {
         "text": "goldsteins",
         "score": 1
       }
     ]
   }
 ]
}

From the results, it is clear that the hotel in the range above 12 km was omitted even though it started with the alphabet "g".

Conclusion

In this tutorial, we explained completion autosuggest involving the context of search. With context, we explored the path based context and the geo based context. In the next series we explore the phrase suggester API used for the "did you mean" type of suggestions in elasticsearch.

comments powered by Disqus