In this post we will walk though the basics of using ngrams in Elasticsearch.

Wikipedia has this to say about ngrams:

In the fields of computational linguistics and probability, an n-gram is a contiguous sequence of n items from a given sequence of text or speech. The items can be phonemes, syllables, letters, words or base pairs according to the application. The n-grams typically are collected from a text or speech corpus. When the items are words, n-grams may also be called shingles.

In the fields of machine learning and data mining, "ngram" will often refer to sequences of n words. In Elasticsearch, however, an "ngram" is a sequnce of n characters. There are various ays these sequences can be generated and used. We'll take a look at some of the most common.

