Before setting up elasticsearch to fulfill entity extraction, it is worth checking out how it became such an easy task. There is a lot of buzz around the new Ingest API shipped with elasticsearch 5.x.
The Ingest API allows data manipulation and enrichment by defining a pipeline through which every document is subject to pass. This pipeline is created with a set of processors – each of which do specific tasks that enrich our data. A typical example of the processor is a grok processor, which allows you to modify and structure your unstructured log using pattern matching. Elasticsearch 5 ships many built-in processors about which you can read here.