Perhaps the most useful and powerful feature of Elasticsearch is its scripting capabilities.

Elasticsearch users can perform a variety of operations by enabling and exercising the script modules. Scripts can be used for a broad range of tasks, such as returning specific fields in a search request or modifying specific elements in a field. Read this article to learn about the benefits of this essential feature, and step through some basic tutorials on implementing scripts in Elasticsearch.

In this article, we explore general scripting in Elasticsearch by introducing you to the basics of scripting and by reviewing some basic examples.

Scripting Languages

Releases prior to Elasticsearch 1.4 were using MVEL as the default scripting language, but the decision was made to discontinue it because of security vulnerabilities and the lack of adequate support from the MVEL community. From release 1.4 onward, all versions of Elasticsearch are using Groovy as the default scripting language. Groovy has very similar syntax to JavaScript.

Here is a complete list of the scripting languages that Elasticsearch supports:

  1. Groovy — Groovy is the default scripting langugage, so there are no additional plugins necessary, and it's also not sandboxed by default.
  2. Expression — This is another scripting language that has native support in Elasticsearch, and there's no need for a plugin. This is not sandboxed by default.
  3. Mustache — Mustache also has built-in support from Elasticsearch for scripting and is also sandboxed by default.
  4. mvel — Although this scripting language was supported in earlier versions, it is now necessary to install an additional plugin if you want to use mvel. You can download the plugin here: https://github.com/elastic/elasticsearch-lang-mvel.
  5. JavaScript — Scripting in JavaScript also requires an external plugin to be installed. The plugin can be obtained at https://github.com/elastic/elasticsearch-lang-javascript.
  6. Python — The plugin for Python is based on Jython and is available as an external plugin here: https://github.com/elastic/elasticsearch-lang-python

Types of Scripting

There are two types of scripting available in Elasticsearch:

  • config folder in my computer — The next way and one of the most preferred ways to use scripting is by saving the scripts in the config folder and later calling it from the code by specifying the file name.
  • in-requests — Also known as dynamic scripting, this is the usage of scripts within the requests/code. By default, dynamic scripting is disabled in Elasticsearch. To enable the dynamic scripting facility, it's necessary to change a script variable in the elasticsearch.yml file (which you can find in the config folder) to read as follows:
script.disable_dynamic: false

By specifying false, we disable the dynamic scripting facility. After setting the script flag, you'll need to restart the cluster to put it into effect.

Index Creation

For this post, we will be using hosted Elasticsearch on Qbox.io. You can sign up or launch your cluster here, or click "Get Started" in the header navigation. If you need help setting up, refer to "Provisioning a Qbox Elasticsearch Cluster."

In the remainder of this article, we'll explore scripting with some simple tutorials. (We assume that you have an active Elasticsearch installation on your machine.) Start the service by entering this command at the terminal: sudo service elasticsearch start.

Let's begin by creating an sample index. You might give the index a name such as testindex. First, let's create documents containing the personal details such as name and age, test scores, and teacher remarks for students in a class. Here is the document containing information for a student named Bob:

{ 
    "personalDetails": {
        "name": "bob", 
        "age": "13"
    }, 
    "marks": {
        "physics": "48", 
        "maths": "45", 
        "chemistry": "44"
    }, 
    "remarks": [
        "hard working", 
        "intelligent"
    ] 
}

In order create this index, let's type in the above JSON data with the following command in the terminal:

curl -XPUT 'localhost:9200/testindex/testindex/1' -d '{
   "personalDetails": {
     "name": "bob",
     "age": "13"
   },
   "marks": {
     "physics": 48,
     "maths": 45,
     "chemistry": 44
   },
   "remarks": [
     "hard working",
     "intelligent"
   ]
 }'

In the curl call above:

  • testindex is the index name.
  • the second testindex is the type name.
  • "1" is the document ID, which in this case is the roll number of the student.
  • The curly braces contains the data set that we need to associate with the document.

After typing in the above command, we will get an acknowledgement response from the server as below:

{
  "_index": "testindex",
  "_type": "testindex",
  "_id": "1",
  "_version": 1,
  "created": true
}

NOTE: "created":true indicates that the creation of the index was successful.

Basic Examples of Scripting

Now that we have seen how to create an index, we can move on to some basic examples:

  • adding a new field to the index
  • updating values of an existing array
  • removing an element from an array
  • removing a field

We explain each of these below.

Adding a New Element in to an Existing Field

Continuing the example given above for the Bob the student, we saw that he has taken tests in three subjects and the corresponding test scores (marks) for each subject. Let's add the scores for the "English" subject to the marks field in the document by entering this command:

curl -XPOST 'localhost:9200/testindex/testindex/1/_update' -d '{
    "script" : "ctx._source.marks.english = 41"
}'

Here we use the _update API resource to communicate with Elasticsearch that this is an update operation on the existing document. Also, we use the Groovy keyword ctx, which enables the _source so that it's ready for searching. Upon execution of this command, we'll get an acknowledgement such as this:

{
  "_index": "testindex",
  "_type": "testindex",
  "_id": "1",
  "_version": 2
}

The version number indicates the number of times the document has been changed. To see the document content, enter this command:

curl -XGET 'http://localhost:9200/testindex/testindex/_search?&pretty=true&size=3' -d '{
  "query": {
    "match_all": {      
    }
  }
}'

This is the response:

{
    "took" : 107,
    "timed_out" : false,
    "_shards" : {
        "total" : 5,
        "successful" : 5,
         "failed" : 0
    },
    "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
        "_index" : "testindex",
        "_type" : "testindex",
        "_id" : "1",
        "_score" : 1.0,
         "_source":{
            "personalDetails":{
                "name":"bob",
                "age":"13"
            },
            "marks":{
                "physics":48,
                "maths":45,
                "chemistry":44,
                "english":41
            },
            "remarks":["hard working","intelligent"]
        }
    ]}
}


In the response above, we can see that the new field english — along with its corresponding mark of 41— is now in the marks field.

Updating Values of an Existing Element in a Field

Now, let's suppose that the mark for the physics subject is incorrect in Bob's record. We can update the mark with a script, like this:

curl -XPOST 'localhost:9200/testindex/testindex/1/_update' -d '{"script" : "ctx._source.marks.physics = physics","params" : {"physics" : 49 }}'

The previous marks of 48 are gone, with 49 in its place now. We get this response:

{"_index":"testindex","_type":"testindex","_id":"1","_version":3}

Take note that the version number increases from 2 to 3, which indicates that the record has undergone a change.

As before, we enter this command to see the document content:

curl -XGET 'http://localhost:9200/testindex/testindex/_search?&pretty=true&size=3' -d '{
  "query": {
    "match_all": {
      
    }
  }
}'

Now we get this response:

{
  "took": 8,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "testindex",
        "_type": "testindex",
        "_id": "1",
        "_score": 1,
        "_source": {
          "personalDetails": {
            "name": "bob",
            "age": "13"
          },
          "marks": {
            "physics": 49,
            "maths": 45,
            "chemistry": 44,
            "english": 41
          },
          "remarks": [
            "hard working",
            "intelligent"
          ]
        }
      }
    ]
  }
}

In the _source field, where our data set resides, we can see that the mark for physics is now 49.

Removing an Existing Element of a Field

To remove the physics entry from the marks field (which removes it entirely from the document itself), we could use the following script:

curl -XPOST 'localhost:9200/testindex/testindex/1/_update' -d '{
  "script": "ctx._source.marks.remove(\"physics\")"
}'

Enter the above command will return this response, which indicates the version change:

{
  "_index": "testindex",
  "_type": "testindex",
  "_id": "1",
  "_version": 4
}

Here is the response for the above script command:

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "testindex",
        "_type": "testindex",
        "_id": "1",
        "_score": 1,
        "_source": {
          "personalDetails": {
            "name": "bob",
            "age": "13"
          },
          "marks": {
            "maths": 45,
            "chemistry": 44,
            "english": 41
          },
          "remarks": [
            "hard working",
            "intelligent"
          ]
        }
      }
    ]
  }
}

The physics element is now gone from the marks field.

Removing an Entire Field

We've seen how to delete an element from a field, and now we'll see how to remove the entire field itself from the document. The command is much similar to the previous one—except for the path. We point to the _source and then call the remove keyword with the field name we want to remove (the marks field).

curl -XPOST 'localhost:9200/testindex/testindex/1/_update' -d '{
    "script" : "ctx._source.remove(\"marks\")"
}'

This will remove the entire marks field and return an acknowledgement that indicates the version number:

{
  "_index": "testindex",
  "_type": "testindex",
  "_id": "1",
  "_version": 5
}

This is the response:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "testindex",
        "_type": "testindex",
        "_id": "1",
        "_score": 1,
        "_source": {
          "personalDetails": {
            "name": "bob",
            "age": "13"
          },
          "remarks": [
            "hard working",
            "intelligent"
          ]
        }
      }
    ]
  }
}

As we expect, the marks field is no longer part of the document.

Scripting using Separate Files

In the examples above, scripts are given in the in-request form. We can also write the same scripts in a separate file and store it in config/scripts folder with a .groovy extension, and then use the command line to invoke the script.

One example would be to substitute in the first example — where we add a new field to an element — the following for the scripting portion:

"script" : "ctx._source.marks.english = 41"

Write this code in a text file, save it as example_script.groovy in the config/scripts folder, and then execute the command below:

curl -XPOST 'localhost:9200/testindex/testindex/1/_update' -d '{
 "_script" : {
    "script_id" : "example_script", "lang" : "groovy",
    }
}'

The lang field specifies the script language, and the script-id asks for the file name in which our script is stored in the config folder.

There are two other methods that we could invoke: scripts-api or native. Since a tutorial for these would require more complex examples, we will cover these in a future article.

Conclusion

In this article, we've given the basic Elasticsearch scripting concepts, the languages in which scripts can be written, the ways by which we can invoke a script, and some basic scripting examples. In a future article, we will cover advanced concepts of scripting along with more elaborate examples.

Got questions? Just drop us a note, and we'll get you a prompt response.

comments powered by Disqus