Painless uses a Java-style syntax that is similar to Groovy. In fact, most Painless scripts are also valid Groovy, and simple Groovy scripts are typically valid Painless. (This specification assumes you have at least a passing familiarity with Java and related languages.)

Painless is essentially a subset of Java with some additional scripting language features that make scripts easier to write. However, there are some important differences, particularly with the casting model. For more detailed conceptual information about the basic constructs that Java and Painless share, refer to the corresponding topics in the Java Language Specification.

Painless scripts are parsed and compiled using the ANTLR4 and ASM libraries. Painless scripts are compiled directly into Java byte code and executed against a standard Java Virtual Machine. This specification uses ANTLR4 grammar notation to describe the allowed syntax. However, the actual Painless grammar is more compact than that shown here. Painless is a simple and secure scripting language designed specifically for use with Elasticsearch. It is the default scripting language for Elasticsearch and can safely be used for inline and stored scripts.

For this post, we will be using hosted Elasticsearch on Qbox.io. You can sign up or launch your cluster here, or click "Get Started" in the header navigation. If you need help setting up, refer to "Provisioning a Qbox Elasticsearch Cluster."

We can use Painless anywhere scripts can be used in Elasticsearch. Painless features include:

  • Fast performance: Painless scripts run several times faster than the alternatives.
  • Syntax: Extends Java’s syntax to provide Groovy-style scripting language features that make scripts easier to write.
  • Safety: Fine-grained whitelist with method call/field granularity. (See the Painless API Reference for a complete list of available classes and methods.)
  • Optional typing: Variables and parameters can use explicit types or the dynamic def type.
  • Optimizations: Designed specifically for Elasticsearch scripting.

Let's illustrate how Painless works by loading some academic stats into an Elasticsearch index:

curl -XPUT 'ES_HOST:ES_PORT/academics/student/_bulk?refresh&pretty' -H 'Content-Type: application/json' -d'
{"index":{"_id":1}}
{"first":"Agatha","last":"Christie","base_score":[9,27,1],"target_score":[17,46,0],"grade_point_index":[26,82,1],"born":"1978/08/13"}
{"index":{"_id":2}}
{"first":"Alan","last":"Moore","base_score":[7,54,26],"target_score":[11,26,13],"grade_point_index":[26,82,82],"born":"1976/10/12"}
{"index":{"_id":3}}
{"first":"jiri","Henrik":"Ibsen","base_score":[5,34,36],"target_score":[11,62,42],"grade_point_index":[24,80,79],"born":"1983/01/04"}
{"index":{"_id":4}}
{"first":"William","last":"Blake","base_score":[4,6,15],"target_score":[8,23,15],"grade_point_index":[26,82,82],"born":"1990/02/17"}
{"index":{"_id":5}}
{"first":"Shaun","last":"Tan","base_score":[5,0,0],"target_score":[8,1,0],"grade_point_index":[26,1,0],"born":"1993/06/20"}
{"index":{"_id":6}}
{"first":"Peter","last":"Hitchens","base_score":[0,26,15],"target_score":[11,30,24],"grade_point_index":[26,81,82],"born":"1969/03/20"}
{"index":{"_id":7}}
{"first":"Raymond","last":"Carver","base_score":[7,19,5],"target_score":[3,17,4],"grade_point_index":[26,45,34],"born":"1963/08/10"}
{"index":{"_id":8}}
{"first":"Lee","last":"Child","base_score":[2,14,7],"target_score":[8,42,30],"grade_point_index":[26,82,82],"born":"1992/06/07"}
{"index":{"_id":39}}
{"first":"Joseph","last":"Heller","base_score":[6,30,15],"target_score":[3,30,24],"grade_point_index":[26,60,63],"born":"1984/10/03"}
{"index":{"_id":10}}
{"first":"Harper","last":"Lee","base_score":[3,15,13],"target_score":[6,24,18],"grade_point_index":[26,82,82],"born":"1976/03/17"}
{"index":{"_id":11}}
{"first":"Ian","last":"Fleming","base_score":[3,18,13],"target_score":[6,20,24],"grade_point_index":[26,67,82],"born":"1972/01/30"}'

Accessing Doc Values from Painless

Document values can be accessed from a Map named doc. For example, the following script calculates a student’s total goals. This example uses a strongly typed int and a for loop.

curl -XGET 'ES_HOST:ES_PORT/academics/_search?pretty' -H 'Content-Type: application/json' -d '{
 "query": {
   "function_score": {
     "script_score": {
       "script": {
         "lang": "painless",
         "inline": "int total = 0; for (int i = 0; i < doc[‘base_score’].length; ++i) { total += doc[‘base_score’][i]; } return total;"
       }
     }
   }
 }
}'

Alternatively, we could do the same using a script field instead of a function score: 

curl -XGET 'ES_HOST:ES_PORT/academics/_search?pretty' -H 'Content-Type: application/json' -d '{
 "query": {
   "match_all": {}
 },
 "script_fields": {
   "total_goals": {
     "script": {
       "lang": "painless",
       "inline": "int total = 0; for (int i = 0; i < doc[‘base_score’].length; ++i) { total += doc[‘base_score’][i]; } return total;"
     }
   }
 }
}'

The following example uses a Painless script to sort the students by their combined first and last names. The names are accessed using doc['first'].value and doc['last'].value.

curl -XGET 'ES_HOST:ES_PORT/academics/_search?pretty' -H 'Content-Type: application/json' -d '{
 "query": {
   "match_all": {}
 },
 "sort": {
   "_script": {
     "type": "string",
     "order": "asc",
     "script": {
       "lang": "painless",
       "inline": "doc['first.keyword'].value + ' ' + doc['last.keyword'].value"
     }
   }
 }
}'

Updating Fields with Painless

We can also easily update fields by accessing the original source for a field as ctx._source.<field-name>

First, let’s look at the source data for a student by submitting the following request: 

curl -XGET 'ES_HOST:ES_PORT/academics/_search?pretty' -H 'Content-Type: application/json' -d '{
 "stored_fields": [
   "_id",
   "_source"
 ],
 "query": {
   "term": {
     "_id": 1
   }
 }
}'

In order to change student 1’s last name to ‘Frost’, simply set ctx._source.last to the new value: 

curl -XPOST 'ES_HOST:ES_PORT/academics/student/1/_update?pretty' -H 'Content-Type: application/json' -d '{
 "script": {
   "lang": "painless",
   "inline": "ctx._source.last = params.last",
   "params": {
     "last": "Frost"
   }
 }
}'

 

We can also add fields to a document. For example, this script adds a new field that contains the student’s nickname - “JS”.

curl -XPOST 'ES_HOST:ES_PORT/academics/student/1/_update?pretty' -H 'Content-Type: application/json' -d '{
 "script": {
   "lang": "painless",
   "inline": "ctx._source.last = params.last; ctx._source.nick = params.nick",
   "params": {
     "last": "Smith",
     "nick": "JS"
   }
 }
}'

Dates 

Date fields are exposed as ReadableDateTime so they support methods like getYear, and getDayOfWeek and getMillis. For example, the following request returns every academic student’s birth year:

curl -XGET 'ES_HOST:ES_PORT/academics/_search?pretty' -H 'Content-Type: application/json' -d '{
 "script_fields": {
   "birth_year": {
     "script": {
       "inline": "doc.born.value.year"
     }
   }
 }
}'

Regular Expressions

Regexes are disabled by default because they circumvent Painless’s protection against long-running and memory-hungry scripts. To make matters worse, even innocuous- looking regexes can have staggering performance and stack depth behavior. They remain an amazing powerful tool but are too scary to be enabled by default. Set script.painless.regex.enabled: true in elasticsearch.yml to enable them.

Painless’s native support for regular expressions has syntax constructs: 

/pattern/: Pattern literals create patterns. This is the only way to create a pattern in painless. The pattern inside the /'s are just Java regular expressions. 

  • =~: The find operator return a Boolean, true if a subsequence of the text matches, false otherwise.
  • ==~: The match operator returns a Boolean, true if the text matches, false if it doesn’t.

Using the find operator (=~), we can update all academic students with "b" in their last name:

curl -XPOST 'ES_HOST:ES_PORT/academics/student/_update_by_query?pretty' -H 'Content-Type: application/json' -d '{
 "script": {
   "lang": "painless",
   "inline": "if (ctx._source.last =~ /b/) {ctx._source.last += \"matched\"} else {ctx.op = 'noop'}"
 }
}'

Using the match operator (==~), we can update all the academic students whose names start with a consonant and end with a vowel:

curl -XPOST 'ES_HOST:ES_PORT/academics/student/_update_by_query?pretty' -H 'Content-Type: application/json' -d '{
 "script": {
   "lang": "painless",
   "inline": "if (ctx._source.last ==~ /[^aeiou].*[aeiou]/) {ctx._source.last += \"matched\"} else {ctx.op = 'noop'}"
 }
}'

We can use the Pattern.matcher directly to get a Matcher instance and remove all of the vowels in all of their last names:

curl -XPOST 'ES_HOST:ES_PORT/academics/student/_update_by_query?pretty' -H 'Content-Type: application/json' -d '{
 "script": {
   "lang": "painless",
   "inline": "ctx._source.last = /[aeiou]/.matcher(ctx._source.last).replaceAll('')"
 }
}'

Matcher.replaceAll is just a call to Java’s Matcher's replaceAll method so it supports $1 and \1 for replacements:

curl -XPOST 'ES_HOST:ES_PORT/academics/student/_update_by_query?pretty' -H 'Content-Type: application/json' -d '{
 "script": {
   "lang": "painless",
   "inline": "ctx._source.last = /n([aeiou])/.matcher(ctx._source.last).replaceAll('$1')"
 }
}'

If you need more control over replacements you can call replaceAll on a CharSequence with a Function<Matcher, String> that builds the replacement. This does not support $1 or \1 to access replacements because you already have a reference to the matcher and can get them with m.group(1).

Note: Calling Matcher.find inside of the function that builds the replacement is rude and will likely break the replacement process.

The following request will make all of the vowels in the academic students' last names upper case: 

curl -XPOST 'ES_HOST:ES_PORT/academics/student/_update_by_query?pretty' -H 'Content-Type: application/json' -d '{
 "script": {
   "lang": "painless",
   "inline": "ctx._source.last = ctx._source.last.replaceAll(/[aeiou]/, m -> m.group().toUpperCase(Locale.ROOT))"
 }
}'

 

Or we can use the CharSequence.replaceFirst to make the first vowel in their last names upper case:

curl -XPOST 'ES_HOST:ES_PORT/academics/student/_update_by_query?pretty' -H 'Content-Type: application/json' -d '{
 "script": {
   "lang": "painless",
   "inline": "ctx._source.last = ctx._source.last.replaceFirst(/[aeiou]/, m -> m.group().toUpperCase(Locale.ROOT))"
 }
}' 

Painless Debugging 

Painless doesn’t have a REPL and while it’d be nice for it to have one day, it wouldn’t tell us the whole story around debugging painless scripts embedded in Elasticsearch because the data that the scripts have access to or "context" is so important. The best way to debug embedded scripts is by throwing exceptions at choice places. While we can throw our own exceptions, Painless’s sandbox prevents us from accessing useful information like the type of an object. So Painless has a utility method, Debug.explain which throws the exception for us. For example, we can use _explain to explore the context available to a script query.

curl -XPUT 'ES_HOST:ES_PORT/academics/student/1?refresh&pretty' -H 'Content-Type: application/json' -d '
{"first":"Robert","last":"Williamson","base_score":[3,5,12],"target_Score":[12,15,18],"grade_point_index":[9,14,16]}
'
curl -XPOST 'ES_HOST:ES_PORT/academics/student/1/_explain?pretty' -H 'Content-Type: application/json' -d '{
 "query": {
   "script": {
     "script": "Debug.explain(doc.base_score)"
   }
 }
}'

Which shows that the class of doc.first is org.elasticsearch.index.fielddata.ScriptDocValues.Longs by responding with: 

{
  "error": {
     "type": "script_exception",
     "to_string": "[3,5,12]",
     "painless_class": "org.elasticsearch.index.fielddata.ScriptDocValues.Longs",
     "java_class": "org.elasticsearch.index.fielddata.ScriptDocValues$Longs",
     ...
  },
  "status": 500
}

We can use the same trick to see that _source is a LinkedHashMap in the _update API:

curl -XPOST 'ES_HOST:ES_PORT/academics/student/1/_update?pretty' -H 'Content-Type: application/json' -d '{
 "script": "Debug.explain(ctx._source)"
}'

The response looks like:

{
 "error" : {
   "root_cause": ...,
   "type": "illegal_argument_exception",
   "reason": "failed to execute script",
   "caused_by": {
     "type": "script_exception",
     "to_string": "{grade_point_index:[9,14,16], last=Williamson, target_Score:[12,15,18], first=Robert, base_Score=[3, 5, 12]}",
     "painless_class": "LinkedHashMap",
     "java_class": "java.util.LinkedHashMap",
     ...
   }
 },
 "status": 400
}

Try Us Out!

It's easy to spin up a standard hosted Elasticsearch cluster on any of our 47 Rackspace, Softlayer, or Amazon data centers. And you can now provision your own AWS Credits on Qbox Private Hosted Elasticsearch

Questions? Drop us a note, and we'll get you a prompt response.

Not yet enjoying the benefits of a hosted ELK-stack enterprise search on Qbox? We invite you to create an account today and discover how easy it is to manage and scale your Elasticsearch environment in our cloud hosting service.

comments powered by Disqus