Getting Started with Painless
Painless is a simple, secure scripting language designed specifically for use with Elasticsearch. It is the default scripting language for Elasticsearch and can safely be used for inline and stored scripts. For a detailed description of the Painless syntax and language features, see the {painless}/painless-lang-spec.html[Painless Language Specification].
You can use Painless anywhere scripts can be used in Elasticsearch. Painless provides:
-
Fast performance: Painless scripts run several times faster than the alternatives.
-
Safety: Fine-grained whitelist with method call/field granularity. See the {painless}/painless-api-reference.html[Painless API Reference] for a complete list of available classes and methods.
-
Optional typing: Variables and parameters can use explicit types or the dynamic
def
type. -
Syntax: Extends Java’s syntax to provide Groovy-style scripting language features that make scripts easier to write.
-
Optimizations: Designed specifically for Elasticsearch scripting.
Painless Examples
To illustrate how Painless works, let’s load some hockey stats into an Elasticsearch index:
PUT hockey/player/_bulk?refresh
{"index":{"_id":1}}
{"first":"johnny","last":"gaudreau","goals":[9,27,1],"assists":[17,46,0],"gp":[26,82,1],"born":"1993/08/13"}
{"index":{"_id":2}}
{"first":"sean","last":"monohan","goals":[7,54,26],"assists":[11,26,13],"gp":[26,82,82],"born":"1994/10/12"}
{"index":{"_id":3}}
{"first":"jiri","last":"hudler","goals":[5,34,36],"assists":[11,62,42],"gp":[24,80,79],"born":"1984/01/04"}
{"index":{"_id":4}}
{"first":"micheal","last":"frolik","goals":[4,6,15],"assists":[8,23,15],"gp":[26,82,82],"born":"1988/02/17"}
{"index":{"_id":5}}
{"first":"sam","last":"bennett","goals":[5,0,0],"assists":[8,1,0],"gp":[26,1,0],"born":"1996/06/20"}
{"index":{"_id":6}}
{"first":"dennis","last":"wideman","goals":[0,26,15],"assists":[11,30,24],"gp":[26,81,82],"born":"1983/03/20"}
{"index":{"_id":7}}
{"first":"david","last":"jones","goals":[7,19,5],"assists":[3,17,4],"gp":[26,45,34],"born":"1984/08/10"}
{"index":{"_id":8}}
{"first":"tj","last":"brodie","goals":[2,14,7],"assists":[8,42,30],"gp":[26,82,82],"born":"1990/06/07"}
{"index":{"_id":39}}
{"first":"mark","last":"giordano","goals":[6,30,15],"assists":[3,30,24],"gp":[26,60,63],"born":"1983/10/03"}
{"index":{"_id":10}}
{"first":"mikael","last":"backlund","goals":[3,15,13],"assists":[6,24,18],"gp":[26,82,82],"born":"1989/03/17"}
{"index":{"_id":11}}
{"first":"joe","last":"colborne","goals":[3,18,13],"assists":[6,20,24],"gp":[26,67,82],"born":"1990/01/30"}
Accessing Doc Values from Painless
Document values can be accessed from a Map
named doc
.
For example, the following script calculates a player’s total goals. This example uses a strongly typed int
and a for
loop.
GET hockey/_search
{
"query": {
"function_score": {
"script_score": {
"script": {
"lang": "painless",
"source": """
int total = 0;
for (int i = 0; i < doc['goals'].length; ++i) {
total += doc['goals'][i];
}
return total;
"""
}
}
}
}
}
Alternatively, you could do the same thing using a script field instead of a function score:
GET hockey/_search
{
"query": {
"match_all": {}
},
"script_fields": {
"total_goals": {
"script": {
"lang": "painless",
"source": """
int total = 0;
for (int i = 0; i < doc['goals'].length; ++i) {
total += doc['goals'][i];
}
return total;
"""
}
}
}
}
The following example uses a Painless script to sort the players by their combined first and last names. The names are accessed using
doc['first'].value
and doc['last'].value
.
GET hockey/_search
{
"query": {
"match_all": {}
},
"sort": {
"_script": {
"type": "string",
"order": "asc",
"script": {
"lang": "painless",
"source": "doc['first.keyword'].value + ' ' + doc['last.keyword'].value"
}
}
}
}
Missing values
If you request the value from a field field
that isn’t in
the document, doc['field'].value
for this document returns:
-
0
if afield
has a numeric datatype (long, double etc.) -
false
is afield
has a boolean datatype -
epoch date if a
field
has a date datatype -
null
if afield
has a string datatype -
null
if afield
has a geo datatype -
""
if afield
has a binary datatype
Important
|
Starting in 7.0, doc['field'].value throws an exception if
the field is missing in a document. To enable this behavior now,
set a {ref}/jvm-options.html[jvm.option ]
-Des.scripting.exception_for_missing_value=true on a node. If you do not
enable this behavior, a deprecation warning may be logged when painless
processes a missing value.
|
To check if a document is missing a value, you can call
doc['field'].size() == 0
.
Updating Fields with Painless
You can also easily update fields. You access the original source for a field as ctx._source.<field-name>
.
First, let’s look at the source data for a player by submitting the following request:
GET hockey/_search
{
"query": {
"term": {
"_id": 1
}
}
}
To change player 1’s last name to hockey
, simply set ctx._source.last
to the new value:
POST hockey/player/1/_update
{
"script": {
"lang": "painless",
"source": "ctx._source.last = params.last",
"params": {
"last": "hockey"
}
}
}
You can also add fields to a document. For example, this script adds a new field that contains the player’s nickname, hockey.
POST hockey/player/1/_update
{
"script": {
"lang": "painless",
"source": """
ctx._source.last = params.last;
ctx._source.nick = params.nick
""",
"params": {
"last": "gaudreau",
"nick": "hockey"
}
}
}
Dates
Date fields are exposed as
ReadableDateTime
, so they support methods like getYear
, getDayOfWeek
or e.g. getting milliseconds since epoch with getMillis
. To use these
in a script, leave out the get
prefix and continue with lowercasing the
rest of the method name. For example, the following returns every hockey
player’s birth year:
GET hockey/_search
{
"script_fields": {
"birth_year": {
"script": {
"source": "doc.born.value.year"
}
}
}
}
Regular expressions
Note
|
Regexes are disabled by default because they circumvent Painless’s
protection against long running and memory hungry scripts. To make matters
worse even innocuous looking regexes can have staggering performance and stack
depth behavior. They remain an amazing powerful tool but are too scary to enable
by default. To enable them yourself set script.painless.regex.enabled: true in
elasticsearch.yml . We’d like very much to have a safe alternative
implementation that can be enabled by default so check this space for later
developments!
|
Painless’s native support for regular expressions has syntax constructs:
-
/pattern/
: Pattern literals create patterns. This is the only way to create a pattern in painless. The pattern inside the /'s are just Java regular expressions. See [pattern-flags] for more. -
=~
: The find operator return aboolean
,true
if a subsequence of the text matches,false
otherwise. -
==~
: The match operator returns aboolean
,true
if the text matches,false
if it doesn’t.
Using the find operator (=~
) you can update all hockey players with "b" in
their last name:
POST hockey/player/_update_by_query
{
"script": {
"lang": "painless",
"source": """
if (ctx._source.last =~ /b/) {
ctx._source.last += "matched";
} else {
ctx.op = "noop";
}
"""
}
}
Using the match operator (==~
) you can update all the hockey players whose
names start with a consonant and end with a vowel:
POST hockey/player/_update_by_query
{
"script": {
"lang": "painless",
"source": """
if (ctx._source.last ==~ /[^aeiou].*[aeiou]/) {
ctx._source.last += "matched";
} else {
ctx.op = "noop";
}
"""
}
}
You can use the Pattern.matcher
directly to get a Matcher
instance and
remove all of the vowels in all of their last names:
POST hockey/player/_update_by_query
{
"script": {
"lang": "painless",
"source": "ctx._source.last = /[aeiou]/.matcher(ctx._source.last).replaceAll('')"
}
}
Matcher.replaceAll
is just a call to Java’s Matcher’s
replaceAll
method so it supports `$1
and \1
for replacements:
POST hockey/player/_update_by_query
{
"script": {
"lang": "painless",
"source": "ctx._source.last = /n([aeiou])/.matcher(ctx._source.last).replaceAll('$1')"
}
}
If you need more control over replacements you can call replaceAll
on a
CharSequence
with a Function<Matcher, String>
that builds the replacement.
This does not support $1
or \1
to access replacements because you already
have a reference to the matcher and can get them with m.group(1)
.
Important
|
Calling Matcher.find inside of the function that builds the
replacement is rude and will likely break the replacement process.
|
This will make all of the vowels in the hockey player’s last names upper case:
POST hockey/player/_update_by_query
{
"script": {
"lang": "painless",
"source": """
ctx._source.last = ctx._source.last.replaceAll(/[aeiou]/, m ->
m.group().toUpperCase(Locale.ROOT))
"""
}
}
Or you can use the CharSequence.replaceFirst
to make the first vowel in their
last names upper case:
POST hockey/player/_update_by_query
{
"script": {
"lang": "painless",
"source": """
ctx._source.last = ctx._source.last.replaceFirst(/[aeiou]/, m ->
m.group().toUpperCase(Locale.ROOT))
"""
}
}
Note: all of the _update_by_query
examples above could really do with a
query
to limit the data that they pull back. While you could use a
{ref}/query-dsl-script-query.html[script query] it wouldn’t be as efficient
as using any other query because script queries aren’t able to use the inverted
index to limit the documents that they have to check.
How painless dispatches functions
Painless uses receiver, name, and arity
for method dispatch. For example, s.foo(a, b)
is resolved by first getting
the class of s
and then looking up the method foo
with two parameters. This
is different from Groovy which uses the
runtime types of the
parameters and Java which uses the compile time types of the parameters.
The consequence of this that Painless doesn’t support overloaded methods like
Java, leading to some trouble when it whitelists classes from the Java
standard library. For example, in Java and Groovy, Matcher
has two methods:
group(int)
and group(String)
. Painless can’t whitelist both of these methods
because they have the same name and the same number of parameters. So instead it
has group(int)
and namedGroup(String)
.
We have a few justifications for this different way of dispatching methods:
-
It makes operating on
def
types simpler and, presumably, faster. Using receiver, name, and arity means that when Painless sees a call on adef
object it can dispatch the appropriate method without having to do expensive comparisons of the types of the parameters. The same is true for invocations withdef
typed parameters. -
It keeps things consistent. It would be genuinely weird for Painless to behave like Groovy if any
def
typed parameters were involved and Java otherwise. It’d be slow for it to behave like Groovy all the time. -
It keeps Painless maintainable. Adding the Java or Groovy like method dispatch feels like it’d add a ton of complexity which’d make maintenance and other improvements much more difficult.
Painless Debugging
Debug.Explain
Painless doesn’t have a
REPL
and while it’d be nice for it to have one day, it wouldn’t tell you the
whole story around debugging painless scripts embedded in Elasticsearch because
the data that the scripts have access to or "context" is so important. For now
the best way to debug embedded scripts is by throwing exceptions at choice
places. While you can throw your own exceptions
(throw new Exception('whatever')
), Painless’s sandbox prevents you from
accessing useful information like the type of an object. So Painless has a
utility method, Debug.explain
which throws the exception for you. For
example, you can use {ref}/search-explain.html[_explain
] to explore the
context available to a {ref}/query-dsl-script-query.html[script query].
PUT /hockey/player/1?refresh
{"first":"johnny","last":"gaudreau","goals":[9,27,1],"assists":[17,46,0],"gp":[26,82,1]}
POST /hockey/player/1/_explain
{
"query": {
"script": {
"script": "Debug.explain(doc.goals)"
}
}
}
Which shows that the class of doc.first
is
org.elasticsearch.index.fielddata.ScriptDocValues.Longs
by responding with:
{
"error": {
"type": "script_exception",
"to_string": "[1, 9, 27]",
"painless_class": "org.elasticsearch.index.fielddata.ScriptDocValues.Longs",
"java_class": "org.elasticsearch.index.fielddata.ScriptDocValues$Longs",
...
},
"status": 500
}
You can use the same trick to see that _source
is a LinkedHashMap
in the _update
API:
POST /hockey/player/1/_update
{
"script": "Debug.explain(ctx._source)"
}
The response looks like:
{
"error" : {
"root_cause": ...,
"type": "illegal_argument_exception",
"reason": "failed to execute script",
"caused_by": {
"type": "script_exception",
"to_string": "{gp=[26, 82, 1], last=gaudreau, assists=[17, 46, 0], first=johnny, goals=[9, 27, 1]}",
"painless_class": "java.util.LinkedHashMap",
"java_class": "java.util.LinkedHashMap",
...
}
},
"status": 400
}
Once you have a class you can go to [painless-api-reference] to see a list of available methods.
Painless execute API
experimental[The painless execute api is new and the request / response format may change in a breaking way in the future]
The Painless execute API allows an arbitrary script to be executed and a result to be returned.
Name | Required | Default | Description |
---|---|---|---|
|
yes |
- |
The script to execute |
|
no |
|
The context the script should be executed in. |
|
no |
- |
Additional parameters to the context. |
Contexts
Contexts control how scripts are executed, what variables are available at runtime and what the return type is.
Painless test context
The painless_test
context executes scripts as is and do not add any special parameters.
The only variable that is available is params
, which can be used to access user defined values.
The result of the script is always converted to a string.
If no context is specified then this context is used by default.
Example
Request:
POST /_scripts/painless/_execute
{
"script": {
"source": "params.count / params.total",
"params": {
"count": 100.0,
"total": 1000.0
}
}
}
Response:
{
"result": "0.1"
}
Filter context
The filter
context executes scripts as if they were executed inside a script
query.
For testing purposes a document must be provided that will be indexed temporarily in-memory and
is accessible to the script being tested. Because of this the _source, stored fields and doc values
are available in the script being tested.
The following parameters may be specified in context_setup
for a filter context:
- document
-
Contains the document that will be temporarily indexed in-memory and is accessible from the script.
- index
-
The name of an index containing a mapping that is compatible with the document being indexed.
Example
PUT /my-index
{
"mappings": {
"_doc": {
"properties": {
"field": {
"type": "keyword"
}
}
}
}
}
POST /_scripts/painless/_execute
{
"script": {
"source": "doc['field'].value.length() <= params.max_length",
"params": {
"max_length": 4
}
},
"context": "filter",
"context_setup": {
"index": "my-index",
"document": {
"field": "four"
}
}
}
Response:
{
"result": true
}
Score context
The score
context executes scripts as if they were executed inside a script_score
function in
function_score
query.
The following parameters may be specified in context_setup
for a score context:
- document
-
Contains the document that will be temporarily indexed in-memory and is accessible from the script.
- index
-
The name of an index containing a mapping that is compatible with the document being indexed.
- query
-
If
_score
is used in the script then a query can specified that will be used to compute a score.
Example
PUT /my-index
{
"mappings": {
"_doc": {
"properties": {
"field": {
"type": "keyword"
},
"rank": {
"type": "long"
}
}
}
}
}
POST /_scripts/painless/_execute
{
"script": {
"source": "doc['rank'].value / params.max_rank",
"params": {
"max_rank": 5.0
}
},
"context": "score",
"context_setup": {
"index": "my-index",
"document": {
"rank": 4
}
}
}
Response:
{
"result": 0.8
}