"Fossies" - the Fresh Open Source Software Archive

Member "elasticsearch-6.8.23/docs/reference/cluster.asciidoc" (29 Dec 2021, 3944 Bytes) of package /linux/www/elasticsearch-6.8.23-src.tar.gz:


As a special service "Fossies" has tried to format the requested source page into HTML format (assuming AsciiDoc format). Alternatively you can here view or download the uninterpreted source code file. A member file download can also be achieved by clicking within a package contents listing on the according byte size field.

Cluster Health

The cluster health API allows to get a very simple status on the health of the cluster. For example, on a quiet single node cluster with a single index with 5 shards and one replica, this:

GET _cluster/health

Returns this:

{
  "cluster_name" : "testcluster",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 5,
  "active_shards" : 5,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 5,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 50.0
}

The API can also be executed against one or more indices to get just the specified indices health:

GET /_cluster/health/test1,test2

The cluster health status is: green, yellow or red. On the shard level, a red status indicates that the specific shard is not allocated in the cluster, yellow means that the primary shard is allocated but replicas are not, and green means that all shards are allocated. The index level status is controlled by the worst shard status. The cluster status is controlled by the worst index status.

One of the main benefits of the API is the ability to wait until the cluster reaches a certain high water-mark health level. For example, the following will wait for 50 seconds for the cluster to reach the yellow level (if it reaches the green or yellow status before 50 seconds elapse, it will return at that point):

GET /_cluster/health?wait_for_status=yellow&timeout=50s

Request Parameters

The cluster health API accepts the following request parameters:

level

Can be one of cluster, indices or shards. Controls the details level of the health information returned. Defaults to cluster.

wait_for_status

One of green, yellow or red. Will wait (until the timeout provided) until the status of the cluster changes to the one provided or better, i.e. green > yellow > red. By default, will not wait for any status.

wait_for_no_relocating_shards

A boolean value which controls whether to wait (until the timeout provided) for the cluster to have no shard relocations. Defaults to false, which means it will not wait for relocating shards.

wait_for_no_initializing_shards

A boolean value which controls whether to wait (until the timeout provided) for the cluster to have no shard initializations. Defaults to false, which means it will not wait for initializing shards.

wait_for_active_shards

A number controlling to how many active shards to wait for, all to wait for all shards in the cluster to be active, or 0 to not wait. Defaults to 0.

wait_for_nodes

The request waits until the specified number N of nodes is available. It also accepts >=N, ⇐N, >N and <N. Alternatively, it is possible to use ge(N), le(N), gt(N) and lt(N) notation.

wait_for_events

Can be one of immediate, urgent, high, normal, low, languid. Wait until all currently queued events with the given priority are processed.

timeout

A time based parameter controlling how long to wait if one of the wait_for_XXX are provided. Defaults to 30s.

master_timeout

A time based parameter controlling how long to wait if the master has not been discovered yet or disconnected. If not provided, uses the same value as timeout.

local

If true returns the local node information and does not provide the state from master node. Default: false.

The following is an example of getting the cluster health at the shards level:

GET /_cluster/health/twitter?level=shards

Cluster State

The cluster state API allows access to metadata representing the state of the whole cluster. This includes information such as

  • the set of nodes in the cluster

  • all cluster-level settings

  • information about the indices in the cluster, including their mappings and settings

  • the locations of all the shards in the cluster

The response is an internal representation of the cluster state and its format may change from version to version. If possible, you should obtain any information from the cluster state using the other, more stable, cluster APIs.

GET /_cluster/state

The response provides the cluster state itself, which can be filtered to only retrieve the parts of interest as described below.

The cluster’s cluster_uuid is also returned as part of the top-level response, in addition to the metadata section. added[6.4.0]

Note
While the cluster is still forming, it is possible for the cluster_uuid to be na as well as the cluster state’s version to be -1.

By default, the cluster state request is routed to the master node, to ensure that the latest cluster state is returned. For debugging purposes, you can retrieve the cluster state local to a particular node by adding local=true to the query string.

Response Filters

The cluster state contains information about all the indices in the cluster, including their mappings, as well as templates and other metadata. This means it can sometimes be quite large. To avoid the need to process all this information you can request only the part of the cluster state that you need:

GET /_cluster/state/{metrics}
GET /_cluster/state/{metrics}/{indices}

{metrics} is a comma-separated list of the following options.

version

Shows the cluster state version.

master_node

Shows the elected master_node part of the response

nodes

Shows the nodes part of the response

routing_table

Shows the routing_table part of the response. If you supply a comma separated list of indices, the returned output will only contain the routing table for these indices.

metadata

Shows the metadata part of the response. If you supply a comma separated list of indices, the returned output will only contain metadata for these indices.

blocks

Shows the blocks part of the response.

_all

Shows all metrics.

The following example returns only metadata and routing_table data for the foo and bar indices:

GET /_cluster/state/metadata,routing_table/foo,bar

The next example returns everything for the foo and bar indices:

GET /_cluster/state/_all/foo,bar

Finally, this example return only the blocks metadata:

GET /_cluster/state/blocks

Cluster Stats

The Cluster Stats API allows to retrieve statistics from a cluster wide perspective. The API returns basic index metrics (shard numbers, store size, memory usage) and information about the current nodes that form the cluster (number, roles, os, jvm versions, memory usage, cpu and installed plugins).

GET /_cluster/stats?human&pretty

Will return, for example:

{
   "_nodes" : {
      "total" : 1,
      "successful" : 1,
      "failed" : 0
   },
   "cluster_uuid": "YjAvIhsCQ9CbjWZb2qJw3Q",
   "cluster_name": "elasticsearch",
   "timestamp": 1459427693515,
   "status": "green",
   "indices": {
      "count": 1,
      "shards": {
         "total": 5,
         "primaries": 5,
         "replication": 0,
         "index": {
            "shards": {
               "min": 5,
               "max": 5,
               "avg": 5
            },
            "primaries": {
               "min": 5,
               "max": 5,
               "avg": 5
            },
            "replication": {
               "min": 0,
               "max": 0,
               "avg": 0
            }
         }
      },
      "docs": {
         "count": 10,
         "deleted": 0
      },
      "store": {
         "size": "16.2kb",
         "size_in_bytes": 16684
      },
      "fielddata": {
         "memory_size": "0b",
         "memory_size_in_bytes": 0,
         "evictions": 0
      },
      "query_cache": {
         "memory_size": "0b",
         "memory_size_in_bytes": 0,
         "total_count": 0,
         "hit_count": 0,
         "miss_count": 0,
         "cache_size": 0,
         "cache_count": 0,
         "evictions": 0
      },
      "completion": {
         "size": "0b",
         "size_in_bytes": 0
      },
      "segments": {
         "count": 4,
         "memory": "8.6kb",
         "memory_in_bytes": 8898,
         "terms_memory": "6.3kb",
         "terms_memory_in_bytes": 6522,
         "stored_fields_memory": "1.2kb",
         "stored_fields_memory_in_bytes": 1248,
         "term_vectors_memory": "0b",
         "term_vectors_memory_in_bytes": 0,
         "norms_memory": "384b",
         "norms_memory_in_bytes": 384,
         "points_memory" : "0b",
         "points_memory_in_bytes" : 0,
         "doc_values_memory": "744b",
         "doc_values_memory_in_bytes": 744,
         "index_writer_memory": "0b",
         "index_writer_memory_in_bytes": 0,
         "version_map_memory": "0b",
         "version_map_memory_in_bytes": 0,
         "fixed_bit_set": "0b",
         "fixed_bit_set_memory_in_bytes": 0,
         "max_unsafe_auto_id_timestamp" : -9223372036854775808,
         "file_sizes": {}
      }
   },
   "nodes": {
      "count": {
         "total": 1,
         "data": 1,
         "coordinating_only": 0,
         "master": 1,
         "ingest": 1
      },
      "versions": [
         "{version}"
      ],
      "os": {
         "available_processors": 8,
         "allocated_processors": 8,
         "names": [
            {
               "name": "Mac OS X",
               "count": 1
            }
         ],
         "pretty_names": [
            {
               "pretty_name": "Mac OS X",
               "count": 1
            }
         ],
         "mem" : {
            "total" : "16gb",
            "total_in_bytes" : 17179869184,
            "free" : "78.1mb",
            "free_in_bytes" : 81960960,
            "used" : "15.9gb",
            "used_in_bytes" : 17097908224,
            "free_percent" : 0,
            "used_percent" : 100
         }
      },
      "process": {
         "cpu": {
            "percent": 9
         },
         "open_file_descriptors": {
            "min": 268,
            "max": 268,
            "avg": 268
         }
      },
      "jvm": {
         "max_uptime": "13.7s",
         "max_uptime_in_millis": 13737,
         "versions": [
            {
               "version": "1.8.0_74",
               "vm_name": "Java HotSpot(TM) 64-Bit Server VM",
               "vm_version": "25.74-b02",
               "vm_vendor": "Oracle Corporation",
               "count": 1
            }
         ],
         "mem": {
            "heap_used": "57.5mb",
            "heap_used_in_bytes": 60312664,
            "heap_max": "989.8mb",
            "heap_max_in_bytes": 1037959168
         },
         "threads": 90
      },
      "fs": {
         "total": "200.6gb",
         "total_in_bytes": 215429193728,
         "free": "32.6gb",
         "free_in_bytes": 35064553472,
         "available": "32.4gb",
         "available_in_bytes": 34802409472
      },
      "plugins": [
        {
          "name": "analysis-icu",
          "version": "{version}",
          "description": "The ICU Analysis plugin integrates Lucene ICU module into elasticsearch, adding ICU relates analysis components.",
          "classname": "org.elasticsearch.plugin.analysis.icu.AnalysisICUPlugin",
          "has_native_controller": false
        },
        ...
      ],
      ...
   }
}

This API can be restricted to a subset of the nodes using node filters:

GET /_cluster/stats/nodes/node1,node*,master:false

Pending cluster tasks

The pending cluster tasks API returns a list of any cluster-level changes (e.g. create index, update mapping, allocate or fail shard) which have not yet been executed.

Note
This API returns a list of any pending updates to the cluster state. These are distinct from the tasks reported by the Task Management API which include periodic tasks and tasks initiated by the user, such as node stats, search queries, or create index requests. However, if a user-initiated task such as a create index command causes a cluster state update, the activity of this task might be reported by both task api and pending cluster tasks API.
GET /_cluster/pending_tasks

Usually this will return an empty list as cluster-level changes are usually fast. However if there are tasks queued up, the output will look something like this:

{
   "tasks": [
      {
         "insert_order": 101,
         "priority": "URGENT",
         "source": "create-index [foo_9], cause [api]",
         "time_in_queue_millis": 86,
         "time_in_queue": "86ms"
      },
      {
         "insert_order": 46,
         "priority": "HIGH",
         "source": "shard-started ([foo_2][1], node[tMTocMvQQgGCkj7QDHl3OA], [P], s[INITIALIZING]), reason [after recovery from shard_store]",
         "time_in_queue_millis": 842,
         "time_in_queue": "842ms"
      },
      {
         "insert_order": 45,
         "priority": "HIGH",
         "source": "shard-started ([foo_2][0], node[tMTocMvQQgGCkj7QDHl3OA], [P], s[INITIALIZING]), reason [after recovery from shard_store]",
         "time_in_queue_millis": 858,
         "time_in_queue": "858ms"
      }
  ]
}

Cluster Reroute

The reroute command allows for manual changes to the allocation of individual shards in the cluster. For example, a shard can be moved from one node to another explicitly, an allocation can be cancelled, and an unassigned shard can be explicitly allocated to a specific node.

Here is a short example of a simple reroute API call:

POST /_cluster/reroute
{
    "commands" : [
        {
            "move" : {
                "index" : "test", "shard" : 0,
                "from_node" : "node1", "to_node" : "node2"
            }
        },
        {
          "allocate_replica" : {
                "index" : "test", "shard" : 1,
                "node" : "node3"
          }
        }
    ]
}

It is important to note that after processing any reroute commands Elasticsearch will perform rebalancing as normal (respecting the values of settings such as cluster.routing.rebalance.enable) in order to remain in a balanced state. For example, if the requested allocation includes moving a shard from node1 to node2 then this may cause a shard to be moved from node2 back to node1 to even things out.

The cluster can be set to disable allocations using the cluster.routing.allocation.enable setting. If allocations are disabled then the only allocations that will be performed are explicit ones given using the reroute command, and consequent allocations due to rebalancing.

It is possible to run reroute commands in "dry run" mode by using the ?dry_run URI query parameter, or by passing "dry_run": true in the request body. This will calculate the result of applying the commands to the current cluster state, and return the resulting cluster state after the commands (and re-balancing) has been applied, but will not actually perform the requested changes.

If the ?explain URI query parameter is included then a detailed explanation of why the commands could or could not be executed is included in the response.

The commands supported are:

move

Move a started shard from one node to another node. Accepts index and shard for index name and shard number, from_node for the node to move the shard from, and to_node for the node to move the shard to.

cancel

Cancel allocation of a shard (or recovery). Accepts index and shard for index name and shard number, and node for the node to cancel the shard allocation on. This can be used to force resynchronization of existing replicas from the primary shard by cancelling them and allowing them to be reinitialized through the standard recovery process. By default only replica shard allocations can be cancelled. If it is necessary to cancel the allocation of a primary shard then the allow_primary flag must also be included in the request.

allocate_replica

Allocate an unassigned replica shard to a node. Accepts index and shard for index name and shard number, and node to allocate the shard to. Takes allocation deciders into account.

Retrying failed allocations

The cluster will attempt to allocate a shard a maximum of index.allocation.max_retries times in a row (defaults to 5), before giving up and leaving the shard unallocated. This scenario can be caused by structural problems such as having an analyzer which refers to a stopwords file which doesn’t exist on all nodes.

Once the problem has been corrected, allocation can be manually retried by calling the reroute API with the ?retry_failed URI query parameter, which will attempt a single retry round for these shards.

Forced allocation on unrecoverable errors

Two more commands are available that allow the allocation of a primary shard to a node. These commands should however be used with extreme care, as primary shard allocation is usually fully automatically handled by Elasticsearch. Reasons why a primary shard cannot be automatically allocated include the following:

  • A new index was created but there is no node which satisfies the allocation deciders.

  • An up-to-date shard copy of the data cannot be found on the current data nodes in the cluster. To prevent data loss, the system does not automatically promote a stale shard copy to primary.

The following two commands are dangerous and may result in data loss. They are meant to be used in cases where the original data can not be recovered and the cluster administrator accepts the loss. If you have suffered a temporary issue that can be fixed, please see the retry_failed flag described above. To emphasise: if these commands are performed and then a node joins the cluster that holds a copy of the affected shard then the copy on the newly-joined node will be deleted or overwritten.

allocate_stale_primary

Allocate a primary shard to a node that holds a stale copy. Accepts the index and shard for index name and shard number, and node to allocate the shard to. Using this command may lead to data loss for the provided shard id. If a node which has the good copy of the data rejoins the cluster later on, that data will be deleted or overwritten with the data of the stale copy that was forcefully allocated with this command. To ensure that these implications are well-understood, this command requires the flag accept_data_loss to be explicitly set to true.

allocate_empty_primary

Allocate an empty primary shard to a node. Accepts the index and shard for index name and shard number, and node to allocate the shard to. Using this command leads to a complete loss of all data that was indexed into this shard, if it was previously started. If a node which has a copy of the data rejoins the cluster later on, that data will be deleted. To ensure that these implications are well-understood, this command requires the flag accept_data_loss to be explicitly set to true.

Cluster Update Settings

Use this API to review and change cluster-wide settings.

To review cluster settings:

GET /_cluster/settings

By default, this API call only returns settings that have been explicitly defined, but can also include the default settings.

Updates to settings can be persistent, meaning they apply across restarts, or transient, where they don’t survive a full cluster restart. Here is an example of a persistent update:

PUT /_cluster/settings
{
    "persistent" : {
        "indices.recovery.max_bytes_per_sec" : "50mb"
    }
}

This update is transient:

PUT /_cluster/settings?flat_settings=true
{
    "transient" : {
        "indices.recovery.max_bytes_per_sec" : "20mb"
    }
}

The response to an update returns the changed setting, as in this response to the transient example:

{
    ...
    "persistent" : { },
    "transient" : {
        "indices.recovery.max_bytes_per_sec" : "20mb"
    }
}

You can reset persistent or transient settings by assigning a null value. If a transient setting is reset, the first one of these values that is defined is applied:

  • the persistent setting

  • the setting in the configuration file

  • the default value.

This example resets a setting:

PUT /_cluster/settings
{
    "transient" : {
        "indices.recovery.max_bytes_per_sec" : null
    }
}

The response does not include settings that have been reset:

{
    ...
    "persistent" : {},
    "transient" : {}
}

You can also reset settings using wildcards. For example, to reset all dynamic indices.recovery settings:

PUT /_cluster/settings
{
    "transient" : {
        "indices.recovery.*" : null
    }
}

Order of Precedence

The order of precedence for cluster settings is:

  1. transient cluster settings

  2. persistent cluster settings

  3. settings in the elasticsearch.yml configuration file.

It’s best to set all cluster-wide settings with the settings API and use the elasticsearch.yml file only for local configurations. This way you can be sure that the setting is the same on all nodes. If, on the other hand, you define different settings on different nodes by accident using the configuration file, it is very difficult to notice these discrepancies.

You can find the list of settings that you can dynamically update in Modules.

Cluster Get Settings

The cluster get settings API allows to retrieve the cluster wide settings.

GET /_cluster/settings

Or

GET /_cluster/settings?include_defaults=true

In the second example above, the parameter include_defaults ensures that the settings which were not set explicitly are also returned. By default include_defaults is set to false.

Nodes Stats

Nodes statistics

The cluster nodes stats API allows to retrieve one or more (or all) of the cluster nodes statistics.

GET /_nodes/stats
GET /_nodes/nodeId1,nodeId2/stats

The first command retrieves stats of all the nodes in the cluster. The second command selectively retrieves nodes stats of only nodeId1 and nodeId2. All the nodes selective options are explained here.

By default, all stats are returned. You can limit this by combining any of indices, os, process, jvm, transport, http, fs, breaker and thread_pool. For example:

indices

Indices stats about size, document count, indexing and deletion times, search times, field cache size, merges and flushes

fs

File system information, data path, free disk space, read/write stats (see FS information)

http

HTTP connection information

jvm

JVM stats, memory pool information, garbage collection, buffer pools, number of loaded/unloaded classes

os

Operating system stats, load average, mem, swap (see OS statistics)

process

Process statistics, memory consumption, cpu usage, open file descriptors (see Process statistics)

thread_pool

Statistics about each thread pool, including current size, queue and rejected tasks

transport

Transport statistics about sent and received bytes in cluster communication

breaker

Statistics about the field data circuit breaker

discovery

Statistics about the discovery

ingest

Statistics about ingest preprocessing

adaptive_selection

Statistics about adaptive replica selection. See adaptive selection statistics.

# return just indices
GET /_nodes/stats/indices

# return just os and process
GET /_nodes/stats/os,process

# return just process for node with IP address 10.0.0.1
GET /_nodes/10.0.0.1/stats/process

All stats can be explicitly requested via /_nodes/stats/_all or /_nodes/stats?metric=_all.

FS information

The fs flag can be set to retrieve information that concern the file system:

fs.timestamp

Last time the file stores statistics have been refreshed

fs.total.total_in_bytes

Total size (in bytes) of all file stores

fs.total.free_in_bytes

Total number of unallocated bytes in all file stores

fs.total.available_in_bytes

Total number of bytes available to this Java virtual machine on all file stores

fs.data

List of all file stores

fs.data.path

Path to the file store

fs.data.mount

Mount point of the file store (ex: /dev/sda2)

fs.data.type

Type of the file store (ex: ext4)

fs.data.total_in_bytes

Total size (in bytes) of the file store

fs.data.free_in_bytes

Total number of unallocated bytes in the file store

fs.data.available_in_bytes

Total number of bytes available to this Java virtual machine on this file store

fs.io_stats.devices (Linux only)

Array of disk metrics for each device that is backing an Elasticsearch data path. These disk metrics are probed periodically and averages between the last probe and the current probe are computed.

fs.io_stats.devices.device_name (Linux only)

The Linux device name.

fs.io_stats.devices.operations (Linux only)

The total number of read and write operations for the device completed since starting Elasticsearch.

fs.io_stats.devices.read_operations (Linux only)

The total number of read operations for the device completed since starting Elasticsearch.

fs.io_stats.devices.write_operations (Linux only)

The total number of write operations for the device completed since starting Elasticsearch.

fs.io_stats.devices.read_kilobytes (Linux only)

The total number of kilobytes read for the device since starting Elasticsearch.

fs.io_stats.devices.write_kilobytes (Linux only)

The total number of kilobytes written for the device since starting Elasticsearch.

fs.io_stats.operations (Linux only)

The total number of read and write operations across all devices used by Elasticsearch completed since starting Elasticsearch.

fs.io_stats.read_operations (Linux only)

The total number of read operations for across all devices used by Elasticsearch completed since starting Elasticsearch.

fs.io_stats.write_operations (Linux only)

The total number of write operations across all devices used by Elasticsearch completed since starting Elasticsearch.

fs.io_stats.read_kilobytes (Linux only)

The total number of kilobytes read across all devices used by Elasticsearch since starting Elasticsearch.

fs.io_stats.write_kilobytes (Linux only)

The total number of kilobytes written across all devices used by Elasticsearch since starting Elasticsearch.

Operating System statistics

The os flag can be set to retrieve statistics that concern the operating system:

os.timestamp

Last time the operating system statistics have been refreshed

os.cpu.percent

Recent CPU usage for the whole system, or -1 if not supported

os.cpu.load_average.1m

One-minute load average on the system (field is not present if one-minute load average is not available)

os.cpu.load_average.5m

Five-minute load average on the system (field is not present if five-minute load average is not available)

os.cpu.load_average.15m

Fifteen-minute load average on the system (field is not present if fifteen-minute load average is not available)

os.mem.total_in_bytes

Total amount of physical memory in bytes

os.mem.free_in_bytes

Amount of free physical memory in bytes

os.mem.free_percent

Percentage of free memory

os.mem.used_in_bytes

Amount of used physical memory in bytes

os.mem.used_percent

Percentage of used memory

os.swap.total_in_bytes

Total amount of swap space in bytes

os.swap.free_in_bytes

Amount of free swap space in bytes

os.swap.used_in_bytes

Amount of used swap space in bytes

os.cgroup.cpuacct.control_group (Linux only)

The cpuacct control group to which the Elasticsearch process belongs

os.cgroup.cpuacct.usage_nanos (Linux only)

The total CPU time (in nanoseconds) consumed by all tasks in the same cgroup as the Elasticsearch process

os.cgroup.cpu.control_group (Linux only)

The cpu control group to which the Elasticsearch process belongs

os.cgroup.cpu.cfs_period_micros (Linux only)

The period of time (in microseconds) for how regularly all tasks in the same cgroup as the Elasticsearch process should have their access to CPU resources reallocated.

os.cgroup.cpu.cfs_quota_micros (Linux only)

The total amount of time (in microseconds) for which all tasks in the same cgroup as the Elasticsearch process can run during one period os.cgroup.cpu.cfs_period_micros

os.cgroup.cpu.stat.number_of_elapsed_periods (Linux only)

The number of reporting periods (as specified by os.cgroup.cpu.cfs_period_micros) that have elapsed

os.cgroup.cpu.stat.number_of_times_throttled (Linux only)

The number of times all tasks in the same cgroup as the Elasticsearch process have been throttled.

os.cgroup.cpu.stat.time_throttled_nanos (Linux only)

The total amount of time (in nanoseconds) for which all tasks in the same cgroup as the Elasticsearch process have been throttled.

os.cgroup.memory.control_group (Linux only)

The memory control group to which the Elasticsearch process belongs

os.cgroup.memory.limit_in_bytes (Linux only)

The maximum amount of user memory (including file cache) allowed for all tasks in the same cgroup as the Elasticsearch process. This value can be too big to store in a long, so is returned as a string so that the value returned can exactly match what the underlying operating system interface returns. Any value that is too large to parse into a long almost certainly means no limit has been set for the cgroup.

os.cgroup.memory.usage_in_bytes (Linux only)

The total current memory usage by processes in the cgroup (in bytes) by all tasks in the same cgroup as the Elasticsearch process. This value is stored as a string for consistency with os.cgroup.memory.limit_in_bytes.

Note
For the cgroup stats to be visible, cgroups must be compiled into the kernel, the cpu and cpuacct cgroup subsystems must be configured and stats must be readable from /sys/fs/cgroup/cpu and /sys/fs/cgroup/cpuacct.

Process statistics

The process flag can be set to retrieve statistics that concern the current running process:

process.timestamp

Last time the process statistics have been refreshed

process.open_file_descriptors

Number of opened file descriptors associated with the current process, or -1 if not supported

process.max_file_descriptors

Maximum number of file descriptors allowed on the system, or -1 if not supported

process.cpu.percent

CPU usage in percent, or -1 if not known at the time the stats are computed

process.cpu.total_in_millis

CPU time (in milliseconds) used by the process on which the Java virtual machine is running, or -1 if not supported

process.mem.total_virtual_in_bytes

Size in bytes of virtual memory that is guaranteed to be available to the running process

Indices statistics

You can get information about indices stats on node, indices, or shards level.

# Fielddata summarised by node
GET /_nodes/stats/indices/fielddata?fields=field1,field2

# Fielddata summarised by node and index
GET /_nodes/stats/indices/fielddata?level=indices&fields=field1,field2

# Fielddata summarised by node, index, and shard
GET /_nodes/stats/indices/fielddata?level=shards&fields=field1,field2

# You can use wildcards for field names
GET /_nodes/stats/indices/fielddata?fields=field*

Supported metrics are:

  • completion

  • docs

  • fielddata

  • flush

  • get

  • indexing

  • merge

  • query_cache

  • recovery

  • refresh

  • request_cache

  • search

  • segments

  • store

  • translog

  • warmer

Search groups

You can get statistics about search groups for searches executed on this node.

# All groups with all stats
GET /_nodes/stats?groups=_all

# Some groups from just the indices stats
GET /_nodes/stats/indices?groups=foo,bar

Ingest statistics

The ingest flag can be set to retrieve statistics that concern ingest:

ingest.total.count

The total number of document ingested during the lifetime of this node

ingest.total.time_in_millis

The total time spent on ingest preprocessing documents during the lifetime of this node

ingest.total.current

The total number of documents currently being ingested.

ingest.total.failed

The total number ingest preprocessing operations failed during the lifetime of this node

On top of these overall ingest statistics, these statistics are also provided on a per pipeline basis.

Adaptive selection statistics

The adaptive_selection flag can be set to retrieve statistics that concern adaptive replica selection. These statistics are keyed by node. For each node:

adaptive_selection.outgoing_searches

The number of outstanding search requests from the node these stats are for to the keyed node.

avg_queue_size

The exponentially weighted moving average queue size of search requests on the keyed node.

avg_service_time_ns

The exponentially weighted moving average service time of search requests on the keyed node.

avg_response_time_ns

The exponentially weighted moving average response time of search requests on the keyed node.

rank

The rank of this node; used for shard selection when routing search requests.

Nodes Info

The cluster nodes info API allows to retrieve one or more (or all) of the cluster nodes information.

GET /_nodes
GET /_nodes/nodeId1,nodeId2

The first command retrieves information of all the nodes in the cluster. The second command selectively retrieves nodes information of only nodeId1 and nodeId2. All the nodes selective options are explained here.

By default, it just returns all attributes and core settings for a node:

build_hash

Short hash of the last git commit in this release.

host

The node’s host name.

ip

The node’s IP address.

name

The node’s name.

total_indexing_buffer

Total heap allowed to be used to hold recently indexed documents before they must be written to disk. This size is a shared pool across all shards on this node, and is controlled by Indexing Buffer settings.

total_indexing_buffer_in_bytes

Same as total_indexing_buffer, but expressed in bytes.

transport_address

Host and port where transport HTTP connections are accepted.

version

Elasticsearch version running on this node.

It also allows to get only information on settings, os, process, jvm, thread_pool, transport, http, plugins, ingest and indices:

# return just process
GET /_nodes/process

# same as above
GET /_nodes/_all/process

# return just jvm and process of only nodeId1 and nodeId2
GET /_nodes/nodeId1,nodeId2/jvm,process

# same as above
GET /_nodes/nodeId1,nodeId2/info/jvm,process

# return all the information of only nodeId1 and nodeId2
GET /_nodes/nodeId1,nodeId2/_all

The _all flag can be set to return all the information - or you can simply omit it.

Operating System information

The os flag can be set to retrieve information that concern the operating system:

os.refresh_interval_in_millis

Refresh interval for the OS statistics

os.name

Name of the operating system (ex: Linux, Windows, Mac OS X)

os.arch

Name of the JVM architecture (ex: amd64, x86)

os.version

Version of the operating system

os.available_processors

Number of processors available to the Java virtual machine

os.allocated_processors

The number of processors actually used to calculate thread pool size. This number can be set with the processors setting of a node and defaults to the number of processors reported by the OS. In both cases this number will never be larger than 32.

Process information

The process flag can be set to retrieve information that concern the current running process:

process.refresh_interval_in_millis

Refresh interval for the process statistics

process.id

Process identifier (PID)

process.mlockall

Indicates if the process address space has been successfully locked in memory

Plugins information

plugins - if set, the result will contain details about the installed plugins and modules per node:

GET /_nodes/plugins

The result will look similar to:

{
  "_nodes": ...
  "cluster_name": "elasticsearch",
  "nodes": {
    "USpTGYaBSIKbgSUJR2Z9lg": {
      "name": "node-0",
      "transport_address": "192.168.17:9300",
      "host": "node-0.elastic.co",
      "ip": "192.168.17",
      "version": "{version}",
      "build_flavor": "{build_flavor}",
      "build_type": "zip",
      "build_hash": "587409e",
      "roles": [
        "master",
        "data",
        "ingest"
      ],
      "attributes": {},
      "plugins": [
        {
          "name": "analysis-icu",
          "version": "{version}",
          "description": "The ICU Analysis plugin integrates Lucene ICU module into elasticsearch, adding ICU relates analysis components.",
          "classname": "org.elasticsearch.plugin.analysis.icu.AnalysisICUPlugin",
          "has_native_controller": false
        }
      ],
      "modules": [
        {
          "name": "lang-painless",
          "version": "{version}",
          "description": "An easy, safe and fast scripting language for Elasticsearch",
          "classname": "org.elasticsearch.painless.PainlessPlugin",
          "has_native_controller": false
        }
      ]
    }
  }
}

The following information are available for each plugin and module:

  • name: plugin name

  • version: version of Elasticsearch the plugin was built for

  • description: short description of the plugin’s purpose

  • classname: fully-qualified class name of the plugin’s entry point

  • has_native_controller: whether or not the plugin has a native controller process

Ingest information

ingest - if set, the result will contain details about the available processors per node:

GET /_nodes/ingest

The result will look similar to:

{
  "_nodes": ...
  "cluster_name": "elasticsearch",
  "nodes": {
    "USpTGYaBSIKbgSUJR2Z9lg": {
      "name": "node-0",
      "transport_address": "192.168.17:9300",
      "host": "node-0.elastic.co",
      "ip": "192.168.17",
      "version": "{version}",
      "build_flavor": "{build_flavor}",
      "build_type": "zip",
      "build_hash": "587409e",
      "roles": [],
      "attributes": {},
      "ingest": {
        "processors": [
          {
            "type": "date"
          },
          {
            "type": "uppercase"
          },
          {
            "type": "set"
          },
          {
            "type": "lowercase"
          },
          {
            "type": "gsub"
          },
          {
            "type": "convert"
          },
          {
            "type": "remove"
          },
          {
            "type": "fail"
          },
          {
            "type": "foreach"
          },
          {
            "type": "split"
          },
          {
            "type": "trim"
          },
          {
            "type": "rename"
          },
          {
            "type": "join"
          },
          {
            "type": "append"
          }
        ]
      }
    }
  }
}

The following information are available for each ingest processor:

  • type: the processor type

Nodes Feature Usage

Nodes usage

The cluster nodes usage API allows to retrieve information on the usage of features for each node.

GET _nodes/usage
GET _nodes/nodeId1,nodeId2/usage

The first command retrieves usage of all the nodes in the cluster. The second command selectively retrieves nodes usage of only nodeId1 and nodeId2. All the nodes selective options are explained here.

REST actions usage information

The rest_actions field in the response contains a map of the REST actions classname with a count of the number of times that action has been called on the node:

{
  "_nodes": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "cluster_name": "my_cluster",
  "nodes": {
    "pQHNt5rXTTWNvUgOrdynKg": {
      "timestamp": 1492553961812, (1)
      "since": 1492553906606, (2)
      "rest_actions": {
        "org.elasticsearch.rest.action.admin.cluster.RestNodesUsageAction": 1,
        "org.elasticsearch.rest.action.admin.indices.RestCreateIndexAction": 1,
        "org.elasticsearch.rest.action.document.RestGetAction": 1,
        "org.elasticsearch.rest.action.search.RestSearchAction": 19, (3)
        "org.elasticsearch.rest.action.admin.cluster.RestNodesInfoAction": 36
      }
    }
  }
}
  1. Timestamp for when this nodes usage request was performed.

  2. Timestamp for when the usage information recording was started. This is equivalent to the time that the node was started.

  3. Search action has been called 19 times for this node.

Remote Cluster Info

The cluster remote info API allows to retrieve all of the configured remote cluster information.

GET /_remote/info

This command returns connection and endpoint information keyed by the configured remote cluster alias.

seeds

The configured initial seed transport addresses of the remote cluster.

http_addresses

The published http addresses of all connected remote nodes.

connected

True if there is at least one connection to the remote cluster.

num_nodes_connected

The number of connected nodes in the remote cluster.

max_connections_per_cluster

The maximum number of connections maintained for the remote cluster.

initial_connect_timeout

The initial connect timeout for remote cluster connections.

skip_unavailable

Whether the remote cluster is skipped in case it is searched through a {ccs} request but none of its nodes are available.

Task Management API

beta[The Task Management API is new and should still be considered a beta feature. The API may change in ways that are not backwards compatible]

Current Tasks Information

The task management API allows to retrieve information about the tasks currently executing on one or more nodes in the cluster.

GET _tasks (1)
GET _tasks?nodes=nodeId1,nodeId2 (2)
GET _tasks?nodes=nodeId1,nodeId2&actions=cluster:* (3)
  1. Retrieves all tasks currently running on all nodes in the cluster.

  2. Retrieves all tasks running on nodes nodeId1 and nodeId2. See Node specification for more info about how to select individual nodes.

  3. Retrieves all cluster-related tasks running on nodes nodeId1 and nodeId2.

The result will look similar to the following:

{
  "nodes" : {
    "oTUltX4IQMOUUVeiohTt8A" : {
      "name" : "H5dfFeA",
      "transport_address" : "127.0.0.1:9300",
      "host" : "127.0.0.1",
      "ip" : "127.0.0.1:9300",
      "tasks" : {
        "oTUltX4IQMOUUVeiohTt8A:124" : {
          "node" : "oTUltX4IQMOUUVeiohTt8A",
          "id" : 124,
          "type" : "direct",
          "action" : "cluster:monitor/tasks/lists[n]",
          "start_time_in_millis" : 1458585884904,
          "running_time_in_nanos" : 47402,
          "cancellable" : false,
          "parent_task_id" : "oTUltX4IQMOUUVeiohTt8A:123"
        },
        "oTUltX4IQMOUUVeiohTt8A:123" : {
          "node" : "oTUltX4IQMOUUVeiohTt8A",
          "id" : 123,
          "type" : "transport",
          "action" : "cluster:monitor/tasks/lists",
          "start_time_in_millis" : 1458585884904,
          "running_time_in_nanos" : 236042,
          "cancellable" : false
        }
      }
    }
  }
}

It is also possible to retrieve information for a particular task. The following example retrieves information about task oTUltX4IQMOUUVeiohTt8A:124:

GET _tasks/oTUltX4IQMOUUVeiohTt8A:124

If the task isn’t found, the API returns a 404.

To retrieve all children of a particular task:

GET _tasks?parent_task_id=oTUltX4IQMOUUVeiohTt8A:123

If the parent isn’t found, the API does not return a 404.

You can also use the detailed request parameter to get more information about the running tasks. This is useful for telling one task from another but is more costly to execute. For example, fetching all searches using the detailed request parameter:

GET _tasks?actions=*search&detailed

The results might look like:

{
  "nodes" : {
    "oTUltX4IQMOUUVeiohTt8A" : {
      "name" : "H5dfFeA",
      "transport_address" : "127.0.0.1:9300",
      "host" : "127.0.0.1",
      "ip" : "127.0.0.1:9300",
      "tasks" : {
        "oTUltX4IQMOUUVeiohTt8A:464" : {
          "node" : "oTUltX4IQMOUUVeiohTt8A",
          "id" : 464,
          "type" : "transport",
          "action" : "indices:data/read/search",
          "description" : "indices[test], types[test], search_type[QUERY_THEN_FETCH], source[{\"query\":...}]",
          "start_time_in_millis" : 1483478610008,
          "running_time_in_nanos" : 13991383,
          "cancellable" : true
        }
      }
    }
  }
}

The new description field contains human readable text that identifies the particular request that the task is performing such as identifying the search request being performed by a search task like the example above. Other kinds of task have different descriptions, like _reindex which has the search and the destination, or _bulk which just has the number of requests and the destination indices. Many requests will only have an empty description because more detailed information about the request is not easily available or particularly helpful in identifying the request.

Important

_tasks requests with detailed may also return a status. This is a report of the internal status of the task. As such its format varies from task to task. While we try to keep the status for a particular task consistent from version to version this isn’t always possible because we sometimes change the implementation. In that case we might remove fields from the status for a particular request so any parsing you do of the status might break in minor releases.

The task API can also be used to wait for completion of a particular task. The following call will block for 10 seconds or until the task with id oTUltX4IQMOUUVeiohTt8A:12345 is completed.

GET _tasks/oTUltX4IQMOUUVeiohTt8A:12345?wait_for_completion=true&timeout=10s

You can also wait for all tasks for certain action types to finish. This command will wait for all reindex tasks to finish:

GET _tasks?actions=*reindex&wait_for_completion=true&timeout=10s

Tasks can be also listed using _cat version of the list tasks command, which accepts the same arguments as the standard list tasks command.

GET _cat/tasks
GET _cat/tasks?detailed

Task Cancellation

If a long-running task supports cancellation, it can be cancelled with the cancel tasks API. The following example cancels task oTUltX4IQMOUUVeiohTt8A:12345:

POST _tasks/oTUltX4IQMOUUVeiohTt8A:12345/_cancel

The task cancellation command supports the same task selection parameters as the list tasks command, so multiple tasks can be cancelled at the same time. For example, the following command will cancel all reindex tasks running on the nodes nodeId1 and nodeId2.

POST _tasks/_cancel?nodes=nodeId1,nodeId2&actions=*reindex

Task Grouping

The task lists returned by task API commands can be grouped either by nodes (default) or by parent tasks using the group_by parameter. The following command will change the grouping to parent tasks:

GET _tasks?group_by=parents

The grouping can be disabled by specifying none as a group_by parameter:

GET _tasks?group_by=none

Identifying running tasks

The X-Opaque-Id header, when provided on the HTTP request header, is going to be returned as a header in the response as well as in the headers field for in the task information. This allows to track certain calls, or associate certain tasks with the client that started them:

curl -i -H "X-Opaque-Id: 123456" "http://localhost:9200/_tasks?group_by=parents"

The result will look similar to the following:

HTTP/1.1 200 OK
X-Opaque-Id: 123456 (1)
content-type: application/json; charset=UTF-8
content-length: 831

{
  "tasks" : {
    "u5lcZHqcQhu-rUoFaqDphA:45" : {
      "node" : "u5lcZHqcQhu-rUoFaqDphA",
      "id" : 45,
      "type" : "transport",
      "action" : "cluster:monitor/tasks/lists",
      "start_time_in_millis" : 1513823752749,
      "running_time_in_nanos" : 293139,
      "cancellable" : false,
      "headers" : {
        "X-Opaque-Id" : "123456" (2)
      },
      "children" : [
        {
          "node" : "u5lcZHqcQhu-rUoFaqDphA",
          "id" : 46,
          "type" : "direct",
          "action" : "cluster:monitor/tasks/lists[n]",
          "start_time_in_millis" : 1513823752750,
          "running_time_in_nanos" : 92133,
          "cancellable" : false,
          "parent_task_id" : "u5lcZHqcQhu-rUoFaqDphA:45",
          "headers" : {
            "X-Opaque-Id" : "123456" (3)
          }
        }
      ]
    }
  }
}
  1. id as a part of the response header

  2. id for the tasks that was initiated by the REST request

  3. the child task of the task initiated by the REST request

Nodes hot_threads

This API yields a breakdown of the hot threads on each selected node in the cluster. Its endpoints are /_nodes/hot_threads and /_nodes/{nodes}/hot_threads:

GET /_nodes/hot_threads
GET /_nodes/nodeId1,nodeId2/hot_threads

The first command gets the hot threads of all the nodes in the cluster. The second command gets the hot threads of only nodeId1 and nodeId2. Nodes can be selected using node filters.

The output is plain text with a breakdown of each node’s top hot threads. The allowed parameters are:

threads

number of hot threads to provide, defaults to 3.

interval

the interval to do the second sampling of threads. Defaults to 500ms.

type

The type to sample, defaults to cpu, but supports wait and block to see hot threads that are in wait or block state.

ignore_idle_threads

If true, known idle threads (e.g. waiting in a socket select, or to get a task from an empty queue) are filtered out. Defaults to true.

Cluster Allocation Explain API

The purpose of the cluster allocation explain API is to provide explanations for shard allocations in the cluster. For unassigned shards, the explain API provides an explanation for why the shard is unassigned. For assigned shards, the explain API provides an explanation for why the shard is remaining on its current node and has not moved or rebalanced to another node. This API can be very useful when attempting to diagnose why a shard is unassigned or why a shard continues to remain on its current node when you might expect otherwise.

Explain API Request

To explain the allocation of a shard, first an index should exist:

PUT /myindex

And then the allocation for shards of that index can be explained:

GET /_cluster/allocation/explain
{
  "index": "myindex",
  "shard": 0,
  "primary": true
}

Specify the index and shard id of the shard you would like an explanation for, as well as the primary flag to indicate whether to explain the primary shard for the given shard id or one of its replica shards. These three request parameters are required.

You may also specify an optional current_node request parameter to only explain a shard that is currently located on current_node. The current_node can be specified as either the node id or node name.

GET /_cluster/allocation/explain
{
  "index": "myindex",
  "shard": 0,
  "primary": false,
  "current_node": "nodeA"                         (1)
}
  1. The node where shard 0 currently has a replica on

You can also have Elasticsearch explain the allocation of the first unassigned shard that it finds by sending an empty body for the request:

GET /_cluster/allocation/explain

Explain API Response

This section includes examples of the cluster allocation explain API response output under various scenarios.

The API response for an unassigned shard:

{
  "index" : "idx",
  "shard" : 0,
  "primary" : true,
  "current_state" : "unassigned",                 (1)
  "unassigned_info" : {
    "reason" : "INDEX_CREATED",                   (2)
    "at" : "2017-01-04T18:08:16.600Z",
    "last_allocation_status" : "no"
  },
  "can_allocate" : "no",                          (3)
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "8qt2rY-pT6KNZB3-hGfLnw",
      "node_name" : "node-0",
      "transport_address" : "127.0.0.1:9401",
      "node_attributes" : {},
      "node_decision" : "no",                     (4)
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "filter",                   (5)
          "decision" : "NO",
          "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"non_existent_node\"]"  (6)
        }
      ]
    }
  ]
}
  1. The current state of the shard

  2. The reason for the shard originally becoming unassigned

  3. Whether to allocate the shard

  4. Whether to allocate the shard to the particular node

  5. The decider which led to the no decision for the node

  6. An explanation as to why the decider returned a no decision, with a helpful hint pointing to the setting that led to the decision

You can return information gathered by the cluster info service about disk usage and shard sizes by setting the include_disk_info parameter to true:

GET /_cluster/allocation/explain?include_disk_info=true

Additionally, if you would like to include all decisions that were factored into the final decision, the include_yes_decisions parameter will return all decisions for each node:

GET /_cluster/allocation/explain?include_yes_decisions=true

The default value for include_yes_decisions is false, which will only include the no decisions in the response. This is generally what you would want, as the no decisions indicate why a shard is unassigned or cannot be moved, and including all decisions include the yes ones adds a lot of verbosity to the API’s response output.

The API response output for an unassigned primary shard that had previously been allocated to a node in the cluster:

{
  "index" : "idx",
  "shard" : 0,
  "primary" : true,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "NODE_LEFT",
    "at" : "2017-01-04T18:03:28.464Z",
    "details" : "node_left[OIWe8UhhThCK0V5XfmdrmQ]",
    "last_allocation_status" : "no_valid_shard_copy"
  },
  "can_allocate" : "no_valid_shard_copy",
  "allocate_explanation" : "cannot allocate because a previous copy of the primary shard existed but can no longer be found on the nodes in the cluster"
}

The API response output for a replica that is unassigned due to delayed allocation:

{
  "index" : "idx",
  "shard" : 0,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "NODE_LEFT",
    "at" : "2017-01-04T18:53:59.498Z",
    "details" : "node_left[G92ZwuuaRY-9n8_tc-IzEg]",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "allocation_delayed",
  "allocate_explanation" : "cannot allocate because the cluster is still waiting 59.8s for the departed node holding a replica to rejoin, despite being allowed to allocate the shard to at least one other node",
  "configured_delay" : "1m",                      (1)
  "configured_delay_in_millis" : 60000,
  "remaining_delay" : "59.8s",                    (2)
  "remaining_delay_in_millis" : 59824,
  "node_allocation_decisions" : [
    {
      "node_id" : "pmnHu_ooQWCPEFobZGbpWw",
      "node_name" : "node_t2",
      "transport_address" : "127.0.0.1:9402",
      "node_decision" : "yes"
    },
    {
      "node_id" : "3sULLVJrRneSg0EfBB-2Ew",
      "node_name" : "node_t0",
      "transport_address" : "127.0.0.1:9400",
      "node_decision" : "no",
      "store" : {                                 (3)
        "matching_size" : "4.2kb",
        "matching_size_in_bytes" : 4325
      },
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[idx][0], node[3sULLVJrRneSg0EfBB-2Ew], [P], s[STARTED], a[id=eV9P8BN1QPqRc3B4PLx6cg]]"
        }
      ]
    }
  ]
}
  1. The configured delay before allocating a replica shard that does not exist due to the node holding it leaving the cluster

  2. The remaining delay before allocating the replica shard

  3. Information about the shard data found on a node

The API response output for an assigned shard that is not allowed to remain on its current node and is required to move:

{
  "index" : "idx",
  "shard" : 0,
  "primary" : true,
  "current_state" : "started",
  "current_node" : {
    "id" : "8lWJeJ7tSoui0bxrwuNhTA",
    "name" : "node_t1",
    "transport_address" : "127.0.0.1:9401"
  },
  "can_remain_on_current_node" : "no",            (1)
  "can_remain_decisions" : [                      (2)
    {
      "decider" : "filter",
      "decision" : "NO",
      "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"non_existent_node\"]"
    }
  ],
  "can_move_to_other_node" : "no",                (3)
  "move_explanation" : "cannot move shard to another node, even though it is not allowed to remain on its current node",
  "node_allocation_decisions" : [
    {
      "node_id" : "_P8olZS8Twax9u6ioN-GGA",
      "node_name" : "node_t0",
      "transport_address" : "127.0.0.1:9400",
      "node_decision" : "no",
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "filter",
          "decision" : "NO",
          "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"non_existent_node\"]"
        }
      ]
    }
  ]
}
  1. Whether the shard is allowed to remain on its current node

  2. The deciders that factored into the decision of why the shard is not allowed to remain on its current node

  3. Whether the shard is allowed to be allocated to another node

The API response output for an assigned shard that remains on its current node because moving the shard to another node does not form a better cluster balance:

{
  "index" : "idx",
  "shard" : 0,
  "primary" : true,
  "current_state" : "started",
  "current_node" : {
    "id" : "wLzJm4N4RymDkBYxwWoJsg",
    "name" : "node_t0",
    "transport_address" : "127.0.0.1:9400",
    "weight_ranking" : 1
  },
  "can_remain_on_current_node" : "yes",
  "can_rebalance_cluster" : "yes",                (1)
  "can_rebalance_to_other_node" : "no",           (2)
  "rebalance_explanation" : "cannot rebalance as no target node exists that can both allocate this shard and improve the cluster balance",
  "node_allocation_decisions" : [
    {
      "node_id" : "oE3EGFc8QN-Tdi5FFEprIA",
      "node_name" : "node_t1",
      "transport_address" : "127.0.0.1:9401",
      "node_decision" : "worse_balance",          (3)
      "weight_ranking" : 1
    }
  ]
}
  1. Whether rebalancing is allowed on the cluster

  2. Whether the shard can be rebalanced to another node

  3. The reason the shard cannot be rebalanced to the node, in this case indicating that it offers no better balance than the current node