"Fossies" - the Fresh Open Source Software Archive

Member "elasticsearch-6.8.23/docs/reference/cat.asciidoc" (29 Dec 2021, 6766 Bytes) of package /linux/www/elasticsearch-6.8.23-src.tar.gz:


As a special service "Fossies" has tried to format the requested source page into HTML format (assuming AsciiDoc format). Alternatively you can here view or download the uninterpreted source code file. A member file download can also be achieved by clicking within a package contents listing on the according byte size field.

cat aliases

aliases shows information about currently configured aliases to indices including filter and routing infos.

GET /_cat/aliases?v

Might respond with:

alias  index filter routing.index routing.search
alias1 test1 -      -            -
alias2 test1 *      -            -
alias3 test1 -      1            1
alias4 test1 -      2            1,2

The output shows that alias2 has configured a filter, and specific routing configurations in alias3 and alias4.

If you only want to get information about specific aliases, you can specify the aliases in comma-delimited format as a URL parameter, e.g., /_cat/aliases/alias1,alias2.

cat allocation

allocation provides a snapshot of how many shards are allocated to each data node and how much disk space they are using.

GET /_cat/allocation?v

Might respond with:

shards disk.indices disk.used disk.avail disk.total disk.percent host      ip        node
     5         260b    47.3gb     43.4gb    100.7gb           46 127.0.0.1 127.0.0.1 CSUXak2

Here we can see that all 5 shards have been allocated to the single node available.

cat count

count provides quick access to the document count of the entire cluster, or individual indices.

GET /_cat/count?v

Looks like:

epoch      timestamp count
1475868259 15:24:19  121

Or for a single index:

GET /_cat/count/twitter?v
epoch      timestamp count
1475868259 15:24:20  120
Note
The document count indicates the number of live documents and does not include deleted documents which have not yet been cleaned up by the merge process.

cat fielddata

fielddata shows how much heap memory is currently being used by fielddata on every data node in the cluster.

GET /_cat/fielddata?v

Looks like:

id                     host      ip        node    field   size
Nqk-6inXQq-OxUfOUI8jNQ 127.0.0.1 127.0.0.1 Nqk-6in body    544b
Nqk-6inXQq-OxUfOUI8jNQ 127.0.0.1 127.0.0.1 Nqk-6in soul    480b

Fields can be specified either as a query parameter, or in the URL path:

GET /_cat/fielddata?v&fields=body

Which looks like:

id                     host      ip        node    field   size
Nqk-6inXQq-OxUfOUI8jNQ 127.0.0.1 127.0.0.1 Nqk-6in body    544b

And it accepts a comma delimited list:

GET /_cat/fielddata/body,soul?v

Which produces the same output as the first snippet:

id                     host      ip        node    field   size
Nqk-6inXQq-OxUfOUI8jNQ 127.0.0.1 127.0.0.1 Nqk-6in body    544b
Nqk-6inXQq-OxUfOUI8jNQ 127.0.0.1 127.0.0.1 Nqk-6in soul    480b

The output shows the individual fielddata for the body and soul fields, one row per field per node.

cat health

health is a terse, one-line representation of the same information from /_cluster/health.

GET /_cat/health?v
epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1475871424 16:17:04  elasticsearch green           1         1      5   5    0    0        0             0                  -                100.0%

It has one option ts to disable the timestamping:

GET /_cat/health?v&ts=false

which looks like:

cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
elasticsearch green           1         1      5   5    0    0        0             0                  -                100.0%

A common use of this command is to verify the health is consistent across nodes:

% pssh -i -h list.of.cluster.hosts curl -s localhost:9200/_cat/health
[1] 20:20:52 [SUCCESS] es3.vm
1384309218 18:20:18 foo green 3 3 3 3 0 0 0 0
[2] 20:20:52 [SUCCESS] es1.vm
1384309218 18:20:18 foo green 3 3 3 3 0 0 0 0
[3] 20:20:52 [SUCCESS] es2.vm
1384309218 18:20:18 foo green 3 3 3 3 0 0 0 0

A less obvious use is to track recovery of a large cluster over time. With enough shards, starting a cluster, or even recovering after losing a node, can take time (depending on your network & disk). A way to track its progress is by using this command in a delayed loop:

% while true; do curl localhost:9200/_cat/health; sleep 120; done
1384309446 18:24:06 foo red 3 3 20 20 0 0 1812 0
1384309566 18:26:06 foo yellow 3 3 950 916 0 12 870 0
1384309686 18:28:06 foo yellow 3 3 1328 916 0 12 492 0
1384309806 18:30:06 foo green 3 3 1832 916 4 0 0
^C

In this scenario, we can tell that recovery took roughly four minutes. If this were going on for hours, we would be able to watch the UNASSIGNED shards drop precipitously. If that number remained static, we would have an idea that there is a problem.

Why the timestamp?

You typically are using the health command when a cluster is malfunctioning. During this period, it’s extremely important to correlate activities across log files, alerting systems, etc.

There are two outputs. The HH:MM:SS output is simply for quick human consumption. The epoch time retains more information, including date, and is machine sortable if your recovery spans days.

cat indices

The indices command provides a cross-section of each index. This information spans nodes. For example:

GET /_cat/indices/twi*?v&s=index

Might respond with:

health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   twitter  u8FNjxh8Rfy_awN11oDKYQ   1   1       1200            0     88.1kb         88.1kb
green  open   twitter2 nYFWZEO7TUiOjLQXBaYJpA   5   0          0            0       260b           260b

We can tell quickly how many shards make up an index, the number of docs, deleted docs, primary store size, and total store size (all shards including replicas). All these exposed metrics come directly from Lucene APIs.

Notes:

  1. As the number of documents and deleted documents shown in this are at the lucene level, it includes all the hidden documents (e.g. from nested documents) as well.

  2. To get actual count of documents at the Elasticsearch level, the recommended way is to use either the cat count or the [search-count]

Primaries

The index stats by default will show them for all of an index’s shards, including replicas. A pri flag can be supplied to enable the view of relevant stats in the context of only the primaries.

Examples

Which indices are yellow?

GET /_cat/indices?v&health=yellow

Which looks like:

health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   twitter  u8FNjxh8Rfy_awN11oDKYQ   1   1       1200            0     88.1kb         88.1kb

Which index has the largest number of documents?

GET /_cat/indices?v&s=docs.count:desc

Which looks like:

health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   twitter  u8FNjxh8Rfy_awN11oDKYQ   1   1       1200            0     88.1kb         88.1kb
green  open   twitter2 nYFWZEO7TUiOjLQXBaYJpA   5   0          0            0       260b           260b

How many merge operations have the shards for the twitter completed?

GET /_cat/indices/twitter?pri&v&h=health,index,pri,rep,docs.count,mt

Might look like:

health index   pri rep docs.count mt pri.mt
yellow twitter   1   1 1200       16     16

How much memory is used per index?

GET /_cat/indices?v&h=i,tm&s=tm:desc

Might look like:

i         tm
twitter   8.1gb
twitter2  30.5kb

cat master

master doesn’t have any extra options. It simply displays the master’s node ID, bound IP address, and node name. For example:

GET /_cat/master?v

might respond:

id                     host      ip        node
YzWoH_2BT-6UjVGDyPdqYg 127.0.0.1 127.0.0.1 YzWoH_2

This information is also available via the nodes command, but this is slightly shorter when all you want to do, for example, is verify all nodes agree on the master:

% pssh -i -h list.of.cluster.hosts curl -s localhost:9200/_cat/master
[1] 19:16:37 [SUCCESS] es3.vm
Ntgn2DcuTjGuXlhKDUD4vA 192.168.56.30 H5dfFeA
[2] 19:16:37 [SUCCESS] es2.vm
Ntgn2DcuTjGuXlhKDUD4vA 192.168.56.30 H5dfFeA
[3] 19:16:37 [SUCCESS] es1.vm
Ntgn2DcuTjGuXlhKDUD4vA 192.168.56.30 H5dfFeA

cat nodeattrs

The nodeattrs command shows custom node attributes. For example:

GET /_cat/nodeattrs?v

Could look like:

node    host      ip        attr     value
...
node-0 127.0.0.1 127.0.0.1 testattr test
...

The first few columns (node, host, ip) give you basic info per node and the attr and value columns give you the custom node attributes, one per line.

Columns

Below is an exhaustive list of the existing headers that can be passed to nodeattrs?h= to retrieve the relevant details in ordered columns. If no headers are specified, then those marked to Appear by Default will appear. If any header is specified, then the defaults are not used.

Aliases can be used in place of the full header name for brevity. Columns appear in the order that they are listed below unless a different order is specified (e.g., h=attr,value versus h=value,attr).

When specifying headers, the headers are not placed in the output by default. To have the headers appear in the output, use verbose mode (v). The header name will match the supplied value (e.g., pid versus p). For example:

GET /_cat/nodeattrs?v&h=name,pid,attr,value

Might look like:

name    pid   attr     value
...
node-0 19566 testattr test
...
Header Alias Appear by Default Description Example

node

name

Yes

Name of the node

DKDM97B

id

nodeId

No

Unique node ID

k0zy

pid

p

No

Process ID

13061

host

h

Yes

Host name

n1

ip

i

Yes

IP address

127.0.1.1

port

po

No

Bound transport port

9300

attr

attr.name

Yes

Attribute name

rack

value

attr.value

Yes

Attribute value

rack123

cat nodes

The nodes command shows the cluster topology. For example

GET /_cat/nodes?v

Might look like:

ip        heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
127.0.0.1           65          99  42    3.07                  mdi       *      mJw06l1

The first few columns (ip, heap.percent, ram.percent, cpu, load_*) tell you where your nodes live and give a quick picture of performance stats.

The last (node.role, master, and name) columns provide ancillary information that can often be useful when looking at the cluster as a whole, particularly large ones. How many master-eligible nodes do I have?

The nodes API accepts an additional URL parameter full_id accepting true or false. The purpose of this parameter is to format the ID field (if requested with id or nodeId) in its full length or in abbreviated form (the default).

Columns

Below is an exhaustive list of the existing headers that can be passed to nodes?h= to retrieve the relevant details in ordered columns. If no headers are specified, then those marked to Appear by Default will appear. If any header is specified, then the defaults are not used.

Aliases can be used in place of the full header name for brevity. Columns appear in the order that they are listed below unless a different order is specified (e.g., h=pid,id versus h=id,pid).

When specifying headers, the headers are not placed in the output by default. To have the headers appear in the output, use verbose mode (v). The header name will match the supplied value (e.g., pid versus p). For example:

GET /_cat/nodes?v&h=id,ip,port,v,m

Might look like:

id   ip        port  v         m
veJR 127.0.0.1 59938 {version} *
Header Alias Appear by Default Description Example

id

nodeId

No

Unique node ID

k0zy

pid

p

No

Process ID

13061

ip

i

Yes

IP address

127.0.1.1

port

po

No

Bound transport port

9300

http_address

http

No

Bound http address

127.0.0.1:9200

version

v

No

Elasticsearch version

{version}

build

b

No

Elasticsearch Build hash

5c03844

jdk

j

No

Running Java version

1.8.0

disk.total

dt, diskTotal

No

Total disk space

458.3gb

disk.used

du, diskUsed

No

Used disk space

259.8gb

disk.avail

d, disk, diskAvail

No

Available disk space

198.4gb

disk.used_percent

dup, diskUsedPercent

No

Used disk space percentage

56.71

heap.current

hc, heapCurrent

No

Used heap

311.2mb

heap.percent

hp, heapPercent

Yes

Used heap percentage

7

heap.max

hm, heapMax

No

Maximum configured heap

1015.6mb

ram.current

rc, ramCurrent

No

Used total memory

513.4mb

ram.percent

rp, ramPercent

Yes

Used total memory percentage

47

ram.max

rm, ramMax

No

Total memory

2.9gb

file_desc.current

fdc, fileDescriptorCurrent

No

Used file descriptors

123

file_desc.percent

fdp, fileDescriptorPercent

Yes

Used file descriptors percentage

1

file_desc.max

fdm, fileDescriptorMax

No

Maximum number of file descriptors

1024

cpu

No

Recent system CPU usage as percent

12

load_1m

l

No

Most recent load average

0.22

load_5m

l

No

Load average for the last five minutes

0.78

load_15m

l

No

Load average for the last fifteen minutes

1.24

uptime

u

No

Node uptime

17.3m

node.role

r, role, nodeRole

Yes

Master eligible node (m); Data node (d); Ingest node (i); Coordinating node only (-)

mdi

master

m

Yes

Elected master (*); Not elected master (-)

*

name

n

Yes

Node name

I8hydUG

completion.size

cs, completionSize

No

Size of completion

0b

fielddata.memory_size

fm, fielddataMemory

No

Used fielddata cache memory

0b

fielddata.evictions

fe, fielddataEvictions

No

Fielddata cache evictions

0

query_cache.memory_size

qcm, queryCacheMemory

No

Used query cache memory

0b

query_cache.evictions

qce, queryCacheEvictions

No

Query cache evictions

0

request_cache.memory_size

rcm, requestCacheMemory

No

Used request cache memory

0b

request_cache.evictions

rce, requestCacheEvictions

No

Request cache evictions

0

request_cache.hit_count

rchc, requestCacheHitCount

No

Request cache hit count

0

request_cache.miss_count

rcmc, requestCacheMissCount

No

Request cache miss count

0

flush.total

ft, flushTotal

No

Number of flushes

1

flush.total_time

ftt, flushTotalTime

No

Time spent in flush

1

get.current

gc, getCurrent

No

Number of current get operations

0

get.time

gti, getTime

No

Time spent in get

14ms

get.total

gto, getTotal

No

Number of get operations

2

get.exists_time

geti, getExistsTime

No

Time spent in successful gets

14ms

get.exists_total

geto, getExistsTotal

No

Number of successful get operations

2

get.missing_time

gmti, getMissingTime

No

Time spent in failed gets

0s

get.missing_total

gmto, getMissingTotal

No

Number of failed get operations

1

indexing.delete_current

idc, indexingDeleteCurrent

No

Number of current deletion operations

0

indexing.delete_time

idti, indexingDeleteTime

No

Time spent in deletions

2ms

indexing.delete_total

idto, indexingDeleteTotal

No

Number of deletion operations

2

indexing.index_current

iic, indexingIndexCurrent

No

Number of current indexing operations

0

indexing.index_time

iiti, indexingIndexTime

No

Time spent in indexing

134ms

indexing.index_total

iito, indexingIndexTotal

No

Number of indexing operations

1

indexing.index_failed

iif, indexingIndexFailed

No

Number of failed indexing operations

0

merges.current

mc, mergesCurrent

No

Number of current merge operations

0

merges.current_docs

mcd, mergesCurrentDocs

No

Number of current merging documents

0

merges.current_size

mcs, mergesCurrentSize

No

Size of current merges

0b

merges.total

mt, mergesTotal

No

Number of completed merge operations

0

merges.total_docs

mtd, mergesTotalDocs

No

Number of merged documents

0

merges.total_size

mts, mergesTotalSize

No

Size of current merges

0b

merges.total_time

mtt, mergesTotalTime

No

Time spent merging documents

0s

refresh.total

rto, refreshTotal

No

Number of refreshes

16

refresh.time

rti, refreshTime

No

Time spent in refreshes

91ms

script.compilations

scrcc, scriptCompilations

No

Total script compilations

17

script.cache_evictions

scrce, scriptCacheEvictions

No

Total compiled scripts evicted from cache

6

search.fetch_current

sfc, searchFetchCurrent

No

Current fetch phase operations

0

search.fetch_time

sfti, searchFetchTime

No

Time spent in fetch phase

37ms

search.fetch_total

sfto, searchFetchTotal

No

Number of fetch operations

7

search.open_contexts

so, searchOpenContexts

No

Open search contexts

0

search.query_current

sqc, searchQueryCurrent

No

Current query phase operations

0

search.query_time

sqti, searchQueryTime

No

Time spent in query phase

43ms

search.query_total

sqto, searchQueryTotal

No

Number of query operations

9

search.scroll_current

scc, searchScrollCurrent

No

Open scroll contexts

2

search.scroll_time

scti, searchScrollTime

No

Time scroll contexts held open

2m

search.scroll_total

scto, searchScrollTotal

No

Completed scroll contexts

1

segments.count

sc, segmentsCount

No

Number of segments

4

segments.memory

sm, segmentsMemory

No

Memory used by segments

1.4kb

segments.index_writer_memory

siwm, segmentsIndexWriterMemory

No

Memory used by index writer

18mb

segments.version_map_memory

svmm, segmentsVersionMapMemory

No

Memory used by version map

1.0kb

segments.fixed_bitset_memory

sfbm, fixedBitsetMemory

No

Memory used by fixed bit sets for nested object field types and type filters for types referred in join fields

1.0kb

suggest.current

suc, suggestCurrent

No

Number of current suggest operations

0

suggest.time

suti, suggestTime

No

Time spent in suggest

0

suggest.total

suto, suggestTotal

No

Number of suggest operations

0

cat pending tasks

pending_tasks provides the same information as the /_cluster/pending_tasks API in a convenient tabular format. For example:

GET /_cat/pending_tasks?v

Might look like:

insertOrder timeInQueue priority source
       1685       855ms HIGH     update-mapping [foo][t]
       1686       843ms HIGH     update-mapping [foo][t]
       1693       753ms HIGH     refresh-mapping [foo][[t]]
       1688       816ms HIGH     update-mapping [foo][t]
       1689       802ms HIGH     update-mapping [foo][t]
       1690       787ms HIGH     update-mapping [foo][t]
       1691       773ms HIGH     update-mapping [foo][t]

cat plugins

The plugins command provides a view per node of running plugins. This information spans nodes.

GET /_cat/plugins?v&s=component&h=name,component,version,description

Might look like:

name    component               version   description
U7321H6 analysis-icu            {version} The ICU Analysis plugin integrates the Lucene ICU module into Elasticsearch, adding ICU-related analysis components.
U7321H6 analysis-kuromoji       {version} The Japanese (kuromoji) Analysis plugin integrates Lucene kuromoji analysis module into elasticsearch.
U7321H6 analysis-nori           {version} The Korean (nori) Analysis plugin integrates Lucene nori analysis module into elasticsearch.
U7321H6 analysis-phonetic       {version} The Phonetic Analysis plugin integrates phonetic token filter analysis with elasticsearch.
U7321H6 analysis-smartcn        {version} Smart Chinese Analysis plugin integrates Lucene Smart Chinese analysis module into elasticsearch.
U7321H6 analysis-stempel        {version} The Stempel (Polish) Analysis plugin integrates Lucene stempel (polish) analysis module into elasticsearch.
U7321H6 analysis-ukrainian      {version} The Ukrainian Analysis plugin integrates the Lucene UkrainianMorfologikAnalyzer into elasticsearch.
U7321H6 discovery-azure-classic {version} The Azure Classic Discovery plugin allows to use Azure Classic API for the unicast discovery mechanism
U7321H6 discovery-ec2           {version} The EC2 discovery plugin allows to use AWS API for the unicast discovery mechanism.
U7321H6 discovery-file          {version} Discovery file plugin enables unicast discovery from hosts stored in a file.
U7321H6 discovery-gce           {version} The Google Compute Engine (GCE) Discovery plugin allows to use GCE API for the unicast discovery mechanism.
U7321H6 ingest-attachment       {version} Ingest processor that uses Apache Tika to extract contents
U7321H6 mapper-annotated-text   {version} The Mapper Annotated_text plugin adds support for text fields with markup used to inject annotation tokens into the index.
U7321H6 mapper-murmur3          {version} The Mapper Murmur3 plugin allows to compute hashes of a field's values at index-time and to store them in the index.
U7321H6 mapper-size             {version} The Mapper Size plugin allows document to record their uncompressed size at index time.
U7321H6 store-smb               {version} The Store SMB plugin adds support for SMB stores.

We can tell quickly how many plugins per node we have and which versions.

cat recovery

The recovery command is a view of index shard recoveries, both on-going and previously completed. It is a more compact view of the JSON recovery API.

A recovery event occurs anytime an index shard moves to a different node in the cluster. This can happen during a snapshot recovery, a change in replication level, node failure, or on node startup. This last type is called a local store recovery and is the normal way for shards to be loaded from disk when a node starts up.

As an example, here is what the recovery state of a cluster may look like when there are no shards in transit from one node to another:

GET _cat/recovery?v

The response of this request will be something like:

index   shard time type  stage source_host source_node target_host target_node repository snapshot files files_recovered files_percent files_total bytes bytes_recovered bytes_percent bytes_total translog_ops translog_ops_recovered translog_ops_percent
twitter 0     13ms store done  n/a         n/a         127.0.0.1   node-0      n/a        n/a      0     0               100%          13          0     0               100%          9928        0            0                      100.0%

In the above case, the source and target nodes are the same because the recovery type was store, i.e. they were read from local storage on node start.

Now let’s see what a live recovery looks like. By increasing the replica count of our index and bringing another node online to host the replicas, we can see what a live shard recovery looks like.

GET _cat/recovery?v&h=i,s,t,ty,st,shost,thost,f,fp,b,bp

This will return a line like:

i       s t      ty   st    shost       thost       f     fp      b bp
twitter 0 1252ms peer done  192.168.1.1 192.168.1.2 0     100.0%  0 100.0%

We can see in the above listing that our thw twitter shard was recovered from another node. Notice that the recovery type is shown as peer. The files and bytes copied are real-time measurements.

Finally, let’s see what a snapshot recovery looks like. Assuming I have previously made a backup of my index, I can restore it using the snapshot and restore API.

GET _cat/recovery?v&h=i,s,t,ty,st,rep,snap,f,fp,b,bp

This will show a recovery of type snapshot in the response

i       s t      ty       st    rep     snap   f  fp   b     bp
twitter 0 1978ms snapshot done  twitter snap_1 79 8.0% 12086 9.0%

cat repositories

The repositories command shows the snapshot repositories registered in the cluster. For example:

GET /_cat/repositories?v

might looks like:

id    type
repo1   fs
repo2   s3

We can quickly see which repositories are registered and their type.

cat thread pool

The thread_pool command shows cluster wide thread pool statistics per node. By default the active, queue and rejected statistics are returned for all thread pools.

GET /_cat/thread_pool

Which looks like:

node-0 analyze             0 0 0
...
node-0 fetch_shard_started 0 0 0
node-0 fetch_shard_store   0 0 0
node-0 flush               0 0 0
...
node-0 write               0 0 0

The first column is the node name

node_name
node-0

The second column is the thread pool name

name
analyze
ccr (default distro only)
fetch_shard_started
fetch_shard_store
flush
force_merge
generic
get
index
listener
management
ml_autodetect (default distro only)
ml_datafeed (default distro only)
ml_utility (default distro only)
refresh
rollup_indexing (default distro only)
search
security-token-key (default distro only)
snapshot
warmer
watcher (default distro only)
write

The next three columns show the active, queue, and rejected statistics for each thread pool

active queue rejected
     0     0        0
     0     0        0
     0     0        0
     0     0        0
     0     0        0
     0     0        0
     0     0        0
     0     0        0
     0     0        0
     0     0        0
     1     0        0
     0     0        0
     0     0        0
     0     0        0
     0     0        0

The cat thread pool API accepts a thread_pool_patterns URL parameter for specifying a comma-separated list of regular expressions to match thread pool names.

GET /_cat/thread_pool/generic?v&h=id,name,active,rejected,completed

which looks like:

id                     name    active rejected completed
0EWUhXeBQtaVGlexUeVwMg generic      0        0        70

Here the host columns and the active, rejected and completed suggest thread pool statistics are displayed.

All built-in thread pools and custom thread pools are available.

Thread Pool Fields

For each thread pool, you can load details about it by using the field names in the table below.

Field Name Alias Description

type

t

The current (*) type of thread pool (fixed or scaling)

active

a

The number of active threads in the current thread pool

size

s

The number of threads in the current thread pool

queue

q

The number of tasks in the queue for the current thread pool

queue_size

qs

The maximum number of tasks permitted in the queue for the current thread pool

rejected

r

The number of tasks rejected by the thread pool executor

largest

l

The highest number of active threads in the current thread pool

completed

c

The number of tasks completed by the thread pool executor

min

mi

The configured minimum number of active threads allowed in the current thread pool

max

ma

The configured maximum number of active threads allowed in the current thread pool

keep_alive

k

The configured keep alive time for threads

Other Fields

In addition to details about each thread pool, it is also convenient to get an understanding of where those thread pools reside. As such, you can request other details like the ip of the responding node(s).

Field Name Alias Description

node_id

id

The unique node ID

ephemeral_id

eid

The ephemeral node ID

pid

p

The process ID of the running node

host

h

The hostname for the current node

ip

i

The IP address for the current node

port

po

The bound transport port for the current node

cat shards

The shards command is the detailed view of what nodes contain which shards. It will tell you if it’s a primary or replica, the number of docs, the bytes it takes on disk, and the node where it’s located.

Here we see a single index, with one primary shard and no replicas:

GET _cat/shards

This will return

twitter 0 p STARTED 3014 31.1mb 192.168.56.10 H5dfFeA

Index pattern

If you have many shards, you may wish to limit which indices show up in the output. You can always do this with grep, but you can save some bandwidth by supplying an index pattern to the end.

GET _cat/shards/twitt*

Which will return the following

twitter 0 p STARTED 3014 31.1mb 192.168.56.10 H5dfFeA

Relocation

Let’s say you’ve checked your health and you see relocating shards. Where are they from and where are they going?

GET _cat/shards

A relocating shard will be shown as follows

twitter 0 p RELOCATING 3014 31.1mb 192.168.56.10 H5dfFeA -> -> 192.168.56.30 bGG90GE

Shard states

Before a shard can be used, it goes through an INITIALIZING state. shards can show you which ones.

GET _cat/shards

You can get the initializing state in the response like this

twitter 0 p STARTED      3014 31.1mb 192.168.56.10 H5dfFeA
twitter 0 r INITIALIZING    0 14.3mb 192.168.56.30 bGG90GE

If a shard cannot be assigned, for example you’ve overallocated the number of replicas for the number of nodes in the cluster, the shard will remain UNASSIGNED with the reason code ALLOCATION_FAILED.

You can use the shards API to find out that reason.

GET _cat/shards?h=index,shard,prirep,state,unassigned.reason

The reason for an unassigned shard will be listed as the last field

twitter 0 p STARTED    3014 31.1mb 192.168.56.10 H5dfFeA
twitter 0 r STARTED    3014 31.1mb 192.168.56.30 bGG90GE
twitter 0 r STARTED    3014 31.1mb 192.168.56.20 I8hydUG
twitter 0 r UNASSIGNED ALLOCATION_FAILED

Reasons for unassigned shard

These are the possible reasons for a shard to be in a unassigned state:

INDEX_CREATED

Unassigned as a result of an API creation of an index.

CLUSTER_RECOVERED

Unassigned as a result of a full cluster recovery.

INDEX_REOPENED

Unassigned as a result of opening a closed index.

DANGLING_INDEX_IMPORTED

Unassigned as a result of importing a dangling index.

NEW_INDEX_RESTORED

Unassigned as a result of restoring into a new index.

EXISTING_INDEX_RESTORED

Unassigned as a result of restoring into a closed index.

REPLICA_ADDED

Unassigned as a result of explicit addition of a replica.

ALLOCATION_FAILED

Unassigned as a result of a failed allocation of the shard.

NODE_LEFT

Unassigned as a result of the node hosting it leaving the cluster.

REROUTE_CANCELLED

Unassigned as a result of explicit cancel reroute command.

REINITIALIZED

When a shard moves from started back to initializing.

REALLOCATED_REPLICA

A better replica location is identified and causes the existing replica allocation to be cancelled.

cat segments

The segments command provides low level information about the segments in the shards of an index. It provides information similar to the _segments endpoint. For example:

GET /_cat/segments?v

might look like:

index shard prirep ip        segment generation docs.count docs.deleted size size.memory committed searchable version compound
test  3     p      127.0.0.1 _0               0          1            0  3kb        2042 false     true       {lucene_version}   true
test1 3     p      127.0.0.1 _0               0          1            0  3kb        2042 false     true       {lucene_version}   true

The output shows information about index names and shard numbers in the first two columns.

If you only want to get information about segments in one particular index, you can add the index name in the URL, for example /_cat/segments/test. Also, several indexes can be queried like /_cat/segments/test,test1

The following columns provide additional monitoring information:

prirep

Whether this segment belongs to a primary or replica shard.

ip

The ip address of the segment’s shard.

segment

A segment name, derived from the segment generation. The name is internally used to generate the file names in the directory of the shard this segment belongs to.

generation

The generation number is incremented with each segment that is written. The name of the segment is derived from this generation number.

docs.count

The number of non-deleted documents that are stored in this segment. Note that these are Lucene documents, so the count will include hidden documents (e.g. from nested types).

docs.deleted

The number of deleted documents that are stored in this segment. It is perfectly fine if this number is greater than 0, space is going to be reclaimed when this segment gets merged.

size

The amount of disk space that this segment uses.

size.memory

Segments store some data into memory in order to be searchable efficiently. This column shows the number of bytes in memory that are used.

committed

Whether the segment has been sync’ed on disk. Segments that are committed would survive a hard reboot. No need to worry in case of false, the data from uncommitted segments is also stored in the transaction log so that Elasticsearch is able to replay changes on the next start.

searchable

True if the segment is searchable. A value of false would most likely mean that the segment has been written to disk but no refresh occurred since then to make it searchable.

version

The version of Lucene that has been used to write this segment.

compound

Whether the segment is stored in a compound file. When true, this means that Lucene merged all files from the segment in a single one in order to save file descriptors.

cat snapshots

The snapshots command shows all snapshots that belong to a specific repository. To find a list of available repositories to query, the command /_cat/repositories can be used. Querying the snapshots of a repository named repo1 then looks as follows.

GET /_cat/snapshots/repo1?v&s=id

Which looks like:

id     status start_epoch start_time end_epoch  end_time duration indices successful_shards failed_shards total_shards
snap1  FAILED 1445616705  18:11:45   1445616978 18:16:18     4.6m       1                 4             1            5
snap2 SUCCESS 1445634298  23:04:58   1445634672 23:11:12     6.2m       2                10             0           10

Each snapshot contains information about when it was started and stopped. Start and stop timestamps are available in two formats. The HH:MM:SS output is simply for quick human consumption. The epoch time retains more information, including date, and is machine sortable if the snapshot process spans days.

cat templates

The templates command provides information about existing templates.

GET /_cat/templates?v&s=name

which looks like

name      index_patterns order version
template0 [te*]          0
template1 [tea*]         1
template2 [teak*]        2     7

The output shows that there are three existing templates, with template2 having a version value.

The endpoint also supports giving a template name or pattern in the url to filter the results, for example /_cat/templates/template* or /_cat/templates/template0.