Discovery Plugins
Discovery plugins extend Elasticsearch by adding new discovery mechanisms that can be used instead of {ref}/modules-discovery-zen.html[Zen Discovery].
Core discovery plugins
The core discovery plugins are:
- EC2 discovery
-
The EC2 discovery plugin uses the AWS API for unicast discovery.
- Azure Classic discovery
-
The Azure Classic discovery plugin uses the Azure Classic API for unicast discovery.
- GCE discovery
-
The Google Compute Engine discovery plugin uses the GCE API for unicast discovery.
- File-based discovery
-
The File-based discovery plugin allows providing the unicast hosts list through a dynamically updatable file.
Community contributed discovery plugins
A number of discovery plugins have been contributed by our community:
-
eskka Discovery Plugin (by Shikhar Bhushan)
-
Kubernetes Discovery Plugin (by Jimmi Dyson, fabric8)
EC2 Discovery Plugin
The EC2 discovery plugin uses the AWS API for unicast discovery.
If you are looking for a hosted solution of Elasticsearch on AWS, please visit http://www.elastic.co/cloud.
Installation
This plugin can be installed using the plugin manager:
sudo bin/elasticsearch-plugin install discovery-ec2
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
This plugin can be downloaded for offline install from {plugin_url}/discovery-ec2/discovery-ec2-{version}.zip.
Removal
The plugin can be removed with the following command:
sudo bin/elasticsearch-plugin remove discovery-ec2
The node must be stopped before removing the plugin.
Getting started with AWS
The plugin provides a hosts provider for zen discovery named ec2
. This hosts
provider finds other Elasticsearch instances in EC2 through AWS metadata.
Authentication is done using
IAM
Role credentials by default. To enable the plugin, set the unicast host
provider for Zen discovery to ec2
:
discovery.zen.hosts_provider: ec2
Settings
EC2 host discovery supports a number of settings. Some settings are sensitive and must be stored in the {ref}/secure-settings.html[elasticsearch keystore]. For example, to use explicit AWS access keys:
bin/elasticsearch-keystore add discovery.ec2.access_key
bin/elasticsearch-keystore add discovery.ec2.secret_key
The following are the available discovery settings. All should be prefixed with discovery.ec2.
.
Those that must be stored in the keystore are marked as Secure
.
access_key
-
An ec2 access key. The
secret_key
setting must also be specified. (Secure) secret_key
-
An ec2 secret key. The
access_key
setting must also be specified. (Secure) session_token
-
An ec2 session token. The
access_key
andsecret_key
settings must also be specified. (Secure) endpoint
-
The ec2 service endpoint to connect to. See http://docs.aws.amazon.com/general/latest/gr/rande.html#ec2_region. This defaults to
ec2.us-east-1.amazonaws.com
. protocol
-
The protocol to use to connect to ec2. Valid values are either
http
orhttps
. Defaults tohttps
. proxy.host
-
The host name of a proxy to connect to ec2 through.
proxy.port
-
The port of a proxy to connect to ec2 through.
proxy.username
-
The username to connect to the
proxy.host
with. (Secure) proxy.password
-
The password to connect to the
proxy.host
with. (Secure) read_timeout
-
The socket timeout for connecting to ec2. The value should specify the unit. For example, a value of
5s
specifies a 5 second timeout. The default value is 50 seconds. groups
-
Either a comma separated list or array based list of (security) groups. Only instances with the provided security groups will be used in the cluster discovery. (NOTE: You could provide either group NAME or group ID.)
host_type
-
The type of host type to use to communicate with other instances. Can be one of
private_ip
,public_ip
,private_dns
,public_dns
ortag:TAGNAME
whereTAGNAME
refers to a name of a tag configured for all EC2 instances. Instances which don’t have this tag set will be ignored by the discovery process.For example if you defined a tag
my-elasticsearch-host
in ec2 and set it tomyhostname1.mydomain.com
, then settinghost_type: tag:my-elasticsearch-host
will tell Discovery Ec2 plugin to read the host name from themy-elasticsearch-host
tag. In this case, it will be resolved tomyhostname1.mydomain.com
. Read more about EC2 Tags.Defaults to
private_ip
. availability_zones
-
Either a comma separated list or array based list of availability zones. Only instances within the provided availability zones will be used in the cluster discovery.
any_group
-
If set to
false
, will require all security groups to be present for the instance to be used for the discovery. Defaults totrue
. node_cache_time
-
How long the list of hosts is cached to prevent further requests to the AWS API. Defaults to
10s
.
All secure settings of this plugin are {ref}/secure-settings.html#reloadable-secure-settings[reloadable]. After you reload the settings, an aws sdk client with the latest settings from the keystore will be used.
Important
|
Binding the network host
It’s important to define You can use {ref}/modules-network.html[core network host settings] or ec2 specific host settings: |
EC2 Network Host
When the discovery-ec2
plugin is installed, the following are also allowed
as valid network host settings:
EC2 Host Value | Description |
---|---|
|
The private IP address (ipv4) of the machine. |
|
The private host of the machine. |
|
The public IP address (ipv4) of the machine. |
|
The public host of the machine. |
|
equivalent to |
|
equivalent to |
|
equivalent to |
Recommended EC2 Permissions
EC2 discovery requires making a call to the EC2 service. You’ll want to setup an IAM policy to allow this. You can create a custom policy via the IAM Management Console. It should look similar to this.
{
"Statement": [
{
"Action": [
"ec2:DescribeInstances"
],
"Effect": "Allow",
"Resource": [
"*"
]
}
],
"Version": "2012-10-17"
}
Filtering by Tags
The ec2 discovery can also filter machines to include in the cluster based on tags (and not just groups). The settings
to use include the discovery.ec2.tag.
prefix. For example, if you defined a tag stage
in EC2 and set it to dev
,
setting discovery.ec2.tag.stage
to dev
will only filter instances with a tag key set to stage
, and a value
of dev
. Adding multiple discovery.ec2.tag
settings will require all of those tags to be set for the instance to be included.
One practical use for tag filtering is when an ec2 cluster contains many nodes that are not running Elasticsearch. In
this case (particularly with high discovery.zen.ping_timeout
values) there is a risk that a new node’s discovery phase
will end before it has found the cluster (which will result in it declaring itself master of a new cluster with the same
name - highly undesirable). Tagging Elasticsearch ec2 nodes and then filtering by that tag will resolve this issue.
Automatic Node Attributes
Though not dependent on actually using ec2
as discovery (but still requires the discovery-ec2
plugin installed), the
plugin can automatically add node attributes relating to ec2. In the future this may support other attributes, but this will
currently only add an aws_availability_zone
node attribute, which is the availability zone of the current node. Attributes
can be used to isolate primary and replica shards across availability zones by using the
{ref}/allocation-awareness.html[Allocation Awareness] feature.
In order to enable it, set cloud.node.auto_attributes
to true
in the settings. For example:
cloud.node.auto_attributes: true
cluster.routing.allocation.awareness.attributes: aws_availability_zone
Best Practices in AWS
Collection of best practices and other information around running Elasticsearch on AWS.
Instance/Disk
When selecting disk please be aware of the following order of preference:
-
EFS - Avoid as the sacrifices made to offer durability, shared storage, and grow/shrink come at performance cost, such file systems have been known to cause corruption of indices, and due to Elasticsearch being distributed and having built-in replication, the benefits that EFS offers are not needed.
-
EBS - Works well if running a small cluster (1-2 nodes) and cannot tolerate the loss all storage backing a node easily or if running indices with no replicas. If EBS is used, then leverage provisioned IOPS to ensure performance.
-
Instance Store - When running clusters of larger size and with replicas the ephemeral nature of Instance Store is ideal since Elasticsearch can tolerate the loss of shards. With Instance Store one gets the performance benefit of having disk physically attached to the host running the instance and also the cost benefit of avoiding paying extra for EBS.
Prefer Amazon Linux AMIs as since Elasticsearch runs on the JVM, OS dependencies are very minimal and one can benefit from the lightweight nature, support, and performance tweaks specific to EC2 that the Amazon Linux AMIs offer.
Networking
-
Networking throttling takes place on smaller instance types in both the form of bandwidth and number of connections. Therefore if large number of connections are needed and networking is becoming a bottleneck, avoid instance types with networking labeled as
Moderate
orLow
. -
Multicast is not supported, even when in an VPC; the aws cloud plugin which joins by performing a security group lookup.
-
When running in multiple availability zones be sure to leverage {ref}/allocation-awareness.html[shard allocation awareness] so that not all copies of shard data reside in the same availability zone.
-
Do not span a cluster across regions. If necessary, use cross cluster search.
Misc
-
If you have split your nodes into roles, consider tagging the EC2 instances by role to make it easier to filter and view your EC2 instances in the AWS console.
-
Consider enabling termination protection for all of your instances to avoid accidentally terminating a node in the cluster and causing a potentially disruptive reallocation.
Azure Classic Discovery Plugin
The Azure Classic Discovery plugin uses the Azure Classic API for unicast discovery.
deprecated[5.0.0, Use coming Azure ARM Discovery plugin instead]
Installation
This plugin can be installed using the plugin manager:
sudo bin/elasticsearch-plugin install discovery-azure-classic
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
This plugin can be downloaded for offline install from {plugin_url}/discovery-azure-classic/discovery-azure-classic-{version}.zip.
Removal
The plugin can be removed with the following command:
sudo bin/elasticsearch-plugin remove discovery-azure-classic
The node must be stopped before removing the plugin.
Azure Virtual Machine Discovery
Azure VM discovery allows to use the azure APIs to perform automatic discovery (similar to multicast in non hostile multicast environments). Here is a simple sample configuration:
cloud:
azure:
management:
subscription.id: XXX-XXX-XXX-XXX
cloud.service.name: es-demo-app
keystore:
path: /path/to/azurekeystore.pkcs12
password: WHATEVER
type: pkcs12
discovery:
zen.hosts_provider: azure
Important
|
Binding the network host
The keystore file must be placed in a directory accessible by Elasticsearch like the It’s important to define You can use {ref}/modules-network.html[core network host settings]. For example |
How to start (short story)
-
Create Azure instances
-
Install Elasticsearch
-
Install Azure plugin
-
Modify
elasticsearch.yml
file -
Start Elasticsearch
Azure credential API settings
The following are a list of settings that can further control the credential API:
cloud.azure.management.keystore.path
|
/path/to/keystore |
cloud.azure.management.keystore.type
|
|
cloud.azure.management.keystore.password
|
your_password for the keystore |
cloud.azure.management.subscription.id
|
your_azure_subscription_id |
cloud.azure.management.cloud.service.name
|
your_azure_cloud_service_name. This is the cloud service name/DNS but without the |
Advanced settings
The following are a list of settings that can further control the discovery:
discovery.azure.host.type
-
Either
public_ip
orprivate_ip
(default). Azure discovery will use the one you set to ping other nodes. discovery.azure.endpoint.name
-
When using
public_ip
this setting is used to identify the endpoint name used to forward requests to Elasticsearch (aka transport port name). Defaults toelasticsearch
. In Azure management console, you could define an endpointelasticsearch
forwarding for example requests on public IP on port 8100 to the virtual machine on port 9300. discovery.azure.deployment.name
-
Deployment name if any. Defaults to the value set with
cloud.azure.management.cloud.service.name
. discovery.azure.deployment.slot
-
Either
staging
orproduction
(default).
For example:
discovery:
type: azure
azure:
host:
type: private_ip
endpoint:
name: elasticsearch
deployment:
name: your_azure_cloud_service_name
slot: production
Setup process for Azure Discovery
We will expose here one strategy which is to hide our Elasticsearch cluster from outside.
With this strategy, only VMs behind the same virtual port can talk to each other. That means that with this mode, you can use Elasticsearch unicast discovery to build a cluster, using the Azure API to retrieve information about your nodes.
Prerequisites
Before starting, you need to have:
-
OpenSSL that isn’t from MacPorts, specifically
OpenSSL 1.0.1f 6 Jan 2014
doesn’t seem to create a valid keypair for ssh. FWIW,OpenSSL 1.0.1c 10 May 2012
on Ubuntu 14.04 LTS is known to work. -
SSH keys and certificate
You should follow this guide to learn how to create or use existing SSH keys. If you have already did it, you can skip the following.
Here is a description on how to generate SSH keys using
openssl
:# You may want to use another dir than /tmp cd /tmp openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout azure-private.key -out azure-certificate.pem chmod 600 azure-private.key azure-certificate.pem openssl x509 -outform der -in azure-certificate.pem -out azure-certificate.cer
Generate a keystore which will be used by the plugin to authenticate with a certificate all Azure API calls.
# Generate a keystore (azurekeystore.pkcs12) # Transform private key to PEM format openssl pkcs8 -topk8 -nocrypt -in azure-private.key -inform PEM -out azure-pk.pem -outform PEM # Transform certificate to PEM format openssl x509 -inform der -in azure-certificate.cer -out azure-cert.pem cat azure-cert.pem azure-pk.pem > azure.pem.txt # You MUST enter a password! openssl pkcs12 -export -in azure.pem.txt -out azurekeystore.pkcs12 -name azure -noiter -nomaciter
Upload the
azure-certificate.cer
file both in the Elasticsearch Cloud Service (underManage Certificates
), and underSettings → Manage Certificates
.ImportantWhen prompted for a password, you need to enter a non empty one. See this guide for more details about how to create keys for Azure.
Once done, you need to upload your certificate in Azure:
-
Go to the management console.
-
Sign in using your account.
-
Click on
Portal
. -
Go to Settings (bottom of the left list)
-
On the bottom bar, click on
Upload
and upload yourazure-certificate.cer
file.
You may want to use Windows Azure Command-Line Tool:
-
-
Install NodeJS, for example using homebrew on MacOS X:
brew install node
-
Install Azure tools
sudo npm install azure-cli -g
-
Download and import your azure settings:
# This will open a browser and will download a .publishsettings file azure account download # Import this file (we have downloaded it to /tmp) # Note, it will create needed files in ~/.azure. You can remove azure.publishsettings when done. azure account import /tmp/azure.publishsettings
Creating your first instance
You need to have a storage account available. Check Azure Blob Storage documentation for more information.
You will need to choose the operating system you want to run on. To get a list of official available images, run:
azure vm image list
Let’s say we are going to deploy an Ubuntu image on an extra small instance in West Europe:
Azure cluster name |
|
Image |
|
VM Name |
|
VM Size |
|
Location |
|
Login |
|
Password |
|
Using command line:
azure vm create azure-elasticsearch-cluster \
b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-13_10-amd64-server-20130808-alpha3-en-us-30GB \
--vm-name myesnode1 \
--location "West Europe" \
--vm-size extrasmall \
--ssh 22 \
--ssh-cert /tmp/azure-certificate.pem \
elasticsearch password1234\!\!
You should see something like:
info: Executing command vm create
+ Looking up image
+ Looking up cloud service
+ Creating cloud service
+ Retrieving storage accounts
+ Configuring certificate
+ Creating VM
info: vm create command OK
Now, your first instance is started.
Tip
|
Working with SSH
You need to give the private key and username each time you log on your instance:
But you can also define it once in
|
Next, you need to install Elasticsearch on your new instance. First, copy your keystore to the instance, then connect to the instance using SSH:
scp /tmp/azurekeystore.pkcs12 azure-elasticsearch-cluster.cloudapp.net:/home/elasticsearch
ssh azure-elasticsearch-cluster.cloudapp.net
Once connected, install Elasticsearch:
# Install Latest Java version
# Read http://www.webupd8.org/2012/09/install-oracle-java-8-in-ubuntu-via-ppa.html for details
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
# If you want to install OpenJDK instead
# sudo apt-get update
# sudo apt-get install openjdk-8-jre-headless
# Download Elasticsearch
curl -s https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-{version}.deb -o elasticsearch-{version}.deb
# Prepare Elasticsearch installation
sudo dpkg -i elasticsearch-{version}.deb
Check that Elasticsearch is running:
GET /
This command should give you a JSON result:
{
"name" : "Cp8oag6",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "AT69_T_DTp-1qgIJlatQqA",
"version" : {
"number" : "{version}",
"build_flavor" : "{build_flavor}",
"build_type" : "zip",
"build_hash" : "f27399d",
"build_date" : "2016-03-30T09:51:41.449Z",
"build_snapshot" : false,
"lucene_version" : "{lucene_version}",
"minimum_wire_compatibility_version" : "1.2.3",
"minimum_index_compatibility_version" : "1.2.3"
},
"tagline" : "You Know, for Search"
}
Install Elasticsearch cloud azure plugin
# Stop Elasticsearch
sudo service elasticsearch stop
# Install the plugin
sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install discovery-azure-classic
# Configure it
sudo vi /etc/elasticsearch/elasticsearch.yml
And add the following lines:
# If you don't remember your account id, you may get it with `azure account list`
cloud:
azure:
management:
subscription.id: your_azure_subscription_id
cloud.service.name: your_azure_cloud_service_name
keystore:
path: /home/elasticsearch/azurekeystore.pkcs12
password: your_password_for_keystore
discovery:
type: azure
# Recommended (warning: non durable disk)
# path.data: /mnt/resource/elasticsearch/data
Restart Elasticsearch:
sudo service elasticsearch start
If anything goes wrong, check your logs in /var/log/elasticsearch
.
Scaling Out!
You need first to create an image of your previous machine. Disconnect from your machine and run locally the following commands:
# Shutdown the instance
azure vm shutdown myesnode1
# Create an image from this instance (it could take some minutes)
azure vm capture myesnode1 esnode-image --delete
# Note that the previous instance has been deleted (mandatory)
# So you need to create it again and BTW create other instances.
azure vm create azure-elasticsearch-cluster \
esnode-image \
--vm-name myesnode1 \
--location "West Europe" \
--vm-size extrasmall \
--ssh 22 \
--ssh-cert /tmp/azure-certificate.pem \
elasticsearch password1234\!\!
Tip
|
It could happen that azure changes the endpoint public IP address. DNS propagation could take some minutes before you can connect again using name. You can get from azure the IP address if needed, using:
|
Let’s start more instances!
for x in $(seq 2 10)
do
echo "Launching azure instance #$x..."
azure vm create azure-elasticsearch-cluster \
esnode-image \
--vm-name myesnode$x \
--vm-size extrasmall \
--ssh $((21 + $x)) \
--ssh-cert /tmp/azure-certificate.pem \
--connect \
elasticsearch password1234\!\!
done
If you want to remove your running instances:
azure vm delete myesnode1
GCE Discovery Plugin
The Google Compute Engine Discovery plugin uses the GCE API for unicast discovery.
Installation
This plugin can be installed using the plugin manager:
sudo bin/elasticsearch-plugin install discovery-gce
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
This plugin can be downloaded for offline install from {plugin_url}/discovery-gce/discovery-gce-{version}.zip.
Removal
The plugin can be removed with the following command:
sudo bin/elasticsearch-plugin remove discovery-gce
The node must be stopped before removing the plugin.
GCE Virtual Machine Discovery
Google Compute Engine VM discovery allows to use the google APIs to perform automatic discovery (similar to multicast in non hostile multicast environments). Here is a simple sample configuration:
cloud:
gce:
project_id: <your-google-project-id>
zone: <your-zone>
discovery:
zen.hosts_provider: gce
The following gce settings (prefixed with cloud.gce
) are supported:
project_id
-
Your Google project id. By default the project id will be derived from the instance metadata.
Note: Deriving the project id from system properties or environment variables (`GOOGLE_CLOUD_PROJECT` or `GCLOUD_PROJECT`) is not supported.
zone
-
helps to retrieve instances running in a given zone. It should be one of the GCE supported zones. By default the zone will be derived from the instance metadata. See also Using GCE zones.
retry
-
If set to
true
, client will use ExponentialBackOff policy to retry the failed http request. Defaults totrue
. max_wait
-
The maximum elapsed time after the client instantiating retry. If the time elapsed goes past the
max_wait
, client stops to retry. A negative value means that it will wait indefinitely. Defaults to0s
(retry indefinitely). refresh_interval
-
How long the list of hosts is cached to prevent further requests to the GCE API.
0s
disables caching. A negative value will cause infinite caching. Defaults to0s
.
Important
|
Binding the network host
It’s important to define You can use {ref}/modules-network.html[core network host settings] or gce specific host settings: |
GCE Network Host
When the discovery-gce
plugin is installed, the following are also allowed
as valid network host settings:
GCE Host Value | Description |
---|---|
|
The private IP address of the machine for a given network interface. |
|
The hostname of the machine. |
|
Same as |
Examples:
# get the IP address from network interface 1
network.host: _gce:privateIp:1_
# Using GCE internal hostname
network.host: _gce:hostname_
# shortcut for _gce:privateIp:0_ (recommended)
network.host: _gce_
How to start (short story)
-
Create Google Compute Engine instance (with compute rw permissions)
-
Install Elasticsearch
-
Install Google Compute Engine Cloud plugin
-
Modify
elasticsearch.yml
file -
Start Elasticsearch
Setting up GCE Discovery
Prerequisites
Before starting, you need:
-
Your project ID, e.g.
es-cloud
. Get it from Google API Console. -
To install Google Cloud SDK
If you did not set it yet, you can define your default project you will work on:
gcloud config set project es-cloud
Login to Google Cloud
If you haven’t already, login to Google Cloud
gcloud auth login
This will open your browser. You will be asked to sign-in to a Google account and authorize access to the Google Cloud SDK.
Creating your first instance
gcloud compute instances create myesnode1 \
--zone <your-zone> \
--scopes compute-rw
When done, a report like this one should appears:
Created [https://www.googleapis.com/compute/v1/projects/es-cloud-1070/zones/us-central1-f/instances/myesnode1].
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
myesnode1 us-central1-f n1-standard-1 10.240.133.54 104.197.94.25 RUNNING
You can now connect to your instance:
# Connect using google cloud SDK
gcloud compute ssh myesnode1 --zone europe-west1-a
# Or using SSH with external IP address
ssh -i ~/.ssh/google_compute_engine 192.158.29.199
Important
|
Service Account Permissions
It’s important when creating an instance that the correct permissions are set. At a minimum, you must ensure you have:
Failing to set this will result in unauthorized messages when starting Elasticsearch. See Machine Permissions. |
Once connected, install Elasticsearch:
sudo apt-get update
# Download Elasticsearch
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-2.0.0.deb
# Prepare Java installation (Oracle)
sudo echo "deb http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main" | sudo tee /etc/apt/sources.list.d/webupd8team-java.list
sudo echo "deb-src http://ppa.launchpad.net/webupd8team/java/ubuntu trusty main" | sudo tee -a /etc/apt/sources.list.d/webupd8team-java.list
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys EEA14886
sudo apt-get update
sudo apt-get install oracle-java8-installer
# Prepare Java installation (or OpenJDK)
# sudo apt-get install java8-runtime-headless
# Prepare Elasticsearch installation
sudo dpkg -i elasticsearch-2.0.0.deb
Install Elasticsearch discovery gce plugin
Install the plugin:
# Use Plugin Manager to install it
sudo bin/elasticsearch-plugin install discovery-gce
Open the elasticsearch.yml
file:
sudo vi /etc/elasticsearch/elasticsearch.yml
And add the following lines:
cloud:
gce:
project_id: es-cloud
zone: europe-west1-a
discovery:
zen.hosts_provider: gce
Start Elasticsearch:
sudo /etc/init.d/elasticsearch start
If anything goes wrong, you should check logs:
tail -f /var/log/elasticsearch/elasticsearch.log
If needed, you can change log level to trace
by opening log4j2.properties
:
sudo vi /etc/elasticsearch/log4j2.properties
and adding the following line:
# discovery
logger.discovery_gce.name = discovery.gce
logger.discovery_gce.level = trace
Cloning your existing machine
In order to build a cluster on many nodes, you can clone your configured instance to new nodes. You won’t have to reinstall everything!
First create an image of your running instance and upload it to Google Cloud Storage:
# Create an image of your current instance
sudo /usr/bin/gcimagebundle -d /dev/sda -o /tmp/
# An image has been created in `/tmp` directory:
ls /tmp
e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz
# Upload your image to Google Cloud Storage:
# Create a bucket to hold your image, let's say `esimage`:
gsutil mb gs://esimage
# Copy your image to this bucket:
gsutil cp /tmp/e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz gs://esimage
# Then add your image to images collection:
gcloud compute images create elasticsearch-2-0-0 --source-uri gs://esimage/e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz
# If the previous command did not work for you, logout from your instance
# and launch the same command from your local machine.
Start new instances
As you have now an image, you can create as many instances as you need:
# Just change node name (here myesnode2)
gcloud compute instances create myesnode2 --image elasticsearch-2-0-0 --zone europe-west1-a
# If you want to provide all details directly, you can use:
gcloud compute instances create myesnode2 --image=elasticsearch-2-0-0 \
--zone europe-west1-a --machine-type f1-micro --scopes=compute-rw
Remove an instance (aka shut it down)
You can use Google Cloud Console or CLI to manage your instances:
# Stopping and removing instances
gcloud compute instances delete myesnode1 myesnode2 \
--zone=europe-west1-a
# Consider removing disk as well if you don't need them anymore
gcloud compute disks delete boot-myesnode1 boot-myesnode2 \
--zone=europe-west1-a
Using GCE zones
cloud.gce.zone
helps to retrieve instances running in a given zone. It should be one of the
GCE supported zones.
The GCE discovery can support multi zones although you need to be aware of network latency between zones.
To enable discovery across more than one zone, just enter add your zone list to cloud.gce.zone
setting:
cloud:
gce:
project_id: <your-google-project-id>
zone: ["<your-zone1>", "<your-zone2>"]
discovery:
zen.hosts_provider: gce
Filtering by tags
The GCE discovery can also filter machines to include in the cluster based on tags using discovery.gce.tags
settings.
For example, setting discovery.gce.tags
to dev
will only filter instances having a tag set to dev
. Several tags
set will require all of those tags to be set for the instance to be included.
One practical use for tag filtering is when an GCE cluster contains many nodes that are not running
Elasticsearch. In this case (particularly with high discovery.zen.ping_timeout
values) there is a risk that a new
node’s discovery phase will end before it has found the cluster (which will result in it declaring itself master of a
new cluster with the same name - highly undesirable). Adding tag on Elasticsearch GCE nodes and then filtering by that
tag will resolve this issue.
Add your tag when building the new instance:
gcloud compute instances create myesnode1 --project=es-cloud \
--scopes=compute-rw \
--tags=elasticsearch,dev
Then, define it in elasticsearch.yml
:
cloud:
gce:
project_id: es-cloud
zone: europe-west1-a
discovery:
zen.hosts_provider: gce
gce:
tags: elasticsearch, dev
Changing default transport port
By default, Elasticsearch GCE plugin assumes that you run Elasticsearch on 9300 default port.
But you can specify the port value Elasticsearch is meant to use using google compute engine metadata es_port
:
When creating instance
Add --metadata es_port=9301
option:
# when creating first instance
gcloud compute instances create myesnode1 \
--scopes=compute-rw,storage-full \
--metadata es_port=9301
# when creating an instance from an image
gcloud compute instances create myesnode2 --image=elasticsearch-1-0-0-RC1 \
--zone europe-west1-a --machine-type f1-micro --scopes=compute-rw \
--metadata es_port=9301
On a running instance
gcloud compute instances add-metadata myesnode1 \
--zone europe-west1-a \
--metadata es_port=9301
GCE Tips
Store project id locally
If you don’t want to repeat the project id each time, you can save it in the local gcloud config
gcloud config set project es-cloud
Machine Permissions
If you have created a machine without the correct permissions, you will see 403 unauthorized
error messages. To change machine permission on an existing instance, first stop the instance then Edit. Scroll down to Access Scopes
to change permission. The other way to alter these permissions is to delete the instance (NOT THE DISK). Then create another with the correct permissions.
- Creating machines with gcloud
-
Ensure the following flags are set:
--scopes=compute-rw
- Creating with console (web)
-
When creating an instance using the web portal, click Show advanced options.
At the bottom of the page, under
PROJECT ACCESS
, choose>> Compute >> Read Write
. - Creating with knife google
-
Set the service account scopes when creating the machine:
knife google server create www1 \ -m n1-standard-1 \ -I debian-8 \ -Z us-central1-a \ -i ~/.ssh/id_rsa \ -x jdoe \ --gce-service-account-scopes https://www.googleapis.com/auth/compute.full_control
Or, you may use the alias:
--gce-service-account-scopes compute-rw
Testing GCE
Integrations tests in this plugin require working GCE configuration and therefore disabled by default. To enable tests prepare a config file elasticsearch.yml with the following content:
cloud:
gce:
project_id: es-cloud
zone: europe-west1-a
discovery:
zen.hosts_provider: gce
Replaces project_id
and zone
with your settings.
To run test:
mvn -Dtests.gce=true -Dtests.config=/path/to/config/file/elasticsearch.yml clean test
File-Based Discovery Plugin
The functionality provided by the discovery-file
plugin is now available in
Elasticsearch without requiring a plugin. This plugin still exists to ensure
backwards compatibility, but it will be removed in a future version.
On installation, this plugin creates a file at
$ES_PATH_CONF/discovery-file/unicast_hosts.txt
that comprises comments that
describe how to use it. It is preferable not to install this plugin and instead
to create this file, and its containing directory, using standard tools.
Installation
This plugin can be installed using the plugin manager:
sudo bin/elasticsearch-plugin install discovery-file
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
This plugin can be downloaded for offline install from {plugin_url}/discovery-file/discovery-file-{version}.zip.
Removal
The plugin can be removed with the following command:
sudo bin/elasticsearch-plugin remove discovery-file
The node must be stopped before removing the plugin.