Aes DG
Aes DG
Aes DG
Developer Guide
API Version 2015-01-01
Amazon Elasticsearch Service Developer Guide
Amazon's trademarks and trade dress may not be used in connection with any product or service that is not
Amazon's, in any manner that is likely to cause confusion among customers, or in any manner that disparages or
discredits Amazon. All other trademarks not owned by Amazon are the property of their respective owners, who may
or may not be affiliated with, connected to, or sponsored by Amazon.
Amazon Elasticsearch Service Developer Guide
Table of Contents
What Is Amazon Elasticsearch Service? ................................................................................................. 1
Features of Amazon Elasticsearch Service ...................................................................................... 1
Supported Elasticsearch Versions ................................................................................................. 2
Pricing for Amazon ES ................................................................................................................ 3
Getting Started with Amazon Elasticsearch Service ......................................................................... 3
Related Services ......................................................................................................................... 3
Getting Started with Amazon ES Domains ............................................................................................ 5
Step 1: Creating an Amazon ES Domain ........................................................................................ 5
Step 2: Uploading Data for Indexing ............................................................................................ 6
Step 3: Searching Documents in an Amazon ES Domain .................................................................. 7
Step 4: Deleting an Amazon ES Domain ........................................................................................ 8
Creating and Managing Amazon ES Domains ......................................................................................... 9
Creating Amazon ES Domains ..................................................................................................... 9
Creating Amazon ES Domains (Console) ................................................................................ 9
Creating Amazon ES Domains (AWS CLI) ............................................................................. 11
Creating Amazon ES Domains (AWS SDKs) ........................................................................... 13
Configuring Access Policies ........................................................................................................ 13
Advanced Options .................................................................................................................... 13
Configuration Changes .............................................................................................................. 14
Charges for Configuration Changes ..................................................................................... 15
Service Software Updates .......................................................................................................... 15
Configuring a Multi-AZ Domain .................................................................................................. 17
Shard Distribution ............................................................................................................ 17
Dedicated Master Node Distribution ................................................................................... 18
Availability Zone Disruptions .............................................................................................. 19
VPC Support ............................................................................................................................ 20
Limitations ...................................................................................................................... 22
About Access Policies on VPC Domains ............................................................................... 23
Testing VPC Domains ........................................................................................................ 23
Prerequisites .................................................................................................................... 24
Creating a VPC ................................................................................................................. 25
Reserving IP Addresses in a VPC Subnet .............................................................................. 26
Service-Linked Role for VPC Access ..................................................................................... 26
Migrating from Public Access to VPC Access ......................................................................... 27
Amazon VPC Documentation ............................................................................................. 27
Monitoring Cluster Metrics ........................................................................................................ 27
Interpreting Health Dashboards ......................................................................................... 27
Cluster Metrics ................................................................................................................. 28
Dedicated Master Node Metrics .......................................................................................... 31
EBS Volume Metrics .......................................................................................................... 32
Instance Metrics ............................................................................................................... 33
UltraWarm Metrics ........................................................................................................... 36
Alerting Metrics ............................................................................................................... 37
Anomaly Detection Metrics ................................................................................................ 38
SQL Metrics ..................................................................................................................... 39
KNN Metrics .................................................................................................................... 40
Cross-Cluster Search Metrics .............................................................................................. 40
Learning to Rank Metrics ................................................................................................... 40
Configuring Logs ...................................................................................................................... 41
Enabling Log Publishing (Console) ...................................................................................... 41
Enabling Log Publishing (AWS CLI) ..................................................................................... 42
Enabling Log Publishing (AWS SDKs) .................................................................................. 43
Setting Elasticsearch Logging Thresholds for Slow Logs ........................................................ 44
Viewing Logs ................................................................................................................... 44
Amazon ES provisions all the resources for your Elasticsearch cluster and launches it. It also
automatically detects and replaces failed Elasticsearch nodes, reducing the overhead associated with
self-managed infrastructures. You can scale your cluster with a single API call or a few clicks in the
console.
To get started using Amazon ES, you create a domain. An Amazon ES domain is synonymous with an
Elasticsearch cluster. Domains are clusters with the settings, instance types, instance counts, and storage
resources that you specify. Each instance acts as one Elasticsearch node.
You can use the Amazon ES console to set up and configure a domain in minutes. If you prefer
programmatic access, you can use the AWS CLI or the AWS SDKs.
Topics
• Features of Amazon Elasticsearch Service (p. 1)
• Supported Elasticsearch Versions (p. 2)
• Pricing for Amazon Elasticsearch Service (p. 3)
• Getting Started with Amazon Elasticsearch Service (p. 3)
• Related Services (p. 3)
Scale
• Numerous configurations of CPU, memory, and storage capacity, known as instance types
• Up to 3 PB of attached storage
• Cost-effective UltraWarm (p. 184) storage for read-only data
Security
Stability
• Numerous geographical locations for your resources, known as Regions and Availability Zones
• Node allocation across two or three Availability Zones in the same AWS Region, known as Multi-AZ
• Dedicated master nodes to offload cluster management tasks
• Automated snapshots to back up and restore Amazon ES domains
Flexibility
Compared to earlier versions of Elasticsearch, the 7.x and 6.x versions offer powerful features that make
them faster, more secure, and easier to use. Here are a few highlights:
• Higher indexing performance – Newer versions of Elasticsearch provide superior indexing capabilities
that significantly increase the throughput of data updates.
• Better safeguards – Newer versions of Elasticsearch help prevent overly broad or complex queries
from negatively affecting the performance and stability of the cluster.
• Vega visualizations – Kibana 6.2 and later versions support the Vega visualization language, which lets
you make context-aware Elasticsearch queries, combine multiple data sources into a single graph, add
user interactivity to graphs, and much more.
• Java high-level REST client – Compared to the low-level client, this client offers a simplified
development experience and supports most Elasticsearch APIs. For a code example, see Signing HTTP
Requests (p. 114).
For more information, see the section called “Supported Elasticsearch Operations” (p. 216), the section
called “Features by Elasticsearch Version” (p. 213), and the section called “Plugins by Elasticsearch
Version” (p. 214).
If you start a new Elasticsearch project, we strongly recommend that you choose the latest supported
Elasticsearch version. If you have an existing domain that uses an older Elasticsearch version, you
can choose to keep the domain or migrate your data. For more information, see the section called
“Upgrading Elasticsearch” (p. 52).
However, some notable data transfer exceptions exist. If a domain uses multiple Availability
Zones (p. 17), Amazon ES does not bill for traffic between the Availability Zones. Significant data
transfer occurs within a domain during shard allocation and rebalancing. Amazon ES neither meters nor
bills for this traffic. Similarly, Amazon ES does not bill for data transfer between UltraWarm (p. 184)
nodes and Amazon S3.
For full pricing details, see Amazon Elasticsearch Service Pricing. For information about charges incurred
during configuration changes, see the section called “Charges for Configuration Changes” (p. 15).
For information on migrating to Amazon ES from a self-managed Elasticsearch cluster, see the section
called “Migrating to Amazon ES” (p. 241).
Related Services
Amazon ES commonly is used with the following services:
Amazon CloudWatch
Amazon ES domains automatically send metrics to CloudWatch so that you can monitor domain
health and performance. For more information, see Monitoring Cluster Metrics with Amazon
CloudWatch (p. 27).
CloudWatch Logs can also go the other direction. You might configure CloudWatch Logs to stream
data to Amazon ES for analysis. To learn more, see the section called “Loading Streaming Data into
Amazon ES from Amazon CloudWatch” (p. 141).
AWS CloudTrail
Use AWS CloudTrail to get a history of the Amazon ES configuration API calls and related events
for your account. For more information, see Logging and Monitoring in Amazon Elasticsearch
Service (p. 96).
Amazon Kinesis
Kinesis is a managed service for real-time processing of streaming data at a massive scale. For more
information, see the section called “Loading Streaming Data into Amazon ES from Amazon Kinesis
Data Streams” (p. 136) and the section called “Loading Streaming Data into Amazon ES from
Amazon Kinesis Data Firehose” (p. 141).
Amazon S3
Amazon Simple Storage Service (Amazon S3) provides storage for the internet. This guide provides
Lambda sample code for integration with Amazon S3. For more information, see the section called
“Loading Streaming Data into Amazon ES from Amazon S3” (p. 131).
AWS IAM
AWS Identity and Access Management (IAM) is a web service that you can use to manage access
to your Amazon ES domains. For more information, see the section called “Identity and Access
Management” (p. 65).
AWS Lambda
AWS Lambda is a compute service that lets you run code without provisioning or managing servers.
This guide provides Lambda sample code to stream data from DynamoDB, Amazon S3, and Kinesis.
For more information, see the section called “Loading Streaming Data into Amazon ES” (p. 131).
Amazon DynamoDB
Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable
performance with seamless scalability. To learn more about streaming data to Amazon ES, see the
section called “Loading Streaming Data into Amazon ES from Amazon DynamoDB” (p. 138).
The tutorial walks you through the basic steps to get a domain up and running quickly. For more detailed
information, see Creating and Managing Amazon ES Domains (p. 9) and the other topics within this
guide. For information on migrating to Amazon ES from a self-managed Elasticsearch cluster, see the
section called “Migrating to Amazon ES” (p. 241).
You can complete the following steps by using the Amazon ES console, the AWS CLI, or the AWS SDK:
For information about installing and setting up the AWS CLI, see the AWS Command Line Interface User
Guide.
An Amazon ES domain is synonymous with an Elasticsearch cluster. Domains are clusters with the
settings, instance types, instance counts, and storage resources that you specify. You can create an
Amazon ES domain by using the console, the AWS CLI, or the AWS SDKs.
You can upload data to an Amazon Elasticsearch Service domain using the command line or most
programming languages.
The following example requests use curl, a common HTTP client, for brevity and convenience. Clients like
curl can't perform the request signing that is required if your access policies specify IAM users or roles. To
successfully perform the instructions in this step, you must use fine-grained access control with a master
user name and password, like you configured in step 1 (p. 5).
You can install curl on Windows and use it from the command prompt, but we recommend a tool like
Cygwin or the Windows Subsystem for Linux. macOS and most Linux distributions come with curl pre-
installed.
• Run the following command to add a single document to the movies domain:
For a detailed explanation of this command and how to make signed requests to Amazon ES, see
Indexing Data (p. 129).
1. Create a file called bulk_movies.json. Copy and paste the following content into it, and add a
trailing newline:
Julie", "Kleeb, Helen", "Gray, Joe", "Nalder, Reggie", "Stevens, Bert", "Masters,
Michael", "Lowell, Tom"], "title": "The Manchurian Candidate"}
{ "index" : { "_index": "movies", "_id" : "3" } }
{"director": "Baird, Stuart", "genre": ["Action", "Crime", "Thriller"], "year": 1998,
"actor": ["Downey Jr., Robert", "Jones, Tommy Lee", "Snipes, Wesley", "Pantoliano,
Joe", "Jacob, Ir\u00e8ne", "Nelligan, Kate", "Roebuck, Daniel", "Malahide, Patrick",
"Richardson, LaTanya", "Wood, Tom", "Kosik, Thomas", "Stellate, Nick", "Minkoff,
Robert", "Brown, Spitfire", "Foster, Reese", "Spielbauer, Bruce", "Mukherji, Kevin",
"Cray, Ed", "Fordham, David", "Jett, Charlie"], "title": "U.S. Marshals"}
{ "index" : { "_index": "movies", "_id" : "4" } }
{"director": "Ray, Nicholas", "genre": ["Drama", "Romance"], "year": 1955, "actor":
["Hopper, Dennis", "Wood, Natalie", "Dean, James", "Mineo, Sal", "Backus, Jim",
"Platt, Edward", "Ray, Nicholas", "Hopper, William", "Allen, Corey", "Birch, Paul",
"Hudson, Rochelle", "Doran, Ann", "Hicks, Chuck", "Leigh, Nelson", "Williams, Robert",
"Wessel, Dick", "Bryar, Paul", "Sessions, Almira", "McMahon, David", "Peters Jr.,
House"], "title": "Rebel Without a Cause"}
2. Run the following command to upload the file to the movies domain:
For more information about the bulk file format, see Indexing Data (p. 129).
• Run the following command to search the movies domain for the word mars:
If you used the bulk data on the previous page, try searching for rebel instead.
1. Point your browser to the Kibana plugin for your Amazon ES domain. You can find the Kibana
endpoint on your domain dashboard on the Amazon ES console. The URL follows this format:
domain-endpoint/_plugin/kibana/
Unlike the brief instructions in the Getting Started (p. 5) tutorial, this chapter describes all options and
provides relevant reference information. You can complete each procedure by using instructions for the
Amazon ES console, the AWS Command Line Interface (AWS CLI), or the AWS SDKs.
• Production domains use Multi-AZ and dedicated master nodes for higher availability.
• Development and testing domains use a single Availability Zone.
• Custom domains let you choose from all configuration options.
Important
Different deployment types present different options on subsequent pages. These steps
include all options (the Custom deployment type).
5. For Elasticsearch version, we recommend that you choose the latest version. For more information,
see the section called “Supported Elasticsearch Versions” (p. 2).
6. Choose Next.
7. For Elasticsearch domain name, enter a domain name. The name must meet the following criteria:
Note
Not all Availability Zones support all instance types. If you choose 3-AZ, we recommend
choosing current-generation instance types such as R5 or I3.
10. For Number of nodes, choose the number of data nodes.
For maximum values, see the section called “Cluster and Instance Limits” (p. 231). Single-node
clusters are fine for development and testing, but should not be used for production workloads. For
more guidance, see the section called “Sizing Amazon ES Domains” (p. 203) and the section called
“Configuring a Multi-AZ Domain” (p. 17).
11. For Data nodes storage type, choose either Instance (default) or EBS.
For guidance on creating especially large domains, see Petabyte Scale (p. 207). If you choose EBS,
the following options appear:
If you choose Provisioned IOPS (SSD) for the EBS volume type, for Provisioned IOPS, enter the
baseline IOPS performance that you want. For more information, see Amazon EBS Volumes in
the Amazon EC2 documentation.
b. For EBS storage size per node, enter the size of the EBS volume that you want to attach to each
data node.
EBS volume size is per node. You can calculate the total cluster size for the Amazon ES domain
by multiplying the number of data nodes by the EBS volume size. The minimum and maximum
size of an EBS volume depends on both the specified EBS volume type and the instance type
that it's attached to. To learn more, see EBS Volume Size Limits (p. 232).
12. (Optional) Enable or disable dedicated master nodes (p. 208). Dedicated master nodes increase
cluster stability and are required for domains that have instance counts greater than 10. We
recommend three dedicated master nodes for production domains.
Note
You can choose different instance types for your dedicated master nodes and data nodes.
For example, you might select general purpose or storage-optimized instances for your data
nodes, but compute-optimized instances for your dedicated master nodes.
13. (Optional) To enable UltraWarm storage (p. 184), choose Enable UltraWarm data nodes. Each
instance type has a maximum amount of storage (p. 232) that it can address. Multiply that amount
by the number of warm data nodes for the total addressable warm storage.
14. (Optional) For domains running Elasticsearch 5.3 and later, Automated snapshot start hour has no
effect. For more information about automated snapshots, see the section called “Working with Index
Snapshots” (p. 45).
15. (Optional) Choose Optional Elasticsearch cluster settings. For a summary of these options, see the
section called “Advanced Options” (p. 13).
16. Choose Next.
17. In the Network configuration section, choose either VPC access or Public access. If you choose
Public access, skip to the next step. If you choose VPC access, ensure that you have met the
prerequisites (p. 24), and then do the following:
a. For VPC, choose the ID of the VPC that you want to use.
Note
The VPC and domain must be in the same AWS Region, and you must select a VPC
with tenancy set to Default. Amazon ES does not yet support VPCs that use dedicated
tenancy.
b. For Subnet, choose a subnet. If you enabled Multi-AZ, you must choose two or three subnets.
Amazon ES will place a VPC endpoint and elastic network interfaces in the subnets.
API Version 2015-01-01
10
Amazon Elasticsearch Service Developer Guide
Creating Amazon ES Domains (AWS CLI)
Note
You must reserve sufficient IP addresses for the network interfaces in the subnet (or
subnets). For more information, see Reserving IP Addresses in a VPC Subnet (p. 26).
c. For Security groups, choose the VPC security groups that need access to the Amazon ES
domain. For more information, see the section called “VPC Support” (p. 20).
d. For IAM role, keep the default role. Amazon ES uses this predefined role (also known as a
service-linked role) to access your VPC and to place a VPC endpoint and network interfaces in
the subnet of the VPC. For more information, see Service-Linked Role for VPC Access (p. 26).
18. In the Fine-grained access control section, enable or disable fine-grained access control:
• If you want to use IAM for user management, choose Set IAM role as master user and specify the
ARN for an IAM role.
• If you want to user the internal user database, choose Create a master user and specify a user
name and password.
Whichever option you choose, the master user can access all indices in the cluster and all
Elasticsearch APIs. For guidance on which option to choose, see the section called “Key
Concepts” (p. 82).
If you disable fine-grained access control, you can still control access to your domain by placing it
within a VPC, applying a restrictive access policy, or both. You must enable node-to-node encryption
and encryption at rest to use fine-grained access control.
19. (Optional) If you want to use Amazon Cognito authentication for Kibana, choose Enable Amazon
Cognito authentication.
• Choose the Amazon Cognito user pool and identity pool that you want to use for Kibana
authentication. For guidance on creating these resources, see the section called “Authentication
for Kibana” (p. 100).
20. For Domain access policy, add the ARNs or IP addresses that you want or choose a preconfigured
policy from the dropdown list. For more information, see the section called “Identity and Access
Management” (p. 65) and the section called “About Access Policies on VPC Domains” (p. 23).
Note
If you chose VPC access in step 17, IP-based policies are prohibited. Instead, you can use
security groups to control which IP addresses can access the domain. For more information,
see the section called “About Access Policies on VPC Domains” (p. 23).
21. (Optional) To require that all requests to the domain arrive over HTTPS, select the Require HTTPS
for all traffic to the domain check box.
22. (Optional) To enable node-to-node encryption, select the Node-to-node encryption check box. For
more information, see the section called “Node-to-node Encryption” (p. 64).
23. (Optional) To enable encryption of data at rest, select the Enable encryption of data at rest check
box.
Select (Default) aws/es to have Amazon ES create a KMS encryption key on your behalf (or use the
one that it already created). Otherwise, choose your own KMS encryption key from the KMS master
key menu. For more information, see the section called “Encryption at Rest” (p. 62).
24. Choose Next.
25. On the Review page, review your domain configuration, and then choose Confirm.
Example Commands
This first example demonstrates the following Amazon ES domain configuration:
Note
If you attempt to create an Amazon ES domain and a domain with the same name already
exists, the CLI does not report an error. Instead, it returns details for the existing domain.
The console provides preconfigured access policies that you can customize for the specific needs of your
domain. You also can import access policies from other Amazon ES domains. For information about
how these access policies interact with VPC access, see the section called “About Access Policies on VPC
Domains” (p. 23).
Advanced Options
Use advanced options to configure the following:
rest.action.multi.allow_explicit_index
Specifies whether explicit references to indices are allowed inside the body of HTTP requests.
Setting this property to false prevents users from bypassing access control for subresources. By
default, the value is true. For more information, see the section called “Advanced Options and API
Considerations” (p. 74).
indices.fielddata.cache.size
Specifies the percentage of Java heap space that is allocated to field data. By default, this setting is
unbounded.
Note
Many customers query rotating daily indices. We recommend that you begin benchmark
testing with indices.fielddata.cache.size configured to 40% of the JVM heap for
most such use cases. However, if you have very large indices you might need a large field
data cache.
indices.query.bool.max_clause_count
Specifies the maximum number of clauses allowed in a Lucene boolean query. The default is 1,024.
Queries with more than the permitted number of clauses result in a TooManyClauses error. For
more information, see the Lucene documentation.
Configuration Changes
Amazon ES uses a blue/green deployment process when updating domains. Blue/green typically refers
to the practice of running two production environments, one live and one idle, and switching the two
as you make software changes. In the case of Amazon ES, it refers to the practice of creating a new
environment for domain updates and routing users to the new environment after those updates are
complete. The practice minimizes downtime and maintains the original environment in the event that
deployment to the new environment is unsuccessful.
There are some exceptions. For example, if you haven't reconfigured your domain since the launch
of three Availability Zone support, Amazon ES might perform a one-time blue/green deployment to
redistribute your dedicated master nodes across Availability Zones.
If you initiate a configuration change, the domain state changes to Processing while Amazon ES creates
a new environment with the latest service software (p. 15). During certain service software updates,
the state remains Active. In both cases, you can review the cluster health and Amazon CloudWatch
metrics and see that the number of nodes in the cluster temporarily increases—often doubling—while
the domain update occurs. In the following illustration, you can see the number of nodes doubling from
11 to 22 during a configuration change and returning to 11 when the update is complete.
This temporary increase can strain the cluster's dedicated master nodes (p. 208), which suddenly might
have many more nodes to manage. It's important to maintain sufficient capacity on dedicated master
nodes to handle the overhead that is associated with these blue/green deployments.
Important
You do not incur any additional charges during configuration changes and service maintenance.
You are billed only for the number of nodes that you request for your cluster. For specifics, see
the section called “Charges for Configuration Changes” (p. 15).
To prevent overloading dedicated master nodes, you can monitor usage with the Amazon CloudWatch
metrics (p. 27). For recommended maximum values, see the section called “Recommended
CloudWatch Alarms” (p. 210).
• If you change the instance type, you are charged for both clusters for the first hour. After the first hour,
you are charged only for the new cluster.
Example: You change the configuration from three m3.xlarge instances to four m4.large instances.
For the first hour, you are charged for both clusters (3 * m3.xlarge + 4 * m4.large). After the first
hour, you are charged only for the new cluster (4 * m4.large).
• If you don’t change the instance type, you are charged only for the largest cluster for the first hour.
After the first hour, you are charged only for the new cluster.
Example: You change the configuration from six m3.xlarge instances to three m3.xlarge instances.
For the first hour, you are charged for the largest cluster (6 * m3.xlarge). After the first hour, you are
charged only for the new cluster (3 * m3.xlarge).
Amazon ES regularly releases system software updates that add features or otherwise improve your
domains. The console is the easiest way to see if an update is available. When new service software
becomes available, you can request an update to your domain and benefit from new features more
quickly. You might also want to start the update at a low traffic time.
• If you take no action on required updates, we still update the service software automatically after a
certain timeframe (typically two weeks).
• If the console does not include an automatic deployment date, the update is optional.
Your domain might be ineligible for a service software update if it is in any of the states that are shown
in the following table.
State Description
Domain in processing The domain is in the middle of a configuration change. Check update
eligibility after the operation completes.
Red cluster status One or more indices in the cluster is red. For troubleshooting steps, see the
section called “Red Cluster Status” (p. 262).
High error rate The Elasticsearch cluster is returning a large number of 5xx errors when
attempting to process requests. This problem is usually the result of too
many simultaneous read or write requests. Consider reducing traffic to the
cluster or scaling your domain.
Split brain Split brain means that your Elasticsearch cluster has more than one master
node and has split into two clusters that never will rejoin on their own. You
can avoid split brain by using the recommended number of dedicated master
nodes (p. 208). For help recovering from split brain, contact AWS Support.
Amazon Cognito Your domain uses authentication for Kibana (p. 100), and Amazon ES can't
integration issue find one or more Amazon Cognito resources. This problem usually occurs if
the Amazon Cognito user pool is missing. To correct the issue, recreate the
missing resource and configure the Amazon ES domain to use it.
Other Amazon ES Issues with Amazon ES itself might cause your domain to display as ineligible
service issue for an update. If none of the previous conditions apply to your domain and
the problem persists for more than a day, contact AWS Support.
You can use the following commands to see if an update is available, check upgrade eligibility, and
request an update:
• describe-elasticsearch-domain (DescribeElasticsearchDomain)
• start-elasticsearch-service-software-update
(StartElasticsearchServiceSoftwareUpdate)
For more information, see the AWS CLI Command Reference and Amazon ES Configuration API
Reference (p. 271).
Tip
After requesting an update, you might have a narrow window of time in which you can
cancel it. Use the console or stop-elasticsearch-service-software-update
(StopElasticsearchServiceSoftwareUpdate) command.
For domains that run production workloads, we recommend the following configuration:
• Choose a Region that supports three Availability Zones with Amazon ES.
• Deploy the domain across three zones.
• Choose current-generation instance types for dedicated master nodes and data nodes.
• Use three dedicated master nodes and at least three data nodes.
• Create at least one replica for each index in your cluster.
The rest of this section provides explanations for and context around these recommendations.
Shard Distribution
If you enable Multi-AZ, you should create at least one replica for each index in your cluster. Without
replicas, Amazon ES can't distribute copies of your data to other Availability Zones, which largely defeats
the purpose of Multi-AZ. Fortunately, the default configuration for any index is a replica count of 1. As
the following diagram shows, Amazon ES makes a best effort to distribute primary shards and their
corresponding replica shards to different zones.
In addition to distributing shards by Availability Zone, Amazon ES distributes them by node. Still, certain
domain configurations can result in imbalanced shard counts. Consider the following domain:
• 5 data nodes
• 5 primary shards
• 2 replicas
• 3 Availability Zones
In this situation, Amazon ES has to overload one node in order to distribute the primary and replica
shards across the zones, as shown in the following diagram.
To avoid these kinds of situations, which can strain individual nodes and hurt performance, we
recommend that you choose an instance count that is a multiple of three if you plan to have two or more
replicas per index.
• If you choose an older-generation instance type that is not available in three Availability Zones, the
following scenarios apply:
• If you chose three Availability Zones for the domain, Amazon ES throws an error. Choose a different
instance type, and try again.
• If you chose two Availability Zones for the domain, Amazon ES distributes the dedicated master
nodes across two zones.
• Not all AWS Regions have three Availability Zones. In these Regions, you can only configure a domain
to use two zones (and Amazon ES can only distribute dedicated master nodes across two zones).
In all configurations, regardless of the cause, node failures can cause the cluster's remaining data nodes
to experience a period of increased load while Amazon ES automatically configures new nodes to replace
the now-missing ones.
For example, in the event of an Availability Zone disruption in a three-zone configuration, two-thirds as
many data nodes have to process just as many requests to the cluster. As they process these requests,
the remaining nodes are also replicating shards onto new nodes as they come online, which can further
impact performance. If availability is critical to your workload, consider adding resources to your cluster
to alleviate this concern.
Note
Amazon ES manages Multi-AZ domains transparently, so you can't manually simulate
Availability Zone disruptions.
Placing an Amazon ES domain within a VPC enables secure communication between Amazon ES and
other services within the VPC without the need for an internet gateway, NAT device, or VPN connection.
All traffic remains securely within the AWS Cloud. Because of their logical isolation, domains that reside
within a VPC have an extra layer of security when compared to domains that use public endpoints.
To support VPCs, Amazon ES places an endpoint into one, two, or three subnets of your VPC. A subnet is
a range of IP addresses in your VPC. If you enable multiple Availability Zones (p. 17) for your domain,
each subnet must be in a different Availability Zone in the same region. If you only use one Availability
Zone, Amazon ES places an endpoint into only one subnet.
The following illustration shows the VPC architecture for one Availability Zone.
The following illustration shows the VPC architecture for two Availability Zones.
Amazon ES also places an elastic network interface (ENI) in the VPC for each of your data nodes. Amazon
ES assigns each ENI a private IP address from the IPv4 address range of your subnet. The service also
assigns a public DNS hostname (which is the domain endpoint) for the IP addresses. You must use a
public DNS service to resolve the endpoint (which is a DNS hostname) to the appropriate IP addresses for
the data nodes:
• If your VPC uses the Amazon-provided DNS server by setting the enableDnsSupport option to true
(the default value), resolution for the Amazon ES endpoint will succeed.
• If your VPC uses a private DNS server and the server can reach the public authoritative DNS servers to
resolve DNS hostnames, resolution for the Amazon ES endpoint will also succeed.
Because the IP addresses might change, you should resolve the domain endpoint periodically so that you
can always access the correct data nodes. We recommend that you set the DNS resolution interval to one
minute. If you’re using a client, you should also ensure that the DNS cache in the client is cleared.
Note
Amazon ES doesn't support IPv6 addresses with a VPC. You can use a VPC that has IPv6 enabled,
but the domain will use IPv4 addresses.
Topics
• Limitations (p. 22)
• About Access Policies on VPC Domains (p. 23)
• Testing VPC Domains (p. 23)
• Before You Begin: Prerequisites for VPC Access (p. 24)
• Creating a VPC (p. 25)
• Reserving IP Addresses in a VPC Subnet (p. 26)
• Service-Linked Role for VPC Access (p. 26)
• Migrating from Public Access to VPC Access (p. 27)
• Amazon VPC Documentation (p. 27)
Limitations
Currently, operating an Amazon ES domain within a VPC has the following limitations:
• You can either launch your domain within a VPC or use a public endpoint, but you can't do both. You
must choose one or the other when you create your domain.
• If you launch a new domain within a VPC, you can't later switch it to use a public endpoint. The reverse
is also true: If you create a domain with a public endpoint, you can't later place it within a VPC. Instead,
you must create a new domain and migrate your data.
• You can't launch your domain within a VPC that uses dedicated tenancy. You must use a VPC with
tenancy set to Default.
• After you place a domain within a VPC, you can't move it to a different VPC. However, you can change
the subnets and security group settings.
• Compared to public domains, VPC domains display less information in the Amazon ES console.
Specifically, the Cluster health tab does not include shard information, and the Indices tab is not
present at all.
• To access the default installation of Kibana for a domain that resides within a VPC, users must have
access to the VPC. This process varies by network configuration, but likely involves connecting to a
VPN or managed network or using a proxy server. To learn more, see the section called “About Access
Policies on VPC Domains” (p. 23), the Amazon VPC User Guide, and the section called “Controlling
Access to Kibana” (p. 179).
https://search-domain-name-identifier.region.es.amazonaws.com
As the "public" label suggests, this endpoint is accessible from any internet-connected device, though
you can (and should) control access to it (p. 65). If you access the endpoint in a web browser, you
might receive a Not Authorized message, but the request reaches the domain.
When you create a domain with VPC access, the endpoint looks similar to a public endpoint:
https://vpc-domain-name-identifier.region.es.amazonaws.com
If you try to access the endpoint in a web browser, however, you might find that the request times out.
To perform even basic GET requests, your computer must be able to connect to the VPC. This connection
often takes the form of a VPN, managed network, or proxy server. For details on the various forms it can
take, see Scenarios and Examples in the Amazon VPC User Guide. For a development-focused example,
see the section called “Testing VPC Domains” (p. 23).
In addition to this connectivity requirement, VPCs let you manage access to the domain through security
groups. For many use cases, this combination of security features is sufficient, and you might feel
comfortable applying an open access policy to the domain.
Operating with an open access policy does not mean that anyone on the internet can access the Amazon
ES domain. Rather, it means that if a request reaches the Amazon ES domain and the associated security
groups permit it, the domain accepts the request without further security checks.
For an additional layer of security, we recommend using access policies that specify IAM users or roles.
Applying these policies means that, for the domain to accept a request, the security groups must permit
it and it must be signed with valid credentials.
Note
Because security groups already enforce IP-based access policies, you can't apply IP-based access
policies to Amazon ES domains that reside within a VPC. If you use public access, IP-based
policies are still available.
1. For your domain's access policy, choose Allow open access to the domain. You can always update this
setting after you finish testing.
2. Create an Amazon Linux Amazon EC2 instance in the same VPC, subnet, and security group as your
Amazon ES domain.
Because this instance is for testing purposes and needs to do very little work, choose an inexpensive
instance type like t2.micro. Assign the instance a public IP address and either create a new key pair
or choose an existing one. If you create a new key, download it to your ~/.ssh directory.
To learn more about creating instances, see Getting Started with Amazon EC2 Linux Instances.
3. Add an internet gateway to your VPC.
4. In the route table for your VPC, add a new route. For Destination, specify a CIDR block that contains
your computer's public IP address. For Target, specify the internet gateway you just created.
For example, you might specify 123.123.123.123/32 for just your computer or
123.123.123.0/24 for a range of computers.
5. For the security group, specify two inbound rules:
The first rule lets you SSH into your EC2 instance. The second allows the EC2 instance to communicate
with the Amazon ES domain over HTTPS.
6. From the terminal, run the following command:
This command creates an SSH tunnel that forwards requests to https://localhost:9200 to your
Amazon ES domain through the EC2 instance. By default, Elasticsearch listens for traffic on port 9200.
Specifying this port simulates a local Elasticsearch install, but use whichever port you'd like.
The command provides no feedback and runs indefinitely. To stop it, press Ctrl + C.
7. Navigate to https://localhost:9200/_plugin/kibana/ in your web browser. You might need to
acknowledge a security exception.
Alternately, you can send requests to https://localhost:9200 using curl, Postman, or your favorite
programming language.
Tip
If you encounter curl errors due to a certificate mismatch, try the --insecure flag.
As an alternative to this approach, if your domain is in a region that AWS Cloud9 supports, you can
create an EC2 environment in the same VPC as your domain, add the environment's security group to
your Amazon ES domain configuration, add the HTTPS rule from step 5 to your security group, and use
the web-based Bash in AWS Cloud9 to issue curl commands.
• Create a VPC
To create your VPC, you can use the Amazon VPC console, the AWS CLI, or one of the AWS SDKs. For
more information, see Creating A VPC (p. 25). If you already have a VPC, you can skip this step.
• Reserve IP addresses
Amazon ES enables the connection of a VPC to a domain by placing network interfaces in a subnet
of the VPC. Each network interface is associated with an IP address. You must reserve a sufficient
number of IP addresses in the subnet for the network interfaces. For more information, see Reserving
IP Addresses in a VPC Subnet (p. 26).
Creating a VPC
To create your VPC, you can use one of the following: the Amazon VPC console, the AWS CLI, or one
of the AWS SDKs. The VPC must have between one and three subnets, depending on the number of
Availability Zones (p. 17) for your domain.
The following procedure shows how to use the Amazon VPC console to create a VPC with a public
subnet, reserve IP addresses for the subnet, and create a security group to control access to your Amazon
ES domain. For other VPC configurations, see Scenarios and Examples in the Amazon VPC User Guide.
1. Sign in to the AWS Management Console, and open the Amazon VPC console at https://
console.aws.amazon.com/vpc/.
2. In the navigation pane, choose VPC Dashboard.
3. Choose Start VPC Wizard.
4. On the Select a VPC Configuration page, select VPC with a Single Public Subnet.
5. On the VPC with a Single Public Subnet page, keep the default options, and then choose Create
VPC.
6. In the confirmation message that appears, choose Close.
7. If you intend to enable multiple Availability Zones (p. 17) for your Amazon ES domain, you must
create additional subnets. Otherwise, skip to step 8.
12. Define a network ingress rule for your security group. This rule allows you to connect to your
Amazon ES domain.
a. In the navigation pane, choose Security Groups, and then select the security group that you just
created.
b. At the bottom of the page, choose the Inbound Rules tab.
c. Choose Edit, and then choose HTTPS (443).
d. Choose Save.
Now you are ready to launch an Amazon ES domain (p. 9) in your Amazon VPC.
• Number of data nodes in your domain. (Master nodes are not included in the number.)
• Number of Availability Zones. If you enable two or three Availability Zones, you need only half or one-
third the number of IP addresses per subnet that you need for one Availability Zone.
Here is the basic formula: The number of IP addresses reserved in each subnet is three times the number
of nodes, divided by the number of Availability Zones.
Examples
• If a domain has 10 data nodes and two Availability Zones, the IP count is 10 / 2 * 3 = 15.
• If a domain has 10 data nodes and one Availability Zone, the IP count is 10 * 3 = 30.
When you create the domain, Amazon ES reserves the IP addresses, uses some for the domain, and
reserves the rest for blue/green deployments (p. 14). You can see the network interfaces and their
associated IP addresses in the Network Interfaces section of the Amazon EC2 console at https://
console.aws.amazon.com/ec2/. The Description column shows which Amazon ES domain the network
interface is associated with.
Tip
We recommend that you create dedicated subnets for the Amazon ES reserved IP addresses. By
using dedicated subnets, you avoid overlap with other applications and services and ensure that
you can reserve additional IP addresses if you need to scale your cluster in the future. To learn
more, see Creating a Subnet in Your VPC.
Amazon ES automatically creates the role when you use the Amazon ES console to create a
domain within a VPC. For this automatic creation to succeed, you must have permissions for the
es:CreateElasticsearchServiceRole and iam:CreateServiceLinkedRole actions. To learn
more, see Service-Linked Role Permissions in the IAM User Guide.
For full information on this role's permissions and how to delete it, see the section called “Using Service-
Linked Roles” (p. 111).
Description Documentation
How to get started using Amazon VPC Amazon VPC Getting Started Guide
How to use Amazon VPC through the AWS Amazon VPC User Guide
Management Console
Complete descriptions of all the Amazon VPC Amazon EC2 Command Line Reference (The
commands Amazon VPC commands are part of the Amazon
EC2 reference.)
Complete descriptions of the Amazon VPC API Amazon EC2 API Reference (The Amazon VPC API
actions, data types, and errors actions are part of the Amazon EC2 reference.)
For more detailed information about Amazon Virtual Private Cloud, see Amazon Virtual Private Cloud.
• Each colored box shows the range of values for the node over the specified time period.
• Blue boxes represent values that are consistent with other nodes. Red boxes represent outliers.
• The white line within each box shows the node's current value.
• The “whiskers” on either side of each box show the minimum and maximum values for all nodes over
the time period.
Amazon ES domains send performance metrics to Amazon CloudWatch every minute. If you use General
Purpose or Magnetic EBS volumes, the EBS volume metrics update only every five minutes. To view these
metrics, use the Cluster health and Instance health tabs in the Amazon Elasticsearch Service console.
The metrics are provided at no extra charge.
If you make configuration changes to your domain, the list of individual instances in the Cluster health
and Instance health tabs often double in size for a brief period before returning to the correct number.
For an explanation of this behavior, see the section called “Configuration Changes” (p. 14).
All metrics are in the AWS/ES namespace. Metrics for individual nodes are in the ClientId,
DomainName, NodeId dimension. Cluster metrics are in the Per-Domain, Per-Client Metrics
dimension. Some node metrics are aggregated at the cluster level and thus included in both dimensions.
The service archives metrics for two weeks before discarding them.
Cluster Metrics
Amazon Elasticsearch Service provides the following metrics for clusters.
Metric Description
ClusterStatus.green A value of 1 indicates that all index shards are allocated to nodes in
the cluster.
ClusterStatus.yellow A value of 1 indicates that the primary shards for all indices are
allocated to nodes in the cluster, but replica shards for at least one
index are not. For more information, see the section called “Yellow
Cluster Status” (p. 264).
ClusterStatus.red A value of 1 indicates that the primary and replica shards for at
least one index are not allocated to nodes in the cluster. For more
information, see the section called “Red Cluster Status” (p. 262).
SearchableDocuments The total number of searchable documents across all data nodes in
the cluster.
DeletedDocuments The total number of documents marked for deletion across all data
nodes in the cluster. These documents no longer appear in search
results, but Elasticsearch only removes deleted documents from disk
during segment merges. This metric increases after delete requests
and decreases after segment merges.
CPUUtilization The percentage of CPU usage for data nodes in the cluster. Maximum
shows the node with the highest CPU usage. Average represents all
nodes in the cluster. This metric is also available for individual nodes.
FreeStorageSpace The free space for data nodes in the cluster. Sum shows total free
space for the cluster, but you must leave the period at one minute to
get an accurate value. Minimum and Maximum show the nodes with
the most and least free space, respectively. This metric is also available
for individual nodes. Amazon ES throws a ClusterBlockException
when this metric reaches 0. To recover, you must either delete indices,
add larger instances, or add EBS-based storage to existing instances.
To learn more, see the section called “Lack of Available Storage
Space” (p. 264).
Metric Description
Note
FreeStorageSpace will always be lower than the value that
the Elasticsearch _cluster/stats API provides. Amazon ES
reserves a percentage of the storage space on each instance
for internal operations.
ClusterUsedSpace The total used space for the cluster. You must leave the period at one
minute to get an accurate value.
JVMMemoryPressure The maximum percentage of the Java heap used for all data nodes
in the cluster. Amazon ES uses half of an instance's RAM for the Java
heap, up to a heap size of 32 GiB. You can scale instances vertically
up to 64 GiB of RAM, at which point you can scale horizontally by
adding instances. See the section called “Recommended CloudWatch
Alarms” (p. 210).
AutomatedSnapshotFailure The number of failed automated snapshots for the cluster. A value of
1 indicates that no automated snapshot was taken for the domain in
the previous 36 hours.
CPUCreditBalance The remaining CPU credits available for data nodes in the
cluster. A CPU credit provides the performance of a full CPU core
for one minute. For more information, see CPU Credits in the
Amazon EC2 Developer Guide. This metric is available only for the
t2.micro.elasticsearch, t2.small.elasticsearch, and
t2.medium.elasticsearch instance types.
Metric Description
KMSKeyError A value of 1 indicates that the KMS customer master key used to
encrypt data at rest has been disabled. To restore the domain to
normal operations, re-enable the key. The console displays this metric
only for domains that encrypt data at rest.
KMSKeyInaccessible A value of 1 indicates that the KMS customer master key used to
encrypt data at rest has been deleted or revoked its grants to Amazon
ES. You can't recover domains that are in this state. If you have a
manual snapshot, though, you can use it to migrate the domain's data
to a new domain. The console displays this metric only for domains
that encrypt data at rest.
If you see large values for this metric, confirm that your Elasticsearch
clients include the domain hostname (and not, for example, its IP
address) in their requests.
2xx, 3xx, 4xx, 5xx The number of requests to the domain that resulted in the given HTTP
response code (2xx, 3xx, 4xx, 5xx).
Metric Description
MasterFreeStorageSpace This metric is not relevant and can be ignored. The service does not
use master nodes as data nodes.
Metric Description
MasterJVMMemoryPressure The maximum percentage of the Java heap used for all dedicated
master nodes in the cluster. We recommend moving to a larger
instance type when this metric reaches 85 percent.
MasterCPUCreditBalance The remaining CPU credits available for dedicated master nodes in
the cluster. A CPU credit provides the performance of a full CPU core
for one minute. For more information, see CPU Credits in the Amazon
EC2 User Guide for Linux Instances. This metric is available only for
the t2.micro.elasticsearch, t2.small.elasticsearch, and
t2.medium.elasticsearch instance types.
Metric Description
ReadThroughput The throughput, in bytes per second, for read operations on EBS volumes.
WriteThroughput The throughput, in bytes per second, for write operations on EBS volumes.
DiskQueueDepth The number of pending input and output (I/O) requests for an EBS volume.
Metric Description
ReadIOPS The number of input and output (I/O) operations per second for read
operations on EBS volumes.
WriteIOPS The number of input and output (I/O) operations per second for write
operations on EBS volumes.
Instance Metrics
Amazon Elasticsearch Service provides the following metrics for each instance in a domain. Amazon ES
also aggregates these instance metrics to provide insight into overall cluster health. You can verify this
behavior using the Data samples statistic in the console. Note that each metric in the following table has
relevant statistics for the node and the cluster.
Important
Different versions of Elasticsearch use different thread pools to process calls to the _index API.
Elasticsearch 1.5 and 2.3 use the index thread pool. Elasticsearch 5.x, 6.0, and 6.2 use the bulk
thread pool. 6.3 and later use the write thread pool. Currently, the Amazon ES console doesn't
include a graph for the bulk thread pool.
Metric Description
IndexingRate The number of indexing operations per minute. A single call to the _bulk
API that adds two documents and updates two counts as four operations,
which might be spread across one or more nodes. If that index has one or
more replicas, other nodes in the cluster also record a total of four indexing
operations. Document deletions do not count towards this metric.
SearchLatency The average time, in milliseconds, that it takes a shard on a data node to
complete a search operation.
SearchRate The total number of search requests per minute for all shards on a data
node. A single call to the _search API might return results from many
different shards. If five of these shards are on one node, the node would
report 5 for this metric, even though the client only made one request.
Metric Description
Relevant cluster statistics: Average, Maximum, Sum
SysMemoryUtilizationThe percentage of the instance's memory that is in use. High values for this
metric are normal and usually do not represent a problem with your cluster.
For a better indicator of potential performance and stability issues, see the
JVMMemoryPressure metric.
The number of times that "young generation" garbage collection has run. A
JVMGCYoungCollectionCount
large, ever-growing number of runs is a normal part of cluster operations.
The amount of time, in milliseconds, that the cluster has spent performing
JVMGCYoungCollectionTime
"young generation" garbage collection.
The number of times that "old generation" garbage collection has run. In a
JVMGCOldCollectionCount
cluster with sufficient resources, this number should remain small and grow
infrequently.
The amount of time, in milliseconds, that the cluster has spent performing
JVMGCOldCollectionTime
"old generation" garbage collection.
The number of queued tasks in the force merge thread pool. If the queue
ThreadpoolForce_mergeQueue
size is consistently high, consider scaling your cluster.
The number of rejected tasks in the force merge thread pool. If this number
ThreadpoolForce_mergeRejected
continually grows, consider scaling your cluster.
Metric Description
ThreadpoolIndexQueueThe number of queued tasks in the index thread pool. If the queue size is
consistently high, consider scaling your cluster. The maximum index queue
size is 200.
The number of rejected tasks in the index thread pool. If this number
ThreadpoolIndexRejected
continually grows, consider scaling your cluster.
The number of queued tasks in the search thread pool. If the queue size is
ThreadpoolSearchQueue
consistently high, consider scaling your cluster. The maximum search queue
size is 1,000.
The number of rejected tasks in the search thread pool. If this number
ThreadpoolSearchRejected
continually grows, consider scaling your cluster.
ThreadpoolBulkQueue The number of queued tasks in the bulk thread pool. If the queue size is
consistently high, consider scaling your cluster.
The number of rejected tasks in the bulk thread pool. If this number
ThreadpoolBulkRejected
continually grows, consider scaling your cluster.
Metric Description
UltraWarm Metrics
Amazon Elasticsearch Service provides the following metrics for UltraWarm (p. 184) nodes.
Metric Description
WarmCPUUtilization The percentage of CPU usage for UltraWarm nodes in the cluster. Maximum
shows the node with the highest CPU usage. Average represents all
UltraWarm nodes in the cluster. This metric is also available for individual
UltraWarm nodes.
WarmFreeStorageSpaceThe amount of free warm storage space in MiB. Because UltraWarm uses
Amazon S3 rather than attached disks, Sum is the only relevant statistic. You
must leave the period at one minute to get an accurate value.
The maximum percentage of the Java heap used for the UltraWarm nodes.
WarmJVMMemoryPressure
The total number of searchable documents across all warm indices in the
WarmSearchableDocuments
cluster. You must leave the period at one minute to get an accurate value.
Metric Description
Relevant cluster statistics: Average, Maximum
WarmSearchRate The total number of search requests per minute for all shards on an
UltraWarm node. A single call to the _search API might return results
from many different shards. If five of these shards are on one node, the
node would report 5 for this metric, even though the client only made one
request.
The total amount of warm storage space that the cluster is using. The
WarmStorageSpaceUtilization
Amazon ES console displays this value in GiB. The Amazon CloudWatch
console displays it in MiB.
The total amount of hot storage space that the cluster is using. The Amazon
HotStorageSpaceUtilization
ES console displays this value in GiB. The Amazon CloudWatch console
displays it in MiB.
Alerting Metrics
Amazon Elasticsearch Service provides the following metrics for the alerting feature (p. 198).
Metric Description
AlertingDegraded A value of 1 means that either the alerting index is red or one or more nodes
is not on schedule. A value of 0 indicates normal behavior.
The health of the index. A value of 1 means green. A value of 0 means that
AlertingIndexStatus.green
the index either doesn't exist or isn't green.
Metric Description
Relevant statistics: Maximum
The health of the index. A value of 1 means red. A value of 0 means that the
AlertingIndexStatus.red
index either doesn't exist or isn't red.
The health of the index. A value of 1 means yellow. A value of 0 means that
AlertingIndexStatus.yellow
the index either doesn't exist or isn't yellow.
A value of 1 means that all alerting jobs are running on schedule (or that
AlertingNodesOnSchedule
no alerting jobs exist). A value of 0 means some jobs are not running on
schedule.
Metric Description
Metric Description
Relevant statistics: Maximum
SQL Metrics
Amazon Elasticsearch Service provides the following metrics for SQL support (p. 152).
Metric Description
Metric Description
Relevant statistics: Sum
KNN Metrics
Amazon Elasticsearch Service includes metrics for KNN (p. 153). For a summary of each, see the Open
Distro for Elasticsearch documentation.
Add a CloudWatch alarm in the event that you lose a connection unexpectedly. For steps to create an
alarm, see Create a CloudWatch Alarm Based on a Static Threshold.
Metric Description
Metric Description
LTRStoreIndexIsRed Tracks if one of the indices needed to run the plugin is red.
Configuring Logs
Amazon ES exposes three Elasticsearch logs through Amazon CloudWatch Logs: error logs, search slow
logs, and index slow logs. These logs are useful for troubleshooting performance and stability issues, but
are disabled by default. If enabled, standard CloudWatch pricing applies.
Note
Error logs are available only for Elasticsearch versions 5.1 and greater. Slow logs are available
for all Elasticsearch versions.
For its logs, Elasticsearch uses Apache Log4j 2 and its built-in log levels (from least to most severe) of
TRACE, DEBUG, INFO, WARN, ERROR, and FATAL.
If you enable error logs, Amazon ES publishes log lines of WARN, ERROR, and FATAL to CloudWatch.
Amazon ES also publishes several exceptions from the DEBUG level, including the following:
• org.elasticsearch.index.mapper.MapperParsingException
• org.elasticsearch.index.query.QueryShardException
• org.elasticsearch.action.search.SearchPhaseExecutionException
• org.elasticsearch.common.util.concurrent.EsRejectedExecutionException
• java.lang.IllegalArgumentException
Error logs can help with troubleshooting in many situations, including the following:
6. Choose an access policy that contains the appropriate permissions, or create a policy using the JSON
that the console provides:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "es.amazonaws.com"
},
"Action": [
"logs:PutLogEvents",
"logs:CreateLogStream"
],
"Resource": "cw_log_group_arn"
}
]
}
Important
CloudWatch Logs supports 10 resource policies per Region. If you plan to enable logs for
several Amazon ES domains, you should create and reuse a broader policy that includes
multiple log groups to avoid reaching this limit. For steps on updating your policy, see the
section called “Enabling Log Publishing (AWS CLI)” (p. 42).
7. Choose Enable.
The status of your domain changes from Active to Processing. The status must return to Active
before log publishing is enabled. This change typically takes 30 minutes, but can take longer
depending on your domain configuration.
If you enabled one of the slow logs, see the section called “Setting Elasticsearch Logging Thresholds
for Slow Logs” (p. 44). If you enabled only error logs, you don't need to perform any additional
configuration steps.
Enter the next command to find the log group's ARN, and then make a note of it:
Now you can give Amazon ES permissions to write to the log group. You must provide the log group's
ARN near the end of the command:
Important
CloudWatch Logs supports 10 resource policies per Region. If you plan to enable slow logs for
several Amazon ES domains, you should create and reuse a broader policy that includes multiple
log groups to avoid reaching this limit.
If you need to review this policy at a later time, use the aws logs describe-resource-policies
command. To update the policy, issue the same aws logs put-resource-policy command with a
new policy document.
Finally, you can use the --log-publishing-options option to enable publishing. The syntax for the
option is the same for both the create-elasticsearch-domain and update-elasticsearch-
domain-config commands.
--log-publishing- SEARCH_SLOW_LOGS={CloudWatchLogsLogGroupArn=cw_log_group_arn,Enab
options false}
INDEX_SLOW_LOGS={CloudWatchLogsLogGroupArn=cw_log_group_arn,Enabl
false}
ES_APPLICATION_LOGS={CloudWatchLogsLogGroupArn=cw_log_group_arn,E
false}
Note
If you plan to enable multiple logs, we recommend publishing each to its own log group. This
separation makes the logs easier to scan.
Example
The following example enables the publishing of search and index slow logs for the specified domain:
If you enabled one of the slow logs, see the section called “Setting Elasticsearch Logging Thresholds
for Slow Logs” (p. 44). If you enabled only error logs, you don't need to perform any additional
configuration steps.
• CreateLogGroup
• DescribeLogGroup
• PutResourcePolicy
The AWS SDKs (except the Android and iOS SDKs) support all the operations that are defined in Amazon
ES Configuration API Reference (p. 271), including the --log-publishing-options option for
CreateElasticsearchDomain and UpdateElasticsearchDomainConfig.
If you enabled one of the slow logs, see the section called “Setting Elasticsearch Logging Thresholds
for Slow Logs” (p. 44). If you enabled only error logs, you don't need to perform any additional
configuration steps.
PUT elasticsearch_domain_endpoint/index/_settings
{
"index.search.slowlog.threshold.query.warn": "5s",
"index.search.slowlog.threshold.query.info": "2s"
}
To test that slow logs are publishing successfully, consider starting with very low values to verify that
logs appear in CloudWatch, and then increase the thresholds to more useful levels.
• Does the CloudWatch log group exist? Check the CloudWatch console.
• Does Amazon ES have permissions to write to the log group? Check the Amazon ES console.
• Is the Amazon ES domain configured to publish to the log group? Check the Amazon ES
console, use the AWS CLI describe-elasticsearch-domain-config option, or call
DescribeElasticsearchDomainConfig using one of the SDKs.
• Are the Elasticsearch logging thresholds low enough that your requests are exceeding them? To review
your thresholds for an index, use the following command:
GET elasticsearch_domain_endpoint/index/_settings?pretty
If you want to disable slow logs for an index, return any thresholds that you changed to their default
values of -1.
Disabling publishing to CloudWatch using the Amazon ES console or AWS CLI does not stop Elasticsearch
from generating logs; it only stops the publishing of those logs. Be sure to check your index settings if
you no longer need the slow logs.
Viewing Logs
Viewing the application and slow logs in CloudWatch is just like viewing any other CloudWatch log. For
more information, see View Log Data in the Amazon CloudWatch Logs User Guide.
• Amazon ES publishes only the first 255,000 characters of each line to CloudWatch. Any remaining
content is truncated.
On Amazon Elasticsearch Service, snapshots come in two forms: automated and manual.
• Automated snapshots are only for cluster recovery. You can use them to restore your domain (p. 50)
in the event of red cluster status (p. 262) or other data loss. Amazon ES stores automated snapshots
in a preconfigured Amazon S3 bucket at no additional charge.
• Manual snapshots are for cluster recovery or moving data from one cluster to another. As the name
suggests, you have to initiate manual snapshots. These snapshots are stored in your own Amazon
S3 bucket, and standard S3 charges apply. If you have a snapshot from a self-managed Elasticsearch
cluster, you can even use that snapshot to migrate to an Amazon ES domain (p. 241).
• For domains running Elasticsearch 5.3 and later, Amazon ES takes hourly automated snapshots and
retains up to 336 of them for 14 days.
• For domains running Elasticsearch 5.1 and earlier, Amazon ES takes daily automated snapshots (during
the hour you specify) and retains up to 14 of them for 30 days.
If your cluster enters red status, Amazon ES stops taking automated snapshots. If you don't correct the
problem within two weeks, you can permanently lose your cluster's data. For troubleshooting steps, see
the section called “Red Cluster Status” (p. 262).
Topics
• Manual Snapshot Prerequisites (p. 45)
• Registering a Manual Snapshot Repository (p. 47)
• Taking Manual Snapshots (p. 49)
• Restoring Snapshots (p. 50)
• Using Curator for Snapshots (p. 51)
Prerequisite Description
S3 bucket Stores manual snapshots for your Amazon ES domain. Make a note of the bucket's
name. You need it in two places:
• Resource statement of the IAM policy that is attached to your IAM role
• Python client that is used to register a snapshot repository
Prerequisite Description
For more information, see Create a Bucket in the Amazon Simple Storage Service
Getting Started Guide.
Important
Do not apply an S3 Glacier lifecycle rule to this bucket. Manual snapshots do
not support the S3 Glacier storage class.
IAM role Delegates permissions to Amazon Elasticsearch Service. The rest of this chapter refers
to this role as TheSnapshotRole.
The trust relationship for the role must specify Amazon Elasticsearch Service in the
Principal statement, as shown in the following example:
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "es.amazonaws.com"
},
"Action": "sts:AssumeRole"
}]
}
{
"Version": "2012-10-17",
"Statement": [{
"Action": [
"s3:ListBucket"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::s3-bucket-name"
]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::s3-bucket-name/*"
]
}
]
}
For more information, see Adding IAM Identity Permissions in the IAM User Guide.
Prerequisite Description
Permissions You must be able to assume TheSnapshotRole in order to register the snapshot
repository. You also need access to the es:ESHttpPut action. The following policy
includes these permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": "arn:aws:iam::123456789012:role/TheSnapshotRole"
},
{
"Effect": "Allow",
"Action": "es:ESHttpPut",
"Resource": "arn:aws:es:region:123456789012:domain/my-domain/*"
}
]
}
$ python register-repo.py
{"Message":"User: arn:aws:iam::123456789012:user/MyUserAccount
is not authorized to perform: iam:PassRole on resource:
arn:aws:iam::123456789012:role/TheSnapshotRole"}
You can't use curl to perform this operation, because it doesn't support AWS request signing.
Instead, use the sample Python client (p. 48), Postman, or some other method to send a signed
request (p. 114) to register the snapshot repository. The request takes the following form:
PUT elasticsearch-domain-endpoint/_snapshot/my-snapshot-repo-name
{
"type": "s3",
"settings": {
"bucket": "s3-bucket-name",
"region": "region",
"role_arn": "arn:aws:iam::123456789012:role/TheSnapshotRole"
}
}
Registering a snapshot directory is a one-time operation, but to migrate from one domain to another,
you must register the same snapshot repository on the old domain and the new domain. The repository
name is arbitrary.
Important
If the S3 bucket is in the us-east-1 region, you must use "endpoint": "s3.amazonaws.com"
instead of "region": "us-east-1".
To enable server-side encryption with S3-managed keys for the snapshot repository, add
"server_side_encryption": true to the "settings" JSON.
If your domain resides within a VPC, your computer must be connected to the VPC in order for the
request to successfully register the snapshot repository. Accessing a VPC varies by network configuration,
but likely involves connecting to a VPN or corporate network. To check that you can reach the Amazon
ES domain, navigate to https://your-vpc-domain.region.es.amazonaws.com in a web browser
and verify that you receive the default JSON response.
If you use fine-grained access control, see the section called “Manual Snapshots” (p. 94) for an
additional step.
You must update the following variables: host, region, path, and payload.
import boto3
import requests
from requests_aws4auth import AWS4Auth
# Register repository
payload = {
"type": "s3",
"settings": {
"bucket": "s3-bucket-name",
# "endpoint": "s3.amazonaws.com", # for us-east-1
"region": "us-west-1", # for all other regions
"role_arn": "arn:aws:iam::123456789012:role/TheSnapshotRole"
}
}
print(r.status_code)
print(r.text)
# # Take snapshot
#
# path = '_snapshot/my-snapshot-repo/my-snapshot'
# url = host + path
#
# r = requests.put(url, auth=awsauth)
#
# print(r.text)
#
# # Delete index
#
# path = 'my-index'
# url = host + path
#
# r = requests.delete(url, auth=awsauth)
#
# print(r.text)
#
# # Restore snapshot (all indices except Kibana and fine-grained access control)
#
# path = '_snapshot/my-snapshot-repo/my-snapshot/_restore'
# url = host + path
#
# payload = {
# "indices": "-.kibana*,-.opendistro_security",
# "include_global_state": false
# }
#
# headers = {"Content-Type": "application/json"}
#
# r = requests.post(url, auth=awsauth, json=payload, headers=headers)
#
# # Restore snapshot (one index)
#
# path = '_snapshot/my-snapshot-repo/my-snapshot/_restore'
# url = host + path
#
# payload = {"indices": "my-index"}
#
# headers = {"Content-Type": "application/json"}
#
# r = requests.post(url, auth=awsauth, json=payload, headers=headers)
#
# print(r.text)
Elasticsearch snapshots are incremental, meaning that they only store data that has changed since
the last successful snapshot. This incremental nature means that the difference in disk usage between
frequent and infrequent snapshots is often minimal. In other words, taking hourly snapshots for a week
(for a total of 168 snapshots) might not use much more disk space than taking a single snapshot at the
end of the week. Also, the more frequently you take snapshots, the less time they take to complete.
Some Elasticsearch users take snapshots as often as every half hour.
The examples in this chapter use curl, a common HTTP client, for convenience and brevity. If your access
policies specify IAM users or roles, however, you must sign your snapshot requests. You can use the
commented-out examples in the sample Python client (p. 48) to make signed HTTP requests to the
same endpoints that the curl commands use.
1. You can't take a snapshot if one is currently in progress. To check, run the following command:
Note
The time required to take a snapshot increases with the size of the Amazon ES domain.
Long-running snapshot operations sometimes encounter the following error: 504
GATEWAY_TIMEOUT. Typically, you can ignore these errors and wait for the operation to
complete successfully. Use the following command to verify the state of all snapshots of your
domain:
Restoring Snapshots
Warning
If you use index aliases, cease write requests to an alias (or switch the alias to another index)
prior to deleting its index. Halting write requests helps avoid the following scenario:
If you switched the alias to another index, specify "include_aliases": false when you
restore from a snapshot.
To restore a snapshot
1. Identify the snapshot that you want to restore. To see all snapshot repositories, run the following
command:
After you identify the repository, run the following command to see all snapshots:
Note
Most automated snapshots are stored in the cs-automated repository. If your domain
encrypts data at rest, they are stored in the cs-automated-enc repository. If you don't
see the manual snapshot repository that you're looking for, make sure that you registered
it (p. 47) to the domain.
API Version 2015-01-01
50
Amazon Elasticsearch Service Developer Guide
Using Curator for Snapshots
2. (Optional) Delete or rename one or more indices in the Amazon ES domain. You don't need to
perform this step if you have no naming conflicts between indices on the cluster and indices in the
snapshot.
You can't restore a snapshot of your indices to an Elasticsearch cluster that already contains indices
with the same names. Currently, Amazon ES does not support the Elasticsearch _close API, so you
must use one of the following alternatives:
• Delete the indices on the same Amazon ES domain, and then restore the snapshot.
• Rename the indices as you restore them from the snapshot (p. 266), and later, reindex them.
• Restore the snapshot to a different Amazon ES domain (only possible with manual snapshots).
The following example shows how to delete all existing indices for a domain:
If you don't plan to restore all indices, though, you might want to delete only one:
Due to special permissions on the Kibana and fine-grained access control indices, attempts to restore
all indices might fail, especially if you try to restore from an automated snapshot. The following
example restores just one index, my-index, from 2017-snapshot in the cs-automated snapshot
repository:
Alternately, you might want to restore all indices except for the Kibana and fine-grained access
control indices:
Note
If not all primary shards were available for the indices involved, a snapshot might have a state
of PARTIAL. This value indicates that data from at least one shard was not stored successfully.
You can still restore from a partial snapshot, but you might need to use older snapshots to
restore any missing indices.
Curator offers advanced filtering functionality that can help simplify management tasks on complex
clusters. Amazon ES supports Curator on domains running Elasticsearch version 5.1 and above. You can
use Curator as a command line interface (CLI) or Python API. If you use the CLI, export your credentials at
the command line and configure curator.yml as follows:
client:
hosts: search-my-domain.us-west-1.es.amazonaws.com
port: 443
use_ssl: True
aws_region: us-west-1
aws_sign_request: True
ssl_no_validate: False
timeout: 60
logging:
loglevel: INFO
For sample Lambda functions that use the Python API, see the section called “Using Curator to Rotate
Data” (p. 193).
Upgrading Elasticsearch
Note
Elasticsearch version upgrades differ from service software updates. For information on
updating the service software for your Amazon ES domain, see the section called “Service
Software Updates” (p. 15).
Amazon ES offers in-place Elasticsearch upgrades for domains that run versions 5.1 and later. If you use
services like Amazon Kinesis Data Firehose or Amazon CloudWatch Logs to stream data to Amazon ES,
check that these services support the newer version of Elasticsearch before migrating.
7.x 7.x
6.8 7.x
Important
Elasticsearch 7.0 includes numerous breaking changes. Before initiating an
in-place upgrade, we recommend taking a manual snapshot (p. 45) of the
6.8 domain, restoring it on a test 7.x domain, and using that test domain to
identify potential upgrade issues.
Like Elasticsearch 6.x, indices can only contain one mapping type, but that
type must now be named _doc. As a result, certain APIs no longer require a
mapping type in the request body (such as the _bulk API).
For new indices, self-hosted Elasticsearch 7.x has a default shard count of
one. Amazon ES 7.x domains retain the previous default of five.
6.x 6.x
5.6 6.x
Important
Indices created in version 6.x no longer support multiple mapping types.
Indices created in version 5.x still support multiple mapping types when
restored into a 6.x cluster. Check that your client code creates only a single
mapping type per index.
5.x 5.6
1. Pre-upgrade checks – Amazon ES performs a series of checks for issues that can block an upgrade
and doesn't proceed to the next step unless these checks succeed.
2. Snapshot – Amazon ES takes a snapshot of the Elasticsearch cluster and doesn't proceed to the next
step unless the snapshot succeeds. If the upgrade fails, Amazon ES uses this snapshot to restore the
cluster to its original state. For more information about this snapshot, see the section called “Can't
Downgrade After Upgrade” (p. 267).
3. Upgrade – Amazon ES starts the upgrade, which can take from 15 minutes to several hours to
complete. Kibana might be unavailable during some or all of the upgrade.
Troubleshooting an Upgrade
In-place Elasticsearch upgrades require healthy domains. Your domain might be ineligible for an upgrade
or fail to upgrade for a wide variety of reasons. The following table shows the most common issues.
Issue Description
Too many shards per The 7.x versions of Elasticsearch have a default setting of no more than
node 1,000 shards per node. If a node in your current cluster exceeds this setting,
Amazon ES doesn't allow you to upgrade. See the section called “Maximum
Shard Limit” (p. 266) for troubleshooting options.
Domain in processing The domain is in the middle of a configuration change. Check upgrade
eligibility after the operation completes.
Red cluster status One or more indices in the cluster is red. For troubleshooting steps, see the
section called “Red Cluster Status” (p. 262).
High error rate The Elasticsearch cluster is returning a large number of 5xx errors when
attempting to process requests. This problem is usually the result of too
many simultaneous read or write requests. Consider reducing traffic to the
cluster or scaling your domain.
Split brain Split brain means that your Elasticsearch cluster has more than one master
node and has split into two clusters that never will rejoin on their own. You
can avoid split brain by using the recommended number of dedicated master
nodes (p. 208). For help recovering from split brain, contact AWS Support.
Master node not found Amazon ES can't find the cluster's master node. If your domain uses multi-
AZ (p. 17), an Availability Zone failure might have caused the cluster to
lose quorum and be unable to elect a new master node (p. 208). If the
issue does not self-resolve, contact AWS Support.
Too many pending The master node is under heavy load and has many pending tasks. Consider
tasks reducing traffic to the cluster or scaling your domain.
Issue Description
Impaired storage The disk volume of one or more nodes isn't functioning properly. This
volume issue often occurs alongside other issues, like a high error rate or too many
pending tasks. If it occurs in isolation and doesn't self-resolve, contact AWS
Support.
KMS key issue The KMS key that is used to encrypt the domain is either inaccessible or
missing. For more information, see the section called “Monitoring Domains
That Encrypt Data at Rest” (p. 63).
Snapshot in progress The domain is currently taking a snapshot. Check upgrade eligibility
after the snapshot finishes. Also check that you can list manual snapshot
repositories, list snapshots within those repositories, and take manual
snapshots. If Amazon ES is unable to check whether a snapshot is in
progress, upgrades can fail.
Snapshot timeout or The pre-upgrade snapshot took too long to complete or failed. Check cluster
failure health, and try again. If the problem persists, contact AWS Support.
Incompatible indices One or more indices is incompatible with the target Elasticsearch version.
This problem can occur if you migrated the indices from an older version of
Elasticsearch, like 2.3. Reindex the indices, and try again.
High disk usage Disk usage for the cluster is above 90%. Delete data or scale the domain,
and try again.
High JVM usage JVM memory pressure is above 75%. Reduce traffic to the cluster or scale
the domain, and try again.
Kibana alias problem .kibana is already configured as an alias and maps to an incompatible
index, likely one from an earlier version of Kibana. Reindex, and try again.
Red Kibana status Kibana status is red. Try using Kibana when the upgrade completes. If the
red status persists, resolve it manually, and try again.
Other Amazon ES Issues with Amazon ES itself might cause your domain to display as ineligible
service issue for an upgrade. If none of the preceding conditions apply to your domain
and the problem persists for more than a day, contact AWS Support.
Starting an Upgrade
The upgrade process is irreversible and can't be paused or canceled. During an upgrade, you can't make
configuration changes to the domain. Before starting an upgrade, double-check that you want to
proceed. You can use these same steps to perform the pre-upgrade check without actually starting an
upgrade.
If the cluster has dedicated master nodes, upgrades complete without downtime. Otherwise, the cluster
might be unresponsive for several seconds post-upgrade while it elects a master node.
1. Take a manual snapshot (p. 45) of your domain. This snapshot serves as a backup that you can
restore on a new domain (p. 50) if you want to return to using the prior Elasticsearch version.
2. Go to https://aws.amazon.com, and then choose Sign In to the Console.
3. Under Analytics, choose Elasticsearch Service.
4. In the navigation pane, under My domains, choose the domain that you want to upgrade.
5. Choose Actions and Upgrade domain.
6. For Operation, choose Upgrade, Submit, and Continue.
7. Return to the Overview tab and choose Upgrade status to monitor the state of the upgrade.
You can use the following operations to identify the right Elasticsearch version for your domain, start an
in-place upgrade, perform the pre-upgrade check, and view progress:
• get-compatible-elasticsearch-versions (GetCompatibleElasticsearchVersions)
• upgrade-elasticsearch-domain (UpgradeElasticsearchDomain)
• get-upgrade-status (GetUpgradeStatus)
• get-upgrade-history (GetUpgradeHistory)
For more information, see the AWS CLI Command Reference and Amazon ES Configuration API
Reference (p. 271).
The following table shows how to use snapshots to migrate data to a domain that uses a different
Elasticsearch version. For more information about taking and restoring snapshots, see the section called
“Working with Index Snapshots” (p. 45).
6.x 7.x 1. Review breaking changes for 7.0 to see if you need to
make adjustments to your indices or applications. For
other considerations, see the table in the section called
“Upgrading Elasticsearch” (p. 52).
2. Create a manual snapshot of the 6.x domain.
3. Create a 7.x domain.
4. Restore the snapshot from the original domain to the 7.x
domain. During the operation, you likely need to restore
the .kibana index under a new name:
POST _snapshot/<repository-name>/<snapshot-name>/
_restore
{
"indices": "*",
"ignore_unavailable": true,
"rename_pattern": ".kibana",
"rename_replacement": ".backup-kibana"
5.x 6.x 1. Review breaking changes for 6.0 to see if you need to
make adjustments to your indices or applications. For
other considerations, see the table in the section called
“Upgrading Elasticsearch” (p. 52).
2. Create a manual snapshot of the 5.x domain.
3. Create a 6.x domain.
4. Restore the snapshot from the original domain to the 6.x
domain.
5. If you no longer need your 5.x domain, delete it.
Otherwise, you continue to incur charges for the domain.
2.3 6.x Elasticsearch 2.3 snapshots are not compatible with 6.x.
To migrate your data directly from 2.3 to 6.x, you must
manually recreate your indices in the new domain.
Alternately, you can follow the 2.3 to 5.x steps in this table,
perform _reindex operations in the new 5.x domain to
convert your 2.3 indices to 5.x indices, and then follow the
5.x to 6.x steps.
2.3 5.x 1. Review breaking changes for 5.0 to see if you need to
make adjustments to your indices or applications.
2. Create a manual snapshot of the 2.3 domain.
3. Create a 5.x domain.
4. Restore the snapshot from the 2.3 domain to the 5.x
domain.
5. If you no longer need your 2.3 domain, delete it.
Otherwise, you continue to incur charges for the domain.
1.5 5.x Elasticsearch 1.5 snapshots are not compatible with 5.x.
To migrate your data from 1.5 to 5.x, you must manually
recreate your indices in the new domain.
Important
1.5 snapshots are compatible with 2.3, but Amazon
ES 2.3 domains do not support the _reindex
operation. Because you cannot reindex them,
indices that originated in a 1.5 domain still fail to
restore from 2.3 snapshots to 5.x domains.
1.5 2.3 1. Use the migration plugin to find out if you can directly
upgrade to version 2.3. You might need to make changes
to your data before migration.
a. In a web browser, open http://domain_endpoint/
_plugin/migration/.
b. Choose Run checks now.
c. Review the results and, if needed, follow the
instructions to make changes to your data.
2. Create a manual snapshot of the 1.5 domain.
3. Create a 2.3 domain.
4. Restore the snapshot from the 1.5 domain to the 2.3
domain.
5. If you no longer need your 1.5 domain, delete it.
Otherwise, you continue to incur charges for the domain.
Tag key The tag key is the required name of the tag. Tag keys must be unique for the Amazon
ES domain to which they are attached. For a list of basic restrictions on tag keys and
values, see User-Defined Tag Restrictions.
Tag value The tag value is an optional string value of the tag. Tag values can be null and do
not have to be unique in a tag set. For example, you can have a key-value pair in a tag
set of project/Trinity and cost-center/Trinity. For a list of basic restrictions on tag keys
and values, see User-Defined Tag Restrictions.
Each Amazon ES domain has a tag set, which contains all the tags that are assigned to that Amazon ES
domain. AWS does not automatically set any tags on Amazon ES domains. A tag set can contain up to 50
tags, or it can be empty. If you add a tag to an Amazon ES domain that has the same key as an existing
tag for a resource, the new value overwrites the old value.
You can use these tags to track costs by grouping expenses for similarly tagged resources. An Amazon
ES domain tag is a name-value pair that you define and associate with an Amazon ES domain. The name
is referred to as the key. You can use tags to assign arbitrary information to an Amazon ES domain. A
tag key could be used, for example, to define a category, and the tag value could be an item in that
category. For example, you could define a tag key of “project” and a tag value of “Salix,” indicating
that the Amazon ES domain is assigned to the Salix project. You could also use tags to designate
Amazon ES domains as being used for test or production by using a key such as environment=test or
environment=production. We recommend that you use a consistent set of tag keys to make it easier
to track metadata that is associated with Amazon ES domains.
You also can use tags to organize your AWS bill to reflect your own cost structure. To do this, sign up to
get your AWS account bill with tag key values included. Then, organize your billing information according
to resources with the same tag key values to see the cost of combined resources. For example, you can
tag several Amazon ES domains with key-value pairs, and then organize your billing information to see
the total cost for each domain across several services. For more information, see Using Cost Allocation
Tags in the AWS Billing and Cost Management documentation.
Note
Tags are cached for authorization purposes. Because of this, additions and updates to tags on
Amazon ES domains might take several minutes before they are available.
For more information about using the console to work with tags, see Working with Tag Editor in the AWS
Management Console Getting Started Guide.
Syntax
Parameter Description
--arn Amazon resource name for the Amazon ES domain to which the tag is
attached.
Example
The following example creates two tags for the logs domain:
You can remove tags from an Amazon ES domain using the remove-tags command.
Syntax
Parameter Description
--arn Amazon Resource Name (ARN) for the Amazon ES domain to which the
tag is attached.
--tag-keys Set of space-separated key-value pairs that you want to remove from the
Amazon ES domain.
Example
The following example removes two tags from the logs domain that were created in the preceding
example:
You can view the existing tags for an Amazon ES domain with the list-tags command:
Syntax
list-tags --arn=<domain_arn>
Parameter Description
--arn Amazon Resource Name (ARN) for the Amazon ES domain to which the
tags are attached.
Example
The following example lists all resource tags for the logs domain:
Security is a shared responsibility between AWS and you. The shared responsibility model describes this
as security of the cloud and security in the cloud:
• Security of the cloud – AWS is responsible for protecting the infrastructure that runs AWS services in
the AWS Cloud. AWS also provides you with services that you can use securely. Third-party auditors
regularly test and verify the effectiveness of our security as part of the AWS compliance programs. To
learn about the compliance programs that apply to Amazon Elasticsearch Service, see AWS Services in
Scope by Compliance Program.
• Security in the cloud – Your responsibility is determined by the AWS service that you use. You are also
responsible for other factors including the sensitivity of your data, your company’s requirements, and
applicable laws and regulations.
This documentation helps you understand how to apply the shared responsibility model when using
Amazon ES. The following topics show you how to configure Amazon ES to meet your security and
compliance objectives. You also learn how to use other AWS services that help you to monitor and secure
your Amazon ES resources.
Topics
• Data Protection in Amazon Elasticsearch Service (p. 61)
• Identity and Access Management in Amazon Elasticsearch Service (p. 65)
• Fine-Grained Access Control in Amazon Elasticsearch Service (p. 77)
• Logging and Monitoring in Amazon Elasticsearch Service (p. 96)
• Compliance Validation for Amazon Elasticsearch Service (p. 98)
• Resilience in Amazon Elasticsearch Service (p. 99)
• Infrastructure Security in Amazon Elasticsearch Service (p. 99)
• Amazon Cognito Authentication for Kibana (p. 100)
• Using Service-Linked Roles for Amazon ES (p. 111)
For data protection purposes, we recommend that you protect AWS account credentials and set up
individual user accounts with AWS Identity and Access Management (IAM), so that each user is given only
the permissions necessary to fulfill their job duties. We also recommend that you secure your data in the
following ways:
We strongly recommend that you never put sensitive identifying information, such as your customers'
account numbers, into domain names, index names, document types, or document IDs. Elasticsearch uses
these names in its Uniform Resource Identifiers (URIs). Servers and applications often log HTTP requests,
which can lead to unnecessary data exposure if URIs contain sensitive information.
For more information about data protection, see the AWS Shared Responsibility Model and GDPR blog
post on the AWS Security Blog.
• Indices
• Elasticsearch logs
• Swap files
• All other data in the application directory
• Automated snapshots
The following are not encrypted when you enable encryption of data at rest, but you can take additional
steps to protect them:
• Manual snapshots: Currently, you can't use KMS master keys to encrypt manual snapshots. You
can, however, use server-side encryption with S3-managed keys to encrypt the bucket that you use
as a snapshot repository. For instructions, see the section called “Registering a Manual Snapshot
Repository” (p. 47).
• Slow logs and error logs: If you publish logs (p. 41) and want to encrypt them, you can encrypt their
CloudWatch Logs log group using the same AWS KMS master key as the Amazon ES domain. For more
information, see Encrypt Log Data in CloudWatch Logs Using AWS KMS in the Amazon CloudWatch
Logs User Guide.
Amazon ES supports only symmetric customer master keys, not asymmetric ones. To learn how to create
symmetric customer master keys, see Creating Keys in the AWS Key Management Service Developer Guide.
Regardless of whether encryption at rest is enabled, all domains automatically encrypt custom
packages (p. 147) using AES-256 and Amazon ES-managed keys.
To use the Amazon ES console to create a domain that encrypts data at rest, you must have read-only
permissions to AWS KMS, such as the following identity-based policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kms:List*",
"kms:Describe*"
],
"Resource": "*"
}
]
}
If you want to use a key other than (Default) aws/es, you must also have permissions to create grants for
the key. These permissions typically take the form of a resource-based policy that you specify when you
create the key.
If you want to keep your key exclusive to Amazon ES, you can add the kms:ViaService condition to the
key policy:
"Condition": {
"StringEquals": {
"kms:ViaService": "es.us-west-1.amazonaws.com"
},
"Bool": {
"kms:GrantIsForAWSResource": "true"
}
}
For more information, see Using Key Policies in AWS KMS in the AWS Key Management Service Developer
Guide.
Warning
If you delete the key that you used to encrypt a domain, the domain becomes inaccessible. The
Amazon ES team can't help you recover your data. AWS KMS deletes master keys only after a
waiting period of at least seven days, so the Amazon ES team might contact you if they detect
that your domain is at risk.
Other Considerations
• Automatic key rotation preserves the properties of your AWS KMS master keys, so the rotation has no
effect on your ability to access your Elasticsearch data. Encrypted Amazon ES domains don't support
manual key rotation, which involves creating a new master key and updating any references to the old
key. To learn more, see Rotating Customer Master Keys in the AWS Key Management Service Developer
Guide.
• Certain instance types don't support encryption of data at rest. For details, see the section called
“Supported Instance Types” (p. 212).
• Domains that encrypt data at rest use a different repository name for their automated snapshots. For
more information, see the section called “Restoring Snapshots” (p. 50).
• Encrypting an Amazon ES domain requires a grant, and each encryption key has a limit of 500 grants
per principal. This limit means that the maximum number of Amazon ES domains that you can encrypt
using a single key is 500. Currently, Amazon ES supports a maximum of 100 domains per account (per
Region), so this grant limit is of no consequence. If the domain limit per account increases, however,
the grant limit might become relevant.
If you need to encrypt more than 500 domains at that time, you can create additional keys. Keys are
regional, not global, so if you operate in more than one AWS Region, you already need multiple keys.
Each Amazon ES domain—regardless of whether the domain uses VPC access—resides within its
own, dedicated VPC. This architecture prevents potential attackers from intercepting traffic between
Elasticsearch nodes and keeps the cluster secure. By default, however, traffic within the VPC is
unencrypted. Node-to-node encryption enables TLS 1.2 encryption for all communications within the
VPC.
If you send data to Amazon ES over HTTPS, node-to-node encryption helps ensure that your data
remains encrypted as Elasticsearch distributes (and redistributes) it throughout the cluster. If data arrives
unencrypted over HTTP, Amazon ES encrypts it after it reaches the cluster. You can require that all traffic
to the domain arrive over HTTPS using the console, AWS CLI, or configuration API.
Other Considerations
• Kibana still works on domains that use node-to-node encryption.
Types of Policies
Amazon ES supports three types of access policies:
Resource-based Policies
You add a resource-based policy, sometimes called the domain access policy, when you create a domain.
These policies specify which actions a principal can perform on the domain's subresources. Subresources
include Elasticsearch indices and APIs.
The Principal element specifies the accounts, users, or roles that are allowed access. The Resource
element specifies which subresources these principals can access. The following resource-based policy
grants test-user full access (es:*) to test-domain:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::123456789012:user/test-user"
]
},
"Action": [
"es:*"
],
"Resource": "arn:aws:es:us-west-1:987654321098:domain/test-domain/*"
}
]
}
• These privileges apply only to this domain. Unless you create additional policies, test-user can't
access data from other domains.
• The trailing /* in the Resource element is significant. Resource-based policies only apply to the
domain's subresources, not the domain itself.
For example, test-user can make requests against an index (GET https://search-test-
domain.us-west-1.es.amazonaws.com/test-index), but can't update the domain's
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::123456789012:user/test-user"
]
},
"Action": [
"es:ESHttpGet"
],
"Resource": "arn:aws:es:us-west-1:987654321098:domain/test-domain/test-index/_search"
}
]
}
Now test-user can perform only one operation: searches against test-index. All other indices within
the domain are inaccessible, and without permissions to use the es:ESHttpPut or es:ESHttpPost
actions, test-user can't add or modify documents.
Next, you might decide to configure a role for power users. This policy gives power-user-role access
to the HTTP GET and PUT methods for all URIs in the index:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::123456789012:role/power-user-role"
]
},
"Action": [
"es:ESHttpGet",
"es:ESHttpPut"
],
"Resource": "arn:aws:es:us-west-1:987654321098:domain/test-domain/test-index/*"
}
]
}
For information about all available actions, see the section called “Policy Element Reference” (p. 70).
Identity-based Policies
Unlike resource-based policies, which are a part of each Amazon ES domain, you attach identity-based
policies to users or roles using the AWS Identity and Access Management (IAM) service. Just like resource-
based policies (p. 65), identity-based policies specify who can access a service, which actions they can
perform, and if applicable, the resources on which they can perform those actions.
While they certainly don't have to be, identity-based policies tend to be more generic. They often govern
only the configuration API actions a user can perform. After you have these policies in place, you can use
resource-based policies in Amazon ES to offer users access to Elasticsearch indices and APIs.
Because identity-based policies attach to users or roles (principals), the JSON doesn't specify a principal.
The following policy grants access to actions that begin with Describe and List. This combination of
actions provides read-only access to domain configurations, but not to the data stored in the domain
itself:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"es:Describe*",
"es:List*"
],
"Effect": "Allow",
"Resource": "*"
}
]
}
An administrator might have full access to Amazon ES and all data stored on all domains:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"es:*"
],
"Effect": "Allow",
"Resource": "*"
}
]
}
For more information about the differences between resource-based and identity-based policies, see IAM
Policies in the IAM User Guide.
Note
Users with the AWS-managed AmazonESReadOnlyAccess policy can't see cluster health status
on the console. To allow them to see cluster health status, add the "es:ESHttpGet" action to
an access policy and attach it to their accounts or roles.
IP-based Policies
IP-based policies restrict access to a domain to one or more IP addresses or CIDR blocks. Technically, IP-
based policies are not a distinct type of policy. Instead, they are just resource-based policies that specify
an anonymous principal and include a special Condition element.
The primary appeal of IP-based policies is that they allow unsigned requests to an Amazon ES domain,
which lets you use clients like curl and Kibana (p. 179) or access the domain through a proxy server. To
learn more, see the section called “Using a Proxy to Access Amazon ES from Kibana” (p. 179).
Note
If you enabled VPC access for your domain, you can't configure an IP-based policy. Instead,
you can use security groups to control which IP addresses can access the domain. For more
information, see the section called “About Access Policies on VPC Domains” (p. 23).
The following policy grants all HTTP requests that originate from the specified IP range access to test-
domain:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": [
"es:ESHttp*"
],
"Condition": {
"IpAddress": {
"aws:SourceIp": [
"192.0.2.0/24"
]
}
},
"Resource": "arn:aws:es:us-west-1:987654321098:domain/test-domain/*"
}
]
}
If your domain has a public endpoint and doesn't use fine-grained access control (p. 77), we
recommend combining IAM principals and IP addresses. This policy grants test-user HTTP access only
if the request originates from the specified IP range:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::987654321098:user/test-user"
]
},
"Action": [
"es:ESHttp*"
],
"Condition": {
"IpAddress": {
"aws:SourceIp": [
"192.0.2.0/24"
]
}
},
"Resource": "arn:aws:es:us-west-1:987654321098:domain/test-domain/*"
}]
}
• To make calls to the Amazon ES configuration API, we recommend that you use one of the AWS SDKs.
The SDKs greatly simplify the process and can save you a significant amount of time compared to
creating and signing your own requests. The configuration API endpoints use the following format:
es.region.amazonaws.com/2015-01-01/
For example, the following request makes a configuration change to the movies domain, but you have
to sign it yourself (not recommended):
POST https://es.us-east-1.amazonaws.com/2015-01-01/es/domain/movies/config
{
"ElasticsearchClusterConfig": {
"InstanceType": "c5.xlarge.elasticsearch"
}
}
If you use one of the SDKs, such as Boto 3, the SDK automatically handles the request signing:
import boto3
client = boto3.client('es')
response = client.update_elasticsearch_domain_config(
DomainName='movies',
ElasticsearchClusterConfig={
'InstanceType': 'c5.xlarge.elasticsearch'
}
)
For a Java code sample, see the section called “Using the AWS SDKs” (p. 124).
• To make calls to the Elasticsearch APIs, you must sign your own requests. For sample code in a variety
of languages, see the section called “Signing HTTP Requests” (p. 114). The Elasticsearch APIs use the
following format:
domain-id.region.es.amazonaws.com
For example, the following request searches the movies index for thor:
GET https://my-domain.us-east-1.es.amazonaws.com/movies/_search?q=thor
Note
The service ignores parameters passed in URLs for HTTP POST requests that are signed with
Signature Version 4.
For example, if a resource-based policy grants you access to a domain, but an identity-based policy
denies you access, you are denied access. If an identity-based policy grants access and a resource-based
policy does not specify whether or not you should have access, you are allowed access. See the following
table of intersecting policies for a full summary of outcomes.
Version The current version of the policy language is 2012-10-17. All access policies
should specify this value.
Effect This element specifies whether the statement allows or denies access to the
specified actions. Valid values are Allow or Deny.
Principal This element specifies the AWS account or IAM user or role that is allowed or
denied access to a resource and can take several forms:
• es:ESHttpDelete
• es:ESHttpGet
• es:ESHttpHead
• es:ESHttpPost
• es:ESHttpPut
• es:ESHttpPatch
• es:AddTags
• es:CreateElasticsearchDomain
• es:CreateElasticsearchServiceRole
• es:DeleteElasticsearchDomain
• es:DeleteElasticsearchServiceRole
• es:DescribeElasticsearchDomain
• es:DescribeElasticsearchDomainConfig
• es:DescribeElasticsearchDomains
• es:DescribeElasticsearchInstanceTypeLimits
• es:DescribeReservedElasticsearchInstanceOfferings
• es:DescribeReservedElasticsearchInstances
• es:ESCrossClusterGet
• es:GetCompatibleElasticsearchVersions
• es:ListDomainNames
• es:ListElasticsearchInstanceTypeDetails
• es:ListElasticsearchInstanceTypes
• es:ListElasticsearchVersions
• es:ListTags
• es:PurchaseReservedElasticsearchInstanceOffering
• es:RemoveTags
• es:UpdateElasticsearchDomainConfig
Tip
You can use wildcards to specify a subset of actions, such as
"Action":"es:*" or "Action":"es:Describe*".
The following identity-based policy (p. 66) lists all es: actions and
groups them according to whether they apply to the domain subresources
(test-domain/*), to the domain configuration (test-domain), or only to
the service (*):
{
"Version": "2012-10-17",
"Statement": [
{
Note
While resource-level permissions for
es:CreateElasticsearchDomain might seem unintuitive
—after all, why give a user permissions to create a domain
that already exists?—the use of a wildcard lets you enforce a
simple naming scheme for your domains, such as "Resource":
"arn:aws:es:us-west-1:987654321098:domain/my-team-
name-*".
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"es:ESHttpGet",
"es:DescribeElasticsearchDomain"
],
"Resource": "*"
}
]
}
To learn more about pairing actions and resources, see the Resource
element in this table.
When configuring an IP-based policy (p. 67), you specify the IP addresses
or CIDR block as a condition, such as the following:
"Condition": {
"IpAddress": {
"aws:SourceIp": [
"192.0.2.0/32"
]
}
}
"Resource": "*"
"Resource": "arn:aws:es:region:aws-account-id:domain/domain-
name"
"Resource": "arn:aws:es:region:aws-account-id:domain/domain-
name/*"
You don't have to use a wildcard. Amazon ES lets you define a different
access policy for each Elasticsearch index or API. For example, you might
limit a user's permissions to the test-index index:
"Resource": "arn:aws:es:region:aws-account-id:domain/domain-
name/test-index"
Instead of full access to test-index, you might prefer to limit the policy
to just the search API:
"Resource": "arn:aws:es:region:aws-account-id:domain/domain-
name/test-index/_search"
"Resource": "arn:aws:es:region:aws-account-id:domain/domain-
name/test-index/test-type/1"
For details about which actions support resource-level permissions, see the
Action element in this table.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::123456789012:user/test-user"
]
},
"Action": [
"es:ESHttp*"
],
"Resource": [
"arn:aws:es:us-west-1:987654321098:domain/test-domain/test-index/*",
"arn:aws:es:us-west-1:987654321098:domain/test-domain/_bulk"
]
},
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::123456789012:user/test-user"
]
},
"Action": [
"es:ESHttpGet"
],
"Resource": "arn:aws:es:us-west-1:987654321098:domain/test-domain/restricted-index/*"
}
]
}
This policy grants test-user full access to test-index and the Elasticsearch bulk API. It also allows
GET requests to restricted-index.
The following indexing request, as you might expect, fails due to a permissions error:
PUT https://search-test-domain.us-west-1.es.amazonaws.com/restricted-index/movie/1
{
"title": "Your Name",
"director": "Makoto Shinkai",
"year": "2016"
}
Unlike the index API, the bulk API lets you create, update, and delete many documents in a single call.
You often specify these operations in the request body, however, rather than in the request URL. Because
Amazon ES uses URLs to control access to domain subresources, test-user can, in fact, use the bulk API
to make changes to restricted-index. Even though the user lacks POST permissions on the index, the
following request succeeds:
POST https://search-test-domain.us-west-1.es.amazonaws.com/_bulk
{ "index" : { "_index": "restricted-index", "_type" : "movie", "_id" : "1" } }
{ "title": "Your Name", "director": "Makoto Shinkai", "year": "2016" }
In this situation, the access policy fails to fulfill its intent. To prevent users from bypassing these kinds
of restrictions, you can change rest.action.multi.allow_explicit_index to false. If this value
is false, all calls to the bulk, mget, and msearch APIs that specify index names in the request body stop
working. In other words, calls to _bulk no longer work, but calls to test-index/_bulk do. This second
endpoint contains an index name, so you don't need to specify one in the request body.
Kibana (p. 179) relies heavily on mget and msearch, so it is unlikely to work properly after this change.
For partial remediation, you can leave rest.action.multi.allow_explicit_index as true and
deny certain users access to one or more of these APIs.
For information about changing this setting, see the section called “Advanced Options” (p. 13).
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012:user/test-user"
},
"Action": "es:ESHttp*",
"Resource": "arn:aws:es:us-west-1:987654321098:domain/test-domain/*"
},
{
"Effect": "Deny",
"Principal": {
"AWS": "arn:aws:iam::123456789012:user/test-user"
},
"Action": "es:ESHTTP*",
"Resource": "arn:aws:es:us-west-1:987654321098:domain/test-domain/restricted-index/*"
}
]
}
• Despite the explicit deny, test-user can still make calls such as GET https://search-test-
domain.us-west-1.es.amazonaws.com/_all/_search and GET https://search-test-
domain.us-west-1.es.amazonaws.com/*/_search to access the documents in restricted-
index.
• Because the Resource element references restricted-index/*, test-user doesn't have
permissions to directly access the index's documents. The user does, however, have permissions to
delete the entire index. To prevent access and deletion, the policy instead must specify restricted-
index*.
Rather than mixing broad allows and focused denies, the safest approach is to follow the principle of
least privilege and grant only the permissions that are required to perform a task. For more information
about controlling access to individual indices or Elasticsearch operations, see the section called “Fine-
Grained Access Control” (p. 77).
Topics
• The Bigger Picture: Fine-Grained Access Control and Amazon ES Security (p. 77)
• Key Concepts (p. 82)
• Enabling Fine-Grained Access Control (p. 82)
• Accessing Kibana as the Master User (p. 82)
• Managing Permissions (p. 83)
• Recommended Configurations (p. 86)
• Tutorial: IAM Master User and Amazon Cognito (p. 87)
• Tutorial: Internal User Database and HTTP Basic Authentication (p. 90)
• Limitations (p. 91)
• Modifying the Master User (p. 92)
• Additional Master Users (p. 92)
• Manual Snapshots (p. 94)
• Integrations (p. 94)
• REST API Differences (p. 94)
Network
The first security layer is the network, which determines whether requests reach an Amazon
ES domain. If you choose Public access when you create a domain, requests from any internet-
connected client can reach the domain endpoint. If you choose VPC access, clients must connect to
the VPC (and the associated security groups must permit it) for a request to reach the endpoint. For
more information, see the section called “VPC Support” (p. 20).
The second security layer is the domain access policy. After a request reaches a domain endpoint, the
resource-based access policy (p. 65) allows or denies the request access to a given URI. The access
policy accepts or rejects requests at the "edge" of the domain, before they reach Elasticsearch itself.
The third and final security layer is fine-grained access control. After a resource-based access
policy allows a request to reach a domain endpoint, fine-grained access control evaluates the
user credentials and either authenticates the user or denies the request. If fine-grained access
control authenticates the user, it fetches all roles mapped to that user and uses the complete set of
permissions to determine how to handle the request.
Note
If a resource-based access policy contains IAM users or roles, clients must send signed requests
using AWS Signature Version 4. As such, access policies can conflict with fine-grained access
control, especially if you use the internal user database and HTTP basic authentication. You can't
sign a request with a user name and password and IAM credentials. In general, if you enable
fine-grained access control, we recommend using a domain access policy that doesn't require
signed requests.
This first diagram illustrates a common configuration: a VPC access domain with fine-grained access
control enabled, an IAM-based access policy, and an IAM master user.
This second diagram illustrates another common configuration: a public access domain with fine-grained
access control enabled, an access policy that doesn't use IAM principals, and a master user in the internal
user database.
Example
Consider a GET request to movies/_search?q=thor. Does the user have permissions to search the
movies index? If so, does the user have permissions to see all documents within it? Should the response
omit or anonymize any fields? For the master user, the response might look like this:
{
"hits": {
"total": 7,
"max_score": 8.772789,
"hits": [{
"_index": "movies",
"_type": "_doc",
"_id": "tt0800369",
"_score": 8.772789,
"_source": {
"directors": [
"Kenneth Branagh",
"Joss Whedon"
],
"release_date": "2011-04-21T00:00:00Z",
"genres": [
"Action",
"Adventure",
"Fantasy"
],
"plot": "The powerful but arrogant god Thor is cast out of Asgard to live amongst
humans in Midgard (Earth), where he soon becomes one of their finest defenders.",
"title": "Thor",
"actors": [
"Chris Hemsworth",
"Anthony Hopkins",
"Natalie Portman"
],
"year": 2011
}
},
...
]
}
}
If a user with more limited permissions issues the exact same request, the response might look like this:
{
"hits": {
"total": 2,
"max_score": 8.772789,
"hits": [{
"_index": "movies",
"_type": "_doc",
"_id": "tt0800369",
"_score": 8.772789,
"_source": {
"year": 2011,
"release_date":
"3812a72c6dd23eef3c750c2d99e205cbd260389461e19d610406847397ecb357",
"plot": "The powerful but arrogant god Thor is cast out of Asgard to live amongst
humans in Midgard (Earth), where he soon becomes one of their finest defenders.",
"title": "Thor"
}
},
...
]
}
}
The response has fewer hits and fewer fields for each hit. Also, the release_date field is anonymized. If
a user with no permissions makes the same request, the cluster returns an error:
{
"error": {
"root_cause": [{
"type": "security_exception",
"reason": "no permissions for [indices:data/read/search] and User [name=limited-user,
roles=[], requestedTenant=null]"
}],
"type": "security_exception",
"reason": "no permissions for [indices:data/read/search] and User [name=limited-user,
roles=[], requestedTenant=null]"
},
"status": 403
}
Key Concepts
Roles are the core way of using fine-grained access control. In this case, roles are distinct from IAM roles.
Roles contain any combination of permissions: cluster-wide, index-specific, document level, and field
level.
After configuring a role, you map it to one or more users. For example, you might map three roles to a
single user: one role that provides access to Kibana, one that provides read-only access to index1, and
one that provides write access to index2. Or you could include all of those permissions in a single role.
Users are people or applications that make requests to the Elasticsearch cluster. Users have credentials—
either IAM access keys or a user name and password—that they specify when they make requests. With
fine-grained access control on Amazon Elasticsearch Service, you choose one or the other for your master
user when you configure your domain. The master user has full permissions to the cluster and manages
roles and role mappings.
• If you choose IAM for your master user, all requests to the cluster must be signed using AWS Signature
Version 4. For sample code, see the section called “Signing HTTP Requests” (p. 114).
We recommend IAM if you want to use the same users on multiple clusters, if you want to use Amazon
Cognito to access Kibana (with or without an external identity provider), or if you have Elasticsearch
clients that support Signature Version 4 signing.
• If you choose the internal user database, you can use HTTP basic authentication (as well as IAM
credentials) to make requests to the cluster. Most clients support basic authentication, including curl.
The internal user database is stored in an Elasticsearch index, so you can't share it with other clusters.
We recommend the internal user database if you don't need to reuse users across multiple clusters, if
you want to use HTTP basic authentication to access Kibana (rather than Amazon Cognito), or if you
have clients that only support basic authentication. The internal user database is the simplest way to
get started with Amazon ES.
You can't enable fine-grained access control on existing domains, only new ones. After you enable fine-
grained access control, you can't disable it.
• If you choose to use IAM for user management, you must enable the section called “Authentication
for Kibana” (p. 100) and sign in using credentials from your user pool to access Kibana. Otherwise,
Kibana shows a nonfunctional sign-in page. See the section called “Limitations” (p. 91).
One of the assumed roles from the Amazon Cognito identity pool must match the IAM role that you
specified for the master user. For more information about this configuration, see the section called
“(Optional) Configuring Granular Access” (p. 106) and the section called “Tutorial: IAM Master User
and Amazon Cognito” (p. 87).
• If you choose to use the internal user database, you can sign in to Kibana with your master user name
and password. You must access Kibana over HTTPS. For more information about this configuration, see
the section called “Tutorial: Internal User Database and HTTP Basic Authentication” (p. 90).
Managing Permissions
As noted in the section called “Key Concepts” (p. 82), you manage fine-grained access control
permissions using roles, users, and mappings. This section describes how to create and apply those
resources. We recommend that you sign in to Kibana as the master user (p. 82) to perform these
operations.
Creating Roles
You can create new roles for fine-grained access control using Kibana or the _opendistro/_security
operation in the REST API. For more information, see the Open Distro for Elasticsearch documentation.
Fine-grained access control also includes a number of predefined roles. Clients such as Kibana and
Logstash make a wide variety of requests to Elasticsearch, which can make it hard to manually create
roles with the minimum set of permissions. For example, the kibana_user role includes the permissions
that a user needs to work with index patterns, visualizations, dashboards, and tenants. We recommend
mapping it (p. 85) to any user or backend role that accesses Kibana, along with additional roles that
allow access to other indices.
Cluster-Level Security
Cluster-level permissions include the ability to make broad requests such as _mget, _msearch,
and _bulk, monitor health, take snapshots, and more. Manage these permissions using the Cluster
Permissions tab when creating a role. For a list of cluster-level action groups, see the Open Distro for
Elasticsearch documentation.
Index-Level Security
Index-level permissions include the ability to create new indices, search indices, read and write
documents, delete documents, manage aliases, and more. Manage these permissions using the Index
Permissions tab when creating a role. For a list of index-level action groups, see the Open Distro for
Elasticsearch documentation.
Document-Level Security
Document-level security lets you restrict which documents in an index a user can see. When creating a
role, specify an index pattern and an Elasticsearch query. Any users that you map to that role can see
only the documents that match the query. Document-level security affects the number of hits that you
receive when you search (p. 80).
For more information, see the Open Distro for Elasticsearch documentation.
Field-Level Security
Field-level security lets you control which document fields a user can see. When creating a role, add a
list of fields to either include or exclude. If you include fields, any users you map to that role can see only
those fields. If you exclude fields, they can see all fields except the excluded ones. Field-level security
affects the number of fields included in hits when you search (p. 80).
For more information, see the Open Distro for Elasticsearch documentation.
Field Masking
Field masking is an alternative to field-level security that lets you anonymize the data in a field rather
than remove it altogether. When creating a role, add a list of fields to mask. Field masking affects
whether you can see the contents of a field when you search (p. 80).
Creating Users
If you enabled the internal user database, you can create users using Kibana or the _opendistro/
_security operation in the REST API. For more information, see the Open Distro for Elasticsearch
documentation.
If you chose IAM for your master user, ignore this portion of Kibana. Create IAM users and IAM roles
instead. For more information, see the IAM User Guide.
Backend roles offer another way of mapping roles to users. Rather than mapping the same role to dozens
of different users, you can map the role to a single backend role, and then make sure that all users have
that backend role. Backend roles can be IAM roles or arbitrary strings that you specify when you create
users in the internal user database.
• Specify users, IAM user ARNs, and Amazon Cognito user strings in the Users section. Cognito user
strings take the form of Cognito/user-pool-id/username.
• Specify backend roles and IAM role ARNs in the Backend roles section.
You can map roles to users using Kibana or the _opendistro/_security operation in the REST API.
For more information, see the Open Distro for Elasticsearch documentation.
Kibana Multi-Tenancy
Tenants are spaces for saving index patterns, visualizations, dashboards, and other Kibana objects.
Kibana multi-tenancy lets you safely share your work with other Kibana users (or keep it private). You can
control which roles have access to a tenant and whether those roles have read or write access. To learn
more, see the Open Distro for Elasticsearch documentation.
2. Choose Tenants.
3. Verify your tenant before creating visualizations or dashboards. If you want to share your work with
all other Kibana users, choose Global. To share your work with a subset of Kibana users, choose a
different shared tenant. Otherwise, choose Private.
Recommended Configurations
Due to how fine-grained access control interacts with other security features (p. 77), we recommend
several fine-grained access control configurations that work well for most use cases.
• Elasticsearch 7.7
• Public access
• Fine-grained access control enabled with an IAM role as the master user (IAMMasterUserRole
for the rest of this tutorial)
• Amazon Cognito authentication for Kibana (p. 100) enabled
• The following access policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"*"
]
},
"Action": [
"es:ESHttp*"
],
"Resource": "arn:aws:es:region:account:domain/domain-name/*"
}
]
}
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "cognito-identity.amazonaws.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"cognito-identity.amazonaws.com:aud": "identity-pool-id"
},
"ForAnyValue:StringLike": {
"cognito-identity.amazonaws.com:amr": "authenticated"
}
}
}]
}
{
"match": {
"FlightDelay": true
}
}
GET _search
{
"query": {
"match_all": {}
}
}
Note the permissions error. limited-user doesn't have permissions to run cluster-wide searches.
39. Run another search:
GET kibana_sample_data_flights/_search
{
"query": {
"match_all": {}
}
}
Note that all matching documents have a FlightDelay field of true, an anonymized Dest field,
and no FlightNum field.
40. In your original browser window, signed in as master-user, choose Dev Tools, and then perform
the same searches. Note the difference in permissions, number of hits, matching documents, and
included fields.
• Elasticsearch 7.7
• Public access
• Fine-grained access control with a master user in the internal user database (TheMasterUser for
the rest of this tutorial)
• Amazon Cognito authentication for Kibana disabled
• The following access policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"*"
]
},
"Action": [
"es:ESHttp*"
],
"Resource": "arn:aws:es:region:account:domain/domain-name/*"
}
]
}
{
"match": {
"FlightDelay": true
}
}
Only new-user has the kibana_user role, but all users with the new-backend-role backend role
have the new-role role.
20. In a new, private browser window, navigate to Kibana, sign in using new-user, and then choose
Explore on my own.
21. Choose Dev Tools and run the default search:
GET _search
{
"query": {
"match_all": {}
}
}
Note the permissions error. new-user doesn't have permissions to run cluster-wide searches.
22. Run another search:
GET kibana_sample_data_flights/_search
{
"query": {
"match_all": {}
}
}
Note that all matching documents have a FlightDelay field of true, an anonymized Dest field,
and no FlightNum field.
23. In your original browser window, signed in as TheMasterUser, choose Dev Tools and perform
the same searches. Note the difference in permissions, number of hits, matching documents, and
included fields.
Limitations
Fine-grained access control has several important limitations:
• The hosts aspect of role mappings, which maps roles to hostnames or IP addresses, doesn't work if
the domain is within a VPC. You can still map roles to users and backend roles.
• Users in the internal user database can't change their own passwords. Master users (or users with
equivalent permissions) must change their passwords for them.
• If you choose IAM for the master user and don't enable Amazon Cognito authentication, Kibana
displays a nonfunctional sign-in page.
• If you choose IAM for the master user, you can still create users in the internal user database. Because
HTTP basic authentication is not enabled under this configuration, however, any requests signed with
those user credentials are rejected.
• If you use SQL (p. 152) to query an index that you don't have access to, you receive a "no
permissions" error. If the index doesn't exist, you receive a "no such index" error. This difference in error
messages means that you can confirm the existence of an index if you happen to guess its name.
To minimize the issue, don't include sensitive information in index names (p. 129). To deny all access
to SQL, add the following element to your domain access policy:
{
"Effect": "Deny",
"Principal": {
"AWS": [
"*"
]
},
"Action": [
"es:*"
],
"Resource": "arn:aws:es:us-east-1:123456789012:domain/my-domain/_opendistro/_sql"
}
• If you previously used an IAM master user, fine-grained access control re-maps the all_access
role to the new IAM ARN that you specify.
• If you previously used the internal user database, fine-grained access control creates a new master
user. You can use the new master user to delete the old one.
6. Choose Submit.
• In Kibana, choose Security, Role Mappings, and then map the new master user to the all_access
and security_manager roles.
PUT _opendistro/_security/api/rolesmapping/all_access
{
"backend_roles": [
"arn:aws:iam::123456789012:role/fourth-master-user"
],
"hosts": [],
"users": [
"master-user",
"second-master-user",
"arn:aws:iam::123456789012:user/third-master-user"
]
}
PUT _opendistro/_security/api/rolesmapping/security_manager
{
"backend_roles": [
"arn:aws:iam::123456789012:role/fourth-master-user"
],
"hosts": [],
"users": [
"master-user",
"second-master-user",
"arn:aws:iam::123456789012:user/third-master-user"
]
}
These requests replace the current role mappings, so perform GET requests first so that you can
include all current roles in the PUT requests. The REST API is especially useful if you can't access Kibana
and want to map an IAM role from Amazon Cognito to the all_access role.
Manual Snapshots
Fine-grained access control introduces some additional complications with taking manual snapshots. To
register a snapshot repository—even if you use HTTP basic authentication for all other purposes—you
must map the manage_snapshots role to an IAM role that has iam:PassRole permissions to assume
TheSnapshotRole, as defined in the section called “Manual Snapshot Prerequisites” (p. 45).
Then use that IAM role to send a signed request to the domain, as outlined in the section called
“Registering a Manual Snapshot Repository” (p. 47).
Integrations
If you use other AWS services (p. 131) with Amazon ES, you must provide the IAM roles for those
services with appropriate permissions. For example, Kinesis Data Firehose delivery streams often
use an IAM role called firehose_delivery_role. In Kibana, create a role for fine-grained access
control (p. 84), and map the IAM role to it (p. 85). In this case, the new role needs the following
permissions:
{
"cluster_permissions": [
"cluster_composite_ops",
"cluster_monitor"
],
"index_permissions": [{
"index_patterns": [
"firehose-index*"
],
"allowed_actions": [
"create_index",
"manage",
"crud"
]
}]
}
Permissions vary based on the actions each service performs. An AWS IoT rule or AWS Lambda function
that indexes data likely needs similar permissions to Kinesis Data Firehose, while a Lambda function that
only performs searches can use a more limited set.
PUT _opendistro/_security/api/user/new-user
{
"password": "some-password",
"roles": ["new-backend-role"]
}
PUT _opendistro/_security/api/user/new-user
{
"password": "some-password",
"backend_roles": ["new-backend-role"]
}
GET _opendistro/_security/api/roles/all_access
{
"all_access": {
"cluster": ["UNLIMITED"],
"tenants": {
"admin_tenant": "RW"
},
"indices": {
"*": {
"*": ["UNLIMITED"]
}
},
"readonly": "true"
}
}
GET _opendistro/_security/api/tenants
{
"global_tenant": {
"reserved": true,
"hidden": false,
"description": "Global tenant",
"static": false
}
}
For documentation on the 7.x REST API, see the Open Distro for Elasticsearch documentation.
Tip
If you use the internal user database, you can use curl to make requests and test your domain.
Try the following sample commands:
The captured calls include calls from the Amazon ES console, AWS CLI, or an AWS SDK. If you create a
trail, you can enable continuous delivery of CloudTrail events to an Amazon S3 bucket, including events
for Amazon ES. If you don't configure a trail, you can still view the most recent events on the CloudTrail
console in Event history. Using the information collected by CloudTrail, you can determine the request
that was made to Amazon ES, the IP address from which the request was made, who made the request,
when it was made, and additional details.
To learn more about CloudTrail, see the AWS CloudTrail User Guide.
For an ongoing record of events in your AWS account, including events for Amazon ES, create a trail.
A trail enables CloudTrail to deliver log files to an Amazon S3 bucket. By default, when you create a
trail in the console, the trail applies to all AWS Regions. The trail logs events from all Regions in the
AWS partition and delivers the log files to the Amazon S3 bucket that you specify. Additionally, you can
configure other AWS services to further analyze and act upon the event data collected in CloudTrail logs.
For more information, see the following:
All Amazon ES configuration API actions are logged by CloudTrail and are documented in the Amazon ES
Configuration API Reference (p. 271).
Every event or log entry contains information about who generated the request. The identity
information helps you determine the following:
• Whether the request was made with root or AWS Identity and Access Management (IAM) user
credentials
• Whether the request was made with temporary security credentials for a role or federated user
• Whether the request was made by another AWS service
The following example shows a CloudTrail log entry that demonstrates the
CreateElasticsearchDomain operation:
{
"eventVersion": "1.05",
"userIdentity": {
"type": "IAMUser",
"principalId": "AIDACKCEVSQ6C2EXAMPLE",
"arn": "arn:aws:iam::123456789012:user/test-user",
"accountId": "123456789012",
"accessKeyId": "access-key",
"userName": "test-user",
"sessionContext": {
"attributes": {
"mfaAuthenticated": "false",
"creationDate": "2018-08-21T21:59:11Z"
}
},
"invokedBy": "signin.amazonaws.com"
},
"eventTime": "2018-08-21T22:00:05Z",
"eventSource": "es.amazonaws.com",
"eventName": "CreateElasticsearchDomain",
"awsRegion": "us-west-1",
"sourceIPAddress": "123.123.123.123",
"userAgent": "signin.amazonaws.com",
"requestParameters": {
"elasticsearchVersion": "6.3",
"elasticsearchClusterConfig": {
"instanceType": "m4.large.elasticsearch",
"instanceCount": 1
},
"snapshotOptions": {
"automatedSnapshotStartHour": 0
},
"domainName": "test-domain",
"encryptionAtRestOptions": {},
"eBSOptions": {
"eBSEnabled": true,
"volumeSize": 10,
"volumeType": "gp2"
},
"accessPolicies": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow
\",\"Principal\":{\"AWS\":[\"123456789012\"]},\"Action\":[\"es:*\"],\"Resource\":
\"arn:aws:es:us-west-1:123456789012:domain/test-domain/*\"}]}",
"advancedOptions": {
"rest.action.multi.allow_explicit_index": "true"
}
},
"responseElements": {
"domainStatus": {
"created": true,
"elasticsearchClusterConfig": {
"zoneAwarenessEnabled": false,
"instanceType": "m4.large.elasticsearch",
"dedicatedMasterEnabled": false,
"instanceCount": 1
},
"cognitoOptions": {
"enabled": false
},
"encryptionAtRestOptions": {
"enabled": false
},
"advancedOptions": {
"rest.action.multi.allow_explicit_index": "true"
},
"upgradeProcessing": false,
"snapshotOptions": {
"automatedSnapshotStartHour": 0
},
"eBSOptions": {
"eBSEnabled": true,
"volumeSize": 10,
"volumeType": "gp2"
},
"elasticsearchVersion": "6.3",
"processing": true,
"aRN": "arn:aws:es:us-west-1:123456789012:domain/test-domain",
"domainId": "123456789012/test-domain",
"deleted": false,
"domainName": "test-domain",
"accessPolicies": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",
\"Principal\":{\"AWS\":\"arn:aws:iam::123456789012:root\"},\"Action\":\"es:*\",\"Resource
\":\"arn:aws:es:us-west-1:123456789012:domain/test-domain/*\"}]}"
}
},
"requestID": "12345678-1234-1234-1234-987654321098",
"eventID": "87654321-4321-4321-4321-987654321098",
"eventType": "AwsApiCall",
"recipientAccountId": "123456789012"
}
For a list of AWS services in scope of specific compliance programs, see AWS Services in Scope by
Compliance Program. For general information, see AWS Compliance Programs.
You can download third-party audit reports using AWS Artifact. For more information, see Downloading
Reports in AWS Artifact.
Your compliance responsibility when using Amazon ES is determined by the sensitivity of your data,
your company's compliance objectives, and applicable laws and regulations. AWS provides the following
resources to help with compliance:
• Security and Compliance Quick Start Guides – These deployment guides discuss architectural
considerations and provide steps for deploying security- and compliance-focused baseline
environments on AWS.
• Architecting for HIPAA Security and Compliance Whitepaper – This whitepaper describes how
companies can use AWS to create HIPAA-compliant applications.
• AWS Compliance Resources – This collection of workbooks and guides might apply to your industry
and location.
• AWS Config – This AWS service assesses how well your resource configurations comply with internal
practices, industry guidelines, and regulations.
• AWS Security Hub – This AWS service provides a comprehensive view of your security state within AWS
that helps you check your compliance with security industry standards and best practices.
For more information about AWS Regions and Availability Zones, see AWS Global Infrastructure.
In addition to the AWS global infrastructure, Amazon ES offers several features to help support your data
resiliency and backup needs:
You use AWS published API calls to access the Amazon ES configuration API through the network. Clients
must support Transport Layer Security (TLS) 1.0 or later. We recommend TLS 1.2 or later. Clients must
also support cipher suites with perfect forward secrecy (PFS) such as Ephemeral Diffie-Hellman (DHE) or
Elliptic Curve Ephemeral Diffie-Hellman (ECDHE). Most modern systems such as Java 7 and later support
these modes.
Additionally, requests to the configuration API must be signed by using an access key ID and a secret
access key that is associated with an IAM principal. Or you can use the AWS Security Token Service (AWS
STS) to generate temporary security credentials to sign requests.
Depending on your domain configuration, you might also need to sign requests to the Elasticsearch APIs.
For more information, see the section called “Making and Signing Amazon ES Requests” (p. 68).
Amazon ES supports public access domains, which can receive requests from any internet-connected
device, and VPC access domains (p. 20), which are isolated from the public internet.
Much of the authentication process occurs in Amazon Cognito, but this section offers guidelines and
requirements for configuring Amazon Cognito resources to work with Amazon ES domains. Standard
pricing applies to all Amazon Cognito resources.
Tip
The first time that you configure a domain to use Amazon Cognito authentication for Kibana, we
recommend using the console. Amazon Cognito resources are extremely customizable, and the
console can help you identify and understand the features that matter to you.
Topics
• Prerequisites (p. 100)
• Configuring an Amazon ES Domain (p. 102)
• Allowing the Authenticated Role (p. 104)
• Configuring Identity Providers (p. 104)
• (Optional) Configuring Granular Access (p. 106)
• (Optional) Customizing the Sign-in Page (p. 108)
• (Optional) Configuring Advanced Security (p. 108)
• Testing (p. 108)
• Limits (p. 108)
• Common Configuration Issues (p. 109)
• Disabling Amazon Cognito Authentication for Kibana (p. 111)
• Deleting Domains that Use Amazon Cognito Authentication for Kibana (p. 111)
Prerequisites
Before you can configure Amazon Cognito authentication for Kibana, you must fulfill several
prerequisites. The Amazon ES console helps streamline the creation of these resources, but
understanding the purpose of each resource helps with configuration and troubleshooting. Amazon
Cognito authentication for Kibana requires the following resources:
Note
The user pool and identity pool must be in the same AWS Region. You can use the same user
pool, identity pool, and IAM role to add Amazon Cognito authentication for Kibana to multiple
Amazon ES domains. To learn more, see the section called “Limits” (p. 108).
When you create a user pool to use with Amazon ES, consider the following:
• Your Amazon Cognito user pool must have a domain name. Amazon ES uses this domain name to
redirect users to a login page for accessing Kibana. Other than a domain name, the user pool doesn't
require any non-default configuration.
• You must specify the pool's required standard attributes—attributes like name, birth date, email
address, and phone number. You can't change these attributes after you create the user pool, so
choose the ones that matter to you at this time.
• While creating your user pool, choose whether users can create their own accounts, the minimum
password strength for accounts, and whether to enable multi-factor authentication. If you plan to use
an external identity provider, these settings are inconsequential. Technically, you can enable the user
pool as an identity provider and enable an external identity provider, but most people prefer one or
the other.
User pool IDs take the form of region_ID. If you plan to use the AWS CLI or an AWS SDK to configure
Amazon ES, make note of the ID.
• If you use the Amazon Cognito console, you must select the Enable access to unauthenticated
identities check box to create the identity pool. After you create the identity pool and configure the
Amazon ES domain (p. 102), Amazon Cognito disables this setting.
• You don't need to add external identity providers to the identity pool. When you configure Amazon
ES to use Amazon Cognito authentication, it configures the identity pool to use the user pool that you
just created.
• After you create the identity pool, you must choose unauthenticated and authenticated IAM roles.
These roles specify the access policies that users have before and after they log in. If you use the
Amazon Cognito console, it can create these roles for you. After you create the authenticated
role, make note of the ARN, which takes the form of arn:aws:iam::123456789012:role/
Cognito_identitypoolAuth_Role.
Identity pool IDs take the form of region:ID-ID-ID-ID-ID. If you plan to use the AWS CLI or an AWS
SDK to configure Amazon ES, make note of the ID.
If you use the AWS CLI or one of the AWS SDKs, you must create your own role, attach the policy, and
specify the ARN for this role when you configure your Amazon ES domain. The role must have the
following trust relationship:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "es.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
For instructions, see Creating a Role to Delegate Permissions to an AWS Service and Attaching and
Detaching IAM Policies in the IAM User Guide.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"iam:GetRole",
"iam:PassRole",
"iam:CreateRole",
"iam:AttachRolePolicy",
"ec2:DescribeVpcs",
"cognito-identity:ListIdentityPools",
"cognito-idp:ListUserPools"
],
"Resource": "*"
}
]
}
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"ec2:DescribeVpcs",
"cognito-identity:ListIdentityPools",
"cognito-idp:ListUserPools"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"iam:GetRole",
"iam:PassRole"
],
"Resource": "arn:aws:iam::123456789012:role/CognitoAccessForAmazonES"
}
]
}
After your domain finishes processing, see the section called “Allowing the Authenticated Role” (p. 104)
and the section called “Configuring Identity Providers” (p. 104) for additional configuration steps.
--cognito-options Enabled=true,UserPoolId="user-pool-id",IdentityPoolId="identity-pool-
id",RoleArn="arn:aws:iam::123456789012:role/CognitoAccessForAmazonES"
Example
The following example creates a domain in the us-east-1 Region that enables Amazon Cognito
authentication for Kibana using the CognitoAccessForAmazonES role and provides domain access to
Cognito_Auth_Role:
east-1:12345678-1234-1234-1234-123456789012",RoleArn="arn:aws:iam::123456789012:role/
CognitoAccessForAmazonES"
After your domain finishes processing, see the section called “Allowing the Authenticated Role” (p. 104)
and the section called “Configuring Identity Providers” (p. 104) for additional configuration steps.
After your domain finishes processing, see the section called “Allowing the Authenticated Role” (p. 104)
and the section called “Configuring Identity Providers” (p. 104) for additional configuration steps.
You can include these permissions in an identity-based (p. 66) policy, but unless you want
authenticated users to have access to all Amazon ES domains, a resource-based (p. 65) policy attached
to a single domain is the better approach:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::123456789012:role/Cognito_identitypoolAuth_Role"
]
},
"Action": [
"es:ESHttp*"
],
"Resource": "arn:aws:es:region:123456789012:domain/domain-name/*"
}
]
}
For instructions about adding a resource-based policy to an Amazon ES domain, see the section called
“Configuring Access Policies” (p. 13).
Warning
Don't rename or delete the app client.
Depending on how you configured your user pool, you might need to create user accounts manually, or
users might be able to create their own. If these settings are acceptable, you don't need to take further
action. Many people, however, prefer to use external identity providers.
To enable a SAML 2.0 identity provider, you must provide a SAML metadata document. To enable social
identity providers like Login with Amazon, Facebook, and Google, you must have an app ID and app
secret from those providers. You can enable any combination of identity providers. The sign-in page adds
options as you add providers, as shown in the following screenshot.
The easiest way to configure your user pool is to use the Amazon Cognito console. Use the Identity
Providers page to add external identity providers and the App client settings page to enable and disable
identity providers for the Amazon ES domain's app client. For example, you might want to enable your
own SAML identity provider and disable Cognito User Pool as an identity provider.
For instructions, see Using Federation from a User Pool and Specifying Identity Provider Settings for Your
User Pool App in the Amazon Cognito Developer Guide.
• Create user groups and configure your identity provider to choose the IAM role based on the user's
authentication token (recommended).
• Configure your identity provider to choose the IAM role based on one or more rules.
You configure these options using the Edit identity pool page of the Amazon Cognito console, as shown
in the following screenshot. For a walkthrough that includes fine-grained access control, see the section
called “Tutorial: IAM Master User and Amazon Cognito” (p. 87).
Important
Just like the default role, Amazon Cognito must be part of each additional role's trust
relationship. For details, see Creating Roles for Role Mapping in the Amazon Cognito Developer
Guide.
After you create one or more user groups, you can configure your authentication provider to assign
users their groups' roles rather than the identity pool's default role. Choose the Choose role from token
option. Then choose either Use default Authenticated role or DENY to specify how the identity pool
should handle users who are not part of a group.
Rules
Rules are essentially a series of if statements that Amazon Cognito evaluates sequentially. For example,
if a user's email address contains @corporate, Amazon Cognito assigns that user Role_A. If a user's
email address contains @subsidiary, it assigns that user Role_B. Otherwise, it assigns the user the
default authenticated role.
To learn more, see Using Rule-Based Mapping to Assign Roles to Users in the Amazon Cognito Developer
Guide.
Testing
After you are satisfied with your configuration, verify that the user experience meets your expectations.
To access Kibana
If any step of this process fails, see the section called “Common Configuration Issues” (p. 109) for
troubleshooting information.
Limits
Amazon Cognito has soft limits on many of its resources. If you want to enable Kibana authentication for
a large number of Amazon ES domains, review Limits in Amazon Cognito and request limit increases as
necessary.
Each Amazon ES domain adds an app client to the user pool, which adds an authentication provider to
the identity pool. If you enable Kibana authentication for more than 10 domains, you might encounter
the "maximum Amazon Cognito user pool providers per identity pool" limit. If you exceed a limit, any
Amazon ES domains that you try to configure to use Amazon Cognito authentication for Kibana can get
stuck in a configuration state of Processing.
Configuring Amazon ES
Issue Solution
Amazon ES can't create the You don't have the correct IAM permissions. Add the
role (console) permissions specified in the section called “Configuring
Amazon Cognito Authentication (Console)” (p. 102).
User is not authorized You don't have read permissions for Amazon Cognito. Attach
to perform: cognito- the AmazonCognitoReadOnly policy to your account.
identity:ListIdentityPools on
resource
Issue Solution
An error occurred Amazon ES can't find the user pool. Confirm that you
(ValidationException) created one and have the correct ID. To find the ID, you can
when calling the use the Amazon Cognito console or the following AWS CLI
CreateElasticsearchDomain command:
operation: User pool does not
exist aws cognito-idp list-user-pools --max-results 60 --
region region
An error occurred Amazon ES can't find the identity pool. Confirm that you
(ValidationException) created one and have the correct ID. To find the ID, you can
when calling the use the Amazon Cognito console or the following AWS CLI
CreateElasticsearchDomain command:
operation: IdentityPool not
found aws cognito-identity list-identity-pools --max-
results 60 --region region
An error occurred The user pool does not have a domain name. You can
(ValidationException) configure one using the Amazon Cognito console or the
when calling the following AWS CLI command:
CreateElasticsearchDomain
operation: Domain needs to be aws cognito-idp create-user-pool-domain --domain name
specified for user pool --user-pool-id id
Accessing Kibana
Issue Solution
The login page doesn't show my Check that you enabled the identity provider for the Amazon
preferred identity providers. ES app client as specified in the section called “Configuring
Identity Providers” (p. 104).
The login page doesn't look as if it's See the section called “(Optional) Customizing the Sign-in
associated with my organization. Page” (p. 108).
My login credentials don't work. Check that you have configured the identity provider
as specified in the section called “Configuring Identity
Providers” (p. 104).
If you use the user pool as your identity provider, check that
the account exists and is confirmed on the User and groups
page of the Amazon Cognito console.
Kibana either doesn't load at all or The Amazon Cognito authenticated role needs es:ESHttp*
doesn't work properly. permissions for the domain (/*) to access and use Kibana.
Check that you added an access policy as specified in the
section called “Allowing the Authenticated Role” (p. 104).
Invalid identity pool Amazon Cognito doesn't have permissions to assume the
configuration. Check assigned IAM role on behalf of the authenticated user. Modify the
IAM roles for this pool. trust relationship for the role to include:
{
"Version": "2012-10-17",
"Statement": [{
Issue Solution
"Effect": "Allow",
"Principal": {
"Federated": "cognito-identity.amazonaws.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"cognito-identity.amazonaws.com:aud":
"identity-pool-id"
},
"ForAnyValue:StringLike": {
"cognito-identity.amazonaws.com:amr":
"authenticated"
}
}
}]
}
Token is not from a supported This uncommon error can occur when you remove the app
provider of this identity client from the user pool. Try opening Kibana in a new
pool. browser session.
Important
If you no longer need the Amazon Cognito user pool and identity pool, delete them. Otherwise,
you can continue to incur charges.
are predefined by Amazon ES and include all the permissions that the service requires to call other AWS
services on your behalf.
A service-linked role makes setting up Amazon ES easier because you don’t have to manually add the
necessary permissions. Amazon ES defines the permissions of its service-linked roles, and unless defined
otherwise, only Amazon ES can assume its roles. The defined permissions include the trust policy and the
permissions policy, and that permissions policy cannot be attached to any other IAM entity.
You can delete a service-linked role only after first deleting its related resources. This protects your
Amazon ES resources because you can't inadvertently remove permission to access the resources.
For information about other services that support service-linked roles, see AWS Services That Work with
IAM and look for the services that have Yes in the Service-Linked Role column. Choose a Yes with a link
to view the service-linked role documentation for that service.
• es.amazonaws.com
The role permissions policy allows Amazon ES to complete the following actions on the specified
resources:
• Action: ec2:CreateNetworkInterface on *
• Action: ec2:DeleteNetworkInterface on *
• Action: ec2:DescribeNetworkInterfaces on *
• Action: ec2:ModifyNetworkInterfaceAttribute on *
• Action: ec2:DescribeSecurityGroups on *
• Action: ec2:DescribeSubnets on *
• Action: ec2:DescribeVpcs on *
You must configure permissions to allow an IAM entity (such as a user, group, or role) to create, edit, or
delete a service-linked role. For more information, see Service-Linked Role Permissions in the IAM User
Guide.
If you delete this service-linked role and then need to create it again, you can use the same process to
recreate the role in your account.
You can also use the IAM console, the IAM CLI, or the IAM API to create a service-linked role manually.
For more information, see Creating a Service-Linked Role in the IAM User Guide.
entities might reference the role. However, you can edit the description of the role using IAM. For more
information, see Editing a Service-Linked Role in the IAM User Guide.
To check whether the service-linked role has an active session in the IAM console
1. Sign in to the AWS Management Console and open the IAM console at https://
console.aws.amazon.com/iam/.
2. In the navigation pane of the IAM console, choose Roles. Then choose the name (not the check box)
of the AWSServiceRoleForAmazonElasticsearchService role.
3. On the Summary page for the selected role, choose the Access Advisor tab.
4. On the Access Advisor tab, review recent activity for the service-linked role.
Note
If you are unsure whether Amazon ES is using the
AWSServiceRoleForAmazonElasticsearchService role, you can try to delete the role. If the
service is using the role, then the deletion fails and you can view the regions where the role
is being used. If the role is being used, then you must wait for the session to end before you
can delete the role. You cannot revoke the session for a service-linked role.
1. Sign in to the AWS Management Console and open the Amazon ES console.
2. Delete any domains that list VPC under the Endpoint column.
Topics
• Signing HTTP Requests to Amazon Elasticsearch Service (p. 114)
• Compressing HTTP Requests (p. 122)
• Using the AWS SDKs with Amazon Elasticsearch Service (p. 124)
Topics
• Java (p. 114)
• Python (p. 117)
• Ruby (p. 119)
• Node (p. 120)
• Go (p. 121)
Java
The easiest way of sending a signed request is to use the AWS Request Signing Interceptor. The
repository contains some samples to help you get started, or you can download a sample project for
Amazon ES on GitHub.
The following example uses the Elasticsearch low-level Java REST client to perform two unrelated
actions: registering a snapshot repository and indexing a document. You must provide values for region
and host.
import org.apache.http.HttpEntity;
import org.apache.http.HttpHost;
import org.apache.http.HttpRequestInterceptor;
import org.apache.http.entity.ContentType;
import org.apache.http.nio.entity.NStringEntity;
import org.elasticsearch.client.Request;
import org.elasticsearch.client.Response;
import org.elasticsearch.client.RestClient;
import com.amazonaws.auth.AWS4Signer;
import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.auth.DefaultAWSCredentialsProviderChain;
import com.amazonaws.http.AWSRequestSigningApacheInterceptor;
import java.io.IOException;
// Index a document
entity = new NStringEntity(sampleDocument, ContentType.APPLICATION_JSON);
String id = "1";
request = new Request("PUT", indexingPath + "/" + id);
request.setEntity(entity);
response = esClient.performRequest(request);
System.out.println(response.toString());
}
If you prefer the high-level REST client, which offers most of the same features and simpler code, try the
following sample, which also uses the AWS Request Signing Interceptor:
import org.apache.http.HttpHost;
import org.apache.http.HttpRequestInterceptor;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import com.amazonaws.auth.AWS4Signer;
import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.auth.DefaultAWSCredentialsProviderChain;
import com.amazonaws.http.AWSRequestSigningApacheInterceptor;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
// Form the indexing request, send it, and print the response
IndexRequest request = new IndexRequest(index, type, id).source(document);
IndexResponse response = esClient.index(request, RequestOptions.DEFAULT);
System.out.println(response.toString());
}
Tip
Both signed samples use the default credential chain. Run aws configure using the AWS CLI
to set your credentials.
Python
You can install elasticsearch-py, an Elasticsearch client for Python, using pip. Instead of the client,
you might prefer requests. The requests-aws4auth and SDK for Python (Boto3) packages simplify the
authentication process, but are not strictly required. From the terminal, run the following commands:
The following sample code establishes a secure connection to the specified Amazon ES domain and
indexes a single document. You must provide values for region and host.
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service,
session_token=credentials.token)
es = Elasticsearch(
hosts = [{'host': host, 'port': 443}],
http_auth = awsauth,
use_ssl = True,
verify_certs = True,
connection_class = RequestsHttpConnection
)
document = {
"title": "Moneyball",
"director": "Bennett Miller",
"year": "2011"
}
If you don't want to use elasticsearch-py, you can just make standard HTTP requests. This sample creates
a new index with seven shards and two replicas:
host = '' # The domain with https:// and trailing slash. For example, https://my-test-
domain.us-east-1.es.amazonaws.com/
path = 'my-index' # the Elasticsearch API endpoint
region = '' # For example, us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service,
session_token=credentials.token)
print(r.text)
This next example uses the Beautiful Soup library to help build a bulk file from a local directory of HTML
files. Using the same client as the first example, you can send the file to the _bulk API for indexing. You
could use this code as the basis for adding search functionality to a website:
bulk_file = ''
id = 1
# This loop iterates through all HTML files in the current directory and
# indexes two things: the contents of the first h1 tag and all other text.
with open(html_file) as f:
soup = BeautifulSoup(f, 'html.parser')
title = soup.h1.string
body = soup.get_text(" ", strip=True)
# If get_text() is too noisy, you can do further processing on the string.
id += 1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service)
es = Elasticsearch(
hosts = [{'host': host, 'port': 443}],
http_auth = awsauth,
use_ssl = True,
verify_certs = True,
connection_class = RequestsHttpConnection
)
es.bulk(bulk_file)
Ruby
This first example uses the Elasticsearch Ruby client and Faraday middleware to perform the request
signing. From the terminal, run the following commands:
This sample code creates a new Elasticsearch client, configures Faraday middleware to sign requests, and
indexes a single document. You must provide values for full_url_and_port and region.
require 'elasticsearch'
require 'faraday_middleware/aws_sigv4'
puts client.index index: index, type: type, id: id, body: document
If your credentials don't work, export them at the terminal using the following commands:
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_SESSION_TOKEN="your-session-token"
This next example uses the AWS SDK for Ruby and standard Ruby libraries to send a signed HTTP
request. Like the first example, it indexes a single document. You must provide values for host and
region.
require 'aws-sdk-elasticsearchservice'
service = 'es'
region = '' # e.g. us-west-1
signer = Aws::Sigv4::Signer.new(
service: service,
region: region,
access_key_id: ENV['AWS_ACCESS_KEY_ID'],
secret_access_key: ENV['AWS_SECRET_ACCESS_KEY'],
session_token: ENV['AWS_SESSION_TOKEN']
)
signature = signer.sign_request(
http_method: 'PUT',
url: host + '/' + index + '/' + type + '/' + id,
body: document.to_json
)
Node
This example uses the SDK for JavaScript in Node.js. From the terminal, run the following commands:
This sample code indexes a single document. You must provide values for region and domain.
var id = '1';
var json = {
"title": "Moneyball",
"director": "Bennett Miller",
"year": "2011"
}
indexDocument(json);
function indexDocument(document) {
var endpoint = new AWS.Endpoint(domain);
var request = new AWS.HttpRequest(endpoint, region);
request.method = 'PUT';
request.path += index + '/' + type + '/' + id;
request.body = JSON.stringify(document);
request.headers['host'] = domain;
request.headers['Content-Type'] = 'application/json';
// Content-Length is only needed for DELETE requests that include a request
// body, but including it for all requests doesn't seem to hurt anything.
request.headers['Content-Length'] = Buffer.byteLength(request.body);
If your credentials don't work, export them at the terminal using the following commands:
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_SESSION_TOKEN="your-session-token"
Go
This example uses the AWS SDK for Go and indexes a single document. You must provide values for
domain and region.
package main
import (
"fmt"
"net/http"
"strings"
"time"
"github.com/aws/aws-sdk-go/aws/credentials"
"github.com/aws/aws-sdk-go/aws/signer/v4"
)
func main() {
// Get credentials from environment variables and create the AWS Signature Version 4
signer
credentials := credentials.NewEnvCredentials()
signer := v4.NewSigner(credentials)
// You can probably infer Content-Type programmatically, but here, we just say that it's
JSON
req.Header.Add("Content-Type", "application/json")
If your credentials don't work, export them at the terminal using the following commands:
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_SESSION_TOKEN="your-session-token"
PUT _cluster/settings
{
"persistent" : {
"http_compression.enabled": true
}
}
Requests to _cluster/settings must be uncompressed, so you might need to use a separate client or
standard HTTP request to update cluster settings.
Required Headers
When including a gzip-compressed request body, keep the standard Content-Type: application/
json header, and add the Content-Encoding: gzip header. To accept a gzip-compressed response,
add the Accept-Encoding: gzip header, as well. If an Elasticsearch client supports gzip compression,
it likely includes these headers automatically.
document = {
"title": "Moneyball",
"director": "Bennett Miller",
"year": "2011"
}
Alternately, you can specify the proper headers, compress the request body yourself, and use a standard
HTTP library like Requests. This code signs the request using HTTP basic credentials, which your domain
might support if you use fine-grained access control (p. 77).
import requests
import gzip
import json
base_url = '' # The domain with https:// and a trailing slash. For example, https://my-
test-domain.us-east-1.es.amazonaws.com/
auth = ('master-user', 'master-user-password') # For testing only. Don't store credentials
in code.
document = {
"title": "Moneyball",
"director": "Bennett Miller",
"year": "2011"
}
Java
This example uses the AWS SDK for Java to create a domain, update its configuration, and delete it.
Uncomment the calls to waitForDomainProcessing (and comment the call to deleteDomain) to
allow the domain to come online and be useable.
package com.amazonaws.samples;
import java.util.concurrent.TimeUnit;
import com.amazonaws.auth.DefaultAWSCredentialsProviderChain;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.elasticsearch.AWSElasticsearch;
import com.amazonaws.services.elasticsearch.AWSElasticsearchClientBuilder;
import com.amazonaws.services.elasticsearch.model.CreateElasticsearchDomainRequest;
import com.amazonaws.services.elasticsearch.model.CreateElasticsearchDomainResult;
import com.amazonaws.services.elasticsearch.model.DeleteElasticsearchDomainRequest;
import com.amazonaws.services.elasticsearch.model.DeleteElasticsearchDomainResult;
import com.amazonaws.services.elasticsearch.model.DescribeElasticsearchDomainRequest;
import com.amazonaws.services.elasticsearch.model.DescribeElasticsearchDomainResult;
import com.amazonaws.services.elasticsearch.model.EBSOptions;
import com.amazonaws.services.elasticsearch.model.ElasticsearchClusterConfig;
import com.amazonaws.services.elasticsearch.model.NodeToNodeEncryptionOptions;
import com.amazonaws.services.elasticsearch.model.ResourceNotFoundException;
import com.amazonaws.services.elasticsearch.model.SnapshotOptions;
import com.amazonaws.services.elasticsearch.model.UpdateElasticsearchDomainConfigRequest;
import com.amazonaws.services.elasticsearch.model.UpdateElasticsearchDomainConfigResult;
import com.amazonaws.services.elasticsearch.model.VolumeType;
/**
* Sample class demonstrating how to use the AWS SDK for Java to create, update,
* and delete Amazon Elasticsearch Service domains.
*/
/**
* Creates an Amazon Elasticsearch Service domain with the specified options.
* Some options require other AWS resources, such as an Amazon Cognito user pool
* and identity pool, whereas others require just an instance type or instance
* count.
*
* @param client
* The AWSElasticsearch client to use for the requests to Amazon
* Elasticsearch Service
* @param domainName
* The name of the domain you want to create
*/
private static void createDomain(final AWSElasticsearch client, final String
domainName) {
// domains.
.withDedicatedMasterType("t2.small.elasticsearch")
.withInstanceType("t2.small.elasticsearch")
.withInstanceCount(5))
// Many instance types require EBS storage.
.withEBSOptions(new EBSOptions()
.withEBSEnabled(true)
.withVolumeSize(10)
.withVolumeType(VolumeType.Gp2))
// You can uncomment this line and add your account ID, a user name, and
the
// domain name to add an access policy.
// .withAccessPolicies("{\"Version\":\"2012-10-17\",\"Statement\":
[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam::123456789012:user/user-name
\"]},\"Action\":[\"es:*\"],\"Resource\":\"arn:aws:es:region:123456789012:domain/domain-
name/*\"}]}")
.withNodeToNodeEncryptionOptions(new NodeToNodeEncryptionOptions()
.withEnabled(true));
/**
* Updates the configuration of an Amazon Elasticsearch Service domain with the
* specified options. Some options require other AWS resources, such as an
* Amazon Cognito user pool and identity pool, whereas others require just an
* instance type or instance count.
*
* @param client
* The AWSElasticsearch client to use for the requests to Amazon
* Elasticsearch Service
* @param domainName
* The name of the domain to update
*/
private static void updateDomain(final AWSElasticsearch client, final String
domainName) {
try {
// Updates the domain to use three data instances instead of five.
// You can uncomment the Cognito lines and fill in the strings to enable
Cognito
// authentication for Kibana.
final UpdateElasticsearchDomainConfigRequest updateRequest = new
UpdateElasticsearchDomainConfigRequest()
.withDomainName(domainName)
// .withCognitoOptions(new CognitoOptions()
// .withEnabled(true)
// .withUserPoolId("user-pool-id")
// .withIdentityPoolId("identity-pool-id")
// .withRoleArn("role-arn"))
.withElasticsearchClusterConfig(new ElasticsearchClusterConfig()
.withInstanceCount(3));
/**
* Deletes an Amazon Elasticsearch Service domain. Deleting a domain can take
* several minutes.
*
* @param client
* The AWSElasticsearch client to use for the requests to Amazon
* Elasticsearch Service
* @param domainName
* The name of the domain that you want to delete
*/
private static void deleteDomain(final AWSElasticsearch client, final String
domainName) {
try {
final DeleteElasticsearchDomainRequest deleteRequest = new
DeleteElasticsearchDomainRequest()
.withDomainName(domainName);
/**
* Waits for the domain to finish processing changes. New domains typically take 15-30
minutes
* to initialize, but can take longer depending on the configuration. Most updates to
existing domains
* take a similar amount of time. This method checks every 15 seconds and finishes only
when
* the domain's processing status changes to false.
*
* @param client
* The AWSElasticsearch client to use for the requests to Amazon
* Elasticsearch Service
* @param domainName
* The name of the domain that you want to check
*/
private static void waitForDomainProcessing(final AWSElasticsearch client, final String
domainName) {
// Create a new request to check the domain status.
final DescribeElasticsearchDomainRequest describeRequest = new
DescribeElasticsearchDomainRequest()
.withDomainName(domainName);
For an introduction to indexing, see the Open Distro for Elasticsearch documentation.
Don't include sensitive information in index, type, or document ID names. Elasticsearch uses these names
in its Uniform Resource Identifiers (URIs). Servers and applications often log HTTP requests, which can
lead to unnecessary data exposure if URIs contain sensitive information:
Even if you don't have permissions (p. 65) to view the associated JSON document, you could infer from
this fake log line that one of Dr. Doe's patients with a phone number of 202-555-0100 had the flu in
2018.
PUT elasticsearch_domain/more-movies/_doc/1
{"title": "Back to the Future"}
Response
{
"_index": "more-movies",
"_type": "_doc",
"_id": "1",
"_version": 4,
"result": "updated",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 3,
"_primary_term": 1
}
This response size might seem minimal, but if you index 1,000,000 documents per day—approximately
11.5 documents per second—339 bytes per response works out to 10.17 GB of download traffic per
month.
If data transfer costs are a concern, use the filter_path parameter to reduce the size of the
Elasticsearch response, but be careful not to filter out fields that you need in order to identify or retry
failed requests. These fields vary by client. The filter_path parameter works for all Elasticsearch REST
APIs, but is especially useful with APIs that you call frequently, such as the _index and _bulk APIs:
PUT elasticsearch_domain/more-movies/_doc/1?filter_path=result,_shards.total
{"title": "Back to the Future"}
Response
{
"result": "updated",
"_shards": {
"total": 2
}
}
Instead of including fields, you can exclude fields with a - prefix. filter_path also supports wildcards:
POST elasticsearch_domain/_bulk?filter_path=-took,-items.index._*
{ "index": { "_index": "more-movies", "_id": "1" } }
{"title": "Back to the Future"}
{ "index": { "_index": "more-movies", "_id": "2" } }
{"title": "Spirited Away"}
Response
{
"errors": false,
"items": [
{
"index": {
"result": "updated",
"status": 200
}
},
{
"index": {
"result": "updated",
"status": 200
}
}
]
Topics
• Loading Streaming Data into Amazon ES from Amazon S3 (p. 131)
• Loading Streaming Data into Amazon ES from Amazon Kinesis Data Streams (p. 136)
• Loading Streaming Data into Amazon ES from Amazon DynamoDB (p. 138)
• Loading Streaming Data into Amazon ES from Amazon Kinesis Data Firehose (p. 141)
• Loading Streaming Data into Amazon ES from Amazon CloudWatch (p. 141)
• Loading Data into Amazon ES from AWS IoT (p. 141)
This method of streaming data is extremely flexible. You can index object metadata, or if the object is
plaintext, parse and index some elements of the object body. This section includes some unsophisticated
Python sample code that uses regular expressions to parse a log file and index the matches.
Tip
For more robust code in Node.js, see amazon-elasticsearch-lambda-samples on GitHub. Some
Lambda blueprints also contain useful parsing examples.
Prerequisites
Before proceeding, you must have the following resources.
Prerequisite Description
Amazon S3 Bucket For more information, see Creating a Bucket in the Amazon Simple Storage
Service Getting Started Guide. The bucket must reside in the same region as
your Amazon ES domain.
Amazon ES Domain The destination for data after your Lambda function processes it. For more
information, see Creating Amazon ES Domains (p. 9).
import boto3
import re
import requests
from requests_aws4auth import AWS4Auth
s3 = boto3.client('s3')
# Get the bucket name and key for the new file
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
# Match the regular expressions to each line and index the JSON
for line in lines:
ip = ip_pattern.search(line).group(1)
timestamp = time_pattern.search(line).group(1)
message = message_pattern.search(line).group(1)
cd s3-to-es
pip install requests -t .
All Lambda execution environments have Boto3 installed, so you don't need to include it in your
deployment package.
Tip
If you use macOS, these commands might not work properly. As a workaround, add a file
named setup.cfg to the s3-to-es directory:
[install]
prefix=
zip -r lambda.zip *
This example assumes that you are using the console. Choose Python 2.7 and a role that has S3 read
permissions and Amazon ES write permissions, as shown in the following screenshot.
After you create the function, you must add a trigger. For this example, we want the code to run
whenever a log file arrives in an S3 bucket:
1. Choose S3.
2. Choose your bucket.
3. For Event type, choose PUT.
4. For Prefix, type logs/.
5. For Filter pattern, type .log.
6. Select Enable trigger.
7. Choose Add.
1. For Handler, type sample.handler. This setting tells Lambda the file (sample.py) and method
(handler) that it should run after a trigger.
2. For Code entry type, choose Upload a .ZIP file, and then follow the prompts to upload your
deployment package.
3. Choose Save.
At this point, you have a complete set of resources: a bucket for log files, a function that runs whenever a
log file is added to the bucket, code that performs the parsing and indexing, and an Amazon ES domain
for searching and visualization.
Upload the file to the logs folder of your S3 bucket. For instructions, see Add an Object to a Bucket in
the Amazon Simple Storage Service Getting Started Guide.
Then use the Amazon ES console or Kibana to verify that the lambda-s3-index index contains two
documents. You can also make a standard search request:
GET https://es-domain/lambda-index/_search?pretty
{
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
{
"_index" : "lambda-s3-index",
"_type" : "lambda-type",
"_id" : "vTYXaWIBJWV_TTkEuSDg",
"_score" : 1.0,
"_source" : {
"ip" : "12.345.678.91",
"message" : "GET /some-file.jpg",
"timestamp" : "10/Oct/2000:14:56:14 -0700"
}
},
{
"_index" : "lambda-s3-index",
"_type" : "lambda-type",
"_id" : "vjYmaWIBJWV_TTkEuCAB",
"_score" : 1.0,
"_source" : {
"ip" : "12.345.678.90",
"message" : "PUT /some-file.jpg",
"timestamp" : "10/Oct/2000:13:55:36 -0700"
}
}
]
}
}
Prerequisites
Before proceeding, you must have the following resources.
Prerequisite Description
Amazon Kinesis Data The event source for your Lambda function. To learn more, see Kinesis Data
Stream Streams.
Amazon ES Domain The destination for data after your Lambda function processes it. For more
information, see Creating Amazon ES Domains (p. 9).
IAM Role This role must have basic Amazon ES, Kinesis, and Lambda permissions,
such as the following:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"es:ESHttpPost",
"es:ESHttpPut",
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"kinesis:GetShardIterator",
"kinesis:GetRecords",
"kinesis:DescribeStream",
"kinesis:ListStreams"
],
"Resource": "*"
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
Prerequisite Description
To learn more, see Creating IAM Roles in the IAM User Guide.
import base64
import boto3
import json
import requests
from requests_aws4auth import AWS4Auth
cd kinesis-to-es
pip install requests -t .
pip install requests_aws4auth -t .
Then follow the instructions in the section called “Creating the Lambda Function” (p. 133), but specify
the IAM role from the section called “Prerequisites” (p. 136) and the following settings for the trigger:
To learn more, see Working with Amazon Kinesis Data Streams in the Amazon Kinesis Data Streams
Developer Guide.
At this point, you have a complete set of resources: a Kinesis data stream, a function that runs after
the stream receives new data and indexes that data, and an Amazon ES domain for searching and
visualization.
aws kinesis put-record --stream-name es-test --data "My test data." --partition-key
partitionKey1 --region us-west-1
Then use the Amazon ES console or Kibana to verify that lambda-kine-index contains a document.
You can also use the following request:
GET https://es-domain/lambda-kine-index/_search
{
"hits" : [
{
"_index": "lambda-kine-index",
"_type": "lambda-kine-type",
"_id":
"shardId-000000000000:49583511615762699495012960821421456686529436680496087042",
"_score": 1,
"_source": {
"timestamp": 1523648740.051,
"message": "My test data.",
"id":
"shardId-000000000000:49583511615762699495012960821421456686529436680496087042"
}
}
]
}
Prerequisites
Before proceeding, you must have the following resources.
Prerequisite Description
DynamoDB Table The table contains your source data. For more information, see Basic
Operations for Tables in the Amazon DynamoDB Developer Guide.
The table must reside in the same region as your Amazon ES domain and
have a stream set to New image. To learn more, see Enabling a Stream.
Prerequisite Description
Amazon ES Domain The destination for data after your Lambda function processes it. For more
information, see Creating Amazon ES Domains (p. 9).
IAM Role This role must have basic Amazon ES, DynamoDB, and Lambda execution
permissions, such as the following:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"es:ESHttpPost",
"es:ESHttpPut",
"dynamodb:DescribeStream",
"dynamodb:GetRecords",
"dynamodb:GetShardIterator",
"dynamodb:ListStreams",
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "*"
}
]
}
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
To learn more, see Creating IAM Roles in the IAM User Guide.
import boto3
import requests
from requests_aws4auth import AWS4Auth
if record['eventName'] == 'REMOVE':
r = requests.delete(url + id, auth=awsauth)
else:
document = record['dynamodb']['NewImage']
r = requests.put(url + id, auth=awsauth, json=document, headers=headers)
count += 1
return str(count) + ' records processed.'
cd ddb-to-es
pip install requests -t .
pip install requests_aws4auth -t .
Then follow the instructions in the section called “Creating the Lambda Function” (p. 133), but specify
the IAM role from the section called “Prerequisites” (p. 138) and the following settings for the trigger:
To learn more, see Processing New Items in a DynamoDB Table in the Amazon DynamoDB Developer
Guide.
At this point, you have a complete set of resources: a DynamoDB table for your source data, a DynamoDB
stream of changes to the table, a function that runs after your source data changes and indexes those
changes, and an Amazon ES domain for searching and visualization.
Then use the Amazon ES console or Kibana to verify that lambda-index contains a document. You can
also use the following request:
GET https://es-domain/lambda-index/lambda-type/00001
{
"_index": "lambda-index",
"_type": "lambda-type",
"_id": "00001",
"_version": 1,
"found": true,
"_source": {
"director": {
"S": "Kevin Costner"
},
"id": {
"S": "00001"
},
"title": {
"S": "The Postman"
}
}
}
Before you load data into Amazon ES, you might need to perform transforms on the data. To learn more
about using Lambda functions to perform this task, see Data Transformation in the same guide.
As you configure a delivery stream, Kinesis Data Firehose features a "one-click" IAM role that gives it the
resource access it needs to send data to Amazon ES, back up data on Amazon S3, and transform data
using Lambda. Because of the complexity involved in creating such a role manually, we recommend using
the provided role.
If your Amazon ES domain uses fine-grained access control (p. 77) with HTTP basic authentication,
configuration is similar to any other Elasticsearch cluster. This example configuration file takes its input
from the open source version of Filebeat (Filebeat OSS).
input {
beats {
port => 5044
}
}
output {
elasticsearch {
hosts => ["https://domain-endpoint:443"]
ssl => true
index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
user => "some-user"
password => "some-user-password"
ilm_enabled => false
}
}
If your domain uses an IAM-based domain access policy or fine-grained access control with an IAM
master user, you must sign all requests to Amazon ES using IAM credentials. In this case, the simplest
solution to sign requests from Logstash is to use the logstash-output-amazon-es plugin. First, install the
plugin.
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_SESSION_TOKEN="your-session-token"
Finally, change your configuration file to use the plugin for its output. This example configuration file
takes its input from files in an S3 bucket.
input {
s3 {
bucket => "my-s3-bucket"
region => "us-east-1"
}
}
output {
amazon_es {
hosts => ["domain-endpoint"]
ssl => true
region => "us-east-1"
index => "production-logs-%{+YYYY.MM.dd}"
}
}
If your Amazon ES domain is in a VPC, Logstash must run on a machine in that same VPC and have access
to the domain through the VPC security groups. For more information, see the section called “About
Access Policies on VPC Domains” (p. 23).
URI Searches
Universal Resource Identifier (URI) searches are the simplest form of search. In a URI search, you specify
the query as an HTTP request parameter:
GET https://search-my-domain.us-west-1.es.amazonaws.com/_search?q=house
{
"took": 25,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 85,
"max_score": 6.6137657,
"hits": [
{
"_index": "movies",
"_type": "movie",
"_id": "tt0077975",
"_score": 6.6137657,
"_source": {
"directors": [
"John Landis"
],
"release_date": "1978-07-27T00:00:00Z",
"rating": 7.5,
"genres": [
"Comedy",
"Romance"
],
"image_url": "http://ia.media-imdb.com/images/M/
MV5BMTY2OTQxNTc1OF5BMl5BanBnXkFtZTYwNjA3NjI5._V1_SX400_.jpg",
"plot": "At a 1962 College, Dean Vernon Wormer is determined to expel the entire
Delta Tau Chi Fraternity, but those troublemakers have other plans for him.",
"title": "Animal House",
"rank": 527,
"running_time_secs": 6540,
"actors": [
"John Belushi",
"Karen Allen",
"Tom Hulce"
],
"year": 1978,
"id": "tt0077975"
}
},
...
]
}
}
By default, this query searches all fields of all indices for the term house. To narrow the search, specify an
index (movies) and a document field (title) in the URI:
GET https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search?q=title:house
You can include additional parameters in the request, but the supported parameters provide only a
small subset of the Elasticsearch search options. The following request returns 20 results (instead of the
default of 10) and sorts by year (rather than by _score):
GET https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search?
q=title:house&size=20&sort=year:desc
POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search
{
"size": 20,
"sort": {
"year": {
"order": "desc"
}
},
"query": {
"query_string": {
"default_field": "title",
"query": "house"
}
}
}
Note
The _search API accepts HTTP GET and POST for request body searches, but not all HTTP
clients support adding a request body to a GET request. POST is the more universal choice.
In many cases, you might want to search several fields, but not all fields. Use the multi_match query:
POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search
{
"size": 20,
"query": {
"multi_match": {
"query": "house",
"fields": ["title", "plot", "actors", "directors"]
}
}
}
Boosting Fields
You can improve search relevancy by "boosting" certain fields. Boosts are multipliers that weigh matches
in one field more heavily than matches in other fields. In the following example, a match for john in the
title field influences _score twice as much as a match in the plot field and four times as much as
a match in the actors or directors fields. The result is that films like John Wick and John Carter are
near the top of the search results, and films starring John Travolta are near the bottom.
POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search
{
"size": 20,
"query": {
"multi_match": {
"query": "john",
"fields": ["title^4", "plot^2", "actors", "directors"]
}
}
}
POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search
{
"from": 20,
"size": 20,
"query": {
"multi_match": {
"query": "house",
"fields": ["title^4", "plot^2", "actors", "directors"]
}
}
}
POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search
{
"size": 20,
"query": {
"multi_match": {
"query": "house",
"fields": ["title^4", "plot^2", "actors", "directors"]
}
},
"highlight": {
"fields": {
"plot": {}
}
}
}
If the query matched the content of the plot field, a hit might look like the following:
{
"_index": "movies",
"_type": "movie",
"_id": "tt0091541",
"_score": 11.276199,
"_source": {
"directors": [
"Richard Benjamin"
],
"release_date": "1986-03-26T00:00:00Z",
"rating": 6,
"genres": [
"Comedy",
"Music"
],
"image_url": "http://ia.media-imdb.com/images/M/
MV5BMTIzODEzODE2OF5BMl5BanBnXkFtZTcwNjQ3ODcyMQ@@._V1_SX400_.jpg",
"plot": "A young couple struggles to repair a hopelessly dilapidated house.",
"title": "The Money Pit",
"rank": 4095,
"running_time_secs": 5460,
"actors": [
"Tom Hanks",
"Shelley Long",
"Alexander Godunov"
],
"year": 1986,
"id": "tt0091541"
},
"highlight": {
"plot": [
"A young couple struggles to repair a hopelessly dilapidated <em>house</em>."
]
}
}
By default, Elasticsearch wraps the matching string in <em> tags, provides up to 100 characters of
context around the match, and breaks content into sentences by identifying punctuation marks, spaces,
tabs, and line breaks. All of these settings are customizable:
POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search
{
"size": 20,
"query": {
"multi_match": {
"query": "house",
"fields": ["title^4", "plot^2", "actors", "directors"]
}
},
"highlight": {
"fields": {
"plot": {}
},
"pre_tags": "<strong>",
"post_tags": "</strong>",
"fragment_size": 200,
"boundary_chars": ".,!? "
}
}
Count API
If you're not interested in the contents of your documents and just want to know the number of
matches, you can use the _count API instead of the _search API. The following request uses the
query_string query to identify romantic comedies:
POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_count
{
"query": {
"query_string": {
"default_field": "genres",
"query": "romance AND comedy"
}
}
}
{
"count": 564,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
}
}
Topics
• Uploading Packages to Amazon S3 (p. 147)
• Importing and Associating Packages (p. 148)
• Using Custom Packages with Elasticsearch (p. 148)
• Updating Custom Packages (p. 150)
• Dissociating and Removing Packages (p. 151)
If your package contains sensitive information, specify server-side encryption with S3-managed keys
when you upload it. Amazon ES can't access files on S3 that you protect using an AWS KMS master key.
After you upload the file, make note of its S3 path. The path format is s3://bucket-name/file-
path/file-name.
You can use the following synonyms file for testing purposes. Save it as synonyms.txt.
Certain dictionaries, such as Hunspell dictionaries, use multiple files and require their own directories on
the file system. At this time, Amazon ES only supports single-file dictionaries.
Alternately, use the AWS CLI, SDKs, or configuration API to import and associate packages. For more
information, see the AWS CLI Command Reference and Amazon ES Configuration API Reference (p. 271).
PUT my-index
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"synonym_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["synonym_filter"]
}
},
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms_path": "analyzers/F111111111"
}
}
}
}
},
"mappings": {
"properties": {
"description": {
"type": "text",
"analyzer": "synonym_analyzer"
}
}
}
}
This request creates a custom analyzer for the index that uses the standard tokenizer and a synonym
token filter.
• Tokenizers break streams of characters into tokens (typically words) based on some set of rules. The
simplest example is the whitespace tokenizer, which breaks the preceding characters into a token each
time it encounters a whitespace character. A more complex example is the standard tokenizer, which
uses a set of grammar-based rules to work across many languages.
• Token filters add, modify, or delete tokens. For example, a synonym token filter adds tokens when it
finds a word in the synonyms list. The stop token filter removes tokens when finds a word in the stop
words list.
This request also adds a text field (description) to the mapping and tells Elasticsearch to use the new
analyzer for that field.
POST _bulk
{ "index": { "_index": "my-index", "_id": "1" } }
{ "description": "ice cream" }
{ "index": { "_index": "my-index", "_id": "2" } }
{ "description": "croissant" }
{ "index": { "_index": "my-index", "_id": "3" } }
{ "description": "tennis shoe" }
{ "index": { "_index": "my-index", "_id": "4" } }
{ "description": "hightop" }
GET my-index/_search
{
"query": {
"match": {
"description": "gelato"
}
}
}
{
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.99463606,
"hits": [{
"_index": "my-index",
"_type": "_doc",
"_id": "1",
"_score": 0.99463606,
"_source": {
"description": "ice cream"
}
}]
}
}
Tip
Dictionary files use Java heap space proportional to their size. For example, a 2 GiB dictionary
file might consume 2 GiB of heap space on a node. If you use large files, ensure that your nodes
have enough heap space to accommodate them. Monitor (p. 28) the JVMMemoryPressure
metric, and scale your cluster as necessary.
After you associate the updated file with your domain, you can use it with new indices by using the
requests in the section called “Using Custom Packages with Elasticsearch” (p. 148).
If you want to use the updated file with existing indices though, you must reindex them. First, create an
index that uses the updated synonyms file:
PUT my-new-index
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"synonym_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["synonym_filter"]
}
},
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms_path": "analyzers/F222222222"
}
}
}
}
},
"mappings": {
"properties": {
"description": {
"type": "text",
"analyzer": "synonym_analyzer"
}
}
}
}
POST _reindex
{
"source": {
"index": "my-index"
},
"dest": {
"index": "my-new-index"
}
}
If you frequently update synonym files, use index aliases to maintain a consistent path to the latest
index:
POST _aliases
{
"actions": [
{
"remove": {
"index": "my-index",
"alias": "latest-index"
}
},
{
"add": {
"index": "my-new-index",
"alias": "latest-index"
}
}
]
}
If you don't need the old index, delete it. If you no longer need the older version of the package,
dissociate and remove it (p. 151).
The console is the simplest way to dissociate a package from a domain and remove it from Amazon ES.
Removing a package from Amazon ES does not remove it from its original location on Amazon S3.
6. If you want to use the package with other domains, stop here. To continue with removing the
package, choose Packages in the navigation pane.
7. Select the package and choose Delete.
Alternately, use the AWS CLI, SDKs, or configuration API to dissociate and remove packages. For more
information, see the AWS CLI Command Reference and Amazon ES Configuration API Reference (p. 271).
SQL support is available on domains running Elasticsearch 6.5 or higher. Full documentation is available
in the Open Distro for Elasticsearch documentation.
Sample Call
To query your data using SQL, send HTTP requests to _opendistro/_sql using the following format:
POST elasticsearch_domain/_opendistro/_sql
{
"query": "SELECT * FROM my-index LIMIT 50"
}
For security considerations related to using SQL with fine-grained access control, see Fine-Grained Access
Control in Amazon Elasticsearch Service (p. 91).
The Open Distro for Elasticsearch SQL plugin includes many tuneable settings, but on Amazon ES, use
the _opendistro/_sql/settings path rather than the standard _cluster/settings path:
PUT _opendistro/_sql/settings
{
"persistent": {
"opendistro.sql.cursor.enabled": true
}
}
Workbench
The SQL Workbench is a Kibana user interface that lets you run on-demand SQL queries, translate SQL
into its REST equivalent, and view and save results as text, JSON, JDBC, or CSV. For more information, see
Workbench.
SQL CLI
The SQL CLI is a standalone Python application that you can launch with the odfesql command. For
steps to install, configure, and use, see SQL CLI.
JDBC Driver
The Java Database Connectivity (JDBC) driver lets you integrate Amazon ES domains with your favorite
business intelligence (BI) applications. To get started, see the GitHub repository. The following table
summarizes version compatibility for the driver.
7.7 1.8.0
7.4 1.4.0
7.1 1.0.0
6.8 0.9.0
6.7 0.9.0
6.5 0.9.0
ODBC Driver
The Open Database Connectivity (ODBC) driver is a read-only ODBC driver for Windows and macOS that
lets you connect business intelligence and data visualization applications like Tableau and Microsoft
Excel to the SQL plugin. For information on downloading and using the JAR file, see the SQL repository
on GitHub.
KNN
Short for its associated k-nearest neighbors algorithm, KNN for Amazon Elasticsearch Service lets
you search for points in a vector space and find the "nearest neighbors" for those points by Euclidean
distance or cosine similarity. Use cases include recommendations (for example, an "other songs you
might like" feature in a music application), image recognition, and fraud detection.
KNN requires Elasticsearch 7.1 or later. Full documentation for the Elasticsearch feature, including
descriptions of settings and statistics, is available in the Open Distro for Elasticsearch documentation. For
background information about the k-nearest neighbors algorithm, see Wikipedia.
PUT my-index
{
"settings": {
"index.knn": true
},
"mappings": {
"properties": {
"my_vector1": {
"type": "knn_vector",
"dimension": 2
},
"my_vector2": {
"type": "knn_vector",
"dimension": 4
}
}
}
}
The knn_vector data type supports a single list of up to 10,000 floats, with the number of floats
defined by the required dimension parameter. After you create the index, add some data to it.
POST _bulk
{ "index": { "_index": "my-index", "_id": "1" } }
{ "my_vector1": [1.5, 2.5], "price": 12.2 }
{ "index": { "_index": "my-index", "_id": "2" } }
{ "my_vector1": [2.5, 3.5], "price": 7.1 }
{ "index": { "_index": "my-index", "_id": "3" } }
{ "my_vector1": [3.5, 4.5], "price": 12.9 }
{ "index": { "_index": "my-index", "_id": "4" } }
{ "my_vector1": [5.5, 6.5], "price": 1.2 }
{ "index": { "_index": "my-index", "_id": "5" } }
{ "my_vector1": [4.5, 5.5], "price": 3.7 }
{ "index": { "_index": "my-index", "_id": "6" } }
{ "my_vector2": [1.5, 5.5, 4.5, 6.4], "price": 10.3 }
{ "index": { "_index": "my-index", "_id": "7" } }
{ "my_vector2": [2.5, 3.5, 5.6, 6.7], "price": 5.5 }
{ "index": { "_index": "my-index", "_id": "8" } }
{ "my_vector2": [4.5, 5.5, 6.7, 3.7], "price": 4.4 }
{ "index": { "_index": "my-index", "_id": "9" } }
{ "my_vector2": [1.5, 5.5, 4.5, 6.4], "price": 8.9 }
Then you can search the data using the knn query type.
GET my-index/_search
{
"size": 2,
"query": {
"knn": {
"my_vector2": {
"vector": [2, 3, 5, 6],
"k": 2
}
}
}
}
In this case, k is the number of neighbors you want the query to return, but you must also include the
size option. Otherwise, you get k results for each shard (and each segment) rather than k results for the
entire query. KNN supports a maximum k value of 10,000.
If you mix the knn query with other clauses, you might receive fewer than k results. In this example, the
post_filter clause reduces the number of results from 2 to 1.
GET my-index/_search
{
"size": 2,
"query": {
"knn": {
"my_vector2": {
"vector": [2, 3, 5, 6],
"k": 2
}
}
},
"post_filter": {
"range": {
"price": {
"gte": 6,
"lte": 10
}
}
}
}
In particular, check the KNNGraphMemoryUsage metric on each data node against the
knn.memory.circuit_breaker.limit statistic and the available RAM for the instance type. Amazon
ES uses half of an instance's RAM for the Java heap (up to a heap size of 32 GiB). By default, KNN uses
up to 60% of the remaining half, so an instance type with 32 GiB of RAM can accommodate 9.6 GiB of
graphs (32 * 0.5 * 0.6). Performance can suffer if graph memory usage exceeds this value.
It often makes more sense to use multiple smaller domains instead of a single large domain, especially
when you're running different types of workloads.
Cross-cluster search supports Kibana, so you can create visualizations and dashboards across all your
domains.
Topics
• Limitations (p. 156)
• Cross-Cluster Search Prerequisites (p. 156)
• Cross-Cluster Search Pricing (p. 156)
• Setting Up a Connection (p. 156)
• Removing a Connection (p. 157)
• Setting Up Security and Sample Walkthrough (p. 157)
• Kibana (p. 161)
Limitations
Cross-cluster search has several important limitations:
• You can only implement cross-cluster search on domains created on or after June 3rd, 2020.
• You can't connect to self-managed Elasticsearch clusters.
• You can't connect to domains in different AWS Regions.
• A domain can have a maximum of 20 outgoing connections. Similarly, a domain can have a maximum
of 20 incoming connections.
• Domains must either share the same major version, or be on the final minor version and the next major
version (for example, 6.8 and 7.x are compatible).
• You can't use custom dictionaries or SQL with cross-cluster search.
• You can't use AWS CloudFormation to connect domains.
• You can't use cross-cluster search on M3 and T2 instances.
Setting Up a Connection
The “source” domain refers to the domain that a cross-cluster search request originates from. In other
words, the source domain is the one that you send the initial search request to.
The “destination” domain is the domain that the source domain queries.
A cross-cluster connection is unidirectional from the source to the destination domain. This means that
the destination domain can’t query the source domain. However, you can set up another connection in
the opposite direction.
The source domain creates an "outbound" connection to the destination domain. The destination domain
receives an "inbound" connection request from the source domain.
To set up a connection
1. On your domain dashboard, choose a domain, and choose the Cross-cluster search connections tab.
• To connect to a domain in your AWS account, from the dropdown list, choose the domain that you
want to connect to and choose Submit.
• To connect to a domain in another AWS account, specify the ARN of the remote domain and
choose Submit.
5. Cross-cluster search first validates the connection request to make sure that the prerequisites are
met to ensure compatibility. If the domains are found to be incompatible, the connection request
enters the “Validation failed” state.
6. After the connection request is validated successfully, it is sent to the destination domain, where
it needs to be approved. Until this approval happens, the connection remains in a “Pending
acceptance” state. When the connection request is accepted at the destination domain, the state
changes to “Active” and the destination domain becomes available for queries.
• The domain page shows you the overall domain health and instance health details of your
destination domain. Only domain owners have the flexibility to create, view, remove, and monitor
connections to or from their domains.
After the connection is established, any traffic that flows between the nodes of the connected domains
is encrypted. If you connect a VPC domain to a non-VPC domain and the non-VPC domain is a public
endpoint that can receive traffic from the internet, the cross-cluster traffic between the domains is still
encrypted and secure.
Removing a Connection
Removing a connection stops any cross-cluster operation on its indices.
You could perform these steps on either the source or destination domain to remove the connection.
After the connection is removed, it's still visible with a "Deleted" status for a period of 15 days.
You can't delete a domain with active cross-cluster connections. To delete a domain, first remove all
incoming and outgoing connections from that domain. This is to make sure you take into account the
cross-cluster domain users before deleting the domain.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"*"
]
},
"Action": [
"es:ESHttp*"
],
"Resource": "arn:aws:es:region:account:domain/src-domain/*"
}
]
}
Note
The domain resource policy evaluates the URI literally, so if you include remote
indices in the path, use arn:aws:es:us-east-1:123456789012:domain/my-
domain/local_index,dst%3Aremote_index rather than arn:aws:es:us-
east-1:123456789012:domain/my-domain/local_index,dst:remote_index.
If you choose to use a restrictive access policy in addition to fine-grained access control, your policy
must allow access to es:ESHttpGet at a minimum.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::123456789012:user/test-es-user"
]
},
"Action": "es:ESHttpGet",
"Resource": "arn:aws:es:region:account:domain/src-domain/*"
}
]
}
3. Fine-grained access control (p. 77) on the source domain evaluates the request:
If the request only searches data on the destination domain (for example, dest-alias:dest-
index/_search), you only need permissions on the destination domain.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "es:ESCrossClusterGet",
"Resource": "arn:aws:es:region:account:domain/dst-domain"
}
]
}
Make sure that the es:ESCrossClusterGet permission is applied for /dst-domain and not /
dst-domain/*.
However, this minimum policy only allows cross-cluster searches. To perform other operations, such
as indexing documents and performing standard searches, you need additional permissions. We
recommend the following policy on the destination domain:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": [
"*"
]
},
"Action": [
"es:ESHttp*"
],
"Resource": "arn:aws:es:region:account:domain/dst-domain/*"
},
{
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "es:ESCrossClusterGet",
"Resource": "arn:aws:es:region:account:domain/dst-domain"
}
]
}
Note
All cross-cluster search requests between domains are encrypted in transit by default as
part of node-to-node encryption.
5. The destination domain performs the search and returns the results to the source domain.
6. The source domain combines its own results (if any) with the results from the destination domain
and returns them to you.
7. We recommend Postman for testing requests:
POST https://dst-domain.us-east-1.es.amazonaws.com/books/_doc/1
{
"Dracula": "Bram Stoker"
}
• To query this index from the source domain, include the connection alias of the destination
domain within the query.
GET https://src-domain.us-east-1.es.amazonaws.com/<connection_alias>:books/_search
{
...
"hits": [
{
"_index": "source-destination:books",
"_type": "_doc",
"_id": "1",
"_score": 1,
"_source": {
"Dracula": "Bram Stoker"
}
}
]
}
You can find the connection alias on the Cross-cluster search connections tab on your domain
dashboard.
• If you set up a connection between domain-a -> domain-b with connection alias cluster_b
and domain-a -> domain-c with connection alias cluster_c, search domain-a, domain-b,
and domain-c as follows:
GET https://src-domain.us-east-1.es.amazonaws.com/
local_index,cluster_b:b_index,cluster_c:c_index/_search
{
"query": {
"match": {
"user": "domino"
}
}
}
Response
{
"took": 150,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"failed": 0,
"skipped": 0
},
"_clusters": {
"total": 3,
"successful": 3,
"skipped": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "local_index",
"_type": "_doc",
"_id": "0",
"_score": 1,
"_source": {
"user": "domino",
"message": "Lets unite the new mutants",
"likes": 0
}
},
{
"_index": "cluster_b:b_index",
"_type": "_doc",
"_id": "0",
"_score": 2,
"_source": {
"user": "domino",
"message": "I'm different",
"likes": 0
}
},
{
"_index": "cluster_c:c_index",
"_type": "_doc",
"_id": "0",
"_score": 3,
"_source": {
"user": "domino",
"message": "So am I",
"likes": 0
}
}
]
}
}
All destination clusters that you search need to be available for your search request to run
successfully. Otherwise, the whole request fails—even if one of the domains is not available, no
search results are returned.
Kibana
You can visualize data from multiple connected domains in the same way as from a single domain,
except that you must access the remote indices using connection-alias:index. So, your index
pattern must match connection-alias:index.
Learning to rank is an open source Elasticsearch plugin that lets you use machine learning and behavioral
data to tune the relevance of documents. The plugin uses models from the XGBoost and Ranklib libraries
to rescore the search results.
Learning to rank requires Elasticsearch 7.7 or later. Full documentation for the feature, including detailed
steps and API descriptions, is available in the Learning to Rank documentation.
Note
To use the Learning to Rank plugin, you must have full admin permissions. To learn more, see
the section called “Modifying the Master User” (p. 92).
Topics
• Getting Started with Learning to Rank (p. 162)
• Learning to Rank API (p. 174)
PUT _ltr
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : ".ltrstore"
}
This command creates a hidden .ltrstore index that stores metadata information such as feature sets
and models.
A judgment list is a collection of examples from which a machine learning model learns. Your judgment
list should include keywords that are important to you and a set of graded documents for each keyword.
In this example, we have a judgment list for a movie data set. A grade of 4 indicates a perfect match. A
grade of 0 indicates the worst match.
You can create this judgment list manually with the help of human annotators or infer it
programmatically from analytics data.
Build a feature set with a Mustache template for each feature. For more information about features, see
Working with Features.
In this example, we build a movie_features feature set with the title and overview fields:
POST _ltr/_featureset/movie_features
{
"featureset" : {
"name" : "movie_features",
"features" : [
{
"name" : "1",
"params" : [
"keywords"
],
"template_language" : "mustache",
"template" : {
"match" : {
"title" : "{{keywords}}"
}
}
},
{
"name" : "2",
"params" : [
"keywords"
],
"template_language" : "mustache",
"template" : {
"match" : {
"overview" : "{{keywords}}"
}
}
}
]
}
}
If you query the original .ltrstore index, you get back your feature set:
GET _ltr/_featureset
Combine the feature set and judgment list to log the feature values. For more information about logging
features, see Logging Feature Scores.
In this example, the bool query retrieves the graded documents with the filter and then selects the
feature set with the sltr query. The ltr_log query combines the documents and the features to log
the corresponding feature values:
POST tmdb/_search
{
"_source": {
"includes": [
"title",
"overview"
]
},
"query": {
"bool": {
"filter": [
{
"terms": {
"_id": [
"7555",
"1370",
"1369",
"1368"
]
}
},
{
"sltr": {
"_name": "logged_featureset",
"featureset": "movie_features",
"params": {
"keywords": "rambo"
}
}
}
]
}
},
"ext": {
"ltr_log": {
"log_specs": {
"name": "log_entry1",
"named_query": "logged_featureset"
}
}
}
}
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "1368",
"_score" : 0.0,
"_source" : {
"overview" : "When former Green Beret John Rambo is harassed by local law
enforcement and arrested for vagrancy, the Vietnam vet snaps, runs for the hills and rat-
a-tat-tats his way into the action-movie hall of fame. Hounded by a relentless sheriff,
Rambo employs heavy-handed guerilla tactics to shake the cops off his tail.",
"title" : "First Blood"
},
"fields" : {
"_ltrlog" : [
{
"log_entry1" : [
{
"name" : "1"
},
{
"name" : "2",
"value" : 10.558305
}
]
}
]
},
"matched_queries" : [
"logged_featureset"
]
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "7555",
"_score" : 0.0,
"_source" : {
"overview" : "When governments fail to act on behalf of captive missionaries,
ex-Green Beret John James Rambo sets aside his peaceful existence along the Salween River
in a war-torn region of Thailand to take action. Although he's still haunted by violent
memories of his time as a U.S. soldier during the Vietnam War, Rambo can hardly turn his
back on the aid workers who so desperately need his help.",
"title" : "Rambo"
},
"fields" : {
"_ltrlog" : [
{
"log_entry1" : [
{
"name" : "1",
"value" : 11.2569065
},
{
"name" : "2",
"value" : 9.936821
}
]
}
]
},
"matched_queries" : [
"logged_featureset"
]
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "1369",
"_score" : 0.0,
"_source" : {
"overview" : "Col. Troutman recruits ex-Green Beret John Rambo for a highly
secret and dangerous mission. Teamed with Co Bao, Rambo goes deep into Vietnam to rescue
POWs. Deserted by his own team, he's left in a hostile jungle to fight for his life,
avenge the death of a woman and bring corrupt officials to justice.",
"title" : "Rambo: First Blood Part II"
},
"fields" : {
"_ltrlog" : [
{
"log_entry1" : [
{
"name" : "1",
"value" : 6.334839
},
{
"name" : "2",
"value" : 10.558305
}
]
}
]
},
"matched_queries" : [
"logged_featureset"
]
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "1370",
"_score" : 0.0,
"_source" : {
"overview" : "Combat has taken its toll on Rambo, but he's finally begun to find
inner peace in a monastery. When Rambo's friend and mentor Col. Trautman asks for his help
on a top secret mission to Afghanistan, Rambo declines but must reconsider when Trautman
is captured.",
"title" : "Rambo III"
},
"fields" : {
"_ltrlog" : [
{
"log_entry1" : [
{
"name" : "1",
"value" : 9.425955
},
{
"name" : "2",
"value" : 11.262714
}
]
}
]
},
"matched_queries" : [
"logged_featureset"
]
}
]
}
}
In the above example, the first feature doesn’t have a feature value because the keyword “rambo”
doesn’t appear in the title field of the document with ID equal to 1368. This is a missing feature value in
the training data.
The next step is to combine the judgment list and feature values to create a training dataset. If your
original judgment list looks like this:
Convert it into the final training dataset, which looks like this:
You can perform this step manually or write a program to automate it.
With the training dataset in place, the next step is to use XGBoost or Ranklib libraries to build a model.
XGBoost and Ranklib libraries let you build popular models such as LambdaMART, Random Forests, and
so on.
For steps to use XGBoost and Ranklib to build the model, see the XGBoost and RankLib documentation,
respectively. To use Amazon SageMaker to build the XGBoost model, see XGBoost Algorithm.
## LambdaMART
## Number of trees = 5
## Number of leaves = 10
## Number of threshold candidates = 256
## Learning rate = 0.1
## Stop early = 100
POST _ltr/_featureset/movie_features/_createmodel
{
"model": {
"name": "my_ranklib_model",
"model": {
"type": "model/ranklib+json",
"definition": "<ensemble>
<tree id="1" weight="0.1">
<split>
<feature>1</feature>
<threshold>10.357876</threshold>
<split pos="left">
<feature>1</feature>
<threshold>0.0</threshold>
<split pos="left">
<output>-2.0</output>
</split>
<split pos="right">
<feature>1</feature>
<threshold>7.0105133</threshold>
<split pos="left">
<output>-2.0</output>
</split>
<split pos="right">
<output>-2.0</output>
</split>
</split>
</split>
<split pos="right">
<output>2.0</output>
</split>
</split>
</tree>
<tree id="2" weight="0.1">
<split>
<feature>1</feature>
<threshold>10.357876</threshold>
<split pos="left">
<feature>1</feature>
<threshold>0.0</threshold>
<split pos="left">
<output>-1.67031991481781</output>
</split>
<split pos="right">
<feature>1</feature>
<threshold>7.0105133</threshold>
<split pos="left">
<output>-1.67031991481781</output>
</split>
<split pos="right">
<output>-1.6703200340270996</output>
</split>
</split>
</split>
<split pos="right">
<output>1.6703201532363892</output>
</split>
</split>
</tree>
<tree id="3" weight="0.1">
<split>
<feature>2</feature>
<threshold>10.573917</threshold>
<split pos="left">
<output>1.479954481124878</output>
</split>
<split pos="right">
<feature>1</feature>
<threshold>7.0105133</threshold>
<split pos="left">
<feature>1</feature>
<threshold>0.0</threshold>
<split pos="left">
<output>-1.4799546003341675</output>
</split>
<split pos="right">
<output>-1.479954481124878</output>
</split>
</split>
<split pos="right">
<output>-1.479954481124878</output>
</split>
</split>
</split>
</tree>
<tree id="4" weight="0.1">
<split>
<feature>1</feature>
<threshold>10.357876</threshold>
<split pos="left">
<feature>1</feature>
<threshold>0.0</threshold>
<split pos="left">
<output>-1.3569872379302979</output>
</split>
<split pos="right">
<feature>1</feature>
<threshold>7.0105133</threshold>
<split pos="left">
<output>-1.3569872379302979</output>
</split>
<split pos="right">
<output>-1.3569872379302979</output>
</split>
</split>
</split>
<split pos="right">
<output>1.3569873571395874</output>
</split>
</split>
</tree>
<tree id="5" weight="0.1">
<split>
<feature>1</feature>
<threshold>10.357876</threshold>
<split pos="left">
<feature>1</feature>
<threshold>0.0</threshold>
<split pos="left">
<output>-1.2721362113952637</output>
</split>
<split pos="right">
<feature>1</feature>
<threshold>7.0105133</threshold>
<split pos="left">
<output>-1.2721363306045532</output>
</split>
<split pos="right">
<output>-1.2721363306045532</output>
</split>
</split>
</split>
<split pos="right">
<output>1.2721362113952637</output>
</split>
</split>
</tree>
</ensemble>"
}
}
}
GET _ltr/_model/my_ranklib_model
Perform the sltr query with the features that you’re using and the name of the model that you want to
execute:
POST tmdb/_search
{
"_source": {
"includes": ["title", "overview"]
},
"query": {
"multi_match": {
"query": "rambo",
"fields": ["title", "overview"]
}
},
"rescore": {
"query": {
"rescore_query": {
"sltr": {
"params": {
"keywords": "rambo"
},
"model": "my_ranklib_model"
}
}
}
}
}
With Learning to Rank, you see “Rambo” as the 1st result because we have assigned it the highest grade
in the judgment list:
"took" : 12,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 7,
"relation" : "eq"
},
"max_score" : 13.096414,
"hits" : [
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "7555",
"_score" : 13.096414,
"_source" : {
"overview" : "When governments fail to act on behalf of captive missionaries,
ex-Green Beret John James Rambo sets aside his peaceful existence along the Salween River
in a war-torn region of Thailand to take action. Although he's still haunted by violent
memories of his time as a U.S. soldier during the Vietnam War, Rambo can hardly turn his
back on the aid workers who so desperately need his help.",
"title" : "Rambo"
}
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "1370",
"_score" : 11.17245,
"_source" : {
"overview" : "Combat has taken its toll on Rambo, but he's finally begun to find
inner peace in a monastery. When Rambo's friend and mentor Col. Trautman asks for his help
on a top secret mission to Afghanistan, Rambo declines but must reconsider when Trautman
is captured.",
"title" : "Rambo III"
}
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "1368",
"_score" : 10.442155,
"_source" : {
"overview" : "When former Green Beret John Rambo is harassed by local law
enforcement and arrested for vagrancy, the Vietnam vet snaps, runs for the hills and rat-
a-tat-tats his way into the action-movie hall of fame. Hounded by a relentless sheriff,
Rambo employs heavy-handed guerilla tactics to shake the cops off his tail.",
"title" : "First Blood"
}
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "1369",
"_score" : 10.442155,
"_source" : {
"overview" : "Col. Troutman recruits ex-Green Beret John Rambo for a highly
secret and dangerous mission. Teamed with Co Bao, Rambo goes deep into Vietnam to rescue
POWs. Deserted by his own team, he's left in a hostile jungle to fight for his life,
avenge the death of a woman and bring corrupt officials to justice.",
"title" : "Rambo: First Blood Part II"
}
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "31362",
"_score" : 7.424202,
"_source" : {
"overview" : "It is 1985, and a small, tranquil Florida town is being rocked
by a wave of vicious serial murders and bank robberies. Particularly sickening to the
authorities is the gratuitous use of violence by two “Rambo” like killers who dress
themselves in military garb. Based on actual events taken from FBI files, the movie
depicts the Bureau’s efforts to track down these renegades.",
"title" : "In the Line of Duty: The F.B.I. Murders"
}
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "13258",
"_score" : 6.43182,
"_source" : {
"overview" : """Will Proudfoot (Bill Milner) is looking for an escape from his
family's stifling home life when he encounters Lee Carter (Will Poulter), the school
bully. Armed with a video camera and a copy of "Rambo: First Blood", Lee plans to make
cinematic history by filming his own action-packed video epic. Together, these two
newfound friends-turned-budding-filmmakers quickly discover that their imaginative # and
sometimes mishap-filled # cinematic adventure has begun to take on a life of its own!""",
"title" : "Son of Rambow"
}
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "61410",
"_score" : 3.9719706,
"_source" : {
"overview" : "It's South Africa 1990. Two major events are about to happen: The
release of Nelson Mandela and, more importantly, it's Spud Milton's first year at an elite
boys only private boarding school. John Milton is a boy from an ordinary background who
wins a scholarship to a private school in Kwazulu-Natal, South Africa. Surrounded by boys
with nicknames like Gecko, Rambo, Rain Man and Mad Dog, Spud has his hands full trying to
adapt to his new home. Along the way Spud takes his first tentative steps along the path
to manhood. (The path it seems could be a rather long road). Spud is an only child. He
is cursed with parents from well beyond the lunatic fringe and a senile granny. His dad
is a fervent anti-communist who is paranoid that the family domestic worker is running
a shebeen from her room at the back of the family home. His mom is a free spirit and
a teenager's worst nightmare, whether it's shopping for Spud's underwear in the local
supermarket",
"title" : "Spud"
}
}
]
}
}
If you search without using the Learning to Rank plugin, Elasticsearch returns different results:
POST tmdb/_search
{
"_source": {
"includes": ["title", "overview"]
},
"query": {
"multi_match": {
"query": "Rambo",
"fields": ["title", "overview"]
}
}
}
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 5,
"relation" : "eq"
},
"max_score" : 11.262714,
"hits" : [
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "1370",
"_score" : 11.262714,
"_source" : {
"overview" : "Combat has taken its toll on Rambo, but he's finally begun to find
inner peace in a monastery. When Rambo's friend and mentor Col. Trautman asks for his help
on a top secret mission to Afghanistan, Rambo declines but must reconsider when Trautman
is captured.",
"title" : "Rambo III"
}
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "7555",
"_score" : 11.2569065,
"_source" : {
"overview" : "When governments fail to act on behalf of captive missionaries,
ex-Green Beret John James Rambo sets aside his peaceful existence along the Salween River
in a war-torn region of Thailand to take action. Although he's still haunted by violent
memories of his time as a U.S. soldier during the Vietnam War, Rambo can hardly turn his
back on the aid workers who so desperately need his help.",
"title" : "Rambo"
}
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "1368",
"_score" : 10.558305,
"_source" : {
"overview" : "When former Green Beret John Rambo is harassed by local law
enforcement and arrested for vagrancy, the Vietnam vet snaps, runs for the hills and rat-
a-tat-tats his way into the action-movie hall of fame. Hounded by a relentless sheriff,
Rambo employs heavy-handed guerilla tactics to shake the cops off his tail.",
"title" : "First Blood"
}
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "1369",
"_score" : 10.558305,
"_source" : {
"overview" : "Col. Troutman recruits ex-Green Beret John Rambo for a highly
secret and dangerous mission. Teamed with Co Bao, Rambo goes deep into Vietnam to rescue
POWs. Deserted by his own team, he's left in a hostile jungle to fight for his life,
avenge the death of a woman and bring corrupt officials to justice.",
"title" : "Rambo: First Blood Part II"
}
},
{
"_index" : "tmdb",
"_type" : "movie",
"_id" : "13258",
"_score" : 6.4600153,
"_source" : {
"overview" : """Will Proudfoot (Bill Milner) is looking for an escape from his
family's stifling home life when he encounters Lee Carter (Will Poulter), the school
bully. Armed with a video camera and a copy of "Rambo: First Blood", Lee plans to make
cinematic history by filming his own action-packed video epic. Together, these two
newfound friends-turned-budding-filmmakers quickly discover that their imaginative # and
sometimes mishap-filled # cinematic adventure has begun to take on a life of its own!""",
"title" : "Son of Rambow"
}
}
]
}
}
Based on how well you think the model is performing, adjust the judgment list and features. Then,
repeat steps 2-8 to improve the ranking results over time.
Create Store
This command creates a hidden .ltrstore index that stores metadata information such as feature sets
and models.
PUT _ltr
Delete Store
Deletes the hidden .ltrstore index and resets the plugin.
DELETE _ltr
POST _ltr/_featureset/<name_of_features>
DELETE _ltr/_featureset/<name_of_feature_set>
GET _ltr/_featureset/<name_of_feature_set>
Create Model
Creates a model.
POST _ltr/_featureset/<name_of_feature_set>/_createmodel
Delete Model
Deletes a model.
DELETE _ltr/_model/<name_of_model>
Get Model
Retrieves a model.
GET _ltr/_model/<name_of_model>
Get Stats
Provides information about how the plugin is behaving.
GET _ltr/_model/<name_of_model>
GET _opendistro/_ltr/nodeID,nodeID,/stats/stat,stat
{
"_nodes" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"cluster_name" : "873043598401:ltr-77",
"stores" : {
".ltrstore" : {
"model_count" : 1,
"featureset_count" : 1,
"feature_count" : 2,
"status" : "green"
}
},
"status" : "green",
"nodes" : {
"DjelK-_ZSfyzstO5dhGGQA" : {
"cache" : {
"feature" : {
"eviction_count" : 0,
"miss_count" : 0,
"entry_count" : 0,
"memory_usage_in_bytes" : 0,
"hit_count" : 0
},
"featureset" : {
"eviction_count" : 2,
"miss_count" : 2,
"entry_count" : 0,
"memory_usage_in_bytes" : 0,
"hit_count" : 0
},
"model" : {
"eviction_count" : 2,
"miss_count" : 3,
"entry_count" : 1,
"memory_usage_in_bytes" : 3204,
"hit_count" : 1
}
},
"request_total_count" : 6,
"request_error_count" : 0
}
}
}
The statistics are provided at two levels, node and cluster, as specified below:
Node-level stats
Cluster-level stats
GET opendistro/_ltr/_cachestats
{
"_nodes": {
"total": 2,
"successful": 2,
"failed": 0
},
"cluster_name": "es-cluster",
"all": {
"total": {
"ram": 612,
"count": 1
},
"features": {
"ram": 0,
"count": 0
},
"featuresets": {
"ram": 612,
"count": 1
},
"models": {
"ram": 0,
"count": 0
}
},
"stores": {
".ltrstore": {
"total": {
"ram": 612,
"count": 1
},
"features": {
"ram": 0,
"count": 0
},
"featuresets": {
"ram": 612,
"count": 1
},
"models": {
"ram": 0,
"count": 0
}
}
},
"nodes": {
"ejF6uutERF20wOFNOXB61A": {
"name": "elasticsearch3",
"hostname": "172.18.0.4",
"stats": {
"total": {
"ram": 612,
"count": 1
},
"features": {
"ram": 0,
"count": 0
},
"featuresets": {
"ram": 612,
"count": 1
},
"models": {
"ram": 0,
"count": 0
}
}
},
"Z2RZNWRLSveVcz2c6lHf5A": {
"name": "elasticsearch1",
"hostname": "172.18.0.2",
"stats": {
...
}
}
}
}
Clear Cache
Clears the plugin cache. Use this to refresh the model.
POST opendistro/_ltr/_clearcache
Kibana
Kibana is a popular open source visualization tool designed to work with Elasticsearch. Amazon ES
provides an installation of Kibana with every Amazon ES domain. You can find a link to Kibana on your
domain dashboard on the Amazon ES console. The URL is domain-endpoint/_plugin/kibana/.
Queries using this default Kibana installation have a 300-second timeout.
Because Kibana is a JavaScript application, requests originate from the user's IP address. IP-based access
control might be impractical due to the sheer number of IP addresses you would need to allow in order
for each user to have access to Kibana. One workaround is to place a proxy server between Kibana and
Amazon ES. Then you can add an IP-based access policy that allows requests from only one IP address,
the proxy's. The following diagram shows this configuration.
1. This is your Amazon ES domain. IAM provides authorized access to this domain. An additional, IP-
based access policy provides access to the proxy server.
2. This is the proxy server, running on an Amazon EC2 instance.
3. Other applications can use the Signature Version 4 signing process to send authenticated requests to
Amazon ES.
4. Kibana clients connect to your Amazon ES domain through the proxy.
To enable this sort of configuration, you need a resource-based policy that specifies roles and IP
addresses. Here's a sample policy:
{
"Version": "2012-10-17",
"Statement": [{
"Resource": "arn:aws:es:us-west-2:111111111111:domain/my-domain/*",
"Principal": {
"AWS": "arn:aws:iam::111111111111:role/allowedrole1"
},
"Action": [
"es:ESHttpGet"
],
"Effect": "Allow"
},
{
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "es:*",
"Condition": {
"IpAddress": {
"aws:SourceIp": [
"123.456.789.123"
]
}
},
"Resource": "arn:aws:es:us-west-2:111111111111:domain/my-domain/*"
}
]
}
We recommend that you configure the EC2 instance running the proxy server with an Elastic IP address.
This way, you can replace the instance when necessary and still attach the same public IP address to it. To
learn more, see Elastic IP Addresses in the Amazon EC2 User Guide for Linux Instances.
If you use a proxy server and the section called “Authentication for Kibana” (p. 100), you might need to
add settings for Kibana and Amazon Cognito to avoid redirect_mismatch errors. See the following
nginx.conf example:
server {
listen 443;
server_name $host;
rewrite ^/$ https://$host/_plugin/kibana redirect;
ssl_certificate /etc/nginx/cert.crt;
ssl_certificate_key /etc/nginx/cert.key;
ssl on;
ssl_session_cache builtin:1000 shared:SSL:10m;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_ciphers HIGH:!aNULL:!eNULL:!EXPORT:!CAMELLIA:!DES:!MD5:!PSK:!RC4;
ssl_prefer_server_ciphers on;
location /_plugin/kibana {
# Forward requests to Kibana
proxy_pass https://$kibana_host/_plugin/kibana;
location ~ \/(log|sign|fav|forgot|change|saml|oauth2) {
# Forward requests to Cognito
proxy_pass https://$cognito_host;
1. Open Kibana.
2. Choose Management.
3. Choose Advanced Settings.
4. Locate visualization:tileMap:WMSdefaults.
5. Change enabled to true and url to the URL of a valid WMS map server:
{
"enabled": true,
"url": "wms-server-url",
"options": {
"format": "image/png",
"transparent": true
}
}
6. Choose Save.
To apply the new default value to visualizations, you might need to reload Kibana. If you have saved
visualizations, choose Options after opening the visualization. Verify that WMS map server is enabled
and WMS url contains your preferred map server, and then choose Apply changes.
Note
Map services often have licensing fees or restrictions. You are responsible for all such
considerations on any map server that you specify. You might find the map services from the
U.S. Geological Survey useful for testing.
kibana.index: ".kibana_1"
# Use elasticsearch.url for versions older than 6.6
# elasticsearch.url: "https://domain-endpoint:443"
# Use elasticsearch.hosts for versions 6.6 and later
elasticsearch.hosts: "https://domain-endpoint:443"
Older versions of Elasticsearch might only work over HTTP. In all cases, add the http or https prefix.
For older versions, you must explicitly specify port 80 or 443. For newer versions, you can omit the port.
Managing Indices
After you add data to Amazon Elasticsearch Service, you often need to reindex that data, work with
index aliases, move an index to more cost-effective storage, or delete it altogether. This chapter covers
UltraWarm storage and Index State Management. For information on the Elasticsearch index APIs, see
the Open Distro for Elasticsearch documentation.
Topics
• UltraWarm for Amazon Elasticsearch Service (p. 184)
• Index State Management (p. 190)
• Using Curator to Rotate Data in Amazon Elasticsearch Service (p. 193)
Rather than attached storage, UltraWarm nodes use Amazon S3 and a sophisticated caching solution
to improve performance. For indices that you are not actively writing to and query less frequently,
UltraWarm offers significantly lower costs per GiB of data. In Elasticsearch, these warm indices behave
just like any other index. You can query them using the same APIs or use them to create dashboards in
Kibana.
Topics
• Prerequisites (p. 184)
• Calculating UltraWarm Storage Requirements (p. 185)
• UltraWarm Pricing (p. 185)
• Enabling UltraWarm (p. 185)
• Migrating Indices to UltraWarm Storage (p. 186)
• Automating Migrations (p. 188)
• Listing Hot and Warm Indices (p. 188)
• Returning Warm Indices to Hot Storage (p. 189)
• Restoring Warm Indices from Automated Snapshots (p. 189)
• Manual Snapshots of Warm Indices (p. 189)
• Disabling UltraWarm (p. 190)
Prerequisites
UltraWarm has a few important prerequisites:
• If your domain uses a T2 instance type for your data nodes, you can't use warm storage.
Because it uses Amazon S3, UltraWarm incurs none of this overhead. When calculating UltraWarm
storage requirements, you consider only the size of the primary shards. The durability of data
in S3 removes the need for replicas, and S3 abstracts away any operating system or service
considerations. That same 10 GiB shard requires 10 GiB of warm storage. If you provision an
ultrawarm1.large.elasticsearch instance, you can use all 20 TiB of its maximum storage for
primary shards. See the section called “UltraWarm Storage Limits” (p. 232) for a summary of instance
types and the maximum amount of storage that each can address.
Tip
With UltraWarm, we still recommend a maximum shard size of 50 GiB.
UltraWarm Pricing
With hot storage, you pay for what you provision. Some instances require an attached Amazon EBS
volume, while others include an instance store. Whether that storage is empty or full, you pay the same
price.
With UltraWarm storage, you pay for what you use. An ultrawarm1.large.elasticsearch instance
can address up to 20 TiB of storage on S3, but if you store only 1 TiB of data, you're only billed for 1
TiB of data. Like all other node types, you also pay an hourly rate for each UltraWarm node. For more
information, see the section called “Pricing for Amazon ES” (p. 3).
Enabling UltraWarm
The console is the simplest way to create a domain that uses warm storage. While creating the domain,
choose Enable UltraWarm data nodes and the number of warm nodes that you want. The same basic
process works on existing domains, provided they meet the prerequisites (p. 184). Even after the
domain state changes from Processing to Active, UltraWarm might not be available to use for several
hours.
You can also use the AWS CLI or configuration API (p. 271) to enable UltraWarm, specifically the
WarmEnabled, WarmCount, and WarmType options in ElasticsearchClusterConfig.
Note
Domains support a maximum number of warm nodes. For details, see the section called
“Limits” (p. 231).
POST https://es.us-east-2.amazonaws.com/2015-01-01/es/domain
{
"ElasticsearchClusterConfig": {
"InstanceCount": 3,
"InstanceType": "r5.large.elasticsearch",
"DedicatedMasterEnabled": true,
"DedicatedMasterType": "c5.large.elasticsearch",
"DedicatedMasterCount": 3,
"ZoneAwarenessEnabled": true,
"ZoneAwarenessConfig": {
"AvailabilityZoneCount": 3
},
"WarmEnabled": true,
"WarmCount": 6,
"WarmType": "ultrawarm1.medium.elasticsearch"
},
"EBSOptions": {
"EBSEnabled": true,
"VolumeType": "gp2",
"VolumeSize": 11
},
"EncryptionAtRestOptions": {
"Enabled": true
},
"NodeToNodeEncryptionOptions": {
"Enabled": true
},
"DomainEndpointOptions": {
"EnforceHTTPS": true,
"TLSSecurityPolicy": "Policy-Min-TLS-1-2-2019-07"
},
"ElasticsearchVersion": "6.8",
"DomainName": "my-domain",
"AccessPolicies": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow
\",\"Principal\":{\"AWS\":[\"123456789012\"]},\"Action\":[\"es:*\"],\"Resource\":
\"arn:aws:es:us-east-1:123456789012:domain/my-domain/*\"}]}"
}
For detailed information, see Amazon ES Configuration API Reference (p. 271).
POST _ultrawarm/migration/my-index/_warm
GET _ultrawarm/migration/my-index/_status
{
"migration_status": {
"index": "my-index",
"state": "RUNNING_SHARD_RELOCATION",
"migration_type": "HOT_TO_WARM",
"shard_level_status": {
"running": 0,
"total": 5,
"pending": 3,
"failed": 0,
"succeeded": 2
}
}
}
If you migrate several indices in quick succession, you can get a summary of all migrations in plaintext,
similar to the _cat API:
GET _ultrawarm/migration/_status?v
You can have up to 25 simultaneous migrations from hot to warm storage. To check the current number,
monitor the HotToWarmMigrationQueueSize metric (p. 36).
PENDING_INCREMENTAL_SNAPSHOT
RUNNING_INCREMENTAL_SNAPSHOT
FAILED_INCREMENTAL_SNAPSHOT
PENDING_FORCE_MERGE
RUNNING_FORCE_MERGE
FAILED_FORCE_MERGE
PENDING_FULL_SNAPSHOT
RUNNING_FULL_SNAPSHOT
FAILED_FULL_SNAPSHOT
PENDING_SHARD_RELOCATION
RUNNING_SHARD_RELOCATION
FINISHED_SHARD_RELOCATION
As these states indicate, migrations might fail during snapshots, shard relocations, or force merges.
Failures during snapshots or shard relocation are typically due to node failures or S3 connectivity issues.
Lack of disk space is usually the underlying cause of force merge failures.
After a migration finishes, the same _status request returns an error. If you check the index at that
time, you can see some settings that are unique to warm indices:
GET my-index/_settings
{
"my-index": {
"settings": {
"index": {
"refresh_interval": "-1",
"auto_expand_replicas": "false",
"provided_name": "my-index",
"creation_date": "1572886951679",
"unassigned": {
"node_left": {
"delayed_timeout": "5m"
}
},
"number_of_replicas": "1",
"uuid": "3iyTkhXvR8Cytc6sWKBirg",
"version": {
"created": "6080099"
},
"routing": {
"allocation": {
"require": {
"box_type": "warm"
}
}
},
"number_of_shards": "5"
}
}
}
}
• number_of_replicas, in this case, is the number of passive replicas, which don't consume disk
space.
• routing.allocation.require.box_type specifies that the index should use warm nodes rather
than standard data nodes.
Indices in warm storage are read-only unless you return them to hot storage (p. 189). You can query
the indices and delete them, but you can't add, update, or delete individual documents. If you try, you
might encounter the following error:
{
"error": {
"root_cause": [{
"type": "cluster_block_exception",
"reason": "blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"
}],
"type": "cluster_block_exception",
"reason": "blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];"
},
"status": 403
}
Automating Migrations
We recommend using the section called “Index State Management” (p. 190) to automate the migration
process after an index reaches a certain age or meets other conditions. The sample policy here (p. 191)
demonstrates that workflow.
GET _warm
GET _hot
You can use these options in other requests that specify indices, such as:
_cat/indices/_warm
_cluster/state/_all/_hot
POST _ultrawarm/migration/my-index/_hot
You can have up to 10 simultaneous migrations from warm to hot storage. To check the current number,
monitor the WarmToHotMigrationQueueSize metric (p. 36).
After the migration finishes, check the index settings to make sure they meet your needs. Indices return
to hot storage with one replica.
Unlike other automated snapshots, each snapshot in this repository contains only one index. When you
restore a snapshot from cs-ultrawarm, it restores to warm storage, not hot storage. Snapshots in the
cs-automated and cs-automated-enc repositories restore to hot storage.
1. Identify the latest snapshot that contains the index that you want to restore:
GET _snapshot/cs-ultrawarm/_all
{
"snapshots": [{
"snapshot": "snapshot-name",
"version": "6.8.0",
"indices": [
"my-index"
]
}]
}
DELETE my-index
If you don't want to delete the index, return it to hot storage (p. 189) and reindex it.
3. Restore the snapshot:
POST _snapshot/cs-ultrawarm/snapshot-name/_restore
UltraWarm ignores any index settings you specify in this restore request, but you can specify options
like rename_pattern and rename_replacement. For a summary of Elasticsearch snapshot restore
options, see the Open Distro for Elasticsearch documentation.
By default, Amazon ES does not include warm indices in manual snapshots. For example, the following
call only includes hot indices:
PUT _snapshot/my-repository/my-snapshot
If you choose to take manual snapshots of warm indices, several important considerations apply.
• You can't mix hot and warm indices. For example, the following request fails:
PUT _snapshot/my-repository/my-snapshot
{
"indices": "warm-index-1,hot-index-1",
"include_global_state": false
}
If they include a mix of hot and warm indices, wildcard (*) statements fail, as well.
• You can only include one warm index per snapshot. For example, the following request fails:
PUT _snapshot/my-repository/my-snapshot
{
"indices": "warm-index-1,warm-index-2,other-warm-indices-*",
"include_global_state": false
}
PUT _snapshot/my-repository/my-snapshot
{
"indices": "warm-index-1",
"include_global_state": false
}
• Manual snapshots always restore to hot storage, even if they originally included a warm index.
Disabling UltraWarm
The console is the simplest way to disable UltraWarm. Choose the domain, Edit domain, uncheck Enable
UltraWarm data nodes, and Submit. You can also use the WarmEnabled option in the AWS CLI and
configuration API.
Before you disable UltraWarm, you must either delete all warm indices or migrate them back to hot
storage. After warm storage is empty, wait five minutes before attempting to disable the feature.
A policy contains a default state and a list of states for the index to transition between. Within each
state, you can define a list of actions to perform and conditions that trigger these transitions. A typical
use case is to periodically delete old indices after a certain period of time.
For example, you can define a policy that moves your index into a read_only state after 30 days and
then ultimately deletes it after 90 days.
ISM requires Elasticsearch 6.8 or later. Full documentation for the feature is available in the Open Distro
for Elasticsearch documentation.
Note
After you attach a policy to an index, ISM creates a job that runs every 30 to 48 minutes to
perform policy actions, check conditions, and transition the index into different states. The base
time for this job to run is every 30 minutes, plus a random 0-60% jitter is added to it to make
sure you do not see a surge of activity from all your indices at the same time.
Sample Policies
This first sample policy moves an index from hot storage to UltraWarm (p. 184) storage after seven
days and deletes the index after 90 days.
In this case, an index is initially in the hot state. After seven days, ISM moves it to the warm state. 83
days later, the service sends a notification to an Amazon Chime room that the index is being deleted and
then permanently deletes it.
{
"policy": {
"description": "Demonstrate a hot-warm-delete workflow.",
"default_state": "hot",
"schema_version": 1,
"states": [{
"name": "hot",
"actions": [],
"transitions": [{
"state_name": "warm",
"conditions": {
"min_index_age": "7d"
}
}]
},
{
"name": "warm",
"actions": [{
"warm_migration": {},
"timeout": "24h",
"retry": {
"count": 5,
"delay": "1h"
}
}],
"transitions": [{
"state_name": "delete",
"conditions": {
"min_index_age": "90d"
}
}]
},
{
"name": "delete",
"actions": [{
"notification": {
"destination": {
"chime": {
"url": "<URL>"
}
},
"message_template": {
"source": "The index {{ctx.index}} is being deleted."
}
}
},
{
"delete": {}
}
]
}
]
}
}
This second, simpler sample policy reduces replica count to zero after seven days to conserve disk space
and then deletes the index after 21 days. This policy assumes your index is non-critical and no longer
receiving write requests; having zero replicas carries some risk of data loss.
{
"policy": {
"description": "Changes replica count and deletes.",
"schema_version": 1,
"default_state": "current",
"states": [{
"name": "current",
"actions": [],
"transitions": [{
"state_name": "old",
"conditions": {
"min_index_age": "7d"
}
}]
},
{
"name": "old",
"actions": [{
"replica_count": {
"number_of_replicas": 0
}
}],
"transitions": [{
"state_name": "delete",
"conditions": {
"min_index_age": "21d"
}
}]
},
{
"name": "delete",
"actions": [{
"delete": {}
}],
"transitions": []
}
]
}
}
Differences
Compared to Open Distro for Elasticsearch, ISM for Amazon Elasticsearch Service has several differences.
ISM Operations
Amazon ES supports a unique ISM operation, warm_migration. If your domain has
UltraWarm (p. 184) enabled, this action transitions the index to warm storage. The warm_migration
action has a default timeout of 12 hours. For large clusters, you might need to change this value, as
shown in the sample policy (p. 191).
• open
• close
• snapshot
ISM Settings
Open Distro for Elasticsearch lets you change all available ISM settings using the _cluster/settings
API. On Amazon ES, you can only change the following settings:
• Cluster-level settings:
• enabled
• history.enabled
• Index-level settings:
• rollover_alias
• policy_id
Although Curator is often used as a command line interface (CLI), it also features a Python API, which
means that you can use it within Lambda functions.
For information about configuring Lambda functions and creating deployment packages, see the section
called “Loading Streaming Data into Amazon ES from Amazon S3” (p. 131). For even more information,
see the AWS Lambda Developer Guide. This section contains only sample code, basic settings, triggers,
and permissions.
Topics
• Sample Code (p. 193)
• Basic Settings (p. 196)
• Triggers (p. 196)
• Permissions (p. 196)
Sample Code
The following sample code uses Curator and elasticsearch-py to delete any index whose name contains
a time stamp indicating that the data is more than 30 days old. For example, if an index name is my-
logs-2014.03.02, the index is deleted. Deletion occurs even if you create the index today, because this
filter uses the name of the index to determine its age.
The code also contains some commented-out examples of other common filters, including one that
determines age by creation date. The AWS SDK for Python (Boto3) and requests-aws4auth library sign
the requests to Amazon ES.
Warning
Both code samples in this section delete data—potentially a lot of data. Modify and test each
sample on a non-critical domain until you're satisfied with its behavior.
Index Deletion
import boto3
from requests_aws4auth import AWS4Auth
from elasticsearch import Elasticsearch, RequestsHttpConnection
import curator
# A test document.
document = {
"title": "Moneyball",
"director": "Bennett Miller",
"year": "2011"
}
# Index the test document so that we have an index that matches the timestring pattern.
# You can delete this line and the test document if you already created some test
indices.
es.index(index="movies-2017.01.31", doc_type="movie", id="1", body=document)
index_list = curator.IndexList(es)
# Filters by age, anything with a time stamp older than 30 days in the index name.
index_list.filter_by_age(source='name', direction='older', timestring='%Y.%m.%d',
unit='days', unit_count=30)
The next code sample deletes any snapshot that is more than two weeks old. It also takes a new
snapshot.
Snapshot Deletion
import boto3
from datetime import datetime
from requests_aws4auth import AWS4Auth
from elasticsearch import Elasticsearch, RequestsHttpConnection
import logging
import curator
# Adding a logger isn't strictly required, but helps with understanding Curator's requests
and debugging.
logger = logging.getLogger('curator')
logger.addHandler(logging.StreamHandler())
logger.setLevel(logging.INFO)
now = datetime.now()
try:
# Get all snapshots in the repository.
snapshot_list = curator.SnapshotList(es, repository=repository_name)
# Split into two try blocks. We still want to try and take a snapshot if deletion
failed.
try:
# Get the list of indices.
# You can filter this list if you didn't want to snapshot all indices.
index_list = curator.IndexList(es)
# Take a new snapshot. This operation can take a while, so we don't want to wait
for it to complete.
curator.Snapshot(index_list, repository=repository_name, name=snapshot_name,
wait_for_completion=False).do_action()
except (curator.exceptions.SnapshotInProgress, curator.exceptions.FailedExecution) as
e:
print(e)
You must update the values for host, region, snapshot_name, and repository_name. If the output
is too verbose for your taste, you can change logging.INFO to logging.WARN.
Because taking and deleting snapshots can take a while, this code is more sensitive to connection and
Lambda timeouts—hence the extra logging code. In the Elasticsearch client, you can see that we set
the timeout to 120 seconds. If the DeleteSnapshots function takes longer to get a response from the
Amazon ES domain, you might need to increase this value. You must also increase the Lambda function
timeout from its default value of three seconds. For a recommended value, see the section called “Basic
Settings” (p. 196).
Basic Settings
We recommend the following settings for these code samples.
Triggers
Rather than reacting to some event (such as a file upload to Amazon S3), these functions are meant to
be scheduled. You might prefer to run these functions more or less frequently.
Permissions
Both Lambda functions in this section need the basic logging permissions that all Lambda functions
need, plus HTTP method permissions for the Amazon ES domain:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "logs:CreateLogGroup",
"Resource": "arn:aws:logs:us-west-1:123456789012:*"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": [
"arn:aws:logs:us-west-1:123456789012:log-group:/aws/lambda/your-lambda-function:*"
]
},
{
"Effect": "Allow",
"Action": [
"es:ESHttpPost",
"es:ESHttpGet",
"es:ESHttpPut",
"es:ESHttpDelete"
],
"Resource": "arn:aws:es:us-west-1:123456789012:domain/my-domain/*"
}
]
}
Monitoring Data
Amazon Elasticsearch Service lets you to monitor your data proactively with the alerting and anomaly
detection features.
Alerting notifies you when your data exceeds certain thresholds. Anomaly detection uses machine
learning to automatically detect any outliers in your streaming data. You can pair anomaly detection
with alerting for Amazon ES to notify you as soon as an anomaly is detected.
Topics
• Alerting for Amazon Elasticsearch Service (p. 198)
• Anomaly Detection for Amazon Elasticsearch Service (p. 199)
Alerting requires Elasticsearch 6.2 or higher. Full documentation for the feature is available in the Open
Distro for Elasticsearch documentation.
Differences
Compared to Open Distro for Elasticsearch, the Amazon Elasticsearch Service alerting feature has some
notable differences.
1. Open Kibana.
2. Choose Alerting.
3. Choose the Destinations tab and then Add Destination.
4. Provide a unique name for the destination.
5. For Type, choose Amazon SNS.
6. Provide the SNS topic ARN.
7. Provide the ARN for an IAM role within your account that has the following trust relationship and
permissions (at minimum):
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Service": "es.amazonaws.com"
},
"Action": "sts:AssumeRole"
}]
}
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": "sns:Publish",
"Resource": "sns-topic-arn"
}]
}
For more information, see Adding IAM Identity Permissions in the IAM User Guide.
8. Choose Create.
Alerting Settings
Open Distro for Elasticsearch lets you modify certain alerting settings using the _cluster/settings
API (for example, opendistro.alerting.monitor.max_monitors). Amazon ES uses the default
values, and you can't change them.
You can, however, disable the alerting feature. Send the following request:
PUT _cluster/settings
{
"persistent" : {
"opendistro.scheduled_jobs.enabled" : false
}
}
If you previously created monitors and want to stop the creation of daily alerting indices, delete all alert
history indices:
DELETE .opendistro-alerting-alert-history-*
Alerting Permissions
To use the Amazon ES alerting feature on a domain that uses fine-grained access control (p. 77), you
must map the all_access role to your user or backend role.
RCF is an unsupervised machine learning algorithm that models a sketch of your incoming data stream.
It computes an anomaly grade and confidence score value for each incoming data point. The
anomaly detection feature uses these values to differentiate an anomaly from normal variations in your
data.
You can pair the anomaly detection plugin with the the section called “Alerting” (p. 198) plugin to
notify you as soon as an anomaly is detected.
Anomaly detection requires Elasticsearch 7.4 or later. Full documentation for the feature, including
detailed steps and API descriptions, is available in the Open Distro for Elasticsearch documentation.
Note
To use the anomaly detection plugin, your user role must be mapped to the master role that
gives you full access to the domain. To learn more, see the section called “Modifying the Master
User” (p. 92).
For example, if you choose min(), the detector focuses on finding anomalies based on the minimum
values of your feature. If you choose average(), the detector finds anomalies based on the average
values of your feature.
• The Live anomalies chart displays the live anomaly results for the last 60 intervals. For example, if
the interval is set to 10, it shows the results for the last 600 minutes. This chart refreshes every 30
seconds.
• The Anomaly history chart plots the anomaly grade with the corresponding measure of confidence.
• The Feature breakdown graph plots the features based on the aggregation method. You can vary the
date-time range of the detector.
• The Anomaly occurrence table shows the Start time, End time, Data confidence, and
Anomaly grade for each anomaly detected.
• Apply a restrictive resource-based access policy (p. 65) to the domain (or enable fine-grained access
control), and follow the principle of least privilege when granting access to the configuration API and
the Elasticsearch APIs.
• Configure at least one replica, the Elasticsearch default, for each index.
• Use three dedicated master nodes (p. 208).
• Deploy the domain across three Availability Zones. This configuration lets Amazon ES distribute replica
shards to different Availability Zones than their corresponding primary shards. For a list of Regions
that have three Availability Zones and some other considerations, see the section called “Configuring a
Multi-AZ Domain” (p. 17).
• Upgrade to the latest Elasticsearch versions (p. 52) as they become available on Amazon Elasticsearch
Service.
• Update to the latest service software (p. 15) as it becomes available.
• Size the domain appropriately for your workload. For storage volume, shard size, and data node
recommendations, see the section called “Sizing Amazon ES Domains” (p. 203) and the section
called “Petabyte Scale” (p. 207). For dedicated master node recommendations, see the section called
“Dedicated Master Nodes” (p. 208).
• Have no more than 1,000 shards on any data node. This limit is the default in Elasticsearch 7.x and
later. For a more nuanced guideline, see the section called “Choosing the Number of Shards” (p. 205).
• Use the latest-generation instances available on the service. For example, use I3 instances rather than
I2 instances.
• Don't use burstable instances for production domains. For example, don't use T2 instances as data
nodes or dedicated master nodes.
• If appropriate for your network configuration, create the domain within a VPC (p. 20).
• If your domain stores sensitive data, enable encryption of data at rest (p. 62) and node-to-node
encryption (p. 64).
Topics
• Sizing Amazon ES Domains (p. 203)
• Petabyte Scale for Amazon Elasticsearch Service (p. 207)
• Dedicated Master Nodes (p. 208)
• Recommended CloudWatch Alarms (p. 210)
hardware needs. This estimate can serve as a useful starting point for the most critical aspect of sizing
domains: testing them with representative workloads and monitoring their performance.
Topics
• Calculating Storage Requirements (p. 204)
• Choosing the Number of Shards (p. 205)
• Choosing Instance Types and Testing (p. 206)
• Long-lived index: You write code that processes data into one or more Elasticsearch indices and then
updates those indices periodically as the source data changes. Some common examples are website,
document, and ecommerce search.
• Rolling indices: Data continuously flows into a set of temporary indices, with an indexing period
and retention window, such as a set of daily indices that is retained for two weeks. Some common
examples are log analytics, time-series processing, and clickstream analytics.
For long-lived index workloads, you can examine the source data on disk and easily determine how much
storage space it consumes. If the data comes from multiple sources, just add those sources together.
For rolling indices, you can multiply the amount of data generated during a representative time period
by the retention period. For example, if you generate 200 MiB of log data per hour, that's 4.7 GiB per day,
which is 66 GiB of data at any given time if you have a two-week retention period.
The size of your source data, however, is just one aspect of your storage requirements. You also have to
consider the following:
1. Number of replicas: Each replica is a full copy of an index and needs the same amount of disk space.
By default, each Elasticsearch index has one replica. We recommend at least one to prevent data
loss. Replicas also improve search performance, so you might want more if you have a read-heavy
workload.
2. Elasticsearch indexing overhead: The on-disk size of an index varies, but is often 10% larger than the
source data. After indexing your data, you can use the _cat/indices?v API and pri.store.size
value to calculate the exact overhead. _cat/allocation?v also provides a useful summary.
3. Operating system reserved space: By default, Linux reserves 5% of the file system for the root user
for critical processes, system recovery, and to safeguard against disk fragmentation problems.
4. Amazon ES overhead: Amazon ES reserves 20% of the storage space of each instance (up to 20 GiB)
for segment merges, logs, and other internal operations.
Because of this 20 GiB maximum, the total amount of reserved space can vary dramatically
depending on the number of instances in your domain. For example, a domain might have
three m4.xlarge.elasticsearch instances, each with 500 GiB of storage space, for a total
of 1.46 TiB. In this case, the total reserved space is only 60 GiB. Another domain might have 10
m3.medium.elasticsearch instances, each with 100 GiB of storage space, for a total of 0.98 TiB.
Here, the total reserved space is 200 GiB, even though the first domain is 50% larger.
In the following formula, we apply a "worst-case" estimate for overhead that includes additional free
space to help minimize the impact of node failures and Availability Zone outages.
In summary, if you have 66 GiB of data at any given time and want one replica, your minimum storage
requirement is closer to 66 * 2 * 1.1 / 0.95 / 0.8 = 191 GiB. You can generalize this calculation as follows:
Insufficient storage space is one of the most common causes of cluster instability, so you should cross-
check the numbers when you choose instance types, instance counts, and storage volumes (p. 206).
• If your minimum storage requirement exceeds 1 PB, see the section called “Petabyte Scale” (p. 207).
• If you have rolling indices and want to use a hot-warm architecture, see the section called
“UltraWarm” (p. 184).
The overarching goal of choosing a number of shards is to distribute an index evenly across all data
nodes in the cluster. However, these shards shouldn't be too large or too numerous. A good rule of
thumb is to try to keep shard size between 10–50 GiB. Large shards can make it difficult for Elasticsearch
to recover from failure, but because each shard uses some amount of CPU and memory, having too many
small shards can cause performance issues and out of memory errors. In other words, shards should be
small enough that the underlying Amazon ES instance can handle them, but not so small that they place
needless strain on the hardware.
For example, suppose you have 66 GiB of data. You don't expect that number to increase over time,
and you want to keep your shards around 30 GiB each. Your number of shards therefore should be
approximately 66 * 1.1 / 30 = 3. You can generalize this calculation as follows:
(Source Data + Room to Grow) * (1 + Indexing Overhead) / Desired Shard Size = Approximate Number
of Primary Shards
This equation helps compensate for growth over time. If you expect those same 67 GiB of data to
quadruple over the next year, the approximate number of shards is (66 + 198) * 1.1 / 30 = 10. Remember,
though, you don't have those extra 198 GiB of data yet. Check to make sure that this preparation for the
future doesn't create unnecessarily tiny shards that consume huge amounts of CPU and memory in the
present. In this case, 66 * 1.1 / 10 shards = 7.26 GiB per shard, which will consume extra resources and
is below the recommended size range. You might consider the more middle-of-the-road approach of six
shards, which leaves you with 12 GiB shards today and 48 GiB shards in the future. Then again, you might
prefer to start with three shards and reindex your data when the shards exceed 50 GiB.
A far less common issue involves limiting the number of shards per node. If you size your shards
appropriately, you typically run out of disk space long before encountering this limit. For example, an
m5.large.elasticsearch instance has a maximum disk size of 512 GiB. If you stay below 80% disk
usage and size your shards at 20 GiB, it can accommodate approximately 20 shards. Elasticsearch 7.x
and later have a limit of 1,000 shards per node, adjustable using the cluster.max_shards_per_node
setting.
Sizing shards appropriately almost always keeps you below this limit, but you can also consider the
number of shards for each GiB of Java heap. On a given node, have no more than 20 shards per GiB of
Java heap. For example, an m5.large.elasticsearch instance has a 4 GiB heap, so each node should
have no more than 80 shards. At that shard count, each shard is roughly 5 GiB in size, which is well below
our recommendation.
In general, the storage limits (p. 231) for each instance type map to the amount of CPU and memory
that you might need for light workloads. For example, an m4.large.elasticsearch instance has
a maximum EBS volume size of 512 GiB, 2 vCPU cores, and 8 GiB of memory. If your cluster has many
shards, performs taxing aggregations, updates documents frequently, or processes a large number of
queries, those resources might be insufficient for your needs. If you believe your cluster falls into one of
these categories, try starting with a configuration closer to 2 vCPU cores and 8 GiB of memory for every
100 GiB of your storage requirement.
Tip
For a summary of the hardware resources that are allocated to each instance type, see Amazon
Elasticsearch Service Pricing.
Still, even those resources might be insufficient. Some Elasticsearch users report that they need many
times those resources to fulfill their requirements. Finding the right hardware for your workload means
making an educated initial estimate, testing with representative workloads, adjusting, and testing again:
1. To start, we recommend a minimum of three nodes to avoid potential Elasticsearch issues, such as
split brain. If you have three dedicated master nodes (p. 208), we still recommend a minimum of two
data nodes for replication.
2. If you have a 184 GiB storage requirement and the recommended minimum number of three nodes,
use the equation 184 / 3 = 61 GiB to find the amount of storage that each node needs. In this
example, you might select three m5.large.elasticsearch instances, each using a 90 GiB EBS
storage volume so that you have a safety net and some room for growth over time. This configuration
provides 6 vCPU cores and 24 GiB of memory, so it's suited to lighter workloads.
For a more substantial example, consider a 14 TiB (14,336 GiB) storage requirement and a
heavy workload. In this case, you might choose to begin testing with 2 * 144 = 288 vCPU
cores and 8 * 144 = 1152 GiB of memory. These numbers work out to approximately 18
i3.4xlarge.elasticsearch instances. If you don't need the fast, local storage, you could also test
18 r5.4xlarge.elasticsearch instances, each using a 1 TiB EBS storage volume.
If your cluster includes hundreds of terabytes of data, see the section called “Petabyte
Scale” (p. 207).
3. After configuring the cluster, you can add your indices (p. 129) using the number of shards you
calculated earlier, perform some representative client testing using a realistic dataset, and monitor
CloudWatch metrics (p. 27) to see how the cluster handles the workload.
4. If performance satisfies your needs, tests succeed, and CloudWatch metrics are normal, the cluster is
ready to use. Remember to set CloudWatch alarms (p. 210) to detect unhealthy resource usage.
If performance isn't acceptable, tests fail, or CPUUtilization or JVMMemoryPressure are high, you
might need to choose a different instance type (or add instances) and continue testing. As you add
instances, Elasticsearch automatically rebalances the distribution of shards throughout the cluster.
Because it is easier to measure the excess capacity in an overpowered cluster than the deficit in an
underpowered one, we recommend starting with a larger cluster than you think you need. Next, test
and scale down to an efficient cluster that has the extra resources to ensure stable operations during
periods of increased activity.
Production clusters or clusters with complex states benefit from dedicated master nodes (p. 208),
which improve performance and cluster reliability.
While this section frequently references the i3.16xlarge.elasticsearch instance types, you can use
several other instance types to reach 1 PB of total domain storage.
Creating domains
Domains of this size exceed the default limit of 40 instances per domain. To request a service limit
increase of up to 200 instances per domain, open a case at the AWS Support Center.
Pricing
Before creating a domain of this size, check the Amazon Elasticsearch Service Pricing page to ensure
that the associated costs match your expectations. Examine the section called “UltraWarm” (p. 184)
to see if a hot-warm architecture fits your use case.
Storage
The i3 instance types are designed to provide fast, local non-volatile memory express (NVMe)
storage. Because this local storage tends to offer performance benefits when compared to Amazon
Elastic Block Store, EBS volumes are not an option when you select these instance types in Amazon
ES. If you prefer EBS storage, use another instance type, such as r5.12xlarge.elasticsearch.
Shard size and count
A common Elasticsearch guideline is not to exceed 50 GB per shard. Given the number
of shards necessary to accommodate large domains and the resources available to
i3.16xlarge.elasticsearch instances, we recommend a shard size of 100 GB.
For example, if you have 450 TB of source data and want one replica, your minimum storage
requirement is closer to 450 TB * 2 * 1.1 / 0.95 = 1.04 PB. For an explanation of this calculation, see
the section called “Calculating Storage Requirements” (p. 204). Although 1.04 PB / 15 TB = 70
instances, you might select 90 or more i3.16xlarge.elasticsearch instances to give yourself
a storage safety net, deal with node failures, and account for some variance in the amount of data
over time. Each instance adds another 20 GiB to your minimum storage requirement, but for disks of
this size, those 20 GiB are almost negligible.
Controlling the number of shards is tricky. Elasticsearch users often rotate indices on a daily basis
and retain data for a week or two. In this situation, you might find it useful to distinguish between
"active" and "inactive" shards. Active shards are, well, actively being written to or read from. Inactive
shards might service some read requests, but are largely idle. In general, you should keep the
number of active shards below a few thousand. As the number of active shards approaches 10,000,
considerable performance and stability risks emerge.
To calculate the number of primary shards, use this formula: 450,000 GB * 1.1 / 100 GB per shard =
4,950 shards. Doubling that number to account for replicas is 9,900 shards, which represents a major
th th
concern if all shards are active. But if you rotate indices and only 1/7 or 1/14 of the shards are
active on any given day (1,414 or 707 shards, respectively), the cluster might work well. As always,
the most important step of sizing and configuring your domain is to perform representative client
testing using a realistic dataset.
We recommend that you add three dedicated master nodes to each production Amazon ES domain.
Never choose an even number of dedicated master nodes.
1. One dedicated master node means that you have no backup in the event of a failure.
2. Two dedicated master nodes means that your cluster does not have the necessary quorum of nodes to
elect a new master node in the event of a failure.
A quorum is the number of dedicated master nodes / 2 + 1 (rounded down to the nearest whole
number), which Amazon ES sets to discovery.zen.minimum_master_nodes when you create your
domain.
In this case, 2 / 2 + 1 = 2. Because one dedicated master node has failed and only one backup exists,
the cluster doesn't have a quorum and can't elect a new master.
3. Three dedicated master nodes, the recommended number, provides two backup nodes in the event of
a master node failure and the necessary quorum (2) to elect a new master.
4. Four dedicated master nodes are no better than three and can cause issues if you use multiple
Availability Zones (p. 17).
• If one master node fails, you have the quorum (3) to elect a new master. If two nodes fail, you lose
that quorum, just as you do with three dedicated master nodes.
• In a three Availability Zone configuration, two AZs have one dedicated master node, and one AZ has
two. If that AZ experiences a disruption, the remaining two AZs don't have the necessary quorum (3)
to elect a new master.
5. Having five dedicated master nodes works as well as three and allows you to lose two nodes while
maintaining a quorum. But because only one dedicated master node is active at any given time,
this configuration means paying for four idle nodes. Many users find this level of failover protection
excessive.
If a cluster has an even number of master-eligible nodes, Elasticsearch versions 7.x and later ignore one
node so that the voting configuration is always an odd number. In this case, four dedicated master nodes
are essentially equivalent to three (and two to one).
Note
If your cluster doesn't have the necessary quorum to elect a new master node, write and read
requests to the cluster both fail. This behavior differs from the Elasticsearch default.
• Replicate changes to the cluster state across all nodes in the cluster
• Monitor the health of all cluster nodes by sending heartbeat signals, periodic signals that monitor the
availability of the data nodes in the cluster
The following illustration shows an Amazon ES domain with ten instances. Seven of the instances are
data nodes and three are dedicated master nodes. Only one of the dedicated master nodes is active; the
two gray dedicated master nodes wait as backup in case the active dedicated master node fails. All data
upload requests are served by the seven data nodes, and all cluster management tasks are offloaded to
the active dedicated master node.
Although dedicated master nodes don't process search and query requests, their size is highly correlated
with the number of instances, indices, and shards that they can manage. For production clusters, we
recommend the following instance types for dedicated master nodes. These recommendations are based
on typical workloads and can vary based on your needs. Clusters with many shards or field mappings can
benefit from larger instance types. Monitor the dedicated master node metrics (p. 210) to see if you
need to use a larger instance type.
1–10 c5.large.elasticsearch
10–30 c5.xlarge.elasticsearch
30–75 c5.2xlarge.elasticsearch
75–200 r5.4xlarge.elasticsearch
• For information about how certain configuration changes can affect dedicated master nodes, see the
section called “Configuration Changes” (p. 14).
• For clarification on instance count limits, see the section called “Cluster and Instance Limits” (p. 231).
• For more information about specific instance types, including vCPU, memory, and pricing, see Amazon
Elasticsearch Instance Prices.
For more information about setting alarms, see Creating Amazon CloudWatch Alarms in the Amazon
CloudWatch User Guide.
Alarm Issue
ClusterStatus.red At least one primary shard and its replicas are not allocated to a node. See
maximum is >= 1 for 1 the section called “Red Cluster Status” (p. 262).
minute, 1 consecutive
time
ClusterStatus.yellow At least one replica shard is not allocated to a node. See the section called
maximum is >= 1 for 1 “Yellow Cluster Status” (p. 264).
minute, 1 consecutive
time
FreeStorageSpace A node in your cluster is down to 20 GiB of free storage space. See the
minimum is <= 20480 section called “Lack of Available Storage Space” (p. 264). This value is in
for 1 minute, 1 MiB, so rather than 20480, we recommend setting it to 25% of the storage
consecutive time space for each node.
Nodes minimum is < x x is the number of nodes in your cluster. This alarm indicates that at least
for 1 day, 1 consecutive one node in your cluster has been unreachable for one day. See the section
time called “Failed Cluster Nodes” (p. 265).
GET domain_endpoint/_snapshot/cs-automated/_all
Alarm Issue
CPUUtilization 100% CPU utilization isn't uncommon, but sustained high usage is
maximum is >= 80% problematic. Consider using larger instance types or adding instances.
for 15 minutes, 3
consecutive times
JVMMemoryPressure The cluster could encounter out of memory errors if usage increases.
maximum is >= 80% for Consider scaling vertically. Amazon ES uses half of an instance's RAM for
5 minutes, 3 consecutive the Java heap, up to a heap size of 32 GiB. You can scale instances vertically
times up to 64 GiB of RAM, at which point you can scale horizontally by adding
instances.
MasterCPUUtilization Consider using larger instance types for your dedicated master
maximum is >= 50% nodes (p. 208). Because of their role in cluster stability and blue/green
for 15 minutes, 3 deployments (p. 14), dedicated master nodes should have lower CPU usage
consecutive times than data nodes.
MasterJVMMemoryPressure
maximum is >= 80%
for 15 minutes, 1
consecutive time
KMSKeyError is >= The KMS encryption key that is used to encrypt data at rest in your
1 for 1 minute, 1 domain is disabled. Re-enable it to restore normal operations. For more
consecutive time information, see the section called “Encryption at Rest” (p. 62).
KMSKeyInaccessible The KMS encryption key that is used to encrypt data at rest in your domain
is >= 1 for 1 minute, 1 has been deleted or has revoked its grants to Amazon ES. You can't recover
consecutive time domains that are in this state, but if you have a manual snapshot, you can
use it to migrate to a new domain. To learn more, see the section called
“Encryption at Rest” (p. 62).
Note
If you just want to view metrics, see Monitoring CloudWatch Metrics (p. 27).
Topics
• Supported Instance Types (p. 212)
• Features by Elasticsearch Version (p. 213)
• Plugins by Elasticsearch Version (p. 214)
• Supported Elasticsearch Operations (p. 216)
• Amazon Elasticsearch Service Limits (p. 231)
• Amazon Elasticsearch Service Reserved Instances (p. 236)
• Other Supported Resources (p. 239)
For information about which instance type is appropriate for your use case, see the section called “Sizing
Amazon ES Domains” (p. 203), the section called “EBS Volume Size Limits” (p. 232), and the section
called “Network Limits” (p. 234).
C4
I2
I3 The I3 instance types require Elasticsearch version 5.1 or later and do not support
EBS storage volumes.
M3 The M3 instance types do not support encryption of data at rest, fine-grained access
control, or cross-cluster search.
M4
R4
T2 • You can use the T2 instance types only if the instance count for your domain is 10
or fewer.
• The t2.micro.elasticsearch instance type supports only Elasticsearch 1.5
and 2.3.
• The T2 instance types do not support encryption of data at rest, fine-grained
access control, UltraWarm storage, or cross-cluster search.
• The t2.micro and t2.small instance types do not support anomaly detection.
Tip
You can use different instance types for dedicated master nodes (p. 208) and data nodes.
Require HTTPS
for all traffic to
the domain
Multi-AZ
support
Dedicated
master nodes
Custom
packages
Curator CLI
support
Encryption of
data at rest
Cognito
authentication
for Kibana
In-place
Elasticsearch
upgrades
Hourly 5.3
automated
snapshots
Node-to-node 6.0
encryption
Java high-level
REST client
support
HTTP request
and response
compression
Alerting 6.2
SQL 6.5
Cross-cluster 6.7
search
Fine-grained
access control
UltraWarm 6.8
KNN 7.1
Anomaly 7.4
Detection
For information about plugins, which enable some of these features and additional functionality, see the
section called “Plugins by Elasticsearch Version” (p. 214). For information about the Elasticsearch API
for each version, see the section called “Supported Elasticsearch Operations” (p. 216).
Japanese
(kuromoji)
Analysis
Phonetic 2.3
Analysis
Seunjeon 5.1
Korean Analysis
Smart Chinese
Analysis
Stempel Polish
Analysis
Ingest
Attachment
Processor
Ingest User
Agent Processor
Mapper
Murmur3
Ukrainian
Analysis
Note
This table is not comprehensive. Amazon ES uses additional plugins to enable core service
functionality, such as the S3 Repository plugin for snapshots and the Open Distro for
Elasticsearch Performance Analyzer plugin for optimization and monitoring.
Topics
• Notable API Differences (p. 216)
• Version 7.7 (p. 218)
• Version 7.4 (p. 219)
• Version 7.1 (p. 220)
• Version 6.8 (p. 220)
• Version 6.7 (p. 221)
• Version 6.5 (p. 222)
• Version 6.4 (p. 223)
• Version 6.3 (p. 224)
• Version 6.2 (p. 225)
• Version 6.0 (p. 225)
• Version 5.6 (p. 226)
• Version 5.5 (p. 227)
• Version 5.3 (p. 228)
• Version 5.1 (p. 229)
• Version 2.3 (p. 229)
• Version 1.5 (p. 230)
// Accepted
PUT _cluster/settings
{
"persistent" : {
"action.auto_create_index" : false
}
}
// Rejected
PUT _cluster/settings
{
"persistent": {
"action": {
"auto_create_index": false
}
}
}
The high-level Java REST client uses the expanded form, so if you need to send settings requests, use the
low-level client.
Prior to Elasticsearch 5.3, the _cluster/settings API on Amazon ES domains supported only the
HTTP PUT method, not the GET method. Later versions support the GET method, as shown in the
following example:
GET https://domain.region.es.amazonaws.com/_cluster/settings?pretty
{
"persistent": {
"cluster": {
"routing": {
"allocation": {
"cluster_concurrent_rebalance": "2",
"node_concurrent_recoveries": "2",
"disk": {
"watermark": {
"low": "1.35gb",
"flood_stage": "0.45gb",
"high": "0.9gb"
}
},
"node_initial_primaries_recoveries": "4"
}
}
},
"indices": {
"recovery": {
"max_bytes_per_sec": "40mb"
}
}
}
}
If you compare responses from an open source Elasticsearch cluster and Amazon ES for certain settings
and statistics APIs, you might notice missing fields. Amazon ES redacts certain information that exposes
service internals, such as the file system data path from _nodes/stats or the operating system name
and version from _nodes.
Shrink
The _shrink API can cause upgrades, configuration changes, and domain deletions to fail. We don't
recommend using it on domains that run Elasticsearch versions 5.3 or 5.1. These versions have a bug that
can cause snapshot restoration of shrunken indices to fail.
If you use the _shrink API on other Elasticsearch versions, make the following request before starting
the shrink operation:
PUT https://domain.region.es.amazonaws.com/source-index/_settings
{
"settings": {
"index.routing.allocation.require._name": "name-of-the-node-to-shrink-to",
"index.blocks.read_only": true
}
}
Then make the following requests after completing the shrink operation:
PUT https://domain.region.es.amazonaws.com/source-index/_settings
{
"settings": {
"index.routing.allocation.require._name": null,
"index.blocks.read_only": false
}
}
PUT https://domain.region.es.amazonaws.com/shrunken-index/_settings
{
"settings": {
"index.routing.allocation.require._name": null,
"index.blocks.read_only": false
}
}
Version 7.7
For Elasticsearch 7.7, Amazon ES supports the following operations.
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
5. See the section called “Shrink” (p. 217).
Version 7.4
For Elasticsearch 7.4, Amazon ES supports the following operations.
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
5. See the section called “Shrink” (p. 217).
Version 7.1
For Elasticsearch 7.1, Amazon ES supports the following operations.
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
5. See the section called “Shrink” (p. 217).
Version 6.8
For Elasticsearch 6.8, Amazon ES supports the following operations.
1
_forcemerge and /index-name/ • /_cluster/stats • /_reindex
update/id) except /index- • /_count • /_render
name/_close 1
• /_delete_by_query • /_rollover
• /_alias 3
• /_explain • /_scripts
• /_aliases 2
• /_field_caps • /_search
• /_all
• /_field_stats • /_search profile
• /_analyze
• /_flush • /_shard_stores
• /_bulk 5
• /_ingest/pipeline • /_shrink
• /_cat (except /_cat/nodeattrs)
• /_mapping • /_snapshot
• /_cluster/allocation/
• /_mget • /_split
explain
• /_msearch • /_stats
• /_cluster/health
• /_mtermvectors • /_status
• /_cluster/pending_tasks
• /_nodes • /_tasks
• /_cluster/settings for several
4
properties : • /_opendistro/ • /_template
_alerting 1
• action.auto_create_index • /_update_by_query
• /_opendistro/_ism • /_validate
• action.search.shard_count.limit
• /_opendistro/
• indices.breaker.fielddata.limit
_security
• indices.breaker.request.limit
• /_opendistro/_sql
• indices.breaker.total.limit
• /_percolate
• cluster.max_shards_per_node
• /_plugin/kibana
• cluster.blocks.read_only
• /_rank_eval
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
5. See the section called “Shrink” (p. 217).
Version 6.7
For Elasticsearch 6.7, Amazon ES supports the following operations.
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
5. See the section called “Shrink” (p. 217).
Version 6.5
For Elasticsearch 6.5, Amazon ES supports the following operations.
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
5. See the section called “Shrink” (p. 217).
Version 6.4
For Elasticsearch 6.4, Amazon ES supports the following operations.
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
5. See the section called “Shrink” (p. 217).
Version 6.3
For Elasticsearch 6.3, Amazon ES supports the following operations.
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
Version 6.2
For Elasticsearch 6.2, Amazon ES supports the following operations.
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
5. See the section called “Shrink” (p. 217).
Version 6.0
For Elasticsearch 6.0, Amazon ES supports the following operations.
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
5. See the section called “Shrink” (p. 217).
Version 5.6
For Elasticsearch 5.6, Amazon ES supports the following operations.
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
5. See the section called “Shrink” (p. 217).
Version 5.5
For Elasticsearch 5.5, Amazon ES supports the following operations.
1
• action.search.shard_count.limit
• /_reindex
• indices.breaker.fielddata.limit
• indices.breaker.request.limit
• indices.breaker.total.limit
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. For considerations about using scripts, see the section called “Other Supported Resources” (p. 239).
4. Refers to the PUT method. For information about the GET method, see the section called “Notable API
Differences” (p. 216).
5. See the section called “Shrink” (p. 217).
Version 5.3
For Elasticsearch 5.3, Amazon ES supports the following operations.
• indices.breaker.fielddata.limit
• indices.breaker.request.limit
• indices.breaker.total.limit
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
Version 5.1
For Elasticsearch 5.1, Amazon ES supports the following operations.
• indices.breaker.fielddata.limit
• indices.breaker.request.limit
• indices.breaker.total.limit
1. Cluster configuration changes might interrupt these operations before completion. We recommend
that you use the /_tasks operation along with these operations to verify that the requests
completed successfully.
2. DELETE requests to /_search/scroll with a message body must specify "Content-Length" in
the HTTP header. Most clients add this header by default. To avoid a problem with = characters in
scroll_id values, use the request body, not the query string, to pass scroll_id values to Amazon
ES.
3. See the section called “Shrink” (p. 217).
Version 2.3
For Elasticsearch 2.3, Amazon ES supports the following operations.
Version 1.5
For Elasticsearch 1.5, Amazon ES supports the following operations.
• threadpool.percolate.queue_size
• threadpool.search.queue_size
• threadpool.suggest.queue_size
Maximum number of data nodes per 40 (except for the T2 instance types, which have a maximum of
cluster 10)
Smallest supported instance type t2.micro.elasticsearch (versions 1.5 and 2.3) and
per Elasticsearch version t2.small.elasticsearch (version 5.x and 6.x).
For a list of the instance types that Amazon ES supports, see Supported Instance Types (p. 212).
ultrawarm1.large.elasticsearch 20 TiB
• If you choose magnetic storage under EBS volume type when creating your domain, the maximum
volume size is 100 GiB for all instance types except t2.micro, t2.small, and t2.medium. For the
maximum sizes listed in the following table, choose one of the SSD options.
• 512 GiB is the maximum volume size for Elasticsearch version 1.5.
• Some older-generation instance types include instance storage, but also support EBS storage. If you
choose EBS storage for one of these instance types, the storage volumes are not additive. You can use
either an EBS volume or the instance storage, not both.
Network Limits
The following table shows the maximum size of HTTP request payloads.
t2.micro.elasticsearch 10 MiB
t2.small.elasticsearch 10 MiB
t2.medium.elasticsearch10 MiB
m3.medium.elasticsearch10 MiB
m3.large.elasticsearch 10 MiB
m3.xlarge.elasticsearch100 MiB
100 MiB
m3.2xlarge.elasticsearch
m4.large.elasticsearch 10 MiB
m4.xlarge.elasticsearch100 MiB
100 MiB
m4.2xlarge.elasticsearch
100 MiB
m4.4xlarge.elasticsearch
100 MiB
m4.10xlarge.elasticsearch
m5.large.elasticsearch 10 MiB
m5.xlarge.elasticsearch100 MiB
100 MiB
m5.2xlarge.elasticsearch
100 MiB
m5.4xlarge.elasticsearch
100 MiB
m5.12xlarge.elasticsearch
c4.large.elasticsearch 10 MiB
c4.xlarge.elasticsearch100 MiB
100 MiB
c4.2xlarge.elasticsearch
100 MiB
c4.4xlarge.elasticsearch
100 MiB
c4.8xlarge.elasticsearch
c5.large.elasticsearch 10 MiB
c5.xlarge.elasticsearch100 MiB
100 MiB
c5.2xlarge.elasticsearch
100 MiB
c5.4xlarge.elasticsearch
100 MiB
c5.9xlarge.elasticsearch
100 MiB
c5.18xlarge.elasticsearch
r3.large.elasticsearch 10 MiB
r3.xlarge.elasticsearch100 MiB
100 MiB
r3.2xlarge.elasticsearch
100 MiB
r3.4xlarge.elasticsearch
100 MiB
r3.8xlarge.elasticsearch
r4.xlarge.elasticsearch100 MiB
100 MiB
r4.2xlarge.elasticsearch
100 MiB
r4.4xlarge.elasticsearch
100 MiB
r4.8xlarge.elasticsearch
100 MiB
r4.16xlarge.elasticsearch
r5.xlarge.elasticsearch100 MiB
100 MiB
r5.2xlarge.elasticsearch
100 MiB
r5.4xlarge.elasticsearch
100 MiB
r5.12xlarge.elasticsearch
i2.xlarge.elasticsearch100 MiB
100 MiB
i2.2xlarge.elasticsearch
i3.xlarge.elasticsearch100 MiB
100 MiB
i3.2xlarge.elasticsearch
100 MiB
i3.4xlarge.elasticsearch
100 MiB
i3.8xlarge.elasticsearch
100 MiB
i3.16xlarge.elasticsearch
Amazon ES RIs require one- or three-year terms and have three payment options that affect the discount
rate:
• No Upfront – You pay nothing upfront. You pay a discounted hourly rate for every hour within the
term.
• Partial Upfront – You pay a portion of the cost upfront, and you pay a discounted hourly rate for every
hour within the term.
• All Upfront – You pay the entirety of the cost upfront. You don't pay an hourly rate for the term.
Generally speaking, a larger upfront payment means a larger discount. You can't cancel Reserved
Instances—when you reserve them, you commit to paying for the entire term—and upfront payments
are nonrefundable. For full details, see Amazon Elasticsearch Service Pricing and FAQ.
Topics
• Purchasing Reserved Instances (Console) (p. 237)
• Purchasing Reserved Instances (AWS CLI) (p. 237)
• Purchasing Reserved Instances (AWS SDKs) (p. 239)
• Examining Costs (p. 239)
To purchase a reservation
On this page, you can view your existing reservations. If you have many reservations, you can filter
them to more easily identify and view a particular reservation.
Tip
If you don't see the Reserved Instances link, create a domain (p. 9) in the region.
4. Choose Purchase Reserved Instance.
5. For Reservation Name, type a unique, descriptive name.
6. Choose an instance type, size, and number of instances. For guidance, see the section called “Sizing
Amazon ES Domains” (p. 203).
7. Choose a term length and payment option.
8. Review the payment details carefully.
9. Choose Submit.
10. Review the purchase summary carefully. Purchases of Reserved Instances are non-refundable.
11. Choose Purchase.
Field Description
Finally, you can list your reservations for a given region using the following example:
"CurrencyCode": "USD"
}
]
}
Note
StartTime is Unix epoch time, which is the number of seconds that have passed since midnight
UTC of 1 January 1970. For example, 1522872571 epoch time is 20:09:31 UTC of 4 April 2018.
You can use online converters.
To learn more about the commands used in the preceding examples, see the AWS CLI Command
Reference.
• DescribeReservedElasticsearchInstanceOfferings
• PurchaseReservedElasticsearchInstanceOffering
• DescribeReservedElasticsearchInstances
For more information about installing and using the AWS SDKs, see AWS Software Development Kits.
Examining Costs
Cost Explorer is a free tool that you can use to view your spending data for the past 13 months.
Analyzing this data helps you identify trends and understand if RIs fit your use case. If you already have
RIs, you can group by Purchase Option and show amortized costs to compare that spending to your
spending for On-Demand Instances. You can also set usage budgets to make sure you are taking full
advantage of your reservations. For more information, see Analyzing Your Costs with Cost Explorer in the
AWS Billing and Cost Management User Guide.
The service enables bootstrap.mlockall in elasticsearch.yml, which locks JVM memory and
prevents the operating system from swapping it to disk. This applies to all supported instance types
except for the following:
• t2.micro.elasticsearch
• t2.small.elasticsearch
• t2.medium.elasticsearch
Scripting module
The service supports scripting for Elasticsearch 5.x and later domains. The service does not support
scripting for 1.5 or 2.3.
• Mustache
For Elasticsearch 5.5 and later domains, Amazon ES supports stored scripts using the _scripts
endpoint. Elasticsearch 5.3 and 5.1 domains support inline scripts only.
TCP transport
The service supports HTTP on port 80 and HTTPS over port 443, but does not support TCP
transport.
Topics
• Migrating to Amazon Elasticsearch Service (p. 241)
• Creating a Search Application with Amazon Elasticsearch Service (p. 246)
• Visualizing Customer Support Calls with Amazon Elasticsearch Service and Kibana (p. 251)
1. Take a snapshot of the existing cluster, and upload the snapshot to an Amazon S3 bucket.
2. Create an Amazon ES domain.
3. Give Amazon ES permissions to access the bucket, and give your user account permissions to work
with snapshots.
4. Restore the snapshot on the Amazon ES domain.
This walkthrough provides more detailed steps and alternate options, where applicable.
For smaller clusters, a one-time approach is to take a shared file system snapshot and then use the AWS
CLI to upload it to S3. If you already have a snapshot, skip to step 4.
1. Add the path.repo setting to elasticsearch.yml on all nodes, and then restart each node.
path.repo: ["/my/shared/directory/snapshots"]
PUT _snapshot/migration-repository
{
"type": "fs",
"settings": {
"location": "/my/shared/directory/snapshots"
}
}
PUT _snapshot/migration-repository/migration-snapshot
{
"indices": "migration-index1,migration-index2,other-indices-*",
"include_global_state": false
}
4. Install the AWS CLI, and run aws configure to add your credentials.
5. Navigate to the snapshot directory. Then run the following commands to create a new S3 bucket
and upload the contents of the snapshot directory to that bucket:
Depending on the size of the snapshot and the speed of your internet connection, this operation can
take a while.
aws es create-elasticsearch-domain \
--domain-name migration-domain \
--elasticsearch-version 7.7 \
--elasticsearch-cluster-config InstanceType=c5.large.elasticsearch,InstanceCount=2 \
--ebs-options EBSEnabled=true,VolumeType=gp2,VolumeSize=100 \
--node-to-node-encryption-options Enabled=true \
--encryption-at-rest-options Enabled=true \
--domain-endpoint-options EnforceHTTPS=true,TLSSecurityPolicy=Policy-Min-TLS-1-2-2019-07
\
--advanced-security-options
Enabled=true,InternalUserDatabaseEnabled=true,MasterUserOptions='{MasterUserName=master-
user,MasterUserPassword=master-user-password}' \
--access-policies '{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Principal":
{"AWS":["*"]},"Action":["es:ESHttp*"],"Resource":"arn:aws:es:us-
west-2:123456789012:domain/migration-domain/*"}]}' \
--region us-west-2
As is, the command creates an internet-accessible domain with two data nodes, each with 100 GiB
of storage. It also enables fine-grained access control (p. 77) with HTTP basic authentication and all
encryption settings. Use the Amazon ES console if you need a more advanced security configuration,
such as a VPC.
Before issuing the command, change the domain name, master user credentials, and account number.
Specify the same region that you used for the S3 bucket and an Elasticsearch version that is compatible
with your snapshot.
Important
Snapshots are only forward-compatible, and only by one major version. For example, you can't
restore a snapshot from a 2.x cluster on a 1.x cluster or a 6.x cluster, only a 2.x or 5.x cluster.
Minor version matters, too. You can't restore a snapshot from a self-managed 5.3.3 cluster on
a 5.3.2 Amazon ES domain. We recommend choosing the most-recent version of Elasticsearch
that your snapshot supports.
Provide Permissions
In the AWS Identity and Access Management (IAM) console, create a role with the following permissions
and trust relationship. Name the role AmazonESSnapshotRole so that it's easy to find.
Permissions
{
"Version": "2012-10-17",
"Statement": [{
"Action": [
"s3:ListBucket"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::migration-bucket"
]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::migration-bucket/*"
]
}
]
}
Trust Relationship
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Service": "es.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
Then give your personal IAM user or role—whatever you used to configure the AWS CLI earlier—
permissions to assume AmazonESSnapshotRole. Create the following policy and attach it to your
identity.
Permissions
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": "arn:aws:iam::123456789012:role/AmazonESSnapshotRole"
}
]
}
Then log in to Kibana using the master user credentials you specified when you created the Amazon ES
domain. You can find the Kibana URL in the Amazon ES console. It takes the form of https://domain-
endpoint/_plugin/kibana/.
In Kibana, choose Security, Role Mappings, and Add. For Role, choose manage_snapshots. Then specify
the ARN for your personal IAM user or role in the appropriate field. User ARNs go in the Users section.
Role ARNs go in the Backend roles section. This step uses fine-grained access control (p. 85) to give your
identity permissions to work with snapshots.
Most programming languages have libraries to assist with signing requests (p. 114), but the simpler
approach is to use a tool like Postman and put your IAM credentials into the Authorization section.
1. Regardless of how you choose to sign your requests, the first step is to register the repository:
PUT _snapshot/migration-repository
{
"type": "s3",
"settings": {
"bucket": "migration-bucket",
"region": "us-west-2",
"role_arn": "arn:aws:iam::123456789012:role/AmazonESSnapshotRole"
}
}
2. Then list the snapshots in the repository, and find the one you want to restore. At this point, you can
continue using Postman or switch to a tool like curl.
Shorthand
GET _snapshot/migration-repository/_all
curl
Shorthand
POST _snapshot/migration-repository/migration-snapshot/_restore
{
"indices": "migration-index1,migration-index2,other-indices-*",
"include_global_state": false
}
curl
Shorthand
GET _cat/indices?v
curl
At this point, the migration is complete. You might configure your clients to use the new Amazon ES
endpoint, resize the domain (p. 203) to suit your workload, check the shard count for your indices, switch
to an IAM master user (p. 82), or start building Kibana dashboards.
If you want to write client-side code that doesn't rely on a server, however, you should compensate
for the security and performance risks. Allowing unsigned, public access to the Elasticsearch APIs is
inadvisable. Users might access unsecured endpoints or impact cluster performance through overly
broad queries (or too many queries).
This chapter presents a solution: use Amazon API Gateway to restrict users to a subset of the
Elasticsearch APIs and AWS Lambda to sign requests from API Gateway to Amazon ES.
Note
Standard API Gateway and Lambda pricing applies, but within the limited usage of this tutorial,
costs should be negligible.
POST https://search-my-domain.us-west-1.es.amazonaws.com/_bulk
{ "index": { "_index": "movies", "_type": "movie", "_id": "tt1979320" } }
{"fields":{"directors":["Ron
Howard"],"release_date":"2013-09-02T00:00:00Z","rating":8.3,"genres":
["Action","Biography","Drama","Sport"],"image_url":"http://ia.media-imdb.com/images/
M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg","plot":"A re-
creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki
Lauda.","title":"Rush","rank":2,"running_time_secs":7380,"actors":["Daniel Brühl","Chris
Hemsworth","Olivia Wilde"],"year":2013},"id":"tt1979320","type":"add"}
{ "index": { "_index": "movies", "_type": "movie", "_id": "tt1951264" } }
{"fields":{"directors":["Francis Lawrence"],"release_date":"2013-11-11T00:00:00Z","genres":
["Action","Adventure","Sci-Fi","Thriller"],"image_url":"http://ia.media-imdb.com/images/
M/MV5BMTAyMjQ3OTAxMzNeQTJeQWpwZ15BbWU4MDU0NzA1MzAx._V1_SX400_.jpg","plot":"Katniss
Everdeen and Peeta Mellark become targets of the Capitol after their victory in the 74th
Hunger Games sparks a rebellion in the Districts of Panem.","title":"The Hunger Games:
Catching Fire","rank":4,"running_time_secs":8760,"actors":["Jennifer Lawrence","Josh
Hutcherson","Liam Hemsworth"],"year":2013},"id":"tt1951264","type":"add"}
...
Setting Values
Settings
Resource /
Method Settings
Request
Authorization: none
Name: q
Required: Yes
Rate: 1000
Burst: 500
These settings configure an API that has only one method: a GET request to the endpoint root
(https://some-id.execute-api.us-west-1.amazonaws.com/search-es-api-test). The
request requires a single parameter (q), the query string to search for. When called, the method passes
the request to Lambda, which runs the search-es-lambda function. For more information, see
Creating an API in Amazon API Gateway and Deploying an API in Amazon API Gateway.
import boto3
import json
import requests
from requests_aws4auth import AWS4Auth
# Put the user query into the query DSL for more accurate search results.
# Note that certain fields are boosted (^).
query = {
"size": 25,
"query": {
"multi_match": {
"query": event['queryStringParameters']['q'],
"fields": ["fields.title^4", "fields.plot^2", "fields.actors",
"fields.directors"]
}
}
}
# Create the response and add some extra content to support CORS
response = {
"statusCode": 200,
"headers": {
"Access-Control-Allow-Origin": '*'
},
"isBase64Encoded": False
}
Because this sample function uses external libraries, you must create a deployment package and
upload it to Lambda for the code to work. For more information about creating Lambda functions and
deployment packages, see Creating a Deployment Package (Python) in the AWS Lambda Developer Guide
and the section called “Creating the Lambda Deployment Package” (p. 132) in this guide.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012:role/service-role/search-es-role"
},
"Action": "es:ESHttpGet",
"Resource": "arn:aws:es:us-west-1:123456789012:domain/web/movies/_search"
}
]
}
For more information, see the section called “Configuring Access Policies” (p. 13).
1. Download sample-site.zip, unzip it, and open scripts/search.js in your favorite text editor.
2. Update the apigatewayendpoint variable to point to your API Gateway endpoint. The endpoint
takes the form of https://some-id.execute-api.us-west-1.amazonaws.com/search-es-
api-test.
3. Open index.html and try running searches for thor, house, and a few other terms.
Next Steps
This chapter is just a starting point to demonstrate a concept. You might consider the following
modifications:
A manual workflow might involve employees listening to recordings, noting the subject of each call, and
deciding whether or not the customer interaction was positive.
Such a process would be extremely labor-intensive. Assuming an average time of 10 minutes per call,
each employee could listen to only 48 calls per day. Barring human bias, the data they generate would be
highly accurate, but the amount of data would be minimal: just the subject of the call and a boolean for
whether or not the customer was satisfied. Anything more involved, such as a full transcript, would take
a huge amount of time.
Using Amazon S3, Amazon Transcribe, Amazon Comprehend, and Amazon Elasticsearch Service
(Amazon ES), you can automate a similar process with very little code and end up with much more
data. For example, you can get a full transcript of the call, keywords from the transcript, and an overall
"sentiment" of the call (positive, negative, neutral, or mixed). Then you can use Elasticsearch and Kibana
to search and visualize the data.
While you can use this walkthrough as-is, the intent is to spark ideas about how to enrich your JSON
documents before you index them in Amazon ES.
Estimated Costs
In general, performing the steps in this walkthrough should cost less than $2. The walkthrough uses the
following resources:
Topics
• Step 1: Configure Prerequisites (p. 252)
• Step 2: Copy Sample Code (p. 252)
• (Optional) Step 3: Add Sample Data (p. 255)
• Step 4: Analyze and Visualize Your Data (p. 256)
• Step 5: Clean Up Resources and Next Steps (p. 260)
Prerequisite Description
Amazon S3 Bucket For more information, see Creating a Bucket in the Amazon Simple Storage
Service Getting Started Guide.
Amazon ES Domain The destination for data. For more information, see Creating Amazon ES
Domains (p. 9).
If you don't already have these resources, you can create them using the following AWS CLI commands:
Note
These commands use the us-west-2 region, but you can use any region that Amazon
Comprehend supports. To learn more, see the AWS General Reference.
import boto3
import datetime
import json
import requests
from requests_aws4auth import AWS4Auth
import time
import urllib.request
# Variables to update
audio_file_name = '' # For example, 000001.mp3
bucket_name = '' # For example, my-transcribe-test
# Get the necessary details and build the URL to the audio file on S3.
# For all other regions.
response = s3_client.get_bucket_location(
Bucket=bucket_name
)
bucket_region = response['LocationConstraint']
mp3_uri = 'https://' + bucket_name + '.s3-' + bucket_region + '.amazonaws.com/' +
audio_file_name
transcript_uri = response['TranscriptionJob']['Transcript']['TranscriptFileUri']
# Open the JSON file, read it, and get the transcript.
response = urllib.request.urlopen(transcript_uri)
raw_json = response.read()
loaded_json = json.loads(raw_json)
transcript = loaded_json['results']['transcripts'][0]['transcript']
keywords = []
for keyword in response['KeyPhrases']:
keywords.append(keyword['Text'])
print('Detecting sentiment...')
response = comprehend_client.detect_sentiment(
Text=trimmed_transcript,
LanguageCode='en'
)
sentiment = response['Sentiment']
print(response)
print(response.json())
4. Place your MP3 in the same directory as call-center.py and run the script. A sample output
follows:
$ python call-center.py
Uploading 000001.mp3...
Starting transcription job...
Waiting for job to complete...
Still waiting...
Still waiting...
Still waiting...
Still waiting...
Still waiting...
Still waiting...
Still waiting...
Detecting key phrases...
Detecting sentiment...
Indexing document...
<Response [201]>
{u'_type': u'call', u'_seq_no': 0, u'_shards': {u'successful': 1, u'failed': 0,
u'total': 2}, u'_index': u'support-calls4', u'_version': 1, u'_primary_term': 1,
u'result': u'created', u'_id': u'000001'}
1. The script uploads an audio file (in this case, an MP3, but Amazon Transcribe supports several formats)
to your S3 bucket.
2. It sends the audio file's URL to Amazon Transcribe and waits for the transcription job to finish.
The time to finish the transcription job depends on the length of the audio file. Assume minutes, not
seconds.
Tip
To improve the quality of the transcription, you can configure a custom vocabulary for
Amazon Transcribe.
3. After the transcription job finishes, the script extracts the transcript, trims it to 5,000 characters, and
sends it to Amazon Comprehend for keyword and sentiment analysis.
4. Finally, the script adds the full transcript, keywords, sentiment, and current time stamp to a JSON
document and indexes it in Amazon ES.
Tip
LibriVox has public domain audiobooks that you can use for testing.
import boto3
from elasticsearch import Elasticsearch, RequestsHttpConnection
import json
from requests_aws4auth import AWS4Auth
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service,
session_token=credentials.token)
es = Elasticsearch(
hosts = [{'host': host, 'port': 443}],
http_auth = awsauth,
use_ssl = True,
verify_certs = True,
connection_class = RequestsHttpConnection
)
response = es.bulk(bulk_file)
print(json.dumps(response, indent=2, sort_keys=True))
$ python bulk-helper.py
{
"errors": false,
"items": [
{
"index": {
"_id": "1",
"_index": "test-data",
"_primary_term": 1,
"_seq_no": 42,
"_shards": {
"failed": 0,
"successful": 1,
"total": 2
},
"_type": "_doc",
"_version": 9,
"result": "updated",
"status": 200
}
},
...
],
"took": 27
}
1. Navigate to https://search-domain.region.es.amazonaws.com/_plugin/kibana.
2. Before you can use Kibana, you need an index pattern. Kibana uses index patterns to narrow your
analysis to one or more indices. To match the support-calls index that call-center.py
created, define an index pattern of support*, and then choose Next step.
3. For Time Filter field name, choose timestamp.
4. Now you can start creating visualizations. Choose Visualize, and then add a new visualization.
5. Choose the pie chart and the support* index pattern.
6. The default visualization is basic, so choose Split Slices to create a more interesting visualization.
For Aggregation, choose Terms. For Field, choose sentiment.keyword. Then choose Apply changes
and Save.
7. Return to the Visualize page, and add another visualization. This time, choose the horizontal bar
chart.
8. Choose Split Series.
For Aggregation, choose Terms. For Field, choose keywords.keyword and change Size to 20. Then
choose Apply Changes and Save.
9. Return to the Visualize page and add one final visualization, a vertical bar chart.
10. Choose Split Series. For Aggregation, choose Date Histogram. For Field, choose timestamp and
change Interval to Daily.
11. Choose Metrics & Axes and change Mode to normal.
12. Choose Apply Changes and Save.
13. Now that you have three visualizations, you can add them to a Kibana dashboard. Choose
Dashboard, create a dashboard, and add your visualizations.
Transcripts require much less disk space than MP3 files. You might be able to shorten your MP3 retention
window—for example, from three months of call recordings to one month—retain years of transcripts,
and still save on storage costs.
You could also automate the transcription process using AWS Step Functions and Lambda, add
additional metadata before indexing, or craft more complex visualizations to fit your exact use case.
If your Amazon ES domain uses VPC access, you might not receive this error. Instead, the request might
time out. To learn more about correcting this issue and the various configuration options available to
you, see the section called “Controlling Access to Kibana” (p. 179), the section called “About Access
Policies on VPC Domains” (p. 23), and the section called “Identity and Access Management” (p. 65).
• If your cluster uses dedicated master nodes, quorum loss occurs when half or more are unavailable.
• If your cluster does not use dedicated master nodes, quorum loss occurs when half or more of your
data nodes are unavailable.
If quorum loss occurs and your cluster has more than one node, Amazon ES restores quorum and places
the cluster into a read-only state. You have two options:
If you prefer to use the cluster as-is, verify that cluster health is green using the following request:
GET _cat/health?v
If cluster health is red, we recommend restoring the cluster from a snapshot. You can also see the section
called “Red Cluster Status” (p. 262) for troubleshooting steps. If cluster health is green, check that all
expected indices are present using the following request:
GET _cat/indices?v
Then run some searches to verify that the expected data is present. If it is, you can remove the read-only
state using the following request:
PUT _cluster/settings
{
"persistent": {
"cluster.blocks.read_only": false
}
}
If quorum loss occurs and your cluster has only one node, Amazon ES replaces the node and does not
place the cluster into a read-only state. Otherwise, your options are the same: use the cluster as-is or
restore from a snapshot.
In both situations, Amazon ES sends two events to your Personal Health Dashboard. The first informs
you of the loss of quorum. The second occurs after Amazon ES successfully restores quorum. For more
information about using the Personal Health Dashboard, see the AWS Health User Guide.
The most common causes of a red cluster status are failed cluster nodes (p. 265) and the Elasticsearch
process crashing due to a continuous heavy processing load.
Note
Amazon ES stores automatic snapshots for 14 days, so if the red cluster status persists for
more than two weeks, you can permanently lose your cluster's data. If your Amazon ES domain
enters a red cluster status, AWS Support might contact you to ask whether you want to address
the problem yourself or you want the support team to assist. You can set a CloudWatch
alarm (p. 210) to notify you when a red cluster status occurs.
Ultimately, red shards cause red clusters, and red indices cause red shards. To identity the indices causing
the red cluster status, Elasticsearch has some helpful APIs.
• GET /_cluster/allocation/explain chooses the first unassigned shard that it finds and explains
why it cannot be allocated to a node:
{
"index": "test4",
"shard": 0,
"primary": true,
"current_state": "unassigned",
"can_allocate": "no",
• GET /_cat/indices?v shows the health status, number of documents, and disk usage for each
index:
Deleting red indices is the fastest way to fix a red cluster status. Depending on the reason for the red
cluster status, you might then scale your Amazon ES domain to use larger instance types, more instances,
or more EBS-based storage and try to recreate the problematic indices.
If deleting a problematic index isn't feasible, you can restore a snapshot (p. 50), delete documents from
the index, change the index settings, reduce the number of replicas, or delete other indices to free up
disk space. The important step is to resolve the red cluster status before reconfiguring your Amazon ES
domain. Reconfiguring a domain with a red cluster status can compound the problem and lead to the
domain being stuck in a configuration state of Processing until you resolve the status.
GET elasticsearch_domain/_nodes/
stats/jvm?pretty
CPUUtilization Specifies the percentage of CPU Add data nodes or increase the
resources used for data nodes in a size of the instance types of
cluster. View the Maximum statistic for existing data nodes.
this metric, and look for a continuous
pattern of high usage.
ClusterBlockException
You might receive a ClusterBlockException error for the following reasons.
To avoid issues, monitor the FreeStorageSpace metric in the Amazon ES console and create
CloudWatch alarms (p. 210) to trigger when FreeStorageSpace drops below a certain threshold.
GET /_cat/allocation?v also provides a useful summary of shard allocation and disk usage. To
resolve issues associated with a lack of storage space, scale your Amazon ES domain to use larger
instance types, more instances, or more EBS-based storage.
When the JVMMemoryPressure metric returns to 88% or lower for five minutes, the protection is
disabled, and write operations to the cluster are unblocked.
JVM OutOfMemoryError
A JVM OutOfMemoryError typically means that one of the following JVM circuit breakers was reached.
To check for this condition, open your domain dashboard on the Amazon ES console. Choose the Cluster
health tab, and then choose the Nodes metric. See if the reported number of nodes is fewer than the
number that you configured for your cluster. If the metric shows that one or more nodes is down for
more than one day, contact AWS Support.
You can also set a CloudWatch alarm (p. 210) to notify you when this issue occurs.
Note
The Nodes metric is not accurate during changes to your cluster configuration and during
routine maintenance for the service. This behavior is expected. The metric will report the
correct number of cluster nodes soon. To learn more, see the section called “Configuration
Changes” (p. 14).
To protect your clusters from unexpected node terminations and restarts, create at least one replica for
each index in your Amazon ES domain.
POST /_snapshot/my-repository/my-snapshot/_restore
{
"indices": "my-index-1,myindex-2",
"include_global_state": true,
"rename_pattern": "my-index-(\\d)",
"rename_replacement": "restored-my-index-$1"
}
If you plan to reindex, shrink, or split an index, you likely want to stop writing to it before performing the
operation.
Request Throttling
If you receive persistent 403 Request throttled due to too many requests errors, consider
scaling vertically. Amazon Elasticsearch Service throttles requests if the payload would cause memory
usage to exceed the maximum size of the Java heap.
If you need more insight into the performance of the cluster, you can publish error logs and slow logs to
CloudWatch (p. 41).
If you need to restore a snapshot from the bucket, restore the objects from S3 Glacier, copy the objects
to a new bucket, and register the new bucket (p. 47) as a snapshot repository.
Host: search-my-sample-domain-ih2lhn2ew2scurji.us-west-2.es.amazonaws.com
If you receive an Invalid Host Header error when making a request, check that your client includes
the Amazon ES domain endpoint (and not, for example, its IP address) in the Host header.
We recommend choosing a newer instance type. For domains running Elasticsearch 6.7 and later, the
following restriction apply:
• If your existing domain does not use M3 instances, you can no longer change to them.
• If you change an existing domain from an M3 instance type to another instance type, you can't switch
back.
Regions in which Amazon ES is not available return "Could not connect to the endpoint URL."
You are not authorized to perform this operation. (Service: AmazonEC2; Status Code: 403;
Error Code: UnauthorizedOperation
To enable this query, you must have access to the ec2:DescribeVpcs, ec2:DescribeSubnets,
and ec2:DescribeSecurityGroups operations. This requirement is only for the console. If you use
the AWS CLI to create and configure a domain with a VPC endpoint, you don't need access to those
operations.
You can prevent these failures by keeping your computer's CA certificates and operating system up-to-
date. If you encounter this issue in a corporate environment and do not manage your own computer, you
might need to ask an administrator to assist with the update process.
The following list shows minimum operating system and Java versions:
• Microsoft Windows versions that have updates from January 2005 or later installed contain at least
one of the required CAs in their trust list.
• Mac OS X 10.4 with Java for Mac OS X 10.4 Release 5 (February 2007), Mac OS X 10.5 (October 2007),
and later versions contain at least one of the required CAs in their trust list.
• Red Hat Enterprise Linux 5 (March 2007), 6, and 7 and CentOS 5, 6, and 7 all contain at least one of
the required CAs in their default trusted CA list.
• Java 1.4.2_12 (May 2006), 5 Update 2 (March 2005), and all later versions, including Java 6 (December
2006), 7, and 8, contain at least one of the required CAs in their default trusted CA list.
• Amazon Root CA 1
• Starfield Services Root Certificate Authority - G2
• Starfield Class 2 Certification Authority
Root certificates from the first two authorities are available from Amazon Trust Services, but keeping
your computer up-to-date is the more straightforward solution. To learn more about ACM-provided
certificates, see AWS Certificate Manager FAQs.
Note
Currently, Amazon ES domains in the us-east-1 region use certificates from a different authority.
We plan to update the region to use these new certificate authorities in the near future.
Actions
The following table provides a quick reference to the HTTP method required for each operation for the
REST interface to the Amazon Elasticsearch Service configuration API. The description of each operation
also includes the required HTTP method.
Note
All configuration service requests must be signed. For more information, see Signing Amazon
Elasticsearch Service Requests (p. 68) in this guide and Signature Version 4 Signing Process in
the AWS General Reference.
DescribeOutboundCrossClusterSearchConnections (p.POST
286)
AcceptInboundCrossClusterSearchConnection
Allows the destination domain owner to accept an inbound cross-cluster search connection request.
Syntax
PUT https://es.us-east-1.amazonaws.com/2015-01-01/es/ccs/inboundConnection/{ConnectionId}/
accept
Request Parameters
This operation does not use HTTP request parameters.
Request Body
This operation does not use the HTTP request body.
Response Elements
Object
CrossClusterSearchConnection Inbound connection details.
AddTags
Attaches resource tags to an Amazon ES domain. For more information, see Tagging Amazon ES
Domains (p. 57).
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/tags
{
"ARN": "domain-arn",
"TagList": [{
"Key": "tag-key",
"Value": "tag-value"
}]
}
Request Parameters
This operation does not use request parameters.
Request Body
Response Elements
The AddTags operation does not return a data structure.
AssociatePackage
Associates a package with an Amazon ES domain.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/packages/associate/package-id/domain-
name
Request Parameters
Parameter Data Type Required? Description
DomainName the section called Yes Name of the domain that you
“DomainName” (p. 308) want to associate the package
with.
Request Body
This operation does not use the HTTP request body.
Response Elements
Field Data Type
CreateElasticsearchDomain
Creates an Amazon ES domain. For more information, see the section called “ Creating Amazon ES
Domains” (p. 9).
Note
If you attempt to create an Amazon ES domain and a domain with the same name already
exists, the API does not report an error. Instead, it returns details for the existing domain.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/es/domain
{
"ElasticsearchClusterConfig": {
"ZoneAwarenessConfig": {
"AvailabilityZoneCount": 3
},
"ZoneAwarenessEnabled": true|false,
"InstanceCount": 3,
"DedicatedMasterEnabled": true|false,
"DedicatedMasterType": "c5.large.elasticsearch",
"DedicatedMasterCount": 3,
"InstanceType": "r5.large.elasticsearch",
"WarmCount": 3,
"WarmEnabled": true|false,
"WarmType": "ultrawarm1.large.elasticsearch"
},
"EBSOptions": {
"EBSEnabled": true|false,
"VolumeType": "io1|gp2|standard",
"Iops": 1000,
"VolumeSize": 35
},
"EncryptionAtRestOptions": {
"Enabled": true|false,
"KmsKeyId":"arn:aws:kms:us-east-1:123456789012:alias/my-key"
},
"SnapshotOptions": {
"AutomatedSnapshotStartHour": 3
},
"VPCOptions": {
"VPCId": "vpc-12345678",
"SubnetIds": ["subnet-abcdefg1", "subnet-abcdefg2", "subnet-abcdefg3"],
"SecurityGroupIds": ["sg-12345678"]
},
"AdvancedOptions": {
"rest.action.multi.allow_explicit_index": "true|false",
"indices.fielddata.cache.size": "40",
"indices.query.bool.max_clause_count": "1024"
},
"CognitoOptions": {
"Enabled": true|false,
"UserPoolId": "us-east-1_121234567",
"IdentityPoolId": "us-east-1:12345678-1234-1234-1234-123456789012",
"RoleArn": "arn:aws:iam::123456789012:role/service-role/CognitoAccessForAmazonES"
},
"NodeToNodeEncryptionOptions": {
"Enabled": true|false
},
"DomainEndpointOptions": {
"EnforceHTTPS": true|false,
"TLSSecurityPolicy": "Policy-Min-TLS-1-2-2019-07|Policy-Min-TLS-1-0-2019-07"
},
"LogPublishingOptions": {
"SEARCH_SLOW_LOGS": {
"CloudWatchLogsLogGroupArn":"arn:aws:logs:us-east-1:264071961897:log-group1:sample-
domain",
"Enabled":true|false
},
"INDEX_SLOW_LOGS": {
"CloudWatchLogsLogGroupArn":"arn:aws:logs:us-east-1:264071961897:log-group2:sample-
domain",
"Enabled":true|false
},
"ES_APPLICATION_LOGS": {
"CloudWatchLogsLogGroupArn":"arn:aws:logs:us-east-1:264071961897:log-group3:sample-
domain",
"Enabled":true|false
}
},
"AdvancedSecurityOptions": {
"Enabled": true|false,
"InternalUserDatabaseEnabled": true|false,
"MasterUserOptions": {
"MasterUserARN": "arn:aws:iam::123456789012:role/my-master-user-role"
"MasterUserName": "my-master-username",
"MasterUserPassword": "my-master-password"
}
},
"ElasticsearchVersion": "7.1",
"DomainName": "my-domain",
"AccessPolicies": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",
\"Principal\":{\"AWS\":[\"123456789012\"]},\"Action\":[\"es:es:ESHttp*\"],\"Resource\":
\"arn:aws:es:us-east-1:123456789012:domain/my-domain/*\"}]}"
}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
String
ElasticsearchVersion No Version of Elasticsearch. If
not specified, 1.5 is used as
the default. For the full list
of supported versions, see
the section called “Supported
Elasticsearch Versions” (p. 2).
Response Elements
CreateOutboundCrossClusterSearchConnection
Creates a new cross-cluster search connection from a source domain to a destination domain.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/es/ccs/outboundConnection
{
"ConnectionAlias": "StringValue",
"SourceDomainInfo": {
"DomainName": "Domain-name",
"Region": "us-east-1"
},
"DestinationDomainInfo": {
"OwnerId": "Account-id",
"DomainName": "Domain-name",
"Region": "us-east-1"
}
}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
Object
DestinationDomainInfo Yes Name and region of the destination
domain.
Response Elements
Object
DestinationDomainInfo Name and region of the destination domain.
String
CrossClusterSearchConnectionId The ID for the outbound connection.
CreatePackage
Add a package for use with Amazon ES domains.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/packages
{
"PackageName": "my-package-name",
"PackageType": "TXT-DICTIONARY",
"PackageDescription": "My synonym file.",
"PackageSource": {
"S3BucketName": "my-s3-bucket",
"S3Key": "synonyms.txt"
}
}
Request Parameters
This operation does not use request parameters.
Request Body
PackageSource the section called Yes S3 bucket and key for the
“PackageSource” (p. 316) package.
Response Elements
DeleteElasticsearchDomain
Deletes an Amazon ES domain and all of its data. A domain cannot be recovered after it is deleted.
Syntax
DELETE https://es.us-east-1.amazonaws.com/2015-01-01/es/domain/domain-name
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
DeleteElasticsearchServiceRole
Deletes the service-linked role between Amazon ES and Amazon EC2. This role gives Amazon ES
permissions to place VPC endpoints into your VPC. A service-linked role must be in place for domains
with VPC endpoints to be created or function properly.
Note
This action succeeds only if no domains are using the service-linked role.
Syntax
DELETE https://es.us-east-1.amazonaws.com/2015-01-01/es/role
Request Parameters
This operation does not use request parameters.
Request Body
This operation does not use the HTTP request body.
Response Elements
The DeleteElasticsearchServiceRole operation does not return a data structure.
DeleteInboundCrossClusterSearchConnection
Allows the destination domain owner to delete an existing inbound cross-cluster search connection.
Syntax
DELETE https://es.us-east-1.amazonaws.com/2015-01-01/es/ccs/
inboundConnection/{ConnectionId}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
This operation does not use the HTTP request body.
Response Elements
Object
CrossClusterSearchConnection Inbound connection details.
DeleteOutboundCrossClusterSearchConnection
Allows the source domain owner to delete an existing outbound cross-cluster search connection.
Syntax
DELETE https://es.us-east-1.amazonaws.com/2015-01-01/es/ccs/
outboundConnection/{ConnectionId}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
This operation does not use the HTTP request body.
Response Elements
Object
CrossClusterSearchConnection Outbound connection details.
DeletePackage
Deletes a package from Amazon ES. The package must not be associated with any Amazon ES domain.
Syntax
DELETE https://es.us-east-1.amazonaws.com/2015-01-01/packages/package-id
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
DescribeElasticsearchDomain
Describes the domain configuration for the specified Amazon ES domain, including the domain ID,
domain service endpoint, and domain ARN.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/es/domain/domain-name
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
DescribeElasticsearchDomainConfig
Displays the configuration of an Amazon ES domain.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/es/domain/domain-name/config
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
DescribeElasticsearchDomains
Describes the domain configuration for up to five specified Amazon ES domains. Information includes
the domain ID, domain service endpoint, and domain ARN.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/es/domain-info
{
"DomainNames": [
"domain-name1",
"domain-name2",
]
}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
Response Elements
DescribeElasticsearchInstanceTypeLimits
Describes the instance count, storage, and master node limits for a given Elasticsearch version and
instance type.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/es/instanceTypeLimits/elasticsearch-
version/instance-type?domainName=domain-name
Request Parameters
String
ElasticsearchVersion Yes Elasticsearch version. For a
list of supported versions, see
the section called “Supported
Elasticsearch Versions” (p. 2).
Request Body
This operation does not use the HTTP request body.
Response Elements
DescribeInboundCrossClusterSearchConnections
Lists all the inbound cross-cluster search connections for a destination domain.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/es/ccs/inboundConnection/search
{
"Filters": [
{
"Name": filter-name (str),
"Values" : [val1, val2, ..] (list of strings)
},
....
"MaxResults": int (Optional, default value - 100),
"NextToken": "next-token-string (optional)"
]
}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
Response Elements
Object
CrossClusterSearchConnections List of inbound connections.
DescribeOutboundCrossClusterSearchConnections
Lists all outbound cross-cluster search connections for a source domain.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/es/ccs/outboundConnection/search
{
"Filters": [
{
"Name": filter-name (str),
"Values" : [val1, val2, ..] (list of strings)
},
....
"MaxResults": int (Optional, default value - 100),
"NextToken": "next-token-string (optional)"
]
}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
Response Elements
Object
CrossClusterSearchConnections List of outbound connections.
DescribePackages
Describes all packages available to Amazon ES. Includes options for filtering, limiting the number of
results, and pagination.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/packages/describe
{
"Filters": [{
"Name": "PackageStatus",
"Value": [
"DELETING", "AVAILABLE"
]
}],
"MaxResults": 5,
"NextToken": "next-token",
}
Request Parameters
This operation does not use request parameters.
Request Body
Response Elements
DescribeReservedElasticsearchInstanceOfferings
Describes the available Reserved Instance offerings for a given Region.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/es/reservedInstanceOfferings?
offeringId=offering-id&maxResults=max-results&nextToken=next-token
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
ReservedElasticsearchInstanceOfferings
ReservedElasticsearchInstanceOfferings Container for all information about a
Reserved Instance offering. For more
information, see the section called
“Purchasing Reserved Instances (AWS
CLI)” (p. 237).
DescribeReservedElasticsearchInstances
Describes the instance that you have reserved in a given Region.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/es/reservedInstances?
reservationId=reservation-id&maxResults=max-results&nextToken=next-token
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
DissociatePackage
Removes the package from the specified Amazon ES domain. The package must not be in use with
any ES index for dissociate to succeed. The package will still be available in the Amazon ES service for
associating later.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/packages/dissociate/package-id/domain-
name
Request Parameters
DomainName the section called Yes Name of the domain that you
“DomainName” (p. 308) want to dissociate the package
from.
Request Body
This operation does not use the HTTP request body.
Response Elements
GetCompatibleElasticsearchVersions
Returns a map of Elasticsearch versions and the versions you can upgrade them to.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/es/compatibleVersions?domainName=domain-
name
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
Map
CompatibleElasticsearchVersions A map of Elasticsearch versions
and the versions that you can
upgrade them to:
"CompatibleElasticsearchVersions":
[{
"SourceVersion": "6.7",
"TargetVersions":
["6.8"]
}]
}
GetUpgradeHistory
Returns a list of the domain's 10 most-recent upgrade operations.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/es/upgradeDomain/domain-name/history?
maxResults=max-results&nextToken=next-token
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
GetUpgradeStatus
Returns the most-recent status of a domain's Elasticsearch version upgrade.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/es/upgradeDomain/domain-name/status
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
ListDomainNames
Displays the names of all Amazon ES domains owned by the current user in the active Region.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/domain
Request Parameters
This operation does not use request parameters.
Request Body
This operation does not use the HTTP request body.
Response Elements
ListDomainsForPackage
Lists all Amazon ES domains that a package is associated with.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/packages/package-id/domains?
maxResults=max-results&nextToken=next-token
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
List
DomainPackageDetailsList List of the section called
“DomainPackageDetails” (p. 309) objects.
ListElasticsearchInstanceTypeDetails
Lists all Elasticsearch instance types that are supported for a given Elasticsearch version and the features
that these instance types support.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/es/instanceTypeDetails/elasticsearch-
version?domainName=domain-name&maxResults=max-results&nextToken=next-token
Request Parameters
String
ElasticsearchVersion Yes The Elasticsearch version.
Request Body
This operation does not use the HTTP request body.
Response Elements
List
ElasticsearchInstanceTypes List of supported instance types for the
given Elasticsearch version and the features
that these instance types support.
ListElasticsearchInstanceTypes (Deprecated)
Lists all Elasticsearch instance types that are supported for a given Elasticsearch version. This action is
deprecated. Use ListElasticsearchInstanceTypeDetails (p. 294) instead.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/es/instanceTypes/elasticsearch-version?
domainName=domain-name&maxResults=max-results&nextToken=next-token
Request Parameters
String
ElasticsearchVersion Yes The Elasticsearch version.
Request Body
This operation does not use the HTTP request body.
Response Elements
List
ElasticsearchInstanceTypes List of supported instance types for the
given Elasticsearch version.
ListElasticsearchVersions
Lists all supported Elasticsearch versions on Amazon ES.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/es/versions?maxResults=max-
results&nextToken=next-token
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
ListPackagesForDomain
Lists all packages associated with the Amazon ES domain.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/domain/domain-name/packages?
maxResults=max-results&nextToken=next-token
Request Parameters
Request Body
This operation does not use the HTTP request body.
Response Elements
DomainPackageDetailsList
List List of the section called
“DomainPackageDetails” (p. 309) objects.
ListTags
Displays all resource tags for an Amazon ES domain.
Syntax
GET https://es.us-east-1.amazonaws.com/2015-01-01/tags?arn=domain-arn
Request Parameters
ARN ARN (p. 306) Yes Amazon Resource Name (ARN) for the
Amazon ES domain.
Request Body
This operation does not use the HTTP request body.
Response Elements
TagList TagList (p. 318) List of resource tags. For more information, see Tagging
Amazon Elasticsearch Service Domains (p. 57).
PurchaseReservedElasticsearchInstanceOffering
Purchases a Reserved Instance.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/es/purchaseReservedInstanceOffering
{
"ReservationName" : "my-reservation",
"ReservedElasticsearchInstanceOfferingId" : "1a2a3a4a5-1a2a-3a4a-5a6a-1a2a3a4a5a6a",
"InstanceCount" : 3
}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
String Yes
ReservedElasticsearchInstanceOfferingId The offering ID.
Response Elements
String
ReservedElasticsearchInstanceId The reservation ID.
RejectInboundCrossClusterSearchConnection
Allows the destination domain owner to reject an inbound cross-cluster search connection request.
Syntax
PUT https://es.us-east-1.amazonaws.com/2015-01-01/es/ccs/inboundConnection/{ConnectionId}/
reject
Request Parameters
This operation does not use HTTP request parameters.
Request Body
This operation does not use the HTTP request body.
Response Elements
Object
CrossClusterSearchConnection Inbound connection details.
RemoveTags
Removes the specified resource tags from an Amazon ES domain.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/tags-removal
{
"ARN": "arn:aws:es:us-east-1:123456789012:domain/my-domain",
"TagKeys": [
"tag-key1",
"tag-key2"
]
}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
TagKeys TagKey (p. 318) Yes List of tag keys for resource tags that you
want to remove from an Amazon ES domain.
Response Elements
The RemoveTags operation does not return a response element.
StartElasticsearchServiceSoftwareUpdate
Schedules a service software update for an Amazon ES domain.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/es/serviceSoftwareUpdate/start
{
"DomainName": "domain-name"
}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
DomainName DomainName (p. 308) Yes Name of the Amazon ES domain that
you want to update to the latest service
software.
Response Elements
StopElasticsearchServiceSoftwareUpdate
Stops a scheduled service software update for an Amazon ES domain. Only works if the domain's
UpdateStatus is PENDING_UPDATE.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/es/serviceSoftwareUpdate/stop
{
"DomainName": "domain-name"
}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
DomainName DomainName (p. 308) Yes Name of the Amazon ES domain that
you want to update to the latest service
software.
Response Elements
ServiceSoftwareOptions Container
ServiceSoftwareOptions (p. 317)
for the state of your domain relative to the
latest service software.
UpdateElasticsearchDomainConfig
Modifies the configuration of an Amazon ES domain, such as the instance type and the number of
instances. You need to specify only the values that you want to update.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/es/domain/<DOMAIN_NAME>/config
{
"ElasticsearchClusterConfig": {
"ZoneAwarenessConfig": {
"AvailabilityZoneCount": 3
},
"ZoneAwarenessEnabled": true|false,
"InstanceCount": 3,
"DedicatedMasterEnabled": true|false,
"DedicatedMasterType": "c5.large.elasticsearch",
"DedicatedMasterCount": 3,
"InstanceType": "r5.large.elasticsearch",
"WarmCount": 6,
"WarmType": "ultrawarm1.medium.elasticsearch"
},
"EBSOptions": {
"EBSEnabled": true|false,
"VolumeType": "io1|gp2|standard",
"Iops": 1000,
"VolumeSize": 35
},
"SnapshotOptions": {
"AutomatedSnapshotStartHour": 3
},
"VPCOptions": {
"SubnetIds": ["subnet-abcdefg1", "subnet-abcdefg2", "subnet-abcdefg3"],
"SecurityGroupIds": ["sg-12345678"]
},
"AdvancedOptions": {
"rest.action.multi.allow_explicit_index": "true|false",
"indices.fielddata.cache.size": "40",
"indices.query.bool.max_clause_count": "1024"
},
"CognitoOptions": {
"Enabled": true|false,
"UserPoolId": "us-east-1_121234567",
"IdentityPoolId": "us-east-1:12345678-1234-1234-1234-123456789012",
"RoleArn": "arn:aws:iam::123456789012:role/service-role/CognitoAccessForAmazonES"
},
"DomainEndpointOptions": {
"EnforceHTTPS": true|false,
"TLSSecurityPolicy": "Policy-Min-TLS-1-2-2019-07|Policy-Min-TLS-1-0-2019-07"
},
"LogPublishingOptions": {
"SEARCH_SLOW_LOGS": {
"CloudWatchLogsLogGroupArn":"arn:aws:logs:us-east-1:264071961897:log-group1:sample-
domain",
"Enabled":true|false
},
"INDEX_SLOW_LOGS": {
"CloudWatchLogsLogGroupArn":"arn:aws:logs:us-east-1:264071961897:log-group2:sample-
domain",
"Enabled":true|false
},
"ES_APPLICATION_LOGS": {
"CloudWatchLogsLogGroupArn":"arn:aws:logs:us-east-1:264071961897:log-group3:sample-
domain",
"Enabled":true|false
}
},
"AdvancedSecurityOptions": {
"InternalUserDatabaseEnabled": true|false,
"MasterUserOptions": {
"MasterUserARN": "arn:aws:iam::123456789012:role/my-master-user-role"
"MasterUserName": "my-master-username",
"MasterUserPassword": "my-master-password"
}
},
"DomainName": "my-domain",
"AccessPolicies": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow
\",\"Principal\":{\"AWS\":[\"*\"]},\"Action\":[\"es:*\"],\"Resource\":\"arn:aws:es:us-
east-1:123456789012:domain/my-domain/*\"}]}"
}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
Response Elements
UpgradeElasticsearchDomain
Upgrades an Amazon ES domain to a new version of Elasticsearch. Alternately, checks upgrade eligibility.
Syntax
POST https://es.us-east-1.amazonaws.com/2015-01-01/es/upgradeDomain
{
"DomainName": "domain-name",
"TargetVersion": "7.7",
"PerformCheckOnly": true|false
}
Request Parameters
This operation does not use HTTP request parameters.
Request Body
Response Elements
Map
UpgradeElasticsearchDomainResponse Basic response confirming operation details.
Data Types
This section describes the data types used by the configuration API.
AdvancedOptions
Key-value pairs to specify advanced Elasticsearch configuration options.
Key-value pair:
rest.action.multi.allow_explicit_index Note the use of a string
rather than a boolean.
"rest.action.multi.allow_explicit_index":"true"
Specifies whether
explicit references to
indices are allowed
inside the body of
HTTP requests. If you
want to configure
access policies for
domain sub-resources,
such as specific indices
and domain APIs,
you must disable this
property. For more
information about
access policies for
Key-value pair:
indices.query.bool.max_clause_count Note the use of a string
rather than an integer.
"indices.query.bool.max_clause_count":"1024"
Specifies the maximum
number of clauses
allowed in a Lucene
boolean query. 1,024
is the default. Queries
with more than the
permitted number of
clauses that result in
a TooManyClauses
error. To learn more,
see the Lucene
documentation.
ARN
AdvancedSecurityOptions
Boolean
InternalUserDatabaseEnabled True to enable the internal user database.
MasterUserOptions the section called Container for information about the master user.
“MasterUserOptions” (p. 315)
CognitoOptions
Field Data Type Description
UserPoolId String The Amazon Cognito user pool ID that you want
Amazon ES to use for Kibana authentication.
CreateElasticsearchDomainRequest
Container for the parameters required by the CreateElasticsearchDomain service operation.
ElasticsearchClusterConfig Container
ElasticsearchClusterConfig (p. 309)for the cluster configuration of
an Amazon ES domain.
NodeToNodeEncryptionOptions Specify
NodeToNodeEncryptionOptions (p. true
315) to enable node-to-node
encryption.
DomainEndpointOptions
Field Data Type Description
• Policy-Min-TLS-1-0-2019-07
• Policy-Min-TLS-1-2-2019-07
DomainID
Data Type Description
DomainName
Name of an Amazon ES domain.
String Name of an Amazon ES domain. Domain names are unique across all
domains owned by the same account within an AWS Region. Domain
names must start with a lowercase letter and must be between 3 and 28
characters. Valid characters are a-z (lowercase only), 0-9, and – (hyphen).
DomainNameList
String of Amazon ES domain names.
["<Domain_Name>","<Domain_Name>"...]
DomainPackageDetails
Information on a package that is associated with a domain.
EBSOptions
Container for the parameters required to enable EBS-based storage for an Amazon ES domain.
VolumeSize String Specifies the size (in GiB) of EBS volumes attached to
data nodes.
ElasticsearchClusterConfig
Container for the cluster configuration of an Amazon ES domain.
ZoneAwarenessConfig Container
ZoneAwarenessConfig (p.for
319)
zone awareness
configuration options. Only required if
ZoneAwarenessEnabled is true.
WarmType String The instance type for the cluster's warm nodes.
ElasticsearchDomainConfig
Container for the configuration of an Amazon ES domain.
ElasticsearchClusterConfig Container
ElasticsearchClusterConfig (p. 309)for the cluster
configuration of an Amazon ES
domain.
NodeToNodeEncryptionOptions Whether
NodeToNodeEncryptionOptions (p. 315)
node-to-node
encryption is enabled or
disabled.
ElasticsearchDomainStatus
Container for the contents of a DomainStatus data structure.
ElasticsearchClusterConfig ElasticsearchClusterConfigContainer
(p. 309) for the cluster
configuration of an Amazon ES
domain.
Whether
NodeToNodeEncryptionOptionsNodeToNodeEncryptionOptions (p. 315)node-to-node
encryption is enabled or
disabled.
ElasticsearchDomainStatusList
List that contains the status of each specified Amazon ES domain.
EncryptionAtRestOptions
Specifies whether the domain should encrypt data at rest, and if so, the AWS Key Management Service
(KMS) key to use. Can be used only to create a new domain, not update an existing one. To learn more,
see the section called “Enabling Encryption of Data at Rest” (p. 62).
EndpointsMap
The key-value pair that contains the VPC endpoint. Only exists if the Amazon ES domain resides in a VPC.
Filters
Filters the packages included in a the section called “DescribePackages” (p. 287) response.
LogPublishingOptions
Specifies whether the Amazon ES domain publishes the Elasticsearch application and slow logs to
Amazon CloudWatch. You still have to enable the collection of slow logs using the Elasticsearch REST API.
To learn more, see the section called “Setting Elasticsearch Logging Thresholds for Slow Logs” (p. 44).
"CloudWatchLogsLogGroupArn":"arn:aws:logs:us-
east-1:264071961897:log-group:sample-
domain",
"Enabled":true
"CloudWatchLogsLogGroupArn":"arn:aws:logs:us-
east-1:264071961897:log-group:sample-
domain",
"Enabled":true
"CloudWatchLogsLogGroupArn":"arn:aws:logs:us-
east-1:264071961897:log-group:sample-
domain",
"Enabled":true
MasterUserOptions
Field Data Type Description
NodeToNodeEncryptionOptions
Enables or disables node-to-node encryption.
OptionState
State of an update to advanced options for an Amazon ES domain.
• RequiresIndexDocuments
• Processing
OptionStatus
Status of an update to configuration options for an Amazon ES domain.
PackageDetails
Basic information about a package.
PackageSource
Bucket and key for the package you want to add to Amazon ES.
ServiceSoftwareOptions
Container for the state of your domain relative to the latest service software.
ServiceURL
Domain-specific endpoint used to submit index, search, and data upload requests to an Amazon ES
domain.
SnapshotOptions
DEPRECATED. See the section called “Working with Index Snapshots” (p. 45). Container for parameters
required to configure the time of daily automated snapshots of the indices in an Amazon ES domain.
Tag
Key TagKey (p. 318) Required name of the tag. Tag keys must be
unique for the Amazon ES domain to which they
are attached. For more information, see Tagging
Amazon Elasticsearch Service Domains (p. 57).
Value TagValue (p. 318) Optional string value of the tag. Tag values can
be null and do not have to be unique in a tag
set. For example, you can have a key-value pair
in a tag set of project/Trinity and cost-center/
Trinity.
TagKey
TagList
TagValue
Value String Holds the value for a TagKey. String can have up
to 256 characters.
VPCDerivedInfo
VPCId String The ID for your VPC. Amazon VPC generates this
value when you create a VPC.
VPCOptions
ZoneAwarenessConfig
Errors
Amazon ES throws the following errors:
Exception Description
BaseException Thrown for all service errors. Contains the HTTP status code of
the error.
LimitExceededException Thrown when trying to create more than the allowed number
and type of Amazon ES domain resources and sub-resources.
Returns HTTP status code 409.
AccessDeniedException
ConflictException
Release Notes
The following table describes important changes to Amazon ES. For notifications about updates, you can
subscribe to the RSS feed.
Important
Service software updates add support for new features, security patches, bug fixes, and other
improvements. To use new features, you might need to update the service software on your
domain. For more information, see the section called “Service Software Updates” (p. 15).
Learning to Rank (p. 321) Amazon Elasticsearch Service July 27, 2020
now supports the open source
Learning to Rank plugin, which
lets you use machine learning
technologies to improve search
relevance. This feature requires
service software R20200721 or
later. For more information, see
the documentation.
gzip Compression (p. 321) Amazon Elasticsearch Service July 23, 2020
now supports gzip compression
for most HTTP requests and
responses, which can reduce
latency and conserve bandwidth.
This feature requires service
software R20200721 or later.
For more information, see the
documentation.
KNN Cosine Similarity (p. 321) KNN now lets you search July 23, 2020
for "nearest neighbors" by
cosine similarity in addition
to Euclidean distance. This
feature requires service software
R20200721 or later. For
more information, see the
documentation.
Kibana Map Service (p. 321) The default installation of June 18, 2020
Kibana for Amazon Elasticsearch
Service now includes a map
service, except for domains in
the India and China regions.
For more information, see
Configuring Kibana to Use a
WMS Map Server.
SQL Improvements (p. 321) SQL support for Amazon June 3, 2020
Elasticsearch Service now
supports many new operations,
a dedicated Kibana user
interface for data exploration,
and an interactive CLI. For more
information, see SQL Support.
Custom Dictionaries (p. 321) Amazon Elasticsearch Service April 21, 2020
lets you upload custom
dictionary files for use with
your cluster. These files improve
your search results by telling
Elasticsearch to ignore certain
high-frequency words or to
treat terms as equivalent. For
more information, see Custom
Packages.
Encryption Features for China Encryption of data at rest November 20, 2019
Regions (p. 321) and node-to-node encryption
are now available in the cn-
north-1 (Beijing) and cn-
northwest-1 (Ningxia) Regions.
Require HTTPS (p. 321) You can now require that all October 3, 2019
traffic to your Amazon ES
domains arrive over HTTPS.
When configuring your domain,
check the Require HTTPS box.
This feature requires service
software R20190808 or later.
Elasticsearch 7.1 and 6.8 Amazon Elasticsearch Service August 13, 2019
Support (p. 321) now supports Elasticsearch
version 7.1 and 6.8. To
learn more, see Supported
Elasticsearch Versions and
Upgrading Elasticsearch.
Hourly Snapshots (p. 321) Rather than daily snapshots, July 8, 2019
Amazon Elasticsearch Service
now takes hourly snapshots of
domains running Elasticsearch
5.3 and later so that you have
more frequent backups from
which to restore your data. To
learn more, see Working with
Amazon Elasticsearch Service
Index Snapshots.
SQL Support (p. 321) Amazon Elasticsearch Service May 15, 2019
now lets you query your
data using SQL. For more
information, see SQL Support.
This feature requires service
software R20190418 or later.
5-series Instance Types (p. 321) Amazon Elasticsearch Service April 24, 2019
now supports M5, C5, and R5
instance types. Compared to
previous-generation instance
types, these new types offer
better performance at lower
prices. For more information, see
Supported Instance Types and
Limits.
Alerting (p. 321) The alerting feature notifies March 25, 2019
you when data from one or
more Elasticsearch indices
meets certain conditions. To
learn more, see Alerting. This
feature requires service software
R20190221 or later.
200-Node Clusters (p. 321) Amazon ES now lets you create January 22, 2019
clusters with up to 200 data
nodes for a total of 3 PB of
storage. To learn more, see
Petabyte Scale.
Elasticsearch 6.3 and 5.6 Amazon Elasticsearch Service August 14, 2018
Support (p. 321) now supports Elasticsearch
version 6.3 and 5.6. To
learn more, see Supported
Elasticsearch Versions.
Error Logs (p. 321) Amazon ES now lets you publish July 31, 2018
Elasticsearch error logs to
Amazon CloudWatch. To learn
more, see Configuring Logs.
China (Ningxia) Reserved Amazon ES now offers Reserved May 29, 2018
Instances (p. 321) Instances in the China (Ningxia)
region.
Reserved Instances (p. 321) Amazon ES now offers Reserved May 7, 2018
Instances. To learn more, see
Amazon ES Reserved Instances.
Earlier Updates
The following table describes important changes Amazon ES before May 2018.
Amazon Cognito Amazon ES now offers login page protection for Kibana. April 2, 2018
Authentication for To learn more, see the section called “Authentication for
Kibana Kibana” (p. 100).
Elasticsearch 6.2 Amazon Elasticsearch Service now supports Elasticsearch March 14,
Support version 6.2. 2018
Korean Analysis Amazon ES now supports a memory-optimized version of the March 13,
Plugin Seunjeon Korean analysis plugin. 2018
Instant Access Control Changes to the access control policies on Amazon ES domains March 7,
Updates now take effect instantly. 2018
Petabyte Scale Amazon ES now supports I3 instance types and total domain 19 December
storage of up to 1.5 PB. To learn more, see the section called 2017
“Petabyte Scale” (p. 207).
Encryption of Data at Amazon ES now supports encryption of data at rest. To learn December 7,
Rest more, see the section called “Encryption at Rest” (p. 62). 2017
Elasticsearch 6.0 Amazon ES now supports Elasticsearch version 6.0. For December 6,
Support migration considerations and instructions, see the section 2017
called “Upgrading Elasticsearch” (p. 52).
VPC Support Amazon ES now lets you launch domains within an Amazon October 17,
Virtual Private Cloud. VPC support provides an additional layer 2017
of security and simplifies communications between Amazon
ES and other services within a VPC. To learn more, see the
section called “VPC Support” (p. 20).
Slow Logs Publishing Amazon ES now supports the publishing of slow logs to October 16,
CloudWatch Logs. To learn more, see the section called 2017
“Configuring Logs” (p. 41).
Elasticsearch 5.5 Amazon ES now supports Elasticsearch version 5.5. For September 7,
Support new feature summaries, see the Amazon announcement of 2017
availability.
Elasticsearch 5.3 Amazon ES added support for Elasticsearch version 5.3. June 1, 2017
Support
More Instances and Amazon ES now supports up to 100 nodes and 150 TB EBS April 5, 2017
EBS Capacity per capacity per cluster.
Cluster
Canada (Central) and Amazon ES added support for the following regions: Canada March 20,
EU (London) Support (Central), ca-central-1, and EU (London), eu-west-2. 2017
More Instances and Amazon ES added support for more instances and larger EBS February 21,
Larger EBS Volumes volumes. 2017
Elasticsearch 5.1 Amazon ES added support for Elasticsearch version 5.1. January 30,
Support 2017
Support for the Amazon ES now provides built-in integration with the December 22,
Phonetic Analysis Phonetic Analysis plugin, which allows you to run “sounds- 2016
Plugin like” queries on your data.
US East (Ohio) Amazon ES added support for the following region: US East October 17,
Support (Ohio), us-east-2. 2016
Elasticsearch 2.3 Amazon ES added support for Elasticsearch version 2.3. July 27, 2016
Support
Asia Pacific (Mumbai) Amazon ES added support for the following region: Asia June 27,
Support Pacific (Mumbai), ap-south-1. 2016
More Instances per Amazon ES increased the maximum number of instances May 18, 2016
Cluster (instance count) per cluster from 10 to 20.
Asia Pacific (Seoul) Amazon ES added support for the following region: Asia January 28,
Support Pacific (Seoul), ap-northeast-2. 2016
AWS glossary
For the latest AWS terminology, see the AWS glossary in the AWS General Reference.