Skip to content →

HOWTO: Manually Backup Snapshots via Amazon Elasticsearch Service

Like many of you, my first stop when encountering a need to manually backup a few Amazon Elasticsearch Service clusters recently included the AWS ES FAQ page itself. At the bottom of the page, there are a couple relevant pieces of information (re-posting here for convenience):

Q: Can I create additional snapshots of my Amazon Elasticsearch domains as needed?

Yes. You can use the Elasticsearch snapshot API to create additional manual snapshots in addition to the daily-automated snapshots created by Amazon Elasticsearch Service. The manual snapshots are stored in your S3 bucket and will incur relevant Amazon S3 usage charges.

Q: Can snapshots created by the manual snapshot process be used to recover a domain in the event of a failure?

Yes. Customers can create a new Amazon Elasticsearch domain and load data from the snapshot into the newly created Amazon Elasticsearch domain using the Elasticsearch restore API.

Q: What happens to my snapshots when I delete my Amazon Elasticsearch domain?

The daily snapshots retained by Amazon Elasticsearch Service will be deleted as part of domain deletion. Before deleting a domain, you should consider creating a snapshot of the domain in your own S3 buckets using the manual snapshot process. The snapshots stored in your S3 bucket will not be affected if you delete your Amazon Elasticsearch domain.

Given the above information, I next looked through the ES API documentation (pg 44) to see what insights I could learn about using the AWS CLI and anything else needed to do this manual snapshot backup.

I also came across this seemingly random but useful reference.

Ultimately, I was able to string together a couple steps that I thought might make for good information for others to avoid endlessly searching through AWS Forums, Stack Overflow, etc.

Basically, the steps include:

1) Setting up a few environment variables in your console to help the AWS CLI do its magic
2) Configuring your AWS CLI IAM User with the appropriate IAM policy to actually make the various requests properly
3) Setting up the basic S3 repository for your manually backed-up ES cluster
4) Initiating the manual backup
5) Confirming that everything worked correctly and comparing the size of the contents of the new S3 bucket with the ES cluster

So first up, let’s get a few things set in the console.

Console Configuration

These next set of environment variables are built-into the Python script below. You can either use this pattern to expose these values or edit them in-line within the Python script.

ES_MANUAL_SNAPSHOT_ROLENAME=your_es_s3_rolename
export ESTEST_MANUAL_SNAPSHOT_ROLENAME=$ESTEST_MANUAL_SNAPSHOT_ROLENAME

ES_MANUAL_SNAPSHOT_IAM_POLICY_NAME=your_es_iam_policy_name
export ESTEST_MANUAL_SNAPSHOT_IAM_POLICY_NAME=$ESTEST_MANUAL_SNAPSHOT_IAM_POLICY_NAME

ES_MANUAL_SNAPSHOT_S3_BUCKET=your_es_backup_s3_bucket_name
export ES_MANUAL_SNAPSHOT_S3_BUCKET=$ES_MANUAL_SNAPSHOT_S3_BUCKET

ES_IAM_MANUAL_SNAPSHOT_ROLE_ARN=your_snapshot_role_arn
export ES_IAM_MANUAL_SNAPSHOT_ROLE_ARN=$ES_IAM_MANUAL_SNAPSHOT_ROLE_ARN

ES_REGION=your_es_aws_region
export ES_REGION=$ES_REGION

ES_CLUSTER_DNS=your_es_dns_name
export ES_CLUSTER_DNS=$ES_CLUSTER_DNS

ES_AWS_ACCESS_KEY_ID=your_aws_cli_user_access_key_id
export ES_AWS_ACCESS_KEY_ID=$ES_AWS_ACCESS_KEY_ID

ES_AWS_SECRET_ACCESS_KEY=your_aws_cli_user_secret_access_key
ES_AWS_SECRET_ACCESS_KEY=$ES_AWS_SECRET_ACCESS_KEY

AWS CLI IAM

Next, let’s create a policy that will apply to the AWS IAM User that you use for the AWS CLI.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1469200763880",
            "Action": [
                "iam:AttachRolePolicy",
                "iam:CreateRole",
                "iam:PutRolePolicy",
                "iam:PassRole"
            ],
            "Effect": "Allow",
            "Resource": "*"
        }
    ]
}

AWS IAM Role for Interacting with S3

This role allows for manipulation of the new S3 bucket that you created above so make sure you pass the appropriate S3 ARN in both of the required areas below (sample-es-service-backup-bucket).

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:ListBucket"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::sample-es-service-backup-bucket"
            ]
        },
        {
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject",
                "iam:PassRole"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::sample-es-service-backup-bucket/*"
            ]
        }
    ]
}

Python Script

Here’s a sample .py Python script to run which will use all of the previously set-up environment variables. Take note that the snapshot repository path is still hard-coded on line 33.

from boto.connection import AWSAuthConnection
import os

class ESConnection(AWSAuthConnection):

def __init__(self, region, **kwargs):
super(ESConnection, self).__init__(**kwargs)
self._set_auth_region_name(region)
self._set_auth_service_name("es")

def _required_auth_capability(self):
return ['hmac-v4']

if __name__ == "__main__":

client = ESConnection(
region=os.environ['ES_REGION'],
host=os.environ['ES_CLUSTER_DNS'],
aws_access_key_id=os.environ['ES_AWS_ACCESS_KEY_ID'],
aws_secret_access_key=os.environ['ES_AWS_SECRET_ACCESS_KEY'],
is_secure=False)

data='{"type": "s3","settings": { ' + \
'"bucket": "' + os.environ['ES_MANUAL_SNAPSHOT_S3_BUCKET'] + \
'","region": "' + os.environ['ES_REGION'] + \
'","role_arn": "' + os.environ['ES_IAM_MANUAL_SNAPSHOT_ROLE_ARN'] + \
'"}}'

print 'Request body: ' + data

print 'Registering Snapshot Repository'
resp = client.make_request(method='POST',
path='/_snapshot/es-index-backups',
data=data)
body = resp.read()

print 'Response body: ' + body

If you run into issues running the script, make sure to update ‘boto’ using the following command:
sudo pip install --upgrade boto

On successfully registering this new Snapshot Repository, you should receive the following message:
{"acknowledged":true}.

Manual Snapshot Backup

The last step is to initiate a manual snapshot backup to this new Snapshot Repository with the following command:
curl -XPUT 'http://es-service-aws-cluster-dns.us-east-1.es.amazonaws.com/_snapshot/es-index-backups/snapshot'.

Be sure to change this to your ES cluster (es-service-aws-cluster-dns), AWS region (us-east-1) as well as your snapshot repository (es-index-backups).

On successfully initiating the manual snapshot backup, you should receive the following message:
{"accepted":true}.

If you get a RepositoryMissingException error, it might be because your snapshot repository was registered with a different name than what you passed above (e.g. es-index-backups) so double check to make sure they are the same.

Jonathan

Published in AWS