Scheduling Elastic Block Storage (EBS) Snapshots with AWS Lambda

  2015-11-17


Traditionally, scheduling snapshots of your Elastic Block Storage (EBS) volumes required the setup and maintenance of an EC2 instance or the use of a third-party service like Skeddly. Depending on cost or security concerns (having to grant a service like Skeddly access to your AWS account) this may not be an option. Additionally, storing your access keys on an EC2 instance may not be acceptable, even if you limit the IAM role to only allow the creation and deletion of snapshots.

Enter AWS Lambda. This service allows you to write a small application in either Node.js, Java, or Python that is executed either on a schedule or in response to other events. In this article we will be focusing on creating a Python script that creates EBS snapshots once a day and deletes backups older than a week from creation to keep storage costs in check.

Just a couple notes before we begin:

  • The Python code below is a compilation of two articles located here and here. Big shout-out to the original author!
  • Lambda is billed by amount of requests made do your application (the number of times your lambda function is triggered) and the amount of time it runs in milliseconds. Depending on how many volumes you are creating snapshots from the cost could vary. In my scenario I am only creating snapshots for a single volume. This low frequency does not cost me a single penny. See this article for more details on pricing.
  • The type of backup we are configuring here is considered “crash consistent”. If any data is being written to the EBS volume when the snapshot first starts there is a chance for corruption or lost data when restoring from the snapshot being created. The only way I know of to get an application consistent backup would be to power down the instance, start the snapshot, then start it again. This can be done in these scripts, but it is outside the scope of this particular article.

With that out of the way, on to the configuration…

Create a new Identity and Access Management (IAM) role in the AWS console

  1. Login to the AWS Console and go to Identity & Access Management
  2. Click Roles on the left navigation
  3. Click Create New Role
  4. Name the role (no spaces allowed) and click Next Step
  5. Click the Select button for AWS Lambda
  6. No not attach any policies, just click Next Step
  7. Click Create Role
  8. The new role has been created and you are returned to the Roles list. Click the new role you just created, we need to add the new custom policy.
  9. Expand Inline Policies and click the “click here” link
  10. Choose Custom Policy and click the Select button
  11. Name the policy (you can just call it the same thing you did in step 4)
  12. Paste in the following policy document:

    {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": ["logs:*"],
            "Resource": "arn:aws:logs:*:*:*"
        },
        {
            "Effect": "Allow",
            "Action": "ec2:Describe*",
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:CreateSnapshot",
                "ec2:DeleteSnapshot",
                "ec2:CreateTags",
                "ec2:ModifySnapshotAttribute",
                "ec2:ResetSnapshotAttribute"
            ],
            "Resource": ["*"]
        }
    ]
    }
    
  13. Click Apply Policy. The new role has been created and is ready for use by Lambda.

Create a New Lambda Function to Create the Snapshots

  1. Go to the Lambda console and Click Get Started. Choose to create a new Lambda function.
  2. Click Skip on the Select blueprint page
  3. Give your function a name and optionally a description
  4. Choose Python 2.7 for the Runtime
  5. Paste in the following code:

    import boto3
    import collections
    import datetime
    
    ec = boto3.client('ec2')
    
    def lambda_handler(event, context):
        reservations = ec.describe_instances(
            Filters=[
                {'Name': 'tag-key', 'Values': ['backup', 'Backup']},
            ]
        ).get(
            'Reservations', []
        )
    
        instances = sum(
            [
                [i for i in r['Instances']]
                for r in reservations
            ], [])
    
        print "Found %d instances that need backing up" % len(instances)
    
        to_tag = collections.defaultdict(list)
    
        for instance in instances:
            try:
                retention_days = [
                    int(t.get('Value')) for t in instance['Tags']
                    if t['Key'] == 'Retention'][0]
            except IndexError:
                retention_days = 7
    
            for dev in instance['BlockDeviceMappings']:
                if dev.get('Ebs', None) is None:
                    continue
                vol_id = dev['Ebs']['VolumeId']
                print "Found EBS volume %s on instance %s" % (
                    vol_id, instance['InstanceId'])
    
                snap = ec.create_snapshot(
                    VolumeId=vol_id,
                )
    
                to_tag[retention_days].append(snap['SnapshotId'])
    
                print "Retaining snapshot %s of volume %s from instance %s for %d days" % (
                    snap['SnapshotId'],
                    vol_id,
                    instance['InstanceId'],
                    retention_days,
                )
    
    
        for retention_days in to_tag.keys():
            delete_date = datetime.date.today() + datetime.timedelta(days=retention_days)
            delete_fmt = delete_date.strftime('%Y-%m-%d')
            print "Will delete %d snapshots on %s" % (len(to_tag[retention_days]), delete_fmt)
            ec.create_tags(
                Resources=to_tag[retention_days],
                Tags=[
                    {'Key': 'DeleteOn', 'Value': delete_fmt},
                ]
            )
    
  6. Under the code editor section choose the new role you created from the Role drop-down.

  7. Click Next

  8. Click Create function.

Go into the EC2 console and add a tag to any EC2 Instances that will be included in the backup.
Simply create a new tag on any instance(s) you would like to include in the backup. Enter Backup for they Key and Value and the script will create snaps for any attached EBS volumes.

Test the Lambda Function

  1. Go back to the Lambda console and click the function you created earlier.
  2. Click the Test button. Leave “Hello World” selected for the Sample event template and click Save and test. If you go back to the EC2 console and click snapshots you should see a snapshot in the process of being created.

Create Another Function to Delete Old Backups
Create another lambda function like we did in the “Create a new lambda function to create the snapshots.” section of this how-to, but use the following code for step 5 instead. Replace the “12345” in account_ids = [‘12345’] with your actual AWS account number (found on the My Account page of the AWS console).

import boto3
import datetime

ec = boto3.client('ec2')

"""
This function looks at *all* snapshots that have a "DeleteOn" tag containing
the current day formatted as YYYY-MM-DD. This function should be run at least
daily.
"""

"""
To get your account id, run this snippet:
> import boto3
> iam = boto3.client('iam')
> print iam.get_user()['User']['Arn'].split(':')[4]
"""
account_ids = ['12345']

def lambda_handler(event, context):
    delete_on = datetime.date.today().strftime('%Y-%m-%d')
    filters = [
        {'Name': 'tag-key', 'Values': ['DeleteOn']},
        {'Name': 'tag-value', 'Values': [delete_on]},
    ]
    snapshot_response = ec.describe_snapshots(OwnerIds=account_ids, Filters=filters)


    for snap in snapshot_response['Snapshots']:
        print "Deleting snapshot %s" % snap['SnapshotId']
        ec.delete_snapshot(SnapshotId=snap['SnapshotId'])

Adjusting Retention
Retention duration by default is seven days. You can change this by modifying the snapshot creation code in the “Create a new lambda function to create the snapshots.” section of this how-to. The specific line you are looking for reads “retention_days = 7”.

Scheduling the Lambda Functions

  1. Lastly, we need to schedule our two new functions to run daily (or however often as you would like). Select one of the functions and choose the Event sources tab.
  2. Click Add event source
  3. Choose Scheduled Event and give it a name
  4. Choose cron(0 7 * * ? *) for the Schedule expression. This will allow you to schedule the function to run at a time of your choosing. See this link for more information on how to specify a Cron expression.
  5. Click Submit. The function will now run on a schedule you specify. Repeat these steps for the other function.

That’s it, you should now have daily backups of your instance(s) based on the schedule you specified. If you would like more frequent backups just change the first function you created to run more frequently.

comments powered by Disqus