Skip to content

ec2 Snapshots – finding snapshots for volumes that no-longer exist.

I have at times managed dozens of EC2 instances; each with their own volume. Alongside all of these instances, I’ve had dozens of additional volumes that were attached to various instances – or used for various things.

In an earlier posting – I showed you how to automatically create snapshots of your volumes. This can come in handy. It can also mean you’ll end up with dozens of snapshots for volumes that no-longer exist. The stored snapshots are taking up space and costing you money, and it can really add up if you aren’t good about cleaning that sort of thing up. IF you automate snapshot creation – you will over time end up with snapshots that are of no value for you.

KEEP IN MIND that you may purposely have snapshots for volumes that no longer exist. You may (for example) have snapshots that were backups of older volumes that you wish to keep around. EC2 allows you to label/describe your snapshots so you can differentiate those.

So I’ve designed a quick shell script that will show you all of the snapshots you have that do not match up with a current volume. Make sure you have your environment setup and the latest version of the EC2 Tools API installed and in the PATH. You can place this command as a cron entry, or simply run it from the command line. Here is the command:

for s in $(comm -13 <(ec2-describe-volumes |cut -f2 |sort -u) <(ec2-describe-snapshots |cut -f3| sort -u)); do ec2-describe-snapshots |grep $s; done;

Just like a few of my earlier scripts - this one-line code loop uses the AWS command line tools. I'll explain what is happening.

The bulk of the activity happens in this chunk of code:

comm -13 <(ec2-describe-volumes |cut -f2 |sort -u) <(ec2-describe-snapshots |cut -f3| sort -u)

The comm command will compare the output from the first list (the list of volumes) to the second list (the list of volumes that have snapshots).

The first list is the unique and sorted list of volume identifiers.
The second list is the unique list of volume identifiers for all of the snapshots.

The comm parameters -13 tell comm to suppress lines unique to stream 1 and lines that are common to both streams. So what we end up with is a list of snapshotted-volumes that aren't included in the list of volumes. Pretty clever, huh?

Next we iterate through this list to run the ec2-describe-snapshots command - grepping the output for each volume. WARNING - this can be a time-consuming operation - especially if you have hundreds of volumes!!

Now the output from this command only shows you the list of snapshots. Here is an example:

SNAPSHOT snap-xxxxxxxx vol-xxxxxxxx completed 2010-05-24T20:49:17+0000 100% XXXXXXXXXXXX 15 Lengthy Snapshot Description provided by you.
SNAPSHOT snap-xxxxxxxx vol-xxxxxxxx completed 2010-08-02T02:34:02+0000 100% XXXXXXXXXXXX 15 KEEP THIS snapshot forever
SNAPSHOT snap-xxxxxxxx vol-xxxxxxxx completed 2010-05-24T20:49:17+0000 100% XXXXXXXXXXXX 15 Automated backup of a volume we don't care about anymore
SNAPSHOT snap-xxxxxxxx vol-xxxxxxxx completed 2010-05-24T20:49:17+0000 100% XXXXXXXXXXXX 15 Perhaps another backup we don't care about

Now that you have this list, you can use the ec2-delete-snapshot to individually delete the un-needed snapshots.

Don't you feel better for keeping your EC2 world a little cleaner.

Post a Comment

Your email is never published nor shared. Required fields are marked *