Automating EC2 EBS Snapshot Cleanup
I’ve recently taken on the task of building and now administering a cluster of Amazon EC2 instances. The EC2 command line tools provide all of the basic functionality you’ll need as it relates to creating new instances, EBS volumes, snapshots, and nearly everything you would ever need to do with all of the assets. The one missing piece was a script to clean-up snapshots. The way snapshots work is they accumulate in your account’s built-in S3 area, and you pay for that.
So the problem in a nutshell is I have 10 volumes, each of which is cron’ed to be snapshotted at various times of the day (depends on the specific volume as to how often it is backed up). With 10 volumes, my S3 storage costs can get out of hand quite quickly. So I needed to develop a set of scripts that would scan my snapshots – and remove the oldest ones – so I’m not paying for that storage. It is important to keep a couple of snapshots for each volume (at the very least) – and in some cases, I’d like to keep several snapshots. For example, one of my volumes is responsible for storing the main database for the CMS. This is backed up once every two hours. For that specific volume, I’d like to always have my choice of the last 10 snapshots to restore. If the database all of the sudden becomes corrupt, it may be necessary to restore earlier backups to see where and when the corruption started. Other volumes may only require the last 1 or 2 snapshots. So this script needed to be flexible – in that I could specify how many backups I’d like to keep for each volume.

