When you first start using Eucalyptus, you are concerned with the intricacies of building images, configuring virtual instances and using all the variety of API features and tools that Eucalyptus offers.
However, when Eucalyptus becomes a tool your business relies on you need to consider making sure you can recover the system from a catastrophic failure. In the enterprise this usually means making adequate backups and being able to restore the whole system from those backups using an automated procedure perhaps with tools such as Ansible, Puppet and Chef or via a documented manual restore procedure.
Of course, like most things - it's never as easy as it seems it should be! :-)
This article will cover backing up and manually restoring the Eucalyptus Cloud Controller (CLC) to a known state.
The Eucalyptus CLC is the "brain" of your whole cloud. It stores metadata for all images, user, account and policy settings and all state related to your cloud. It is one larger pieces you need to consider when backing up your cloud - however there are other components that need backing up too such as the Walrus server (containing your S3 buckets) and Storage Controller (SC), which contains your EBS volumes.
The Eucalyptus Enhancement Bug EUCA-2139 has some more details on the reasons to backup the CLC.
Eucalyptus 3.1 and above uses PostgreSQL as the database that stores all of the critical data I mentioned. PostgreSQL is a very well established database with thousands of users worldwide and has well defined backup and replication procedures. PostgeSQL dumps and Continuous Archiving are the main two options we can consider.
I'll focus on dumps specifically as they are simple to understand and restore from and most sysadmins using PostgreSQL are familiar with them.
Taking a PostgreSQL dump is very simple via the pg_dumpall or pg_dump commands, however we also need to make sure we backup our cloud key files and certificates and then restore them in the correct order.
1. Run pg_dumpall to produce SQL dumps of all databases
mkdir -p /tmp/euca-backup pg_dumpall -c -o -h /var/lib/eucalyptus/db/data -p 8777 -U root -f /tmp/euca-backup/eucalyptus-database-`date +%Y%m%d`.sql
2. Backup the keys directory
tar -czvf /tmp/euca-backup/eucalyptus-keysdir-`date +%Y%m%d`.tgz /var/lib/eucalyptus/keys
3. Backup the /etc/eucalyptus directory
tar -czvf /tmp/euca-backup/eucalyptus-etcdir-`date +%Y%m%d`.tgz /etc/eucalyptus
4. Store the sets of files together
tar -czvf /tmp/eucalyptus-backup-`date +%Y%m%d`.tgz /tmp/euca-backup/eucalyptus-database-`date +%Y%m%d`.sql /tmp/euca-backup/eucalyptus-keysdir-`date +%Y%m%d`.tgz /tmp/euca-backup/eucalyptus-etcdir-`date +%Y%m%d`.tgz
Now we have a backup file that contains SQL dumps, keys directory and Eucalyptus settings directory in a compressed file called "/tmp/eucalyptus-backup-
(where the date command is replaced with todays date) - you can safely store this file off-site with your other backups.
Restoring a Eucalyptus CLC involves several additional steps that aren't completely obvious at first glance.
1. Install CentOS 6
CentOS 6 installation is above and beyond what this guide is demonstrating, if you don't already know how to do this then skip this whole article and head over to http://www.centos.org.
2. Follow the Eucalyptus Documentation for package installation
The Eucalyptus documentation covers everything you need to know regarding pre-configuring your system (network settings for example) and package installation.
You do not need to register any components, just install the packages and dependencies up to the point where your next step would be to initialise the Eucalyptus database.
3. Stop the CLC service
For good measure stop the service, it should already be in this state.
4. Remove any old database directory (may not exist)
You may find that you have an old database on the system, if you want it then make sure you copy this directory to another location, otherwise remove it.
rm /var/lib/eucalyptus/db -rf
5. Initialise the new database structure
This command initialises the whole database structure including the db directory and the PostgreSQL configuration files.
6. Start the Eucalyptus PostgreSQL database manually
su eucalyptus -c "/usr/pgsql-9.1/bin/pg_ctl start -w -s -D/var/lib/eucalyptus/db/data -o '-h0.0.0.0/0 -p8777 -i'"
If we start the database via the Eucalyptus-cloud init script, it populates the database with some content that will make restoring our backup difficult.
7. Prepare the backup file
Copy the backup file from your off-site backup facility back to your new Eucalyptus CLC and untar it.
tar -xvf /tmp/eucalyptus-backup-XXXX.tgz -C /
Where XXXX is the date the backup was taken on e.g. eucalyptus-backup-20130308.tgz
8. Restore the SQL backup
psql -U root -d postgres -p 8777 -h /var/lib/eucalyptus/db/data -f /tmp/euca-backup/eucalyptus-database-XXXX.sql
9. Restore keys and certificates
tar -xvf /tmp/euca-backup/eucalyptus-keysdir-XXXX.tgz -C /
10. Restore /etc/eucalyptus directory
tar -xvf /tmp/euca-backup/eucalyptus-etcdir-XXXX.tgz -C /
11. Stop Eucalyptus PostgreSQL database
su eucalyptus -c "/usr/pgsql-9.1/bin/pg_ctl stop -D/var/lib/eucalyptus/db/data"
12. Start the CLC
I've written a rather rudimentary bash script to automate this (I'm working on a better python version!), which you are welcome to download, change and modify under a BSD license. I use this script in cron to backup my test systems every Sunday night, however you may want to incorporate this into your nightly OS backup procedure.
You can get it here: https://github.com/tomellis/scripts/blob/master/euca/euca-clc-backup.sh
Now you can go to sleep safe in the knowledge that your CLC is backed up ready to restore in the event of any catastrophic failure!