Backup a WordPress Amazon EC2 instance

Previously I’ve had some difficulty backing up my microistance WordPress MySql db; running mysqldump would cause 100% CPU use until I rebooted the instance and restart apache & mysql.

Why was mysqldump evil?

At least I’ve finally discovered the reason I’d get those nasty CPU spikes:

w2db_fail[1]

The culprit is, most likely, the Slimstat plugin:

screenshot-2[1]

This is a real-time web analytics plugin which has a lovely dashboard view showing you everything about who is or has viewed your site, how long they were there for, how they got there, etc. Addictive stuff.

However, in order to get these stats it saves a lot of data in a new set of mysql tables; these tables not only get really big, but are also being accessed constantly.

As such, a brief overview for backing up from the command line, as referenced in the WordPress codex has a couple of mysqldump commands similar to:

mysqldump --add-drop-table -h <hostname> -u <username> -p <database name>

This will try to backup everything in the named database, which would include all Slimstat tables.

Solutions

There are a couple of approaches to this. I’ve now done both.

  1. Change your backup script to select the wordpress mysql tables only
  2. Use google analytics instead of slimstat

Using google analytics

Head over to www.google.co.uk/analytics and setup your account, then save the file it gives you onto your web host to prove it’s you. Now you can get lovely stats just by placing a tracking script reference in the footer of your WordPress theme; be careful to reimplement this if you change themes.

analytics

Selecting the WordPress tables

Get the table names for all wordpress tables (assuming you went with the default naming convention of a “wp_” prefix; if not, adapt to suit your convention):

mysql -u<username> -p<password> –skip-column-names -e "select table_name from information_schema.TABLES where TABLE_NAME like ‘wp_%’;"

Pipe it into mysqldump:

mysql -u<username> -p<password> –skip-column-names -e "select table_name from information_schema.TABLES where TABLE_NAME like ‘wp_%’;" | xargs mysqldump –add-drop-table -u<username> -p<password> <database name>

send that output to a file:

mysql -u<username> -p<password> –skip-column-names -e "select table_name from information_schema.TABLES where TABLE_NAME like ‘wp_%’;" | xargs mysqldump –add-drop-table -u<username> -p<password> <database name> > <backup file name>

In summary

Using this info I can now create a backup script that actually works; it will generate a single gzipped tar of all my website’s content files (including wordpress code) and the database backup:

#!/bin/bash 

# Set the date format, filename and the directories where your backup files will be placed and which directory will be archived.
NOW=$(date +"%Y-%m-%d-%H%M")
FILE="mywebsitebackup.$NOW.tar"
BACKUP_DIR="/home/ec2-user/_backup"
WWW_DIR="/var/www" 

# MySQL database credentials
DB_USER="root"
DB_PASS="enteryourpwdhere"
DB_NAME="wordpress"
DB_FILE="mydbbackup.$NOW.sql" 

# dump the wordpress dbs
mysql -u$DB_USER -p$DB_PASS --skip-column-names -e "select table_name from information_schema.TABLES where TABLE_NAME like 'wp_%';" | xargs mysqldump --add-drop-table -u$DB_USER -p$DB_PASS $DB_NAME > $BACKUP_DIR/$DB_FILE

# archive the website files
tar -cvf $BACKUP_DIR/$FILE $WWW_DIR 

# append the db backup to the archive
tar --append --file=$BACKUP_DIR/$FILE $BACKUP_DIR/$DB_FILE 

# remove the db backup
rm $BACKUP_DIR/$DB_FILE 

# compress the archive
gzip -9 $BACKUP_DIR/$FILE

Save that in a script, make it executable, and if you like you can add a cron task to create a backup on a regular basis. A couple of things to watch out for:

  1. These files can be big, so will use up a good chunk of your elastic block store volumes, which cost a few dollars more than a teeny EC2 instance
  2. Creating the archive can be a bit processor intensive sometimes, which may stress your poor little microinstance

Coming up

Using this backup I automate the creation of a duplicate blog and script the entire process using Chef and Vagrant!

Aside #1 – EC2 WordPress Issues

Whilst I’ve been attempting to learn a new thing each month this year, I’ve been finding it really tricky to keep to the pretty loose schedule. As such, I though I’d try and note down every time a shiny new thing takes my interest, so that I have some idea why I’m incapable of completing a series of blog posts.

EC2 Issues

I’m really having problems with EC2 these days, and I’m constantly being dragged back into finding out what the problem(s) is/are..

Restarts don’t restart

Had to add entries into the rc.local to restart httpd and mysql, and also had to change the permissions on the file to allow it to be executed upon restart (sudo chmod 6755 /etc/rc.d/rc.local).

Instigator
— because the blog went down a couple of times and I didn’t notice!

Backups kill the site!

WordPress 2 DropBox – killed me. As does mysqldump in general.

DB backup

mysqldump --add-drop-table -u <username> -p <database> | bzip2 -czs > <db backup filename>.bz2

website backup

tar -cjf <site backup filename>.bz2 /var/www

Instigator
— wordpress kept telling me to upgrade to the latest version, but also had a big MAKE SURE YOU BACKUP FIRST warning; and the WordPress 2 DropBox plugin killed my EC2 microinstance.

WordPress: Alerting on high CPU usage

#!/bin/sh
#
# Script to check CPU usage and tweet me if it's over a threshold
#
if [ ` uptime | awk '{ print$10 }' | cut -d. -f1 ` -gt 50 ];
then
    sudo python /home/ec2-user/tweepy/ec2Event.py '@rposbo LOAD: ' < /tmp/load
else
    echo lo
fi

This totally doesn’t work! Uptime seems to happily report low CPU usage even when the site is dying. Maybe CPU is low and mem is high.. hmm.. may have to change that check.

Instigator
— attempting to backup wordpress keeps breaking my site, dammit!

logstash, graphite, statsd

Currently trying to get those three to work together and help me find out why the EC2 microinstance is so.. uh.. micro..

Instigator
— EC2 just keeps on dying and I need to find out why