A Step by Step Guide to using Terraform to define an AutoScaling Private WebPageTest instance in code

WebPagetest Terraform

In a previous article I went through the steps needed to create your own private, autoscaling, WebPageTest setup in Amazon AWS. It wasn’t particularly complicated, but it was quite manual; I don’t like pointing and clicking in a GUI since I can’t easily put it in version control and run it again and again on demand.

Fortunately, whatever you create within AWS can be described using a language called CloudFormation which allows you to define your infrastructure as code.

Unfortunately it’s not easy to understand (in my opinion!) and I could never quite get my head around it, which annoyed me no end.

In this article I’ll show you how to use Terraform to define your private autoscaling WebPageTest setup in easily understandable infrastructure as code, enabling an effortless and reproducable web performance testing setup, which you can then fearlessly edit and improve!

Terraform

Terraform is a domain specific language (DSL) which MASSIVELY simplifies AWS’s CloudFormation language.

I’ll build up the infrastructure code section by section below, and then review the script in its entirety at the end.

Your first prerequisite is to go and download terraform and ensure it’s in your PATH; it’s a standalone executable, which makes this super simple!

Done that? Ok, cool – let’s move on:

Autoscaling Private WebPageTest

The three required sections for this setup are:

  1. A user with permissions to create new EC2 instances which the WebPageTest server uses to spin up test agents and S3 permissions to archive the tests (IAM).
  2. A place to archive test results, freeing up space on the WebPageTest server itself (S3).
  3. A VM instance on which to host your WebPageTest server which orchestrates the whole process (EC2).

Previously we created and linked these up manually, and this time we’ll get the same result using Terraform to code it all up!

Codified Autoscaling Private WebPageTest setup

A big difference with this approach is that in order to use Terraform we will need to create an AWS user for Terraform since it will be creating AWS resources on our behalf, and as such it will need the appropriate admin permissions.

This is a one-off task: log in to your AWS console, hop over to IAM, and create a new user with programattic access and admin privilages.

This is seriously overkill, so you might want to read up on Terraform Best Practices to grant just the level of access you actually need for your setup.

Grab the access key and secret for your Terraform user and save them in any one of the following:

  1. Environment variables
  2. AWS Profile entry in the local profile file
  3. Inline (see below)
  4. Plain text file

If you choose the AWS profile file or a plain text file, then it should use this format:

[terraform]
aws_access_key_id=BLAHBLAHBLAHBLAHBLAH
aws_secret_access_key=bLAHBLAhbLAHBLAhb

Set up the AWS provider

Terraform needs to know which infrastructure provider you plan to use; it currently officially supports over 100 providers and has unofficial support for over 100 more via the community.

Basically if you deal with infrastructure, there’s probably a Terraform provider for it. We’re using AWS so ours will look like this, depending on where you saved your Terraform IAM details:

# Use locally defined "terraform" profile
# This could  via environment variables (1)
provider "aws" {
  region        = "eu-west-1"
}

# OR
# in your ~/.aws file, for example (2)
provider "aws" {
  region        = "eu-west-1"
  profile       = "terraform"
}

# OR
# Inline (3)
provider "aws" {
  region     = "eu-west-1"
  access_key = "<access_key>"
  secret_key = "<secret_key>"
}

# OR
# Specific credentials file (4)
# (same format as a profile file)
provider "aws" {
  region        = "eu-west-1"
  shared_credentials_file = "aws_credentials.txt"
}

Just choose one of these approaches and save it as "webpagetest.tf"

IAM

Now that the admin is completed, we can get back to the list of three things the WebPageTest setup needs:

  1. IAM
  2. S3
  3. EC2

We need to create a user for the webpagetest server to use in order to create and destroy the EC2 test agents, and archive test results to S3.

1) Create the IAM user

The general structure for Terraform resources is:

# resource structure
resource "resource type" "resource name" {
  property_name = "property value"
}

For creating our IAM user, it looks like this:

# IAM resource
resource "aws_iam_user" "wpt-user" {
  name = "wpt-user"
}
  • The "resource type" is "aws_iam_user"
  • The "resource name" for this one is "wpt-user"
  • The "property_name" "name" has the "property_value" of "wpt-user"

Save this as "iam-wpt.tf" in the same directory as your "webpagetest.tf" file from above (with the configuration details).

Easy, right? That’s our first resource! Want to try it out? Sure you do! Hop over to your command line, get into the same directory as your tf files, and run:

$ terraform init

Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...
- Downloading plugin for provider "aws" (2.12.0)...

This will initialise the directory with the AWS Terraform provider, which can take a while to download. Once that’s finished you can run:

terraform plan

This will show what Terraform will do, without actually doing it yet:

Terraform will perform the following actions:

  + aws_iam_user.wpt-user
      id:                                    <computed>
      arn:                                   <computed>
      force_destroy:                         "false"
      name:                                  "wpt-user"
      path:                                  "/"
      unique_id:                             <computed>

  + aws_iam_user_policy_attachment.ec2-policy-attach
      id:                                    <computed>
      policy_arn:                            "arn:aws:iam::aws:policy/AmazonEC2FullAccess"
      user:                                  "wpt-user"

  + aws_iam_user_policy_attachment.s3-policy-attach
      id:                                    <computed>
      policy_arn:                            "arn:aws:iam::aws:policy/AmazonS3FullAccess"
      user:                                  "wpt-user"

  + aws_instance.webpagetest
      id:                                    <computed>
      ami:                                   "ami-9978f6ee"
      arn:                                   <computed>
      associate_public_ip_address:           <computed>
    ...

If you actually want to try this out, then run:

terraform apply

You’ll be asked to approve this step by typing "yes"; if you’re feeling confident and foolhardy you can bypass this each time with an extra parameter:

terraform apply -auto-approve

The IAM user will be created, visible in your AWS console:

webpagetest IAM user created

Since we’re not done yet, we could just leave it there since Terraform keeps track of the current state of your setup, and will apply subsequent updates as incremental changes where possible or just teardown and recreate the entire thing.

If you don’t want to leave it laying around, you can run:

terraform destroy

This will tear down whatever was created by the .tf files in the current directory.

Give EC2 permissions

We need to find the EC2 policy for AmazonEC2FullAccess and attach it to the user; for this you can use the appropriate arn (Amazon Resource Name) inline:

# AmazonEC2FullAccess
# Attaching the policy to the user
resource "aws_iam_user_policy_attachment" "ec2-policy-attach" {
  user = "${aws_iam_user.wpt-user.name}"
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2FullAccess"
}

There are a few other ways to do this; I’ll pick this up later in the Improvements section.

Notice the use of references: instead of typing:

user = "webpagetest"

we can refer to the properties of other resources, so this code instead becomes:

user = "${aws_iam_user.wpt-user.name}"

In some cases (such as using resources we’ve already created), this dynamic referencing makes for a wonderfully flexible infrastructure-as-code setup; we’ll use this lots more later on.

Give S3 Permissions

Same as above, but this time for AmazonS3FullAccess:

# S3 policy
# attach the S3 policy to the user
resource "aws_iam_user_policy_attachment" "s3-policy-attach" {
  user = "${aws_iam_user.wpt-user.name}"
  policy_arn = "arn:aws:iam::aws:policy/AmazonS3FullAccess"
}

Checkpoint

We’ve now defined a Terraform user with admin powers, created a WebPageTest IAM user, and granted that user full EC2 and S3 permissions.

Ideally you’d only grant powers to create the resources that you actually need – there’s a lot of valuable info over on the Terraform site around best practices

You can check this out with a terraform plan and/or terraform apply if you like, or just move on to the next section.

2) S3 Bucket for test archiving

Creating an S3 bucket is pretty simple, and since we’ve given the IAM user full S3 access then it can push tests into whatever bucket we create. There’s one snag though…

resource "aws_s3_bucket" "wpt-archive" {
  bucket = "HowDoIMakeSureThisIsUniqueIsh?!" # great question..
  acl    = "private"
}

S3 buckets need to have unique names within a given region – how can we code that? Luckily, Terraform has the concept of a random number generator, which will give us a pretty good chance of coming up with something unique:

# the "random_string" resource with some S3-friendly settings:
# no special chars and no uppercase
resource "random_string" "bucket" {
  length = 10
  special = false
  upper = false
}

# Now get the actual value using ".result"
resource "aws_s3_bucket" "wpt-archive" {
  bucket = "my-wpt-test-archive-${random_string.bucket.result}"
  acl    = "private"
}

Important point: random_string counts as a new "provider", so you’ll need to run terraform init again to pull down that provider, before you’ll be allowed to run terraform plan or terraform apply.

3) EC2 WebPageTest Instance

To create the WebPageTest server EC2 instance we need to specify the AMI to use; luckily for us, these are all listed in the WebPageTest github repo:

WebPageTest server EC2 AMI list by region

Pick the one for your chosen AWS region; in my case this will be eu-west-1, so I use the ami value of "ami-9978f6ee" for the first step:

# WebPageTest EC2 instance
resource "aws_instance" "webpagetest" {
  ami           = "ami-9978f6ee"
  instance_type = "t2.micro"
}

Security

We also need to specify the security config to open up port 80, and/or 443 (for HTTPS). This is a separate resource which we then refer back to in the EC2 one:

# Security group for the WebPageTest EC2 instance
resource "aws_security_group" "wpt-sg" {
  name = "wpt-sg"

  # http  
  ingress {
    # What range of ports are we opening up?
    from_port = 80 # From port 80...
    to_port = 80 # to port 80!
    protocol = "tcp"
    description = "HTTP"
    cidr_blocks = ["0.0.0.0/0"] # who can access it? The world!
  }  
  
  # SSH
  ingress {
    from_port = 22
    to_port = 22
    protocol = "tcp"
    description = "SSH"
    cidr_blocks = ["0.0.0.0/0"]
  }

  # Outgoing traffic
  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"
    cidr_blocks     = ["0.0.0.0/0"]
  }
}

I’m not a security specialist, I just like connecting dots to see what happens! Please feel free to suggest improvements to the repo over on github; I appreciate any help I can get.

Now we can link that security group resource to the EC2 instance resource like so:

# WebPageTest EC2 instance
resource "aws_instance" "webpagetest" {
  ami           = "ami-9978f6ee"
  instance_type = "t2.micro"

  # we can refer back to the security group using a reference
  # made up of  "resource-type.resource-name" to  get the
  # id property
  vpc_security_group_ids  = ["${aws_security_group.wpt-sg.id}"]
}

A couple more things and we’re done with the EC2 instance.

Key Pair

Any EC2 instance needs an associated public key, so that you can remotely connect if necessary. We could make this dynamic, using something like this:

# create a keypair
resource "aws_key_pair" "wpt-admin-key" {
  key_name   = "wpt-admin-key"
  public_key = "ssh-rsa blahblahblah... [email protected]"
}

resource "aws_instance" "webpagetest" {
  ami           = "ami-9978f6ee"
  instance_type = "t2.micro"
  vpc_security_group_ids  = ["${aws_security_group.wpt-sg.id}"]
  # reference the new key pair's name
  key_name      = "${aws_key_pair.wpt-admin-key.name}"
}

Or we could just use one that already exists in the AWS account.

resource "aws_instance" "webpagetest" {
  ami           = "ami-9978f6ee"
  instance_type = "t2.micro"
  vpc_security_group_ids  = ["${aws_security_group.wpt-sg.id}"]
  # paste in your existing key pair's name
  key_name      = "robins-key"
}

User data

Last thing needed to the WebPageTest server EC2 instance working is to set the user data, defining the settings for the WebPageTest server. The tricky part here seems to be passing in line-delimited data within user data.

If you remember from last time, we want to pass in the following information (and if you don’t remember from last time, why not have a quick read to get context?):

  • ec2_key: the IAM user access key ID
  • ec2_secret: the IAM user secret access key
  • api_key: a super secret api key that you provide
  • waterfall_show_user_timing: makes for pretty waterfall charts if you have user timings in your pages
  • iq: image quality – defaults to 30%, but 75% is nicer, especially if you’re using S3 for storage!
  • pngss: full resolution images for screenshots
  • archive_s3_key: the IAM user key
  • archive_s3_secret: the IAM user secret
  • archive_s3_bucket: the WPT archive bucket
  • archive_days: number of days before tests are pushed to S3
  • cron_archive: run archive script hourly automatically as agents poll for work

First up, let’s get a reference to the wpt-user‘s IAM credentials, so they can be passed in as user data:

# IAM Access Key for WebPageTest user
resource "aws_iam_access_key" "wpt-user" {
  user = "${aws_iam_user.wpt-user.name}"
}

The WebPageTest IAM user’s access key details can now be accessed via this new aws_iam_access_key resource:

key = "${aws_iam_access_key.wpt-user.id}"
secret = "${aws_iam_access_key.wpt-user.secret}"

We could just create a really long string with "\n" separating, or we could reference an external file. I’ll show an inline string here with references to both the aws_iam_access_key and our previously created aws_s3_bucket resource called "wpt_archive", and will show the external file version in the Improvements section towards the end.

Unfortunately you can’t have multiline variable values, so this inline version just becomes one very very long line of text. Not easy to maintain or debug!

# specifying the user data inline
user_data     = "ec2_key=${aws_iam_access_key.wpt-user.id} \n ec2_secret=${aws_iam_access_key.wpt-user.secret} \n api_key=<my crazy long api key> \n waterfall_show_user_timing=1 \n iq=75 \n pngss=1 \n archive_s3_server=s3.amazonaws.com \n archive_s3_key=${aws_iam_access_key.wpt-user.id} \n archive_s3_secret=${aws_iam_access_key.wpt-user.secret} \n archive_s3_bucket=${aws_s3_bucket.wpt-archive.bucket} \n archive_days=1 \n cron_archive=1"

There are plenty of potential improvements here: pulling the user data into a template, which allows us to create it dynamically from a template file is a favourite of mine; I’ll demonstrate this later in the Improvements section.

Try it out

The resulting script is below – be sure to replace placeholders with your own values!:

# Setting up the AWS Terraform provider
provider "aws" {
  region     = "eu-west-1"
  
  # FILL IN THESE PLACEHOLDERS (or use another method):
  access_key = "<access_key>"
  secret_key = "<secret_key>"
}

# IAM config
resource "aws_iam_user" "wpt-user" {
  name = "wpt-user"
}

resource "aws_iam_user_policy_attachment" "ec2-policy-attach" {
  user = "${aws_iam_user.wpt-user.name}"
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2FullAccess"
}

resource "aws_iam_user_policy_attachment" "s3-policy-attach" {
  user = "${aws_iam_user.wpt-user.name}"
  policy_arn = "arn:aws:iam::aws:policy/AmazonS3FullAccess"
}

resource "aws_iam_access_key" "wpt-user" {
  user = "${aws_iam_user.wpt-user.name}"
}

# S3 Config
resource "random_string" "bucket" {
  length = 10
  special = false
  upper = false
}

resource "aws_s3_bucket" "wpt-archive" {
  bucket = "my-wpt-test-archive-${random_string.bucket.result}"
  acl    = "private"
}

# Main EC2 config
resource "aws_instance" "webpagetest" {
  ami           = "ami-9978f6ee"
  instance_type = "t2.micro"
  vpc_security_group_ids  = ["${aws_security_group.wpt-sg.id}"]
  
  # FILL IN THIS PLACEHOLDER:
  key_name      = "<key pair name>"
  
  # FILL IN THE API KEY PLACEHOLDER:
  user_data     = "ec2_key=${aws_iam_access_key.wpt-user.id} \n ec2_secret=${aws_iam_access_key.wpt-user.secret} \n api_key=<my crazy long api key> \n waterfall_show_user_timing=1 \n iq=75 \n pngss=1 \n archive_s3_server=s3.amazonaws.com \n archive_s3_key=${aws_iam_access_key.wpt-user.id} \n archive_s3_secret=${aws_iam_access_key.wpt-user.secret} \n archive_s3_bucket=${aws_s3_bucket.wpt-archive.bucket} \n archive_days=1 \n cron_archive=1"
}

# Security group for the WebPageTest server
resource "aws_security_group" "wpt-sg" {
  name = "wpt-sg"

  # http  
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    description = "HTTP"
    cidr_blocks = ["0.0.0.0/0"]
  }  
  
  # SSH
  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    description = "SSH"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"
    cidr_blocks     = ["0.0.0.0/0"]
  }
}

# This is to make it easier to get the resulting URL
# for your WebPageTest instance
output "webpagetest" {
  value = "${aws_instance.webpagetest.public_dns}"
}

The final section is really handy; the output type will print out the result of the value property – in this case the public URL for the WebPageTest server itself.

We could also create an output for:

  • Randomised S3 bucket name,
  • WebPageTest IAM user’s key and secret,
  • etc

Let’s run terraform apply and watch what happens:

Apply complete! Resources: 9 added, 0 changed, 0 destroyed.

Outputs:

webpagetest = ec2-34-245-85-104.eu-west-1.compute.amazonaws.com

Go ahead and visit that url to find the familiar WebPageTest homepage. Enter a URL to test and submit it; you’ll see an EC2 test agent start to spin up in the appropriate AWS region.

Complete

Hopefully you’ve seen how easy it can be to get started with Terraform, creating your very own autoscaling private WebPageTest instance in AWS from one Terraform file.

You can stop here, have a refreshing beverage of choice and make smug finger-guns at people as they pass by.

Or you could check out the next section which takes your current knowledge and builds on it – go on, you’ll love it, honest!

Improvements

The script above will totally get you a fully autoscaling WebPageTest private instance in AWS and is pretty flexible via the user_data options that can configure WebPageTest in some detail.

Here are a few opportunities to improve on this.

Improvement 1: Variables

There are a few places in the Terraform script that use hard coded values; by introducing variables we can make the script more flexible. For instance, right at the very top the "region" is set to "eu-west-1", so let’s pull that into a variable:

# Define a variable for the region
variable "region" {
  default = "eu-west-1"
}

We can now refer to this anywhere that we would have hard coded the region, for example:

# Setting up the AWS Terraform provider
provider "aws" {
  region = "${var.region}"
}

Let’s define another one that abstracts the EC2 ami for the WebPageTest server; this will be a "map" type instead of the default "value" type:

# WebPageTest EC2 AMIs
variable "wpt_ami" {
    type    = "map"
    default = {
        us-east-1 = "ami-fcfd6194"
        us-west-1= "ami-e44853a1"
        us-west-2= "ami-d7bde6e7"
        sa-east-1= "ami-0fce7112"
        eu-west-1= "ami-9978f6ee"
        eu-central-1= "ami-22cefd3f"
        ap-southeast-1 = "ami-88bd97da"
        ap-southeast-2 = "ami-eb3542d1"
        ap-northeast-1 = "ami-66233967"
    }
}

Since there’s a different one for each region, we can combine these variables using a lookup:

resource "aws_instance" "webpagetest" {
  ami = "${lookup(var.wpt_ami, var.region)}"
  ...
}

Cool, huh? Although the default region is set to "eu-west-1" in this example, it can be overridden when calling from the command line:

terraform apply -var "region=ap-southeast-1"

This will set the "region" variable to "ap-southeast-1", affecting the provider resource and also choose the matching "wpt_ami" value. This would result in the equivalent of:

provider "aws" {
  region = "ap-southeast-1"
}

...

resource "aws_instance" "webpagetest" {
  ami  = "ami-88bd97da"
  ...
}

Handy! We’ve now extended the original script to support all AWS regions that WebPageTest AMIs exist for.

The github repo includes this addition

Improvement 2: Template for User Data

Having that user data as a looooonnnggg inline string is quite ugly and unmaintainable. We can improve this by using the template_file data type.

This is a new provider, so you’ll need to run terraform init before it can be used.

By abstracting out our user data into a separate data source, we can update the user data string in the EC2 definition to:

user_data     = "${data.template_file.ec2_wpt_userdata.rendered}"

There are a few methods – all covered below – to implement this and they all use the template_file data type. This allows us to use a template input with placeholders, and define the values for those placeholders in a vars object, which are merged together later:

#  a) Inline template string
# Separates it out, but still a messy single line
data "template_file" "ec2_wpt_userdata" {
    template = "ec2_key=$${key} \n ec2_secret=$${secret} \n api_key=$${api_key} \n waterfall_show_user_timing=1 \n iq=75 \n pngss=1 \n archive_s3_server=s3.amazonaws.com \n archive_s3_key=$${key} \n archive_s3_secret=$${secret} \n archive_s3_bucket=$${wpt_s3_archive} \n archive_days=1 \n cron_archive=1"

    vars = {
        key = "${aws_iam_access_key.wpt_user.id}"
        secret = "${aws_iam_access_key.wpt_user.secret}"
        api_key = "123412341234123412341234"
        wpt_s3_archive = "${aws_s3_bucket.wpt-archive.bucket}"
    }
}

# b) Inline heredoc syntax - much more readable!
# Now we can have new lines for improved readability
# Note the double $$
data "template_file" "ec2_wpt_userdata" {
    template =<<EOT
      ec2_key=$${key}
      ec2_secret=$${secret}
      api_key=$${api_key}
      waterfall_show_user_timing=1
      iq=75
      pngss=1
      archive_s3_server=s3.amazonaws.com
      archive_s3_key=$${key}
      archive_s3_secret=$${secret}
      archive_s3_bucket=$${wpt_s3_archive}
      archive_days=1
      cron_archive=1
    EOT

    vars = {
        key = "${aws_iam_access_key.wpt_user.id}"
        secret = "${aws_iam_access_key.wpt_user.secret}"
        api_key = "123412341234123412341234"
        wpt_s3_archive = "${aws_s3_bucket.wpt-archive.bucket}"
    }
}

# c) External TPL file
# Keeps it nice and tidy!
data "template_file" "ec2_wpt_userdata" {
    template = "${file("user_data.tpl")}"

    vars = {
        key = "${aws_iam_access_key.wpt_user.id}"
        secret = "${aws_iam_access_key.wpt_user.secret}"
        api_key = "123412341234123412341234"
        wpt_s3_archive = "${aws_s3_bucket.wpt-archive.bucket}"
    }
}

All of the above options can be referenced using .rendered on the template_file:

# refer to this as a "rendered" value
resource "aws_instance" "webpagetest" {
  user_data     = "${data.template_file.ec2_wpt_userdata.rendered}"
  ...
}

The external template file would look like the below – note the single $ this time:

ec2_key=${key}
ec2_secret=${secret}
api_key=${api_key}
waterfall_show_user_timing=1
iq=75
pngss=1
archive_s3_server=s3.amazonaws.com
archive_s3_key=${key}
archive_s3_secret=${secret}
archive_s3_bucket=${wpt_s3_archive}
archive_days=1
cron_archive=1

The github repo includes the heredoc template syntax version

Improvement 3: Dynamic API key

Up until now we’ve used a static API key value:

# e.g.
api_key=<my crazy long api key>

# or
api_key = "123412341234123412341234"

Of course, Terraform has a solution to this; first up, the random_string as we used for the S3 bucket name:

# API key as a random 40 char string
resource "random_string" "api-key" {
  length = 40
  special = false
}

data "template_file" "ec2_wpt_userdata" {
    template = "${file("user_data.tpl")}"

    vars = {
        # reference the api key to get the resulting random string
        api_key = "${random_string.api-key.result}"
        key = "${aws_iam_access_key.wpt_user.id}"
        secret = "${aws_iam_access_key.wpt_user.secret}"
        wpt_s3_archive = "${aws_s3_bucket.wpt-archive.bucket}"
    }
}

All seems good, but we can actually improve on this more. What use is an API key if you don’t know what it is? You can’t easily get the value back out of the rendered user data without rendering the whole string; and doing so will regenerate the random value! it’s like a quantum variable!

One trick to geting the random value out is in Terraform’s locals; a local value assigns a name to an expression, allowing it to be used multiple times within a module without repeating it.. It also means that the value is calculated once and can be referenced many times.

# API key as a random 40 char string
resource "random_string" "api-key" {
  length = 40
  special = false
}

# define a local "api_key" variable
locals {
  "api_key" = "${random_string.api-key.result}"
}

data "template_file" "ec2_wpt_userdata" {
    template = "${file("user_data.tpl")}"

    vars = {
        # reference the new local
        api_key = "${local.api_key}"
        key = "${aws_iam_access_key.wpt_user.id}"
        secret = "${aws_iam_access_key.wpt_user.secret}"
        wpt_s3_archive = "${aws_s3_bucket.wpt-archive.bucket}"
    }
}

# BONUS! Return the API key without
# regenerating the random value
output "api_key" {
  value = "${local.api_key}"
}

Putting it all together

The full script is over on github, and once you fill in the AWS credentials for your Terraform user and the key pair name, then after running a terraform init and terraform apply you’ll be greeted with something like this:

Outputs:

api_key = t2glfd2MlixzkQpr1e0v37xmGkQOBUVWU1pKeQKd
webpagetest = ec2-34-246-124-170.eu-west-1.compute.amazonaws.com

The user data is generated as expected; you can see the api key is the same in user data as the output from above:

Generated User Data

You’ll see the familiar WebPageTest homepage if you pop over to the URL in the output from above:

Generated WPT instance

Different Regions

Let’s try this same script, but in a different region!

Be aware that once you’ve executed one test, then the S3 bucket will not be deleted when you call destroy as it’s not empty. Usually this isn’t a problem, since any subsequent terraform apply checks the local "terraform.tfstate" file, knows this S3 bucket still exists, and won’t create a new one. If you change region then this apply will fail, since the S3 bucket exists in "terraform.tfstate", but doesn’t exist in the new region that you’re now referencing. You can just delete your "terraform.tfstate" file if you want to start from scratch and it’ll work.

ALSO be aware that your key pair doesn’t exist in the new region, so you’ll need to create a new one there first or use Terraform’s inline key pair creation to automated it!

terraform apply -var 'region=ap-southeast-1'

After ticking along for a while (assuming you’ve tidied up any S3 and missing key pair), you’ll see something like this:

Outputs:

api_key = 5titWU7aE3R6gkbf851v3tPjwCsosNVZnmreOSuq
webpagetest = ec2-54-169-221-212.ap-southeast-1.compute.amazonaws.com

Oooh – ap-southeast-1! Cool huh?

Given that the WebPageTest server can already spin up test instances in many AWS regions, you choose to can deploy the server into whichever region you need.

EXTRA POST CREDITS NICK FURY BONUS IMPROVEMENT

Here’s one last improvement to thank you for reading to the end! Notice the EC2 instances appearing in the AWS console:

EC2 webpagetest instances - no names!

No name! That’s not pretty.

If we add tags, and if we add a tag called "Name", then we’ll get something more useful in that listing.

resource "aws_instance" "webpagetest" {
  ...
  # Add your tags in here
  tags {
   Name      =   "webpagetest"
   Project   =   "webperf"
  }
}

Now check it out:

EC2 webpagetest instances - named!

Cool!

Summary

Phew, we’ve covered a whole lot here! With a bit of Terraform we’ve managed to set up WebPageTest’s IAM, S3, EC2, Security Groups, and make it region agnostic and autoscaling, with a dynamically generated API key.

The resulting script is in this github repo. Have a go and let me know how you get on!

Chef for Developers: part 4 – WordPress, Backups, & Restoring

I’m continuing with my plan to create a series of articles for learning Chef from a developer perspective.

Part #1 gave an intro to Chef, Chef Solo, Vagrant, and Virtualbox. I also created my first Ubunutu VM running Apache and serving up the default website.

Part #2 got into creating a cookbook of my own, and evolved it whilst introducing PHP into the mix.

Part #3 wired in MySql and refactored things a bit.

WordPress Restore – Attempt #1: Hack It Together

Now that we’ve got a generic LAMP VM its time to evolve it a bit. In this post I’ll cover adding wordpress to your VM via Chef, scripting a backup of your current wordpress site, and finally creating a carbon copy of that backup on your new wordpress VM.

I’m still focussing on using Chef Solo with Vagrant and VirtualBox for the time being; I’m learning to walk before running!

Kicking off

Create a new directory for working in and create a cookbooks subdirectory; you don’t need to prep the directory with a vagrant init as I’ll add in a couple of clever lines at the top of my new Vagrantfile to initialise it straight from a vagrant up.

Installing WordPress

As in the previous articles, just pull down the wordpress recipe from the opscode repo into your cookbooks directory:

cd cookbooks
git clone https://github.com/opscode-cookbooks/wordpress.git

Looking at the top of the WordPress default.rb file you can see which other cookbooks it depends on:

include_recipe "apache2"
include_recipe "mysql::server"
include_recipe "mysql::ruby"
include_recipe "php"
include_recipe "php::module_mysql"
include_recipe "apache2::mod_php5"

From the last post we know that MySql also depends on OpenSSL, and MySql::Ruby depends on build-essentials. Go get those both in your cookbooks directory as well as the others mentioned above:

git clone https://github.com/opscode-cookbooks/apache2.git
git clone https://github.com/opscode-cookbooks/mysql.git
git clone https://github.com/opscode-cookbooks/openssl.git
git clone https://github.com/opscode-cookbooks/build-essential.git
git clone https://github.com/opscode-cookbooks/php.git

Replace the default Vagrantfile with the one below to reference the wordpress cookbook, and configure the database, username, and password for wordpress to use; I’m basing this one on the Vagrantfile from my last post but have removed everything to do with the “mysite” cookbook:

Vagrantfile

Vagrant.configure("2") do |config|
  config.vm.box = "precise32"
  config.vm.box_url = "http://files.vagrantup.com/precise32.box"
  config.vm.network :forwarded_port, guest: 80, host: 8080

  config.vm.provision :shell, :inline => "apt-get clean; apt-get update" 

  config.vm.provision :chef_solo do |chef|

    chef.json = {
      "mysql" => {
      "server_root_password" => "myrootpwd",
      "server_repl_password" => "myrootpwd",
      "server_debian_password" => "myrootpwd"
      },
      "wordpress" => {
        "db" => {
          "database" => "wordpress",
          "user" => "wordpress",
          "password" => "mywppassword"
        }
      }
    }

    chef.cookbooks_path = ["cookbooks"]
    chef.add_recipe "wordpress"
  end
end

The lines

  config.vm.box = "precise32"
  config.vm.box_url = "http://files.vagrantup.com/precise32.box"

mean you can skip the vagrant init stage as we’re defining the same information here instead.

You don’t need to reference the dependant recipes directly since the WordPress one has references to it already.

You also don’t need to disable the default site since the wordpress recipe does this anyway. As such, remove this from the json area:

      "apache" => {
        "default_site_enabled" => false
      },

Note: An issue I’ve found with the current release of the WordPress cookbook

I had to comment out the last line of execution which just displays a message to you saying

Navigate to 'http://#{server_fqdn}/wp-admin/install.php' to complete wordpress installation.

For some reason the method “message” on “log” appears to be invalid. You don’t need it though, so if you get the same problem you can just comment it out yourself for now.

To do this, head to line 116 in cookbooks/wordpress/recipes/default.rb and add a # at the start, e.g.:

log "wordpress_install_message" do
  action :nothing
  # message "Navigate to 'http://#{server_fqdn}/wp-admin/install.php' to complete wordpress installation"
end

Give that a

vagrant up

Then browse to localhost:8080/wp-admin/install.php and you should see:

wordpress inital screen 8080

From here you could quite happily set up your wordpress site on a local VM, but I’m going to move on to the next phase in my cunning plan.

Restore a WordPress Backup

I’ve previously blogged about backing a wordpress blog, the output of which was a gziped tar of the entire wordpress directory and the wordpress database tables. I’m now going to restore it to this VM so that I have a functioning copy of my backed up blog.

I’d suggest you head over and read the backup post I link to above, or you can just use the resulting script:

backup_blog.sh

#!/bin/bash

# Set the date format, filename and the directories where your backup files will be placed and which directory will be archived.
NOW=$(date +"%Y-%m-%d-%H%M")
FILE="rposbowordpressrestoredemo.$NOW.tar"
BACKUP_DIR="/home/<user>/_backup"
WWW_DIR="/var/www"

# MySQL database credentials
DB_USER="root"
DB_PASS="myrootpwd"
DB_NAME="wordpress"
DB_FILE="rposbowordpressrestoredemo.$NOW.sql"

# dump the wordpress dbs
mysql -u$DB_USER -p$DB_PASS --skip-column-names -e "select table_name from information_schema.TABLES where TABLE_NAME like 'wp_%';" | xargs mysqldump --add-drop-table -u$DB_USER -p$DB_PASS $DB_NAME > $BACKUP_DIR/$DB_FILE

# archive the website files
tar -cvf $BACKUP_DIR/$FILE $WWW_DIR

# append the db backup to the archive
tar --append --file=$BACKUP_DIR/$FILE $BACKUP_DIR/$DB_FILE

# remove the db backup
rm $BACKUP_DIR/$DB_FILE

# compress the archive
gzip -9 $BACKUP_DIR/$FILE

That results in a gzipped tarball of the entire wordpress directory and the wordpress database dumped to a sql file, all saved in the directory specified at the top – BACKUP_DIR=”/home/<user>/_backup”

First Restore Attempt – HACK-O-RAMA!

For the initial attempt I’m just going to brute-force it, to validate the actual importing and restoring of the backup. The steps are:

  1. copy an archive of the backup over to the VM (or in my case I’ll just set up a shared directory)
  2. uncompress the archive into a temp dir
  3. copy the wordpress files into a website directory
  4. import the mysql dump
  5. update some site specific items in mysql to enable local browsing

You can skip that last one if you want to just add some HOSTS entries to direct calls to the actual wordpress backed up site over to your VM.

Prerequisite

Create a backup of a wordpress site using the script above (or similar) and download the archive to your host machine.

I’ve actually done this using another little vagrant box with a base wordpress install for you to create a quick blog to play around with backing up and restoring – repo is over on github.

For restoring

Since this is the HACK-O-RAMA version, just create a bash script in that same directory called restore_backup.sh into which you’ll be pasting the chunks of code from below to execute the restore.

We can then call this script from the Vagrantfile directly. Haaacckkyyyy…

Exposing the archive to the VM

I’m saving the wordpress archive in a directory called “blog_backup” which is a subdirectory of the project dir on the host machine; I’ll share that directory with the VM using this line somewhere in the Vagrantfile:

config.vm.synced_folder "blog_backup/", "/var/blog_backup/"

if you’re using Vagrant v1 then the syntax would be:

config.vm.share_folder "blog", "/var/blog_backup/", "blog_backup/"

Uncompress the archive into the VM

This can be done using the commands below, pasted into that restore_backup.sh

# pull in the backup to a temp dir
mkdir /tmp/restore

# untar and expand it
cd /tmp/restore
tar -zxvf /var/blog_backup/<yoursite>.*.tar.gz

Copy the wordpress files over

# copy the website files to the wordpress site root
sudo cp -Rf /tmp/restore/var/www/wordpress/* /var/www/wordpress/

Import the MySQL dump

# import the db
mysql -uroot -p<dbpassword> wordpress < /tmp/restore/home/<user>/_backup/<yoursite>.*.sql

Update some site-specific settings to enable browsing

Running these db updates will allow you to browse both the wordpress blog locally and also the admin pages:

# set the default site to locahost for testage
mysql -uroot -p<dbpassword> wordpress -e "UPDATE wp_options SET option_value='http://localhost:8080' WHERE wp_options.option_name='siteurl'"
mysql -uroot -p<dbpassword> wordpress -e "UPDATE wp_options SET option_value='http://localhost:8080' WHERE wp_options.option_name='home'"

Note: Pretty Permalinks

If you’re using pretty permalinks – i.e., robinosborne.co.uk/2013/07/02/chef-for-developers/ instead of http://robinosborne.co.uk/?p=1418 – then you’ll need to both install the apache::mod_rewrite recipe and configure your .htaccess to allow mod_rewrite to do its thing. Create the .htaccess below to enable rewrites and save it in the same dir as your restore script.

.htaccess

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

restore_backup.sh

# copy over the .htaccess to support mod_rewrite for pretty permalinks
sudo cp /var/blog_backup/.htaccess /var/www/wordpress/
sudo chmod 644 /var/www/wordpress/.htaccess

Also add this to your Vagrantfile:

chef.add_recipe "apache2::mod_rewrite"

The final set up and scripts

Bringing this all together we now have a backed up wordpress blog, restored and running as a local VM:

wordpress restore 1

The files needed to achieve this feat are:

Backup script

To be saved on your blog host, executed on demand, and the resulting archive file manually downloaded (probably SCPed). I have mine saved in a shared directory – /var/vagrant/blog_backup.sh:

blog_backup.sh

#!/bin/bash

# Set the date format, filename and the directories where your backup files will be placed and which directory will be archived.
NOW=$(date +"%Y-%m-%d-%H%M")
FILE="rposbowordpressrestoredemo.$NOW.tar"
BACKUP_DIR="/home/vagrant"
WWW_DIR="/var/www"

# MySQL database credentials
DB_USER="root"
DB_PASS="myrootpwd"
DB_NAME="wordpress"
DB_FILE="rposbowordpressrestoredemo.$NOW.sql"

# dump the wordpress dbs
mysql -u$DB_USER -p$DB_PASS --skip-column-names -e "select table_name from information_schema.TABLES where TABLE_NAME like 'wp_%';" | xargs mysqldump --add-drop-table -u$DB_USER -p$DB_PASS $DB_NAME > $BACKUP_DIR/$DB_FILE

# archive the website files
tar -cvf $BACKUP_DIR/$FILE $WWW_DIR

# append the db backup to the archive
tar --append --file=$BACKUP_DIR/$FILE $BACKUP_DIR/$DB_FILE

# remove the db backup
rm $BACKUP_DIR/$DB_FILE

# compress the archive
gzip -9 $BACKUP_DIR/$FILE

Restore script

To be saved in a directory on the host to be shared with the VM, along with your blog archive.

restore_backup.sh

# pull in the backup, untar and expand it, copy the website files, import the db
mkdir /tmp/restore
cd /tmp/restore
tar -zxvf /var/blog_backup/rposbowordpressrestoredemo.*.tar.gz
sudo cp -Rf /tmp/restore/var/www/wordpress/* /var/www/wordpress/
mysql -uroot -pmyrootpwd wordpress < /tmp/restore/home/vagrant/_backup/rposbowordpressrestoredemo.*.sql

# create the .htaccess to support mod_rewrite for pretty permalinks
sudo cp /var/blog_backup/.htaccess /var/www/wordpress/
sudo chmod 644 /var/www/wordpress/.htaccess

# set the default site to locahost for testage
mysql -uroot -pmyrootpwd wordpress -e "UPDATE wp_options SET option_value='http://localhost:8080' WHERE wp_options.option_name='siteurl'"
mysql -uroot -pmyrootpwd wordpress -e "UPDATE wp_options SET option_value='http://localhost:8080' WHERE wp_options.option_name='home'"

.htaccess

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

Vagrantfile

Vagrant.configure("2") do |config|
  config.vm.box = "precise32"
  config.vm.box_url = "http://files.vagrantup.com/precise32.box"
  config.vm.network :forwarded_port, guest: 80, host: 8080
  config.vm.synced_folder "blog_backup/", "/var/blog_backup/"

  config.vm.provision :shell, :inline => "apt-get clean; apt-get update" 

  config.vm.provision :chef_solo do |chef|

    chef.json = {
      "mysql" => {
      "server_root_password" => "myrootpwd",
      "server_repl_password" => "myrootpwd",
      "server_debian_password" => "myrootpwd"
      },
      "wordpress" => {
        "db" => {
          "database" => "wordpress",
          "user" => "wordpress",
          "password" => "mywppassword"
        }
      }
    }

    chef.cookbooks_path = ["cookbooks"]
    chef.add_recipe "wordpress"
    chef.add_recipe "apache2::mod_rewrite"
  end

  # hacky first attempt at restoring the blog from a script on a share
  config.vm.provision :shell, :path => "blog_backup/restore_backup.sh"
end

myrootpwd

The password used to set up the mysql instance; it needs to be consistent in your Vagrantfile and your restore_backup.sh script

mywppassword

if you can’t remember your current wordpress user’s password, look in the /wp-config.php file in the backed up archive.

Go get it

I’ve created a fully working setup for your perusal over on github. This repo, combined with the base wordpress install one will give you a couple of fully functional VMs to play with.

If you pull down the restore repo you’ll just need to run setup_cookbooks.sh to pull down the prerequisite cookbooks, then edit the wordpress default recipe to comment out that damned message line.

Once that’s all done, just run

vagrant up

and watch everything tick over until you get your prompt back. At this point you can open a browser and hit http://localhost:8080/ to see:

restored blog from github

Next up

I’ll be trying to move all of this hacky cleverness into a Chef recipe or two. Stay tuned.

Chef For Developers part 3

I’m continuing with my plan to create a series of articles for learning Chef from a developer perspective.

Part #1 gave an intro to Chef, Chef Solo, Vagrant, and Virtualbox. I also created my first Ubunutu VM running Apache and serving up the default website.

Part #2 got into creating a cookbook of my own, and evolved it whilst introducing PHP into the mix.

In this article I’ll get MySQL installed and integrated with PHP, and tidy up my own recipe.

Adding a database into the mix

1. Getting MySQL

Download mysql cookbook from the Opscode github repo into your “cookbooks” subdirecctory:

mysql

git clone https://github.com/opscode-cookbooks/mysql.git

Since this will be a server install instead of a client one you’ll also need to get OpenSSL:

openssl

git clone https://github.com/opscode-cookbooks/openssl.git

Now use Chef Solo to configure it by including the recipe reference and the mysql password in the Vagrantfile I’ve been using in the previous articles;

Vagrantfile

Vagrant.configure("2") do |config|
  config.vm.box = "precise32"
  config.vm.box_url = "http://files.vagrantup.com/precise32.box"
  config.vm.network :forwarded_port, guest: 80, host: 8080

  config.vm.provision :shell, :inline => "apt-get clean; apt-get update" 

  config.vm.provision :chef_solo do |chef|

    chef.json = {
      "apache" => {
        "default_site_enabled" => false
      },
      "mysql" => {
          "server_root_password" => "blahblah",
          "server_repl_password" => "blahblah",
          "server_debian_password" => "blahblah"
      },
      "mysite" => {
        "name" => "My AWESOME site",
        "web_root" => "/var/www/mysite"
      }
    }

    chef.cookbooks_path = ["cookbooks","site-cookbooks"]
    chef.add_recipe "mysql::server"
    chef.add_recipe "mysite"
  end
end

No need to explicitly reference OpenSSL; it’s in the “cookbooks” directory and since the mysql::server recipe references it it just gets pulled in.

If you run that now you’ll be able to ssh in and fool around with mysql using the user root and password as specified in the chef.json block.

vagrant ssh

and then

mysql -u root -p

and enter your password (“blahblah” in my case) to get into your mysql instance.

MySQL not doing very much

Now let’s make it do something. Using the mysql::ruby recipe it’s possible to orchestrate a lot of mysql functionality; this also relies on the build-essential cookbook, so download that into your “cookbooks” directory:

Build essential

git clone https://github.com/opscode-cookbooks/build-essential.git

To get some useful database abstraction methods we need the database cookbook:

Database

git clone https://github.com/opscode-cookbooks/database.git

The database cookbook gives a nice way of monkeying around with an RDBMS, making it possible to do funky things like:

mysql_connection = {:host => "localhost", :username => 'root',
                    :password => node['mysql']['server_root_password']}

mysql_database "#{node.mysite.database}" do
  connection mysql_connection
  action :create
end

to create a database.

Add the following to the top of the mysite/recipes/default.rb file:

include_recipe "mysql::ruby"

mysql_connection = {:host => "localhost", :username => 'root',
                    :password => node['mysql']['server_root_password']}

mysql_database node['mysite']['database'] do
  connection mysql_connection
  action :create
end

mysql_database_user "root" do
  connection mysql_connection
  password node['mysql']['server_root_password']
  database_name node['mysite']['database']
  host 'localhost'
  privileges [:select,:update,:insert, :delete]
  action [:create, :grant]
end

mysql_conn_args = "--user=root --password=#{node['mysql']['server_root_password']}"

execute 'insert-dummy-data' do
  command %Q{mysql #{mysql_conn_args} #{node['mysite']['database']} <<EOF
    CREATE TABLE transformers (name VARCHAR(32) PRIMARY KEY, type VARCHAR(32));
    INSERT INTO transformers (name, type) VALUES ('Hardhead','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Chromedome','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Brainstorm','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Highbrow','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Cerebros','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Fortress Maximus','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Chase','Throttlebot');
    INSERT INTO transformers (name, type) VALUES ('Freeway','Throttlebot');
    INSERT INTO transformers (name, type) VALUES ('Rollbar','Throttlebot');
    INSERT INTO transformers (name, type) VALUES ('Searchlight','Throttlebot');
    INSERT INTO transformers (name, type) VALUES ('Wideload','Throttlebot');
EOF}
  not_if "echo 'SELECT count(name) FROM transformers' | mysql #{mysql_conn_args} --skip-column-names #{node['mysite']['database']} | grep '^3$'"
end

and add in the new database variable in Vagrantfile:

Vagrant.configure("2") do |config|
  config.vm.box = "precise32"
  config.vm.box_url = "http://files.vagrantup.com/precise32.box"
  config.vm.network :forwarded_port, guest: 80, host: 8080

  config.vm.provision :shell, :inline => "apt-get clean; apt-get update" 

  config.vm.provision :chef_solo do |chef|

    chef.json = {
      "apache" => {
        "default_site_enabled" => false
      },
      "mysql" => {
      "server_root_password" => "blahblah",
      "server_repl_password" => "blahblah",
      "server_debian_password" => "blahblah"
      },
      "mysite" => {
        "name" => "My AWESOME site",
        "web_root" => "/var/www/mysite",
        "database" => "great_cartoons"
      }
    }

    chef.cookbooks_path = ["cookbooks","site-cookbooks"]
    chef.add_recipe "mysql::server"
    chef.add_recipe "mysite"
  end
end

Now we need a page to display that data, but we need to pass in the mysql password as a parameter. That means we need to use a template; create the file templates/default/robotsindisguise.php.erb with this content:

<?php
    $con = mysqli_connect("localhost","root", "<%= @pwd %>");
    if (mysqli_connect_errno($con))
    {
        die('Could not connect: ' . mysqli_connect_error());
    }

    $sql = "SELECT * FROM great_cartoons.transformers";
    $result = mysqli_query($con, $sql);

?>
    <table>
    <tr>
    <th>Transformer Name</th>
    <th>Type</th>
    </tr>
    <?php
    while($row = mysqli_fetch_array($result, MYSQL_ASSOC))
    {
    ?>
        <tr>
                <td><?php echo $row['name']?></td>
                <td><?php echo $row['type']?></td>
        </tr>
    <?php
    }//end while
    ?>
    </tr>
    </table>
<?php
mysqli_free_result($result);
mysqli_close($con);
?>

That line at the top might look odd:

$con = mysqli_connect("localhost","root", "<%= @pwd %>");

But bear in mind that it’s an ERB (Extended RuBy) file so gets processed by the ruby parser to generate the resulting file; the PHP processor only kicks in once the file is requested from a browser.

As such, if you kick off a vagrant up now and (eventually) vagrant ssh in, open /var/www/robotsindisguise.php in nano/vi and you’ll see the line

$con = mysqli_connect("localhost","root", "<%= @pwd %>");

has become

$con = mysqli_connect("localhost","root", "blahblahblah");

browsing to http://localhost:8080/robotsindisguise.php should give something like this:

Autobots: COMBINE!

2. Tidy it up a bit

Right now we’ve got data access stuff in the default.rb recipe, so let’s move that lot out; I’ve created the file /recipes/data.rb with these contents:

data.rb

include_recipe "mysql::ruby"

mysql_connection = {:host => "localhost", :username => 'root',
                    :password => node['mysql']['server_root_password']}

mysql_database node['mysite']['database'] do
  connection mysql_connection
  action :create
end

mysql_database_user "root" do
  connection mysql_connection
  password node['mysql']['server_root_password']
  database_name node['mysite']['database']
  host 'localhost'
  privileges [:select,:update,:insert, :delete]
  action [:create, :grant]
end

mysql_conn_args = "--user=root --password=#{node['mysql']['server_root_password']}"

execute 'insert-dummy-data' do
  command %Q{mysql #{mysql_conn_args} #{node['mysite']['database']} <<EOF
    CREATE TABLE transformers (name VARCHAR(32) PRIMARY KEY, type VARCHAR(32));
    INSERT INTO transformers (name, type) VALUES ('Hardhead','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Chromedome','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Brainstorm','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Highbrow','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Cerebros','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Fortress Maximus','Headmaster');
    INSERT INTO transformers (name, type) VALUES ('Chase','Throttlebot');
    INSERT INTO transformers (name, type) VALUES ('Freeway','Throttlebot');
    INSERT INTO transformers (name, type) VALUES ('Rollbar','Throttlebot');
    INSERT INTO transformers (name, type) VALUES ('Searchlight','Throttlebot');
    INSERT INTO transformers (name, type) VALUES ('Wideload','Throttlebot');
EOF}
  not_if "echo 'SELECT count(name) FROM transformers' | mysql #{mysql_conn_args} --skip-column-names #{node['mysite']['database']} | grep '^3$'"
end

I’ve moved the php recipe references into recipes/webfiles.rb:

webfiles.rb

include_recipe "php"
include_recipe "php::module_mysql"

# -- Setup the website
# create the webroot
directory "#{node.mysite.web_root}" do
    mode 0755
end

# copy in an index.html from mysite/files/default/index.html
 cookbook_file "#{node.mysite.web_root}/index.html" do
    source "index.html"
    mode 0755
 end

# copy in my usual favicon, just for the helluvit..
 cookbook_file "#{node.mysite.web_root}/favicon.ico" do
    source "favicon.ico"
    mode 0755
 end

# copy in the mysql demo php file
 template "#{node.mysite.web_root}/robotsindisguise.php" do
    source "robotsindisguise.php.erb"
    variables ({
        :pwd => node.mysql.server_root_password
    })
    mode 0755
 end

 # use a template to create a phpinfo page (just creating the file and passing in one variable)
template "#{node.mysite.web_root}/phpinfo.php" do
    source "testpage.php.erb"
    mode 0755
    variables ({
        :title => node.mysite.name
    })
end

So /receipes/default.rb now looks like this:

default.rb

include_recipe "apache2"
include_recipe "apache2::mod_php5"

# call "web_app" from the apache recipe definition to set up a new website
web_app "mysite" do
    # where the website will live
   docroot "#{node.mysite.web_root}"

   # apache virtualhost definition
   template "mysite.conf.erb"
end

include_recipe "mysite::webfiles"
include_recipe "mysite::data"

Summary

Over the past three articles we’ve automated the creation of a virtual environment via a series of code files, flat files, and template files, and a main script to pull it all together. The result is a full LAMP stack virtual machine. We also created a new website and pushed that on to the VM also.

All files used in this post can be found in the associated github repo.

Any comments or questions would be greatly appreciated, as would pull requests for improving my lame ruby and php skillz! (and lame css and html..)

Chef For Developers part 2

I’m continuing with my plan to create a series of articles for learning Chef from a developer perspective.

Part #1 gave an intro to Chef, Chef Solo, Vagrant, and Virtualbox. I also created my first Ubuntu VM running Apache and serving up the default website.

In this article I’ll get on to creating a cookbook of my own, and evolve it whilst introducing PHP into the mix.

Creating and evolving your own cookbook

1. Cook your own book

Downloaded configuration cookbooks live in the cookbooks subdirectory; this should be left alone as you can exclude it from version control knowing that the cookbooks are remotely hosted and can be downloaded as needed.

For your own ones you need to create a new directory; the convention for this has become to use site-cookbooks, but you can use whatever name you like as far as I can tell. You just need to add a reference to that directory in the Vagrantfile:

chef.cookbooks_path = ["cookbooks", "site-cookbooks", "blahblahblah"]

Within that new subdirectory you need to have, at a minimum, a recipes subdirectory with a default.rb ruby file which defines what your recipe does. Other key subdirectories are files (exactly that: files to be referenced/copied/whatever) and templates (ruby ERB templates which can be referenced to create a new file).

To create this default structure (for a cookbook called mysite) just use the one-liner:

mkdir -p site-cookbooks/mysite/{recipes,{templates,files}/default}

You’ll need to create two new files to spin up our new website; a favicon and a flat index html file. Create something simple and put them in the files/default/ directory (or use my ones).

Now in order for them to be referenced there needs to be a default.rb in recipes:

# -- Setup the website
# create the webroot
directory "#{node.mysite.web_root}" do
    mode 0755
end

# copy in an index.html from mysite/files/default/index.html
 cookbook_file "#{node.mysite.web_root}/index.html" do
    source "index.html"
    mode 0755
 end

# copy in my usual favicon, just for the helluvit..
 cookbook_file "#{node.mysite.web_root}/favicon.ico" do
    source "favicon.ico"
    mode 0755
 end

This will create a directory for the website (the location of which needs to be defined in the chef.json section of the Vagrantfile), copy the specified files from files/default/ over, and set the permissions on them all so that the web process can access them.

You can also use the syntax:

directory node['mysite']['web_root'] do

in place of

directory "#{node.mysite.web_root}" do

So how will Apache know about this site? Better configure it with a conf file from a template; create a new file in templates/default/ called mysite.conf.erb:

<VirtualHost *:80>
  DocumentRoot <%= @params[:docroot] %>
</VirtualHost>

And then reference it from the default.rb recipe file (add to the end of the one we just created, above):

web_app "mysite" do
    # where the website will live
   docroot "#{node.mysite.web_root}"

   # apache virtualhost definition
   template "mysite.conf.erb"
end

That just calls the web_app method that exists within the Apache cookbook to create a new site called “mysite”, set the docroot to the same directory as we just created, and configure the virtual host to reference it, as configured in the ERB template.

The Vagrantfile now needs to become:

Vagrant.configure("2") do |config|
  config.vm.box = "precise32"
  config.vm.box_url = "http://files.vagrantup.com/precise32.box"
  config.vm.network :forwarded_port, guest: 80, host: 8080

  config.vm.provision :chef_solo do |chef|

    chef.json = {
      "apache" => {
        "default_site_enabled" => false
      },
      "mysite" => {
        "name" => "My AWESOME site",
        "web_root" => "/var/www/mysite"
      }
    }

    chef.cookbooks_path = ["cookbooks","site-cookbooks"]
    chef.add_recipe "apache2"
    chef.add_recipe "mysite"
  end
end

Pro tip: be careful with quotes around the value for default_site_enabled; “false” == true whereas false == false, apparently.

Make sure you’ve destroyed your existing vagrant vm and bring this new one up, a decent one-liner is:

vagrant destroy --force && vagrant up

You should see a load of references to your new cookbook in the output and hopefully once it’s finished you’ll be able to browse to http://localhost:8080 and see something as GORGEOUS as:

Salmonpink is underrated

2. Skipping the M in LAMP, Straight to the P: PHP

Referencing PHP

Configure your code to bring in PHP; a new recipe needs to be referenced as a module of Apache:

chef.add_recipe "apache2::mod_php5"

It’s probably worth mentioning that

add_recipe "apache"

actually means

add_recipe "apache::default"

As such, mod_php5 is a recipe file itself, much like default.rb is; you can find it in the Apache cookbook under cookbooks/apache2/recipes/mod_php5.rb and all it does is call the approriate package manager to install the necessary libraries.

You may find that you receive the following error after adding in that recipe reference:

apt-get -q -y install libapache2-mod-php5=5.3.10-1ubuntu3.3 returned 100, expected 0

To get around this you need to add in some simple apt-get housekeeping before any other provisioning:

config.vm.provision :shell, :inline => "apt-get clean; apt-get update"

PHPInfo

Let’s make a basic phpinfo page to show that PHP is in there and running. To do this you could create a new file and just whack in a call to phpinfo(), but I’m going to create a new template so we can pass in a page title for it to use (create your own, or just use mine):

<html>
<head>
    <title><%= @title %></title>
    .. snip..
</head>
<body>
<h1><%= @title %></h1>
<div class="description">
    <?php
    phpinfo( );
    ?>
</div>
.. snip ..
</body>
</html>

The default.rb recipe now needs a new section to create a file from the template:

# use a template to create a phpinfo page (just creating the file and passing in one variable)
template "#{node.mysite.web_root}/phpinfo.php" do
    source "testpage.php.erb"
    mode 0755
    variables ({
        :title => node.mysite.name
    })
end

Destroy, rebuild, and browse to http://localhost:8080/phpinfo.php:

A spanking new phpinfo page - wowzers!

Notice the heading and the title of the tab are set to the values passed in from the Vagrantfile.

3. Refactor the Cookbook

We can actually put the add_recipe calls inside of other recipes using include_recipe, so that the dependencies are explicit; no need to worry about forgetting to include apache in the Vagrantfile if you’re including it in your recipe itself.

Let’s make default.rb responsible for the web app itself, and make a new recipe for creating the web files; create a new webfiles.rb in recipes/mysite and move the file related stuff in there:

webfiles.rb

# -- Setup the website
# create the webroot
directory "#{node.mysite.web_root}" do
    mode 0755
end

# copy in an index.html from mysite/files/default/index.html
 cookbook_file "#{node.mysite.web_root}/index.html" do
    source "index.html"
    mode 0755
 end

# copy in my usual favicon, just for the helluvit..
 cookbook_file "#{node.mysite.web_root}/favicon.ico" do
    source "favicon.ico"
    mode 0755
 end

 # use a template to create a phpinfo page (just creating the file and passing in one variable)
template "#{node.mysite.web_root}/phpinfo.php" do
    source "testpage.php.erb"
    mode 0755
    variables ({
        :title => node.mysite.name
    })
end

default.rb now looks like

include_recipe "apache2"
include_recipe "apache2::mod_php5"

# call "web_app" from the apache recipe definition to set up a new website
web_app "mysite" do
    # where the website will live
   docroot "#{node.mysite.web_root}"

   # apache virtualhost definition
   template "mysite.conf.erb"
end

include_recipe "mysite::webfiles"

And Vagrantfile now looks like:

Vagrant.configure("2") do |config|
  config.vm.box = "precise32"
  config.vm.box_url = "http://files.vagrantup.com/precise32.box"
  config.vm.network :forwarded_port, guest: 80, host: 8080

  config.vm.provision :shell, :inline => "apt-get clean; apt-get update" 

  config.vm.provision :chef_solo do |chef|

    chef.json = {
      "apache" => {
        "default_site_enabled" => false
      },
      "mysite" => {
        "name" => "My AWESOME site",
        "web_root" => "/var/www/mysite"
      }
    }

    chef.cookbooks_path = ["cookbooks","site-cookbooks"]
    chef.add_recipe "mysite"
  end
end

The add_recipes are now include_recipes moved to default.rb, the file related stuff is in webfiles.rb and there’s an include_recipe to reference this new file:

include_recipe "mysite::webfiles"

Why the refactoring is important!

Well, refactoring is a nice, cathartic, thing to do anyway. But there’s also a specific reason for doing it here: once we move from using Chef Solo to Grown Up Chef (aka Hosted Chef) the Vagrantfile won’t be used anymore.

As such, moving the logic out of the Vagrantfile (e.g., add_recipe calls) and into our own cookbook (e.g. include_recipe calls) will allow us to use our same recipe in both Chef Solo and also Hosted Chef.

Next up

We’ll be getting stuck in to MySQL integration and evolving a slightly more dynamic recipe.

All files used in this post can be found in the associated github repo.

Chef For Developers

Chef, Vagrant, VirtualBox

In this upcoming series of articles I’ll be trying to demonstrate (and learn for myself) how to effectively configure the creation of an environment. I’ve decided to look into Chef as my environment configuration tool of choice, just because it managed to settle in my brain quicker than Puppet did.

I’m planning on starting really slowly and simply using Chef Solo so I don’t need to learn about the concepts of hosted Chef servers and Chef client nodes to begin with. I’ll be using virtual machines instead of metal, so will be using VirtualBox for the VM-ing and Vagrant for the VM orchestration.

Sounds like Ops to me..

The numerous other articles I’ve read about using Chef all seem to assume a fundemental Linux SysOps background, which melted my little brain somewhat; hence why I’m starting my own series and doing it from a developer perspective.

LINUX?!

Don’t worry if you’re not familiar with Linux; although I’ll start with a Linux VM I’ll eventually move on to applying the same process to Windows, and the commands used in Linux will be srsly basic. Srsly.
Lolz.

Part 1 – I ♥ LAMP

This first few articles will cover:

Chef

Chef

“Chef is an automation platform that transforms infrastructure into code”. You are ultimately able to describe what your infrastructure looks like in ruby code and manage your entire server estate via a central repository; adding, removing, and updating features, applications, and configuration from the command line with an extensive Chef toolbelt.

Yes, there are knives. And cookbooks and recipes. Even a food critic!

Here’s the important bit: The difference between Chef Solo and one of the Hosted Chef options

Chef Solo

  1. You only have a single Chef client which uses a local json file to understand what it is comprised of.
  2. Cookbooks are either saved locally to the client or referenced via URL to a tar archive.
  3. There is no concept of different environments.

Hosted Chef

  1. You have a master Chef server to which all Chef client nodes connect to understand what they are comprised of.
  2. Cookbooks are uploaded to the Chef server using the Knife command line tool.
  3. There is the concept of different environments (dev, test, prod).

I’ll eventually get on to this in more detail as I’ll be investigating Chef over the next few posts in this series; for now, please just be aware that in this scenario Chef Solo is being used to demonstrate the benefit of environment configuration and is not being recommended as a production solution. Although in some cases it might be.

VirtualBox

virtualbox

“VirtualBox is a cross-platform virtualization application”. You can easily configure a virtual machine in terms of RAM, HDD size and type, network interface type and number, CPU, even cnfigure shared folders between host and client. Then you can point the virtual master drive at an ISO on the host computer and install an OS as if you were sitting at a physical machine.

This has so many uses, including things like setting up a development VM for installing loads of dev tools if you want to keep your own computer clean, or setting up a presentation machine containing just powerpoint, your slides, and Visual Studio for demos.

Vagrant

vagrant up

Vagrant is an open source development environment virtualisation technology written in Ruby. Essentially you use Vagrant to script against VirtualBox, VMWare, AWS or many others; you can even write your own provider for it to hook into!

The code for Vagrant is open source and can be found on github

Getting started

Downloads

For this first post you don’t even need to download the Chef client, so we’ll leave that for now.

Go and download Vagrant and VirtualBox and install them.

Your First Scripted Environment

1. Get a base OS image

To do this download a Vagrant “box” (an actual base OS, of which there are many) from the specified URL, assign a friendly name (e.g. “precise32”) to it, and create a base “Vagrantfile” using Vagrant’s “init” method, from the command line run:

vagrant init precise32 http://files.vagrantup.com/precise32.box

vagrant init

A Vagrantfile is a little bit of ruby to define the configuration of your Vagrant box; the autogenerated one is HUGE but its pretty much all tutorial-esque comments. Ignoring the comments gives you something like this:

Vagrant::Config.run do |config|
  config.vm.box = "precise32"
  config.vm.box_url = "http://files.vagrantup.com/precise32.box"
end

Yours might also look like this depending on whether you’re defaulting to Vagrant v2 or v1:

Vagrant.configure("2") do |config|
  config.vm.box = "precise32"
  config.vm.box_url = "http://files.vagrantup.com/precise32.box"
end

This is worth bearing in mind as the syntax for various operations differ slightly between versions.

2. Create and start your basic VM

From the command line:

Create and start up the basic vm

vagrant up

vagrant up

If you have Virtualbox running you’ll see the new VM pop up and the preview window will show it booting.

vagrant up in virtualbox

SSH into it

vagrant ssh

vagrant ssh

Stop it

vagrant halt

vagrant halt

Remove all trace of it

vagrant destroy

vagrant destroy

And that’s your first basic, scripted, virtual machine using Vagrant! Now let’s add some more useful functionality to it:

3. Download Apache Cookbook

Create a subdirectory “cookbooks” in the same place as your Vagrantfile, then head over to the opscode github repo and download the Apache2 cookbook into the “cookbooks” directory.

OpsCode cookbooks repo for Apache

Apache

git clone https://github.com/opscode-cookbooks/apache2.git

Gitting it

4. Set up Apache using Chef Solo

Now it starts to get interesting.

Update your Vagrantfile to include port forwarding so that browsing to localhost:8080 redirects to your VM’s port 80:

Vagrant.configure("2") do |config|
      config.vm.box = "precise32"
      config.vm.box_url = "http://files.vagrantup.com/precise32.box"
      config.vm.network :forwarded_port, guest: 80, host: 8080
    end

Now add in the Chef provisioning to include Apache in the build:

Vagrant.configure("2") do |config|
      config.vm.box = "precise32"
      config.vm.box_url = "http://files.vagrantup.com/precise32.box"
      config.vm.network :forwarded_port, guest: 80, host: 8080

      config.vm.provision :chef_solo do |chef|
        chef.cookbooks_path = ["cookbooks"]
        chef.add_recipe "apache2"
      end
    end

Kick it off:

vagrant up

Vagrant with Apache - starting boot

..tick tock..

Vagrant with Apache - finishing boot

So we now have a fresh new Ubunutu VM with Apache installed and configured and running on port 80, with our own port 8080 forwarded to the VM’s port 80; let’s check it out!

Browsing the wonderful Apache site

Huh? Where’s the lovely default site you normally get with Apache? Apache is definitely running – check the footer of that screen.

What’s happening is that on Ubuntu the default site doesn’t get enabled so we have to do that ourselves. This is also a great intro to passing data into the chef provisioner.

Add in this little chunk of JSON to the Vagrantfile:

chef.json = {
      "apache" => {
        "default_site_enabled" => true
      }
    }

So it should now look like this:

Vagrant.configure("2") do |config|
      config.vm.box = "precise32"
      config.vm.box_url = "http://files.vagrantup.com/precise32.box"
      config.vm.network :forwarded_port, guest: 80, host: 8080

      config.vm.provision :chef_solo do |chef|

        chef.json = {
          "apache" => {
            "default_site_enabled" => true
          }
        }

        chef.cookbooks_path = ["cookbooks"]
        chef.add_recipe "apache2"
      end
    end

The chef.json section passes the specified variable values into the specified recipe file. If you dig into default.rb in /cookbooks/apache/recipes you’ll see this block towards the end:

apache_site "default" do
  enable node['apache']['default_site_enabled']
end

Essentially this says “for the site default, set it’s status equal to the value defined by default_site_enabled in the apache node config section”. For Ubuntu this defaults to false (other OSs default it to true) and we’ve just set ours to true.

Let’s try that again:

vagrant reload

(reload is the equivalent of vagrant halt && vagrant up)

Notice that this time, towards the end of the run, we get the message

INFO: execute[a2ensite default] ran successfully

instead of on the previous one:

INFO: execute[a2dissite default] ran successfully
  • a2ensite = enable site
  • a2dissite = disable site

So what does this look like?

Browsing the wonderful Apache site.. take 2

BOOM!

Next up

Let’s dig into the concept of Chef recipes and creating our own ones.