A Step by Step Guide to using Terraform to define an AutoScaling Private WebPageTest instance in code

WebPagetest Terraform

In a previous article I went through the steps needed to create your own private, autoscaling, WebPageTest setup in Amazon AWS. It wasn’t particularly complicated, but it was quite manual; I don’t like pointing and clicking in a GUI since I can’t easily put it in version control and run it again and again on demand.

Fortunately, whatever you create within AWS can be described using a language called CloudFormation which allows you to define your infrastructure as code.

Unfortunately it’s not easy to understand (in my opinion!) and I could never quite get my head around it, which annoyed me no end.

In this article I’ll show you how to use Terraform to define your private autoscaling WebPageTest setup in easily understandable infrastructure as code, enabling an effortless and reproducable web performance testing setup, which you can then fearlessly edit and improve!

Terraform

Terraform is a domain specific language (DSL) which MASSIVELY simplifies AWS’s CloudFormation language.

I’ll build up the infrastructure code section by section below, and then review the script in its entirety at the end.

Your first prerequisite is to go and download terraform and ensure it’s in your PATH; it’s a standalone executable, which makes this super simple!

Done that? Ok, cool – let’s move on:

Autoscaling Private WebPageTest

The three required sections for this setup are:

  1. A user with permissions to create new EC2 instances which the WebPageTest server uses to spin up test agents and S3 permissions to archive the tests (IAM).
  2. A place to archive test results, freeing up space on the WebPageTest server itself (S3).
  3. A VM instance on which to host your WebPageTest server which orchestrates the whole process (EC2).

Previously we created and linked these up manually, and this time we’ll get the same result using Terraform to code it all up!

Codified Autoscaling Private WebPageTest setup

A big difference with this approach is that in order to use Terraform we will need to create an AWS user for Terraform since it will be creating AWS resources on our behalf, and as such it will need the appropriate admin permissions.

This is a one-off task: log in to your AWS console, hop over to IAM, and create a new user with programattic access and admin privilages.

This is seriously overkill, so you might want to read up on Terraform Best Practices to grant just the level of access you actually need for your setup.

Grab the access key and secret for your Terraform user and save them in any one of the following:

  1. Environment variables
  2. AWS Profile entry in the local profile file
  3. Inline (see below)
  4. Plain text file

If you choose the AWS profile file or a plain text file, then it should use this format:

[terraform]
aws_access_key_id=BLAHBLAHBLAHBLAHBLAH
aws_secret_access_key=bLAHBLAhbLAHBLAhb

Set up the AWS provider

Terraform needs to know which infrastructure provider you plan to use; it currently officially supports over 100 providers and has unofficial support for over 100 more via the community.

Basically if you deal with infrastructure, there’s probably a Terraform provider for it. We’re using AWS so ours will look like this, depending on where you saved your Terraform IAM details:

# Use locally defined "terraform" profile
# This could  via environment variables (1)
provider "aws" {
  region        = "eu-west-1"
}

# OR
# in your ~/.aws file, for example (2)
provider "aws" {
  region        = "eu-west-1"
  profile       = "terraform"
}

# OR
# Inline (3)
provider "aws" {
  region     = "eu-west-1"
  access_key = "<access_key>"
  secret_key = "<secret_key>"
}

# OR
# Specific credentials file (4)
# (same format as a profile file)
provider "aws" {
  region        = "eu-west-1"
  shared_credentials_file = "aws_credentials.txt"
}

Just choose one of these approaches and save it as "webpagetest.tf"

IAM

Now that the admin is completed, we can get back to the list of three things the WebPageTest setup needs:

  1. IAM
  2. S3
  3. EC2

We need to create a user for the webpagetest server to use in order to create and destroy the EC2 test agents, and archive test results to S3.

1) Create the IAM user

The general structure for Terraform resources is:

# resource structure
resource "resource type" "resource name" {
  property_name = "property value"
}

For creating our IAM user, it looks like this:

# IAM resource
resource "aws_iam_user" "wpt-user" {
  name = "wpt-user"
}
  • The "resource type" is "aws_iam_user"
  • The "resource name" for this one is "wpt-user"
  • The "property_name" "name" has the "property_value" of "wpt-user"

Save this as "iam-wpt.tf" in the same directory as your "webpagetest.tf" file from above (with the configuration details).

Easy, right? That’s our first resource! Want to try it out? Sure you do! Hop over to your command line, get into the same directory as your tf files, and run:

$ terraform init

Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...
- Downloading plugin for provider "aws" (2.12.0)...

This will initialise the directory with the AWS Terraform provider, which can take a while to download. Once that’s finished you can run:

terraform plan

This will show what Terraform will do, without actually doing it yet:

Terraform will perform the following actions:

  + aws_iam_user.wpt-user
      id:                                    <computed>
      arn:                                   <computed>
      force_destroy:                         "false"
      name:                                  "wpt-user"
      path:                                  "/"
      unique_id:                             <computed>

  + aws_iam_user_policy_attachment.ec2-policy-attach
      id:                                    <computed>
      policy_arn:                            "arn:aws:iam::aws:policy/AmazonEC2FullAccess"
      user:                                  "wpt-user"

  + aws_iam_user_policy_attachment.s3-policy-attach
      id:                                    <computed>
      policy_arn:                            "arn:aws:iam::aws:policy/AmazonS3FullAccess"
      user:                                  "wpt-user"

  + aws_instance.webpagetest
      id:                                    <computed>
      ami:                                   "ami-9978f6ee"
      arn:                                   <computed>
      associate_public_ip_address:           <computed>
    ...

If you actually want to try this out, then run:

terraform apply

You’ll be asked to approve this step by typing "yes"; if you’re feeling confident and foolhardy you can bypass this each time with an extra parameter:

terraform apply -auto-approve

The IAM user will be created, visible in your AWS console:

webpagetest IAM user created

Since we’re not done yet, we could just leave it there since Terraform keeps track of the current state of your setup, and will apply subsequent updates as incremental changes where possible or just teardown and recreate the entire thing.

If you don’t want to leave it laying around, you can run:

terraform destroy

This will tear down whatever was created by the .tf files in the current directory.

Give EC2 permissions

We need to find the EC2 policy for AmazonEC2FullAccess and attach it to the user; for this you can use the appropriate arn (Amazon Resource Name) inline:

# AmazonEC2FullAccess
# Attaching the policy to the user
resource "aws_iam_user_policy_attachment" "ec2-policy-attach" {
  user = "${aws_iam_user.wpt-user.name}"
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2FullAccess"
}

There are a few other ways to do this; I’ll pick this up later in the Improvements section.

Notice the use of references: instead of typing:

user = "webpagetest"

we can refer to the properties of other resources, so this code instead becomes:

user = "${aws_iam_user.wpt-user.name}"

In some cases (such as using resources we’ve already created), this dynamic referencing makes for a wonderfully flexible infrastructure-as-code setup; we’ll use this lots more later on.

Give S3 Permissions

Same as above, but this time for AmazonS3FullAccess:

# S3 policy
# attach the S3 policy to the user
resource "aws_iam_user_policy_attachment" "s3-policy-attach" {
  user = "${aws_iam_user.wpt-user.name}"
  policy_arn = "arn:aws:iam::aws:policy/AmazonS3FullAccess"
}

Checkpoint

We’ve now defined a Terraform user with admin powers, created a WebPageTest IAM user, and granted that user full EC2 and S3 permissions.

Ideally you’d only grant powers to create the resources that you actually need – there’s a lot of valuable info over on the Terraform site around best practices

You can check this out with a terraform plan and/or terraform apply if you like, or just move on to the next section.

2) S3 Bucket for test archiving

Creating an S3 bucket is pretty simple, and since we’ve given the IAM user full S3 access then it can push tests into whatever bucket we create. There’s one snag though…

resource "aws_s3_bucket" "wpt-archive" {
  bucket = "HowDoIMakeSureThisIsUniqueIsh?!" # great question..
  acl    = "private"
}

S3 buckets need to have unique names within a given region – how can we code that? Luckily, Terraform has the concept of a random number generator, which will give us a pretty good chance of coming up with something unique:

# the "random_string" resource with some S3-friendly settings:
# no special chars and no uppercase
resource "random_string" "bucket" {
  length = 10
  special = false
  upper = false
}

# Now get the actual value using ".result"
resource "aws_s3_bucket" "wpt-archive" {
  bucket = "my-wpt-test-archive-${random_string.bucket.result}"
  acl    = "private"
}

Important point: random_string counts as a new "provider", so you’ll need to run terraform init again to pull down that provider, before you’ll be allowed to run terraform plan or terraform apply.

3) EC2 WebPageTest Instance

To create the WebPageTest server EC2 instance we need to specify the AMI to use; luckily for us, these are all listed in the WebPageTest github repo:

WebPageTest server EC2 AMI list by region

Pick the one for your chosen AWS region; in my case this will be eu-west-1, so I use the ami value of "ami-9978f6ee" for the first step:

# WebPageTest EC2 instance
resource "aws_instance" "webpagetest" {
  ami           = "ami-9978f6ee"
  instance_type = "t2.micro"
}

Security

We also need to specify the security config to open up port 80, and/or 443 (for HTTPS). This is a separate resource which we then refer back to in the EC2 one:

# Security group for the WebPageTest EC2 instance
resource "aws_security_group" "wpt-sg" {
  name = "wpt-sg"

  # http  
  ingress {
    # What range of ports are we opening up?
    from_port = 80 # From port 80...
    to_port = 80 # to port 80!
    protocol = "tcp"
    description = "HTTP"
    cidr_blocks = ["0.0.0.0/0"] # who can access it? The world!
  }  
  
  # SSH
  ingress {
    from_port = 22
    to_port = 22
    protocol = "tcp"
    description = "SSH"
    cidr_blocks = ["0.0.0.0/0"]
  }

  # Outgoing traffic
  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"
    cidr_blocks     = ["0.0.0.0/0"]
  }
}

I’m not a security specialist, I just like connecting dots to see what happens! Please feel free to suggest improvements to the repo over on github; I appreciate any help I can get.

Now we can link that security group resource to the EC2 instance resource like so:

# WebPageTest EC2 instance
resource "aws_instance" "webpagetest" {
  ami           = "ami-9978f6ee"
  instance_type = "t2.micro"

  # we can refer back to the security group using a reference
  # made up of  "resource-type.resource-name" to  get the
  # id property
  vpc_security_group_ids  = ["${aws_security_group.wpt-sg.id}"]
}

A couple more things and we’re done with the EC2 instance.

Key Pair

Any EC2 instance needs an associated public key, so that you can remotely connect if necessary. We could make this dynamic, using something like this:

# create a keypair
resource "aws_key_pair" "wpt-admin-key" {
  key_name   = "wpt-admin-key"
  public_key = "ssh-rsa blahblahblah... [email protected]"
}

resource "aws_instance" "webpagetest" {
  ami           = "ami-9978f6ee"
  instance_type = "t2.micro"
  vpc_security_group_ids  = ["${aws_security_group.wpt-sg.id}"]
  # reference the new key pair's name
  key_name      = "${aws_key_pair.wpt-admin-key.name}"
}

Or we could just use one that already exists in the AWS account.

resource "aws_instance" "webpagetest" {
  ami           = "ami-9978f6ee"
  instance_type = "t2.micro"
  vpc_security_group_ids  = ["${aws_security_group.wpt-sg.id}"]
  # paste in your existing key pair's name
  key_name      = "robins-key"
}

User data

Last thing needed to the WebPageTest server EC2 instance working is to set the user data, defining the settings for the WebPageTest server. The tricky part here seems to be passing in line-delimited data within user data.

If you remember from last time, we want to pass in the following information (and if you don’t remember from last time, why not have a quick read to get context?):

  • ec2_key: the IAM user access key ID
  • ec2_secret: the IAM user secret access key
  • api_key: a super secret api key that you provide
  • waterfall_show_user_timing: makes for pretty waterfall charts if you have user timings in your pages
  • iq: image quality – defaults to 30%, but 75% is nicer, especially if you’re using S3 for storage!
  • pngss: full resolution images for screenshots
  • archive_s3_key: the IAM user key
  • archive_s3_secret: the IAM user secret
  • archive_s3_bucket: the WPT archive bucket
  • archive_days: number of days before tests are pushed to S3
  • cron_archive: run archive script hourly automatically as agents poll for work

First up, let’s get a reference to the wpt-user‘s IAM credentials, so they can be passed in as user data:

# IAM Access Key for WebPageTest user
resource "aws_iam_access_key" "wpt-user" {
  user = "${aws_iam_user.wpt-user.name}"
}

The WebPageTest IAM user’s access key details can now be accessed via this new aws_iam_access_key resource:

key = "${aws_iam_access_key.wpt-user.id}"
secret = "${aws_iam_access_key.wpt-user.secret}"

We could just create a really long string with "\n" separating, or we could reference an external file. I’ll show an inline string here with references to both the aws_iam_access_key and our previously created aws_s3_bucket resource called "wpt_archive", and will show the external file version in the Improvements section towards the end.

Unfortunately you can’t have multiline variable values, so this inline version just becomes one very very long line of text. Not easy to maintain or debug!

# specifying the user data inline
user_data     = "ec2_key=${aws_iam_access_key.wpt-user.id} \n ec2_secret=${aws_iam_access_key.wpt-user.secret} \n api_key=<my crazy long api key> \n waterfall_show_user_timing=1 \n iq=75 \n pngss=1 \n archive_s3_server=s3.amazonaws.com \n archive_s3_key=${aws_iam_access_key.wpt-user.id} \n archive_s3_secret=${aws_iam_access_key.wpt-user.secret} \n archive_s3_bucket=${aws_s3_bucket.wpt-archive.bucket} \n archive_days=1 \n cron_archive=1"

There are plenty of potential improvements here: pulling the user data into a template, which allows us to create it dynamically from a template file is a favourite of mine; I’ll demonstrate this later in the Improvements section.

Try it out

The resulting script is below – be sure to replace placeholders with your own values!:

# Setting up the AWS Terraform provider
provider "aws" {
  region     = "eu-west-1"
  
  # FILL IN THESE PLACEHOLDERS (or use another method):
  access_key = "<access_key>"
  secret_key = "<secret_key>"
}

# IAM config
resource "aws_iam_user" "wpt-user" {
  name = "wpt-user"
}

resource "aws_iam_user_policy_attachment" "ec2-policy-attach" {
  user = "${aws_iam_user.wpt-user.name}"
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2FullAccess"
}

resource "aws_iam_user_policy_attachment" "s3-policy-attach" {
  user = "${aws_iam_user.wpt-user.name}"
  policy_arn = "arn:aws:iam::aws:policy/AmazonS3FullAccess"
}

resource "aws_iam_access_key" "wpt-user" {
  user = "${aws_iam_user.wpt-user.name}"
}

# S3 Config
resource "random_string" "bucket" {
  length = 10
  special = false
  upper = false
}

resource "aws_s3_bucket" "wpt-archive" {
  bucket = "my-wpt-test-archive-${random_string.bucket.result}"
  acl    = "private"
}

# Main EC2 config
resource "aws_instance" "webpagetest" {
  ami           = "ami-9978f6ee"
  instance_type = "t2.micro"
  vpc_security_group_ids  = ["${aws_security_group.wpt-sg.id}"]
  
  # FILL IN THIS PLACEHOLDER:
  key_name      = "<key pair name>"
  
  # FILL IN THE API KEY PLACEHOLDER:
  user_data     = "ec2_key=${aws_iam_access_key.wpt-user.id} \n ec2_secret=${aws_iam_access_key.wpt-user.secret} \n api_key=<my crazy long api key> \n waterfall_show_user_timing=1 \n iq=75 \n pngss=1 \n archive_s3_server=s3.amazonaws.com \n archive_s3_key=${aws_iam_access_key.wpt-user.id} \n archive_s3_secret=${aws_iam_access_key.wpt-user.secret} \n archive_s3_bucket=${aws_s3_bucket.wpt-archive.bucket} \n archive_days=1 \n cron_archive=1"
}

# Security group for the WebPageTest server
resource "aws_security_group" "wpt-sg" {
  name = "wpt-sg"

  # http  
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    description = "HTTP"
    cidr_blocks = ["0.0.0.0/0"]
  }  
  
  # SSH
  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    description = "SSH"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"
    cidr_blocks     = ["0.0.0.0/0"]
  }
}

# This is to make it easier to get the resulting URL
# for your WebPageTest instance
output "webpagetest" {
  value = "${aws_instance.webpagetest.public_dns}"
}

The final section is really handy; the output type will print out the result of the value property – in this case the public URL for the WebPageTest server itself.

We could also create an output for:

  • Randomised S3 bucket name,
  • WebPageTest IAM user’s key and secret,
  • etc

Let’s run terraform apply and watch what happens:

Apply complete! Resources: 9 added, 0 changed, 0 destroyed.

Outputs:

webpagetest = ec2-34-245-85-104.eu-west-1.compute.amazonaws.com

Go ahead and visit that url to find the familiar WebPageTest homepage. Enter a URL to test and submit it; you’ll see an EC2 test agent start to spin up in the appropriate AWS region.

Complete

Hopefully you’ve seen how easy it can be to get started with Terraform, creating your very own autoscaling private WebPageTest instance in AWS from one Terraform file.

You can stop here, have a refreshing beverage of choice and make smug finger-guns at people as they pass by.

Or you could check out the next section which takes your current knowledge and builds on it – go on, you’ll love it, honest!

Improvements

The script above will totally get you a fully autoscaling WebPageTest private instance in AWS and is pretty flexible via the user_data options that can configure WebPageTest in some detail.

Here are a few opportunities to improve on this.

Improvement 1: Variables

There are a few places in the Terraform script that use hard coded values; by introducing variables we can make the script more flexible. For instance, right at the very top the "region" is set to "eu-west-1", so let’s pull that into a variable:

# Define a variable for the region
variable "region" {
  default = "eu-west-1"
}

We can now refer to this anywhere that we would have hard coded the region, for example:

# Setting up the AWS Terraform provider
provider "aws" {
  region = "${var.region}"
}

Let’s define another one that abstracts the EC2 ami for the WebPageTest server; this will be a "map" type instead of the default "value" type:

# WebPageTest EC2 AMIs
variable "wpt_ami" {
    type    = "map"
    default = {
        us-east-1 = "ami-fcfd6194"
        us-west-1= "ami-e44853a1"
        us-west-2= "ami-d7bde6e7"
        sa-east-1= "ami-0fce7112"
        eu-west-1= "ami-9978f6ee"
        eu-central-1= "ami-22cefd3f"
        ap-southeast-1 = "ami-88bd97da"
        ap-southeast-2 = "ami-eb3542d1"
        ap-northeast-1 = "ami-66233967"
    }
}

Since there’s a different one for each region, we can combine these variables using a lookup:

resource "aws_instance" "webpagetest" {
  ami = "${lookup(var.wpt_ami, var.region)}"
  ...
}

Cool, huh? Although the default region is set to "eu-west-1" in this example, it can be overridden when calling from the command line:

terraform apply -var "region=ap-southeast-1"

This will set the "region" variable to "ap-southeast-1", affecting the provider resource and also choose the matching "wpt_ami" value. This would result in the equivalent of:

provider "aws" {
  region = "ap-southeast-1"
}

...

resource "aws_instance" "webpagetest" {
  ami  = "ami-88bd97da"
  ...
}

Handy! We’ve now extended the original script to support all AWS regions that WebPageTest AMIs exist for.

The github repo includes this addition

Improvement 2: Template for User Data

Having that user data as a looooonnnggg inline string is quite ugly and unmaintainable. We can improve this by using the template_file data type.

This is a new provider, so you’ll need to run terraform init before it can be used.

By abstracting out our user data into a separate data source, we can update the user data string in the EC2 definition to:

user_data     = "${data.template_file.ec2_wpt_userdata.rendered}"

There are a few methods – all covered below – to implement this and they all use the template_file data type. This allows us to use a template input with placeholders, and define the values for those placeholders in a vars object, which are merged together later:

#  a) Inline template string
# Separates it out, but still a messy single line
data "template_file" "ec2_wpt_userdata" {
    template = "ec2_key=$${key} \n ec2_secret=$${secret} \n api_key=$${api_key} \n waterfall_show_user_timing=1 \n iq=75 \n pngss=1 \n archive_s3_server=s3.amazonaws.com \n archive_s3_key=$${key} \n archive_s3_secret=$${secret} \n archive_s3_bucket=$${wpt_s3_archive} \n archive_days=1 \n cron_archive=1"

    vars = {
        key = "${aws_iam_access_key.wpt_user.id}"
        secret = "${aws_iam_access_key.wpt_user.secret}"
        api_key = "123412341234123412341234"
        wpt_s3_archive = "${aws_s3_bucket.wpt-archive.bucket}"
    }
}

# b) Inline heredoc syntax - much more readable!
# Now we can have new lines for improved readability
# Note the double $$
data "template_file" "ec2_wpt_userdata" {
    template =<<EOT
      ec2_key=$${key}
      ec2_secret=$${secret}
      api_key=$${api_key}
      waterfall_show_user_timing=1
      iq=75
      pngss=1
      archive_s3_server=s3.amazonaws.com
      archive_s3_key=$${key}
      archive_s3_secret=$${secret}
      archive_s3_bucket=$${wpt_s3_archive}
      archive_days=1
      cron_archive=1
    EOT

    vars = {
        key = "${aws_iam_access_key.wpt_user.id}"
        secret = "${aws_iam_access_key.wpt_user.secret}"
        api_key = "123412341234123412341234"
        wpt_s3_archive = "${aws_s3_bucket.wpt-archive.bucket}"
    }
}

# c) External TPL file
# Keeps it nice and tidy!
data "template_file" "ec2_wpt_userdata" {
    template = "${file("user_data.tpl")}"

    vars = {
        key = "${aws_iam_access_key.wpt_user.id}"
        secret = "${aws_iam_access_key.wpt_user.secret}"
        api_key = "123412341234123412341234"
        wpt_s3_archive = "${aws_s3_bucket.wpt-archive.bucket}"
    }
}

All of the above options can be referenced using .rendered on the template_file:

# refer to this as a "rendered" value
resource "aws_instance" "webpagetest" {
  user_data     = "${data.template_file.ec2_wpt_userdata.rendered}"
  ...
}

The external template file would look like the below – note the single $ this time:

ec2_key=${key}
ec2_secret=${secret}
api_key=${api_key}
waterfall_show_user_timing=1
iq=75
pngss=1
archive_s3_server=s3.amazonaws.com
archive_s3_key=${key}
archive_s3_secret=${secret}
archive_s3_bucket=${wpt_s3_archive}
archive_days=1
cron_archive=1

The github repo includes the heredoc template syntax version

Improvement 3: Dynamic API key

Up until now we’ve used a static API key value:

# e.g.
api_key=<my crazy long api key>

# or
api_key = "123412341234123412341234"

Of course, Terraform has a solution to this; first up, the random_string as we used for the S3 bucket name:

# API key as a random 40 char string
resource "random_string" "api-key" {
  length = 40
  special = false
}

data "template_file" "ec2_wpt_userdata" {
    template = "${file("user_data.tpl")}"

    vars = {
        # reference the api key to get the resulting random string
        api_key = "${random_string.api-key.result}"
        key = "${aws_iam_access_key.wpt_user.id}"
        secret = "${aws_iam_access_key.wpt_user.secret}"
        wpt_s3_archive = "${aws_s3_bucket.wpt-archive.bucket}"
    }
}

All seems good, but we can actually improve on this more. What use is an API key if you don’t know what it is? You can’t easily get the value back out of the rendered user data without rendering the whole string; and doing so will regenerate the random value! it’s like a quantum variable!

One trick to geting the random value out is in Terraform’s locals; a local value assigns a name to an expression, allowing it to be used multiple times within a module without repeating it.. It also means that the value is calculated once and can be referenced many times.

# API key as a random 40 char string
resource "random_string" "api-key" {
  length = 40
  special = false
}

# define a local "api_key" variable
locals {
  "api_key" = "${random_string.api-key.result}"
}

data "template_file" "ec2_wpt_userdata" {
    template = "${file("user_data.tpl")}"

    vars = {
        # reference the new local
        api_key = "${local.api_key}"
        key = "${aws_iam_access_key.wpt_user.id}"
        secret = "${aws_iam_access_key.wpt_user.secret}"
        wpt_s3_archive = "${aws_s3_bucket.wpt-archive.bucket}"
    }
}

# BONUS! Return the API key without
# regenerating the random value
output "api_key" {
  value = "${local.api_key}"
}

Putting it all together

The full script is over on github, and once you fill in the AWS credentials for your Terraform user and the key pair name, then after running a terraform init and terraform apply you’ll be greeted with something like this:

Outputs:

api_key = t2glfd2MlixzkQpr1e0v37xmGkQOBUVWU1pKeQKd
webpagetest = ec2-34-246-124-170.eu-west-1.compute.amazonaws.com

The user data is generated as expected; you can see the api key is the same in user data as the output from above:

Generated User Data

You’ll see the familiar WebPageTest homepage if you pop over to the URL in the output from above:

Generated WPT instance

Different Regions

Let’s try this same script, but in a different region!

Be aware that once you’ve executed one test, then the S3 bucket will not be deleted when you call destroy as it’s not empty. Usually this isn’t a problem, since any subsequent terraform apply checks the local "terraform.tfstate" file, knows this S3 bucket still exists, and won’t create a new one. If you change region then this apply will fail, since the S3 bucket exists in "terraform.tfstate", but doesn’t exist in the new region that you’re now referencing. You can just delete your "terraform.tfstate" file if you want to start from scratch and it’ll work.

ALSO be aware that your key pair doesn’t exist in the new region, so you’ll need to create a new one there first or use Terraform’s inline key pair creation to automated it!

terraform apply -var 'region=ap-southeast-1'

After ticking along for a while (assuming you’ve tidied up any S3 and missing key pair), you’ll see something like this:

Outputs:

api_key = 5titWU7aE3R6gkbf851v3tPjwCsosNVZnmreOSuq
webpagetest = ec2-54-169-221-212.ap-southeast-1.compute.amazonaws.com

Oooh – ap-southeast-1! Cool huh?

Given that the WebPageTest server can already spin up test instances in many AWS regions, you choose to can deploy the server into whichever region you need.

EXTRA POST CREDITS NICK FURY BONUS IMPROVEMENT

Here’s one last improvement to thank you for reading to the end! Notice the EC2 instances appearing in the AWS console:

EC2 webpagetest instances - no names!

No name! That’s not pretty.

If we add tags, and if we add a tag called "Name", then we’ll get something more useful in that listing.

resource "aws_instance" "webpagetest" {
  ...
  # Add your tags in here
  tags {
   Name      =   "webpagetest"
   Project   =   "webperf"
  }
}

Now check it out:

EC2 webpagetest instances - named!

Cool!

Summary

Phew, we’ve covered a whole lot here! With a bit of Terraform we’ve managed to set up WebPageTest’s IAM, S3, EC2, Security Groups, and make it region agnostic and autoscaling, with a dynamically generated API key.

The resulting script is in this github repo. Have a go and let me know how you get on!

A Step by Step Guide to setting up an AutoScaling Private WebPageTest instance

If you have any interest in website performance optimisation, then you have undoubtebly heard of WebPageTest. Being able to test your websites from all over the world, on every major browser, on different operating systems, and even on physical mobile devices, is the greatest ever addition to a web performance engineer’s toolbox.

One small shelf of Pat Meenan's epic WebPageTest device lab

The sheer scale of WebPageTest, with test agents literally global (even in China!), of course means that queues for the popular locations can get quite long – not great when you’re in the middle of a performance debug session and need answers FAST!

Also since these test agents query your website from the public internet they won’t be able to hit internal systems – for example pre-production or QA, or even just a corporate intranet that isn’t accessible outside of a certain network.

In this article I’ll show you how to set up your very own private instance of WebPageTest in Amazon AWS with autoscaling test agents to keep costs down

Once you have this you can optionally extend the setup, for example:

  • Always-on test agent(s) – get your test result faster
  • Your own test agent within the China firewall – get realistic test results from the perspective of a user behind the Great Firewall
  • Automation and scripting – schedule regular tests
  • Reporting and visualisation – graph your tests to find trends

We’ll get on to those topics later, but first up let’s focus on building a solid WebPageTest foundation!

Getting familiar with AWS

You don’t need to be an AWS guru to get this all running. AWS can be confusing, however this setup only requires a few clicks throughout the AWS console. Make sure you have your own AWS account ready to go.

There are 2 or 3 main areas we will need to use in AWS for setting up our private WebPageTest instance:

  1. A user with permissions to create new EC2 instances which the WebPageTest server uses to spin up test agents, and optionally S3 permissions to archive the tests.
  2. A place to archive test results, freeing up space on the WebPageTest server itself (optional, but highly recommended – and cheap).
  3. A VM instance on which to host your WebPageTest server which orchestrates the whole process.

Let’s start off with –

1. Create a WebPageTest User

IAM is the AWS Identity and Access Management (aka IAM – "I am"…) area; we need to create a new programmatic user for the WebPageTest server to run under.

Log in to your AWS console and head over to IAM by finding it in the Services menu:

AWS IAM

From there, click Add user, enter a name, and select Programmatic Access:

AWS IAM - creating wpt user

Now we need to Set permissions. Select Attach Existing Policies and search for and select "AmazonEC2FullAccess" (so it can start test agents)…

AWS IAM - setting permissions for EC2

… and search for and select "AmazonS3FullAccess" (to archive tests into S3).

AWS IAM - setting permissions for S3

The tests can easily fill up even a large EC2 volume quickly, so archiving to S3 is strongly recommended – S3 is super cheap, EBS volumes are not.

Archiving tests to S3 won’t change how WebPageTest looks and feels; it will fetch the zipped test from S3 on demand incredibly quickly.

Copy the AWS Access Key ID and Secret Access Key for this new user somewhere as we’ll need them in a moment.

AWS IAM - user created

2. Configure Test Storage in S3

EBS volumes (the hard drive on a EC2 instance) will fill up quickly and, although expanding them is possible, it isn’t easy and it can’t be reversed. S3 is the super cheap, almost limitless, alternative.

Head over to the S3 area of your AWS console – again, you can find this under the Services menu:

AWS S3

Give your bucket a unique name – your bucket name cannot already exist in the same region, even in another user’s AWS account – and set the region where you’ll be creating your WebPageTest server:

AWS S3 - create a WPT bucket

With that done, now you’re ready to create the server!

3. Setting up the WebPageTest server on EC2

Now let’s create the actual WebPageTest server itself; this is quite a long set of steps so get that coffee ready!

Head over to the EC2 area within the AWS console and select Launch Instance:

AWS EC2

Search for "webpagetest" and select Community AMIs:

AWS EC2 - choose a WPT AMI

Select the top webpagetest-server result.

Choose instance size: t2.micro is ok to start with as you can always scale up if necessary, and t2.micro is currently free:

AWS EC2 - set AMI size

This is an important step: instead of launching the server and logging in to edit settings, we can actually define the settings in "user data" which is passed in to the instance at start-up.

If you prefer, you can connect to your WebPageTest instance after launching and configure these settings directly in the file /var /www /webpagetest /www /settings/ settings.ini; you can copy the sample and edit it.

Tap on Configure instance details, then Advanced, and paste in user data similar to this, filling in the blanks:

# User to start/stop agents and save tests to s3
ec2_key=<the IAM user access key ID>
ec2_secret=<the IAM user secret access key>

# Set the API key to use when programmatically enqueuing tests
# Can't think of one? check out http://new-guid.com/
api_key=<choose a super secret api key>

# show user timing marks in waterfalls - very handy
waterfall_show_user_timing=1

# better images for screenshots - you're using S3, right?
# so you have the storage!
iq=75
pngss=1

# archiving to s3
archive_s3_server=s3.amazonaws.com
archive_s3_key=<the IAM user key>
archive_s3_secret=<the IAM user secret>
archive_s3_bucket=<the WPT archive bucket>

# number of days to keep tests locally before archiving
archive_days=1

# run archive script hourly automatically as agents poll for work
cron_archive=1

The full list of options that can be set are over on the github repo for WebPageTest, as the settings.ini.sample file; as mentioned earlier, you could skip the user data step and set these options in a ini file to be created in /var /www /webpagetest /www /settings /settings.ini

AWS EC2 - set user data

Now we need to make sure we can access the server over port 80; for this you need to select Configure security group, then add rule and choose HTTP:

AWS EC2 - set security groups

Alright! Ready to rock! Let’s hit Review and Launch and after you’ve selected or created a keypair (used to log in to the instance), and after it’s finished initialising we’ll be given a URL, where it says Public DNS:

AWS EC2 - starting up

Head over to that URL and you should see the familiar WebPageTest homepage:

AWS EC2 - WPT server running, submitting a test

Try it out – pop in a URL, select a location, and submit the test. If you then check the EC2 area of your AWS account, after a moment you’ll notice a new instance starting up called "WebPagetest Agent":

AWS EC2 - WPT test agent being created

A new test agent can take a few minutes to actually connect and start testing – sometimes up to 10 minutes – but once connected it’ll automatically pick up the enqueued test and run it:

AWS EC2 - submitted test being picked up

That was easy, right? The server will update itself from the WebPageTest github repo regularly, as will the test agents. Your tests will be automatically archived to, and retrieved from, S3.

AWS EC2 - test running

Your private WebPageTest foundation has been laid! We will build on this over a few more articles.

BONUS) Scaling Up and Staying There

By default, only one test agent per 100 tests will be created per region. One of the reasons I wanted a private WebPageTest instance was my impatience at queueing up to find out how slow my sites are. If you can throw money at the problem, then change this in your user data:

# This will create one new test agent for every 5 tests queued
# *up to the location max*:
EC2.ScaleFactor=5

# The default max per location is 1, so you need to override it per
# location to enable scale out:
EC2.us-east-1.max=10
EC2.us-west-1.max=20
EC2.eu-west-1.max=15
...
# etc

If there are no more tests queued which that agent can pick up for an hour (or however long you want; an hour is the default) then it will be terminated. This means that you’ll have to wait for the agent to be recreated next time, but it also means you’re not paying for an EC2 instance that you’re not using; these agents should be c4.large, so they’re not free tier eligible.

To avoid this you can alter the user data (you’ll need to stop the WebPageTest server instance before you can edit it) to add in a line for each location you want to keep an agent always available, e.g.:

 EC2.us-east-1.min=1
 EC2.us-west-1.min=1
 EC2.eu-west-1.min=1

It will now scale down to a minimum of one test agent for each of the specified regions.

Debugging

Wondering why your test isn’t starting? Getting impatient? Check /install to make sure you have green everywhere, except at the bottom where no test agents will exist (since they’re spun up on demand):

AWS debugging - install check

You can also check /getTesters.php?f=html to see what is connected:

AWS debugging - no test agents connected

The test agents are only created on demand, so let’s ensure there’s a test actually registered at /getLocations.php?f=html:

AWS debugging - confirming a test has been submitted for a location

You can also log in and check what’s happening on the server:

ssh -i "<your keypair>.pem" [email protected]<the url of your instance>

e.g.

ssh -i "webpagetest.pem" [email protected]

Now let’s check nginx for requests to getwork.php – this is the URL that test agents poll to pick up the next test in their queue:

tail -f /var/log/nginx/access.log | grep getwork

e.g.

[email protected]:~$ tail -f /var/log/nginx/access.log | grep getwork
127.0.0.1 - - [25/Apr/2019:20:42:01 +0000] "GET /work/getwork.php HTTP/1.1" 200 5 "-" "Wget/1.15 (linux-gnu)"
127.0.0.1 - - [25/Apr/2019:20:43:01 +0000] "GET /work/getwork.php HTTP/1.1" 200 5 "-" "Wget/1.15 (linux-gnu)"
127.0.0.1 - - [25/Apr/2019:20:44:01 +0000] "GET /work/getwork.php HTTP/1.1" 200 5 "-" "Wget/1.15 (linux-gnu)"
127.0.0.1 - - [25/Apr/2019:20:45:02 +0000] "GET /work/getwork.php HTTP/1.1" 200 5 "-" "Wget/1.15 (linux-gnu)"

hmm… not a lot happening there. That’s the server pinging itself for some reason; you can tell this since the requests are from "127.0.0.1". Usually after a few more minutes you’ll see:

172.31.20.24 - - [25/Apr/2019:20:45:37 +0000] "GET /work/getwork.php?f=json&shards=1&reboot=1&location=eu-west-1&pc=EC2AMAZ-CO5OM1I&key=4d446cb0d60d76dced79ffa39cf3c1e953db594b&ec2=i-0323a543b58e9ba00&ec2zone=eu-west-1a&version=190221.200223&screenwidth=1920&screenheight=1200&freedisk=7.464&upminutes=9 HTTP/1.1" 200 282 "-" "python-requests/2.21.0"
127.0.0.1 - - [25/Apr/2019:20:46:01 +0000] "GET /work/getwork.php HTTP/1.1" 200 5 "-" "Wget/1.15 (linux-gnu)"
172.31.20.24 - - [25/Apr/2019:20:46:39 +0000] "GET /work/getwork.php?f=json&shards=1&reboot=1&location=eu-west-1_IE11&pc=EC2AMAZ-CO5OM1I&key=4d446cb0d60d76dced79ffa39cf3c1e953db594b&ec2=i-0323a543b58e9ba00&ec2zone=eu-west-1a&version=190221.200223&screenwidth=1920&screenheight=1200&freedisk=7.404&upminutes=10 HTTP/1.1" 200 31 "-" "python-requests/2.21.0"
172.31.20.24 - - [25/Apr/2019:20:46:39 +0000] "GET /work/getwork.php?f=json&shards=1&reboot=1&location=eu-west-1&pc=EC2AMAZ-CO5OM1I&key=4d446cb0d60d76dced79ffa39cf3c1e953db594b&ec2=i-0323a543b58e9ba00&ec2zone=eu-west-1a&version=190221.200223&screenwidth=1920&screenheight=1200&freedisk=7.404&upminutes=10 HTTP/1.1" 200 284 "-" "python-requests/2.21.0"

Notice the timestamps show there was nothing going on for over 3 minutes after I had started to get impatient; AWS EC2 test agents can take a while to wake up and connect, so bear this in mind.

AWS debugging - test agent connected

Interesting point: notice the querystring parameters in the request; the test agent informs the server a lot about itself, even including available disc space.

Things to check:

  • User Data – did you set the correct IAM details for EC2? If not, then the WebPageTest server will not be able to spin up the agents
  • IAM – did you give EC2 permissions to the IAM user you created? If not, same issue as above.
  • WPT logs – you can check the logs for issues in the WebPageTest logs, which can be found in /var/www/webpagetest/www/log/ (not /logs/, as this is where the submitted tests are logged, not errors).
  • nginx logs – you can check if the agents are able to connect to your the WebPageTest server at all.

Summary

Hopefully you followed along and successfully set up your own private WebPageTest instance, and can now queue up all the tests you like!

If you have issues, then hit me up on twitter or head over to the WebPageTest forums – seriously, given how many times the same question must be asked in there, the gang are exceptionally patient and helpful.

Good luck!

AI Awesomeness Part Deux! Microsoft Cognitive Services Speaker Identification

The Speaker Recognition API: AI Awesomeness

In a recent article I introduced Microsoft Cognitive Services’ Speaker Verification service, using a recording of a person repeating one of a set of key phrases to verify that user by their voiceprint.

The second main feature of the Speaker Recognition API is Speaker Identification, which can compare a piece of audio to a selection of voiceprints and tell you who was talking! For example, both Barclays and HSBC banks have investigated using passive speaker identification during customer support calls to give an added layer of user identification while you’re chatting to customer support. Or you could prime your profiles against all the speakers in a conference, and have their name automatically appear on screen when they’re talking in a panel discussion.

In this article I’m going to introduce you to the Speaker Identification API from the Cognitive Services and go through an example of using it for fun and profit! Though mainly fun.

Continue reading

AI Awesomeness! Microsoft Cognitive Services Speech Verification

AI Awesomeness: The Speaker Recognition API

Microsoft have been consistently ramping up their AI offerings over the past couple of years under the grouping of “Cognitive Services”. These include some incredible offerings as services for things that would have required a degree in Maths and a deep understanding of Python and R to achieve, such as image recognition, video analysis, speech synthesis, intent analysis, sentiment analysis and so much more.

I think it’s quite incredible to have the capability to ping an endpoint with an image and very quickly get a response containing a text description of the image contents. Surely we live in the future!

In this article I’m going to introduce you to the Cognitive Services, focus on the Speech Recognition ones, and implement a working example for Speaker Verification.

Continue reading

London Bot Framework Meetup Numero Four

On the 16th January I had the pleasure of hosting another London BotFramework meetup at the newly constructed event space in the Just Eat offices.

London BotFramework Meetup #4

They’ve joined three floors with a staircase, so attendees can have beers and pizza upstairs while the presenters sweat with the AV equipment downstairs!

There was a great turnout for this one, including the usual gang and a few new faces too.

Before I get started, in case you haven’t already seen it, you should totally subscribe to the weekly Artificially Intelligent newsletter that has the latest news in AI, Chatbots, and Speech and Image Recognition!
Go sign up for Artificially Intelligent!

Video

Just want to get stuck in? Here’s the video; first half is Jimmy, second half is Jessica.

Sessions

For this meetup we were fortunate enough to have the Engström MVP power team, Jessica and Jimmy, who were in town for NDC London and graced us with their presence.

1) Developing Cross Platform Bots: Jimmy Engström

Jimmy Engstrom - Cross Platform Bots
The first session included several fantastic live demos where Jimmy creates a simple chat bot and, with minimal development effort, gets it working on Alexa, Cortana, and Google Home!

(Rendering my own ingenious Alexa BotFramework hack from last year quite useless!)

During the day Jimmy Engström is a .NET developer and he does all the fun stuff during his spare time. He and his wife run a code intensive user group (Coding After Work) that focuses on helping participants with code and design problems, and a podcast with the same name. Jimmy can be found tweeting as @apeoholic

2) Conversational UX: Jessica Engström

Jessica Engstrom - Conversational UX
In the second half Jessica gave a great overview of creating a framework to ensure your bot – speech or text based – seems less, well, robotic!

Some great takeaways from this which can easily be applied to your next project.

Being a geek shows in all parts of Jessica Engström’s life, whether it be organizing hackathons, running a user group and a podcast with her husband, game nights (retro or VR/MR) with friends, just catching the latest superhero movie or speaking internationally at conferences.

Her favorite topics is UX/UI and Mixed reality and other futuristic tech. She’s a Windows Development MVP. Together with her husband she runs a company called “AZM dev” which is focused on HoloLens and Windows development.

Follow her exploits over on twitter as @grytlappen

Summary

The updated event space at Just Eat is great and gives better visibility of the sessions thanks to stadium seating at the back.

The sessions were insightful and overall I think this went well.

Here’s to the next one and don’t forget to join up (and actually attend when you RSVP… ahem…)

Image Placeholders: Do it right or don’t do it at all. Please.

Hello. I’m a grumpy old web dev. I’m still wasting valuable memory on things like the deprecated img element’s lowsrc attribute (bring it back!), the hacks needed to get a website looking acceptable in both Firefox 2.5 and IE5.5 and IE on Mac, and what “cards” and “decks” meant in WAP terminology.

Having this – possibly pointless – information to hand means I am constantly getting frustrated at supposed “breakthrough” approaches to web development and optimisation which seem to be adding complexity for the sake of it, sometimes apparently ignoring existing tech.

What’s more annoying is when a good approach to something is implemented so badly that it reflects poorly on the original concept. I’ve previously written about how abusing something clever like React results in an awful user experience.

Don’t get me wrong, I absolutely love new tech, new approaches, new thinking, new opinions. I’m just sometimes grumpy about it because these new things don’t suit my personal preferences. Hence this article! Wahey!

Continue reading

London Bot Framework Meetup the Third

Welcome to the Third London BotFramework Meetup! Here's the line up

On Wednesday 22nd November 2017 I had the pleasure of running the third London Bot Framework meetup at the lovely Just Eat office in central London. The offices have been recently upgraded and the new meetup space has a huge 9 screen display a multiple mic speaker system, including a fantastic CatchBox throwable mic for ensuring everyone hears the audience questions

It has been a year since the previous one (whoops) but it was great to see some familiar faces return in the attendees. I had forgotten how much fun it is to emcee an event like this! Maybe next time I’ll be sure to just emcee and not also commit presenting a session too.

Continue reading

The Tesco Mobile Website and The Importance of Device Testing

A constant passion of mine is efficiency: not being wasteful, repeating something until the process has been refined to the most effective, efficient, economical, form of the activity that is realistically achievable.

I’m not saying I always get it right, just that it’s frustrating when I see this not being done. Especially so when the opposite seems to be true, as if people are actively trying to make things as bad as possible.

Which brings me on the the current Tesco mobile website, the subject of this article, and of my dislike of the misuse of a particular form of web technology: client side rendering.

What follows is a mixture of web perf analysis and my own opinions and preferences. And you know what they say about opinions…

Client Side Rendering; What is it good for?

client side rendering frameworks

No, it’s not “absolutely nothing”! Angular, React, Vue; they all have their uses. They do a job, and in the most part they do it well.

The problem comes when developers treat every problem like something that can be solved with client side rendering.

Continue reading

Building your first Botframework based Cortana Skill

Hi. I'm Cortana.

At //BUILD 2017 Microsoft announced support for Cortana Skills and connecting a Cortana Skill into a Bot Framework chatbot; given the number of chatbots out there using Microsoft Bot Framework, this is an extremely exciting move.

In this article I’ll show you how to create your first Cortana Skill from a Bot Framework chatbot and make it talk!

Cortana

If you’re not already familiar with Cortana, this is Microsoft’s “personal assistant” and is available on Windows 10 (version 1607 and above) and a couple of Windows phones (Lumia 950/950 XL), a standalone speaker – like an Amazon Echo – and a plethora of devices that can run the Cortana app, including iOS and Android and plenty of laptops.

Cortana all the things, Derrick.

You’re going to be seeing a lot more of this little box of tricks (“Bot” of tricks? Box of bots?.. hmm…), so you might as well get in on the act right now!

Continue reading

Involved in a startup? Read this!

Having been the VP of Engineering at a startup, I understand a lot of the challenges. The technical ones relating to the solution you think you need to build, more technical ones relating to the solutions the investors want you to build, the development process to best fit a rapidly changing product, team, requirements, and priorities, as well as managing the team through uncertain terrain.

They’re the fun ones. The easy ones! Especially given how talented my dev team was.

The founder had the difficult challenges; define a product that could be a success, iterate that idea based on extensive user testing, and most importantly, ensure there was funding.

Luckily, our founder was as talented at soliciting funds as we were at building epic tech!

If you are involved in a startup, perhaps Just Eat’s Accelerator programme can help with both types of challenge!

Continue reading