A Step by Step Guide to setting up an AutoScaling Private WebPageTest instance

Update: November 2021
This article is out of date; there are no longer WebPageTest Server AMIs, so you need to install WPT on a base OS. The updated article is here WebPageTest Private Instance: 2021 Edition

If you have any interest in website performance optimisation, then you have undoubtedly heard of WebPageTest. Being able to test your websites from all over the world, on every major browser, on different operating systems, and even on physical mobile devices, is the greatest ever addition to a web performance engineer’s toolbox.

One small shelf of Pat Meenan's epic WebPageTest device lab

The sheer scale of WebPageTest, with test agents literally global (even in China!), of course means that queues for the popular locations can get quite long – not great when you’re in the middle of a performance debug session and need answers FAST!

Also since these test agents query your website from the public internet they won’t be able to hit internal systems – for example pre-production or QA, or even just a corporate intranet that isn’t accessible outside of a certain network.

In this article I’ll show you how to set up your very own private instance of WebPageTest in Amazon AWS with autoscaling test agents to keep costs down

Continue reading

Image Placeholders: Do it right or don’t do it at all. Please.

Hello. I’m a grumpy old web dev. I’m still wasting valuable memory on things like the deprecated img element’s lowsrc attribute (bring it back!), the hacks needed to get a website looking acceptable in both Firefox 2.5 and IE5.5 and IE on Mac, and what “cards” and “decks” meant in WAP terminology.

Having this – possibly pointless – information to hand means I am constantly getting frustrated at supposed “breakthrough” approaches to web development and optimisation which seem to be adding complexity for the sake of it, sometimes apparently ignoring existing tech.

What’s more annoying is when a good approach to something is implemented so badly that it reflects poorly on the original concept. I’ve previously written about how abusing something clever like React results in an awful user experience.

Don’t get me wrong, I absolutely love new tech, new approaches, new thinking, new opinions. I’m just sometimes grumpy about it because these new things don’t suit my personal preferences. Hence this article! Wahey!

Continue reading

The Tesco Mobile Website and The Importance of Device Testing

A constant passion of mine is efficiency: not being wasteful, repeating something until the process has been refined to the most effective, efficient, economical, form of the activity that is realistically achievable.

I’m not saying I always get it right, just that it’s frustrating when I see this not being done. Especially so when the opposite seems to be true, as if people are actively trying to make things as bad as possible.

Which brings me on the the current Tesco mobile website, the subject of this article, and of my dislike of the misuse of a particular form of web technology: client side rendering.

What follows is a mixture of web perf analysis and my own opinions and preferences. And you know what they say about opinions…

Client Side Rendering; What is it good for?

client side rendering frameworks

No, it’s not “absolutely nothing”! Angular, React, Vue; they all have their uses. They do a job, and in the most part they do it well.

The problem comes when developers treat every problem like something that can be solved with client side rendering.

Continue reading

Introduction to GruntJS for Visual Studio

As a developer, there are often tasks that we need to automate to make our daily lives easier. You may have heard about GruntJS or even Gulp before.

In this article, I am going to run through a quick intro to successfully using gruntjs to automate your build process within the usual IDE of .Net developers: Visual Studio..

gruntjs (Grunt)

gruntjs logo

What is it?

Gruntjs is a JavaScript task runner; one of a few that exist, but only one of two to become mainstream – the other being Gulp. Both do pretty similar things, both have great support and great communities.

Whereas gulp = tasks defined in code, grunt = tasks defined in configuration.

It’s been going on for a while – check this first commit from 2011!

What does it do?

A JavaScript task runner allows you to define a set of tasks, subtasks, and dependent tasks, and execute these tasks at a time of your choosing; on demand, before or after a specific event, or any time a file changes, for example.

These tasks range from things like CSS and JS minification and combination, image optimisation, HTML minification, HTML generation, redact code, run tests, and so on. A large number of the available plugins are in fact grunt wrappers around existing executables, meaning you can now run those programs from a chain of tasks; for example: LESS, WebSocket, ADB, Jira, XCode, SASS, RoboCopy.

The list goes on and on – and you can even add your own to it!

How does it work?

GruntJS is a nodejs module, and as such is installed via npm (node package manager). Which also means you need both npm and nodejs installed to use Grunt.

nodejs logo npm logo

By installing it globally or just into your project directory you’re able to execute it from the command line (or other places) and it will check the current directory for a specific file called “gruntfile.js“. It is in this gruntfile.js that you will specify and configure your tasks and the order in which you would like them to run. Each of those tasks is also a nodejs module, so will also need to be installed via npm and referenced in the package.json file.

The package.json is not a grunt-specific file, but an npm-specific file; when you clone a repo containing grunt tasks, you must first ensure all development dependencies are met by running npm install, which installs modules referenced within this packages.json file. It can also be used by grunt to pull in project settings, configuration, and data for use within the various grunt tasks; for example, adding a copyright to each file with your name and the current date.

Using grunt – WITHOUT Visual Studio

Sounds AMAAAAYYZING, right? So how can you get your grubby mitts on it? I’ve mentioned a few dependencies before, but here they all are:

  • nodejs – grunt is a nodejs module, so needs to run on nodejs.
  • npm – grunt is a nodejs module and depends on many other nodejs packages; sort of makes sense that you’d need a nodejs package manager for this job, eh?
  • grunt-cli – the grunt command line tool, which is needed to actually run grunt tasks
  • package.json – the package dependencies and project information, for npm to know what to install
  • gruntfile.js – the guts of the operation; where we configure the tasks we want to run and when.

First things first

You need to install nodejs and npm (both are installed with nodejs).

grunt-cli

Now you’ve got node and npm, open a terminal and fire off npm install -g grunt-cli to install grunt globally. (You could skip this step and just create a package.json with grunt as a dependency and then run npm install in that directory)

Configuration

The package.json contains information about your project, and the various package dependencies. Think of it as a slice of NuGet’s packages.config and a sprinkle of your project’s .sln file; it contains project-specific data, such as the name, author’s name, repo location, description, as well as defining modules on which your project depends in order to build and run

Create a package.json file with some simple configuration, such as that used on the gruntjs site:

{
  "name": "my-project-name",
  "version": "0.1.0"
}

Or you could run npm-init, but that asks for lots more info that we really need here, so the generated package.json is a bit bloated:

npm init

So, what’s going on in the code above? We’re setting a name for our project and a version. Now we could just add in a few more lines and run npm install to go and get those for us, for example:

{
  "name": "my-project-name",
  "version": "0.1.0",
  "devDependencies": {
    "grunt": "~0.4.5",
    "grunt-contrib-jshint": "~0.10.0",
    "grunt-contrib-nodeunit": "~0.4.1",
    "grunt-contrib-uglify": "~0.5.0"
 }
}

Here we’re saying what we need to run our project; if you’re writing a nodejs or iojs project then you’ll have lots of your own stuff referenced in here, however for us .Net peeps we just have things our grunt tasks need.

Within devDependencies we’re firstly saying we use grunt, and we want at least version 0.4.5; the tilde versioning means we want version 0.4.5 or above, up to but not including 0.5.0.

Then we’re saying this project also needs jshint, nodeunit, and uglify.

A note on packages: “grunt-contrib” packages are those verified and officially maintained by the grunt team.

But what if we don’t want to write stuff in, have to check the right version from the npm website, and then run npm install each time to actually pull it down? There’s another way of doing this.

Rewind back to when we just had this:

{
  "name": "my-project-name",
  "version": "0.1.0"
}

Now if you were to run the following commands, you would have the same resulting package.json as before:

npm install grunt --save-dev
npm install grunt-contrib-jshint --save-dev
npm install grunt-contrib-nodeunit --save-dev
npm install grunt-contrib-uglify --save-dev

However, this time they’re already installed and their correct versions are already set in your package.json file.

Below is an example package.json for an autogenerated flat file website

{
  "name": "webperf",
  "description": "Website collecting articles and interviews relating to web performance",
  "version": "0.1.0",
  "devDependencies": {
    "grunt": "^0.4.5",
    "grunt-directory-to-html": "^0.2.0",
    "grunt-markdown": "^0.7.0"
  }
}

In the example here we’re starting out by just depending on grunt itself, and two other modules; one that creates an html list from a directory structure, and one that generates html from markdown files.

Last step – gruntfile.js

Now you can create a gruntfile.js and paste in something like that specified from the gruntjs site:

module.exports = function(grunt) {
  // Project configuration.
  grunt.initConfig({
    pkg: grunt.file.readJSON('package.json'),
    uglify: {
      options: {
        banner: '/*! <%= pkg.name %> <%= grunt.template.today("yyyy-mm-dd") %> */\n'
      },
      build: {
        src: 'src/<%= pkg.name %>.js',
        dest: 'build/<%= pkg.name %>.min.js'
      }
    }
  });

  // Load the plugin that provides the "uglify" task.
  grunt.loadNpmTasks('grunt-contrib-uglify');

  // Default task(s).
  grunt.registerTask('default', ['uglify']);

};

What’s happening in here then? The standard nodejs module.exports pattern is used to expose your content as a function. Then it’s reading in the package.json file and putting that object into the variable pkg.

Then it gets interesting; we configure the grunt-contrib-uglify npm package with the uglify task, setting a banner for the minified js file to contain the package name – as specified in package.json – and today’s date, then specifying a “target” called build with source and destination directories.

Then we’re telling grunt to bring in the grunt-contrib-uglify npm module (that must already be installed locally or globally).

After the configuration is specified, we’re telling grunt to load the uglify task (which you must have previously installed for this to work) and then set the default grunt task to call the uglify task.

BINGO. Any javascript in the project’s “src” directory will get minified, have a header added, and the result dumped into the project’s “build” directory any time we run grunt.

Example gruntfile.js for an autogenerated website

module.exports = function(grunt) {

  grunt.initConfig({
  markdown: {
    all: {
      files: [
        {
          cwd:'_drafts',
          expand: true,
          src: '*.md',
          dest: 'articles/',
          ext: '.html'
        }
      ]
    },
    options: {
      template: 'templates/article.html',
      preCompile: function(src, context) {
        var matcher = src.match(/@-title:\s?([^@:\n]+)\n/i);
        context.title = matcher && matcher.length > 1 && matcher[1];
      },
      markdownOptions: {
        gfm: false,
        highlight: 'auto'
        }
      }
  },
  to_html: {
    build:{      
        options: {
          useFileNameAsTitle: true,
          rootDirectory: 'articles',
          template: grunt.file.read('templates/listing.hbs'),
          templatingLanguage: 'handlebars',

        },
        files: {
          'articles.html': 'articles/*.html'
        }
    }
  }
});

grunt.loadNpmTasks('grunt-markdown');
grunt.loadNpmTasks('grunt-directory-to-html');

grunt.registerTask('default', ['markdown','to_html']);

};

This one will convert all markdown files in a _drafts directory to html based on a template html file (grunt-markdown), then create a listing page based on the directory structure and a template handlebars file (grunt-directory-to-html).

Using grunt – WITH Visual Studio

Prerequisites

You still need nodejs, npm, and grunt-cli so make sure you install nodejs and npm install -g grunt-cli.

To use task runners within Visual Studio you first need to have a version that supports them. If you already have VS 2015 you can skip these install sections.

Visual Studio 2013.3 or above

If you have VS 2013 then you need to make sure you have at least RC3 or above (free upgrades!). Go and install if from your pals at Microsoft.

This is a lengthy process, so remember to come back here once you’ve done it!

TRX Task Runner Explorer Extension

This gives your Visual Studio an extra window that displays all available tasks, as defined within your grunt or gulp file. So go and install that from the Visual Studio Gallery

NPM Intellisense Extension

You can get extra powers for yourself if you install the intellisense extension, which makes using grunt in Visual Studio much easier. Go get it from the Visual Studio Gallery.

Grunt Launcher Extension

Even more extra powers; right-click on certain files in your solution to launch grunt, gulp, bower, and npm commands using the Grunt Launcher Extension

Tasks Configuration

Create a new web project, or open an existing one, and add a package.json and a gruntfile.js.

Example package.json

{
  "name": "grunt-demo",
  "version": "0.1.0",
  "devDependencies": {
    "grunt": "~0.4.5",
    "grunt-contrib-uglify": "~0.5.0"
 }
}

Example gruntfile.js

module.exports = function(grunt) {
  // Project configuration.
  grunt.initConfig({
    pkg: grunt.file.readJSON('package.json'),
    uglify: {
      options: {
        banner: '/*! <%= pkg.name %> <%= grunt.template.today("yyyy-mm-dd") %> */\n'
      },
      build: {
        src: 'Scripts/bootstrap.js',
        dest: 'Scripts/build/bootstrap.min.js'
      }
    }
  });

  // Load the plugin that provides the "uglify" task.
  grunt.loadNpmTasks('grunt-contrib-uglify');

  // Default task(s).
  grunt.registerTask('default', ['uglify']);

};

Using The Task Runner Extension in Visual Studio

Up until this point the difference between without Visual Studio and with Visual Studio has been non-existent; but here’s where it gets pretty cool.

If you installed everything mentioned above, then you’ll notice some cool stuff happening when you open a project that already contains a package.json.

The Grunt Launcher extension will “do a nuget” and attempt to restore your “devDependencies” npm packages when you open your project:

npm package restore

And the same extension will give you a right click option to force an npm install:

npm package restore - menu

This one also allows you to kick off your grunt tasks straight from a context menu on the gruntfile itself:

grunt launcher

Assuming you installed the intellisense extension, you now get things like auto-suggestion for npm package versions, along with handy tooltip explainers for what the version syntax actually means:

npm intellisense

If you’d like some more power over when the grunt tasks run, this is where the Task Runner Explorer extension comes in to play:

task runner

This gives you a persistent window that lists your available grunt tasks and lets you kick any one of them off with a double click, showing the results in an output window.

task runner explorer output

Which is equivalent of running the same grunt tasks outside of Visual Studio.

What’s really quite cool with this extension is being able to configure when these tasks run automatically; your options are:

  • Before Build
  • After Build
  • Clean
  • Solution Open

task runner explorer

Which means you can ensure that when you hit F5 in Visual Studio all of your tasks will run to generate the output required to render your website before the website is launched in a browser, or when you execute a “Clean” on the solution it can fire off that task to delete some temp directories, or the output from the last tasks execution.

Summary

Grunt and Gulp are fantastic tools to help you bring in automation to your projects; and now they’re supported in Visual Studio, so even you .Net developers have no excuse to not try playing around with them!

Have a go with the tools above, and let me know how you get on!

Top 5 Biggest Queries of 2014

During this year I became slightly addicted to the fantastic community site bigqueri.es; a site to help people playing around with the data available in Google’s BigQuery share their queries and get help, comments, and validation on that idea.

A query can start a conversation which can end up refining or even changing the direction of the initial idea.

BigQuery contains a few different publicly available large datasets for you to query, including all of Wikipedia, Shakespeare’s works, and Github meta data.

HTTP Archive

The main use of bigqueri.es is for discussing the contents of the HTTP Archive (there are a few about other things, however) and that’s where I’ve been focussing my nerdiness.

What follows is a summary of the five most popular HTTP Archive queries created this year, by page view. I’m hoping that you find them as fascinating as I do, and perhaps even sign up at bigqueri.es and continue the conversation or even sign up for Big Query and submit your query for review.

Here they are, in reverse order:

5) 3rd party content: Who is guarding the cache? (1.5k views)

http://bigqueri.es/t/3rd-party-content-who-is-guarding-the-cache/182

Doug Sillars (@dougsillars) riffs on a previous query by Ilya Grigorik to try investigating what percentage of requests are coming from 3rd parties, what is the total amount of this (in MB), and how much of it is cacheable.

I’ve run what I believe to be the same query over the entire year of 2014 and you can see the results below:

We can see that there’s a generally good show from the 3rd parties, with June and October being particularly highly cacheable; something appears to have happened in September though, as there’s a sudden drop-off after 80 of the top 100 sites whereas the other months we see that same drop-off after 90 sites.

4) Analyzing HTML, CSS, and JavaScript response bodies (2.4k views)

http://bigqueri.es/t/analyzing-html-css-and-javascript-response-bodies/442

Ilya Grigorik (@igrigorik) gets stuck into a recent addition to the HTTP Archive (in fact, it only exists for ONE run due to the sheer volume of data); the response bodies! Mental.

By searching within the response bodies themselves – such as raw HTML, Javascript, and CSS – you’re able to look inside the inner workings of each site. The field is just text and can be interrogated by applying regular expressions or “contains” type functions.

The query he references (actually created as an example query by Steve Souders (@souders)) examines the asynchronous vs synchronous usages of the Google Analytics tracking script, which tells us that threre are 80577 async uses, 44 sync uses and a bizarre 6707 uses that fall into neither category.

I’m working on several queries myself using the response body data; it’s amazing that this is even available for querying! Do be aware that if you’re using BigQuery for this you will very quickly use up your free usage! Try downloading the mysql archive if you’re serious.

3) Sites that deliver Images using gzip/deflate encoding (4.4k views)

http://bigqueri.es/t/sites-that-deliver-images-using-gzip-deflate-encoding/220

Paddy Ganti (@paddy_ganti) starts a great conversation by attempting to discover which domains are disobeying a guideline for reducing payload: don’t gzip images or other binary files, since their own compression algorithms will do a better job than gzip/deflate which might even result in a larger file. Yikes!

The query looks into the response’s’ content type, checking that it’s an image, and compares this with the content encoding, checking if compression has been used.

There are over 19k compressed image responses coming from Akamai alone in the latest dataset:

Although you can see the results suggest a significant number of requests are gzip or deflate encoded images, the great discussion that follows sheds some light on the reasons for this.

2) Are Popular Websites Faster? (4.9k views)

http://bigqueri.es/t/are-popular-websites-faster/162

Doug Sillars (@dougsillars) has another popular query where he looks into the speed index of the most popular websites (using the “rank” column).

We’re all aware of the guideline around keeping a page load as close to a maximum of 2 seconds as possible, so do the “big sites” manage that better than the others?

If we graph the top 1000 sites – split into top 100, 100-500, and 500-1000 – and get a count of sites per Speed Index (displayed as a single whole number along the x-axis; e.g. 2 = SI 2000), we can see the relative performance of each group.

Top 100

The top 100 sites have between 25-30 sites with Speed Indexes around 2000-3000 then drop off sharply.

Top 100-500

Although the next 400 have over 60 sites each with a Speed Index of 2000 or 4000, and almost 90 sites with 3000, their drop off is smoother and there’s a long tail out to 25000.

Top 500-1000

The next 500 have a similar pattern but a much less dramatic drop off, then a gentle tail out to around 25000 again.

This shows that although there are sites in each range which achieve extremely good performance, the distribution of the remainder gets more and more spread out. Essentially the percentage of each range who achieve good performance is reduced.

The post is very detailed with lots of great visualisations of the data, leading to some interesting conclusions.

1) M dot or RWD. Which is faster? (7.6k views)

http://bigqueri.es/t/m-dot-or-rwd-which-is-faster/296

The most popular query by quite a way is another one from Doug Sillars (@dougsillars).

The key question he investigates is whether a website which redirects from the main domain to a mobile-specific domain performs better than a single responsive website.

He identifies those sites which may mobile specific using the cases below:

 WHEN HOST(requests.url)  LIKE 'm.%' then "M dot"
 WHEN HOST(requests.url)  LIKE 't.%' then "T dot"
 WHEN HOST(requests.url)  LIKE '%.mobi%' then "dot mobi"
 WHEN HOST(requests.url)  LIKE 'mobile%' then "mobile"
 WHEN HOST(requests.url)  LIKE 'iphone%' then "iphone"
 WHEN HOST(requests.url)  LIKE 'wap%' then "wap"
 WHEN HOST(requests.url)  LIKE 'mobil%' then "mobil"
 WHEN HOST(requests.url)  LIKE 'movil%' then "movil"
 WHEN HOST(requests.url)  LIKE 'touch%' then "touch"

The key is this clause, used to check when the HTML is being served:

 WHERE requests.firstHtml=true

These are then compared to sites whose url don’t significantly change (such as merely adding or removing “www.”).

The fascinating article goes into a heap of detail and ultimately results in the conclusion that responsively designed websites appear to outperform mobile-specific websites. Obviously, this is only true for well written sites, because it is still easy to make a complete mess of a RWD site!

bigqueri.es

Hopefully this has given you cause to head over to the http://bigqueri.es website, check out what other people are looking into and possibly help out or try your own web performance detective work out over the holiday season.