Honestly, the hardest part of this by far was getting the audio recorded in the correct format for the APIs to accept! I hacked the wonderful recorderjs from Matt Diamond to set the correct bitrate, channels, etc, and eventually got there through trial and error (and squinting at the minified source code of the Microsoft demo page)!
One of these changes managed to break my existing Speaker Recognition API applications (it’s still in Preview, so don’t be surprised!) by moving Speaker Recognition under the Speech Service project, slightly changing the APIs and their endpoints, and adding new functionality (exciting!)
In this article I’ll show the same web page implementation, but use the updated 2020 Speaker Recognition APIs, and introduce the new Verification API that doesn’t rely on a predefined list of passphrases.
WebPageTest is incredible. It allows us to visit a web page, enter a few values and then produce performance results from any destination around the world. Best of all, you can do this in many different possible browser configurations; even on many different real devices.
If you’re doing this a lot, then using that simple web form can become the bottleneck to rapidly iterating on your web performance improvements.
In this article I’ll show you how to easily execute your web performance tests in a simple, repeatable, automated way using the WebPageTest API.
In a previous article I went through the steps needed to create your own private, autoscaling, WebPageTest setup in Amazon AWS. It wasn’t particularly complicated, but it was quite manual; I don’t like pointing and clicking in a GUI since I can’t easily put it in version control and run it again and again on demand.
Fortunately, whatever you create within AWS can be described using a language called CloudFormation which allows you to define your infrastructure as code.
Unfortunately it’s not easy to understand (in my opinion!) and I could never quite get my head around it, which annoyed me no end.
In this article I’ll show you how to use Terraform to define your private autoscaling WebPageTest setup in easily understandable infrastructure as code, enabling an effortless and reproducable web performance testing setup, which you can then fearlessly edit and improve!
If you have any interest in website performance optimisation, then you have undoubtebly heard of WebPageTest. Being able to test your websites from all over the world, on every major browser, on different operating systems, and even on physical mobile devices, is the greatest ever addition to a web performance engineer’s toolbox.
The sheer scale of WebPageTest, with test agents literally global (even in China!), of course means that queues for the popular locations can get quite long – not great when you’re in the middle of a performance debug session and need answers FAST!
Also since these test agents query your website from the public internet they won’t be able to hit internal systems – for example pre-production or QA, or even just a corporate intranet that isn’t accessible outside of a certain network.
In this article I’ll show you how to set up your very own private instance of WebPageTest in Amazon AWS with autoscaling test agents to keep costs down
The second main feature of the Speaker Recognition API is Speaker Identification, which can compare a piece of audio to a selection of voiceprints and tell you who was talking! For example, both Barclays and HSBC banks have investigated using passive speaker identification during customer support calls to give an added layer of user identification while you’re chatting to customer support. Or you could prime your profiles against all the speakers in a conference, and have their name automatically appear on screen when they’re talking in a panel discussion.
In this article I’m going to introduce you to the Speaker Identification API from the Cognitive Services and go through an example of using it for fun and profit! Though mainly fun.
Microsoft have been consistently ramping up their AI offerings over the past couple of years under the grouping of “Cognitive Services”. These include some incredible offerings as services for things that would have required a degree in Maths and a deep understanding of Python and R to achieve, such as image recognition, video analysis, speech synthesis, intent analysis, sentiment analysis and so much more.
I think it’s quite incredible to have the capability to ping an endpoint with an image and very quickly get a response containing a text description of the image contents. Surely we live in the future!
In this article I’m going to introduce you to the Cognitive Services, focus on the Speech Recognition ones, and implement a working example for Speaker Verification.
They’ve joined three floors with a staircase, so attendees can have beers and pizza upstairs while the presenters sweat with the AV equipment downstairs!
There was a great turnout for this one, including the usual gang and a few new faces too.
Before I get started, in case you haven’t already seen it, you should totally subscribe to the weekly Artificially Intelligent newsletter that has the latest news in AI, Chatbots, and Speech and Image Recognition! Go sign up for Artificially Intelligent!
Just want to get stuck in? Here’s the video; first half is Jimmy, second half is Jessica.
For this meetup we were fortunate enough to have the Engström MVP power team, Jessica and Jimmy, who were in town for NDC London and graced us with their presence.
1) Developing Cross Platform Bots: Jimmy Engström
The first session included several fantastic live demos where Jimmy creates a simple chat bot and, with minimal development effort, gets it working on Alexa, Cortana, and Google Home!
During the day Jimmy Engström is a .NET developer and he does all the fun stuff during his spare time. He and his wife run a code intensive user group (Coding After Work) that focuses on helping participants with code and design problems, and a podcast with the same name. Jimmy can be found tweeting as @apeoholic
2) Conversational UX: Jessica Engström
In the second half Jessica gave a great overview of creating a framework to ensure your bot – speech or text based – seems less, well, robotic!
Some great takeaways from this which can easily be applied to your next project.
Being a geek shows in all parts of Jessica Engström’s life, whether it be organizing hackathons, running a user group and a podcast with her husband, game nights (retro or VR/MR) with friends, just catching the latest superhero movie or speaking internationally at conferences.
Her favorite topics is UX/UI and Mixed reality and other futuristic tech. She’s a Windows Development MVP. Together with her husband she runs a company called “AZM dev” which is focused on HoloLens and Windows development.
Hello. I’m a grumpy old web dev. I’m still wasting valuable memory on things like the deprecated img element’s lowsrc attribute (bring it back!), the hacks needed to get a website looking acceptable in both Firefox 2.5 and IE5.5 and IE on Mac, and what “cards” and “decks” meant in WAP terminology.
Having this – possibly pointless – information to hand means I am constantly getting frustrated at supposed “breakthrough” approaches to web development and optimisation which seem to be adding complexity for the sake of it, sometimes apparently ignoring existing tech.
Don’t get me wrong, I absolutely love new tech, new approaches, new thinking, new opinions. I’m just sometimes grumpy about it because these new things don’t suit my personal preferences. Hence this article! Wahey!