The second main feature of the Speaker Recognition API is Speaker Identification, which can compare a piece of audio to a selection of voiceprints and tell you who was talking! For example, both Barclays and HSBC banks have investigated using passive speaker identification during customer support calls to give an added layer of user identification while you’re chatting to customer support. Or you could prime your profiles against all the speakers in a conference, and have their name automatically appear on screen when they’re talking in a panel discussion.
In this article I’m going to introduce you to the Speaker Identification API from the Cognitive Services and go through an example of using it for fun and profit! Though mainly fun.
Microsoft have been consistently ramping up their AI offerings over the past couple of years under the grouping of “Cognitive Services”. These include some incredible offerings as services for things that would have required a degree in Maths and a deep understanding of Python and R to achieve, such as image recognition, video analysis, speech synthesis, intent analysis, sentiment analysis and so much more.
I think it’s quite incredible to have the capability to ping an endpoint with an image and very quickly get a response containing a text description of the image contents. Surely we live in the future!
In this article I’m going to introduce you to the Cognitive Services, focus on the Speech Recognition ones, and implement a working example for Speaker Verification.
They’ve joined three floors with a staircase, so attendees can have beers and pizza upstairs while the presenters sweat with the AV equipment downstairs!
There was a great turnout for this one, including the usual gang and a few new faces too.
Before I get started, in case you haven’t already seen it, you should totally subscribe to the weekly Artificially Intelligent newsletter that has the latest news in AI, Chatbots, and Speech and Image Recognition! Go sign up for Artificially Intelligent!
Just want to get stuck in? Here’s the video; first half is Jimmy, second half is Jessica.
For this meetup we were fortunate enough to have the Engström MVP power team, Jessica and Jimmy, who were in town for NDC London and graced us with their presence.
1) Developing Cross Platform Bots: Jimmy Engström
The first session included several fantastic live demos where Jimmy creates a simple chat bot and, with minimal development effort, gets it working on Alexa, Cortana, and Google Home!
During the day Jimmy Engström is a .NET developer and he does all the fun stuff during his spare time. He and his wife run a code intensive user group (Coding After Work) that focuses on helping participants with code and design problems, and a podcast with the same name. Jimmy can be found tweeting as @apeoholic
2) Conversational UX: Jessica Engström
In the second half Jessica gave a great overview of creating a framework to ensure your bot – speech or text based – seems less, well, robotic!
Some great takeaways from this which can easily be applied to your next project.
Being a geek shows in all parts of Jessica Engström’s life, whether it be organizing hackathons, running a user group and a podcast with her husband, game nights (retro or VR/MR) with friends, just catching the latest superhero movie or speaking internationally at conferences.
Her favorite topics is UX/UI and Mixed reality and other futuristic tech. She’s a Windows Development MVP. Together with her husband she runs a company called “AZM dev” which is focused on HoloLens and Windows development.
Hello. I’m a grumpy old web dev. I’m still wasting valuable memory on things like the deprecated img element’s lowsrc attribute (bring it back!), the hacks needed to get a website looking acceptable in both Firefox 2.5 and IE5.5 and IE on Mac, and what “cards” and “decks” meant in WAP terminology.
Having this – possibly pointless – information to hand means I am constantly getting frustrated at supposed “breakthrough” approaches to web development and optimisation which seem to be adding complexity for the sake of it, sometimes apparently ignoring existing tech.
Don’t get me wrong, I absolutely love new tech, new approaches, new thinking, new opinions. I’m just sometimes grumpy about it because these new things don’t suit my personal preferences. Hence this article! Wahey!
On Wednesday 22nd November 2017 I had the pleasure of running the third London Bot Framework meetup at the lovely Just Eat office in central London. The offices have been recently upgraded and the new meetup space has a huge 9 screen display a multiple mic speaker system, including a fantastic CatchBox throwable mic for ensuring everyone hears the audience questions
It has been a year since the previous one (whoops) but it was great to see some familiar faces return in the attendees. I had forgotten how much fun it is to emcee an event like this! Maybe next time I’ll be sure to just emcee and not also commit presenting a session too.
A constant passion of mine is efficiency: not being wasteful, repeating something until the process has been refined to the most effective, efficient, economical, form of the activity that is realistically achievable.
I’m not saying I always get it right, just that it’s frustrating when I see this not being done. Especially so when the opposite seems to be true, as if people are actively trying to make things as bad as possible.
Which brings me on the the current Tesco mobile website, the subject of this article, and of my dislike of the misuse of a particular form of web technology: client side rendering.
What follows is a mixture of web perf analysis and my own opinions and preferences. And you know what they say about opinions…
Client Side Rendering; What is it good for?
No, it’s not “absolutely nothing”! Angular, React, Vue; they all have their uses. They do a job, and in the most part they do it well.
The problem comes when developers treat every problem like something that can be solved with client side rendering.
At //BUILD 2017 Microsoft announced support for Cortana Skills and connecting a Cortana Skill into a Bot Framework chatbot; given the number of chatbots out there using Microsoft Bot Framework, this is an extremely exciting move.
In this article I’ll show you how to create your first Cortana Skill from a Bot Framework chatbot and make it talk!
If you’re not already familiar with Cortana, this is Microsoft’s “personal assistant” and is available on Windows 10 (version 1607 and above) and a couple of Windows phones (Lumia 950/950 XL), a standalone speaker – like an Amazon Echo – and a plethora of devices that can run the Cortana app, including iOS and Android and plenty of laptops.
You’re going to be seeing a lot more of this little box of tricks (“Bot” of tricks? Box of bots?.. hmm…), so you might as well get in on the act right now!
Having been the VP of Engineering at a startup, I understand a lot of the challenges. The technical ones relating to the solution you think you need to build, more technical ones relating to the solutions the investors want you to build, the development process to best fit a rapidly changing product, team, requirements, and priorities, as well as managing the team through uncertain terrain.
They’re the fun ones. The easy ones! Especially given how talented my dev team was.
The founder had the difficult challenges; define a product that could be a success, iterate that idea based on extensive user testing, and most importantly, ensure there was funding.
Luckily, our founder was as talented at soliciting funds as we were at building epic tech!
If you are involved in a startup, perhaps Just Eat’s Accelerator programme can help with both types of challenge!
If you’re getting a “403” HTTP error when attempting to receive an image sent to your Skype bot, and the previous use of message.ServiceUrl to create a ConnectorClient didn’t work, try this more verbose version which explicitly sets the authorization header:
if (image.ContentUrl != null)
using (var connectorClient
= new ConnectorClient(new Uri(message.ServiceUrl)))
var token =
await (connectorClient.Credentials as MicrosoftAppCredentials)
var uri = new Uri(image.ContentUrl);
using (var httpClient = new HttpClient())
&& uri.Scheme == Uri.UriSchemeHttps)
new AuthenticationHeaderValue("Bearer", token);
// Get the image in a byte variable
data = await httpClient.GetByteArrayAsync(uri);
Whatever your social media tool of choice is these days, it’s almost guaranteed to be filled with images and their associated hashtags #sorrynotsorry #lovelife #sunnyday
Sometimes coming up with those tags is more work than perfectly framing your latest #flatlay shot.
In the age of amazing image recognition tech, it must be possible to create something that can help us out and give us more time to move that light source around to cast the right shadow over your meal.
Turns out, it is possible! Yay! (of course..)
In this article I’ll show you how to automatically generate image hashtags via a chatbot using Microsoft’s Computer Vision API.