A while back I showed how to use the Microsoft Speaker Recognition APIs in the simplest way I could think of; using a web page to record audio and call the various APIs to set up and test Speaker Verification and Speaker Identification.
Honestly, the hardest part of this by far was getting the audio recorded in the correct format for the APIs to accept! I hacked the wonderful recorderjs from Matt Diamond to set the correct bitrate, channels, etc, and eventually got there through trial and error (and squinting at the minified source code of the Microsoft demo page)!
In the run up to //Build this year, there have been a lot of changes in the Microsoft AI space.
One of these changes managed to break my existing Speaker Recognition API applications (it’s still in Preview, so don’t be surprised!) by moving Speaker Recognition under the Speech Service project, slightly changing the APIs and their endpoints, and adding new functionality (exciting!)
In this article I’ll show the same web page implementation, but use the updated 2020 Speaker Recognition APIs, and introduce the new Verification API that doesn’t rely on a predefined list of passphrases.
In a recent article I introduced Microsoft Cognitive Services’ Speaker Verification service, using a recording of a person repeating one of a set of key phrases to verify that user by their voiceprint.
The second main feature of the Speaker Recognition API is Speaker Identification, which can compare a piece of audio to a selection of voiceprints and tell you who was talking! For example, both Barclays and HSBC banks have investigated using passive speaker identification during customer support calls to give an added layer of user identification while you’re chatting to customer support. Or you could prime your profiles against all the speakers in a conference, and have their name automatically appear on screen when they’re talking in a panel discussion.
In this article I’m going to introduce you to the Speaker Identification API from the Cognitive Services and go through an example of using it for fun and profit! Though mainly fun.
Microsoft have been consistently ramping up their AI offerings over the past couple of years under the grouping of “Cognitive Services”. These include some incredible offerings as services for things that would have required a degree in Maths and a deep understanding of Python and R to achieve, such as image recognition, video analysis, speech synthesis, intent analysis, sentiment analysis and so much more.
I think it’s quite incredible to have the capability to ping an endpoint with an image and very quickly get a response containing a text description of the image contents. Surely we live in the future!
In this article I’m going to introduce you to the Cognitive Services, focus on the Speech Recognition ones, and implement a working example for Speaker Verification.
On Wednesday 22nd November 2017 I had the pleasure of running the third London Bot Framework meetup at the lovely Just Eat office in central London. The offices have been recently upgraded and the new meetup space has a huge 9 screen display a multiple mic speaker system, including a fantastic CatchBox throwable mic for ensuring everyone hears the audience questions
It has been a year since the previous one (whoops) but it was great to see some familiar faces return in the attendees. I had forgotten how much fun it is to emcee an event like this! Maybe next time I’ll be sure to just emcee and not also commit presenting a session too.
Whatever your social media tool of choice is these days, it’s almost guaranteed to be filled with images and their associated hashtags #sorrynotsorry #lovelife #sunnyday
Sometimes coming up with those tags is more work than perfectly framing your latest #flatlay shot.
In the age of amazing image recognition tech, it must be possible to create something that can help us out and give us more time to move that light source around to cast the right shadow over your meal.
Turns out, it is possible! Yay! (of course..)
In this article I’ll show you how to automatically generate image hashtags via a chatbot using Microsoft’s Computer Vision API.