speaker identification | Robin Osborne

A while back I showed how to use the Microsoft Speaker Recognition APIs in the simplest way I could think of; using a web page to record audio and call the various APIs to set up and test Speaker Verification and Speaker Identification.

Honestly, the hardest part of this by far was getting the audio recorded in the correct format for the APIs to accept! I hacked the wonderful recorderjs from Matt Diamond to set the correct bitrate, channels, etc, and eventually got there through trial and error (and squinting at the minified source code of the Microsoft demo page)!

In the run up to //Build this year, there have been a lot of changes in the Microsoft AI space.

One of these changes managed to break my existing Speaker Recognition API applications (it’s still in Preview, so don’t be surprised!) by moving Speaker Recognition under the Speech Service project, slightly changing the APIs and their endpoints, and adding new functionality (exciting!)

In this article I’ll show the same web page implementation, but use the updated 2020 Speaker Recognition APIs, and introduce the new Verification API that doesn’t rely on a predefined list of passphrases.

Continue reading →

Robin Osborne

Always learning more about Performance, Observability, DevOps, and Tech Leadership

Menu

Tag Archives: speaker identification

AI Awesomeness: 2020 Update! Microsoft Cognitive Services Speaker Recognition API