August 21, 2017
Microsoft’s AI backed systems have been creating a lot of buzz lately. Its conversational chatbots are holding actual conversations on one end where as on the input end, the conversational speech recognition systems developed by Microsoft has just hit another milestone.
Microsoft announced that there conversational speech recognition system has reached a 5.1% error rate which is its lowest or the lowest for any speech recognition so far. This has broken a record in the field surpassing the 5.9% mark which was reached last year by a group of researchers from Microsoft Artificial Intelligence and Research. Interestingly, this puts the system on par with professional human transcribers who have advantages like the ability to listen to text several times.
Microsoft, in both cases, transcribed recordings from the Switchboard corpus, a collection of about 2,400 telephone conversations that have been used by researchers to test speech recognition systems since the early 1990s, for their studies.
The new study which was performed by a group of researchers at Microsoft AI and Research had set a goal to of achieving the same level of accuracy as a group of human transcribers who were able to listen to what they were transcribing several times.
So all in all, the researches at Microsoft were able to reduce the error rate by about 12 percent compared to last year’s findings by improving the neural net-based acoustic and language models of Microsoft’s speech recognition system. Most notably, the speech recognizer is able to use entire conversations, which let it adapt its transcriptions to context and also predict words or phrases that were likely to come next in the manner humans do when talking to one another.