Will Automatic Speech Recognition Ever be Accurate?

Ever wondered how your iPhone responds to the command “Hey Siri”? Or how Alexa plays your favorite song on request? Well, it is the automatic speech recognition (ASR) software installed on these devices that recognizes your voice, analyzes what you said, transcribes it, and performs the action you asked it to do.

The History of Speech Recognition Technology

The speech recognition technology is not just confined to your smartphones or virtual assistants. It is also used in the field of medicine, military, business, banking, journalism, education, entertainment, etc. It was in 1952 that Bell Laboratory created a speech recognition system called ‘Audrey’ which recognized the digits spoken by a speaker. Later, the world’s first proper speech recognition software ‘Dragon Dictate’ was developed by Dragon in the year 1990. This was used for converting speech to text. Today, the company owns several advanced speech recognition software packages including Dragon Anywhere and Dragon Home.

Drawbacks of Speech Recognition Technology

Despite being a groundbreaking technology, the speech recognition software used by leading companies has noticeable drawbacks while recognising a speaker’s voice and performing the command. They include

What is ASR?

A. Automatic Speech Recognition
B. Animated Speech Recognition
C. Audio Speech Recognition
D. Amplified Speech Recognition

1. Lack of Accuracy

This is one of the major disadvantages of current speech recognition software. There is no way a computer can fully interpret what a human says.

2. Too Much Time Consumption

The lack of accuracy makes us spend too much time on the software to make it work.

3. Needs Training

All speech recognition software packages except a few need to be trained to understand the speaker’s voice efficiently.

4. Misinterpreted Commands

There is a high probability that the command might be misinterpreted. Especially in the field of medicine where high accuracy is required, a small misinterpretation can cause a blunder.

5. Noise Interference

Many a times, we are forced to use the software in a public place or a room surrounded by other people. In such a case, it is hard for the software to cancel the background noise and understand the speaker’s command.

6. Problems with Accent

Speech recognition software in general is not specific to a particular country or accent. So different accents sound different which is hard for the computer to transcribe.

Today several new-age algorithms like deep neural networks, and Viterbi Search are used to increase the efficacy of voice recognition. Yet, we have a long way to go in terms of accuracy when it comes to voice recognition.

How Efficient Is the Automatic Speech Recognition Software?