Unrecognized life of speech recognition

I will show you the great power of SCIENCE!

I’ll show you the great power of science! (By dictating this entry in Dragon NaturallySpeaking.)

Robert Fortner has an article called “Rest in Peas: The Unrecognized Death of Speech Recognition“, which unfortunately has gained some attention. Unfortunately, because even though it may be factually correct, it is highly misleading. There is a graph early in the article, where the reader’s attention is still fresh, showing that the error rate in speech recognition reached a minimum in 2001. Presumably this is correct according to some kind of research. But then he follows up later in the article with repeated mention of a specific product, Dragon NaturallySpeaking. This projects the impression, at least unless you read very carefully, that the accuracy of Dragon NaturallySpeaking has not improved since 2001. This is exactly the opposite of the truth.

As it happens, 2001 was about the time I first tested Dragon NaturallySpeaking, which was then in version 5. I was not impressed. In fact, I compared it to a drunk and homesick Asian high school exchange student. Unless your body was seriously damaged, this software had little more than entertainment value, was my conclusion. While you could probably not type faster with your feet, I think it might have been a close race.

I skipped version 6 and tried again at version 7. It had improved, but had still mostly entertainment value to me. It continued to improve with version 8, at the end of 2004. Then in version 9, in 2006, it actually became useful even to me who has a noticeable Norwegian accent. The improvement up to version 10, in 2008, was less dramatic. Even so, it was with this version of the program (at least to me) crossed the “uncanny valley” and became comparable to talking to a fellow human. Version 11 did not change the speech recognition engine, as far as I know. It was mainly an interface and usability update, and in my opinion it does not deserve a new version number, but should have been called 10.6 or some such. So it does indeed seem that the accuracy of speech recognition has reached a limit – but in 2008 rather than 2001.

Meanwhile, Microsoft keeps improving its own speech recognition which is inherent in its Windows operating system. It is still lagging behind Dragon, but the distance is less than it used to be. It is not impossible but Microsoft may overtake Nuance, if Nuance can’t make their speech recognition engine more accurate than it is today.

But even today, speech recognition is far from dead. It doesn’t actually understand what you say, but it is able to take dictation with the best of them and use your computer hands-free (demo on YouTube). That’s pretty impressive for something that’s supposed to be pushing up daisies, don’t you think?


Leave a Reply

Your email address will not be published. Required fields are marked *