Google offers new offline voice recognition feature
A new tool developed by Google allows you to reduce latency in speech recognition, as well as increase its accuracy and allow it to work offline. In this way, the Mountain View company could optimize the introduction by voice in mobile devices. No more keyboards?
In 2012, research in the field of speech recognition showed important advances produced by automatic learning, more precisely with the use of deep neural networks for acoustic modeling. These developments enabled the adoption of this function in products such as Google's search engine, also known as Google Voice Search. But that was only the beginning of a whole revolution; new architectures began to appear every year, to improve recognition technology: from deep (DNN) and recurrent (RNN) neural networks to convolutional neural networks, to name a few examples.
One of the most important objectives of these architectures has always been to reduce latency. In other words, shorten the waiting time between speech and recognition. With that goal in mind, Google has announced the introduction of a new neural recognition feature to make it easier to enter text on its Gboard keyboard. The new technology uses a recurrent neural network transducer (RNN-T), which eliminates latency and operating irregularities.
The transducer is compact enough to be used in mobile devices. In this way, the technology goes from being hosted on the server to being hosted inside the device. This means that it also allows offline voice recognition. The model takes the transcription to the phoneme/character level, so that the keyboard is typing characters as the user speaks, as if he were speaking and typing at the same time.
The new voice recognition for Gboard is planned to be introduced in all Pixel devices. But for the moment, only those Pixels that work in American English will receive the update. However, given the current algorithmic improvements, it is expected that the function will be available in more languages quite soon.
What do you think of this new breakthrough in voice recognition? Can you imagine a near future where keyboards become obsolete? You can tell us your opinion in the comments below.
Source: Google AI Blog
So, is there an API for this? And how soon is it coming to non-pixel devices? We're not awash in pixels in our lab.
I work with a number of people who are either deaf or blind so speech recognition is something I use all day every day. Any improvement is hugely appreciated. Also, I tend to write tens of thousands of words per day, so if we ever get dictation software that works properly, it will save my fingers, plus be a huge speed improvement. I type fast, upwards of 90 words per minute, but we all talk closer to 120 and some people up to 160. Most professional secretaries only type 40-60 words per minute and with a phone way less than that. Voice is the only way to go for the future.