Home » News » OpenAI Releases ‘Whisper’ - Automatic Speech Recognition Tool

OpenAI Releases ‘Whisper’ - Automatic Speech Recognition Tool

(Image Credit Google)
Whisper is released by OpenAI, a nonprofit organization dedicated to developing and directing artificial intelligence (AI) to benefit humanity as a whole. It is an automatic speech recognition system that will enable 'robust' transcription in multiple languages, according to OpenAI. Whisper will also automatically translate those languages into English. AI and machine learning have always been challenged by automatic speech recognition (ASR). Whisper is a step in the right direction for OpenAI. Whisper was trained by OpenAI using 680,000 hours of audio data and matching transcripts from the web in 98 languages. The models and inference code are open source and can be used to create useful apps and further research into making speech processing more reliable. OpenAI Releases ‘Whisper’ - Automatic Speech Recognition Tool CLIP, an open source computer vision model released by OpenAI in January 2021, arguably ignited the recent era of rapidly progressing image synthesis technology such as DALL-E 2 and Stable Diffusion. Whisper is described by OpenAI as an encoder-decoder transformer, a type of neural network that can learn associations from input data and then translate them into the model's output. This overview of Whisper's operation is provided by OpenAI. 'Input audio is divided into 30-second chunks, converted to a log-Mel spectrogram, and then passed through an encoder.' A decoder is trained to predict the corresponding text caption, and special tokens are mixed in to direct the single model to perform tasks like language identification, phrase-level timestamps, multilingual speech transcription, and to-English speech translation.' OpenAI Releases ‘Whisper’ - Automatic Speech Recognition Tool Approximately one-third of Whisper's audio dataset is non-English, with the task of transcribing in the original language or translating to English alternately assigned. The researchers claim that this method is effective for learning speech-to-text translation and outperforms the supervised SOTA on CoVoST2 to English translation zero-shot. The OpenAI researchers also hope that Whisper's high accuracy and ease of use will enable developers to incorporate voice interfaces into a broader range of applications.

By Jozeph P

Journalism explorer, tech Enthusiast. Love to read and write.

RELATED NEWS

Image credit : Hackster.io ...

news-extra-space

(Image credit- Technology Networks) Researchers...

news-extra-space

(Image credit- Tech Crunch) Virtual reality (VR...

news-extra-space

(Image credit- Science Blog) With thousands of ...

news-extra-space

(Image credit- Gulf News) The public is now bei...

news-extra-space

(Image credit- Tech Times) According to the rep...

news-extra-space
2
3
4
5
6
7
8
9
10