Search results
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
Robust Speech Recognition via Large-Scale Weak Supervision - Releases · openai/whisper.
Sep 21, 2022 · Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise and technical language.
Whisper large-v3 is supported in Hugging Face 🤗 Transformers. To run the model, first install the Transformers library through the GitHub repo. For this example, we'll also install 🤗 Datasets to load toy audio dataset from the Hugging Face Hub:
Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
Use this model. Edit model card. Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalise to many datasets and domains without the need for fine-tuning.
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.
Constructs a Whisper processor which wraps a Whisper feature extractor and a Whisper tokenizer into a single processor. WhisperProcessor offers all the functionalities of WhisperFeatureExtractor and WhisperTokenizer .
Nov 17, 2023 · Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model - Const-me/Whisper