DeepSpeech is an open-source Speech-To-Text (STT) engine that uses a model trained by machine learning techniques. It was initially developed based on Baidu’s Deep Speech research paper and is now maintained by Mozilla.
Use DeepSpeech to transcribe audio files into text, such as podcasts or lectures.