Result filters

Metadata provider

Language

Resource type

  • Unspecified

Tool task

Availability

Active filters:

  • Resource type: Unspecified
  • Tool task: ASR
Loading...
14 record(s) found

Search results

  • Voice control and question answering (22.10)

    [English] The goal of this work package was to develop Kaldi recipes for voice control and question answering systems for Icelandic. We defined six tasks and either generated or gathered data for each, normalized the data and trained Kaldi language models. Included in this submission are six ASR language models, an acoustic model, the training data for the language model and all the code used to generate the data and create the models. For further information have a look at the file README.md. [Icelandic] Markmiðið með þessu verkefni var að búa til talgreiningar uppskriftir með Kalda fyrir raddskipanir og fyrirspurnir. Við skilgreindum sex verkefni og annaðhvort söfnuðum eða bjuggum til gögn fyrir hvert og eitt þeirra, undirbjuggum gögnin og þjálfuðum mállíkön. Í þessu safni er að finna sex sérhæfð mállíkön, hljóðlíkan, gögnin sem voru notuð til þess að búa til mállíkönin ásamt öllum kóða sem notaður var til þess að búa til gögnin og líkönin. Freakri upplýsingar má finna í skránni README.md.
  • Kaldi Recipe for Faroese

    - ENGLISH The "Kaldi Recipe for Faroese" is a code recipe intended to show how to use the corpus "Ravnursson Faroese Speech and Transcripts" [1] to create automatic speech recognition systems using the Kaldi toolkit [2]. - ÍSLENSLA "Kaldi Forskrift fyrir færeysku" er forskrift af því hvernig má nota gagnasafnið "Ravnursson Faroese Speech and Transcripts" [1] til að búa til talgreini í verkfærakistunni Kaldi [2]. [1] Hernández Mena, Carlos Daniel; Simonsen, Annika. "Ravnursson Faroese Speech and Transcripts". Web Downloading: http://hdl.handle.net/20.500.12537/276 [2] Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., ... & Vesely, K. (2011). The Kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding (No. CONF). IEEE Signal Processing Society.
  • Speech Corpora Toolkit (22.06)

    [ENGLISH] Speech corpora toolkit is a collection of tools for processing audio and scripts to prepare them for segmentation and alignment. The output for each source is standardized. [ÍSLENSKA] Tækjasafn fyrir talmálsheildir er samansafn af tólum til að vinna hljóð og handrit yfir á staðlað form sem gerir þau tilbúin fyrir niðurbútun og samröðun.
  • Slovene Conformer CTC BPE E2E Automated Speech Recognition model PROTOVERB-ASR-E2E 1.0

    This Conformer CTC BPE E2E Automated Speech Recognition model was trained following the NVIDIA NeMo Conformer-CTC fine-tuning recipe (for details see the official NVIDIA NeMo NMT documentation, https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/intro.html, and NVIDIA NeMo GitHub repository https://github.com/NVIDIA/NeMo). It provides functionality for transcribing Slovene speech to text. The starting point was the Conformer CTC BPE E2E Automated Speech Recognition model RSDO-DS2-ASR-E2E 2.0, which was fine-tuned on the Protoverb closed dataset. The model was fine-tuned for 20 epochs, which improved the performance on the Protoverb test dataset for 9.8% relative WER, and for 3.3% relative WER on the Slobench dataset.
  • Slovene Conformer CTC BPE E2E Automated Speech Recognition model RSDO-DS2-ASR-E2E 2.0

    This Conformer CTC BPE E2E Automated Speech Recognition model was trained following the NVIDIA NeMo Conformer-CTC recipe (for details see the official NVIDIA NeMo NMT documentation, https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/intro.html, and NVIDIA NeMo GitHub repository https://github.com/NVIDIA/NeMo). It provides functionality for transcribing Slovene speech to text. The training, development and test datasets were based on the Artur dataset and consisted of 630.38, 16.48 and 15.12 hours of transcribed speech in standardised form, respectively. The model was trained for 200 epochs and reached WER 0.0429 on the development and WER 0.0558 on the test dataset.
  • Polish Speech Services

    This archive contains the source code and configuration of the speech tools web service available at http://mowa.clarin-pl.eu/mowa. The services provided include: + speech to text alignemnt + speaker diarization + speech transcription + speech activity detection and noise classification + keyword spotting