CLARIN Tool Portal

Saper

1 resources

Shallow semantic parser for polish texts processing. Contains word sense disambiguation, mapping go SUMO concepts and semantic role labelling.

Use "Saper"

The CLASSLA-StanfordNLP model for named entity recognition of non-standard Croatian 1.0

2 resources

This model for named entity recognition of non-standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the hr500k training corpus (http://hdl.handle.net/11356/1183), the ReLDI-NormTagNER-hr corpus (http://hdl.handle.net/11356/1241) and the ReLDI-NormTagNER-sr corpus (http://hdl.handle.net/11356/1240), using the CLARIN.SI-embed.hr word embeddings (http://hdl.handle.net/11356/1205). The training corpora were additionally augmented for handling missing diacritics by repeating parts of the corpora with diacritics removed.

Use "The CLASSLA-StanfordNLP model for named entity recognition of non-standard Croatian 1.0"

Samrómur-Adolescents Kaldi Recipe 22.06

2 resources

The "Samrómur-Adolescents Kaldi Recipe 22.06" is a code recipe intended to show how to integrate the adolescent portion of the corpus "Samrómur Children's Icelandic Speech Data 21.09" [1] and the "Icelandic Language Models with Pronunciations 22.01" [2] to create automatic speech recognition systems using the Kaldi toolkit [3].

Use "Samrómur-Adolescents Kaldi Recipe 22.06"

NPSemRel

1 resources

NPSemrel is a tool for recognizing semantic roles into nominal Phrases.

Use "NPSemRel"

VIADAT-STAT (2019-12-31)

2 resources

A VIADAT module; the purpose of VIADAT-STAT is statistical analysis of recordings stored by the platform. Developed in cooperation with ÚSD AV ČR and NFA.

Use "VIADAT-STAT (2019-12-31)"

Font ZRCalo 1.0

2 resources

ZRCalo is an open font meant to gradually phase out the ZRCola font as one of the components of the ZRCola 2 input system (http://hdl.handle.net/11356/1090). The current version is a baseline variant covering the basic Latin Unicode blocks. Future versions will aim to build on Unicode's combining characters mechanic to replace ZRCola's extensive use of the Private Use Area.

Use "Font ZRCalo 1.0"

The CLASSLA-Stanza model for UD dependency parsing of standard Slovenian 2.2

3 resources

This model for UD dependency parsing of standard Slovenian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SUK training corpus (http://hdl.handle.net/11356/1747) and using the CLARIN.SI-embed.sl word embeddings (http://hdl.handle.net/11356/1204) expanded with the MaCoCu-sl Slovene web corpus (http://hdl.handle.net/11356/1517). The estimated LAS of the parser is ~90.42. The difference to the previous version of the model is that the model was trained using the improved SUK 1.1 version of the training corpus.

Use "The CLASSLA-Stanza model for UD dependency parsing of standard Slovenian 2.2"

Samrómur DeepSpeech Recipe 22.06

2 resources

The "Samrómur DeepSpeech Recipe 22.06" is a code recipe intended to show how to integrate the corpus "Samromur 21.05" [1] and the "DeepSpeech Scorer for Icelandic 22.06" [2] to create automatic speech recognition systems using the Mozilla's DeepSpeech recognizer [3].

Use "Samrómur DeepSpeech Recipe 22.06"

The CLASSLA-Stanza model for morphosyntactic annotation of non-standard Slovenian 2.1

3 resources

This model for morphosyntactic annotation of non-standard Slovenian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SUK training corpus (http://hdl.handle.net/11356/1747) and the Janes-Tag corpus (http://hdl.handle.net/11356/1732), using the CLARIN.SI-embed.sl word embeddings (http://hdl.handle.net/11356/1204) that were expanded with the MaCoCu-sl Slovene web corpus (http://hdl.handle.net/11356/1517). These corpora were additionally augmented for handling missing diacritics by repeating parts of the corpora with diacritics removed. The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~92.17. The difference to the previous version of the model is that the model was trained on the SUK training corpus and the 3.0 version of Janes-tag, uses new embeddings and the new version of the Slovene morphological lexicon Sloleks 3.0 (http://hdl.handle.net/11356/1745).

Use "The CLASSLA-Stanza model for morphosyntactic annotation of non-standard Slovenian 2.1"

The CLASSLA-Stanza model for morphosyntactic annotation of standard Slovenian 2.0

3 resources

This model for morphosyntactic annotation of standard Slovenian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SUK training corpus (http://hdl.handle.net/11356/1747) and using the CLARIN.SI-embed.sl word embeddings (http://hdl.handle.net/11356/1204) that were expanded with the MaCoCu-sl Slovene web corpus (http://hdl.handle.net/11356/1517). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~98.27. The difference to the previous version of the model is that the model was trained using the SUK training corpus and uses new embeddings and the new version of the Slovene morphological lexicon Sloleks 3.0 (http://hdl.handle.net/11356/1745).

Use "The CLASSLA-Stanza model for morphosyntactic annotation of standard Slovenian 2.0"

Result filters

Metadata provider

Language

Resource type

Tool task

Field of study

Availability

Organisation

Project

Keywords

Active filters:

Search results

Saper

The CLASSLA-StanfordNLP model for named entity recognition of non-standard Croatian 1.0

Samrómur-Adolescents Kaldi Recipe 22.06

NPSemRel

VIADAT-STAT (2019-12-31)

Font ZRCalo 1.0

The CLASSLA-Stanza model for UD dependency parsing of standard Slovenian 2.2

Samrómur DeepSpeech Recipe 22.06

The CLASSLA-Stanza model for morphosyntactic annotation of non-standard Slovenian 2.1

The CLASSLA-Stanza model for morphosyntactic annotation of standard Slovenian 2.0