CLARIN Tool Portal

EVALD 2.0

3 resources

EVALD 2.0 serves for automatic evaluation of surface coherence (cohesion) in Czech texts written by native speakers of Czech.

Use "EVALD 2.0"

MSTperl delexicalized parser transfer scripts and configuration files

3 resources

This is a set of MSTperl parser configuration files and scripts for delexicalized parser transfer. They were used in the work reported in arXiv:1506.04897 (http://arxiv.org/abs/1506.04897), as well as several related papers. The MSTperl parser is available at http://hdl.handle.net/11234/1-1480

Use "MSTperl delexicalized parser transfer scripts and configuration files"

Universal Dependencies 2.0 Models for UDPipe (2017-08-01)

3 resources

Tokenizer, POS Tagger, Lemmatizer and Parser models for all 50 languages of Universal Depenencies 2.0 Treebanks, created solely using UD 2.0 data (http://hdl.handle.net/11234/1-1983). The model documentation including performance can be found at http://ufal.mff.cuni.cz/udpipe/users-manual#universal_dependencies_20_models . To use these models, you need UDPipe binary version at least 1.2, which you can download from http://ufal.mff.cuni.cz/udpipe . In addition to models itself, all additional data and value of hyperparameters used for training are available in the second archive, allowing reproducible training.

Use "Universal Dependencies 2.0 Models for UDPipe (2017-08-01)"

ForFun 1.0

2 resources

ForFun is a database of linguistic forms and their syntactic functions built with the use of the multi-layer annotated corpora of Czech, the Prague Dependency Treebanks. The purpose of the Prague Database of Forms and Functions (ForFun) is to help the linguists to study the form-function relation, which we assume to be one of the principal tasks of both theoretical linguistics and natural language processing. A prototypical question to be asked is "What purposes does a preposition 'po' serve for" or "What are the linguistic means in the sentence that can express the meaning 'a destination of an action'?". There are almost 1500 distinct forms (besides the 'po' preposition) and 65 distinct functions (besides the 'destination').

Use "ForFun 1.0"

Translation Models (en-de) (v1.0)

2 resources

En-De translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/). Models are compatible with Tensor2tensor version 1.6.6. For details about the model training (data, model hyper-parameters), please contact the archive maintainer. Evaluation on newstest2020 (BLEU): en->de: 25.9 de->en: 33.4 (Evaluated using multeval: https://github.com/jhclark/multeval)

Use "Translation Models (en-de) (v1.0)"

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Croatian 1.1

3 resources

The model for morphosyntactic annotation of standard Croatian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the hr500k training corpus (http://hdl.handle.net/11356/1183) and using the CLARIN.SI-embed.hr word embeddings (http://hdl.handle.net/11356/1205). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~94.1. The difference to the previous version of the model is that now the whole XPOS tag is predicted and not specific characters, as was the case in stanfordnlp, which resulted in illegal XPOS tags (and slightly decreased performance).

Use "The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Croatian 1.1"

The CLASSLA-StanfordNLP model for lemmatisation of standard Slovenian 1.2

2 resources

The model for lemmatisation of standard Slovenian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the ssj500k training corpus (http://hdl.handle.net/11356/1210) and using the Sloleks inflectional lexicon (http://hdl.handle.net/11356/1230). The estimated F1 of the lemma annotations is ~99.0. The difference to the previous version is that now it relies solely on XPOS annotations, and not on a combination of UPOS, FEATS (lexicon lookup) and XPOS (lemma prediction) annotations.

Use "The CLASSLA-StanfordNLP model for lemmatisation of standard Slovenian 1.2"

NLP Web services and NLP workflow engine

2 resources

Web based system for natural language processing of texts in Polish. It allows running complex workflows of language and machine learning tools. Making it avaliable via REST Web Services.

Use "NLP Web services and NLP workflow engine"

XLM-RoBERTa-LARGE events relation recognition

1 resources

A set of basic language tools for the Polish language. Z4.2a Improving the quality of recognition of relations between events using Transformer-type deep networks.

Use "XLM-RoBERTa-LARGE events relation recognition"

Universal Dependencies 2.4 Models for UDPipe (2019-05-31)

93 resources

Tokenizer, POS Tagger, Lemmatizer and Parser models for 90 treebanks of 60 languages of Universal Depenencies 2.4 Treebanks, created solely using UD 2.4 data (http://hdl.handle.net/11234/1-2988). The model documentation including performance can be found at http://ufal.mff.cuni.cz/udpipe/models#universal_dependencies_24_models . To use these models, you need UDPipe binary version at least 1.2, which you can download from http://ufal.mff.cuni.cz/udpipe . In addition to models itself, all additional data and value of hyperparameters used for training are available in the second archive, allowing reproducible training.

Use "Universal Dependencies 2.4 Models for UDPipe (2019-05-31)"

Result filters

Metadata provider

Language

Resource type

Tool task

Availability

Organisation

Project

Keywords

Active filters:

Search results

EVALD 2.0

MSTperl delexicalized parser transfer scripts and configuration files

Universal Dependencies 2.0 Models for UDPipe (2017-08-01)

ForFun 1.0

Translation Models (en-de) (v1.0)

The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Croatian 1.1

The CLASSLA-StanfordNLP model for lemmatisation of standard Slovenian 1.2

NLP Web services and NLP workflow engine

XLM-RoBERTa-LARGE events relation recognition

Universal Dependencies 2.4 Models for UDPipe (2019-05-31)