CLARIN Tool Portal

EVALD 4.0 for Foreigners – Evaluator of Discourse

3 resources

EVALD 4.0 for Foreigners is a software for automatic evaluation of surface coherence (cohesion) in Czech texts written by non-native speakers of Czech.

Use "EVALD 4.0 for Foreigners – Evaluator of Discourse"

CUBBITT Translation Models (en-cs) (v1.0)

3 resources

CUBBITT En-Cs translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/). Models are compatible with Tensor2tensor version 1.6.6. For details about the model training (data, model hyper-parameters), please contact the archive maintainer. Evaluation on newstest2014 (BLEU): en->cs: 27.6 cs->en: 34.4 (Evaluated using multeval: https://github.com/jhclark/multeval)

Use "CUBBITT Translation Models (en-cs) (v1.0)"

EVALD 2.0 for Foreigners

3 resources

EVALD 2.0 for Foreigners is a software for automatic evaluation of surface coherence (cohesion) in Czech texts written by non-native speakers of Czech.

Use "EVALD 2.0 for Foreigners"

NameTag

1 resources

NameTag is an open-source tool for named entity recognition (NER). NameTag identifies proper names in text and classifies them into predefined categories, such as names of persons, locations, organizations, etc.

LINDAT Translation

1 resources

The input file size is limited to 100kB. Translates from->to: Czech->English, Hindi, French, Russian, German English->Russsian, German, Czech, Hindi, French Russian->German, French, Czech, Hindi, English German->Russian, Hindi, Czech, English, French French->Russian, German, Czech, English, Hindi

UDPipe

1 resources

UDPipe is an trainable pipeline for tokenization, tagging, lemmatization and dependency parsing of CoNLL-U files. UDPipe is language-agnostic and can be trained given only annotated data in CoNLL-U format. Trained models are provided for nearly all UD treebanks.

EduPo: Analysis and Generation of Czech Poetry, v0.5

2 resources

A suite of tools for analysis and generation of Czech poetry. This is a snapshot of the public Github repository at https://github.com/ufal/edupo -- the beta-version of the tool suite, released together with a scientific paper at the NLP4DH 2025 conference. Sada nástrojů pro analýzu a generování české poezie. Tato verze veřejného repozitáře na Githubu https://github.com/ufal/edupo je beta-verzí doprovázející vydání vědeckého článku na konferenci NLP4DH 2025.

Use "EduPo: Analysis and Generation of Czech Poetry, v0.5"

Generator of Czech lyrics according to structure

3 resources

Fine-tuned Czech TinyLlama model (https://huggingface.co/BUT-FIT/CSTinyLlama-1.2B) and Czech GPT2 small model (https://huggingface.co/lchaloupsky/czech-gpt2-oscar) to generate lyrics of song sections based on the provided syllable counts, keywords and rhyme scheme. The TinyLlama-based model yields better results, however, the GPT2-based model can run locally. Both models are discussed in a Bachelor Thesis: Generation of Czech Lyrics to Cover Songs.

Use "Generator of Czech lyrics according to structure"

Debiasing Algorithm through Model Adaptation

2 resources

Debiasing Algorithm through Model Adaptation (DAMA) is based on guarding stereotypical gender signals and model editing. DAMA is performed on specific modules prone to convey gender bias, as shown by causal tracing. Our novel method effectively reduces gender bias in LLaMA models in three diagnostic tests: generation, coreference (WinoBias), and stereotypical sentence likelihood (StereoSet). The method does not change the model’s architecture, parameter count, or inference cost. We have also shown that the model’s performance in language modeling and a diverse set of downstream tasks is almost unaffected. This package contains both the source codes and English, English-to-Czech, and English-to-German datasets.

Use "Debiasing Algorithm through Model Adaptation"

MSTperl parser (2015-05-19)

2 resources

MSTperl is a Perl reimplementation of the MST parser of Ryan McDonald (http://www.seas.upenn.edu/~strctlrn/MSTParser/MSTParser.html). MST parser (Maximum Spanning Tree parser) is a state-of-the-art natural language dependency parser -- a tool that takes a sentence and returns its dependency tree. In MSTperl, only some functionality was implemented; the limitations include the following: the parser is a non-projective one, curently with no possibility of enforcing the requirement of projectivity of the parse trees; only first-order features are supported, i.e. no second-order or third-order features are possible; the implementation of MIRA is that of a single-best MIRA, with a closed-form update instead of using quadratic programming. On the other hand, the parser supports several advanced features: parallel features, i.e. enriching the parser input with word-aligned sentence in other language; adding large-scale information, i.e. the feature set enriched with features corresponding to pointwise mutual information of word pairs in a large corpus (CzEng); weighted/unweighted parser model interpolation; combination of several instances of the MSTperl parser (through MST algorithm); combination of several existing parses from any parsers (through MST algorithm). The MSTperl parser is tuned for parsing Czech. Trained models are available for Czech, English and German. We can train the parser for other languages on demand, or you can train it yourself -- the guidelines are part of the documentation. The parser, together with detailed documentation, is avalable on CPAN (http://search.cpan.org/~rur/Treex-Parser-MSTperl/).

Use "MSTperl parser (2015-05-19)"

Result filters

Metadata provider

Language

Resource type

Tool task

Availability

Organisation

Project

Keywords

Active filters:

Search results

EVALD 4.0 for Foreigners – Evaluator of Discourse

CUBBITT Translation Models (en-cs) (v1.0)

EVALD 2.0 for Foreigners

NameTag

LINDAT Translation

UDPipe

EduPo: Analysis and Generation of Czech Poetry, v0.5

Generator of Czech lyrics according to structure

Debiasing Algorithm through Model Adaptation

MSTperl parser (2015-05-19)