Result filters

Metadata provider

Language

Resource type

  • Unspecified

Availability

Organisation

  • CLARIN.SI

Active filters:

  • Resource type: Unspecified
  • Organisation: CLARIN.SI
Loading...
2 record(s) found

Search results

  • Pretrained models for recognising sex education concepts SemSEX 1.0

    Pretrained language models for detecting and classifying the presence of sex education concepts in Slovene curriculum documents. The models are PyTorch neural network models, intended for usage with the HuggingFace transformers library (https://github.com/huggingface/transformers). The models are based on the Slovenian RoBERTa contextual embeddings model SloBERTa 2.0 (http://hdl.handle.net/11356/1397) and on the CroSloEngual BERT model (http://hdl.handle.net/11356/1330). The source code of the model and example usage is available in GitHub repository https://github.com/TimotejK/SemSex. The models and tokenizers can be loaded using the AutoModelForSequenceClassification.from_pretrained() and the AutoTokenizer.from_pretrained() functions from the transformers library. An example of such usage is available at https://github.com/TimotejK/SemSex/blob/main/Concept%20detection/Classifiers/full_pipeline.py. The corpus on which these models have been trained is available at http://hdl.handle.net/11356/1895.
  • SloBENCH evaluation framework

    The evaluation framework contains public evaluation scripts. All the scripts contain additional Dockerfiles that allow for platform-independent evaluation and exact comparison of results. Pre-built Docker images are available in slobench/eval DockerHub repository. The evaluation framework is used and maintained by the SloBENCH leaderboard Web site team. SloBENCH submitters are able to check their compliance of submissions and evaluate theri model on training/validation data prior to submission. The initial version of SloBENCH contains evaluation scripts with examples of training and testing datasets for nine different tasks: named entity recognition, part-of-speech tagging, lemmatization, dependency parsing, semantic role labeling, translation (ENG-SLO, SLO-ENG), summarization and question answering.