CLARIN Tool Portal

703 record(s) found

Search results

BinPackage (0.3.1)

3 resources

BinPackage is a Python Package that embeds the vocabulary of the DMII (bin.arnastofnun.is) and offers various lookups and queries of the data. The database, maintained by The Árni Magnússon Institute for Icelandic Studies, contains over 6.5 million entries, over 3.1 million unique word forms, and about 300,000 distinct lemmas. The database has been encapsulated in an easy-to-install Python package, and compressed from 400+ megabyte CSV file to an ~80 megabyte indexed binary structure. More information at: https://github.com/mideind/BinPackage BinPackage er Python-pakki utan um BÍN, Beygingarlýsingu íslensks nútímamáls (bin.arnastofnun.is), sem inniheldur yfir 6,5 milljónir færslna, 3,1 milljón einstakra orðmynda og um 300.000 stakar lemmur. Stofnun Árna Magnússonar heldur utan um gagnagrunninn. Gagnagrunninum, um 400 megabæta CSV-skrá, hefur verið pakkað í um 80 megabæta tvíundarbyggingu með vísum. Frekari upplýsingar á: https://github.com/mideind/BinPackage

Use "BinPackage (0.3.1)"
Punctuation model (20.09)

9 resources

A python package that punctuates Icelandic text. The input data is unpunctuated text and punctuated text is returned. The user can choose between two punctuation models, a BERT-based Transformer and a bidirectional RNN ([Punctuator 2](www.github.com/ottokart/punctuator2)) in Tensorflow 2. [Icelandic] Python-pakki sem greinarmerkjasetur íslenskan texta. Inntakið er á formi ógreinarmerkjasetts texta og greinarmerkjasettum texta er skilað. Notandinn getur valið milli tveggja greinarmerkjasetningalíkana, annars vegar umbreytis sem byggir á BERT og tvístefnu-endurkvæmnisneti ([Punctuator 2](www.github.com/ottokart/punctuator2)) í Tensorflow 2.

Use "Punctuation model (20.09)"
Prep for Adventure: A game for the acquisition of English prepositions

2 resources

The presented game is designed to teach the six most frequent English prepositions (to, of, in, for, on, and with) at the A1 to A2 levels of proficiency. Prep for Adventure is a single-player game comprised of five separate tasks – jumping puzzle, cooking, town maze, lighting the goblets, and a banter with a classmate. Their mechanics are then combined in the final task (The Final Fight) to elicit the correct responses of the subject. The language used in the game is adjusted for the subjects’ level of proficiency, the game is fully voiced and offers a degree of customization. All tasks are based on the gap-filling type of exercise where subjects have to complete a sentence with a missing word, either by typing it in or via different kinds of multiple-choice formats. The game is designed to advance the subjects’ performance in prepositional structures by exposing players to as much input as possible. The length of one average playthrough is approximately 30-45 minutes. The game was created in the RPG Maker MV engine where RPG stands for role-playing game, which is a genre of a game in which the player adopts a role/roles of a fictional character/characters in a (partly or fully) invented setting. The game story: The Grammar School of Witchcraft has been taken over by the Evil Preposition Magician and the player is trying to win their school back alongside with a young witch named Morphologina (the player’s guide).

Use "Prep for Adventure: A game for the acquisition of English prepositions"
Lithuanian Spelling Checker V.1.0.45 for Linux

1 resources

Lithuanian spelling checker for Linux 2020-04-07 version 1.0.45

Use "Lithuanian Spelling Checker V.1.0.45 for Linux"
Yfirlestur Word 22.10

2 resources

Yfirlestur Word is the source code for a spelling and grammar correction add-on for Icelandic, for use with Microsoft Word. The plugin provides error annotation and replacement, based on user interaction. The source code is intended for third party development and can be installed and tested locally using Node.js. The plugin requires third party correction software for its functionality. For development and testing, the open-access Yfirlestur.is API produced by Miðeind was used (see: https://github.com/icelandic-lt/Yfirlestur)) but is not intended for production use. This software is licensed under the MIT License. More information at https://github.com/icelandic-lt/Yfirlestur-Word.

Use "Yfirlestur Word 22.10"
WordnetLoom 2.0

4 resources

WordneLoom 2.0 executable files for plWordnet 4.0. Source code available at https://github.com/CLARIN-PL/WordnetLoom WordnetLoom – is an wordnet editor application built for the needs of the construction of a the largest Polish wordnet called plWordNet. WordnetLoom provides two means of interaction: a form-based, implemented initially, and a visual, graph-based introduced recently. The visual, graph-based interactive presentation of the wordnet structure enables browsing and its direct editing on the structure of lexico-semantic relations and synsets. WordnetLooms works in a distributed environment, i.e. several linguists can work simulanuously from different sites on the same central database.

Use "WordnetLoom 2.0"
Dependency tree extraction tool STARK 1.0

2 resources

STARK is a python-based command-line tool for extraction of dependency trees from parsed corpora, aimed at corpus-driven linguistic investigations of syntactic phenomena of various kinds. It supports the CONLL-U format (https://universaldependencies.org/format.html) as input and returns a list of all relevant dependency trees, frequencies, and other associated information in the form of a tab-separated .tsv file. For installation, execution and the description of various user-defined parameter settings, see the official project page at: https://gitea.cjvt.si/lkrsnik/STARK. This entry corresponds to commit 421f12cac6 in the Git repository.

Use "Dependency tree extraction tool STARK 1.0"
Colloc -- A Tool for Automatic Identification of Multiword Expressions

1 resources

Colloc -- a tool for automatic identification of multiword expressions (MWE) is freely available for online use at http://resursai.mwe.lt/atpazintuvas. As material for training DELFI.lt corpus (http://tekstynas.mwe.lt/) was used. For identification combination of 2 trained models (RNN bi-LSTM and CRF) is used. Automatically identified MWE can be retrieved in 2 formats -- list of MWE or / and text with annotated MWE.

Use "Colloc -- A Tool for Automatic Identification of Multiword Expressions"
Rule-based g2p for Icelandic

2 resources

Manually developed grapheme-to-phoneme (g2p) transcription rules for Icelandic, written in Thrax grammar syntax. The rules are for the standard Icelandic pronunciation, the northern variation, the north-eastern variation and the south pronunciation variation. The package also contains a command line tool in C++. Handskrifaðar hljóðritunarreglur fyrir íslensku, skrifaðar í Thrax. Reglurnar eru skrifaðar fyrir hefðbundinn íslenskan framburð, fyrir harðmæli, raddaðan framburð og hv-framburð. Skipanalínutól skrifað í C++ fylgir.

Use "Rule-based g2p for Icelandic"
Slovene Text Denormalizator RSDO-DS2-DENORM 1.0

2 resources

This Text Denormalisator converts Slovene spoken-form text into written-form text. Typically it is used as a post-processing step in Automatic Speech Recognition, which traditionally outputs spoken-form text. As input it accepts text in either string form, list of tokens, or a list of dictionaries with a mandatory "text" field. The output is a dictionary. Example of use: denormalize("Danes, osmega sedmega dva tisoč dvaindvajset, je lep sončen dan, saj je zunaj prijetnih petindvajset stopinj Celzija.") {'denormalized_content': [{'text': 'Danes', 'index': [0]}, {'text': ',', 'index': [1]}, {'text': '8.', 'index': [2]}, {'text': '7.', 'index': [3]}, {'text': '2022', 'index': [4, 5, 6]}, {'text': ',', 'index': [7]}, {'text': 'je', 'index': [8]}, {'text': 'lep', 'index': [9]}, {'text': 'sončen', 'index': [10]}, {'text': 'dan', 'index': [11]}, {'text': ',', 'index': [12]}, {'text': 'saj', 'index': [13]}, {'text': 'je', 'index': [14]}, {'text': 'zunaj', 'index': [15]}, {'text': 'prijetnih', 'index': [16]}, {'text': '25', 'index': [17]}, {'text': '°C', 'index': [18, 19]}, {'text': '.', 'index': [20]}], 'denormalized_string': 'Danes, 8. 7. 2022, je lep sončen dan, saj je zunaj prijetnih 25 °C.'}

Use "Slovene Text Denormalizator RSDO-DS2-DENORM 1.0"

Result filters

Metadata provider

Language

Resource type

Type of tool

Tool task

Field of study

Availability

Organisation

Project

Keywords

Search results

BinPackage (0.3.1)

Punctuation model (20.09)

Prep for Adventure: A game for the acquisition of English prepositions

Lithuanian Spelling Checker V.1.0.45 for Linux

Yfirlestur Word 22.10

WordnetLoom 2.0

Dependency tree extraction tool STARK 1.0

Colloc -- A Tool for Automatic Identification of Multiword Expressions

Rule-based g2p for Icelandic

Slovene Text Denormalizator RSDO-DS2-DENORM 1.0