Result filters

Metadata provider

  • DSpace

Language

Resource type

Active filters:

  • Metadata provider: DSpace
  • Keywords: multiword expressions
Loading...
3 record(s) found

Search results

  • The Database of Lithuanian multiword expressions

    The Database of Lithuanian multiword expressions (MWEs) is freely accessible for online search at: https://resursai.pastovu.vdu.lt/paieska/paprastoji from 2019. It contains two-word and three-word MWEs extracted from the DELFI.lt corpus representing news texts on the various topics (https://klc.vdu.lt/pastovuSearch.html). First, 12,000 MWEs (mostly collocations, a few idioms) were included in the database. In 2022, the database was updated adding new collocations from the same corpus and filtering arbitrary collocations: out of appr. 19,000 collocations appr. 9000 are marked as arbitrary collocations, i.e., having lexical collocability restrictions. The database provides rich information about the usage of collocations: lemma, word forms, frequencies (in the DELFI.lt corpus), morphological information, syntactic relations, grammatical variants, text genres, and usage examples. Usage variation cases are also illustrated, for example, word order changes or insertions between collocation constituents.
  • Colloc -- A Tool for Automatic Identification of Multiword Expressions

    Colloc -- a tool for automatic identification of multiword expressions (MWE) is freely available for online use at http://resursai.mwe.lt/atpazintuvas. As material for training DELFI.lt corpus (http://tekstynas.mwe.lt/) was used. For identification combination of 2 trained models (RNN bi-LSTM and CRF) is used. Automatically identified MWE can be retrieved in 2 formats -- list of MWE or / and text with annotated MWE.
  • Dependency tree extraction tool STARK 3.0

    STARK is a highly customizable tool designed for extracting different types of syntactic structures (trees) from parsed corpora (treebanks), aimed at corpus-driven linguistic investigations of syntactic and lexical phenomena of various kinds. It takes a treebank in the CONLL-U format as input and returns a list of all relevant dependency trees with frequency information and other useful statistics, such as the strength of association between the nodes of a tree, or its significance in comparison to another treebank. For installation, execution and the description of various user-defined parameter settings, see the official project page at: https://github.com/clarinsi/STARK. An online demo version of the tool is available at: https://orodja.cjvt.si/stark/. In comparison to v2, this version introduces several new features and improvements, such as the ability to extract very long trees, ignore irrelevant relations, process multi-root treebanks, or handle special operators when querying.