Result filters

Metadata provider

Language

  • Dutch

Resource type

Availability

  • Other

Active filters:

  • Language: Dutch
  • Availability: Other
Loading...
6 record(s) found

Search results

  • WebCelex

    WebCelex is a webbased interface to the CELEX lexical databases of English, Dutch and German. CELEX was developed as a joint enterprise of the University of Nijmegen, the Institute for Dutch Lexicology in Leiden, the Max Planck Institute for Psycholinguistics in Nijmegen, and the Institute for Perception Research in Eindhoven. For each language, the database contains detailed information on: orthography (variations in spelling, hyphenation), phonology (phonetic transcriptions, variations in pronunciation, syllable structure, primary stress), morphology (derivational and compositional structure, inflectional paradigms), syntax (word class, word class-specific subcategorizations, argument structures) and word frequency (summed word and lemma counts, based on recent and representative text corpora).
  • Fast and easy development of pronunciation lexicons for names

    The AUTONOMATA transcription tool set consists of a transcription tool and learning tools, with which one can enrich word lists with precise information on the pronunciation. Thee uses a general grapheme-to-phoneme converter (the g2p-converter).
    This STEVIN project is about the investigation of new pronunciation modeling technologies that can improve the automatic recognition of spoken names in the context of a POI (Point-of-Interest) information providing business service. Collaboration with RU (Nijmegen), UiL (Utrecht), Nuance and TeleAtlas.
    De AUTONOMATA-transcriptietoolset bestaat uit een transcriptietool en learning tools, waarmee men woordenlijsten kan verrijken met nauwkeurige uitspraakinformatie. De tool maakt gebruik van een algemene grafeem-naar-foneemomzetter (de g2p-omzetter).
  • Evaluating Repetitions, or how to Improve your Multilingual ASR System by doing Nothing

    A demo of a speech recognizer for POIs (Points of Interest). This demo recognizes stay-over addresses and eateries in some big cities (inter alia Amsterdam, Antwerpen, Gent, Rotterdam).
    This STEVIN project is about the investigation of new pronunciation modeling technologies that can improve the automatic recognition of spoken names in the context of a POI (Point-of-Interest) information providing business service. Collaboration with RU (Nijmegen), UiL (Utrecht), Nuance and TeleAtlas.
    Een demo van een spraakherkenner voor POIs (Points of Interest). Deze demo herkent overnachtingsadressen en eetgelegenheden in enkele grote steden (o.a. Amsterdam, Antwerpen, Gent, Rotterdam).
  • Cornetto: Combinatorial and Relational Network as Toolkit for Dutch Language Technology

    Cornetto is a lexical resource for the Dutch language which combines two resources with different semantic organisations: the Dutch Wordnet with its synset organisation and the Dutch Reference Lexicon which includes definitions, usage constraints, selectional restrictions, syntactic behaviours, illustrative contexts, etc. The Cornetto database contains over 92K lemmas and almost 120K word meanings. The Cornetto lexical resource for Dutch covers the most generic and central part of the language. Cornetto combines the structures of the Princeton Wordnet, some of the features from the FrameNet for English and the information on morphological, syntactic, semantic and combinatorial features of lexemes normally found in dictionaries. The Cornetto resource is compiled by combining and aligning two existing semantic resources for Dutch: the Dutch wordnet (DWN) and the Referentie Bestand Nederlands (RBN). Recently, the resource is revised and extended with sentiment values in the From Text to Political Positions project , and with semantic annotations in SONAR, CGN and texts from the Web in the DutchSemCor project. The Cornetto Lexical Resource consists of two large repositories of lexicon data: the lexical entry repository and the synset repository. A Lexical Entry (LE) is a word-meaning pair (i.e. a single meaning of a certain word form), for which morphological, syntactical, semantical and combinatorial information is given. As such, LEs are word senses in the lexical semantic tradition, containing the linguistic knowledge that is needed to properly use the word in a specific meaning in a language. Since the LEs follow a word-to-meaning view, the semantical and combinatorial information for each meaning clarify the differences across the meanings. LEs focus on the polysemy of words and typically follow an approach to represent condensed and generalised meanings from which more specific ones can be derived. Each LE is aligned with a synset (set of synonyms) in the synset repository. As such, a synset can be seen as a set of LEs with the same meaning and every synset stands for a concept. The synsets in Cornetto are interconnected by different semantic relations such as hyponymy, antonymy and meronymy. The Cornet-to Resource is aligned with the English Wordnet, from which domain information was imported. The domains represent clusters of concepts that are related by a shared area of interest, such as sport, education or politics. The definitions of LEs from the same synset should be semantically equivalent and the LEs of a single word form should belong to different synsets. The LEs of a single word form typically differ in terms of connotation, pragmatics, syntax and semantics but synonymous words in the same synset can be differen-tiated along connotation, pragmatics and syntax but not semantics. This structure of the resource makes it possible to combine the very detailed information on form and usage of a specific LE or a group of LEs with the semantic relations which are specified in the corresponding synset(s). For an Open Source version lexico-semantic database for Dutch see the Open Source Dutch Wordnet (ODWN): http://wordpress.let.vupr.nl/odwn/
    Vossen, P., I. maks, R. Segers, H. van der Vliet, M.F. Moens, K. Hofmann, E. Tjong Kim Sang, M. de Rijke (2013), Corntto: a lexical semantic database for Dutch, Chapter in: P. Spyns and J. Odijk (eds): Essential Speech and Language Technology for Dutch, Results by the STEVIN-programme, Publ. Springer series Theory and Applications of Natural Language Processing, ISBN 978-3-642-30909-0.
    Vossen, P., I. Maks, R. Seegers and H. van der Vliet (2008). Integrating Lexical Units, Synsets, and Ontology in the Cornetto Database. In Proceedings of LREC-2008, Marrakech, Morocco.
  • PaQu - Parse and Query

    PaQu uses the Alpino parser to make treebanks of your own text corpus, and to search in these treebanks using an interface based on the LASSY Word Relations Search interface (http://dev.clarin.nl/node/1966). Several treebanks are already available in the application, such as: Lassy Klein (1M words, manually checked syntactic analysis) and Lassy Groot (700M words, syntactic analysis automatically assigned by Alpino). PaQu offers two ways to search through the syntactically annotated texts. The first option is to use the search bar to look for word pairs, optionally complemented by their syntactic relationship. The second search option is to use the query language XPath.
    Odijk, J, van Noord, G, Kleiweg, P and Tjong Kim Sang, E. 2017. The Parse and Query (PaQu) Application. In: Odijk, J and van Hessen, A. (eds.) CLARIN in the Low Countries, Pp. 281–297. London: Ubiquity Press. DOI: https://doi.org/10.5334/bbi.23. License: CC-BY 4.0
  • Syntactic Profiler of Dutch

    SPOD is syntactic profiler that covers a broad spectrum of properties. It is part of the PaQu application but has its own interface with a menu of predefined queries. It can be used to provide general information about corpus properties, such as the number of main and subordinate clauses, types of main and subordinate clauses, and their frequencies, average length of clauses (per clause type: e.g. relative clauses, indirect questions, finite complement clauses, infinitival clauses, finite adverbial clauses, etc.). It yields output in HTML and tab-separated text format.