Result filters

Metadata provider

Language

Resource type

Availability

Project

  • Language Technology for Icelandic 2019-2023

Active filters:

  • Project: Language Technology for Icelandic 2019-2023
  • Tool task: Parsing
Loading...
10 record(s) found

Search results

  • Miðeind's Neural Constituency Parser - v. 1.0

    The Miðeind neural constituency parser is an experimental variant of the Berkeley neural parser architecture. It is self-contained and conveniently plug-and-play via a docker image. Currently POS tags are not part of its constituency trees. The input to the parser is a full path to a text file (${INPUT_FILE}) where each line contains a sentence that will be parsed. No prior tokenization is required. The output file will be located in ${OUTPUT_DIR}/output.txt and the output format is line-separated bracketed trees . To run the parser use the following: docker run --volume ${INPUT_FILE}:/data/input.txt --volume ${OUTPUT_DIR}:/data/ mideind/neural-parser:${TAG} The output follows the bracketed tree format described at https://www.ling.upenn.edu/~janabeck/tutorial.html --- Tauganetsþáttari Miðeindar er tilraunaafbrigði af Berkeley tauganetsþáttaranum. Þáttarinn skilar stofnliðatrjám án POS-marka (eins og er). Inntakið í þáttarann er full algjör slóð texta að skrá (${INPUT_FILE}) þar sem hver lína geymir eina málsgrein. Eftir keyrslu má finna úttakið í skránni ${OUTPUT_DIR}/output.txt þar sem úttakssniðið er tré á svigaformi með auðri línu á milli . Til að keyra þáttarann skal nota: docker run --volume ${INPUT_FILE}:/data/input.txt --volume ${OUTPUT_DIR}:/data/ mideind/neural-parser:${TAG} (edited)
  • Biaffine-based UD Parser 22.10

    ENGLISH: This Universal Dependencies parser for Icelandic was trained with Diaparser [1] on IcePaHC [2] and UD_Icelandic-Modern [3], the latter one having been revised before training, as some duplicate sentences had to be removed. The parser utilizes information from an ELECTRA language model [4]. Its UAS (unlabeled attachment score) is 89.52 and its LAS (labeled attachment score) is 86.23.
  • GreynirPackage 3.5.2 (22.10)

    GreynirPackage is a Python 3 package for working with Icelandic natural language text. Greynir can parse text into sentence trees, find lemmas, inflect noun phrases, assign part-of-speech tags and much more. Greynir's sentence trees can inter alia be used to extract information from text, for instance about people, titles, entities, facts, actions and opinions. Greynir uses the Tokenizer package, by the same authors, to tokenize text (see http://hdl.handle.net/20.500.12537/262). More information at https://github.com/icelandic-lt/GreynirEngine and detailed documentation at https://greynir.is/doc/. GreynirPackage er Python 3 pakki sem vinnur með íslenskan texta. Greynir þáttar texta í setningar, lemmar og markar texta, beygir nafnliði og margt fleira. Hægt er að nýta þáttunartrén sem tólið býr til í þeim tilgangi að draga upplýsingar út úr texta, til dæmis um manneskjur, starfstitla, sérnafnaeiningar, staðreyndir, atburði og skoðanir. Greynir notar Tokenizer-pakkann, eftir sömu höfunda, til að tilreiða texta (sjá http://hdl.handle.net/20.500.12537/262). Frekari upplýsingar má finna á https://github.com/icelandic-lt/GreynirEngine og ítarlega skjölun (á ensku) á https://greynir.is/doc/.
  • COMBO-based UD Parser 22.10

    ENGLISH: This Universal Dependencies parser for Icelandic was trained with COMBO on IcePaHC and UD_Icelandic-Modern, the latter one having been revised before training, as some duplicate sentences had to be removed. It utilizes information from an ELECTRA language model (https://huggingface.co/jonfd/electra-base-igc-is). Its UAS (unlabeled attachment score) is 89.13 and its LAS (labeled attachment score) is 85.97.
  • GreynirPackage (2021-05-12)

    GreynirPackage is a Python 3 package for working with Icelandic natural language text. Greynir can parse text into sentence trees, find lemmas, inflect noun phrases, assign part-of-speech tags and much more. Greynir's sentence trees can inter alia be used to extract information from text, for instance about people, titles, entities, facts, actions and opinions. Greynir uses the Tokenizer package, by the same authors, to tokenize text. More information at https://github.com/mideind/GreynirPackage and detailed documentation at https://greynir.is/doc/. GreynirPackage er Python 3 pakki sem vinnur með íslenskan texta. Greynir þáttar texta í setningar, lemmar og markar texta, beygir nafnliði og margt fleira. Hægt er að nýta þáttunartrén sem tólið býr til í þeim tilgangi að draga upplýsingar út úr texta, til dæmis um manneskjur, starfstitla, sérnafnaeiningar, staðreyndir, atburði og skoðanir. Greynir notar Tokenizer-pakkann, eftir sömu höfunda, til að tilreiða texta. Frekari upplýsingar má finna á https://github.com/mideind/GreynirPackage og ítarlega skjölun (á ensku) á https://greynir.is/doc/.
  • COMBO-based UD Parser for Icelandic 22.12

    ENGLISH: This Universal Dependencies parser for Icelandic was trained with COMBO [1]. This version of it was trained on v2.11 of UD_Icelandic-IcePaHC [2] and UD_Icelandic-Modern [3]. (Note that texts in UD_Icelandic-Modern [3] labeled RUV_TGS_2017 and RUV_ESP_2017 were not included here as these were originally parsed with COMBO-based UD Parser 22.10 [4] and the output subsequently corrected.) The parser utilizes information from an ELECTRA language model [4]. Its UAS (unlabeled attachment score) is 88.80 (89.00 on a pre-tokenized text file) and its LAS (labeled attachment score) is 85.52 (85.71 if pre-tokenized).   ICELANDIC: Þessi UD-þáttari var þjálfaður með COMBO [1]. Hann var þjálfaður á útgáfu 2.11 af UD_Icelandic-IcePaHC [2] og UD_Icelandic-Modern [3]. (Ath. að textar í UD_Icelandic-Modern [3] merktir RUV_TGS_2017 og RUV_ESP_2017 voru ekki notaðir við þjálfunina þar sem þeir voru upphaflega þáttaðir með COMBO-based UD Parser 22.10 [4] og úttakið leiðrétt að því loknu.) Þáttarinn nýtir sér upplýsingar úr ELECTRA-mállíkani [5]. Hann skorar 88.80 (89.00 á fortókuðu skjali) á UAS (unlabeled attachment score) og 85.52 (85.71 á fortókuðu skjali) á LAS (labeled attachment score). [1] COMBO: https://gitlab.clarin-pl.eu/syntactic-tools/combo/  [2] UD_Icelandic-IcePaHC: https://github.com/UniversalDependencies/UD_Icelandic-IcePaHC/  [3] UD_Icelandic-Modern: https://github.com/UniversalDependencies/UD_Icelandic-Modern/  [4] COMBO-based UD Parser 22.10: http://hdl.handle.net/20.500.12537/272 [5] electra-base-igc-is: https://huggingface.co/jonfd/electra-base-igc-is
  • GreynirPackage v3.5.1

    GreynirPackage is a Python 3 package for working with Icelandic natural language text. Greynir can parse text into sentence trees, find lemmas, inflect noun phrases, assign part-of-speech tags and much more. Greynir's sentence trees can inter alia be used to extract information from text, for instance about people, titles, entities, facts, actions and opinions. Greynir uses the Tokenizer package, by the same authors, to tokenize text. More information at https://github.com/mideind/GreynirPackage and detailed documentation at https://greynir.is/doc/. GreynirPackage er Python 3 pakki sem vinnur með íslenskan texta. Greynir þáttar texta í setningar, lemmar og markar texta, beygir nafnliði og margt fleira. Hægt er að nýta þáttunartrén sem tólið býr til í þeim tilgangi að draga upplýsingar út úr texta, til dæmis um manneskjur, starfstitla, sérnafnaeiningar, staðreyndir, atburði og skoðanir. Greynir notar Tokenizer-pakkann, eftir sömu höfunda, til að tilreiða texta. Frekari upplýsingar má finna á https://github.com/mideind/GreynirPackage og ítarlega skjölun (á ensku) á https://greynir.is/doc/.
  • GreynirPackage 3.1.0

    GreynirPackage is a Python 3 package for working with Icelandic natural language text. Greynir can parse text into sentence trees, find lemmas, inflect noun phrases, assign part-of-speech tags and much more. Greynir's sentence trees can inter alia be used to extract information from text, for instance about people, titles, entities, facts, actions and opinions. Greynir uses the Tokenizer package, by the same authors, to tokenize text. More information at https://github.com/mideind/GreynirPackage and detailed documentation at https://greynir.is/doc/. GreynirPackage er Python 3 pakki sem vinnur með íslenskan texta. Greynir þáttar texta í setningar, lemmar og markar texta, beygir nafnliði og margt fleira. Hægt er að nýta þáttunartrén sem tólið býr til í þeim tilgangi að draga upplýsingar út úr texta, til dæmis um manneskjur, starfstitla, sérnafnaeiningar, staðreyndir, atburði og skoðanir. Greynir notar Tokenizer-pakkann, eftir sömu höfunda, til að tilreiða texta. Frekari upplýsingar má finna á https://github.com/mideind/GreynirPackage og ítarlega skjölun (á ensku) á https://greynir.is/doc/.
  • GreynirPackage 2.6.1

    GreynirPackage is a Python 3 package for working with Icelandic natural language text. Greynir can parse text into sentence trees, find lemmas, inflect noun phrases, assign part-of-speech tags and much more. Greynir's sentence trees can inter alia be used to extract information from text, for instance about people, titles, entities, facts, actions and opinions. Greynir uses the Tokenizer package, by the same authors, to tokenize text. More information at https://github.com/mideind/GreynirPackage and detailed documentation at https://greynir.is/doc/. GreynirPackage er Python 3 pakki sem vinnur með íslenskan texta. Greynir þáttar texta í setningar, lemmar og markar texta, beygir nafnliði og margt fleira. Hægt er að nýta þáttunartrén sem tólið býr til í þeim tilgangi að draga upplýsingar út úr texta, til dæmis um manneskjur, starfstitla, sérnafnaeiningar, staðreyndir, atburði og skoðanir. Greynir notar Tokenizer-pakkann, eftir sömu höfunda, til að tilreiða texta. Frekari upplýsingar má finna á https://github.com/mideind/GreynirPackage og ítarlega skjölun (á ensku) á https://greynir.is/doc/.
  • Biaffine-based UD Parser for Icelandic 22.12

    ENGLISH: This Universal Dependencies parser for Icelandic was trained with Diaparser [1]. This version of it was trained on v2.11 of UD_Icelandic-IcePaHC [2] and UD_Icelandic-Modern [3]. (Note that texts in UD_Icelandic-Modern [3] labeled RUV_TGS_2017 and RUV_ESP_2017 were not included here as these were originally parsed with COMBO-based UD Parser 22.10 [4] and the output subsequently corrected.) The parser utilizes information from an ELECTRA language model [5]. Its UAS (unlabeled attachment score) is 89.58 and its LAS (labeled attachment score) is 86.46.   ICELANDIC: Þessi UD-þáttari var þjálfaður með Diaparser [1]. Þessi útgáfa hans var þjálfuð á útgáfu 2.11 af UD_Icelandic-IcePaHC [2] og UD_Icelandic-Modern [3]. (Ath. að textar í UD_Icelandic-Modern [3] merktir RUV_TGS_2017 og RUV_ESP_2017 voru ekki notaðir við þjálfunina þar sem þeir voru upphaflega þáttaðir með COMBO-based UD Parser 22.10 [4] og úttakið leiðrétt að því loknu.) Þáttarinn nýtir sér upplýsingar úr ELECTRA-mállíkani [5]. Hann skorar 89.58 á UAS (unlabeled attachment score) og 86.46 á LAS (labeled attachment score). [1] Diaparser: https://github.com/Unipisa/diaparser  [2] UD_Icelandic-IcePaHC: https://github.com/UniversalDependencies/UD_Icelandic-IcePaHC/  [3] UD_Icelandic-Modern: https://github.com/UniversalDependencies/UD_Icelandic-Modern/  [4] COMBO-based UD Parser 22.10: http://hdl.handle.net/20.500.12537/272 [5] electra-base-igc-is: https://huggingface.co/jonfd/electra-base-igc-is