Result filters

Metadata provider

  • DSpace

Language

Resource type

Keywords

  • punctuation

Active filters:

  • Metadata provider: DSpace
  • Keywords: punctuation
Loading...
3 record(s) found

Search results

  • NeMo Punctuation and Capitalisation service RSDO-DS2-P&C-API 1.0

    Punctuation and Capitalisation service for NeMo models. For more details about building such models, see the official NVIDIA NeMo documentation (https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/punctuation_and_capitalization.html) and NVIDIA NeMo GitHub (https://github.com/NVIDIA/NeMo). A model for punctuation and capitalisation restoration in lowercased non-punctuated Slovene text can be downloaded from http://hdl.handle.net/11356/1735. The service accepts as input either a single string or list of strings for which punctuation and capitalisation should be restored. The result will be in the same format as the request, either a single string or list of strings. The maximal accepted text length is 5000c. Note that punctuation and capitalization of one 5000c text block on cpu will take advantage of all available cores and may take ~30s (on a system with 24 vCPU). See the service README.md for further details.
  • Punctuation model (20.09)

    A python package that punctuates Icelandic text. The input data is unpunctuated text and punctuated text is returned. The user can choose between two punctuation models, a BERT-based Transformer and a bidirectional RNN ([Punctuator 2](www.github.com/ottokart/punctuator2)) in Tensorflow 2. [Icelandic] Python-pakki sem greinarmerkjasetur íslenskan texta. Inntakið er á formi ógreinarmerkjasetts texta og greinarmerkjasettum texta er skilað. Notandinn getur valið milli tveggja greinarmerkjasetningalíkana, annars vegar umbreytis sem byggir á BERT og tvístefnu-endurkvæmnisneti ([Punctuator 2](www.github.com/ottokart/punctuator2)) í Tensorflow 2.
  • Slovene Punctuation and Capitalisation model RSDO-DS2-P&C 3.6

    This Punctuation and Capitalisation model was trained following the NVIDIA NeMo Punctuation and Capitalisation recipe (for details see the official NVIDIA NeMo P&C documentation, https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/punctuation_and_capitalization.html, and NVIDIA NeMo GitHub repository https://github.com/NVIDIA/NeMo). It provides functionality for restoring punctuation (,.!?) and capital letters in lowercased non-punctuated Slovene text. The training corpus was built from publicly available datasets, as well as a small portion of proprietary data. In total the training corpus consisted of 38.829.529 sentences and the validation corpus consisted of 2.092.497 sentences.