Result filters

Metadata provider

  • DSpace

Language

Resource type

Tool task

Availability

Project

  • Clarin-PL

Keywords

Active filters:

  • Metadata provider: DSpace
  • Project: Clarin-PL
Loading...
8 record(s) found

Search results

  • WebStylo

    Web based, open stylometry system based on Multilevel Text Analysis. Runs cluto and stylo (R system) clusterisation methods. Based on Natural Language Processing Workflow engine (included in the distribution).
  • Liner2.5 model NER

    Przygotował: Michał Marcińczuk <marcinczuk@gmail.com> Data: 25.05.2016 Projekt: Clarin-PL (http://clarin-pl.eu) Autorzy: Michał Marcińczuk, Jan Kocoń, Michał Krautforst Modele do narzędzia Liner2.5 do rozpoznawania jednostek identyfikacyjnych. Narzędzie Liner2.5 dostępne jest pod linkiem http://hdl.handle.net/11321/231. Paczka zawiera trzy modele: 1. config-nam.ini -- granice jednostek identyfikacyjnych, 2. config-top9.ini -- granice i ogólna kategoryzacja jednostek (9 kategorii), 3. config-n82.ini -- granice i szczegółowa kategoryzacja jednostek (82 kategorie).
  • Polish Speech Services

    This archive contains the source code and configuration of the speech tools web service available at http://mowa.clarin-pl.eu/mowa. The services provided include: + speech to text alignemnt + speaker diarization + speech transcription + speech activity detection and noise classification + keyword spotting
  • Long term archive operating system source code

    This submission contains the operating system of the long-term archive, built in the Polish-Japanese Academy of Information Technology for the Clarin-PL project. Basic elements of the archive are data nodes, equipped with mass memories. The nodes are controlled by embedded low-power computers which are independently powered up only when their storage is about to be accessed. This allows not only for limiting the overall energy consumption but also lowers environmental demands (no air-condition needed). The nodes are grouped in trays. Basic and recommended configuration allows for 30 nodes in trays, but it is possible to extend this limit up to 253. Each tray contains several networks designed for data transport, devices’ state control and power supply. Communication with clients is conducted through buffers that are the only parts visible from externally connected networks. Therefore, stored files are completely isolated and cannot be directly accessed. Multiple trays located at single physical site create a complete archive. It is possible to split storage space into virtual archives that are separated on logical level. The operating system of the data network allows to store from 3 to 7 copies of single digital file in different nodes. Moreover, additional copies of the resource may be stored automatically in remotely located archives. The trays are treated as local parts of wider dispersed data network structure. Software of the archive enables not only secure read and write operations data but it also automatically takes care of the stored data. It periodically regenerates physical state of saved files. In case of device failure clients are transparently redirected to local or remote redundant copies. The mechanism of "software bots" was implemented. Archive can be supplied with external programs for processing files stored inside the data network. This allows for data analyzes, indexation, post-data creation, statistical computations or finding associations in unstructured data sets of Big Data type. Only the output of software bot can be externally accessed what makes such operations very secure. Client programs communicate with the archive using set of simple protocols based on key-value pair strings, making it convenient to build web interfaces for archive access and administration. By automating the supervision of the resources, reduction of requirements for storage, precise energy consumption control and proposed solution significantly lowers the cost of long-term data storage.