Web based, open stylometry system based on Multilevel Text Analysis. Runs cluto and stylo (R system) clusterisation methods. Based on Natural Language Processing Workflow engine (included in the distribution).
Liner2.6 NER NKJP model
The package contains a pre-trained Liner2 (https://github.com/CLARIN-PL/Liner2) model for recognition named entities according to NKJP guidelines. The model was trained on the NKJP corpus (http://nkjp.pl/) and evaluated in the PolEval 2018 Task 2 (http://poleval.pl/tasks/).
The model won third place with the following results: Exact — 0.778, Overlap — 0.818, Final — 0.810.
References:
* NKJP corpus in TEI format — http://clip.ipipan.waw.pl/NationalCorpusOfPolish?action=AttachFile&do=view&target=NKJP-PodkorpusMilionowy-1.2.tar.gz
* PolEval 2018 Task 2 evaluation corpus — http://mozart.ipipan.waw.pl/~axw/poleval2018/
A set of basic language tools for the Polish language. Z4.2a Improving the quality of recognition of relations between events using Transformer-type deep networks.