Service integrates several keyword determination methods, including generative models and multi-label classification. The combination of several advanced techniques makes the results more reliable and accurate.
A service for automatically extracting information about what topics are covered in the texts. It uses topic modeling (LDA), which detects topics based on the co-occurrence of words in one document. The service assigns each document to several topics. The detected topic represents what a list of pairs: the word and the probability of its occurrence in the topic. It enables qualitative (detection of non-obvious topics) and quantitative analysis of processed texts.
Service that is used to process literary texts to extract statistical information from them. The service allows, among others, lemmatization, determining parts of speech, characterization of verbs used in the text, creating a list of proper names and extracting statistics from the corpus of texts.
A portal that implements a transcription chain for batch processing of speech files using automatic speech recognition, the OCTRA editor, the Munich Automatic Segmentation MAUS and the EMU-webApp viewer.
Set of tools used in natural language processing to assign labels or tags to text elements such as words or tokens. Postagger works at the stage after the text has been analyzed by a morphological or syntactic tagger and is intended to make the final classification and assign appropriate labels to individual text elements.