PICCL: Philosophical Integrator of Computational and Corpus Libraries
<?xml version="1.0" encoding="UTF-8"?>
<cmd:CMD xmlns:cmd="http://www.clarin.eu/cmd/1"
xmlns:cmdp="http://www.clarin.eu/cmd/1/profiles/clarin.eu:cr1:p_1342181139640"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
CMDVersion="1.2"
xsi:schemaLocation="http://www.clarin.eu/cmd/1 https://infra.clarin.eu/CMDI/1.x/xsd/cmd-envelop.xsd http://www.clarin.eu/cmd/1/profiles/clarin.eu:cr1:p_1342181139640 https://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/1.1/profiles/clarin.eu:cr1:p_1342181139640/1.2/xsd">
<cmd:Header>
<cmd:MdCreator>janodijk</cmd:MdCreator>
<cmd:MdCreationDate>2018-08-01+02:00</cmd:MdCreationDate>
<cmd:MdProfile>clarin.eu:cr1:p_1342181139640</cmd:MdProfile>
<cmd:MdCollectionDisplayName>CLARIN Netherlands</cmd:MdCollectionDisplayName>
</cmd:Header>
<cmd:Resources>
<cmd:ResourceProxyList>
<cmd:ResourceProxy id="PICCL001">
<cmd:ResourceType>Resource</cmd:ResourceType>
<cmd:ResourceRef>https://webservices-lst.science.ru.nl/piccl/</cmd:ResourceRef>
</cmd:ResourceProxy>
<cmd:ResourceProxy id="PICCL002">
<cmd:ResourceType>LandingPage</cmd:ResourceType>
<cmd:ResourceRef>https://webservices-lst.science.ru.nl/piccl/info/</cmd:ResourceRef>
</cmd:ResourceProxy>
</cmd:ResourceProxyList>
<cmd:JournalFileProxyList/>
<cmd:ResourceRelationList/>
</cmd:Resources>
<cmd:Components>
<cmdp:ClarinSoftwareDescription>
<cmdp:GeneralInfo>
<cmdp:name xml:lang="eng">PICCL</cmdp:name>
<cmdp:title xml:lang="eng">PICCL: Philosophical Integrator of Computational and Corpus Libraries</cmdp:title>
<cmdp:version>v0.6.4</cmdp:version>
<cmdp:publicationYear>2015</cmdp:publicationYear>
<cmdp:url>https://webservices-lst.science.ru.nl/piccl/</cmdp:url>
<cmdp:CLARINCentre>none yet</cmdp:CLARINCentre>
<cmdp:ReleaseStatus>
<cmdp:LifeCycleStatus>published</cmdp:LifeCycleStatus>
<cmdp:lastUpdate>2018-07-12</cmdp:lastUpdate>
</cmdp:ReleaseStatus>
<cmdp:NationalProjects>
<cmdp:Project>
<cmdp:name>CLARIN-NL</cmdp:name>
<cmdp:title>CLARIN in the Netherlands</cmdp:title>
<cmdp:id>184.021.003</cmdp:id>
<cmdp:funder>NWO</cmdp:funder>
<cmdp:url>http://www.clarin.nl</cmdp:url>
<cmdp:Contact>
<cmdp:Person>Jan Odijk</cmdp:Person>
<cmdp:Role>National Coordinator</cmdp:Role>
<cmdp:Address>Utrecht, the Netherlands</cmdp:Address>
<cmdp:Email>j.odijk@uu.nl</cmdp:Email>
<cmdp:Department>UiL-OTS</cmdp:Department>
<cmdp:Organisation>Utrecht University</cmdp:Organisation>
</cmdp:Contact>
<cmdp:Duration>
<cmdp:StartYear>2009</cmdp:StartYear>
<cmdp:CompletionYear>2015</cmdp:CompletionYear>
</cmdp:Duration>
</cmdp:Project>
<cmdp:Project>
<cmdp:name>CLARIAH-CORE</cmdp:name>
<cmdp:title>Common Lab Research Infrastructure for the Arts and the Humanities</cmdp:title>
<cmdp:id>184.033.101</cmdp:id>
<cmdp:funder>NWO</cmdp:funder>
<cmdp:url>http://www.clariah.nl</cmdp:url>
<cmdp:Contact>
<cmdp:Person>Jan Odijk</cmdp:Person>
<cmdp:Role>National Coordinator</cmdp:Role>
<cmdp:Address>Utrecht, the Netherlands</cmdp:Address>
<cmdp:Email>j.odijk@uu.nl</cmdp:Email>
<cmdp:Department>UiL-OTS</cmdp:Department>
<cmdp:Organisation>Utrecht University</cmdp:Organisation>
</cmdp:Contact>
<cmdp:Duration>
<cmdp:StartYear>2015</cmdp:StartYear>
<cmdp:CompletionYear>2018</cmdp:CompletionYear>
</cmdp:Duration>
</cmdp:Project>
</cmdp:NationalProjects>
<cmdp:Country>
<cmdp:CountryName>Netherlands</cmdp:CountryName>
<cmdp:CountryCoding>NL</cmdp:CountryCoding>
</cmdp:Country>
<cmdp:Description>
<cmdp:Description xml:lang="eng">PICCL is a set of workflows for corpus building through OCR, post-correction, modernization of historic language and Natural Language Processing. It combines Tesseract Optical Character Recognition, TICCL functionality and Frog functionality in a single pipeline.
Tesseract offers Open Source software for optical character recognition.
TICCL (Text Induced Corpus Clean-up) is a system that is designed to search a corpus for all existing variants of (potentially) all words occurring in the corpus. This corpus can be one text, or several, in one or more directories, located on one or more machines. TICCL creates word frequency lists, listing for each word type how often the word occurs in the corpus. These frequencies of the normalized word forms are the sum of the frequencies of the actual word forms found in the corpus. TICCL is a system that is intended to detect and correct typographical errors (misprints) and OCR errors (optical character recognition) in texts. When books or other texts are scanned from paper by a machine, that then turns these scans, i.e. images, into digital text files, errors occur. For instance, the letter combination `in' can be read as `m', and so the word `regeering' is incorrectly reproduced as `regeermg'. TICCL can be used to detect these errors and to suggest a correct form.
Frog enriches textual documents with various linguistic annotations.
</cmdp:Description>
</cmdp:Description>
</cmdp:GeneralInfo>
<cmdp:SoftwareFunction>
<cmdp:toolCategory>written language tool</cmdp:toolCategory>
<cmdp:ToolTasks>
<cmdp:toolTask>optical character recognition</cmdp:toolTask>
<cmdp:toolTask>orthographic normalisation</cmdp:toolTask>
<cmdp:toolTask>sentence splitting</cmdp:toolTask>
<cmdp:toolTask>tokenisation</cmdp:toolTask>
<cmdp:toolTask>dependency parsing</cmdp:toolTask>
<cmdp:toolTask>shallow parsing</cmdp:toolTask>
<cmdp:toolTask>lemmatisation</cmdp:toolTask>
<cmdp:toolTask>morphological analysis</cmdp:toolTask>
<cmdp:toolTask>named entity recognition</cmdp:toolTask>
<cmdp:toolTask>part of speech tagging</cmdp:toolTask>
</cmdp:ToolTasks>
<cmdp:ResearchPhases>
<cmdp:ResearchPhase>Enriching Data</cmdp:ResearchPhase>
</cmdp:ResearchPhases>
<cmdp:ResearchDomains>
<cmdp:researchDomain>Linguistics</cmdp:researchDomain>
<cmdp:researchDomain>Philosophy</cmdp:researchDomain>
<cmdp:researchDomain>Literary Studies</cmdp:researchDomain>
<cmdp:researchDomain>Religion Studies</cmdp:researchDomain>
<cmdp:researchDomain>History</cmdp:researchDomain>
</cmdp:ResearchDomains>
<cmdp:LinguisticsSubject>
<cmdp:linguisticsSubject>general linguistics</cmdp:linguisticsSubject>
<cmdp:Description>
<cmdp:Description/>
</cmdp:Description>
</cmdp:LinguisticsSubject>
<cmdp:LinguisticsSubject>
<cmdp:linguisticsSubject>orthography</cmdp:linguisticsSubject>
<cmdp:linguisticsSubject>morphology</cmdp:linguisticsSubject>
<cmdp:linguisticsSubject>syntax</cmdp:linguisticsSubject>
<cmdp:Description>
<cmdp:Description/>
</cmdp:Description>
</cmdp:LinguisticsSubject>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Dutch</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>nld</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>no</cmdp:centuryDependent>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Swedish</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>swe</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>20</cmdp:centuryFrom>
<cmdp:centuryThrough>21</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Russian</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>rus</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>20</cmdp:centuryFrom>
<cmdp:centuryThrough>21</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Spanish</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>spa</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>20</cmdp:centuryFrom>
<cmdp:centuryThrough>21</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Portuguese</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>por</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>20</cmdp:centuryFrom>
<cmdp:centuryThrough>21</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>English</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>20</cmdp:centuryFrom>
<cmdp:centuryThrough>21</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>German</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>deu</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>20</cmdp:centuryFrom>
<cmdp:centuryThrough>21</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>French</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>fra</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>20</cmdp:centuryFrom>
<cmdp:centuryThrough>21</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Italian</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>ita</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>20</cmdp:centuryFrom>
<cmdp:centuryThrough>21</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Finnish</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>fin</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>20</cmdp:centuryFrom>
<cmdp:centuryThrough>21</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Modern Greek</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>ell</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>15</cmdp:centuryFrom>
<cmdp:centuryThrough>21</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Classical Greek</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>grc</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>-10</cmdp:centuryFrom>
<cmdp:centuryThrough>15</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Icelandic</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>isl</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>-10</cmdp:centuryFrom>
<cmdp:centuryThrough>15</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>German (Fraktur)</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>deu</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>-10</cmdp:centuryFrom>
<cmdp:centuryThrough>15</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Latin</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>lat</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>no</cmdp:centuryDependent>
</cmdp:Centuries>
</cmdp:LanguageVariety>
<cmdp:LanguageVariety>
<cmdp:languageDependent>yes</cmdp:languageDependent>
<cmdp:Language>
<cmdp:LanguageName>Romanian</cmdp:LanguageName>
<cmdp:ISO639>
<cmdp:iso-639-3-code>ron</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Language>
<cmdp:Centuries>
<cmdp:centuryDependent>yes</cmdp:centuryDependent>
<cmdp:CenturyInterval>
<cmdp:centuryFrom>-10</cmdp:centuryFrom>
<cmdp:centuryThrough>15</cmdp:centuryThrough>
</cmdp:CenturyInterval>
</cmdp:Centuries>
</cmdp:LanguageVariety>
</cmdp:SoftwareFunction>
<cmdp:SoftwareImplementation>
<cmdp:distributionMedium>Online available</cmdp:distributionMedium>
<cmdp:sourcecodeURI>https://github.com/LanguageMachines/piccl</cmdp:sourcecodeURI>
<cmdp:InstallationRequirements>
<cmdp:MinimumHardwareRequirements>
<cmdp:SystemRequirements>
<cmdp:workingMemoryMin>not specified</cmdp:workingMemoryMin>
<cmdp:hardDiskMin>not specified</cmdp:hardDiskMin>
<cmdp:Platform>
<cmdp:operatingSystem>POSIX</cmdp:operatingSystem>
<cmdp:operatingSystemVersion>not specified</cmdp:operatingSystemVersion>
<cmdp:bitArchitecture>unknown</cmdp:bitArchitecture>
</cmdp:Platform>
</cmdp:SystemRequirements>
</cmdp:MinimumHardwareRequirements>
<cmdp:SoftwareRequirements>
<cmdp:RequiredSoftware>
<cmdp:SoftwareShortDescription>
<cmdp:resourceName>nextflow</cmdp:resourceName>
<cmdp:version>not specified</cmdp:version>
<cmdp:url>https://github.com/nextflow-io/nextflow</cmdp:url>
<cmdp:applicationType>localDesktop</cmdp:applicationType>
</cmdp:SoftwareShortDescription>
<cmdp:SoftwareShortDescription>
<cmdp:resourceName>ticcltools</cmdp:resourceName>
<cmdp:version>not specified</cmdp:version>
<cmdp:url>https://github.com/LanguageMachines/ticcltools</cmdp:url>
<cmdp:applicationType>localDesktop</cmdp:applicationType>
</cmdp:SoftwareShortDescription>
<cmdp:SoftwareShortDescription>
<cmdp:resourceName>foliautils</cmdp:resourceName>
<cmdp:version>not specified</cmdp:version>
<cmdp:url>https://github.com/LanguageMachines/foliautils</cmdp:url>
<cmdp:applicationType>localDesktop</cmdp:applicationType>
</cmdp:SoftwareShortDescription>
<cmdp:SoftwareShortDescription>
<cmdp:resourceName>tesseract</cmdp:resourceName>
<cmdp:version>not specified</cmdp:version>
<cmdp:url>https://github.com/tesseract-ocr/tesseract</cmdp:url>
<cmdp:applicationType>localDesktop</cmdp:applicationType>
</cmdp:SoftwareShortDescription>
</cmdp:RequiredSoftware>
</cmdp:SoftwareRequirements>
</cmdp:InstallationRequirements>
<cmdp:UserInterface>
<cmdp:interfaceType>command line interface</cmdp:interfaceType>
<cmdp:applicationType>local desktop</cmdp:applicationType>
</cmdp:UserInterface>
<cmdp:UserInterface>
<cmdp:interfaceType>graphical user interface</cmdp:interfaceType>
<cmdp:applicationType>web application</cmdp:applicationType>
</cmdp:UserInterface>
<cmdp:UserInterface>
<cmdp:interfaceType>web interface</cmdp:interfaceType>
<cmdp:applicationType>web service</cmdp:applicationType>
</cmdp:UserInterface>
<cmdp:Input>
<cmdp:inputType>text</cmdp:inputType>
<cmdp:inputResource>PDF picture</cmdp:inputResource>
<cmdp:MimeType>
<cmdp:MimeType>application/pdf</cmdp:MimeType>
</cmdp:MimeType>
</cmdp:Input>
<cmdp:Input>
<cmdp:inputType>text</cmdp:inputType>
<cmdp:inputResource>PDF with embedded text</cmdp:inputResource>
<cmdp:MimeType>
<cmdp:MimeType>application/pdf</cmdp:MimeType>
</cmdp:MimeType>
</cmdp:Input>
<cmdp:Input>
<cmdp:inputType>text</cmdp:inputType>
<cmdp:inputResource>TIFF picture</cmdp:inputResource>
<cmdp:MimeType>
<cmdp:MimeType>image/tiff</cmdp:MimeType>
</cmdp:MimeType>
</cmdp:Input>
<cmdp:Input>
<cmdp:inputType>text</cmdp:inputType>
<cmdp:inputResource>Lexicon (one word per line)</cmdp:inputResource>
<cmdp:MimeType>
<cmdp:MimeType>text/plain</cmdp:MimeType>
</cmdp:MimeType>
</cmdp:Input>
<cmdp:Input>
<cmdp:inputType>text</cmdp:inputType>
<cmdp:inputResource>Post-OCR text document</cmdp:inputResource>
<cmdp:MimeType>
<cmdp:MimeType>text/plain</cmdp:MimeType>
</cmdp:MimeType>
</cmdp:Input>
<cmdp:Input>
<cmdp:inputType>text</cmdp:inputType>
<cmdp:Schema>
<cmdp:schemaname>FoLiA</cmdp:schemaname>
</cmdp:Schema>
<cmdp:MimeType>
<cmdp:MimeType>text/folia+xml</cmdp:MimeType>
</cmdp:MimeType>
</cmdp:Input>
<cmdp:Input>
<cmdp:inputType>text</cmdp:inputType>
<cmdp:Schema>
<cmdp:schemaname>DjVU</cmdp:schemaname>
</cmdp:Schema>
<cmdp:MimeType>
<cmdp:MimeType>image/vnd.djvu</cmdp:MimeType>
</cmdp:MimeType>
</cmdp:Input>
<cmdp:Output>
<cmdp:outputType>text</cmdp:outputType>
<cmdp:characterEncoding>utf8</cmdp:characterEncoding>
<cmdp:Schema>
<cmdp:schemaname>FoLiA</cmdp:schemaname>
</cmdp:Schema>
<cmdp:MimeType>
<cmdp:MimeType>text/folia+xml</cmdp:MimeType>
</cmdp:MimeType>
<cmdp:AnnotationType>
<cmdp:AnnotationType>Discourse/Sentence Boundaries</cmdp:AnnotationType>
<cmdp:AnnotationType>Orthography/Token</cmdp:AnnotationType>
<cmdp:AnnotationType>Morphosyntax/Inflection</cmdp:AnnotationType>
<cmdp:AnnotationType>Morphosyntax/Lemma</cmdp:AnnotationType>
<cmdp:AnnotationType>Morphosyntax/POS</cmdp:AnnotationType>
<cmdp:AnnotationType>Morphosyntax/Word form</cmdp:AnnotationType>
<cmdp:TagSet>POSTags/DCOI Tagset</cmdp:TagSet>
</cmdp:AnnotationType>
</cmdp:Output>
</cmdp:SoftwareImplementation>
<cmdp:Access>
<cmdp:ResourceLicense>
<cmdp:license>GNU GPL</cmdp:license>
<cmdp:version>3.0</cmdp:version>
<cmdp:distributionType>public</cmdp:distributionType>
<cmdp:url>https://spdx.org/licenses/GPL-3.0</cmdp:url>
<cmdp:Price>
<cmdp:amount>0</cmdp:amount>
<cmdp:ISO4217>
<cmdp:iso-4217-currency>EUR</cmdp:iso-4217-currency>
</cmdp:ISO4217>
</cmdp:Price>
</cmdp:ResourceLicense>
<cmdp:Contact>
<cmdp:Person>
Martin Reynaert
</cmdp:Person>
<cmdp:Address>Tilburg, the Netherlands</cmdp:Address>
<cmdp:Email>
reynaert@uvt.nl
</cmdp:Email>
<cmdp:Department>Department of Cognitive Science and Artificial Intelligence</cmdp:Department>
<cmdp:Organisation>
Tilburg University, Tilburg
</cmdp:Organisation>
<cmdp:Url>https://www.tilburguniversity.edu/about/schools/humanities/departments/dca/</cmdp:Url>
</cmdp:Contact>
</cmdp:Access>
<cmdp:ResourceDocumentation>
<cmdp:Documentation>
<cmdp:title>Information Page</cmdp:title>
<cmdp:documentationTarget>technical</cmdp:documentationTarget>
<cmdp:url>https://webservices-lst.science.ru.nl/piccl/info/</cmdp:url>
<cmdp:ISO639>
<cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Documentation>
<cmdp:Documentation>
<cmdp:title>readme</cmdp:title>
<cmdp:documentationTarget>technical</cmdp:documentationTarget>
<cmdp:url>https://github.com/LanguageMachines/PICCL/blob/master/README.md</cmdp:url>
<cmdp:ISO639>
<cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Documentation>
<cmdp:Documentation>
<cmdp:title>releaseNotes</cmdp:title>
<cmdp:documentationTarget>user</cmdp:documentationTarget>
<cmdp:url>https://github.com/LanguageMachines/PICCL/releases</cmdp:url>
<cmdp:ISO639>
<cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Documentation>
<cmdp:Documentation>
<cmdp:title>issueTracker</cmdp:title>
<cmdp:documentationTarget>technical</cmdp:documentationTarget>
<cmdp:url>https://github.com/LanguageMachines/PICCL/issues</cmdp:url>
<cmdp:ISO639>
<cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Documentation>
<cmdp:Documentation>
<cmdp:title>contIntegration</cmdp:title>
<cmdp:documentationTarget>technical</cmdp:documentationTarget>
<cmdp:url>https://travis-ci.org/LanguageMachines/PICCL</cmdp:url>
<cmdp:ISO639>
<cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
</cmdp:ISO639>
</cmdp:Documentation>
<cmdp:Publication>
<cmdp:publicationCategory>in proceedings</cmdp:publicationCategory>
<cmdp:publicationPurpose>scientific background</cmdp:publicationPurpose>
<cmdp:peerReviewStatus>yes</cmdp:peerReviewStatus>
<cmdp:Description>
<cmdp:Description LanguageID="eng">Martin Reynaert, Maarten van Gompel, Ko van der Sloot and Antal van den Bosch. 2015. PICCL: Philosophical Integrator of Computational and Corpus Libraries. Proceedings of CLARIN Annual Conference 2015, pp. 75-79. WrocÅaw, Poland. http://www.nederlab.nl/cms/wp-content/uploads/2015/10/Reynaert_PICCL-Philosophical-Integrator-of-Computational-and-Corpus-Libraries.pdf
</cmdp:Description>
</cmdp:Description>
</cmdp:Publication>
</cmdp:ResourceDocumentation>
<cmdp:SoftwareDevelopment>
<cmdp:Project>
<cmdp:name>CLARIN-NL</cmdp:name>
<cmdp:title/>
<cmdp:funder>NWO</cmdp:funder>
<cmdp:url>http://www.clarin.nl</cmdp:url>
<cmdp:Contact>
<cmdp:Person/>
<cmdp:Email/>
<cmdp:Organisation xml:lang="eng"/>
</cmdp:Contact>
<cmdp:Duration/>
</cmdp:Project>
<cmdp:Project>
<cmdp:name>CLARIAH-CORE</cmdp:name>
<cmdp:title/>
<cmdp:funder>NWO</cmdp:funder>
<cmdp:url>https://www.clariah.nl</cmdp:url>
<cmdp:Contact>
<cmdp:Person/>
<cmdp:Email/>
<cmdp:Organisation xml:lang="eng"/>
</cmdp:Contact>
<cmdp:Duration/>
</cmdp:Project>
<cmdp:Project>
<cmdp:name>Nederlab</cmdp:name>
<cmdp:title/>
<cmdp:funder>NWO</cmdp:funder>
<cmdp:url>http://www.nederlab.nl</cmdp:url>
<cmdp:Contact>
<cmdp:Person/>
<cmdp:Email/>
<cmdp:Organisation xml:lang="eng"/>
</cmdp:Contact>
<cmdp:Duration/>
</cmdp:Project>
<cmdp:Creator>
<cmdp:Contact>
<cmdp:Person>Martin Reynaert</cmdp:Person>
<cmdp:Organisation xml:lang="eng"/>
</cmdp:Contact>
</cmdp:Creator>
<cmdp:Creator>
<cmdp:Role>
project lead
</cmdp:Role>
<cmdp:Contact>
<cmdp:Person>
Martin Reynaert
</cmdp:Person>
<cmdp:Address>Tilburg, the Netherlands</cmdp:Address>
<cmdp:Email>
reynaert@uvt.nl
</cmdp:Email>
<cmdp:Department>Department of Cognitive Science and Artificial Intelligence</cmdp:Department>
<cmdp:Organisation>Tilburg University, Tilburg</cmdp:Organisation>
<cmdp:Url>https://www.tilburguniversity.edu/about/schools/humanities/departments/dca/</cmdp:Url>
</cmdp:Contact>
</cmdp:Creator>
<cmdp:Creator>
<cmdp:Role>
software developer
</cmdp:Role>
<cmdp:Contact>
<cmdp:Person>
Maarten van Gompel
</cmdp:Person>
<cmdp:Address>Nijmegen, the Netherlands</cmdp:Address>
<cmdp:Email>
proycon@anaproy.nl
</cmdp:Email>
<cmdp:Department>Center for Language and Speech Technology</cmdp:Department>
<cmdp:Organisation>
Radboud University Nijmegen
</cmdp:Organisation>
<cmdp:Url>
https://www.ru.nl/clst/
</cmdp:Url>
</cmdp:Contact>
</cmdp:Creator>
<cmdp:Creator>
<cmdp:Role>
software developer
</cmdp:Role>
<cmdp:Contact>
<cmdp:Person>
Ko van der Sloot
</cmdp:Person>
<cmdp:Address>Nijmegen, the Netherlands</cmdp:Address>
<cmdp:Department>Center for Language and Speech Technology</cmdp:Department>
<cmdp:Organisation>
Radboud University Nijmegen
</cmdp:Organisation>
<cmdp:Url>
https://www.ru.nl/clst/
</cmdp:Url>
</cmdp:Contact>
</cmdp:Creator>
</cmdp:SoftwareDevelopment>
<cmdp:TechnicalInfo>
<cmdp:ImplementationLanguage>
<cmdp:implementationLanguage>nextflow</cmdp:implementationLanguage>
<cmdp:version>unknown</cmdp:version>
</cmdp:ImplementationLanguage>
</cmdp:TechnicalInfo>
<cmdp:LRS>
<cmdp:Authentication>Yes. Before tool use, please register at https://webservices-lst.science.ru.nl/register.</cmdp:Authentication>
<cmdp:Description>
<cmdp:Description>PICCL</cmdp:Description>
</cmdp:Description>
<cmdp:ToolTasks>
<cmdp:toolTask>optical character recognition</cmdp:toolTask>
<cmdp:toolTask>orthographic normalisation</cmdp:toolTask>
<cmdp:toolTask>sentence splitting</cmdp:toolTask>
<cmdp:toolTask>tokenisation</cmdp:toolTask>
<cmdp:toolTask>dependency parsing</cmdp:toolTask>
<cmdp:toolTask>shallow parsing</cmdp:toolTask>
<cmdp:toolTask>lemmatisation</cmdp:toolTask>
<cmdp:toolTask>morphological analysis</cmdp:toolTask>
<cmdp:toolTask>named entity recognition</cmdp:toolTask>
<cmdp:toolTask>part of speech tagging</cmdp:toolTask>
</cmdp:ToolTasks>
<cmdp:ActualParameters><!--0-1 -->
<cmdp:ActualParameter><!--1 - unbounded -->
<cmdp:ActualParameterName>project</cmdp:ActualParameterName>
<cmdp:ActualParameterValue>new</cmdp:ActualParameterValue>
</cmdp:ActualParameter>
<cmdp:ActualParameter><!--1 - unbounded -->
<cmdp:ActualParameterName>input</cmdp:ActualParameterName>
<cmdp:ActualParameterValue>self.linkToResource</cmdp:ActualParameterValue>
</cmdp:ActualParameter>
</cmdp:ActualParameters>
<cmdp:LRSMapping>
<cmdp:LRSParameterName>input</cmdp:LRSParameterName>
<cmdp:ActualParameterName>pdftext_url</cmdp:ActualParameterName>
</cmdp:LRSMapping>
</cmdp:LRS>
</cmdp:ClarinSoftwareDescription>
</cmd:Components>
</cmd:CMD>
Organisation:
- Utrecht University
- Tilburg University, Tilburg
- Radboud University Nijmegen