PICCL: Philosophical Integrator of Computational and Corpus Libraries

<?xml version="1.0" encoding="UTF-8"?>
<cmd:CMD xmlns:cmd="http://www.clarin.eu/cmd/1"
         xmlns:cmdp="http://www.clarin.eu/cmd/1/profiles/clarin.eu:cr1:p_1342181139640"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         CMDVersion="1.2"
         xsi:schemaLocation="http://www.clarin.eu/cmd/1 https://infra.clarin.eu/CMDI/1.x/xsd/cmd-envelop.xsd http://www.clarin.eu/cmd/1/profiles/clarin.eu:cr1:p_1342181139640 https://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/1.1/profiles/clarin.eu:cr1:p_1342181139640/1.2/xsd">
   <cmd:Header>
      <cmd:MdCreator>janodijk</cmd:MdCreator>
      <cmd:MdCreationDate>2018-08-01+02:00</cmd:MdCreationDate>
      <cmd:MdProfile>clarin.eu:cr1:p_1342181139640</cmd:MdProfile>
      <cmd:MdCollectionDisplayName>CLARIN Netherlands</cmd:MdCollectionDisplayName>
   </cmd:Header>
   <cmd:Resources>
      <cmd:ResourceProxyList>
		       <cmd:ResourceProxy id="PICCL001">
			         <cmd:ResourceType>Resource</cmd:ResourceType>
			         <cmd:ResourceRef>https://webservices-lst.science.ru.nl/piccl/</cmd:ResourceRef>
		       </cmd:ResourceProxy>
		       <cmd:ResourceProxy id="PICCL002">
			         <cmd:ResourceType>LandingPage</cmd:ResourceType>
			         <cmd:ResourceRef>https://webservices-lst.science.ru.nl/piccl/info/</cmd:ResourceRef>
		       </cmd:ResourceProxy>
	     </cmd:ResourceProxyList>
      <cmd:JournalFileProxyList/>
      <cmd:ResourceRelationList/>
   </cmd:Resources>
   <cmd:Components>
      <cmdp:ClarinSoftwareDescription>
         <cmdp:GeneralInfo>
            <cmdp:name xml:lang="eng">PICCL</cmdp:name>
            <cmdp:title xml:lang="eng">PICCL: Philosophical Integrator of Computational and Corpus Libraries</cmdp:title>
			         <cmdp:version>v0.6.4</cmdp:version>
            <cmdp:publicationYear>2015</cmdp:publicationYear>
            <cmdp:url>https://webservices-lst.science.ru.nl/piccl/</cmdp:url>
            <cmdp:CLARINCentre>none yet</cmdp:CLARINCentre>
            <cmdp:ReleaseStatus>
               <cmdp:LifeCycleStatus>published</cmdp:LifeCycleStatus>
               <cmdp:lastUpdate>2018-07-12</cmdp:lastUpdate>
            </cmdp:ReleaseStatus>
            <cmdp:NationalProjects>
               <cmdp:Project>
                  <cmdp:name>CLARIN-NL</cmdp:name>
                  <cmdp:title>CLARIN in the Netherlands</cmdp:title>
                  <cmdp:id>184.021.003</cmdp:id>
                  <cmdp:funder>NWO</cmdp:funder>
                  <cmdp:url>http://www.clarin.nl</cmdp:url>
                  <cmdp:Contact>
                     <cmdp:Person>Jan Odijk</cmdp:Person>
                     <cmdp:Role>National Coordinator</cmdp:Role>
                     <cmdp:Address>Utrecht, the Netherlands</cmdp:Address>
                     <cmdp:Email>j.odijk@uu.nl</cmdp:Email>
                     <cmdp:Department>UiL-OTS</cmdp:Department>
                     <cmdp:Organisation>Utrecht University</cmdp:Organisation>
                  </cmdp:Contact>
                  <cmdp:Duration>
                     <cmdp:StartYear>2009</cmdp:StartYear>
                     <cmdp:CompletionYear>2015</cmdp:CompletionYear>
                  </cmdp:Duration>
               </cmdp:Project>
               <cmdp:Project>
                  <cmdp:name>CLARIAH-CORE</cmdp:name>
                  <cmdp:title>Common Lab Research Infrastructure for the Arts and the Humanities</cmdp:title>
                  <cmdp:id>184.033.101</cmdp:id>
                  <cmdp:funder>NWO</cmdp:funder>
                  <cmdp:url>http://www.clariah.nl</cmdp:url>
                  <cmdp:Contact>
                     <cmdp:Person>Jan Odijk</cmdp:Person>
                     <cmdp:Role>National Coordinator</cmdp:Role>
                     <cmdp:Address>Utrecht, the Netherlands</cmdp:Address>
                     <cmdp:Email>j.odijk@uu.nl</cmdp:Email>
                     <cmdp:Department>UiL-OTS</cmdp:Department>
                     <cmdp:Organisation>Utrecht University</cmdp:Organisation>
                  </cmdp:Contact>
                  <cmdp:Duration>
                     <cmdp:StartYear>2015</cmdp:StartYear>
                     <cmdp:CompletionYear>2018</cmdp:CompletionYear>
                  </cmdp:Duration>
               </cmdp:Project>
            </cmdp:NationalProjects>
            <cmdp:Country>
               <cmdp:CountryName>Netherlands</cmdp:CountryName>
               <cmdp:CountryCoding>NL</cmdp:CountryCoding>
            </cmdp:Country>
            <cmdp:Description>
	              <cmdp:Description xml:lang="eng">PICCL is a set of workflows for corpus building through OCR, post-correction, modernization of historic language and Natural Language Processing. It combines Tesseract Optical Character Recognition, TICCL functionality and Frog functionality in a single pipeline.
		  
		  Tesseract offers Open Source software for optical character recognition.
		  
		  TICCL (Text Induced Corpus Clean-up) is a system that is designed to search a corpus for all existing variants of (potentially) all words occurring in the corpus. This corpus can be one text, or several, in one or more directories, located on one or more machines. TICCL creates word frequency lists, listing for each word type how often the word occurs in the corpus. These frequencies of the normalized word forms are the sum of the frequencies of the actual word forms found in the corpus. TICCL is a system that is intended to detect and correct typographical errors (misprints) and OCR errors (optical character recognition) in texts. When books or other texts are scanned from paper by a machine, that then turns these scans, i.e. images, into digital text files, errors occur. For instance, the letter combination `in' can be read as `m', and so the word `regeering' is incorrectly reproduced as `regeermg'. TICCL can be used to detect these errors and to suggest a correct form.
		  
		  Frog enriches textual documents with various linguistic annotations.
		 </cmdp:Description>
            </cmdp:Description>
         </cmdp:GeneralInfo>
         <cmdp:SoftwareFunction>
            <cmdp:toolCategory>written language tool</cmdp:toolCategory>
			         <cmdp:ToolTasks>
				           <cmdp:toolTask>optical character recognition</cmdp:toolTask>
				           <cmdp:toolTask>orthographic normalisation</cmdp:toolTask>
				           <cmdp:toolTask>sentence splitting</cmdp:toolTask>
				           <cmdp:toolTask>tokenisation</cmdp:toolTask>
				           <cmdp:toolTask>dependency parsing</cmdp:toolTask>
				           <cmdp:toolTask>shallow parsing</cmdp:toolTask>
				           <cmdp:toolTask>lemmatisation</cmdp:toolTask>
				           <cmdp:toolTask>morphological analysis</cmdp:toolTask>
				           <cmdp:toolTask>named entity recognition</cmdp:toolTask>
				           <cmdp:toolTask>part of speech tagging</cmdp:toolTask>
			         </cmdp:ToolTasks>
			         <cmdp:ResearchPhases>
               <cmdp:ResearchPhase>Enriching Data</cmdp:ResearchPhase>
            </cmdp:ResearchPhases>
            <cmdp:ResearchDomains>
				           <cmdp:researchDomain>Linguistics</cmdp:researchDomain>
				           <cmdp:researchDomain>Philosophy</cmdp:researchDomain>
				           <cmdp:researchDomain>Literary Studies</cmdp:researchDomain>
				           <cmdp:researchDomain>Religion Studies</cmdp:researchDomain>
				           <cmdp:researchDomain>History</cmdp:researchDomain>
			         </cmdp:ResearchDomains>
            <cmdp:LinguisticsSubject>
               <cmdp:linguisticsSubject>general linguistics</cmdp:linguisticsSubject>
	              <cmdp:Description>
		                <cmdp:Description/>
	              </cmdp:Description>
            </cmdp:LinguisticsSubject>
            <cmdp:LinguisticsSubject>
               <cmdp:linguisticsSubject>orthography</cmdp:linguisticsSubject>
               <cmdp:linguisticsSubject>morphology</cmdp:linguisticsSubject>
               <cmdp:linguisticsSubject>syntax</cmdp:linguisticsSubject>
	              <cmdp:Description>
		                <cmdp:Description/>
	              </cmdp:Description>
            </cmdp:LinguisticsSubject>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Dutch</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>nld</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>no</cmdp:centuryDependent>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Swedish</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>swe</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>20</cmdp:centuryFrom>
					                <cmdp:centuryThrough>21</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Russian</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>rus</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>20</cmdp:centuryFrom>
					                <cmdp:centuryThrough>21</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Spanish</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>spa</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>20</cmdp:centuryFrom>
					                <cmdp:centuryThrough>21</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Portuguese</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>por</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>20</cmdp:centuryFrom>
					                <cmdp:centuryThrough>21</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>English</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>20</cmdp:centuryFrom>
					                <cmdp:centuryThrough>21</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>German</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>deu</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>20</cmdp:centuryFrom>
					                <cmdp:centuryThrough>21</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>French</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>fra</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>20</cmdp:centuryFrom>
					                <cmdp:centuryThrough>21</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Italian</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>ita</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>20</cmdp:centuryFrom>
					                <cmdp:centuryThrough>21</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Finnish</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>fin</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>20</cmdp:centuryFrom>
					                <cmdp:centuryThrough>21</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Modern Greek</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>ell</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>15</cmdp:centuryFrom>
					                <cmdp:centuryThrough>21</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Classical Greek</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>grc</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>-10</cmdp:centuryFrom>
					                <cmdp:centuryThrough>15</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Icelandic</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>isl</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>-10</cmdp:centuryFrom>
					                <cmdp:centuryThrough>15</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>German (Fraktur)</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>deu</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>-10</cmdp:centuryFrom>
					                <cmdp:centuryThrough>15</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Latin</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>lat</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>no</cmdp:centuryDependent>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
            <cmdp:LanguageVariety>
               <cmdp:languageDependent>yes</cmdp:languageDependent>
               <cmdp:Language>
                  <cmdp:LanguageName>Romanian</cmdp:LanguageName>
                  <cmdp:ISO639>
                     <cmdp:iso-639-3-code>ron</cmdp:iso-639-3-code>
                  </cmdp:ISO639>
               </cmdp:Language>
               <cmdp:Centuries>
					             <cmdp:centuryDependent>yes</cmdp:centuryDependent>
					             <cmdp:CenturyInterval>
					                <cmdp:centuryFrom>-10</cmdp:centuryFrom>
					                <cmdp:centuryThrough>15</cmdp:centuryThrough>
					             </cmdp:CenturyInterval>
				           </cmdp:Centuries>
            </cmdp:LanguageVariety>
	         </cmdp:SoftwareFunction>
	
         <cmdp:SoftwareImplementation>
            <cmdp:distributionMedium>Online available</cmdp:distributionMedium>
		          <cmdp:sourcecodeURI>https://github.com/LanguageMachines/piccl</cmdp:sourcecodeURI>
            <cmdp:InstallationRequirements>
               <cmdp:MinimumHardwareRequirements>
                  <cmdp:SystemRequirements>
                     <cmdp:workingMemoryMin>not specified</cmdp:workingMemoryMin>
                     <cmdp:hardDiskMin>not specified</cmdp:hardDiskMin>
                     <cmdp:Platform>
                        <cmdp:operatingSystem>POSIX</cmdp:operatingSystem>
						                  <cmdp:operatingSystemVersion>not specified</cmdp:operatingSystemVersion>
						                  <cmdp:bitArchitecture>unknown</cmdp:bitArchitecture>
                     </cmdp:Platform>
                  </cmdp:SystemRequirements>
               </cmdp:MinimumHardwareRequirements>
               <cmdp:SoftwareRequirements>
                  <cmdp:RequiredSoftware>
                     <cmdp:SoftwareShortDescription>
                        <cmdp:resourceName>nextflow</cmdp:resourceName>
						                  <cmdp:version>not specified</cmdp:version>
						                  <cmdp:url>https://github.com/nextflow-io/nextflow</cmdp:url>
                        <cmdp:applicationType>localDesktop</cmdp:applicationType>
                     </cmdp:SoftwareShortDescription>
                     <cmdp:SoftwareShortDescription>
                        <cmdp:resourceName>ticcltools</cmdp:resourceName>
						                  <cmdp:version>not specified</cmdp:version>
						                  <cmdp:url>https://github.com/LanguageMachines/ticcltools</cmdp:url>
                        <cmdp:applicationType>localDesktop</cmdp:applicationType>
                     </cmdp:SoftwareShortDescription>
                     <cmdp:SoftwareShortDescription>
                        <cmdp:resourceName>foliautils</cmdp:resourceName>
						                  <cmdp:version>not specified</cmdp:version>
						                  <cmdp:url>https://github.com/LanguageMachines/foliautils</cmdp:url>
                        <cmdp:applicationType>localDesktop</cmdp:applicationType>
                     </cmdp:SoftwareShortDescription>
                     <cmdp:SoftwareShortDescription>
                        <cmdp:resourceName>tesseract</cmdp:resourceName>
						                  <cmdp:version>not specified</cmdp:version>
						                  <cmdp:url>https://github.com/tesseract-ocr/tesseract</cmdp:url>
                        <cmdp:applicationType>localDesktop</cmdp:applicationType>
                     </cmdp:SoftwareShortDescription>
                  </cmdp:RequiredSoftware>
               </cmdp:SoftwareRequirements>
            </cmdp:InstallationRequirements>
            <cmdp:UserInterface>
               <cmdp:interfaceType>command line interface</cmdp:interfaceType>
               <cmdp:applicationType>local desktop</cmdp:applicationType>
            </cmdp:UserInterface>
            <cmdp:UserInterface>
               <cmdp:interfaceType>graphical user interface</cmdp:interfaceType>
               <cmdp:applicationType>web application</cmdp:applicationType>
            </cmdp:UserInterface>
            <cmdp:UserInterface>
               <cmdp:interfaceType>web interface</cmdp:interfaceType>
               <cmdp:applicationType>web service</cmdp:applicationType>
            </cmdp:UserInterface>
		          <cmdp:Input>
			            <cmdp:inputType>text</cmdp:inputType>
			            <cmdp:inputResource>PDF picture</cmdp:inputResource>
		             <cmdp:MimeType>
                  <cmdp:MimeType>application/pdf</cmdp:MimeType>
               </cmdp:MimeType>
		          </cmdp:Input>
		          <cmdp:Input>
			            <cmdp:inputType>text</cmdp:inputType>
			            <cmdp:inputResource>PDF with embedded text</cmdp:inputResource>
		             <cmdp:MimeType>
                  <cmdp:MimeType>application/pdf</cmdp:MimeType>
               </cmdp:MimeType>
		          </cmdp:Input>
		          <cmdp:Input>
			            <cmdp:inputType>text</cmdp:inputType>
			            <cmdp:inputResource>TIFF picture</cmdp:inputResource>
		             <cmdp:MimeType>
                  <cmdp:MimeType>image/tiff</cmdp:MimeType>
               </cmdp:MimeType>
		          </cmdp:Input>
		          <cmdp:Input>
			            <cmdp:inputType>text</cmdp:inputType>
			            <cmdp:inputResource>Lexicon (one word per line)</cmdp:inputResource>
		             <cmdp:MimeType>
                  <cmdp:MimeType>text/plain</cmdp:MimeType>
               </cmdp:MimeType>
		          </cmdp:Input>
		          <cmdp:Input>
			            <cmdp:inputType>text</cmdp:inputType>
			            <cmdp:inputResource>Post-OCR text document</cmdp:inputResource>
		             <cmdp:MimeType>
                  <cmdp:MimeType>text/plain</cmdp:MimeType>
               </cmdp:MimeType>
		          </cmdp:Input>
		          <cmdp:Input>
			            <cmdp:inputType>text</cmdp:inputType>
			            <cmdp:Schema>
                  <cmdp:schemaname>FoLiA</cmdp:schemaname>
               </cmdp:Schema>
		             <cmdp:MimeType>
                  <cmdp:MimeType>text/folia+xml</cmdp:MimeType>
               </cmdp:MimeType>
		          </cmdp:Input>
		          <cmdp:Input>
			            <cmdp:inputType>text</cmdp:inputType>
			            <cmdp:Schema>
                  <cmdp:schemaname>DjVU</cmdp:schemaname>
               </cmdp:Schema>
		             <cmdp:MimeType>
                  <cmdp:MimeType>image/vnd.djvu</cmdp:MimeType>
               </cmdp:MimeType>
		          </cmdp:Input>

		          <cmdp:Output>
               <cmdp:outputType>text</cmdp:outputType>
				           <cmdp:characterEncoding>utf8</cmdp:characterEncoding>
                <cmdp:Schema>
                  <cmdp:schemaname>FoLiA</cmdp:schemaname>
               </cmdp:Schema> 
		             <cmdp:MimeType>
                  <cmdp:MimeType>text/folia+xml</cmdp:MimeType>
               </cmdp:MimeType>
                <cmdp:AnnotationType>
					             <cmdp:AnnotationType>Discourse/Sentence Boundaries</cmdp:AnnotationType>
					             <cmdp:AnnotationType>Orthography/Token</cmdp:AnnotationType>
					             <cmdp:AnnotationType>Morphosyntax/Inflection</cmdp:AnnotationType>
					             <cmdp:AnnotationType>Morphosyntax/Lemma</cmdp:AnnotationType>
					             <cmdp:AnnotationType>Morphosyntax/POS</cmdp:AnnotationType>
					             <cmdp:AnnotationType>Morphosyntax/Word form</cmdp:AnnotationType>
					             <cmdp:TagSet>POSTags/DCOI Tagset</cmdp:TagSet> 
				           </cmdp:AnnotationType>

				        </cmdp:Output>

         </cmdp:SoftwareImplementation>
         <cmdp:Access>
            <cmdp:ResourceLicense>
               <cmdp:license>GNU GPL</cmdp:license>
			            <cmdp:version>3.0</cmdp:version>
               <cmdp:distributionType>public</cmdp:distributionType>
               <cmdp:url>https://spdx.org/licenses/GPL-3.0</cmdp:url>
               <cmdp:Price>
                  <cmdp:amount>0</cmdp:amount>
                  <cmdp:ISO4217>
                     <cmdp:iso-4217-currency>EUR</cmdp:iso-4217-currency>
                  </cmdp:ISO4217>
               </cmdp:Price>
            </cmdp:ResourceLicense>
               <cmdp:Contact>
                  <cmdp:Person>
				  Martin Reynaert
                  </cmdp:Person>
				           <cmdp:Address>Tilburg, the Netherlands</cmdp:Address>
                  <cmdp:Email>
				  reynaert@uvt.nl
                  </cmdp:Email>
				           <cmdp:Department>Department of Cognitive Science and Artificial Intelligence</cmdp:Department>
                  <cmdp:Organisation>
				  Tilburg University, Tilburg
                  </cmdp:Organisation>
                  <cmdp:Url>https://www.tilburguniversity.edu/about/schools/humanities/departments/dca/</cmdp:Url>
               </cmdp:Contact>

         </cmdp:Access>
		 
		 
         <cmdp:ResourceDocumentation>
            <cmdp:Documentation>
               <cmdp:title>Information Page</cmdp:title>
               <cmdp:documentationTarget>technical</cmdp:documentationTarget>
               <cmdp:url>https://webservices-lst.science.ru.nl/piccl/info/</cmdp:url>
               <cmdp:ISO639>
                  <cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
               </cmdp:ISO639>
			         </cmdp:Documentation>

            <cmdp:Documentation>
               <cmdp:title>readme</cmdp:title>
               <cmdp:documentationTarget>technical</cmdp:documentationTarget>
               <cmdp:url>https://github.com/LanguageMachines/PICCL/blob/master/README.md</cmdp:url>
               <cmdp:ISO639>
                  <cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
               </cmdp:ISO639>
            </cmdp:Documentation>
            <cmdp:Documentation>
               <cmdp:title>releaseNotes</cmdp:title>
               <cmdp:documentationTarget>user</cmdp:documentationTarget>
               <cmdp:url>https://github.com/LanguageMachines/PICCL/releases</cmdp:url>
               <cmdp:ISO639>
                  <cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
               </cmdp:ISO639>
            </cmdp:Documentation>
            <cmdp:Documentation>
               <cmdp:title>issueTracker</cmdp:title>
               <cmdp:documentationTarget>technical</cmdp:documentationTarget>
               <cmdp:url>https://github.com/LanguageMachines/PICCL/issues</cmdp:url>
               <cmdp:ISO639>
                  <cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
               </cmdp:ISO639>
            </cmdp:Documentation>
            <cmdp:Documentation>
               <cmdp:title>contIntegration</cmdp:title>
               <cmdp:documentationTarget>technical</cmdp:documentationTarget>
               <cmdp:url>https://travis-ci.org/LanguageMachines/PICCL</cmdp:url>
               <cmdp:ISO639>
                  <cmdp:iso-639-3-code>eng</cmdp:iso-639-3-code>
               </cmdp:ISO639>
            </cmdp:Documentation>
			         <cmdp:Publication>
		             <cmdp:publicationCategory>in proceedings</cmdp:publicationCategory>
		             <cmdp:publicationPurpose>scientific background</cmdp:publicationPurpose>
		             <cmdp:peerReviewStatus>yes</cmdp:peerReviewStatus>
		             <cmdp:Description>
		                <cmdp:Description LanguageID="eng">Martin Reynaert, Maarten van Gompel, Ko van der Sloot and Antal van den Bosch. 2015. PICCL: Philosophical Integrator of Computational and Corpus Libraries. Proceedings of CLARIN Annual Conference 2015, pp. 75-79. Wrocław, Poland. 	        http://www.nederlab.nl/cms/wp-content/uploads/2015/10/Reynaert_PICCL-Philosophical-Integrator-of-Computational-and-Corpus-Libraries.pdf
		   </cmdp:Description>
		             </cmdp:Description>
		          </cmdp:Publication>

         </cmdp:ResourceDocumentation>
         <cmdp:SoftwareDevelopment>
             <cmdp:Project>
               <cmdp:name>CLARIN-NL</cmdp:name>
               <cmdp:title/>
               <cmdp:funder>NWO</cmdp:funder>
               <cmdp:url>http://www.clarin.nl</cmdp:url>
               <cmdp:Contact>
		                <cmdp:Person/>
		                <cmdp:Email/>
		                <cmdp:Organisation xml:lang="eng"/>
	              </cmdp:Contact>
               <cmdp:Duration/>
            </cmdp:Project>
            <cmdp:Project>
               <cmdp:name>CLARIAH-CORE</cmdp:name>
               <cmdp:title/>
               <cmdp:funder>NWO</cmdp:funder>
               <cmdp:url>https://www.clariah.nl</cmdp:url>
               <cmdp:Contact>
		                <cmdp:Person/>
		                <cmdp:Email/>
		                <cmdp:Organisation xml:lang="eng"/>
	              </cmdp:Contact>
               <cmdp:Duration/>
            </cmdp:Project>
             <cmdp:Project>
               <cmdp:name>Nederlab</cmdp:name>
               <cmdp:title/>
               <cmdp:funder>NWO</cmdp:funder>
               <cmdp:url>http://www.nederlab.nl</cmdp:url>
               <cmdp:Contact>
		                <cmdp:Person/>
		                <cmdp:Email/>
		                <cmdp:Organisation xml:lang="eng"/>
	              </cmdp:Contact>
               <cmdp:Duration/>
            </cmdp:Project>
            <cmdp:Creator>
               <cmdp:Contact>
		                <cmdp:Person>Martin Reynaert</cmdp:Person>
		                <cmdp:Organisation xml:lang="eng"/>
	              </cmdp:Contact>
		          </cmdp:Creator> 
            <cmdp:Creator>
               <cmdp:Role>
			   project lead
               </cmdp:Role>
               <cmdp:Contact>
                  <cmdp:Person>
				  Martin Reynaert
                  </cmdp:Person>
				              <cmdp:Address>Tilburg, the Netherlands</cmdp:Address>
                  <cmdp:Email>
				  reynaert@uvt.nl
                  </cmdp:Email>
                  <cmdp:Department>Department of Cognitive Science and Artificial Intelligence</cmdp:Department>
                  <cmdp:Organisation>Tilburg University, Tilburg</cmdp:Organisation>
                  <cmdp:Url>https://www.tilburguniversity.edu/about/schools/humanities/departments/dca/</cmdp:Url>
               </cmdp:Contact>
            </cmdp:Creator>
            <cmdp:Creator>
               <cmdp:Role>
			   software developer
               </cmdp:Role>
               <cmdp:Contact>
                  <cmdp:Person>
				  Maarten van Gompel
                  </cmdp:Person>
				              <cmdp:Address>Nijmegen, the Netherlands</cmdp:Address>
                  <cmdp:Email>
				  proycon@anaproy.nl
                  </cmdp:Email>
				              <cmdp:Department>Center for Language and Speech Technology</cmdp:Department>
                  <cmdp:Organisation>
				  Radboud University Nijmegen
                  </cmdp:Organisation>
                  <cmdp:Url>
					https://www.ru.nl/clst/
                  </cmdp:Url>
               </cmdp:Contact>
            </cmdp:Creator>
            <cmdp:Creator>
               <cmdp:Role>
			   software developer
               </cmdp:Role>
               <cmdp:Contact>
                  <cmdp:Person>
				  Ko van der Sloot
                  </cmdp:Person>
				              <cmdp:Address>Nijmegen, the Netherlands</cmdp:Address>
 				             <cmdp:Department>Center for Language and Speech Technology</cmdp:Department>
                 <cmdp:Organisation>
				  Radboud University Nijmegen
                  </cmdp:Organisation>
                  <cmdp:Url>
					https://www.ru.nl/clst/
                  </cmdp:Url>
               </cmdp:Contact>
            </cmdp:Creator>
			
         </cmdp:SoftwareDevelopment>
         <cmdp:TechnicalInfo>
            <cmdp:ImplementationLanguage>
               <cmdp:implementationLanguage>nextflow</cmdp:implementationLanguage>
               <cmdp:version>unknown</cmdp:version>
            </cmdp:ImplementationLanguage>
         </cmdp:TechnicalInfo>
	        <cmdp:LRS>
	           <cmdp:Authentication>Yes. Before tool use, please register at https://webservices-lst.science.ru.nl/register.</cmdp:Authentication>
		          <cmdp:Description>
               <cmdp:Description>PICCL</cmdp:Description>
            </cmdp:Description>
		          <cmdp:ToolTasks>
	              <cmdp:toolTask>optical character recognition</cmdp:toolTask>
               <cmdp:toolTask>orthographic normalisation</cmdp:toolTask>
	              <cmdp:toolTask>sentence splitting</cmdp:toolTask>
	              <cmdp:toolTask>tokenisation</cmdp:toolTask>
               <cmdp:toolTask>dependency parsing</cmdp:toolTask>
               <cmdp:toolTask>shallow parsing</cmdp:toolTask>
               <cmdp:toolTask>lemmatisation</cmdp:toolTask>
	              <cmdp:toolTask>morphological analysis</cmdp:toolTask>
	              <cmdp:toolTask>named entity recognition</cmdp:toolTask>
	              <cmdp:toolTask>part of speech tagging</cmdp:toolTask>
		          </cmdp:ToolTasks>
		          <cmdp:ActualParameters><!--0-1 -->
			            <cmdp:ActualParameter><!--1 - unbounded -->
				              <cmdp:ActualParameterName>project</cmdp:ActualParameterName>
				              <cmdp:ActualParameterValue>new</cmdp:ActualParameterValue>
			            </cmdp:ActualParameter>
			            <cmdp:ActualParameter><!--1 - unbounded -->
				              <cmdp:ActualParameterName>input</cmdp:ActualParameterName>
				              <cmdp:ActualParameterValue>self.linkToResource</cmdp:ActualParameterValue>
			            </cmdp:ActualParameter>
		          </cmdp:ActualParameters>
		          <cmdp:LRSMapping>
		             <cmdp:LRSParameterName>input</cmdp:LRSParameterName>
		             <cmdp:ActualParameterName>pdftext_url</cmdp:ActualParameterName>
		          </cmdp:LRSMapping>
	        </cmdp:LRS>
      </cmdp:ClarinSoftwareDescription>
   </cmd:Components>
</cmd:CMD>
Organisation:
  • Utrecht University
  • Tilburg University, Tilburg
  • Radboud University Nijmegen

Resources:

Resource

application/pdf