Component Overview 1.0.0
Table of Contents

This document provides an overview over the analytics components and data formats supported by the component collections of the OpenMinTeD partners.

Analytics by category

Uncategorized (132)

Components listed here are presently uncategorized.

Component Description Framework

ANNIE NE Transducer

ANNIE named entity grammar.

GATE

ANNIE OrthoMatcher

ANNIE orthographical coreference component.

GATE

ANNIE+Measurements

Ready-made application for ANNIE plus the measurement tagger

GATE

Ab3P

synopsis

AlvisNLP

Action

Applies action expressions on selected elements.

AlvisNLP

AggregateValues

synopsis

AlvisNLP

Agreement Evaluator

Reports agreement on annotations coming from different views (sofas).

NaCTeM (UIMA)

AlchemyAPI: Entity Extraction

Runs the AlchemyAPI Entity Extraction service on a GATE document

GATE

AlchemyAPI: Keyword Extraction

Runs the AlchemyAPI Keyword Extraction service on a GATE document

GATE

AlvisREPrepareCrossValidation

synopsis

AlvisNLP

AnchorTuples

Creates tuples with a common argument.

AlvisNLP

Annotation Remover

Removes span-of-text annotations.

NaCTeM (UIMA)

AnnotationTermbank

TermRaider Termbank derived from document annotations

GATE

AntecedentChoice

Biotopes-specific module: chooses an antecedent.

AlvisNLP

Arabic Gazetteer Collector

No description

GATE

Arabic Main Grammar

A module for executing Jape grammars.

GATE

Arabic OrthoMatcher

ANNIE orthographical coreference component.

GATE

Assert

Tests an assertion on specified elements.

AlvisNLP

AssertAnnotations$InternalJCasHolder

Descriptor automatically generated by uimaFIT

DKPro Core (UIMA)

AttestedTermsProjector

Projects a list of terms given in tree-tagger format.

AlvisNLP

BDM Computation PR

Compute BDM score for each pair of concepts in the given ontology.

GATE

Banner Sentence Breaker

Sentence breaker using the Sun Java API "BreakIterator".

NaCTeM (UIMA)

BioLG

Applies BioLG and lp2lp to sentences.

AlvisNLP

CSV Corpus Populater

Populate a corpus from CSV files

GATE

CartesianProductTuples

Creates tuples for each element of a Cartesian product.

AlvisNLP

Cebuano Transducer

A module for executing Jape grammars.

GATE

Cebuano Transducer Postprocessor

A module for executing Jape grammars.

GATE

Chemical Entity Recogniser

A named entity recogniser capable of annotating names of chemicals, drugs and metabolites.

NaCTeM (UIMA)

ColognePhoneticTranscriptor

Cologne phonetic (Kölner Phonetik) transcription based on Apache Commons Codec.

DKPro Core (UIMA)

Compound Document

GATE Compound Document.

GATE

Compound Document From Xml

GATE Compound Document.

GATE

ConnectSesameOntology

Connect to a repository containing and ontology

GATE

Control Script

Editor for the Groovy script controlling a scriptable controller

GATE

Copy Anns to Another Doc PR

Copy the annotations from one document to another document.

GATE

Corpus Indexing Support

No description

GATE

Crawler PR

GATE implementation of the Websphinx crawling API

GATE

CreateSesameOntology

Create a ontology from a Sesame configuration file for a repository

GATE

Dictionary Pluggable Soft TF/IDF Matcher

Tests input tokens whether they belong to an entry in the specified dictionary using SecondString Soft TF/IDF.

NaCTeM (UIMA)

DisambiguateAlternatives

Disambiguate features that have multiple values.

AlvisNLP

DocumentFrequencyBank

Document frequency counter derived from corpora and other DFBs

GATE

DoubleMetaphonePhoneticTranscriptor

Double-Metaphone phonetic transcription based on Apache Commons Codec.

DKPro Core (UIMA)

ElementMapper

Maps elements according to a collection of mapping elements.

AlvisNLP

ElementProjector

Searches for entries in a dictionary generated by an expression.

AlvisNLP

ElementProjector2

synopsis

AlvisNLP

EngLemmatiser

English lemmatiser which is adapted from WordNet.

NaCTeM (UIMA)

Feature Generator

Generates a list of user-defined observations for each token.

NaCTeM (UIMA)

FileMapper

Maps the value of an annoation feature according to a mapping file.

AlvisNLP

FileMapper2

Maps elements according to a tab-separated mapping file.

AlvisNLP

FreelingMorpho

Performs tokenisation, and determines possible lemmas and POS tags for each token, with confidence scores.

NaCTeM (UIMA)

GATE Composite document

GATE Composite document.

GATE

Gazetteer List Collector

Gazetteer lists collector.

GATE

GermanSeparatedParticleAnnotator

Annotator to be used for post-processing of German corpora that have been lemmatized and POS-tagged with the TreeTagger, based on the STTS tagset.

DKPro Core (UIMA)

Groovy support for GATE

No description

GATE

Hindi Main Grammar

A module for executing Jape grammars

GATE

Hindi OrthoMatcher

Hindi Orthomatcher

GATE

Hindi Tokeniser Postprocessor

A module for executing Jape grammars

GATE

HyponymyTermbank

TermRaider Termbank derived from head/string hyponymy

GATE

IOTestRunner$Validator

Descriptor automatically generated by uimaFIT

DKPro Core (UIMA)

InsertContents

synopsis

AlvisNLP

Kleio Search

Uses the Keio service to fetch MEDLINE abstracts matching a specified query.

NaCTeM (UIMA)

LBJ Named Entity Recognizer

A wrapper for the Illinois Named Entity Tagger

NaCTeM (UIMA)

LayerComparator

Compares annotations in two different layers.

AlvisNLP

Linguistic Simplifier

A processing resource that takes document and corpus parameters

GATE

Linguistic Simplifier

Example application for the linguistic simplifier

GATE

Lucene IR Engine

No description

GATE

Lupedia Service PR

Runs a lupedia annotation service on a GATE document

GATE

Majority-vote consensus builder (annotation)

Process results of a crowd annotation task to find where annotators agree and disagree.

GATE

MergeLayers

Creates a new layer in each section containing all annotations in source layers.

AlvisNLP

MergeSections

Merge several sections into a single one.

AlvisNLP

MetaMap Annotator

This plugin uses the MetaMap Java API to send GATE document content to MetaMap skrmedpostctl server and PrologBeans mmserver instances running on the given machine/port

GATE

MetaphonePhoneticTranscriptor

Metaphone phonetic transcription based on Apache Commons Codec.

DKPro Core (UIMA)

MutationFinder

GATE MutationFinder Wrapper

GATE

NGramAnnotator

N-gram annotator.

DKPro Core (UIMA)

NGrams

Computes annotation n-grams.

AlvisNLP

NeMine

No description

NaCTeM (UIMA)

NewCount

Counts element occurrences and writes the results in a file, including tfidf.

AlvisNLP

OBOMapper

synopsis

AlvisNLP

OBOProjector

Projects OBO terms and synonyms on sections.

AlvisNLP

OWLIM Ontology

Ontology created as a temporary OWLIM3 in-memory repository

GATE

OWLIM Ontology DEPRECATED

Ontology created as a temporary OWLIM3 in-memory repository, for backwards compatibility only

GATE

OntoReif

synopsis

AlvisNLP

OpenNLPNEDetector

Detects named entities in text and creates corresponding entity annotations that span the found entities.

NaCTeM (UIMA)

OpenNLPSentenceDetector

Detect sentence boundaries and create sentence annotations that span these boundaries.

NaCTeM (UIMA)

OrthoRef

An orthographic coreferencer

GATE

OscarMER

Runs Oscar 3 with maximum entropy based recogniser with syntactic tokens as input

NaCTeM (UIMA)

PMI Bank

Pointwise Mutual Information from corpora

GATE

PMI Example (English)

Example application for the PMI (pointwise mutual information) tool

GATE

PatternMatcher

Matches a regular expression-like pattern on the sequence of annotations in a given layer.

AlvisNLP

ProminentConceptReporter

synopsis

AlvisNLP

Quality Assurance PR

The Quality Assurance PR provides a functionality of the Corpus QA Tool in GATE Developer

GATE

QuickHTML

synopsis

AlvisNLP

RO_FDGBank

This reader performs the transformation of the CONLL tab separated text format to the CAS ConllDependency format.

NaCTeM (UIMA)

Reference Evaluator

Reports annotation performance comparing views (sofas) to one selected reference view.

NaCTeM (UIMA)

RegExp

Matches a regular expression on sections contents and create an annotation for each match.

AlvisNLP

Regex Annotator

Annotates spans of text based on a custom regular expression.

NaCTeM (UIMA)

RemoveContents

synopsis

AlvisNLP

RemoveEquivalent

Removes duplicate elements.

AlvisNLP

RemoveOverlaps

Removes overlapping annotations from a given layer.

AlvisNLP

Romanian Transducer

A module for executing Jape grammars

GATE

SFTP BioNLP Shared Task Data Provider

Reads a corpus in BioNLP Shared Task format from a remote directory on a user-specified server via SFTP.

NaCTeM (UIMA)

SQLImport

synopsis

AlvisNLP

SeSMig

Detects sentence boundaries and creates one annotation for each sentence.This module assumes WoSMig processed the same sections.

AlvisNLP

Search Results

Viewer for IR search results

GATE

SearchPR

Provides IR functionality.

GATE

Sequence_Impl

Sequence of modules.

AlvisNLP

Show/Hide Resources

Show resources that would otherwise be hidden, e.g. resources created for internal use by other resources

GATE

SimpleProjector

Projects a simple dictionary on sections.

AlvisNLP

SimpleProjector2

Projects a simple dictionary on sections.

AlvisNLP

SoundexPhoneticTranscriptor

Soundex phonetic transcription based on Apache Commons Codec.

DKPro Core (UIMA)

Species

Calls the Species taxon tagger.

AlvisNLP

SplitOverlaps

Splits overlapping annotations.

AlvisNLP

TermRaider English Term Extraction

Example application showing typical set-up for the TermRaider tools

GATE

Termbank Score Copier

Copy scores from Termbanks back to their source annotations

GATE

TextRazor Service PR

Runs the TextRazor annotation service (http://textrazor.com) on a GATE document

GATE

TfIdfTermbank

TermRaider Termbank derived from vectors in document features

GATE

TfidfAnnotator

This component adds Tfidf annotations consisting of a term and a tfidf weight.

DKPro Core (UIMA)

TomapProjector

synopsis

AlvisNLP

TomapTrain

synopsis

AlvisNLP

TyDIProjector

Projects terms from a TiDI export.

AlvisNLP

Type Mapper

No description

NaCTeM (UIMA)

UAICDiacriticsDescriptor

No description

NaCTeM (UIMA)

UAICLemmav1

Assigns base forms to tokenised text.

NaCTeM (UIMA)

UAICLemmav2

Assigns base forms in Romanian text, given POS-tagged text.

NaCTeM (UIMA)

UAICSegV1

Splits texts into fragments

NaCTeM (UIMA)

UMLS Full Dictionary Feature Extractor

Extracts Dictionary features from a UMLS-sourced dictionary

NaCTeM (UIMA)

WapitiLabel

synopsis

AlvisNLP

WapitiTrain

synopsis

AlvisNLP

WoSMig

Performs word segmentation on section contents.

AlvisNLP

WordNet

WordNet

GATE

WordNet 1.6

Princeton WordNet 1.6.

GATE

YateaProjector

synopsis

AlvisNLP

Zemanta Service PR

Runs a zemanta annotation service on a GATE document

GATE

Chunker (7)

Component Description Framework

ANNIE VP Chunker

ANNIE VP Chunker component.

GATE

ILSP Chunker

No description

ILSP (UIMA)

Noun Phrase Chunker

Ready-made NP chunking application

GATE

Noun Phrase Chunker

Implementation of the Ramshaw and Marcus base noun phrase chunker

GATE

OpenNLP Chunker

Chunker using an OpenNLP maxent model

GATE

OpenNlpChunker

Chunk annotator using OpenNLP.

DKPro Core (UIMA)

TreeTaggerChunker

Chunk annotator using TreeTagger.

DKPro Core (UIMA)

Classifier (8)

Component Description Framework

Entity Classification Job Builder

Build a CrowdFlower job asking users to select the right label for entities

GATE

Entity Classification Results Importer

Import judgments from a CrowdFlower job created by the Entity Classification Job Builder as GATE annotations.

GATE

Majority-vote consensus builder (classification)

Process results of a crowd annotation task to find where annotators agree and disagree.

GATE

SelectingElementClassifier

Searches for discrimminating attributes with Weka.

AlvisNLP

TaggingElementClassifier

Classifies elements with a Weka classifier.

AlvisNLP

Text Categorization PR

Classify text based on a semantic space

GATE

Textalytics Text Classification

Textalytics Text Classification

GATE

TrainingElementClassifier

Trains a Weka classifier where examples are elements.

AlvisNLP

Coreference (3)

Component Description Framework

ANNIE Nominal Coreferencer

Nominal Coreference resolution component

GATE

ANNIE Pronominal Coreferencer

Pronominal Coreference resolution component.

GATE

StanfordCoreferenceResolver

No description

DKPro Core (UIMA)

CrowdSourcing (1)

Component Description Framework

Entity Annotation Job Builder

Build a CrowdFlower job asking users to annotate entities within a snippet of text

GATE

Developers/Debugging (9)

Component Description Framework

DependencyDumper

Dump dependencies to screen.

DKPro Core (UIMA)

DocumentMetaDataStripper

Removes fields from the document meta data which may be different depending on the machine a test is run on.

DKPro Core (UIMA)

EDT Monitor

Warns whenever an AWT component is updated from anywhere other than the event dispatch thread

GATE

JCasHolder

Utility analysis engine for use with CAS multipliers in uimaFIT pipelines.

DKPro Core (UIMA)

Java Heap Dumper

Dumps the Java heap to the specified file

GATE

Log4J Level: ALL

Allows the Log4J log level to be set to ALL from within the GUI

GATE

Stopwatch

Can be used to measure how long the processing between two points in a pipeline takes.

DKPro Core (UIMA)

TagsetDescriptionStripper

Copyright 2012 Ubiquitous Knowledge Processing (UKP) Lab Technische Universität Darmstadt

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.

DKPro Core (UIMA)

Unload Unused Plugins

Unloads all plugins for which we cannot find any loaded instances

GATE

Evaluation (2)

Component Description Framework

CompareElements

Compares two sets of elements.

AlvisNLP

IAA Computation PR

Compute inter-annotator agreement (IAA).

GATE

Filtering (6)

Component Description Framework

AnnotationByLengthFilter

Removes annotations that do not conform to minimum or maximum length constraints.

DKPro Core (UIMA)

AnnotationByTextFilter

Reads a list of words from a text file (one token per line) and retains only tokens or other annotations that match any of these words.

DKPro Core (UIMA)

Boilerpipe Content Detection

Uses boilerpipe to determine which sections of a document are interesting content and which are just boilerplate

GATE

PosFilter

Removes all tokens/lemmas/stems/POS tags (depending on the "Mode" setting) that do not match the given parts of speech.

DKPro Core (UIMA)

RegexTokenFilter

Remove every token that does or does not match a given regular expression.

DKPro Core (UIMA)

StopWordRemover

Remove all of the specified types from the CAS if their covered text is in the stop word dictionary.

DKPro Core (UIMA)

Flow (8)

Component Description Framework

Annotation Merging PR

Merge Annotations from different annotators.

GATE

Annotation Set Transfer

Annotation set transfer component.

GATE

Combine Members PR

Combines documents in a composite document.

GATE

Delete Member PR

Deletes one member document from a compound doc.

GATE

Document Reset PR

Remove named annotation sets or reset the default annotation set

GATE

Scriptable Controller

A controller whose execution strategy is controlled by a Groovy script

GATE

Segment Processing PR

Processes individual segments as separate documents

GATE

Switch Member PR

Sets the focus of a compound document to a specified member document.

GATE

Gazetteer (16)

Component Description Framework

ANNIE Gazetteer

A list lookup component.

GATE

Arabic Gazetteer

A list lookup component.

GATE

Arabic Infered Gazetteer

A list lookup component.

GATE

Cebuano Gazetteer

A list lookup component.

GATE

DictionaryAnnotator

Takes a plain text file with phrases as input and annotates the phrases in the CAS file.

DKPro Core (UIMA)

Flexible Gazetteer

A more flexible list lookup component.

GATE

Hash Gazetteer

A list lookup component implemented by OntoText Lab.

GATE

Hindi Gazetteer

A list lookup component.

GATE

Hindi Tokeniser Gazetteer

A list lookup component.

GATE

Inflectional gazetteer

Gazetteer with support for inflectional morphology

GATE

Large KB Gazetteer

KIM KB based alias-lookup commponent

GATE

Onto Root Gazetteer

An ontology lookup component

GATE

OntoGazetteer

A list lookup component based on mapping between ontology classes and gazetteer lists.

GATE

Romanian Gazetteer

A list lookup component.

GATE

Russian Gazetteer

Customised version of the hash gazetteer

GATE

Sharable Gazettee

A list lookup component.

GATE

Irrelevant (1)

Component Description Framework

The Duplicator

Duplicate any resource with a right click menu option

GATE

Keywords/Terms (3)

Component Description Framework

KEA Keyphrase Extractor

A Keyphrase Extractor by Eibe Frank.

GATE

KeywordsSelector

Selects most relevant keywords in documents.

AlvisNLP

YateaExtractor

Extract terms from the corpus using the YaTeA term extractor.

AlvisNLP

Language Identifier (7)

Component Description Framework

LangDetectLanguageIdentifier

Langdetect language identifier based on character n-grams.

DKPro Core (UIMA)

LanguageDetectorWeb1T

Language detector based on n-gram frequency counts, e.g. as provided by Web1T

DKPro Core (UIMA)

LanguageIdentifier

Detection based on character n-grams.

DKPro Core (UIMA)

LingPipe Language Identifier PR

GATE PR for language identification using LingPipe

GATE

TextCat Fingerprint Generator

Generate language fingerprints for use with the TextCat Language Indentification PR

GATE

TextCat Language Identification

Recognizes the document language using TextCat

GATE

Textalytics Language Identification

Textalytics Language Identification

GATE

Lemmatizer (7)

Component Description Framework

ClearNlpLemmatizer

Lemmatizer using Clear NLP.

DKPro Core (UIMA)

GateLemmatizer

Wrapper for the GATE rule based lemmatizer.

DKPro Core (UIMA)

ILSP Lemmatizer

ILSP Lemmatizer consults a assigns lemmas to tokens from Greek texts.

ILSP (UIMA)

LanguageToolLemmatizer

Naive lexicon-based lemmatizer.

DKPro Core (UIMA)

MateLemmatizer

DKPro Annotator for the MateToolsLemmatizer.

DKPro Core (UIMA)

MorphaLemmatizer

Lemmatize based on a finite-state machine.

DKPro Core (UIMA)

StanfordLemmatizer

Stanford Lemmatizer component.

DKPro Core (UIMA)

Machine Learning (2)

Component Description Framework

Batch Learning PR

Supports training, application and evaluation of machine learning models for NLP tasks

GATE

Machine Learning PR

Trains a machine learning algorithm from a corpus.

GATE

MorphTagger (3)

Component Description Framework

GATE Morphological analyser

Morphological Analyzer for the English Language.

GATE

RASP2 Morphological Analyser

RASP morphological analyser, which adds lemma and suffix to the WordForm annotations produced by the RASP POS tagger (or the ANNIE POS tagger plus the RASP converter)

GATE

SfstAnnotator

Sfst morphological analyzer.

DKPro Core (UIMA)

Named Entity Recognizer (11)

Component Description Framework

ABNER

Wraps the ABNER entity identification system into the UIMA framework.

NaCTeM (UIMA)

CRF++ Trainer

Produces a Conditional Random Fields model.

NaCTeM (UIMA)

ILSP NERC

This module uses a Maximum Entropy NER engine focusing on EL or EN textual newsy data.

ILSP (UIMA)

LingPipe NER PR

LingPipe Named Entity Recognizer

GATE

OpenNLP NER

NER PR using a set of OpenNLP maxent models

GATE

OpenNlpNamedEntityRecognizer

OpenNLP name finder wrapper.

DKPro Core (UIMA)

SVMLight Trainer

Produces an SVMLight model based on user-specified learning parameters.

NaCTeM (UIMA)

Stanford NER

Stanford Named Entity Recogniser

GATE

StanfordNER

synopsis

AlvisNLP

StanfordNamedEntityRecognizer

Stanford Named Entity Recognizer component.

DKPro Core (UIMA)

Yeast Metabliner

This service is to annotate yeast metabolites with a supervised NER system using CRF.

NaCTeM (UIMA)

Normalizer (19)

Component Description Framework

ApplyChangesAnnotator

Applies changes annotated using a SofaChangeAnnotation.

DKPro Core (UIMA)

Backmapper

After processing a file with the ApplyChangesAnnotator this annotator can be used to map the annotations created in the cleaned view back to the original view.

DKPro Core (UIMA)

CapitalizationNormalizer

Takes a text and replaces wrong capitalization

DKPro Core (UIMA)

CjfNormalizer

Converts traditional Chinese to simplified Chinese or vice-versa.

DKPro Core (UIMA)

Date Annotation Normalizer

provides normalized values for all existing date annotations

GATE

Date Normalizer

provides normalized values for all known dates

GATE

DictionaryBasedTokenTransformer

Reads a tab-separated file containing mappings from one token to another.

DKPro Core (UIMA)

Document normalizer

Normalize document content to remove "smart quotes" etc.

GATE

ExpressiveLengtheningNormalizer

Takes a text and shortens extra long words

DKPro Core (UIMA)

FileBasedTokenTransformer

Replaces all tokens that are listed in the file in #PARAM_MODEL_LOCATION by the string specified in #PARAM_REPLACEMENT.

DKPro Core (UIMA)

HyphenationRemover

Simple dictionary-based hyphenation remover.

DKPro Core (UIMA)

RegexBasedTokenTransformer

A JCasTransformerChangeBased_ImplBase implementation that replaces tokens based on a regular expressions.

DKPro Core (UIMA)

ReplacementFileNormalizer

Takes a text and replaces desired expressions This class should not work on tokens as some expressions might span several tokens

DKPro Core (UIMA)

SharpSNormalizer

Takes a text and replaces sharp s

DKPro Core (UIMA)

SpellingNormalizer

Converts annotations of the type SpellingAnomaly into a SofaChangeAnnoatation.

DKPro Core (UIMA)

StanfordPtbTransformer

Uses the normalizing tokenizer of the Stanford CoreNLP tools to escape the text PTB-style.

DKPro Core (UIMA)

TokenCaseTransformer

Change tokens to follow a specific casing: all upper case, all lower case, or 'normal case': lowercase everything but the first character of a token and the characters immediately following a hyphen.

DKPro Core (UIMA)

Tweet Normaliser

Normalise texts in tweets (convert into standard English spelling mistakes, colloquialisms, typing variations and so on)

GATE

UmlautNormalizer

Takes a text and checks for umlauts written as "ae", "oe", or "ue" and normalizes them if they really are umlauts depending on a frequency model.

DKPro Core (UIMA)

Parser (24)

Component Description Framework

BerkeleyParser

Berkeley Parser annotator .

DKPro Core (UIMA)

CCGParser

Syntax parsing with CCG Parser.

AlvisNLP

ClearNlpParser

Clear parser annotator.

DKPro Core (UIMA)

English Dependency Parser

Ready-made application for Stanford English parser

GATE

English POS Tagger and Dependency Parser

Ready-made application for Stanford English POS tagger and parser

GATE

Enju Parser

A syntactic parser for English.

NaCTeM (UIMA)

EnjuParser

Parses sentences with the ENJU dependency parser.

AlvisNLP

EnjuParser2

synopsis

AlvisNLP

FreelingShallowParser

Performs tokenisation, lemmatisation, POS tagging and shallow parsing (chunking).

NaCTeM (UIMA)

GENIA Dependency Parser

A dependency parser for biomedical text.

NaCTeM (UIMA)

ILSP Dependency Parser

ILSP Dependency Parser is a tool trained on the Greek Dependency Treebank (Prokopidis et al., 2005), a resource which comprises data annotated at several linguistic levels.

ILSP (UIMA)

MaltParser

Dependency parsing using MaltPaser.

DKPro Core (UIMA)

MateParser

DKPro Annotator for the MateToolsParser.

DKPro Core (UIMA)

Minipar Wrapper

MiniPar is a shallow parser.

GATE

MstParser

Dependency parsing using MSTParser.

DKPro Core (UIMA)

OpenNLP Parser

Syntactic parser from Apache OpenNLP

GATE

OpenNLPParser

Parse the document and create phrasal and clausal annotations over the text.

NaCTeM (UIMA)

OpenNlpParser

OpenNLP parser.

DKPro Core (UIMA)

RASP2 Parser

RASP dependency parser

GATE

Stanford Dependency Parser

Generates Stanford-style dependencies together with POS tokens for English.

NaCTeM (UIMA)

StanfordDependencyConverter

Converts a constituency structure into a dependency structure.

DKPro Core (UIMA)

StanfordParser

Stanford parser wrapper

GATE

StanfordParser

Stanford Parser component.

DKPro Core (UIMA)

_PoS_and_Parsing,Textalytics Lemmatization, PoS and Parsing

Textalytics Lemmatization, PoS and Parsing

GATE

Pre-built Workflows (12)

Component Description Framework

Arabic IE System

Ready-made Arabic IE application

GATE

Cebuano IE System

Ready-made Cebuano IE application

GATE

Chinese IE System

Ready-made Chinese IE application

GATE

French IE System

Ready-made French IE application

GATE

German IE System

Ready-made German IE application

GATE

Measurements

Ready-made application for measurement annotator

GATE

Romanian IE System

Ready-made Romanian IE application

GATE

RussIE

Basic version of the RussIE application

GATE

RussIE + Inflectional Gazetteer & OrthoMatcher

RussIE application with orthomatcher and inflexional gazetteer

GATE

RussIE + Inflectional Gazetter

RussIE application with inflexional gazetteer

GATE

RussIE + OrthoMatcher

RussIE application with orthomatcher

GATE

TwitIE (EN)

English TwitIE application

GATE

Readability (1)

Component Description Framework

ReadabilityAnnotator

Assign a set of popular readability scores to the text.

DKPro Core (UIMA)

SRL (2)

Component Description Framework

ClearNlpSemanticRoleLabeler

ClearNLP semantic role labeller.

DKPro Core (UIMA)

MateSemanticRoleLabeler

DKPro Annotator for the MateTools Semantic Role Labeler.

DKPro Core (UIMA)

Scripted analytics (6)

Component Description Framework

Groovy scripting PR

Runs a Groovy script as a processing resource

GATE

JAPE Transducer

A module for executing Jape grammars.

GATE

JAPE-Plus Transducer

An optimised, JAPE-compatible transducer.

GATE

RunProlog

Runs a Prolog program with the corpus data structure encoded as facts.

AlvisNLP

Script

Runs a script.

AlvisNLP

UIMA Analysis Engine

Wrapper for a Text Analysis Engine from UIMA.

GATE

Segmenter (55)

Component Description Framework

ANNIE English Tokeniser

A customisable English tokeniser.

GATE

ANNIE Sentence Splitter

ANNIE sentence splitter.

GATE

Arabic Tokeniser

A customisable English tokeniser.

GATE

ArktweetTokenizer

ArkTweet tokenizer.

DKPro Core (UIMA)

Banner Base Tokenizer

Tokens returned by this class consist primarily of contiguous alphanumeric characters or single punctuation marks, however certain constructs such * as real numbers, percentages are recognized and returned as a single token.

NaCTeM (UIMA)

Banner Simple Tokenizer

Tokens ouput by this tokenizer consist of a contiguous block of alphanumeric characters or a single punctuation mark.

NaCTeM (UIMA)

Banner Whitespace Tokenizer

* Instances of this class tokenize {@link Sentence}s only at whitespace characters.

NaCTeM (UIMA)

BreakIteratorSegmenter

BreakIterator segmenter.

DKPro Core (UIMA)

Cafetiere Sentence Splitter

Uses a set of heuristics and patterns to find sentence boundaries.

NaCTeM (UIMA)

CamelCaseTokenSegmenter

Split up existing tokens again if they are camel-case text.

DKPro Core (UIMA)

Cebuano Gazetteer Tokeniser

A list lookup component.

GATE

Cebuano Tokeniser

A customisable English tokeniser.

GATE

Chinese Segmenter PR

Segment the Chinese text into words, based on the PAUM learning algorithm.

GATE

ClearNlpSegmenter

Tokenizer using Clear NLP.

DKPro Core (UIMA)

CompoundAnnotator

Annotates compound parts and linking morphemes.

DKPro Core (UIMA)

Freeling Sentence Splitter

Performs tokenisation.

NaCTeM (UIMA)

FreelingTokenizer

Performs tokenisation.

NaCTeM (UIMA)

GATE Unicode Tokeniser

A customisable Unicode tokeniser.

GATE

GENIA Sentence Splitter

A processing resource that takes document and corpus parameters

GATE

GENIA Sentence Splitter

Machine learning-based sentence splitter optimized for biomedical texts.

NaCTeM (UIMA)

Hashtag Tokenizer

Tokenizes Multi-Word Hashtags

GATE

Hindi Splitter

A Sentence Splitter.

GATE

Hindi Tokeniser

A customisable Hindi tokeniser.

GATE

_Sentence_and_Token_Segmentor,ILSP Paragraph, Sentence and Token Segmentor

This module is a regex and abbreviation based segmentor targetting texts written in Greek.

ILSP (UIMA)

IULATokenizer

Performs paragraph splitting, sentence splitting, and tokenisation.

NaCTeM (UIMA)

JTokSegmenter

JTok segmenter.

DKPro Core (UIMA)

LanguageToolSegmenter

Segmenter using LanguageTool to do the heavy lifting.

DKPro Core (UIMA)

LineBasedSentenceSegmenter

Annotates each line in the source text as a sentence.

DKPro Core (UIMA)

LingPipe Sentence Splitter

Sentence splitter based on LingPipe models.

NaCTeM (UIMA)

LingPipe Sentence Splitter PR

Provides an interface to LingPipe sentence splitter API.

GATE

LingPipe Tokenizer PR

Provides a LingPipe tokenizer.

GATE

MLRS Maltese Tokeniser

Tokenises Maltese text

NaCTeM (UIMA)

MLRS Paragraph Splitter

Identifies the paragraphs in the text, creating a Paragraph annotation for each one

NaCTeM (UIMA)

MLRS Sentence Splitter

Identifies the sentences in the text, creating a Sentence annotation for each

NaCTeM (UIMA)

OSCAR 4 Tokeniser

Segments text into tokens.

NaCTeM (UIMA)

OgmiosTokenizer

Tokenizes the sections contents according to the Ogmios tokenizer specifications.

AlvisNLP

OpenNLP Sentence Splitter

Sentence splitter using an OpenNLP maxent model

GATE

OpenNLP Tokenizer

Tokenizer using an OpenNLP maxent model

GATE

OpenNLPTokenizer

Tokenize the text and create token annotations that span the tokens.

NaCTeM (UIMA)

OpenNlpSegmenter

Tokenizer and sentence splitter using OpenNLP.

DKPro Core (UIMA)

ParagraphSplitter

This class creates paragraph annotations for the given input document.

DKPro Core (UIMA)

PatternBasedTokenSegmenter

Split up existing tokens again at particular split-chars.

DKPro Core (UIMA)

Penn BioTokenizer

Tokenizer for biomedical text

GATE

RASP2 Tokenizer

RASP2 Tokenizer.

GATE

RegEx Sentence Splitter

A sentence splitter based on regular expressions.

GATE

RegexTokenizer

This segmenter splits sentences and tokens based on regular expressions that define the sentence and token boundaries.

DKPro Core (UIMA)

Romanian Tokeniser

A customisable Romanian tokeniser.

GATE

Stanford PTB Tokenizer

Stanford Penn Treebank v3 Tokenizer, for English

GATE

StanfordSegmenter

No description

DKPro Core (UIMA)

TokenMerger

Merges any Tokens that are covered by a given annotation type.

DKPro Core (UIMA)

TokenTrimmer

Remove prefixes and suffixes from tokens.

DKPro Core (UIMA)

TrailingCharacterRemover

Removing trailing character (sequences) from tokens, e.g. punctuation.

DKPro Core (UIMA)

Twitter Tokenizer (EN)

Tokenizer tuned for Tweets

GATE

UAICTokenizerDescriptor

No description

NaCTeM (UIMA)

WhitespaceTokenizer

A strict whitespace tokenizer, i.e. tokenizes according to whitespaces and linebreaks only.

DKPro Core (UIMA)

Semantics (2)

Component Description Framework

Semantic Enrichment PR

The Semantic Enrichment PR allows adding new data to semantic annotations by querying external RDF (Linked Data) repositories.

GATE

SemanticFieldAnnotator

This Analysis Engine annotates English single words with semantic field information retrieved from an ExternalResource.

DKPro Core (UIMA)

Sentiment (1)

Component Description Framework

Textalytics Sentiment Analysis

Textalytics Sentiment Analysis

GATE

Spelling/Grammar (5)

Component Description Framework

CorrectionsContextualizer

This component assumes that some spell checker has already been applied upstream (e.g.

DKPro Core (UIMA)

JazzyChecker

This annotator uses Jazzy for the decision whether a word is spelled correctly or not.

DKPro Core (UIMA)

LanguageToolChecker

Detect grammatical errors in text using LanguageTool a rule based grammar checker.

DKPro Core (UIMA)

NorvigSpellingCorrector

Creates SofaChangeAnnotations containing corrections for previously identified spelling errors.

DKPro Core (UIMA)

_Grammar_and_Style_Proofreading,Textalytics Spell, Grammar and Style Proofreading

Textalytics Spell, Grammar and Style Proofreading

GATE

Stemmer (4)

Component Description Framework

BulStem

This plugin is an implementation of the BulStem stemmer algorithm for Bulgarian developed by Preslav Nakov.

GATE

PorterStemmer

synopsis

AlvisNLP

SnowballStemmer

UIMA wrapper for the Snowball stemmer.

DKPro Core (UIMA)

Stemmer PR

Wrapper for the Snowball stemmer.

GATE

Tagger (52)

Component Description Framework

ABNER Tagger

GATE wrapper over ABNER

GATE

ANNIE POS Tagger

Mark Hepple's Brill-style POS tagger

GATE

Anatomical Entity Tagger

Tags anatomical entities using Brown, UMLS and OBO Anatomy dictionary features

NaCTeM (UIMA)

ArktweetPosTagger

Wrapper for Twitter Tokenizer and POS Tagger.

DKPro Core (UIMA)

BANNER CRF Tagger

A UIMA wrapper for BANNER entity tagger.

NaCTeM (UIMA)

BioCreative Gene Mention Tagger

Tags Gene mentions using a model trained on BioCreative GM task data, with Entrez Gene and UMLS dictionary features.

NaCTeM (UIMA)

CCGPosTagger

Applies the CCG POS tagger on annotations.

AlvisNLP

CRF++ Tagger

Uses Conditional Random Fields model for labeling.

NaCTeM (UIMA)

Cebuano POS Tagger

Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword

GATE

Chemistry Tagger

A tagger for chemical names.

GATE

ClearNlpPosTagger

Part-of-Speech annotator using Clear NLP.

DKPro Core (UIMA)

FreelingTagger

Performs tokenisation, lemmatisation and POS tagging.

NaCTeM (UIMA)

GENIA Tagger

Tags biological named entities: proteins, cell lines, cell types, DNAs, and RNAs.

NaCTeM (UIMA)

GenericTagger

The Generic Tagger is Generic!

GATE

GeniaTagger

Runs Genia Tagger on annotations.

AlvisNLP

Hepple POS Tagger

Mark Hepple's POS tagger, from dragontools/Banner toolkit.

NaCTeM (UIMA)

HepplePosTagger

GATE Hepple part-of-speech tagger.

DKPro Core (UIMA)

Hindi POS Tagger

Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword

GATE

HunPosTagger

Part-of-Speech annotator using HunPos.

DKPro Core (UIMA)

ILSP FBT Tagger

ILSP FBT Tagger is an adaptation of the Brill tagger trained on Greek text.

ILSP (UIMA)

IULATagger

Performs paragraph splitting, sentence splitting, tokenisation and POS tagging.

NaCTeM (UIMA)

LingPipe POS Tagger PR

Provides a LingPipe part of speech tagger.

GATE

MateMorphTagger

DKPro Annotator for the MateToolsMorphTagger.

DKPro Core (UIMA)

MatePosTagger

DKPro Annotator for the MateToolsPosTagger

DKPro Core (UIMA)

MeCabTagger

Annotator for the MeCab Japanese POS Tagger.

DKPro Core (UIMA)

Measurement Tagger

A measurement tagger based upon GNU Units

GATE

Medical Condition Tagger

A tagger that recognises mentions of medical conditions.

NaCTeM (UIMA)

NormaGene Tagger

A processing resource that takes document and corpus parameters

GATE

Numbers Tagger

Finds numbers in (both words and digits) and annotates them with their numeric value

GATE

OpenCalais Tagger

An OpenCalais based semantic annotator

GATE

OpenNLP POS Tagger

POS Tagger using an OpenNLP maxent model

GATE

OpenNlpPosTagger

Part-of-Speech annotator using OpenNLP.

DKPro Core (UIMA)

POS Mapper

Map complex Russian morphology tags into simpler POS categories

GATE

Penn BioTagger

Ready-made application for the Penn BioTagger

GATE

Penn BioTagger: Genes

Penn BioTagger for Genes

GATE

Penn BioTagger: Malignancy

Penn BioTagger for malignancy types

GATE

Penn BioTagger: Variation

Penn BioTagger for variations

GATE

PosMapper

Maps existing POS tags from one tagset to another using a user provided properties file.

DKPro Core (UIMA)

RASP POS Converter

Converts from PennTreebank POS tags to the C2 tagset used by RASP.

GATE

RASP2 POS Tagger

RASP part-of-speech tagger, creating WordForm annotations

GATE

RfTagger

Rftagger morphological analyzer.

DKPro Core (UIMA)

Roman Numerals Tagger

Finds and annotates Roman numerals

GATE

Russian POS Tagger

Part-of-speech tagger for Russian

GATE

SVMLight Tagger

Applies an SVMLight-trained model on instances.

NaCTeM (UIMA)

Species Tagger

Tags species

NaCTeM (UIMA)

Stanford POS Tagger

Stanford Part-of-Speech Tagger

GATE

StanfordPosTagger

Stanford Part-of-Speech tagger component.

DKPro Core (UIMA)

Stepp Tagger

No description

NaCTeM (UIMA)

TreeTagger

Runs tree-tagger.

AlvisNLP

TreeTaggerPosTagger

Part-of-Speech and lemmatizer annotator using TreeTagger.

DKPro Core (UIMA)

Twitter POS Tagger (EN)

Stanford POS tagger trained on Tweets

GATE

UaicPosTagger

Carries out sentence splitting, tokenisation, POS tagging and lemmatitisation on plain text.

NaCTeM (UIMA)

Topics (3)

Component Description Framework

MalletTopicModelEstimator

Estimate an LDA topic model using Mallet and write it to a file.

DKPro Core (UIMA)

MalletTopicModelInferencer

Infers the topic distribution over documents using a Mallet ParallelTopicModel.

DKPro Core (UIMA)

Textalytics Topics Extraction

Textalytics Topics Extraction

GATE

Validation (1)

Component Description Framework

Schema Enforcer

Produces an annotation set whose content is restricted by the specified set of schemas

GATE

Viewer/Editor (18)

Component Description Framework

Compound Document Editor

Editor for compound documents.

GATE

GATE Ontology Editor

Ontology editing tool.

GATE

GAZE

Gazetteer viewer and editor

GATE

Gazetteer Editor

Gazetteer viewer and editor.

GATE

JAPE-Plus Viewer

A JAPE grammar file viewer

GATE

Jape Viewer

A JAPE grammar file viewer

GATE

OAT

Ontology Annotation Tool.

GATE

Pairbank Viewer

viewer for the TermRaider Pairbank

GATE

RAT-C

Relation Annotation Tool Class view.

GATE

RAT-I

Relation Annotation Tool Instance view.

GATE

Schema Annotations Editor

An annotation editor restricted by schemas.

GATE

Script Editor

Editor for the Groovy script behind this PR

GATE

Shell

Starts an interactive shell that allows to query the corpus data structure.

AlvisNLP

Shell2

Starts an interactive shell that allows to query the corpus data structure.

AlvisNLP

Simple Schema Viewer

A Simple Annotation Schema Viewer

GATE

Syntax tree viewer

Viewer for syntax trees generated by a parser.

GATE

Termbank Viewer

viewer for the TermRaider Termbank

GATE

WordNet Viewer

WordNet viewer

GATE

Analytics by product

(original) AlvisNLP (52)

The components listed here could not be associated with a known third-party tool collection and are assumed to be original components.

Component Description Framework

Ab3P

synopsis

AlvisNLP

Action

Applies action expressions on selected elements.

AlvisNLP

AggregateValues

synopsis

AlvisNLP

AlvisREPrepareCrossValidation

synopsis

AlvisNLP

AnchorTuples

Creates tuples with a common argument.

AlvisNLP

AntecedentChoice

Biotopes-specific module: chooses an antecedent.

AlvisNLP

Assert

Tests an assertion on specified elements.

AlvisNLP

AttestedTermsProjector

Projects a list of terms given in tree-tagger format.

AlvisNLP

CartesianProductTuples

Creates tuples for each element of a Cartesian product.

AlvisNLP

CompareElements

Compares two sets of elements.

AlvisNLP

DisambiguateAlternatives

Disambiguate features that have multiple values.

AlvisNLP

ElementMapper

Maps elements according to a collection of mapping elements.

AlvisNLP

ElementProjector

Searches for entries in a dictionary generated by an expression.

AlvisNLP

ElementProjector2

synopsis

AlvisNLP

FileMapper

Maps the value of an annoation feature according to a mapping file.

AlvisNLP

FileMapper2

Maps elements according to a tab-separated mapping file.

AlvisNLP

InsertContents

synopsis

AlvisNLP

KeywordsSelector

Selects most relevant keywords in documents.

AlvisNLP

LayerComparator

Compares annotations in two different layers.

AlvisNLP

MergeLayers

Creates a new layer in each section containing all annotations in source layers.

AlvisNLP

MergeSections

Merge several sections into a single one.

AlvisNLP

NGrams

Computes annotation n-grams.

AlvisNLP

NewCount

Counts element occurrences and writes the results in a file, including tfidf.

AlvisNLP

OBOMapper

synopsis

AlvisNLP

OBOProjector

Projects OBO terms and synonyms on sections.

AlvisNLP

OntoReif

synopsis

AlvisNLP

PatternMatcher

Matches a regular expression-like pattern on the sequence of annotations in a given layer.

AlvisNLP

ProminentConceptReporter

synopsis

AlvisNLP

QuickHTML

synopsis

AlvisNLP

RegExp

Matches a regular expression on sections contents and create an annotation for each match.

AlvisNLP

RemoveContents

synopsis

AlvisNLP

RemoveEquivalent

Removes duplicate elements.

AlvisNLP

RemoveOverlaps

Removes overlapping annotations from a given layer.

AlvisNLP

RunProlog

Runs a Prolog program with the corpus data structure encoded as facts.

AlvisNLP

SQLImport

synopsis

AlvisNLP

Script

Runs a script.

AlvisNLP

SeSMig

Detects sentence boundaries and creates one annotation for each sentence.This module assumes WoSMig processed the same sections.

AlvisNLP

SelectingElementClassifier

Searches for discrimminating attributes with Weka.

AlvisNLP

Sequence_Impl

Sequence of modules.

AlvisNLP

Shell

Starts an interactive shell that allows to query the corpus data structure.

AlvisNLP

Shell2

Starts an interactive shell that allows to query the corpus data structure.

AlvisNLP

SimpleProjector

Projects a simple dictionary on sections.

AlvisNLP

SimpleProjector2

Projects a simple dictionary on sections.

AlvisNLP

SplitOverlaps

Splits overlapping annotations.

AlvisNLP

TaggingElementClassifier

Classifies elements with a Weka classifier.

AlvisNLP

TomapProjector

synopsis

AlvisNLP

TomapTrain

synopsis

AlvisNLP

TrainingElementClassifier

Trains a Weka classifier where examples are elements.

AlvisNLP

TyDIProjector

Projects terms from a TiDI export.

AlvisNLP

WapitiLabel

synopsis

AlvisNLP

WapitiTrain

synopsis

AlvisNLP

WoSMig

Performs word segmentation on section contents.

AlvisNLP

(original) DKPro Core (UIMA) (52)

The components listed here could not be associated with a known third-party tool collection and are assumed to be original components.

Component Description Framework

AnnotationByLengthFilter

Removes annotations that do not conform to minimum or maximum length constraints.

DKPro Core (UIMA)

AnnotationByTextFilter

Reads a list of words from a text file (one token per line) and retains only tokens or other annotations that match any of these words.

DKPro Core (UIMA)

ApplyChangesAnnotator

Applies changes annotated using a SofaChangeAnnotation.

DKPro Core (UIMA)

AssertAnnotations$InternalJCasHolder

Descriptor automatically generated by uimaFIT

DKPro Core (UIMA)

Backmapper

After processing a file with the ApplyChangesAnnotator this annotator can be used to map the annotations created in the cleaned view back to the original view.

DKPro Core (UIMA)

BerkeleyParser

Berkeley Parser annotator .

DKPro Core (UIMA)

CamelCaseTokenSegmenter

Split up existing tokens again if they are camel-case text.

DKPro Core (UIMA)

CapitalizationNormalizer

Takes a text and replaces wrong capitalization

DKPro Core (UIMA)

ColognePhoneticTranscriptor

Cologne phonetic (Kölner Phonetik) transcription based on Apache Commons Codec.

DKPro Core (UIMA)

CompoundAnnotator

Annotates compound parts and linking morphemes.

DKPro Core (UIMA)

CorrectionsContextualizer

This component assumes that some spell checker has already been applied upstream (e.g.

DKPro Core (UIMA)

DependencyDumper

Dump dependencies to screen.

DKPro Core (UIMA)

DictionaryAnnotator

Takes a plain text file with phrases as input and annotates the phrases in the CAS file.

DKPro Core (UIMA)

DictionaryBasedTokenTransformer

Reads a tab-separated file containing mappings from one token to another.

DKPro Core (UIMA)

DocumentMetaDataStripper

Removes fields from the document meta data which may be different depending on the machine a test is run on.

DKPro Core (UIMA)

DoubleMetaphonePhoneticTranscriptor

Double-Metaphone phonetic transcription based on Apache Commons Codec.

DKPro Core (UIMA)

ExpressiveLengtheningNormalizer

Takes a text and shortens extra long words

DKPro Core (UIMA)

FileBasedTokenTransformer

Replaces all tokens that are listed in the file in #PARAM_MODEL_LOCATION by the string specified in #PARAM_REPLACEMENT.

DKPro Core (UIMA)

GateLemmatizer

Wrapper for the GATE rule based lemmatizer.

DKPro Core (UIMA)

GermanSeparatedParticleAnnotator

Annotator to be used for post-processing of German corpora that have been lemmatized and POS-tagged with the TreeTagger, based on the STTS tagset.

DKPro Core (UIMA)

HyphenationRemover

Simple dictionary-based hyphenation remover.

DKPro Core (UIMA)

IOTestRunner$Validator

Descriptor automatically generated by uimaFIT

DKPro Core (UIMA)

JCasHolder

Utility analysis engine for use with CAS multipliers in uimaFIT pipelines.

DKPro Core (UIMA)

LineBasedSentenceSegmenter

Annotates each line in the source text as a sentence.

DKPro Core (UIMA)

MalletTopicModelEstimator

Estimate an LDA topic model using Mallet and write it to a file.

DKPro Core (UIMA)

MateParser

DKPro Annotator for the MateToolsParser.

DKPro Core (UIMA)

MetaphonePhoneticTranscriptor

Metaphone phonetic transcription based on Apache Commons Codec.

DKPro Core (UIMA)

MstParser

Dependency parsing using MSTParser.

DKPro Core (UIMA)

NGramAnnotator

N-gram annotator.

DKPro Core (UIMA)

NorvigSpellingCorrector

Creates SofaChangeAnnotations containing corrections for previously identified spelling errors.

DKPro Core (UIMA)

ParagraphSplitter

This class creates paragraph annotations for the given input document.

DKPro Core (UIMA)

PatternBasedTokenSegmenter

Split up existing tokens again at particular split-chars.

DKPro Core (UIMA)

PosFilter

Removes all tokens/lemmas/stems/POS tags (depending on the "Mode" setting) that do not match the given parts of speech.

DKPro Core (UIMA)

PosMapper

Maps existing POS tags from one tagset to another using a user provided properties file.

DKPro Core (UIMA)

ReadabilityAnnotator

Assign a set of popular readability scores to the text.

DKPro Core (UIMA)

RegexBasedTokenTransformer

A JCasTransformerChangeBased_ImplBase implementation that replaces tokens based on a regular expressions.

DKPro Core (UIMA)

RegexTokenFilter

Remove every token that does or does not match a given regular expression.

DKPro Core (UIMA)

RegexTokenizer

This segmenter splits sentences and tokens based on regular expressions that define the sentence and token boundaries.

DKPro Core (UIMA)

ReplacementFileNormalizer

Takes a text and replaces desired expressions This class should not work on tokens as some expressions might span several tokens

DKPro Core (UIMA)

SharpSNormalizer

Takes a text and replaces sharp s

DKPro Core (UIMA)

SoundexPhoneticTranscriptor

Soundex phonetic transcription based on Apache Commons Codec.

DKPro Core (UIMA)

SpellingNormalizer

Converts annotations of the type SpellingAnomaly into a SofaChangeAnnoatation.

DKPro Core (UIMA)

StopWordRemover

Remove all of the specified types from the CAS if their covered text is in the stop word dictionary.

DKPro Core (UIMA)

Stopwatch

Can be used to measure how long the processing between two points in a pipeline takes.

DKPro Core (UIMA)

TagsetDescriptionStripper

Copyright 2012 Ubiquitous Knowledge Processing (UKP) Lab Technische Universität Darmstadt

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.

DKPro Core (UIMA)

TfidfAnnotator

This component adds Tfidf annotations consisting of a term and a tfidf weight.

DKPro Core (UIMA)

TokenCaseTransformer

Change tokens to follow a specific casing: all upper case, all lower case, or 'normal case': lowercase everything but the first character of a token and the characters immediately following a hyphen.

DKPro Core (UIMA)

TokenMerger

Merges any Tokens that are covered by a given annotation type.

DKPro Core (UIMA)

TokenTrimmer

Remove prefixes and suffixes from tokens.

DKPro Core (UIMA)

TrailingCharacterRemover

Removing trailing character (sequences) from tokens, e.g. punctuation.

DKPro Core (UIMA)

UmlautNormalizer

Takes a text and checks for umlauts written as "ae", "oe", or "ue" and normalizes them if they really are umlauts depending on a frequency model.

DKPro Core (UIMA)

WhitespaceTokenizer

A strict whitespace tokenizer, i.e. tokenizes according to whitespaces and linebreaks only.

DKPro Core (UIMA)

(original) GATE (135)

The components listed here could not be associated with a known third-party tool collection and are assumed to be original components.

Component Description Framework

ANNIE English Tokeniser

A customisable English tokeniser.

GATE

ANNIE Gazetteer

A list lookup component.

GATE

ANNIE NE Transducer

ANNIE named entity grammar.

GATE

ANNIE Nominal Coreferencer

Nominal Coreference resolution component

GATE

ANNIE OrthoMatcher

ANNIE orthographical coreference component.

GATE

ANNIE Pronominal Coreferencer

Pronominal Coreference resolution component.

GATE

ANNIE Sentence Splitter

ANNIE sentence splitter.

GATE

ANNIE VP Chunker

ANNIE VP Chunker component.

GATE

ANNIE+Measurements

Ready-made application for ANNIE plus the measurement tagger

GATE

Annotation Merging PR

Merge Annotations from different annotators.

GATE

Annotation Set Transfer

Annotation set transfer component.

GATE

Arabic Gazetteer

A list lookup component.

GATE

Arabic Gazetteer Collector

No description

GATE

Arabic IE System

Ready-made Arabic IE application

GATE

Arabic Infered Gazetteer

A list lookup component.

GATE

Arabic Main Grammar

A module for executing Jape grammars.

GATE

Arabic OrthoMatcher

ANNIE orthographical coreference component.

GATE

Arabic Tokeniser

A customisable English tokeniser.

GATE

BDM Computation PR

Compute BDM score for each pair of concepts in the given ontology.

GATE

Batch Learning PR

Supports training, application and evaluation of machine learning models for NLP tasks

GATE

Boilerpipe Content Detection

Uses boilerpipe to determine which sections of a document are interesting content and which are just boilerplate

GATE

CSV Corpus Populater

Populate a corpus from CSV files

GATE

Cebuano Gazetteer

A list lookup component.

GATE

Cebuano Gazetteer Tokeniser

A list lookup component.

GATE

Cebuano IE System

Ready-made Cebuano IE application

GATE

Cebuano Tokeniser

A customisable English tokeniser.

GATE

Cebuano Transducer

A module for executing Jape grammars.

GATE

Cebuano Transducer Postprocessor

A module for executing Jape grammars.

GATE

Chemistry Tagger

A tagger for chemical names.

GATE

Chinese IE System

Ready-made Chinese IE application

GATE

Chinese Segmenter PR

Segment the Chinese text into words, based on the PAUM learning algorithm.

GATE

Combine Members PR

Combines documents in a composite document.

GATE

Compound Document

GATE Compound Document.

GATE

Compound Document Editor

Editor for compound documents.

GATE

Compound Document From Xml

GATE Compound Document.

GATE

ConnectSesameOntology

Connect to a repository containing and ontology

GATE

Control Script

Editor for the Groovy script controlling a scriptable controller

GATE

Copy Anns to Another Doc PR

Copy the annotations from one document to another document.

GATE

Corpus Indexing Support

No description

GATE

Crawler PR

GATE implementation of the Websphinx crawling API

GATE

CreateSesameOntology

Create a ontology from a Sesame configuration file for a repository

GATE

Date Annotation Normalizer

provides normalized values for all existing date annotations

GATE

Date Normalizer

provides normalized values for all known dates

GATE

Delete Member PR

Deletes one member document from a compound doc.

GATE

Document Reset PR

Remove named annotation sets or reset the default annotation set

GATE

Document normalizer

Normalize document content to remove "smart quotes" etc.

GATE

DocumentFrequencyBank

Document frequency counter derived from corpora and other DFBs

GATE

EDT Monitor

Warns whenever an AWT component is updated from anywhere other than the event dispatch thread

GATE

Flexible Gazetteer

A more flexible list lookup component.

GATE

French IE System

Ready-made French IE application

GATE

GATE Composite document

GATE Composite document.

GATE

GATE Morphological analyser

Morphological Analyzer for the English Language.

GATE

GATE Ontology Editor

Ontology editing tool.

GATE

GATE Unicode Tokeniser

A customisable Unicode tokeniser.

GATE

GAZE

Gazetteer viewer and editor

GATE

Gazetteer Editor

Gazetteer viewer and editor.

GATE

Gazetteer List Collector

Gazetteer lists collector.

GATE

GenericTagger

The Generic Tagger is Generic!

GATE

German IE System

Ready-made German IE application

GATE

Groovy scripting PR

Runs a Groovy script as a processing resource

GATE

Groovy support for GATE

No description

GATE

Hash Gazetteer

A list lookup component implemented by OntoText Lab.

GATE

Hashtag Tokenizer

Tokenizes Multi-Word Hashtags

GATE

Hindi Gazetteer

A list lookup component.

GATE

Hindi Main Grammar

A module for executing Jape grammars

GATE

Hindi OrthoMatcher

Hindi Orthomatcher

GATE

Hindi Splitter

A Sentence Splitter.

GATE

Hindi Tokeniser

A customisable Hindi tokeniser.

GATE

Hindi Tokeniser Gazetteer

A list lookup component.

GATE

Hindi Tokeniser Postprocessor

A module for executing Jape grammars

GATE

IAA Computation PR

Compute inter-annotator agreement (IAA).

GATE

Inflectional gazetteer

Gazetteer with support for inflectional morphology

GATE

JAPE Transducer

A module for executing Jape grammars.

GATE

JAPE-Plus Transducer

An optimised, JAPE-compatible transducer.

GATE

JAPE-Plus Viewer

A JAPE grammar file viewer

GATE

Jape Viewer

A JAPE grammar file viewer

GATE

Java Heap Dumper

Dumps the Java heap to the specified file

GATE

Large KB Gazetteer

KIM KB based alias-lookup commponent

GATE

Linguistic Simplifier

A processing resource that takes document and corpus parameters

GATE

Linguistic Simplifier

Example application for the linguistic simplifier

GATE

Log4J Level: ALL

Allows the Log4J log level to be set to ALL from within the GUI

GATE

Machine Learning PR

Trains a machine learning algorithm from a corpus.

GATE

Majority-vote consensus builder (annotation)

Process results of a crowd annotation task to find where annotators agree and disagree.

GATE

Majority-vote consensus builder (classification)

Process results of a crowd annotation task to find where annotators agree and disagree.

GATE

Measurement Tagger

A measurement tagger based upon GNU Units

GATE

Measurements

Ready-made application for measurement annotator

GATE

MetaMap Annotator

This plugin uses the MetaMap Java API to send GATE document content to MetaMap skrmedpostctl server and PrologBeans mmserver instances running on the given machine/port

GATE

Noun Phrase Chunker

Ready-made NP chunking application

GATE

Noun Phrase Chunker

Implementation of the Ramshaw and Marcus base noun phrase chunker

GATE

Numbers Tagger

Finds numbers in (both words and digits) and annotates them with their numeric value

GATE

OAT

Ontology Annotation Tool.

GATE

OWLIM Ontology

Ontology created as a temporary OWLIM3 in-memory repository

GATE

OWLIM Ontology DEPRECATED

Ontology created as a temporary OWLIM3 in-memory repository, for backwards compatibility only

GATE

Onto Root Gazetteer

An ontology lookup component

GATE

OntoGazetteer

A list lookup component based on mapping between ontology classes and gazetteer lists.

GATE

OrthoRef

An orthographic coreferencer

GATE

PMI Bank

Pointwise Mutual Information from corpora

GATE

PMI Example (English)

Example application for the PMI (pointwise mutual information) tool

GATE

POS Mapper

Map complex Russian morphology tags into simpler POS categories

GATE

Quality Assurance PR

The Quality Assurance PR provides a functionality of the Corpus QA Tool in GATE Developer

GATE

RAT-C

Relation Annotation Tool Class view.

GATE

RAT-I

Relation Annotation Tool Instance view.

GATE

RegEx Sentence Splitter

A sentence splitter based on regular expressions.

GATE

Roman Numerals Tagger

Finds and annotates Roman numerals

GATE

Romanian Gazetteer

A list lookup component.

GATE

Romanian IE System

Ready-made Romanian IE application

GATE

Romanian Tokeniser

A customisable Romanian tokeniser.

GATE

Romanian Transducer

A module for executing Jape grammars

GATE

RussIE

Basic version of the RussIE application

GATE

RussIE + Inflectional Gazetteer & OrthoMatcher

RussIE application with orthomatcher and inflexional gazetteer

GATE

RussIE + Inflectional Gazetter

RussIE application with inflexional gazetteer

GATE

RussIE + OrthoMatcher

RussIE application with orthomatcher

GATE

Russian Gazetteer

Customised version of the hash gazetteer

GATE

Russian POS Tagger

Part-of-speech tagger for Russian

GATE

Schema Annotations Editor

An annotation editor restricted by schemas.

GATE

Schema Enforcer

Produces an annotation set whose content is restricted by the specified set of schemas

GATE

Script Editor

Editor for the Groovy script behind this PR

GATE

Scriptable Controller

A controller whose execution strategy is controlled by a Groovy script

GATE

Search Results

Viewer for IR search results

GATE

SearchPR

Provides IR functionality.

GATE

Segment Processing PR

Processes individual segments as separate documents

GATE

Semantic Enrichment PR

The Semantic Enrichment PR allows adding new data to semantic annotations by querying external RDF (Linked Data) repositories.

GATE

Sharable Gazettee

A list lookup component.

GATE

Show/Hide Resources

Show resources that would otherwise be hidden, e.g. resources created for internal use by other resources

GATE

Simple Schema Viewer

A Simple Annotation Schema Viewer

GATE

Switch Member PR

Sets the focus of a compound document to a specified member document.

GATE

Syntax tree viewer

Viewer for syntax trees generated by a parser.

GATE

Termbank Score Copier

Copy scores from Termbanks back to their source annotations

GATE

Text Categorization PR

Classify text based on a semantic space

GATE

The Duplicator

Duplicate any resource with a right click menu option

GATE

Tweet Normaliser

Normalise texts in tweets (convert into standard English spelling mistakes, colloquialisms, typing variations and so on)

GATE

TwitIE (EN)

English TwitIE application

GATE

Twitter Tokenizer (EN)

Tokenizer tuned for Tweets

GATE

UIMA Analysis Engine

Wrapper for a Text Analysis Engine from UIMA.

GATE

Unload Unused Plugins

Unloads all plugins for which we cannot find any loaded instances

GATE

(original) ILSP (UIMA) (5)

The components listed here could not be associated with a known third-party tool collection and are assumed to be original components.

Component Description Framework

ILSP Chunker

No description

ILSP (UIMA)

ILSP FBT Tagger

ILSP FBT Tagger is an adaptation of the Brill tagger trained on Greek text.

ILSP (UIMA)

ILSP Lemmatizer

ILSP Lemmatizer consults a assigns lemmas to tokens from Greek texts.

ILSP (UIMA)

ILSP NERC

This module uses a Maximum Entropy NER engine focusing on EL or EN textual newsy data.

ILSP (UIMA)

_Sentence_and_Token_Segmentor,ILSP Paragraph, Sentence and Token Segmentor

This module is a regex and abbreviation based segmentor targetting texts written in Greek.

ILSP (UIMA)

(original) NaCTeM (UIMA) (18)

The components listed here could not be associated with a known third-party tool collection and are assumed to be original components.

Component Description Framework

Agreement Evaluator

Reports agreement on annotations coming from different views (sofas).

NaCTeM (UIMA)

Anatomical Entity Tagger

Tags anatomical entities using Brown, UMLS and OBO Anatomy dictionary features

NaCTeM (UIMA)

Annotation Remover

Removes span-of-text annotations.

NaCTeM (UIMA)

Cafetiere Sentence Splitter

Uses a set of heuristics and patterns to find sentence boundaries.

NaCTeM (UIMA)

Dictionary Pluggable Soft TF/IDF Matcher

Tests input tokens whether they belong to an entry in the specified dictionary using SecondString Soft TF/IDF.

NaCTeM (UIMA)

Feature Generator

Generates a list of user-defined observations for each token.

NaCTeM (UIMA)

Kleio Search

Uses the Keio service to fetch MEDLINE abstracts matching a specified query.

NaCTeM (UIMA)

Medical Condition Tagger

A tagger that recognises mentions of medical conditions.

NaCTeM (UIMA)

NeMine

No description

NaCTeM (UIMA)

OSCAR 4 Tokeniser

Segments text into tokens.

NaCTeM (UIMA)

OscarMER

Runs Oscar 3 with maximum entropy based recogniser with syntactic tokens as input

NaCTeM (UIMA)

RO_FDGBank

This reader performs the transformation of the CONLL tab separated text format to the CAS ConllDependency format.

NaCTeM (UIMA)

Reference Evaluator

Reports annotation performance comparing views (sofas) to one selected reference view.

NaCTeM (UIMA)

Regex Annotator

Annotates spans of text based on a custom regular expression.

NaCTeM (UIMA)

SFTP BioNLP Shared Task Data Provider

Reads a corpus in BioNLP Shared Task format from a remote directory on a user-specified server via SFTP.

NaCTeM (UIMA)

Type Mapper

No description

NaCTeM (UIMA)

UMLS Full Dictionary Feature Extractor

Extracts Dictionary features from a UMLS-sourced dictionary

NaCTeM (UIMA)

Yeast Metabliner

This service is to annotate yeast metabolites with a supervised NER system using CRF.

NaCTeM (UIMA)

(service) AlchemyAPI (2)

Component Description Framework

AlchemyAPI: Entity Extraction

Runs the AlchemyAPI Entity Extraction service on a GATE document

GATE

AlchemyAPI: Keyword Extraction

Runs the AlchemyAPI Keyword Extraction service on a GATE document

GATE

(service) CrowdFlower (3)

Component Description Framework

Entity Annotation Job Builder

Build a CrowdFlower job asking users to annotate entities within a snippet of text

GATE

Entity Classification Job Builder

Build a CrowdFlower job asking users to select the right label for entities

GATE

Entity Classification Results Importer

Import judgments from a CrowdFlower job created by the Entity Classification Job Builder as GATE annotations.

GATE

(service) Lupedia (1)

Component Description Framework

Lupedia Service PR

Runs a lupedia annotation service on a GATE document

GATE

(service) TextRazor (1)

Component Description Framework

TextRazor Service PR

Runs the TextRazor annotation service (http://textrazor.com) on a GATE document

GATE

(service) Textalytics (6)

Component Description Framework

Textalytics Language Identification

Textalytics Language Identification

GATE

_PoS_and_Parsing,Textalytics Lemmatization, PoS and Parsing

Textalytics Lemmatization, PoS and Parsing

GATE

Textalytics Sentiment Analysis

Textalytics Sentiment Analysis

GATE

_Grammar_and_Style_Proofreading,Textalytics Spell, Grammar and Style Proofreading

Textalytics Spell, Grammar and Style Proofreading

GATE

Textalytics Text Classification

Textalytics Text Classification

GATE

Textalytics Topics Extraction

Textalytics Topics Extraction

GATE

(service) UAIC (6)

Component Description Framework

UAICDiacriticsDescriptor

No description

NaCTeM (UIMA)

UAICLemmav1

Assigns base forms to tokenised text.

NaCTeM (UIMA)

UAICLemmav2

Assigns base forms in Romanian text, given POS-tagged text.

NaCTeM (UIMA)

UAICSegV1

Splits texts into fragments

NaCTeM (UIMA)

UAICTokenizerDescriptor

No description

NaCTeM (UIMA)

UaicPosTagger

Carries out sentence splitting, tokenisation, POS tagging and lemmatitisation on plain text.

NaCTeM (UIMA)

ABNER (2)

Component Description Framework

ABNER

Wraps the ABNER entity identification system into the UIMA framework.

NaCTeM (UIMA)

ABNER Tagger

GATE wrapper over ABNER

GATE

Arktweet (2)

Component Description Framework

ArktweetPosTagger

Wrapper for Twitter Tokenizer and POS Tagger.

DKPro Core (UIMA)

ArktweetTokenizer

ArkTweet tokenizer.

DKPro Core (UIMA)

BANNER (5)

Component Description Framework

BANNER CRF Tagger

A UIMA wrapper for BANNER entity tagger.

NaCTeM (UIMA)

Banner Base Tokenizer

Tokens returned by this class consist primarily of contiguous alphanumeric characters or single punctuation marks, however certain constructs such * as real numbers, percentages are recognized and returned as a single token.

NaCTeM (UIMA)

Banner Simple Tokenizer

Tokens ouput by this tokenizer consist of a contiguous block of alphanumeric characters or a single punctuation mark.

NaCTeM (UIMA)

Banner Whitespace Tokenizer

* Instances of this class tokenize {@link Sentence}s only at whitespace characters.

NaCTeM (UIMA)

EngLemmatiser

English lemmatiser which is adapted from WordNet.

NaCTeM (UIMA)

BioCreative (2)

Component Description Framework

BioCreative Gene Mention Tagger

Tags Gene mentions using a model trained on BioCreative GM task data, with Entrez Gene and UMLS dictionary features.

NaCTeM (UIMA)

Chemical Entity Recogniser

A named entity recogniser capable of annotating names of chemicals, drugs and metabolites.

NaCTeM (UIMA)

BioLG (1)

Component Description Framework

BioLG

Applies BioLG and lp2lp to sentences.

AlvisNLP

BulStem (1)

Component Description Framework

BulStem

This plugin is an implementation of the BulStem stemmer algorithm for Bulgarian developed by Preslav Nakov.

GATE

CCG (2)

Component Description Framework

CCGParser

Syntax parsing with CCG Parser.

AlvisNLP

CCGPosTagger

Applies the CCG POS tagger on annotations.

AlvisNLP

CRF++ (2)

Component Description Framework

CRF++ Tagger

Uses Conditional Random Fields model for labeling.

NaCTeM (UIMA)

CRF++ Trainer

Produces a Conditional Random Fields model.

NaCTeM (UIMA)

Cjf (1)

Component Description Framework

CjfNormalizer

Converts traditional Chinese to simplified Chinese or vice-versa.

DKPro Core (UIMA)

ClearNLP (5)

Component Description Framework

ClearNlpLemmatizer

Lemmatizer using Clear NLP.

DKPro Core (UIMA)

ClearNlpParser

Clear parser annotator.

DKPro Core (UIMA)

ClearNlpPosTagger

Part-of-Speech annotator using Clear NLP.

DKPro Core (UIMA)

ClearNlpSegmenter

Tokenizer using Clear NLP.

DKPro Core (UIMA)

ClearNlpSemanticRoleLabeler

ClearNLP semantic role labeller.

DKPro Core (UIMA)

EnjuParser (3)

Component Description Framework

Enju Parser

A syntactic parser for English.

NaCTeM (UIMA)

EnjuParser

Parses sentences with the ENJU dependency parser.

AlvisNLP

EnjuParser2

synopsis

AlvisNLP

FreeLing (5)

Component Description Framework

Freeling Sentence Splitter

Performs tokenisation.

NaCTeM (UIMA)

FreelingMorpho

Performs tokenisation, and determines possible lemmas and POS tags for each token, with confidence scores.

NaCTeM (UIMA)

FreelingShallowParser

Performs tokenisation, lemmatisation, POS tagging and shallow parsing (chunking).

NaCTeM (UIMA)

FreelingTagger

Performs tokenisation, lemmatisation and POS tagging.

NaCTeM (UIMA)

FreelingTokenizer

Performs tokenisation.

NaCTeM (UIMA)

GATE Hepple (5)

Component Description Framework

ANNIE POS Tagger

Mark Hepple's Brill-style POS tagger

GATE

Cebuano POS Tagger

Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword

GATE

Hepple POS Tagger

Mark Hepple's POS tagger, from dragontools/Banner toolkit.

NaCTeM (UIMA)

HepplePosTagger

GATE Hepple part-of-speech tagger.

DKPro Core (UIMA)

Hindi POS Tagger

Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword

GATE

GENIA (5)

Component Description Framework

GENIA Dependency Parser

A dependency parser for biomedical text.

NaCTeM (UIMA)

GENIA Sentence Splitter

A processing resource that takes document and corpus parameters

GATE

GENIA Sentence Splitter

Machine learning-based sentence splitter optimized for biomedical texts.

NaCTeM (UIMA)

GENIA Tagger

Tags biological named entities: proteins, cell lines, cell types, DNAs, and RNAs.

NaCTeM (UIMA)

GeniaTagger

Runs Genia Tagger on annotations.

AlvisNLP

HunPos (1)

Component Description Framework

HunPosTagger

Part-of-Speech annotator using HunPos.

DKPro Core (UIMA)

IULA (2)

Component Description Framework

IULATagger

Performs paragraph splitting, sentence splitting, tokenisation and POS tagging.

NaCTeM (UIMA)

IULATokenizer

Performs paragraph splitting, sentence splitting, and tokenisation.

NaCTeM (UIMA)

JTok (1)

Component Description Framework

JTokSegmenter

JTok segmenter.

DKPro Core (UIMA)

Java BreakIterator (2)

Component Description Framework

Banner Sentence Breaker

Sentence breaker using the Sun Java API "BreakIterator".

NaCTeM (UIMA)

BreakIteratorSegmenter

BreakIterator segmenter.

DKPro Core (UIMA)

Jazzy (1)

Component Description Framework

JazzyChecker

This annotator uses Jazzy for the decision whether a word is spelled correctly or not.

DKPro Core (UIMA)

KEA (1)

Component Description Framework

KEA Keyphrase Extractor

A Keyphrase Extractor by Eibe Frank.

GATE

LBJ (1)

Component Description Framework

LBJ Named Entity Recognizer

A wrapper for the Illinois Named Entity Tagger

NaCTeM (UIMA)

Langdetect (1)

Component Description Framework

LangDetectLanguageIdentifier

Langdetect language identifier based on character n-grams.

DKPro Core (UIMA)

LanguageTool (3)

Component Description Framework

LanguageToolChecker

Detect grammatical errors in text using LanguageTool a rule based grammar checker.

DKPro Core (UIMA)

LanguageToolLemmatizer

Naive lexicon-based lemmatizer.

DKPro Core (UIMA)

LanguageToolSegmenter

Segmenter using LanguageTool to do the heavy lifting.

DKPro Core (UIMA)

LingPipe (6)

Component Description Framework

LingPipe Language Identifier PR

GATE PR for language identification using LingPipe

GATE

LingPipe NER PR

LingPipe Named Entity Recognizer

GATE

LingPipe POS Tagger PR

Provides a LingPipe part of speech tagger.

GATE

LingPipe Sentence Splitter

Sentence splitter based on LingPipe models.

NaCTeM (UIMA)

LingPipe Sentence Splitter PR

Provides an interface to LingPipe sentence splitter API.

GATE

LingPipe Tokenizer PR

Provides a LingPipe tokenizer.

GATE

Lucene/Solr (1)

Component Description Framework

Lucene IR Engine

No description

GATE

MLRS (3)

Component Description Framework

MLRS Maltese Tokeniser

Tokenises Maltese text

NaCTeM (UIMA)

MLRS Paragraph Splitter

Identifies the paragraphs in the text, creating a Paragraph annotation for each one

NaCTeM (UIMA)

MLRS Sentence Splitter

Identifies the sentences in the text, creating a Sentence annotation for each

NaCTeM (UIMA)

Mallet (1)

Component Description Framework

MalletTopicModelInferencer

Infers the topic distribution over documents using a Mallet ParallelTopicModel.

DKPro Core (UIMA)

MaltParser (2)

Component Description Framework

ILSP Dependency Parser

ILSP Dependency Parser is a tool trained on the Greek Dependency Treebank (Prokopidis et al., 2005), a resource which comprises data annotated at several linguistic levels.

ILSP (UIMA)

MaltParser

Dependency parsing using MaltPaser.

DKPro Core (UIMA)

Mate Tools (4)

Component Description Framework

MateLemmatizer

DKPro Annotator for the MateToolsLemmatizer.

DKPro Core (UIMA)

MateMorphTagger

DKPro Annotator for the MateToolsMorphTagger.

DKPro Core (UIMA)

MatePosTagger

DKPro Annotator for the MateToolsPosTagger

DKPro Core (UIMA)

MateSemanticRoleLabeler

DKPro Annotator for the MateTools Semantic Role Labeler.

DKPro Core (UIMA)

MeCab (1)

Component Description Framework

MeCabTagger

Annotator for the MeCab Japanese POS Tagger.

DKPro Core (UIMA)

Minipar (1)

Component Description Framework

Minipar Wrapper

MiniPar is a shallow parser.

GATE

Morpha (1)

Component Description Framework

MorphaLemmatizer

Lemmatize based on a finite-state machine.

DKPro Core (UIMA)

MutationFinder (1)

Component Description Framework

MutationFinder

GATE MutationFinder Wrapper

GATE

NormaGene (1)

Component Description Framework

NormaGene Tagger

A processing resource that takes document and corpus parameters

GATE

Ogmios (1)

Component Description Framework

OgmiosTokenizer

Tokenizes the sections contents according to the Ogmios tokenizer specifications.

AlvisNLP

OpenCalais (1)

Component Description Framework

OpenCalais Tagger

An OpenCalais based semantic annotator

GATE

OpenNLP (15)

Component Description Framework

OpenNLP Chunker

Chunker using an OpenNLP maxent model

GATE

OpenNLP NER

NER PR using a set of OpenNLP maxent models

GATE

OpenNLP POS Tagger

POS Tagger using an OpenNLP maxent model

GATE

OpenNLP Parser

Syntactic parser from Apache OpenNLP

GATE

OpenNLP Sentence Splitter

Sentence splitter using an OpenNLP maxent model

GATE

OpenNLP Tokenizer

Tokenizer using an OpenNLP maxent model

GATE

OpenNLPNEDetector

Detects named entities in text and creates corresponding entity annotations that span the found entities.

NaCTeM (UIMA)

OpenNLPParser

Parse the document and create phrasal and clausal annotations over the text.

NaCTeM (UIMA)

OpenNLPSentenceDetector

Detect sentence boundaries and create sentence annotations that span these boundaries.

NaCTeM (UIMA)

OpenNLPTokenizer

Tokenize the text and create token annotations that span the tokens.

NaCTeM (UIMA)

OpenNlpChunker

Chunk annotator using OpenNLP.

DKPro Core (UIMA)

OpenNlpNamedEntityRecognizer

OpenNLP name finder wrapper.

DKPro Core (UIMA)

OpenNlpParser

OpenNLP parser.

DKPro Core (UIMA)

OpenNlpPosTagger

Part-of-Speech annotator using OpenNLP.

DKPro Core (UIMA)

OpenNlpSegmenter

Tokenizer and sentence splitter using OpenNLP.

DKPro Core (UIMA)

Penn Bio-Tools (5)

Component Description Framework

Penn BioTagger

Ready-made application for the Penn BioTagger

GATE

Penn BioTagger: Genes

Penn BioTagger for Genes

GATE

Penn BioTagger: Malignancy

Penn BioTagger for malignancy types

GATE

Penn BioTagger: Variation

Penn BioTagger for variations

GATE

Penn BioTokenizer

Tokenizer for biomedical text

GATE

Porter Stemmer (1)

Component Description Framework

PorterStemmer

synopsis

AlvisNLP

RASP (5)

Component Description Framework

RASP POS Converter

Converts from PennTreebank POS tags to the C2 tagset used by RASP.

GATE

RASP2 Morphological Analyser

RASP morphological analyser, which adds lemma and suffix to the WordForm annotations produced by the RASP POS tagger (or the ANNIE POS tagger plus the RASP converter)

GATE

RASP2 POS Tagger

RASP part-of-speech tagger, creating WordForm annotations

GATE

RASP2 Parser

RASP dependency parser

GATE

RASP2 Tokenizer

RASP2 Tokenizer.

GATE

RfTagger (1)

Component Description Framework

RfTagger

Rftagger morphological analyzer.

DKPro Core (UIMA)

SPECIES (2)

Component Description Framework

Species

Calls the Species taxon tagger.

AlvisNLP

Species Tagger

Tags species

NaCTeM (UIMA)

STePP (1)

Component Description Framework

Stepp Tagger

No description

NaCTeM (UIMA)

SVMLight (2)

Component Description Framework

SVMLight Tagger

Applies an SVMLight-trained model on instances.

NaCTeM (UIMA)

SVMLight Trainer

Produces an SVMLight model based on user-specified learning parameters.

NaCTeM (UIMA)

Sfst (1)

Component Description Framework

SfstAnnotator

Sfst morphological analyzer.

DKPro Core (UIMA)

Snowball (2)

Component Description Framework

SnowballStemmer

UIMA wrapper for the Snowball stemmer.

DKPro Core (UIMA)

Stemmer PR

Wrapper for the Snowball stemmer.

GATE

Stanford (17)

Component Description Framework

English Dependency Parser

Ready-made application for Stanford English parser

GATE

English POS Tagger and Dependency Parser

Ready-made application for Stanford English POS tagger and parser

GATE

Stanford Dependency Parser

Generates Stanford-style dependencies together with POS tokens for English.

NaCTeM (UIMA)

Stanford NER

Stanford Named Entity Recogniser

GATE

Stanford POS Tagger

Stanford Part-of-Speech Tagger

GATE

Stanford PTB Tokenizer

Stanford Penn Treebank v3 Tokenizer, for English

GATE

StanfordCoreferenceResolver

No description

DKPro Core (UIMA)

StanfordDependencyConverter

Converts a constituency structure into a dependency structure.

DKPro Core (UIMA)

StanfordLemmatizer

Stanford Lemmatizer component.

DKPro Core (UIMA)

StanfordNER

synopsis

AlvisNLP

StanfordNamedEntityRecognizer

Stanford Named Entity Recognizer component.

DKPro Core (UIMA)

StanfordParser

Stanford parser wrapper

GATE

StanfordParser

Stanford Parser component.

DKPro Core (UIMA)

StanfordPosTagger

Stanford Part-of-Speech tagger component.

DKPro Core (UIMA)

StanfordPtbTransformer

Uses the normalizing tokenizer of the Stanford CoreNLP tools to escape the text PTB-style.

DKPro Core (UIMA)

StanfordSegmenter

No description

DKPro Core (UIMA)

Twitter POS Tagger (EN)

Stanford POS tagger trained on Tweets

GATE

TermRaider (6)

Component Description Framework

AnnotationTermbank

TermRaider Termbank derived from document annotations

GATE

HyponymyTermbank

TermRaider Termbank derived from head/string hyponymy

GATE

Pairbank Viewer

viewer for the TermRaider Pairbank

GATE

TermRaider English Term Extraction

Example application showing typical set-up for the TermRaider tools

GATE

Termbank Viewer

viewer for the TermRaider Termbank

GATE

TfIdfTermbank

TermRaider Termbank derived from vectors in document features

GATE

TextCat (3)

Component Description Framework

LanguageIdentifier

Detection based on character n-grams.

DKPro Core (UIMA)

TextCat Fingerprint Generator

Generate language fingerprints for use with the TextCat Language Indentification PR

GATE

TextCat Language Identification

Recognizes the document language using TextCat

GATE

TreeTagger (3)

Component Description Framework

TreeTagger

Runs tree-tagger.

AlvisNLP

TreeTaggerChunker

Chunk annotator using TreeTagger.

DKPro Core (UIMA)

TreeTaggerPosTagger

Part-of-Speech and lemmatizer annotator using TreeTagger.

DKPro Core (UIMA)

Web1T (1)

Component Description Framework

LanguageDetectorWeb1T

Language detector based on n-gram frequency counts, e.g. as provided by Web1T

DKPro Core (UIMA)

WordNet (4)

Component Description Framework

SemanticFieldAnnotator

This Analysis Engine annotates English single words with semantic field information retrieved from an ExternalResource.

DKPro Core (UIMA)

WordNet

WordNet

GATE

WordNet 1.6

Princeton WordNet 1.6.

GATE

WordNet Viewer

WordNet viewer

GATE

Yatea (2)

Component Description Framework

YateaExtractor

Extract terms from the corpus using the YaTeA term extractor.

AlvisNLP

YateaProjector

synopsis

AlvisNLP

Zemanta (1)

Component Description Framework

Zemanta Service PR

Runs a zemanta annotation service on a GATE document

GATE

I/O components by format

Uncategorized (47)

Component Description Framework

ACE Corpus Reader

Reads ...

NaCTeM (UIMA)

ADBWriter

synopsis

AlvisNLP

AlvisAEReader

reads documents and annotations from an AlvisAE campaign.

AlvisNLP

AlvisAEReader2

reads documents and annotations from an AlvisAE campaign.

AlvisNLP

AlvisDBIndexer

synopsis

AlvisNLP

AlvisIRIndexer

synopsis

AlvisNLP

AnimalReader

Project-specific file reader.

AlvisNLP

BIO Format Collection Reader

Reads BIO format files from specified directory.

NaCTeM (UIMA)

BIO Format Writer Cas Consumer

Writes specified types of annotations to the specified directory in the BIO format.

NaCTeM (UIMA)

BioC Reader

Reads a file in BioC format.

NaCTeM (UIMA)

BioC Writer

Writes BioC annotations to files.

NaCTeM (UIMA)

BioCreative CHEMDNER Reader

Reads data prepared specifically for the BioCreative IV's CHEMDNER track.

NaCTeM (UIMA)

BioNLP ST Data Reader

Reads files formatted for the BioNLP Shared Task series and outputs documents with named entity, relation and event annotations.

NaCTeM (UIMA)

BioNLP ST Data Writer

Writes BioNLP entity and event annotations to files.

NaCTeM (UIMA)

BlikiWikipediaReader

Bliki-based Wikipedia reader.

DKPro Core (UIMA)

CombinationReader

Combines multiple readers into a single reader.

DKPro Core (UIMA)

Configurable Exporter

Allows annotations to be exported according to a specified format.

GATE

Entity Annotation Results Importer

Import judgments from a CrowdFlower job created by the Entity Annotation Job Builder as GATE annotations.

GATE

ExpressionExtract

Write elements in a tab separated file.

AlvisNLP

FillDB

Stores the corpus into a SQL database.

AlvisNLP

Flexible Exporter

Exports a document with GATE annotations to its original format.

GATE

HtmlReader

Reads the contents of a given URL and strips the HTML.

DKPro Core (UIMA)

ILSP File System Collection Reader

Reads files from the filesystem.

ILSP (UIMA)

LIBSVMReader

Reads a dataset in LIBSVM format

NaCTeM (UIMA)

Legacy Coref Data Writer

A simple PR that converts co-reference data from the Relations-based model to the legacy format (based on 'matches' annotation and document features).

GATE

MalletTopicProportionsWriter

Write topic proportions to a file in the shape depends on the {@link TopicDistribution annotation which should have been created by MalletTopicModelInferencer before.

DKPro Core (UIMA)

MalletTopicsProportionsSortedWriter

Write the topic proportions according to an LDA topic model to an output file.

DKPro Core (UIMA)

PubTatorReader

synopsis

AlvisNLP

Shared Task 2004 Reader

Reads training or evaluation data from the BioNLP/NLPBA 2004 Bio-Entity Recognition Task

NaCTeM (UIMA)

TGrepWriter

TGrep2 corpus file writer.

DKPro Core (UIMA)

TSV Reader

No description

NaCTeM (UIMA)

TSV Writer

Saves annotations of a selected type to a file in tab-separated-value format.

NaCTeM (UIMA)

TabularExport

Writes the corpus data structure in files in tabular format.

AlvisNLP

TabularReader

synopsis

AlvisNLP

TfidfConsumer

This consumer builds a DfModel.

DKPro Core (UIMA)

TreeTaggerReader

Read files in tree-tagger output format and creates a document for each file read.

AlvisNLP

Twitter Collection Reader

No description

NaCTeM (UIMA)

Twitter Corpus Populator

Populate a corpus from Twitter JSON containing multiple Tweets

GATE

TwitterDatabaseConsumer

No description

NaCTeM (UIMA)

WebOfKnowledgeReader

Reads Web of Knowledge search result import files.

AlvisNLP

WhatsWrongExport

Writes files in What's Wrong with my NLP format.

AlvisNLP

WikipediaArticleInfoReader

Reads all general article infos without retrieving the whole Page objects

DKPro Core (UIMA)

WikipediaDiscussionReader

Reads all discussion pages.

DKPro Core (UIMA)

WikipediaLinkReader

Read links from Wikipedia.

DKPro Core (UIMA)

WikipediaQueryReader

Reads all article pages that match a query created by the numerous parameters of this class.

DKPro Core (UIMA)

WikipediaRevisionPairReader

Reads pairs of adjacent revisions of all articles.

DKPro Core (UIMA)

WikipediaRevisionReader

Reads Wikipedia page revisions.

DKPro Core (UIMA)

AclAnthology (1)

Component Description Framework

AclAnthologyReader

Reada the ACL anthology corpus and outputs CASes with plain text documents.

DKPro Core (UIMA)

Alvis Enriched Document (1)

Component Description Framework

EnrichedDocumentWriter

Writes the corpus in the infamous Alvis Enriched Document Format suitable for indexation with Zebra-Alvis.

AlvisNLP

BNC (1)

Component Description Framework

BncReader

Reader for the British National Corpus (XML version).

DKPro Core (UIMA)

BioNLP Shared Task (2)

Component Description Framework

GeniaReader

Reads text files and their associated annotation files in BioNLP Shared Task format.

AlvisNLP

GeniaWriter

Writes each section in three files in the BioNLP challenge format.

AlvisNLP

BioNLP-ST 2013 a1/a2 (1)

Component Description Framework

BioNLPSTReader

Reads documents and annotations in the BioNLP-ST 2013 a1/a2 format.

AlvisNLP

Brat (2)

Component Description Framework

BratReader

Reader for the brat format.

DKPro Core (UIMA)

BratWriter

Writer for the brat annotation format.

DKPro Core (UIMA)

CLARIN TCF (2)

Component Description Framework

TcfReader

Reader for the WebLicht TCF format.

DKPro Core (UIMA)

TcfWriter

Writer for the WebLicht TCF format.

DKPro Core (UIMA)

CadixeJSON (1)

Component Description Framework

ExportCadixeJSON

Writes each document in a file in the AlvisAE protocol format.

AlvisNLP

CoNLL 2000 (2)

Component Description Framework

Conll2000Reader

Reads the Conll 2000 chunking format.

DKPro Core (UIMA)

Conll2000Writer

Writes the CoNLL 2000 chunking format.

DKPro Core (UIMA)

CoNLL 2002 (2)

Component Description Framework

Conll2002Reader

Reads the CoNLL 2002 named entity format.

DKPro Core (UIMA)

Conll2002Writer

Writes the CoNLL 2002 named entity format.

DKPro Core (UIMA)

CoNLL 2006 (2)

Component Description Framework

Conll2006Reader

Reads a file in the CoNLL-2006 format (aka CoNLL-X).

DKPro Core (UIMA)

Conll2006Writer

Writes a file in the CoNLL-2006 format (aka CoNLL-X).

DKPro Core (UIMA)

CoNLL 2007 (1)

Component Description Framework

CoNLL2007 Cas Consumer

Writes sentences from the CAS in the CoNLL 2007 format.

ILSP (UIMA)

CoNLL 2009 (2)

Component Description Framework

Conll2009Reader

Reads a file in the CoNLL-2009 format.

DKPro Core (UIMA)

Conll2009Writer

Writes a file in the CoNLL-2009 format.

DKPro Core (UIMA)

CoNLL 2012 (2)

Component Description Framework

Conll2012Reader

Reads a file in the CoNLL-2009 format.

DKPro Core (UIMA)

Conll2012Writer

Writer for the CoNLL-2009 format.

DKPro Core (UIMA)

Cochrane (1)

Component Description Framework

GATE .cochrane.txt document format

Load this to allow the opening of Cochrane text documents, and choose the mime type "text/x-cochrane", or use the correct file extension.

GATE

DataSift JSON (1)

Component Description Framework

GATE DataSift JSON Document Format

Format parser for DataSift JSON files

GATE

Factored Tag Lem (1)

Component Description Framework

Factored Tag Lem Consumer

Writes sentences from the CAS in the Factored Tag Lem format

ILSP (UIMA)

Fast Infoset (2)

Component Description Framework

Fast Infoset Document Format

Format parser for GATE XML stored in the binary Fast Infoset format

GATE

Fast Infoset Exporter

Export GATE documents to GATE XML stored in the binary Fast Infoset format

GATE

GATE JSON (1)

Component Description Framework

GATE JSON Exporter

Export documents and corpora in JSON format

GATE

GATE XML (2)

Component Description Framework

GATE XML Writer CAS Consumer

Writes the CAS to GATE XML format

ILSP (UIMA)

GateXMLReaderDescriptor

Reads GATE documents created with ILSP tools

ILSP (UIMA)

Genia JSON (1)

Component Description Framework

GeniaJSONReader

synopsis

AlvisNLP

GrAF (1)

Component Description Framework

ILSP GrAF Consumer

Writes sentences from the CAS to GrAF standoff format.

ILSP (UIMA)

HTML (1)

Component Description Framework

Simplified Text Exporter

Simplified text exporter (HTML output)

GATE

HTML5 Microdata (1)

Component Description Framework

HTML5 Microdata Exporter

Exports Annotations as HTML5 Microdata

GATE

I2B2 (1)

Component Description Framework

I2B2Reader

read files in the format of the I2B2 challenge.

AlvisNLP

ImsCwb (2)

Component Description Framework

ImsCwbReader

Reads a tab-separated format including pseudo-XML tags.

DKPro Core (UIMA)

ImsCwbWriter

This Consumer outputs the content of all CASes into the IMS workbench format.

DKPro Core (UIMA)

JDBC (1)

Component Description Framework

JdbcReader

Collection reader for JDBC database.The obtained data will be written into CAS DocumentText as well as fields of the DocumentMetaData annotation.

DKPro Core (UIMA)

KEA Corpus (1)

Component Description Framework

KEA Corpus Importer

Imports a KEA-style corpus into GATE

GATE

LLL (1)

Component Description Framework

LLLReader

Read files and annotations in LLL format.

AlvisNLP

MediaWiki markup (1)

Component Description Framework

MediaWiki Document Format

Document format for parsing MediaWiki markup

GATE

NEGRA Export (1)

Component Description Framework

NegraExportReader

This CollectionReader reads a file which is formatted in the NEGRA export format.

DKPro Core (UIMA)

OBO (1)

Component Description Framework

OBOReader

Reads terms in OBO files as documents.

AlvisNLP

PDF (1)

Component Description Framework

PdfReader

Collection reader for PDF files.

DKPro Core (UIMA)

Penn Treebank Chunked (1)

Component Description Framework

PennTreebankChunkedReader

Penn Treebank chunked format reader.

DKPro Core (UIMA)

Penn Treebank Combined (2)

Component Description Framework

PennTreebankCombinedReader

Penn Treebank combined format reader.

DKPro Core (UIMA)

PennTreebankCombinedWriter

Penn Treebank combined format writer.

DKPro Core (UIMA)

Prague Markup Language (1)

Component Description Framework

ILSP PML Cas Consumer

Writes sentences from the CAS in the Prague Markup Language format for editing dependency structures in TrEd

ILSP (UIMA)

PubMed (2)

Component Description Framework

GATE .pubMed.txt document format

Load this to allow the opening of PubMed text documents, and choose the mime type "text/x-pubmed"or use the correct file extension.

GATE

PubMed Abstract Reader

Fetches PubMed abstracts from NaCTeM's Kleio service.

NaCTeM (UIMA)

RDF (3)

Component Description Framework

RDF Reader

Reads Common Annotation Structures (CASes) from RDF-encoded files.

NaCTeM (UIMA)

RDF Writer

Saves Common Annotation Structures into RDF files.

NaCTeM (UIMA)

RDFExport

synopsis

AlvisNLP

RTF (1)

Component Description Framework

RTFReader

Read RTF (Rich Test Format) files.

DKPro Core (UIMA)

Relp (1)

Component Description Framework

RelpWriter

Writes the corpus in relp format.

AlvisNLP

Reuters-21578 (2)

Component Description Framework

Reuters21578SgmlReader

Read a Reuters-21578 corpus in SGML format.

DKPro Core (UIMA)

Reuters21578TxtReader

Read a Reuters-21578 corpus that has been transformed into text format using ExtractReuters in the lucene-benchmarks project.

DKPro Core (UIMA)

Solr (1)

Component Description Framework

SolrWriter

A simple implementation of SolrWriter_ImplBase

DKPro Core (UIMA)

TEI-XML (4)

Component Description Framework

Aimed Collection Reader

Reads Aimed corpus (225 abstracts from MEDLINE) with the gold standard sentence, protein, protein-protein interaction anntations.

NaCTeM (UIMA)

TeiReader

Reader for the TEI XML.

DKPro Core (UIMA)

TeiWriter

UIMA CAS consumer writing the CAS document text in TEI format.

DKPro Core (UIMA)

WikipediaTemplateFilteredArticleReader

Reads all pages that contain or do not contain the templates specified in the template whitelist and template blacklist.

DKPro Core (UIMA)

TIGER-XML (2)

Component Description Framework

TigerXmlReader

UIMA collection reader for TIGER-XML files.

DKPro Core (UIMA)

TigerXmlWriter

UIMA CAS consumer writing the CAS document text in the TIGER-XML format.

DKPro Core (UIMA)

Text (14)

Component Description Framework

AssertAnnotations$InternalStringReader

Descriptor automatically generated by uimaFIT

DKPro Core (UIMA)

EuropePMC Open Access Reader

Reads open-access full-text articles from the Europe PMC web service

NaCTeM (UIMA)

FSOVFileReader

Project-specific text file reader.

AlvisNLP

Input Text Reader

Reads text supplied in a parameter.

NaCTeM (UIMA)

Merge GENIA-coref with -term Collection Reader

Read GENIA-coref files and GENIA-event/-term files and merge each couple into one CAS.

NaCTeM (UIMA)

SFTP Document Reader

Reads plain-text documents from a remote directory on a user-specified server via SFTP.

NaCTeM (UIMA)

Simplified Text Exporter

Simplified text exporter (plain text output)

GATE

StringReader

Simple reader that generates a CAS from a String.

DKPro Core (UIMA)

TextFileReader

Reads files and adds a document in the corpus for each file.

AlvisNLP

TextReader

UIMA collection reader for plain text files.

DKPro Core (UIMA)

TextWriter

UIMA CAS consumer writing the CAS document text as plain text file.

DKPro Core (UIMA)

TokenizedTextWriter

This class writes a set of pre-processed documents into a large text file containing one sentence per line and tokens split by whitespaces.

DKPro Core (UIMA)

WikipediaArticleReader

Reads all article pages.

DKPro Core (UIMA)

WikipediaPageReader

Reads all Wikipedia pages in the database (articles, discussions, etc).

DKPro Core (UIMA)

TüPP-D/Z (1)

Component Description Framework

TueppReader

UIMA collection reader for Tübingen Partially Parsed Corpus of Written German (TüPP-D/Z) XML files.

DKPro Core (UIMA)

Twitter JSON (1)

Component Description Framework

GATE JSON Tweet Document Format

Format parser for Twitter JSON files

GATE

UIMA Binary CAS (4)

Component Description Framework

BinaryCasReader

UIMA Binary CAS formats reader.

DKPro Core (UIMA)

BinaryCasWriter

Write CAS in one of the UIMA binary formats.

DKPro Core (UIMA)

SerializedCasReader

No description

DKPro Core (UIMA)

SerializedCasWriter

No description

DKPro Core (UIMA)

UIMA CAS Dump (1)

Component Description Framework

CasDumpWriter

Dumps CAS content to a text file.

DKPro Core (UIMA)

UIMA JSON (1)

Component Description Framework

JsonWriter

UIMA JSON format writer.

DKPro Core (UIMA)

Web1T (1)

Component Description Framework

Web1TWriter

Web1T n-gram index format writer.

DKPro Core (UIMA)

XCES (2)

Component Description Framework

ILSP XCES Consumer

Writes sentences from the CAS to the XCES format

ILSP (UIMA)

XcesReaderDescriptor

Reads XCES XML files.

ILSP (UIMA)

XMI (7)

Component Description Framework

ILSP Xmi Writer CAS Consumer

Serializes the CAS to XMI.

ILSP (UIMA)

SFTP XMI Reader

Reads an XMI-formatted corpus from an SFTP-enabled server.

NaCTeM (UIMA)

SFTP XMI Writer

Saves Common Annotation Structures to an SFTP server

NaCTeM (UIMA)

XMI Reader

Reads common annotation structures (CAS) from files in XMI format.

NaCTeM (UIMA)

XMI Writer

Serialises entires common annotation structures (CAS) to XMI format.

NaCTeM (UIMA)

XmiReader

Reader for UIMA XMI files.

DKPro Core (UIMA)

XmiWriter

UIMA XMI format writer.

DKPro Core (UIMA)

XML (12)

Component Description Framework

ExportAlignmentPR

A PR to export alignment information in an xml file.

GATE

InlineXmlWriter

Writes an approximation of the content of a textual CAS as an inline XML file.

DKPro Core (UIMA)

MediaWiki Corpus Populater

Populate a corpus from a MediaWiki XML dump

GATE

MediaWiki XML Document Format

Deprecated MediaWiki importer

GATE

XMLReader

Reads a corpus in XML files.

AlvisNLP

XMLReader2

Reads XML files and creates elements.

AlvisNLP

XMLWriter

Writes an XML serialization of the corpus into a file.

AlvisNLP

XMLWriter2

Writes the corpus data structure into a file via an XSLT stylesheet.

AlvisNLP

XMLWriter2ForINIST

synopsis

AlvisNLP

XmlReader

Reader for XML files.

DKPro Core (UIMA)

XmlTextReader

No description

DKPro Core (UIMA)

XmlXPathReader

A component reader for XML files implemented with XPath.

DKPro Core (UIMA)

Component details

Uncategorized (132)

ANNIE NE Transducer

Category: Uncategorized
Framework: GATE
Version: unknown

ANNIE named entity grammar.

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationAccessors

 — 

java.util.List

 — 

 — 

 — 

 — 

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

enableDebugging

 — 

java.lang.Boolean

 — 

false

 — 

true

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

grammarURL

 — 

java.net.URL

 — 

resources/NE/main.jape

 — 

 — 

inputASName

 — 

java.lang.String

 — 

 — 

 — 

true

operators

 — 

java.util.List

 — 

 — 

 — 

 — 

outputASName

 — 

java.lang.String

 — 

 — 

 — 

true

ANNIE OrthoMatcher

Category: Uncategorized
Framework: GATE
Version: unknown

ANNIE orthographical coreference component.

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationSetName

 — 

java.lang.String

 — 

 — 

 — 

true

annotationTypes

 — 

java.util.List

 — 

Organization;Person;Location;Date

 — 

true

caseSensitive

 — 

java.lang.Boolean

 — 

false

 — 

 — 

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

definitionFileURL

 — 

java.net.URL

 — 

resources/othomatcher/listsNM.def

 — 

 — 

document

 — 

gate.Document

 — 

 — 

 — 

true

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

extLists

 — 

java.lang.Boolean

 — 

true

 — 

 — 

highPrecisionOrgs

 — 

java.lang.Boolean

 — 

false

 — 

 — 

minimumNicknameLikelihood

 — 

java.lang.Double

 — 

0.50

 — 

 — 

organizationType

 — 

java.lang.String

 — 

Organization

 — 

 — 

personType

 — 

java.lang.String

 — 

Person

 — 

 — 

processUnknown

 — 

java.lang.Boolean

 — 

true

 — 

 — 

ANNIE+Measurements

Category: Uncategorized
Framework: GATE
Version: unknown

Ready-made application for ANNIE plus the measurement tagger

Parameter Description Type Mandatory Default Value Multi-value Runtime

menu

 — 

java.util.List

 — 

 — 

 — 

 — 

pipelineURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

Ab3P

Category: Uncategorized
Framework: AlvisNLP
Version:

synopsis

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

constantAnnotationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantRelationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantTupleFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

installDir

 — 

org.bibliome.util.files.InputDirectory

True

 — 

 — 

 — 

longFormFeature

 — 

java.lang.String

True

 — 

 — 

 — 

longFormRole

 — 

java.lang.String

True

 — 

 — 

 — 

longFormsLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

relationName

 — 

java.lang.String

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

shortFormRole

 — 

java.lang.String

True

 — 

 — 

 — 

shortFormsLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

Action

Category: Uncategorized
Framework: AlvisNLP
Version: 2012-04-30

Applies action expressions on selected elements.

Parameter Description Type Mandatory Default Value Multi-value Runtime

action

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

addToLayer

 — 

java.lang.Boolean

False

 — 

 — 

 — 

constantAnnotationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantDocumentFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantRelationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantSectionFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantTupleFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

createAnnotations

 — 

java.lang.Boolean

False

 — 

 — 

 — 

createDocuments

 — 

java.lang.Boolean

False

 — 

 — 

 — 

createRelations

 — 

java.lang.Boolean

False

 — 

 — 

 — 

createSections

 — 

java.lang.Boolean

False

 — 

 — 

 — 

createTuples

 — 

java.lang.Boolean

False

 — 

 — 

 — 

deleteElements

 — 

java.lang.Boolean

False

 — 

 — 

 — 

removeFromLayer

 — 

java.lang.Boolean

False

 — 

 — 

 — 

setArguments

 — 

java.lang.Boolean

False

 — 

 — 

 — 

setFeatures

 — 

java.lang.Boolean

False

 — 

 — 

 — 

target

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

AggregateValues

Category: Uncategorized
Framework: AlvisNLP
Version:

synopsis

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

aggregators

 — 

org.bibliome.alvisnlp.modules.aggregate.Aggregator[]

True

 — 

 — 

 — 

entries

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

key

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

outFile

 — 

org.bibliome.util.streams.TargetStream

True

 — 

 — 

 — 

separator

 — 

java.lang.Character

True

 — 

 — 

 — 

Agreement Evaluator

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 1.0

Reports agreement on annotations coming from different views (sofas).

Parameter Description Type Mandatory Default Value Multi-value Runtime

OutputFile

 — 

String

True

 — 

false

 — 

AlchemyAPI: Entity Extraction

Category: Uncategorized
Framework: GATE
Version: unknown

Runs the AlchemyAPI Entity Extraction service on a GATE document

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationType

 — 

java.lang.String

 — 

Mention

 — 

true

apiKey

 — 

java.lang.String

 — 

 — 

 — 

 — 

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

inputASName

 — 

java.lang.String

 — 

 — 

 — 

true

numberOfSentencesInBatch

 — 

java.lang.Integer

 — 

 — 

 — 

true

numberOfSentencesInContext

 — 

java.lang.Integer

 — 

 — 

 — 

true

outputASName

 — 

java.lang.String

 — 

 — 

 — 

true

AlchemyAPI: Keyword Extraction

Category: Uncategorized
Framework: GATE
Version: unknown

Runs the AlchemyAPI Keyword Extraction service on a GATE document

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationType

 — 

java.lang.String

 — 

Keyword

 — 

true

apiKey

 — 

java.lang.String

 — 

 — 

 — 

 — 

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

inputASName

 — 

java.lang.String

 — 

 — 

 — 

true

numberOfSentencesInBatch

 — 

java.lang.Integer

 — 

 — 

 — 

true

numberOfSentencesInContext

 — 

java.lang.Integer

 — 

 — 

 — 

true

outputASName

 — 

java.lang.String

 — 

 — 

 — 

true

AlvisREPrepareCrossValidation

Category: Uncategorized
Framework: AlvisNLP
Version:

synopsis

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

cParameter

 — 

java.lang.Double

True

 — 

 — 

 — 

dependencies

 — 

org.bibliome.alvisnlp.modules.alvisre.AlvisRERelations

True

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

folds

 — 

java.lang.Integer

True

 — 

 — 

 — 

outDir

 — 

org.bibliome.util.files.OutputDirectory

True

 — 

 — 

 — 

relations

 — 

org.bibliome.alvisnlp.modules.alvisre.AlvisRERelations[]

True

 — 

 — 

 — 

schema

 — 

org.w3c.dom.DocumentFragment

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

sectionSeparator

 — 

java.lang.String

True

 — 

 — 

 — 

sentences

 — 

org.bibliome.alvisnlp.modules.alvisre.AlvisRETokens

True

 — 

 — 

 — 

similarityFunction

 — 

org.w3c.dom.DocumentFragment

True

 — 

 — 

 — 

terms

 — 

org.bibliome.alvisnlp.modules.alvisre.AlvisRETokens[]

True

 — 

 — 

 — 

words

 — 

org.bibliome.alvisnlp.modules.alvisre.AlvisRETokens

True

 — 

 — 

 — 

AnchorTuples

Category: Uncategorized
Framework: AlvisNLP
Version: 2012-04-30

Creates tuples with a common argument.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

anchor

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

anchorRole

 — 

java.lang.String

True

 — 

 — 

 — 

arguments

 — 

alvisnlp.module.types.ExpressionMapping

True

 — 

 — 

 — 

constantRelationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantTupleFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

relationName

 — 

java.lang.String

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

Annotation Remover

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 1.0

Removes span-of-text annotations.

Parameter Description Type Mandatory Default Value Multi-value Runtime

mode

Set to 'remove' if you wish to remove annotations of the types given in 'types'. Set to 'retain' if you wish to retain only the annotations of the types given in 'types'.

String

True

 — 

false

 — 

types

List of annotation types.

String

True

 — 

true

 — 

AnnotationTermbank

Category: Uncategorized
Framework: GATE
Version: unknown

TermRaider Termbank derived from document annotations

Parameter Description Type Mandatory Default Value Multi-value Runtime

corpora

 — 

java.util.Set

 — 

 — 

 — 

 — 

debugMode

 — 

java.lang.Boolean

 — 

false

 — 

 — 

idDocumentFeature

 — 

java.lang.String

 — 

 — 

 — 

 — 

inputASName

 — 

java.lang.String

 — 

 — 

 — 

 — 

inputAnnotationFeature

 — 

java.lang.String

 — 

canonical

 — 

 — 

inputAnnotationTypes

 — 

java.util.Set

 — 

SingleWord;MultiWord

 — 

 — 

inputScoreFeature

 — 

java.lang.String

 — 

localAugTfIdf

 — 

 — 

languageFeature

 — 

java.lang.String

 — 

lang

 — 

 — 

mergingMode

 — 

gate.termraider.modes.MergingMode

 — 

MAXIMUM

 — 

 — 

normalization

 — 

gate.termraider.modes.Normalization

 — 

Sigmoid

 — 

 — 

scoreProperty

 — 

java.lang.String

 — 

tfIdfAug

 — 

 — 

AntecedentChoice

Category: Uncategorized
Framework: AlvisNLP
Version: 2012-04-30

Biotopes-specific module: chooses an antecedent.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

Arabic Gazetteer Collector

Category: Uncategorized
Framework: GATE
Version: unknown

Parameter Description Type Mandatory Default Value Multi-value Runtime

menu

 — 

java.util.List

 — 

Arabic

 — 

 — 

pipelineURL

 — 

java.net.URL

 — 

resources/arabic_lists_collector.gapp

 — 

 — 

Arabic Main Grammar

Category: Uncategorized
Framework: GATE
Version: unknown

A module for executing Jape grammars.

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationAccessors

 — 

java.util.List

 — 

 — 

 — 

 — 

binaryGrammarURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

enableDebugging

 — 

java.lang.Boolean

 — 

false

 — 

true

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

grammarURL

 — 

java.net.URL

 — 

resources/grammar/main.jape

 — 

 — 

inputASName

 — 

java.lang.String

 — 

 — 

 — 

true

ontology

 — 

gate.creole.ontology.Ontology

 — 

 — 

 — 

true

operators

 — 

java.util.List

 — 

 — 

 — 

 — 

outputASName

 — 

java.lang.String

 — 

 — 

 — 

true

Arabic OrthoMatcher

Category: Uncategorized
Framework: GATE
Version: unknown

ANNIE orthographical coreference component.

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationSetName

 — 

java.lang.String

 — 

 — 

 — 

true

annotationTypes

 — 

java.util.List

 — 

Organization;Person;Location;Date

 — 

true

caseSensitive

 — 

java.lang.Boolean

 — 

false

 — 

 — 

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

definitionFileURL

 — 

java.net.URL

 — 

resources/orthomatcher/listsNM.def

 — 

 — 

document

 — 

gate.Document

 — 

 — 

 — 

true

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

extLists

 — 

java.lang.Boolean

 — 

true

 — 

 — 

highPrecisionOrgs

 — 

java.lang.Boolean

 — 

false

 — 

 — 

minimumNicknameLikelihood

 — 

java.lang.Double

 — 

0.50

 — 

 — 

organizationType

 — 

java.lang.String

 — 

Organization

 — 

 — 

personType

 — 

java.lang.String

 — 

Person

 — 

 — 

processUnknown

 — 

java.lang.Boolean

 — 

true

 — 

 — 

Assert

Category: Uncategorized
Framework: AlvisNLP
Version:

Tests an assertion on specified elements.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

assertion

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

severe

 — 

java.lang.Boolean

True

 — 

 — 

 — 

stopAt

 — 

java.lang.Integer

False

 — 

 — 

 — 

target

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

[[AssertAnnotations$InternalJCasHolder]] ==== AssertAnnotations$InternalJCasHolder

Category: Uncategorized
Framework: DKPro Core (UIMA)
Version: 1.8.0

Descriptor automatically generated by uimaFIT

AttestedTermsProjector

Category: Uncategorized
Framework: AlvisNLP
Version: 2010-10-28

Projects a list of terms given in tree-tagger format.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

constantAnnotationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

errorDuplicateValues

 — 

java.lang.Boolean

False

 — 

 — 

 — 

ignoreCase

 — 

java.lang.Boolean

False

 — 

 — 

 — 

ignoreDiacritics

 — 

java.lang.Boolean

False

 — 

 — 

 — 

ignoreWhitespace

 — 

java.lang.Boolean

False

 — 

 — 

 — 

lemmaFeatureName

 — 

java.lang.String

True

 — 

 — 

 — 

lemmaKeys

 — 

java.lang.Boolean

True

 — 

 — 

 — 

multipleValueAction

 — 

org.bibliome.alvisnlp.modules.projectors.MultipleValueAction

True

 — 

 — 

 — 

normalizeSpace

 — 

java.lang.Boolean

False

 — 

 — 

 — 

posFeatureName

 — 

java.lang.String

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

subject

 — 

org.bibliome.alvisnlp.modules.projectors.Subject

True

 — 

 — 

 — 

targetLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

termFeatureName

 — 

java.lang.String

False

 — 

 — 

 — 

termsFile

 — 

org.bibliome.util.streams.SourceStream

True

 — 

 — 

 — 

BDM Computation PR

Category: Uncategorized
Framework: GATE
Version: unknown

Compute BDM score for each pair of concepts in the given ontology.

Parameter Description Type Mandatory Default Value Multi-value Runtime

ontology

 — 

gate.creole.ontology.Ontology

 — 

 — 

 — 

true

outputBDMFile

 — 

java.net.URL

 — 

 — 

 — 

true

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 1.0

Sentence breaker using the Sun Java API "BreakIterator".

BioLG

Category: Uncategorized
Framework: AlvisNLP
Version: 2012-04-30

Applies BioLG and lp2lp to sentences.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

constantRelationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantTupleFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

dependencyLabelFeature

 — 

java.lang.String

True

 — 

 — 

 — 

dependencyRelation

 — 

java.lang.String

True

 — 

 — 

 — 

dependentRole

 — 

java.lang.String

True

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

headRole

 — 

java.lang.String

True

 — 

 — 

 — 

linkageNumberFeature

 — 

java.lang.String

True

 — 

 — 

 — 

lp2lpConf

 — 

org.bibliome.util.files.InputFile

True

 — 

 — 

 — 

lp2lpExecutable

 — 

org.bibliome.util.files.ExecutableFile

True

 — 

 — 

 — 

maxLinkages

 — 

java.lang.Integer

False

 — 

 — 

 — 

parserPath

 — 

org.bibliome.util.files.WorkingDirectory

True

 — 

 — 

 — 

posFeature

 — 

java.lang.String

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

sentenceFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

sentenceLayer

 — 

java.lang.String

True

 — 

 — 

 — 

sentenceRole

 — 

java.lang.String

True

 — 

 — 

 — 

timeout

 — 

java.lang.Integer

True

 — 

 — 

 — 

union

 — 

java.lang.Boolean

True

 — 

 — 

 — 

wordLayer

 — 

java.lang.String

True

 — 

 — 

 — 

wordNumberLimit

 — 

java.lang.Integer

True

 — 

 — 

 — 

CSV Corpus Populater

Category: Uncategorized
Framework: GATE
Version: unknown

Populate a corpus from CSV files

CartesianProductTuples

Category: Uncategorized
Framework: AlvisNLP
Version: 2012-04-30

Creates tuples for each element of a Cartesian product.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

anchor

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

arguments

 — 

alvisnlp.module.types.ExpressionMapping

True

 — 

 — 

 — 

constantRelationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantTupleFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

relationName

 — 

java.lang.String

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

Cebuano Transducer

Category: Uncategorized
Framework: GATE
Version: unknown

A module for executing Jape grammars.

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationAccessors

 — 

java.util.List

 — 

 — 

 — 

 — 

binaryGrammarURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

enableDebugging

 — 

java.lang.Boolean

 — 

false

 — 

true

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

grammarURL

 — 

java.net.URL

 — 

resources/grammar/main.jape

 — 

 — 

inputASName

 — 

java.lang.String

 — 

 — 

 — 

true

ontology

 — 

gate.creole.ontology.Ontology

 — 

 — 

 — 

true

operators

 — 

java.util.List

 — 

 — 

 — 

 — 

outputASName

 — 

java.lang.String

 — 

 — 

 — 

true

Cebuano Transducer Postprocessor

Category: Uncategorized
Framework: GATE
Version: unknown

A module for executing Jape grammars.

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationAccessors

 — 

java.util.List

 — 

 — 

 — 

 — 

binaryGrammarURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

enableDebugging

 — 

java.lang.Boolean

 — 

false

 — 

true

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

grammarURL

 — 

java.net.URL

 — 

resources/tokeniser/join.jape

 — 

 — 

inputASName

 — 

java.lang.String

 — 

 — 

 — 

true

ontology

 — 

gate.creole.ontology.Ontology

 — 

 — 

 — 

true

operators

 — 

java.util.List

 — 

 — 

 — 

 — 

outputASName

 — 

java.lang.String

 — 

 — 

 — 

true

Chemical Entity Recogniser

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 0.1

A named entity recogniser capable of annotating names of chemicals, drugs and metabolites. Built on top of the NERsuite package [1]. Available models: Chemical: trained on the BioCreative IV CHEMDNER Track training and development corpora [2] Drug: trained on the DDI training corpus [3] Metabolite: trained on NaCTeM's Metabolite corpus [4] Dictionaries used: Chemical: ChEBI [5], DrugBank [6], CTD Chemicals [7], PubChem Compound [8], Jochem [9] Drug: DrugBank [6] Metabolite: ChEBI [5], Human Metabolome Database [10] Links: [1] http://nersuite.nlplab.org [2] http://www.biocreative.org/resources/corpora/bc-iv-chemdner-corpus [3] http://labda.inf.uc3m.es/doku.php?id=en:labda_ddicorpus [4] http://www.nactem.ac.uk/metabolite-corpus [5] http://www.ebi.ac.uk/chebi [6] http://www.drugbank.ca [7] http://ctdbase.org [8] http://pubchem.ncbi.nlm.nih.gov [9] http://www.biosemantics.org/new/index.php?page=Jochem [10] http://www.hmdb.ca

Parameter Description Type Mandatory Default Value Multi-value Runtime

model

The model to use

String

True

 — 

false

 — 

performAbbreviationRecognition

Additionally perform abbreviation recognition

Boolean

False

 — 

false

 — 

performTokenRelabelling

Additionally perform relabelling based on token chemical composition

Boolean

False

 — 

false

 — 

ColognePhoneticTranscriptor

Category: Uncategorized
Framework: DKPro Core (UIMA)
Version: 1.8.0

Cologne phonetic (Kölner Phonetik) transcription based on Apache Commons Codec. Works for German.

Compound Document

Category: Uncategorized
Framework: GATE
Version: unknown

GATE Compound Document.

Parameter Description Type Mandatory Default Value Multi-value Runtime

collectRepositioningInfo

 — 

java.lang.Boolean

 — 

false

 — 

 — 

documentIDs

 — 

java.util.ArrayList

 — 

 — 

 — 

 — 

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

markupAware

 — 

java.lang.Boolean

 — 

true

 — 

 — 

preserveOriginalContent

 — 

java.lang.Boolean

 — 

false

 — 

 — 

sourceUrl

 — 

java.net.URL

 — 

 — 

 — 

 — 

Compound Document From Xml

Category: Uncategorized
Framework: GATE
Version: unknown

GATE Compound Document.

Parameter Description Type Mandatory Default Value Multi-value Runtime

compoundDocumentUrl

 — 

java.net.URL

 — 

 — 

 — 

 — 

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

ConnectSesameOntology

Category: Uncategorized
Framework: GATE
Version: unknown

Connect to a repository containing and ontology

Parameter Description Type Mandatory Default Value Multi-value Runtime

repositoryID

 — 

java.lang.String

 — 

 — 

 — 

 — 

repositoryLocation

 — 

java.net.URL

 — 

 — 

 — 

 — 

Control Script

Category: Uncategorized
Framework: GATE
Version: unknown

Editor for the Groovy script controlling a scriptable controller

Copy Anns to Another Doc PR

Category: Uncategorized
Framework: GATE
Version: unknown

Copy the annotations from one document to another document.

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationTypes

 — 

java.util.List

 — 

 — 

 — 

true

inputASName

 — 

java.lang.String

 — 

 — 

 — 

true

outputASName

 — 

java.lang.String

 — 

 — 

 — 

true

sourceFilesURL

 — 

java.net.URL

 — 

 — 

 — 

true

Corpus Indexing Support

Category: Uncategorized
Framework: GATE
Version: unknown

Crawler PR

Category: Uncategorized
Framework: GATE
Version: unknown

GATE implementation of the Websphinx crawling API

Parameter Description Type Mandatory Default Value Multi-value Runtime

convertXmlTypes

 — 

java.lang.Boolean

 — 

true

 — 

true

depth

 — 

java.lang.Integer

 — 

3

 — 

true

dfs

 — 

java.lang.Boolean

 — 

true

 — 

true

domain

 — 

crawl.DomainMode

 — 

SUBTREE

 — 

true

keywords

 — 

java.util.List

 — 

 — 

 — 

true

keywordsCaseSensitive

 — 

java.lang.Boolean

 — 

true

 — 

true

max

 — 

java.lang.Integer

 — 

-1

 — 

true

maxPageSize

 — 

java.lang.Integer

 — 

100

 — 

true

outputCorpus

 — 

gate.Corpus

 — 

 — 

 — 

true

root

 — 

java.lang.String

 — 

 — 

 — 

true

source

 — 

gate.Corpus

 — 

 — 

 — 

true

stopAfter

 — 

java.lang.Integer

 — 

-1

 — 

true

userAgent

 — 

java.lang.String

 — 

 — 

 — 

true

CreateSesameOntology

Category: Uncategorized
Framework: GATE
Version: unknown

Create a ontology from a Sesame configuration file for a repository

Parameter Description Type Mandatory Default Value Multi-value Runtime

configFile

 — 

java.net.URL

 — 

 — 

 — 

 — 

repositoryID

 — 

java.lang.String

 — 

 — 

 — 

 — 

repositoryLocation

 — 

java.net.URL

 — 

 — 

 — 

 — 

[[Dictionary_Pluggable_Soft_TF/IDF_Matcher]] ==== Dictionary Pluggable Soft TF/IDF Matcher

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 1.0

Tests input tokens whether they belong to an entry in the specified dictionary using SecondString Soft TF/IDF. The dictionary should have suffix of .list for its file name, and its format should be (Format: key1 TAB alias11 TAB alias12 ... NEWLINE key2 TAB alias21 TAB alias22 ...)

Parameter Description Type Mandatory Default Value Multi-value Runtime

DictionaryFile

File which contains the dictionary (Format: key1 TAB alias11 TAB alias12 …​ NEWLINE key2 TAB alias21 TAB alias22 …​)

String

True

 — 

false

 — 

MaxTokenCombination

 — 

Integer

False

 — 

false

 — 

MinMatchingScore

 — 

Float

False

 — 

false

 — 

DisambiguateAlternatives

Category: Uncategorized
Framework: AlvisNLP
Version: 2012-04-30

Disambiguate features that have multiple values.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

ambiguousFeature

 — 

java.lang.String

True

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

target

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

warnIfAmbiguous

 — 

java.lang.Boolean

False

 — 

 — 

 — 

DocumentFrequencyBank

Category: Uncategorized
Framework: GATE
Version: unknown

Document frequency counter derived from corpora and other DFBs

Parameter Description Type Mandatory Default Value Multi-value Runtime

corpora

 — 

java.util.Set

 — 

 — 

 — 

 — 

debugMode

 — 

java.lang.Boolean

 — 

false

 — 

 — 

idDocumentFeature

 — 

java.lang.String

 — 

 — 

 — 

 — 

inputASName

 — 

java.lang.String

 — 

 — 

 — 

 — 

inputAnnotationFeature

 — 

java.lang.String

 — 

canonical

 — 

 — 

inputAnnotationTypes

 — 

java.util.Set

 — 

SingleWord;MultiWord

 — 

 — 

inputBanks

 — 

java.util.Set

 — 

 — 

 — 

 — 

languageFeature

 — 

java.lang.String

 — 

lang

 — 

 — 

scoreProperty

 — 

java.lang.String

 — 

documentFrequency

 — 

 — 

segmentAnnotationType

 — 

java.lang.String

 — 

 — 

 — 

 — 

DoubleMetaphonePhoneticTranscriptor

Category: Uncategorized
Framework: DKPro Core (UIMA)
Version: 1.8.0

Double-Metaphone phonetic transcription based on Apache Commons Codec. Works for English.

ElementMapper

Category: Uncategorized
Framework: AlvisNLP
Version:

Maps elements according to a collection of mapping elements.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

entries

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

form

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

ignoreCase

 — 

java.lang.Boolean

False

 — 

 — 

 — 

key

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

operator

 — 

org.bibliome.alvisnlp.modules.mapper.MappingOperator

True

 — 

 — 

 — 

target

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

targetFeatures

 — 

java.lang.String[]

True

 — 

 — 

 — 

values

 — 

alvisnlp.corpus.expressions.Expression[]

True

 — 

 — 

 — 

ElementProjector

Category: Uncategorized
Framework: AlvisNLP
Version:

Searches for entries in a dictionary generated by an expression.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

constantAnnotationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

errorDuplicateValues

 — 

java.lang.Boolean

False

 — 

 — 

 — 

features

 — 

alvisnlp.module.types.ExpressionMapping

True

 — 

 — 

 — 

ignoreCase

 — 

java.lang.Boolean

False

 — 

 — 

 — 

ignoreDiacritics

 — 

java.lang.Boolean

False

 — 

 — 

 — 

ignoreWhitespace

 — 

java.lang.Boolean

False

 — 

 — 

 — 

key

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

multipleValueAction

 — 

org.bibliome.alvisnlp.modules.projectors.MultipleValueAction

True

 — 

 — 

 — 

normalizeSpace

 — 

java.lang.Boolean

False

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

subject

 — 

org.bibliome.alvisnlp.modules.projectors.Subject

True

 — 

 — 

 — 

targetLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

values

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

ElementProjector2

Category: Uncategorized
Framework: AlvisNLP
Version:

synopsis

Parameter Description Type Mandatory Default Value Multi-value Runtime

action

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

addToLayer

 — 

java.lang.Boolean

False

 — 

 — 

 — 

allUpperCaseInsensitive

 — 

java.lang.Boolean

False

 — 

 — 

 — 

allowJoined

 — 

java.lang.Boolean

False

 — 

 — 

 — 

caseInsensitive

 — 

java.lang.Boolean

False

 — 

 — 

 — 

constantAnnotationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantDocumentFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantRelationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantSectionFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantTupleFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

createAnnotations

 — 

java.lang.Boolean

False

 — 

 — 

 — 

createDocuments

 — 

java.lang.Boolean

False

 — 

 — 

 — 

createRelations

 — 

java.lang.Boolean

False

 — 

 — 

 — 

createSections

 — 

java.lang.Boolean

False

 — 

 — 

 — 

createTuples

 — 

java.lang.Boolean

False

 — 

 — 

 — 

deleteElements

 — 

java.lang.Boolean

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

entries

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

ignoreDiacritics

 — 

java.lang.Boolean

False

 — 

 — 

 — 

joinDash

 — 

java.lang.Boolean

False

 — 

 — 

 — 

key

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

matchStartCaseInsensitive

 — 

java.lang.Boolean

False

 — 

 — 

 — 

multipleEntryBehaviour

 — 

org.bibliome.alvisnlp.modules.trie.MultipleEntryBehaviour

True

 — 

 — 

 — 

removeFromLayer

 — 

java.lang.Boolean

False

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

setArguments

 — 

java.lang.Boolean

False

 — 

 — 

 — 

setFeatures

 — 

java.lang.Boolean

False

 — 

 — 

 — 

skipConsecutiveWhitespaces

 — 

java.lang.Boolean

False

 — 

 — 

 — 

skipWhitespace

 — 

java.lang.Boolean

False

 — 

 — 

 — 

subject

 — 

org.bibliome.alvisnlp.modules.trie.Subject

True

 — 

 — 

 — 

targetLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

trieSink

 — 

org.bibliome.util.files.OutputFile

False

 — 

 — 

 — 

trieSource

 — 

org.bibliome.util.files.InputFile

False

 — 

 — 

 — 

wordStartCaseInsensitive

 — 

java.lang.Boolean

False

 — 

 — 

 — 

EngLemmatiser

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 1.0

English lemmatiser which is adapted from WordNet. From dragontools/Banner toolkit.

Parameter Description Type Mandatory Default Value Multi-value Runtime

DisableVerbAdjective

 — 

Boolean

True

 — 

false

 — 

IndexLookup

 — 

Boolean

True

 — 

false

 — 

Feature Generator

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 1.0

Generates a list of user-defined observations for each token. Token and sequence boundaries are also parametrised. The output of this component is useful for machine learning components.

Parameter Description Type Mandatory Default Value Multi-value Runtime

FeatureDefinitions

 — 

String

True

 — 

true

 — 

SequenceAnnotationType

 — 

String

True

 — 

false

 — 

TokenAnnotationType

 — 

String

True

 — 

false

 — 

FileMapper

Category: Uncategorized
Framework: AlvisNLP
Version: 2010-10-28

Maps the value of an annoation feature according to a mapping file.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

ignoreCase

 — 

java.lang.Boolean

False

 — 

 — 

 — 

mappedLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

mappingFile

 — 

org.bibliome.util.streams.SourceStream

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

separator

 — 

java.lang.Character

True

 — 

 — 

 — 

sourceFeature

 — 

java.lang.String

True

 — 

 — 

 — 

targetFeatures

 — 

java.lang.String[]

True

 — 

 — 

 — 

FileMapper2

Category: Uncategorized
Framework: AlvisNLP
Version:

Maps elements according to a tab-separated mapping file.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

form

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

ignoreCase

 — 

java.lang.Boolean

False

 — 

 — 

 — 

keyColumn

 — 

java.lang.Integer

True

 — 

 — 

 — 

mappingFile

 — 

org.bibliome.util.streams.SourceStream

True

 — 

 — 

 — 

operator

 — 

org.bibliome.alvisnlp.modules.mapper.MappingOperator

True

 — 

 — 

 — 

separator

 — 

java.lang.Character

True

 — 

 — 

 — 

target

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

targetFeatures

 — 

java.lang.String[]

True

 — 

 — 

 — 

FreelingMorpho

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 1.0

Performs tokenisation, and determines possible lemmas and POS tags for each token, with confidence scores. Operates on English (en). Spanish (es) and Catalan (ca), Welsh (cy), Galician (gl), Italian (it) and Portuguese (pt) by setting the "language" parameter (default is English).

Parameter Description Type Mandatory Default Value Multi-value Runtime

language

 — 

String

True

 — 

false

 — 

GATE Composite document

Category: Uncategorized
Framework: GATE
Version: unknown

GATE Composite document.

Gazetteer List Collector

Category: Uncategorized
Framework: GATE
Version: unknown

Gazetteer lists collector.

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationTypes

 — 

java.util.ArrayList

 — 

Organization;Person;Location;Date

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

gazetteer

 — 

gate.creole.gazetteer.Gazetteer

 — 

 — 

 — 

true

markupASName

 — 

java.lang.String

 — 

Key

 — 

true

theLanguage

 — 

java.lang.String

 — 

 — 

 — 

true

GermanSeparatedParticleAnnotator

Category: Uncategorized
Framework: DKPro Core (UIMA)
Version: 1.8.0

Annotator to be used for post-processing of German corpora that have been lemmatized and POS-tagged with the TreeTagger, based on the STTS tagset. This Annotator deals with German particle verbs. Particle verbs consist of a particle and a stem, e.g. anfangen = an+fangen There are many usages of German particle verbs where the stem and the particle are separated, e.g., Wir fangen gleich an. The TreeTagger lemmatizes the verb stem as "fangen" and the separated particle as "an", the proper verblemma "anfangen" is thus not available as an annotation. The GermanSeparatedParticleAnnotator replaces the lemma of the stem of particle-verbs (e.g., fangen) by the proper verb lemma (e.g. anfangen) and leaves the lemma of the separated particle unchanged.

Groovy support for GATE

Category: Uncategorized
Framework: GATE
Version: unknown

Hindi Main Grammar

Category: Uncategorized
Framework: GATE
Version: unknown

A module for executing Jape grammars

Parameter Description Type Mandatory Default Value Multi-value Runtime

document

 — 

gate.Document

 — 

 — 

 — 

true

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

grammarURL

 — 

java.net.URL

 — 

resources/grammar/main.jape

 — 

 — 

inputASName

 — 

java.lang.String

 — 

 — 

 — 

true

outputASName

 — 

java.lang.String

 — 

 — 

 — 

true

Hindi OrthoMatcher

Category: Uncategorized
Framework: GATE
Version: unknown

Hindi Orthomatcher

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationSetName

 — 

java.lang.String

 — 

 — 

 — 

true

annotationTypes

 — 

java.util.ArrayList

 — 

Organization;Person;Location;Date

 — 

true

caseSensitive

 — 

java.lang.Boolean

 — 

false

 — 

 — 

definitionFileURL

 — 

java.net.URL

 — 

resources/orthomatcher/listsNM.def

 — 

 — 

document

 — 

gate.Document

 — 

 — 

 — 

true

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

extLists

 — 

java.lang.Boolean

 — 

true

 — 

 — 

highPrecisionOrgs

 — 

java.lang.Boolean

 — 

false

 — 

 — 

minimumNicknameLikelihood

 — 

java.lang.Double

 — 

0.50

 — 

 — 

organizationType

 — 

java.lang.String

 — 

Organization

 — 

 — 

personType

 — 

java.lang.String

 — 

Person

 — 

 — 

processUnknown

 — 

java.lang.Boolean

 — 

true

 — 

 — 

Hindi Tokeniser Postprocessor

Category: Uncategorized
Framework: GATE
Version: unknown

A module for executing Jape grammars

Parameter Description Type Mandatory Default Value Multi-value Runtime

document

 — 

gate.Document

 — 

 — 

 — 

true

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

grammarURL

 — 

java.net.URL

 — 

resources/tokeniser/join.jape

 — 

 — 

inputASName

 — 

java.lang.String

 — 

 — 

 — 

true

outputASName

 — 

java.lang.String

 — 

 — 

 — 

true

HyponymyTermbank

Category: Uncategorized
Framework: GATE
Version: unknown

TermRaider Termbank derived from head/string hyponymy

Parameter Description Type Mandatory Default Value Multi-value Runtime

corpora

 — 

java.util.Set

 — 

 — 

 — 

 — 

debugMode

 — 

java.lang.Boolean

 — 

false

 — 

 — 

idDocumentFeature

 — 

java.lang.String

 — 

 — 

 — 

 — 

inputASName

 — 

java.lang.String

 — 

 — 

 — 

 — 

inputAnnotationFeature

 — 

java.lang.String

 — 

canonical

 — 

 — 

inputAnnotationTypes

 — 

java.util.Set

 — 

SingleWord;MultiWord

 — 

 — 

inputHeadFeatures

 — 

java.util.List

 — 

 — 

 — 

 — 

languageFeature

 — 

java.lang.String

 — 

lang

 — 

 — 

normalization

 — 

gate.termraider.modes.Normalization

 — 

Sigmoid

 — 

 — 

scoreProperty

 — 

java.lang.String

 — 

kyotoDomainRelevance

 — 

 — 

[[IOTestRunner$Validator]] ==== IOTestRunner$Validator

Category: Uncategorized
Framework: DKPro Core (UIMA)
Version: 1.8.0

Descriptor automatically generated by uimaFIT

InsertContents

Category: Uncategorized
Framework: AlvisNLP
Version:

synopsis

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

constantAnnotationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantRelationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantSectionFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantTupleFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

insert

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

offset

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

points

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

userFunctions

 — 

org.bibliome.alvisnlp.library.UserFunction[]

True

 — 

 — 

 — 

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 0.3

Uses the Keio service to fetch MEDLINE abstracts matching a specified query. Kleio is available at http://www.nactem.ac.uk/Kleio/

Parameter Description Type Mandatory Default Value Multi-value Runtime

query

Kleio query

String

True

 — 

false

 — 

recentFirst

If true, results will be sorted by the date of publication in decreasing order. Otherwise, they will be sorted by relevance.

Boolean

False

 — 

false

 — 

LBJ Named Entity Recognizer

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 1.0

A wrapper for the Illinois Named Entity Tagger

Parameter Description Type Mandatory Default Value Multi-value Runtime

BeamSize

 — 

Integer

False

 — 

false

 — 

BrownClusterFiles

Set of resource files

String

True

 — 

true

 — 

BrownClusterThresholds

Settings per cluster resource file

Integer

True

 — 

true

 — 

BrownIsLowercase

Setting per cluster resource

String

True

 — 

true

 — 

ChunkScheme

Whether BIO, BILOU, IOB2, etc.

String

True

 — 

false

 — 

EmbeddingDimensionalities

 — 

Integer

False

 — 

false

 — 

Features

Which features to use

String

True

 — 

true

 — 

ForceNewSentenceOnLineBreaks

 — 

Boolean

False

 — 

false

 — 

InferenceMethod

 — 

String

False

 — 

false

 — 

IsLowercaseWordEmbeddings

 — 

Boolean

False

 — 

false

 — 

KeepOriginalFileTokenizationAndSentenceSplitting

 — 

Boolean

False

 — 

false

 — 

Labels

Which labels to output

String

True

 — 

true

 — 

LinkScoreThreshold

 — 

Float

False

 — 

false

 — 

MinWordAppThresholdsForEmbeddings

 — 

Integer

False

 — 

false

 — 

NormalizationConstantsForEmbeddings

 — 

Float

False

 — 

false

 — 

NormalizationMethodsForEmbeddings

 — 

String

False

 — 

false

 — 

NormalizeTitleText

 — 

Boolean

True

 — 

false

 — 

PredictionConfidenceThreshold

 — 

Integer

False

 — 

false

 — 

ThresholdPrediction

 — 

Boolean

False

 — 

false

 — 

TokenizationScheme

 — 

String

True

 — 

false

 — 

LayerComparator

Category: Uncategorized
Framework: AlvisNLP
Version: 2010-10-28

Compares annotations in two different layers.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

outFile

 — 

org.bibliome.util.streams.TargetStream

True

 — 

 — 

 — 

predictedLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

referenceLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

Linguistic Simplifier

Category: Uncategorized
Framework: GATE
Version: unknown

A processing resource that takes document and corpus parameters

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationSetName

 — 

java.lang.String

 — 

 — 

 — 

true

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

encoding

 — 

java.lang.String

 — 

UTF-8

 — 

 — 

gazetteerURL

 — 

java.net.URL

 — 

resources/gazetteer/lists.def

 — 

 — 

japeURL

 — 

java.net.URL

 — 

resources/jape/main.jape

 — 

 — 

nounVerbMapURL

 — 

java.net.URL

 — 

resources/noun_verb.csv

 — 

 — 

wordNet

 — 

gate.wordnet.WordNet

 — 

 — 

 — 

true

Linguistic Simplifier

Category: Uncategorized
Framework: GATE
Version: unknown

Example application for the linguistic simplifier

Parameter Description Type Mandatory Default Value Multi-value Runtime

menu

 — 

java.util.List

 — 

 — 

 — 

 — 

pipelineURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

Lucene IR Engine

Category: Uncategorized
Framework: GATE
Version: unknown

Lupedia Service PR

Category: Uncategorized
Framework: GATE
Version: unknown

Runs a lupedia annotation service on a GATE document

Parameter Description Type Mandatory Default Value Multi-value Runtime

caseSensitive

 — 

java.lang.Boolean

 — 

true

 — 

true

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

datasets

 — 

java.util.List

 — 

Person;Event;Place;Organisation;Work

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

keepFirstAndLongestMatch

 — 

java.lang.Boolean

 — 

true

 — 

true

keepHighest

 — 

java.lang.Boolean

 — 

true

 — 

true

keepSpecific

 — 

java.lang.Boolean

 — 

true

 — 

true

lang

 — 

gate.lupedia.Language

 — 

en

 — 

true

outputASName

 — 

java.lang.String

 — 

 — 

 — 

true

singleGreedyMatch

 — 

java.lang.Boolean

 — 

false

 — 

true

skipShortWords

 — 

java.lang.Boolean

 — 

true

 — 

true

skipStopWords

 — 

java.lang.Boolean

 — 

true

 — 

true

threshold

 — 

java.lang.Double

 — 

0.70

 — 

true

[[Majority-vote_consensus_builder_(annotation)]] ==== Majority-vote consensus builder (annotation)

Category: Uncategorized
Framework: GATE
Version: unknown

Process results of a crowd annotation task to find where annotators agree and disagree.

Parameter Description Type Mandatory Default Value Multi-value Runtime

consensusASName

 — 

java.lang.String

 — 

 — 

 — 

true

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

disputeASName

 — 

java.lang.String

 — 

crowdDisputed

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

minimumAgreement

 — 

java.lang.Integer

 — 

 — 

 — 

true

resultASName

 — 

java.lang.String

 — 

crowdResults

 — 

true

resultAnnotationType

 — 

java.lang.String

 — 

 — 

 — 

true

MergeLayers

Category: Uncategorized
Framework: AlvisNLP
Version: 2010-10-28

Creates a new layer in each section containing all annotations in source layers.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

sourceLayerNames

 — 

java.lang.String[]

True

 — 

 — 

 — 

targetLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

MergeSections

Category: Uncategorized
Framework: AlvisNLP
Version:

Merge several sections into a single one.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

constantAnnotationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantRelationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantSectionFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantTupleFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

fragmentLayerName

 — 

java.lang.String

False

 — 

 — 

 — 

fragmentSelection

 — 

org.bibliome.alvisnlp.modules.clone.FragmentSelection

True

 — 

 — 

 — 

fragmentSeparator

 — 

java.lang.String

True

 — 

 — 

 — 

removeSections

 — 

java.lang.Boolean

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

sectionSeparator

 — 

java.lang.String

True

 — 

 — 

 — 

sectionsLayerName

 — 

java.lang.String

False

 — 

 — 

 — 

targetSectionName

 — 

java.lang.String

True

 — 

 — 

 — 

MetaMap Annotator

Category: Uncategorized
Framework: GATE
Version: unknown

This plugin uses the MetaMap Java API to send GATE document content to MetaMap skrmedpostctl server and PrologBeans mmserver instances running on the given machine/port

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotNormalize

 — 

gate.metamap.AnnotNormalizeMode

 — 

None

 — 

true

annotateNegEx

 — 

java.lang.Boolean

 — 

false

 — 

true

annotatePhrases

 — 

java.lang.Boolean

 — 

false

 — 

true

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

excludeIfContains

 — 

java.util.ArrayList

 — 

 — 

 — 

true

excludeIfWithin

 — 

java.util.ArrayList

 — 

 — 

 — 

true

inputASName

 — 

java.lang.String

 — 

 — 

 — 

true

inputASTypeFeature

 — 

java.lang.String

 — 

 — 

 — 

true

inputASTypes

 — 

java.util.ArrayList

 — 

 — 

 — 

true

metaMapOptions

 — 

java.lang.String

 — 

-Xy

 — 

true

outputASName

 — 

java.lang.String

 — 

 — 

 — 

true

outputASType

 — 

java.lang.String

 — 

MetaMap

 — 

true

outputMode

 — 

gate.metamap.OutputMode

 — 

HighestMappingOnly

 — 

true

taggerMode

 — 

gate.metamap.TaggerMode

 — 

CoReference

 — 

true

MetaphonePhoneticTranscriptor

Category: Uncategorized
Framework: DKPro Core (UIMA)
Version: 1.8.0

Metaphone phonetic transcription based on Apache Commons Codec. Works for English.

MutationFinder

Category: Uncategorized
Framework: GATE
Version: unknown

GATE MutationFinder Wrapper

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationSetName

 — 

java.lang.String

 — 

 — 

 — 

true

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

regexURL

 — 

java.net.URL

 — 

resources/regex.txt

 — 

 — 

NGramAnnotator

Category: Uncategorized
Framework: DKPro Core (UIMA)
Version: 1.8.0

N-gram annotator.

Parameter Description Type Mandatory Default Value Multi-value Runtime

N

The length of the n-grams to generate (the "n" in n-gram).

Integer

True

 — 

false

 — 

NGrams

Category: Uncategorized
Framework: AlvisNLP
Version: 2012-04-30

Computes annotation n-grams.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

constantAnnotationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

keepAnnotations

 — 

java.lang.String[]

True

 — 

 — 

 — 

maxNGramSize

 — 

java.lang.Integer

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

sentenceLayerName

 — 

java.lang.String

False

 — 

 — 

 — 

targetLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

tokenLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

NeMine

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 0.0.1-SNAPSHOT

Parameter Description Type Mandatory Default Value Multi-value Runtime

threshold

 — 

Float

True

 — 

false

 — 

NewCount

Category: Uncategorized
Framework: AlvisNLP
Version: 2012-04-30

Counts element occurrences and writes the results in a file, including tfidf.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

countFile

 — 

org.bibliome.util.streams.TargetStream

False

 — 

 — 

 — 

documents

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

featureKey

 — 

java.lang.String

True

 — 

 — 

 — 

headers

 — 

java.lang.Boolean

False

 — 

 — 

 — 

target

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

tfidfFile

 — 

org.bibliome.util.streams.TargetStream

False

 — 

 — 

 — 

OBOMapper

Category: Uncategorized
Framework: AlvisNLP
Version:

synopsis

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

ancestorsFeature

 — 

java.lang.String

False

 — 

 — 

 — 

childrenFeature

 — 

java.lang.String

False

 — 

 — 

 — 

form

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

idFeature

 — 

java.lang.String

False

 — 

 — 

 — 

idKeys

 — 

java.lang.Boolean

False

 — 

 — 

 — 

ignoreCase

 — 

java.lang.Boolean

False

 — 

 — 

 — 

keepDBXref

 — 

java.lang.Boolean

False

 — 

 — 

 — 

nameFeature

 — 

java.lang.String

False

 — 

 — 

 — 

oboFiles

 — 

java.lang.String[]

True

 — 

 — 

 — 

operator

 — 

org.bibliome.alvisnlp.modules.mapper.MappingOperator

True

 — 

 — 

 — 

parentsFeature

 — 

java.lang.String

False

 — 

 — 

 — 

pathFeature

 — 

java.lang.String

False

 — 

 — 

 — 

target

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

versionFeature

 — 

java.lang.String

False

 — 

 — 

 — 

OBOProjector

Category: Uncategorized
Framework: AlvisNLP
Version:

Projects OBO terms and synonyms on sections.

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

allUpperCaseInsensitive

 — 

java.lang.Boolean

False

 — 

 — 

 — 

allowJoined

 — 

java.lang.Boolean

False

 — 

 — 

 — 

ancestorsFeature

 — 

java.lang.String

False

 — 

 — 

 — 

caseInsensitive

 — 

java.lang.Boolean

False

 — 

 — 

 — 

childrenFeature

 — 

java.lang.String

False

 — 

 — 

 — 

constantAnnotationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

idFeature

 — 

java.lang.String

False

 — 

 — 

 — 

ignoreDiacritics

 — 

java.lang.Boolean

False

 — 

 — 

 — 

joinDash

 — 

java.lang.Boolean

False

 — 

 — 

 — 

keepDBXref

 — 

java.lang.Boolean

False

 — 

 — 

 — 

matchStartCaseInsensitive

 — 

java.lang.Boolean

False

 — 

 — 

 — 

multipleEntryBehaviour

 — 

org.bibliome.alvisnlp.modules.trie.MultipleEntryBehaviour

True

 — 

 — 

 — 

nameFeature

 — 

java.lang.String

False

 — 

 — 

 — 

oboFiles

 — 

java.lang.String[]

True

 — 

 — 

 — 

parentsFeature

 — 

java.lang.String

False

 — 

 — 

 — 

pathFeature

 — 

java.lang.String

False

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

skipConsecutiveWhitespaces

 — 

java.lang.Boolean

False

 — 

 — 

 — 

skipWhitespace

 — 

java.lang.Boolean

False

 — 

 — 

 — 

subject

 — 

org.bibliome.alvisnlp.modules.trie.Subject

True

 — 

 — 

 — 

targetLayerName

 — 

java.lang.String

True

 — 

 — 

 — 

trieSink

 — 

org.bibliome.util.files.OutputFile

False

 — 

 — 

 — 

trieSource

 — 

org.bibliome.util.files.InputFile

False

 — 

 — 

 — 

versionFeature

 — 

java.lang.String

False

 — 

 — 

 — 

wordStartCaseInsensitive

 — 

java.lang.Boolean

False

 — 

 — 

 — 

OWLIM Ontology

Category: Uncategorized
Framework: GATE
Version: unknown

Ontology created as a temporary OWLIM3 in-memory repository

Parameter Description Type Mandatory Default Value Multi-value Runtime

baseURI

 — 

java.lang.String

 — 

 — 

 — 

 — 

dataDirectoryURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

loadImports

 — 

java.lang.Boolean

 — 

true

 — 

 — 

mappingsURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

n3URL

 — 

java.net.URL

 — 

 — 

 — 

 — 

ntriplesURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

persistent

 — 

java.lang.Boolean

 — 

false

 — 

 — 

rdfXmlURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

turtleURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

OWLIM Ontology DEPRECATED

Category: Uncategorized
Framework: GATE
Version: unknown

Ontology created as a temporary OWLIM3 in-memory repository, for backwards compatibility only

Parameter Description Type Mandatory Default Value Multi-value Runtime

baseURI

 — 

java.lang.String

 — 

 — 

 — 

 — 

dataDirectoryURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

defaultNameSpace

 — 

java.lang.String

 — 

 — 

 — 

 — 

loadImports

 — 

java.lang.Boolean

 — 

true

 — 

 — 

mappingsURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

n3URL

 — 

java.net.URL

 — 

 — 

 — 

 — 

ntriplesURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

persistent

 — 

java.lang.Boolean

 — 

false

 — 

 — 

rdfXmlURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

turtleURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

OntoReif

Category: Uncategorized
Framework: AlvisNLP
Version:

synopsis

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

constantRelationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantTupleFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

OpenNLPNEDetector

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 1.0

Detects named entities in text and creates corresponding entity annotations that span the found entities. Uses the OpenNLP MaxEnt named entity Detector. Each entity class has a separate MaxEnt model file. All model files must be stored in a single model file directory and use the following naming convention: "class.bin.gz", where "class" is the entity class name and ".bin.gz" must appear as shown, e.g., "person.bin.gz". This analysis engine takes a parameter called "EntityTypeMapping" which maps each entity class name to an entity annotation type. The entity class name must match a model file in the model file directory, and the entity annotation type must be defined in the type system and have a corresponding JCas Java class.

Parameter Description Type Mandatory Default Value Multi-value Runtime

EntityTypeMappings

Mapping from entity names (obtained from the model filename) to the JCas class for the corresponding annotation. Each mapping string is of the form "name,class", i.e., the entity type name followed by a comma followed by the annotation class.

String

False

 — 

true

 — 

ModelDirectory

 — 

String

True

 — 

false

 — 

OpenNLPSentenceDetector

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 1.0

Detect sentence boundaries and create sentence annotations that span these boundaries. Uses the OpenNLP MaxEnt Sentence Detector.

Parameter Description Type Mandatory Default Value Multi-value Runtime

ModelFile

Filename of the model file.

String

True

 — 

false

 — 

OrthoRef

Category: Uncategorized
Framework: GATE
Version: unknown

An orthographic coreferencer

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationSetName

 — 

java.lang.String

 — 

 — 

 — 

true

configFileUrl

 — 

java.net.URL

 — 

resources/default-config.coref.xml

 — 

 — 

corpus

 — 

gate.Corpus

 — 

 — 

 — 

true

document

 — 

gate.Document

 — 

 — 

 — 

true

maxLookBehind

 — 

java.lang.Integer

 — 

10

 — 

true

OscarMER

Category: Uncategorized
Framework: NaCTeM (UIMA)
Version: 1.0

Runs Oscar 3 with maximum entropy based recogniser with syntactic tokens as input

PMI Bank

Category: Uncategorized
Framework: GATE
Version: unknown

Pointwise Mutual Information from corpora

Parameter Description Type Mandatory Default Value Multi-value Runtime

allowOverlapCollocations

 — 

java.lang.Boolean

 — 

false

 — 

 — 

corpora

 — 

java.util.Set

 — 

 — 

 — 

 — 

debugMode

 — 

java.lang.Boolean

 — 

false

 — 

 — 

innerAnnotationTypes

 — 

java.util.Set

 — 

Entity

 — 

 — 

inputASName

 — 

java.lang.String

 — 

 — 

 — 

 — 

inputAnnotationFeature

 — 

java.lang.String

 — 

canonical

 — 

 — 

languageFeature

 — 

java.lang.String

 — 

lang

 — 

 — 

outerAnnotationType

 — 

java.lang.String

 — 

Sentence

 — 

 — 

outerAnnotationWindow

 — 

java.lang.Integer

 — 

2

 — 

 — 

requireTypeDifference

 — 

java.lang.Boolean

 — 

false

 — 

 — 

scoreProperty

 — 

java.lang.String

 — 

pmiScore

 — 

 — 

[[PMI_Example_(English)]] ==== PMI Example (English)

Category: Uncategorized
Framework: GATE
Version: unknown

Example application for the PMI (pointwise mutual information) tool

Parameter Description Type Mandatory Default Value Multi-value Runtime

menu

 — 

java.util.List

 — 

 — 

 — 

 — 

pipelineURL

 — 

java.net.URL

 — 

 — 

 — 

 — 

PatternMatcher

Category: Uncategorized
Framework: AlvisNLP
Version: 2010-10-28

Matches a regular expression-like pattern on the sequence of annotations in a given layer.

Parameter Description Type Mandatory Default Value Multi-value Runtime

actions

 — 

org.bibliome.alvisnlp.modules.pattern.action.MatchAction[]

True

 — 

 — 

 — 

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

annotationComparator

 — 

alvisnlp.corpus.AnnotationComparator

True

 — 

 — 

 — 

constantAnnotationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantRelationFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

constantTupleFeatures

 — 

alvisnlp.module.types.Mapping

False

 — 

 — 

 — 

documentFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

layerName

 — 

java.lang.String

True

 — 

 — 

 — 

overlappingBehaviour

 — 

org.bibliome.alvisnlp.modules.pattern.OverlappingBehaviour

True

 — 

 — 

 — 

pattern

 — 

org.bibliome.alvisnlp.modules.pattern.ElementPattern

True

 — 

 — 

 — 

sectionFilter

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

ProminentConceptReporter

Category: Uncategorized
Framework: AlvisNLP
Version:

synopsis

Parameter Description Type Mandatory Default Value Multi-value Runtime

active

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

conceptAnnotations

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

conceptId

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

documents

 — 

alvisnlp.corpus.expressions.Expression

True

 — 

 — 

 — 

sectionName

 — 

java.lang.String

True

 — 

 — 

 — 

Quality Assurance PR

Category: Uncategorized
Framework: GATE
Version: unknown

The Quality Assurance PR provides a functionality of the Corpus QA Tool in GATE Developer

Parameter Description Type Mandatory Default Value Multi-value Runtime

annotationTypes

 — 

java.util.List

 — 

 — 

 — 

true

corpus

 — 

gate.Corpus

 — 

 —