Typesystem Alignment 1.0.0

This document provides an alignment of the type systems and annotation conventions used by the OpenMinTeD partners and LAPPS with each other, based on the types supported by LAPPS.

Overview

Table 1. Typesystems (6)
Typesystem Size

Alvis

8

Argo

14

DKPro_Core

22

GATE

15

ILSP

13

LAPPS

22

Type alignments

Table 2. LAPPS type alignments
LAPPS Alvis Argo DKPro_Core GATE ILSP

Annotation

 — 

 — 

AnnotationBase

Annotation

gr.ilsp.types.Annotation

AudioDocument

 — 

 — 

 — 

Unsupported

 — 

Constituent

Tuple

org.u_compare.shared.syntactic.TreeNode

Constituent

SyntaxTreeNode

 — 

Coreference

Tuple

org.u_compare.shared.semantic.CoreferenceAnnotation

CoreferenceChain

 — 

 — 

Date

 — 

 — 

Date

Date

Timex2Mention

Dependency

Tuple

org.u_compare.shared.syntactic.Dependency

Dependency

Dependency

DependencyRelation

DependencyStructure

Relation

 — 

 — 

 — 

 — 

Document

Document

org.u_compare.shared.document.DocumentAnnotation

DocumentMetaData

Document

DocumentAnnotation

Location

 — 

org.u_compare.shared.semantic.Place

Location

Location

LOC

Markable

 — 

 — 

CoreferenceLink

 — 

 — 

NamedEntity

Annotation

org.u_compare.shared.semantic.NamedEntity

NamedEntity

 — 

 — 

NounChunk

Annotation

 — 

NC

NounChunk

 — 

Organization

 — 

 — 

Organization

Organization

ORG

Person

 — 

org.u_compare.shared.semantic.Person

Person

Person

PER

PhraseStructure

Relation

 — 

ROOT

SyntaxTreeNode

 — 

Sentence

Annotation

org.u_compare.shared.syntactic.Sentence

Sentence

Sentence

Sentence

Span

Annotation

Uima.tcas.Annotation

Annotation

Annotation

gr.ilsp.types.Annotation

TextDocument

Section

 — 

 — 

so all documents are implicitly text),(GATE only supports textual documents, so all documents are implicitly text)

 — 

Thing

Element

Uima.cas.TOP

TOP

 — 

uima.tcas.Annotation

Token

Annotation

org.u_compare.shared.syntactic.Token

Token

Token

Token

Token.metadata.posTagSet and friends

 — 

 — 

TagSetDescription

 — 

 — 

VerbChunk

Annotation

 — 

VC

VG

 — 

Feature alignments

Table 3. LAPPS type Annotation feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Annotation.metadata.producer

 — 

 — 

 — 

 — 

Annotation.componentId

Annotation.metadata.rules

 — 

 — 

 — 

 — 

 — 

Annotation.properties.id

 — 

 — 

 — 

Annotation.id

Annotation.id

Table 4. LAPPS type AudioDocument feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP
Table 5. LAPPS type Constituent feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Constituent.properties.children

Tuple.arguments

children

Constituent.children

 — 

 — 

Constituent.properties.label

 — 

 — 

Constituent.constituentType

 — 

 — 

Constituent.properties.parent

 — 

parent

Constituent.parent

SyntaxTreeNode.features.consists

 — 

Table 6. LAPPS type Coreference feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Coreference.properties.mentions

 — 

linkedAnnotationSets

CoreferenceChain.links()

 — 

 — 

Coreference.properties.representative

 — 

 — 

 — 

 — 

 — 

Table 7. LAPPS type Date feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Date.properties.dateType

 — 

 — 

NamedEntity.value

Date.features.kind

 — 

Table 8. LAPPS type Dependency feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Dependency.properties.dependent

 — 

target

Dependency.dependent

Dependency.features.args[1]

 — 

Dependency.properties.governor

 — 

source

Dependency.governor

Dependency.features.args[0]

head

Dependency.properties.label

 — 

label

Dependency.dependencyType

Dependency.features.kind

label

Table 9. LAPPS type DependencyStructure feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

DependencyStructure.metadata.dependencySet

 — 

 — 

 — 

 — 

 — 

DependencyStructure.properties.dependencies

Relation.tuples

 — 

 — 

 — 

 — 

DependencyStructure.properties.dependencyType

 — 

 — 

 — 

 — 

 — 

Table 10. LAPPS type Document feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Document.properties.encoding

 — 

 — 

 — 

Document.mimeType

 — 

Document.properties.id

Document.id

Document.externalReference.id

DocumentMetaData.documentId

Document.name

DocumentAnnotation.docId

Document.properties.language

 — 

 — 

DocumentAnnotation.language

Document.features.language

DocumentAnnotation.language

Document.properties.source

 — 

Document.externalReference

DocumentMetaData.documentUri

Document.sourceUrl or Document.stringContent

DocumentAnnotation.uri

Document.properties.sourceType

 — 

 — 

 — 

 — 

 — 

Table 11. LAPPS type Location feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Location.properties.locType

 — 

 — 

NamedEntity.value

Location.features.locType

 — 

Table 12. LAPPS type Markable feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Coreference.properties.mentions

 — 

 — 

CoreferenceLink.next

 — 

 — 

Table 13. LAPPS type NamedEntity feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP
Table 14. LAPPS type NounChunk feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP
Table 15. LAPPS type Organization feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Organization.properties.orgType

 — 

 — 

NamedEntity.value

Organization.features.orgType

 — 

Table 16. LAPPS type Person feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Person.properties.gender

 — 

 — 

NamedEntity.value

Person.features.gender

 — 

Table 17. LAPPS type PhraseStructure feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

PhraseStructure.metadata.categorySet

 — 

 — 

TagSetDescription

 — 

 — 

PhraseStructure.properties.constituents

Relation.tuples

 — 

 — 

 — 

 — 

Table 18. LAPPS type Sentence feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Sentence.properties.sentenceType

 — 

 — 

 — 

 — 

 — 

Table 19. LAPPS type Span feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Span.properties.end

Annotation.end

Annotation.end

Annotation.end

Annotation.endOffset

Annotation.end

Span.properties.start

Annotation.start

Annotation.begin

Annotation.begin

Annotation.startOffset

Annotation.begin

Span.properties.targets

 — 

 — 

 — 

 — 

 — 

Table 20. LAPPS type TextDocument feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP
Table 21. LAPPS type Thing feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Thing.propertiers.alternateName

 — 

 — 

 — 

 — 

 — 

Table 22. LAPPS type Token feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

Constituent.parent

 — 

 — 

Token.parent

 — 

Token.parent

Token.metadata.posTagSet

 — 

 — 

TagSetDescription

 — 

 — 

Token.properties.lemma

 — 

 — 

Token.lemma.value

Token.features.root

Token.lemma.value

Token.properties.length

Annotation.length

 — 

Token.getCoveredText().length()

Token.features.length

Token.getCoveredText().length()

Token.properties.orth

 — 

 — 

 — 

Token.features.orth

Token.orthogr.value

Token.properties.pos

 — 

 — 

Token.pos.posValue

Token.features.category

Token.POSTag.value

Token.properties.tokenType

 — 

 — 

 — 

Token.features.kind

Token.tokenType

Table 23. LAPPS type Token.metadata.posTagSet and friends feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP
Table 24. LAPPS type VerbChunk feature alignment
LAPPS Alvis Argo DKPro_Core GATE ILSP

VerbChunk.neg

 — 

 — 

MorphologicalFeatures.negative

VG.features.neg

 — 

VerbChunk.tense

 — 

 — 

MorphologicalFeatures.tense

VG.features.tense

 — 

VerbChunk.vcType

 — 

 — 

MorphologicalFeatures.verbForm

VG.features.type

 — 

VerbChunk.voice

 — 

 — 

MorphologicalFeatures.voice

VG.features.voice

 — 

Typesystems

Alvis

Annotation

Supertype: Element

Table 25. Features of type Annotation
Feature Type Description

Annotation.end

int

0-based. Unicode character count. Character after the last of the span.

Annotation.length

Annotation.start

int

0-based. Unicode character count. First character of the span.

Corpus

Supertype: Element

A corpus to represent data to handle

Table 26. Features of type Corpus
Feature Type Description

Corpus.documents

Document[]

A collection of documents of the corpus

Corpus.processedBy

String[]

The software that have processed the corpus

Document

Supertype: Element

A document

Table 27. Features of type Document
Feature Type Description

Document.id

String

Document.sections

Section[]

Element

Supertype: Object

A generic element

Table 28. Features of type Element
Feature Type Description

Element.features

Map<String,String[]>

Key-value pairs. Keys are local to the processing of a corpus. Some feature keys follow usage conventions.

Layer

Supertype: Object

Set of annotations of the same type.

Table 29. Features of type Layer
Feature Type Description

Layer.annotations

Annotation[]

Layer.name

String

Layer name. Local to the process. Layer names follow conventions.

Relation

Supertype: Element

Set of element tuples. "Relation" in the RDBMS sense.

Table 30. Features of type Relation
Feature Type Description

Relation.name

String

Name of the relation

Relation.tuples

Tuple[]

Tuple

Element

Named tuple of elements.

Tuple.arguments

Map<String,Element>

Arguments of the tuple. As in features, argument names are local, and there are conventions.

Section

Supertype: Element

A section is a passage of the text document.

Table 31. Features of type Section
Feature Type Description

Section.allAnnotations

Annotation[]

Annotations in this section.

Section.contents

String

Text contents of the section.

Section.layers

Layer[]

Annotation containers.

Section.name

String

Name of the section.

Section.relations

Relation[]

Relations in this section.

Tuple

Supertype: Element

Convention: relation name "coreferences".

Table 32. Features of type Tuple
Feature Type Description

Tuple.arguments

Arguments are either of type Tuple (sub-constituent) or Annotation (terminal).

Argo

Uima.cas.TOP

Supertype: <none>

Most generic specification in uima.cas

Table 33. Features of type Uima.cas.TOP
Feature Type Description

Uima.tcas.Annotation

Supertype: TOP

Table 34. Features of type Uima.tcas.Annotation
Feature Type Description

Annotation.begin

Integer

Annotation.end

Integer

org.u_compare.shared.document.DocumentAnnotation

Supertype: org.u_compare.shared.ReferenceAnnotation

Table 35. Features of type org.u_compare.shared.document.DocumentAnnotation
Feature Type Description

Document.externalReference

org.u_compare.shared.ExternalReference

Document.externalReference.id

uima.cas.String

begin

uima.cas.Integer

defined in Uima.tcas.annotation

end

uima.cas.Integer

defined in Uima.tcas.annotation

fragments

uima.cas.FSArray

metadata

org.u_compare.shared.AnnotationMetadata

Placeholder for a confidence value

org.u_compare.shared.semantic.CoreferenceAnnotation

Supertype: org.u_compare.shared.semantic.DiscoursePhenomenon

Table 36. Features of type org.u_compare.shared.semantic.CoreferenceAnnotation
Feature Type Description

linkedAnnotationSets

uima.cas.FSArray (org.u_compare.shared.semantic.LinkedAnnotationSet)

org.u_compare.shared.semantic.NamedEntity

Supertype: org.u_compare.shared.semantic.SemanticClassAnnotation

Table 37. Features of type org.u_compare.shared.semantic.NamedEntity
Feature Type Description

org.u_compare.shared.semantic.Person

Supertype: org.u_compare.shared.semantic.ProperName

Table 38. Features of type org.u_compare.shared.semantic.Person
Feature Type Description

org.u_compare.shared.semantic.ProperName

org.u_compare.shared.semantic.NamedEntity

org.u_compare.shared.semantic.bio.CellLine

org.u_compare.shared.semantic.NamedEntity

org.u_compare.shared.semantic.bio.CellType

org.u_compare.shared.semantic.NamedEntity

org.u_compare.shared.semantic.bio.DNA

org.u_compare.shared.semantic.NamedEntity

org.u_compare.shared.semantic.bio.GeneOrGeneProduct

org.u_compare.shared.semantic.NamedEntity

org.u_compare.shared.semantic.bio.RNA

org.u_compare.shared.semantic.NamedEntity

org.u_compare.shared.semantic.Place

Supertype: org.u_compare.shared.semantic.ProperName

Table 39. Features of type org.u_compare.shared.semantic.Place
Feature Type Description

org.u_compare.shared.syntactic.Chunk

Supertype: org.u_compare.shared.syntactic.Constituent

Similar to specific chunks below but more general

Table 40. Features of type org.u_compare.shared.syntactic.Chunk
Feature Type Description

org.u_compare.shared.syntactic.Dependency

Supertype: __

Table 41. Features of type org.u_compare.shared.syntactic.Dependency
Feature Type Description

label

org.u_compare.shared.label.DependencyLabel

source

uima.cas.TOP

target

uima.cas.TOP

org.u_compare.shared.syntactic.POSToken

Supertype: org.u_compare.shared.syntactic.Token

A token containing a POS

Table 42. Features of type org.u_compare.shared.syntactic.POSToken
Feature Type Description

pos

org.u_compare.shared.label.POS

posString

uima.cas.String

org.u_compare.shared.syntactic.RichToken

Supertype: org.u_compare.shared.syntactic.POSToken

A token containing a POS and a word

Table 43. Features of type org.u_compare.shared.syntactic.RichToken
Feature Type Description

base

uima.cas.String

org.u_compare.shared.syntactic.Sentence

Supertype: org.u_compare.shared.syntactic.SyntacticAnnotation

Table 44. Features of type org.u_compare.shared.syntactic.Sentence
Feature Type Description

org.u_compare.shared.syntactic.Token

Supertype: org.u_compare.shared.syntactic.SyntacticAnnotation

Table 45. Features of type org.u_compare.shared.syntactic.Token
Feature Type Description

org.u_compare.shared.syntactic.TreeNode

Supertype: org.u_compare.shared.syntactic.SyntacticAnnotation

Table 46. Features of type org.u_compare.shared.syntactic.TreeNode
Feature Type Description

children

uima.cas.FSArray (uima.cas.TOP)

parent

uima.cas.TOP

DKPro_Core

Annotation

Supertype: AnnotationBase

UIMA Annotation Type

Table 47. Features of type Annotation
Feature Type Description

Annotation.begin

Integer

Annotation.end

Integer

AnnotationBase

Supertype: TOP

Abstract UIMA Annotation Type

Table 48. Features of type AnnotationBase
Feature Type Description

AnnotationBase.sofaFS

SofaFS

AnnotationBase.view

CAS

Chunk

Supertype: Annotation

Table 49. Features of type Chunk
Feature Type Description

chunkValue

String

Chunk category as produced by chunker

Constituent

Supertype: Annotation

Table 50. Features of type Constituent
Feature Type Description

Constituent.children

Annotation[]

Points to child Constituents or Tokens

Constituent.constituentType

String

Label on the constituent

Constituent.parent

Annotation

Points to parent Constituent or null

Constituent.syntacticFunction

String

Label on the link to the parent

CoreferenceChain

Supertype: TOP

Type to easily find the entry into a coreference chain.

Table 51. Features of type CoreferenceChain
Feature Type Description

CoreferenceChain.first

CoreferenceLink

The head of the coreference chain

CoreferenceChain.links()

CoreferenceLink[]

Convenience method at JCas level to collect all the links (mentions) in a chain.

Supertype: Annotation

Table 52. Features of type CoreferenceLink
Feature Type Description

CoreferenceLink.next

CoreferenceLink

Points to next element in chain

CoreferenceLink.referenceLink

String

Category of the link to the next element in the chain

CoreferenceLink.referenceType

String

Category of the present element

Date

Supertype: NamedEntity

Table 53. Features of type Date
Feature Type Description

NamedEntity.value

Dependency

Supertype: Annotation

Table 54. Features of type Dependency
Feature Type Description

Dependency.dependencyType

String

The dependency category as per parser

Dependency.dependent

Token

Points to dependent

Dependency.governor

Token

Points to governor

DocumentMetaData

Supertype: DocumentAnnotation / Annotation

Table 55. Features of type DocumentMetaData
Feature Type Description

DocumentAnnotation.language

String

ISO 639 two-letter code. DKPro Core uses these to automatically look up models. Other language codes can be stored but model lookup will fail. Language will then have to be overwritten per processing component.

DocumentMetaData.collectionId

String

The ID of the collection/corpus.

DocumentMetaData.documentBaseUri

String

The base URI for the corpus/collection.

DocumentMetaData.documentId

String

The ID of the document within the collection

DocumentMetaData.documentTitle

String

The title of the document.

DocumentMetaData.documentUri

String

The URI from which the document was obtained, typically this is a location on the local file system or from within a JAR/ZIP file.

DocumentMetaData.isLastSegment

boolean

If a document was split during processing, this flag indicates the last segment, e.g. when reconstructing the full document from the splits.

Location

Supertype: NamedEntity

Table 56. Features of type Location
Feature Type Description

NamedEntity.value

MorphologicalFeatures

Supertype: Annotation

Morphological features per Universal Dependencies Treebank definition

Table 57. Features of type MorphologicalFeatures
Feature Type Description

MorphologicalFeatures.animacy

String

Normalized category mapped from original tag

MorphologicalFeatures.aspect

String

Normalized category mapped from original tag

MorphologicalFeatures.case

String

Normalized category mapped from original tag

MorphologicalFeatures.definiteness

String

Normalized category mapped from original tag

MorphologicalFeatures.degree

String

Normalized category mapped from original tag

MorphologicalFeatures.gender

String

Normalized category mapped from original tag

MorphologicalFeatures.mood

String

Normalized category mapped from original tag

MorphologicalFeatures.negative

String

Normalized category mapped from original tag

MorphologicalFeatures.numType

String

Normalized category mapped from original tag

MorphologicalFeatures.number

String

Normalized category mapped from original tag

MorphologicalFeatures.person

String

Normalized category mapped from original tag

MorphologicalFeatures.possessive

String

Normalized category mapped from original tag

MorphologicalFeatures.pronType

String

Normalized category mapped from original tag

MorphologicalFeatures.reflex

String

Normalized category mapped from original tag

MorphologicalFeatures.tense

String

Normalized category mapped from original tag

MorphologicalFeatures.value

String

Original tag as produced by morphological analyzer

MorphologicalFeatures.verbForm

String

Normalized category mapped from original tag

MorphologicalFeatures.voice

String

Normalized category mapped from original tag

NC

Supertype: Chunk

Table 58. Features of type NC
Feature Type Description

NamedEntity

Supertype: Annotation

Table 59. Features of type NamedEntity
Feature Type Description

NamedEntity.value

String

Entity category

Organization

Supertype: NamedEntity

Table 60. Features of type Organization
Feature Type Description

NamedEntity.value

Person

Supertype: NamedEntity

Table 61. Features of type Person
Feature Type Description

NamedEntity.value

ROOT

Supertype: __

Table 62. Features of type ROOT
Feature Type Description

TagSetDescription

Sentence

Supertype: Annotation

Table 63. Features of type Sentence
Feature Type Description

TOP

Supertype: <none>

UIMA Top type

Table 64. Features of type TOP
Feature Type Description

TagDescription

Supertype: TOP

Table 65. Features of type TagDescription
Feature Type Description

TagDescription.name

String

The tag (e.g. N, VP, Person, …​)

TagSetDescription

Supertype: Annotation

Describes a tag set

Table 66. Features of type TagSetDescription
Feature Type Description

TagSetDescription.layer

String

The annotation type for which this tag set applies (e.g. "Token" FQN)

TagSetDescription.name

String

The name of the tagset

TagSetDescription.tags

TagDescription[]

The tag descriptions

Token

Supertype: Annotation

Table 67. Features of type Token
Feature Type Description

TagSetDescription

Token.getCoveredText().length()

Integer

Method to get the length of the covered text (end - begin basically)

Token.lemma.value

String

Lemma produced by lemmatizer.

Token.morphologicalFeatures

MorphologicalFeatures

Morphological features as per Universal Dependency Treebank.

Token.parent

Annotation

By convention, the parent is a Constituent. For purely technical reasons, it is an Annotation.

Token.pos.posValue

String

Original POS tag produced by tagger.

Token.stem.value

String

Stem produced by a stemmer.

VC

Supertype: Chunk

Table 68. Features of type VC
Feature Type Description

MorphologicalFeatures.negative

String

MorphologicalFeatures.tense

String

MorphologicalFeatures.verbForm

String

I think this information is typically encoded in the POS or morph tags.

MorphologicalFeatures.voice

String

GATE

[[type-GATE-(GATE only supports textual documents, so all documents are implicitly text)]] ==== (GATE only supports textual documents, so all documents are implicitly text)

Supertype: __

Table 69. Features of type (GATE only supports textual documents, so all documents are implicitly text)
Feature Type Description

Annotation

Supertype: Annotation

Table 70. Features of type Annotation
Feature Type Description

Annotation.endOffset

Long

The ending offset (0-based) in the primary data.

Annotation.id

Integer

Unique identifier for the annotation

Annotation.startOffset

Long

The starting offset (0-based) in the primary data.

Date

Supertype: Annotation

Table 71. Features of type Date
Feature Type Description

Date.features.kind

String

DateNormalizer plugin adds extra features like normalized value (YYYYMMDD as a number)

Dependency

Supertype: Annotation

GATE’s wrapper for the Stanford Parser represents dependencies in two ways, as annotations that link to the dependent and governing Tokens by ID, and as a complex feature value on the Token itself giving its outgoing dependency arcs

Table 72. Features of type Dependency
Feature Type Description

Dependency.features.args[0]

List of Integer

"args" is a list with the governer token ID first and the dependent second

Dependency.features.args[1]

Dependency.features.kind

String

Dependency type

Document

Supertype: Document

Table 73. Features of type Document
Feature Type Description

Document.features.language

String

Document.mimeType

String

MIME type used to determine the format parser GATE will use to extract the text from the document

Document.name

String

Every GATE resource (including documents) can have a name, but it is not necessarily a unique ID

Document.sourceUrl or Document.stringContent

URL or String

The source of the document, either a URL or a plain text string

Location

Supertype: Annotation

Table 74. Features of type Location
Feature Type Description

Location.features.locType

String

Mention

Supertype: Annotation

Mentions of concepts in a knowledge base for semanically linked annotation tasks

Table 75. Features of type Mention
Feature Type Description

Mention.features.class

URI

URI of the (main) ontology class to which the instance belongs

Mention.features.inst

URI

URI of the instance to which the mention links

NounChunk

Supertype: Annotation

As produced by GATE default NP chunker

Table 76. Features of type NounChunk
Feature Type Description

Organization

Supertype: Annotation

Table 77. Features of type Organization
Feature Type Description

Organization.features.orgType

String

Person

Supertype: Annotation

Table 78. Features of type Person
Feature Type Description

Person.features.gender

String

Sentence

Supertype: Annotation

Table 79. Features of type Sentence
Feature Type Description

SyntaxTreeNode

Supertype: Annotation

GATE models syntax trees as nodes, representing constituents by containment

Table 80. Features of type SyntaxTreeNode
Feature Type Description

SyntaxTreeNode.features.consists

Token

Supertype: Annotation

Table 81. Features of type Token
Feature Type Description

Token.features.category

String

Token.features.kind

String

Token.features.length

Long

Token.features.orth

String

GATE’s default tokeniser uses lower camel-case names "upperInitial", "allCaps", "lowercase", "mixedCaps"

Token.features.root

String

Morphological root, also "affix" for the affix

Unsupported

Supertype: __

Table 82. Features of type Unsupported
Feature Type Description

VG

Supertype: Annotation

As produced by the verb group chunker

Table 83. Features of type VG
Feature Type Description

VG.features.neg

String

"yes" for negated verbs, feature absent on non-negated ones

VG.features.tense

String

VG.features.type

String

VG.features.voice

String

"active" or "passive"

ILSP

Chunk

Supertype: gr.ilsp.types.Annotation

Table 84. Features of type Chunk
Feature Type Description

Chunk.label

String

Several values specific to the ILSP chunker

Clause

gr.ilsp.types.Annotation

Clause.label

String

Several values specific to the ILSP clause segmenter

ConceptMention

Supertype: gr.ilsp.types.Annotation

Table 85. Features of type ConceptMention
Feature Type Description

EntityMention

ConceptMention

specificType

String

DependencyRelation

Supertype: gr.ilsp.types.Annotation

Table 86. Features of type DependencyRelation
Feature Type Description

head

Token

label

String

DocumentAnnotation

Supertype: gr.ilsp.types.Annotation

Table 87. Features of type DocumentAnnotation
Feature Type Description

Author

a gr.ilsp.types.Annotation (with features for surname etc.)

Used for authors of the document

Date

a gr.ilsp.types.Annotation (with features for year, month, day)

The date of the document

DocumentAnnotation.docId

String

Can be set by the process creating the header

DocumentAnnotation.language

String

Expected to be a 2 or 3-letter code.

DocumentAnnotation.title

String

The title of the document.

DocumentAnnotation.uri

String

The URI from which this document was created

DocumentAnnotaton.offsetInSource

Integer

Byte offset of the start of document content within original source file or other input source. Only used if the CAS document was retrieved from an source where one physical source file contained several conceptual documents. Zero otherwise.

LOC

Supertype: EntityMention

Table 88. Features of type LOC
Feature Type Description

ORG

Supertype: EntityMention

Table 89. Features of type ORG
Feature Type Description

PER

Supertype: EntityMention

Table 90. Features of type PER
Feature Type Description

Sentence

Supertype: gr.ilsp.types.Annotation

Table 91. Features of type Sentence
Feature Type Description

Paragraph

gr.ilsp.types.Annotation

Identified automatically based on document markup

Sentence.orthogr.value

String

Values are uppercase (THIS IS A SENTENCE), lowercase (this is a sentence), and capitalized (This Is A Sentence)

Timex2Mention

Supertype: ConceptMention

Table 92. Features of type Timex2Mention
Feature Type Description

Token

Supertype: gr.ilsp.types.Annotation

Table 93. Features of type Token
Feature Type Description

Token.POSTag.value

String

Token.depRel

gr.ilsp.types.DependencyRelation

Dependency relation. See below.

Token.feats

gr.ilsp.types.GrammaticalFeats

Morphological features of a word category

Token.getCoveredText().length()

Integer

Method to get the length of the covered text (end - begin basically)

Token.lemma.value

String

Token.orthogr.value

String

Token.parent

Annotation

By convention, the parent is a Constituent. For purely technical reasons, it is an Annotation.

Token.sentOrd

Integer

Order of token in the sentence

Token.stemmedForm.value

String

Stem produced by a stemmer.

Token.tokenType

String

gr.ilsp.types.Annotation

Supertype: uima.tcas.Annotation

ILSP Annotation Type

Table 94. Features of type gr.ilsp.types.Annotation
Feature Type Description

Annotation.begin

Integer

Annotation.componentId

String

Annotation.confidence

long

Annotation.end

Integer

Annotation.id

String

projective

Supertype: boolean

Table 95. Features of type projective
Feature Type Description

uima.tcas.Annotation

Supertype: TOP

UIMA Annotation Type

Table 96. Features of type uima.tcas.Annotation
Feature Type Description

LAPPS

Annotation

Supertype: Thing

Linguistic information added to a word, phrase, clause, sentence, text, or a relation among them

Table 97. Features of type Annotation
Feature Type Description

Annotation.metadata.producer

URI[]

The software that produced the annotations.

Annotation.metadata.rules

URI[]

The documentation (if any) for the rules that were used to identify the annotations.

Annotation.properties.id

String

A unique identifier associated with the annotation.

AudioDocument

Supertype: Document

Table 98. Features of type AudioDocument
Feature Type Description

Constituent

Supertype: Span

Table 99. Features of type Constituent
Feature Type Description

Constituent.properties.children

ID[]

Constituent.properties.label

String or URI

Constituent.properties.parent

ID

Coreference

Supertype: Annotation

Used to mark references to other mentions of the same entity or instance.

Table 100. Features of type Coreference
Feature Type Description

Coreference.properties.mentions

ID[]

A list of identifiers. Each identifier points to an object of type Annotation, or a subtype thereof.

Coreference.properties.representative

ID

An identifier that points to the representative item in the coreference chain.

Date

Supertype: NamedEntity

Table 101. Features of type Date
Feature Type Description

Date.properties.dateType

String or URI

Sub-type information such as date, datetime, time, etc. Ideally a URI referencing a pre-defined descriptor.

Dependency

Supertype: Annotation

Table 102. Features of type Dependency
Feature Type Description

Dependency.properties.dependent

ID

Dependency.properties.governor

ID

Dependency.properties.label

String or URI

DependencyStructure

Supertype: Annotation

Table 103. Features of type DependencyStructure
Feature Type Description

DependencyStructure.metadata.dependencySet

String or URI

DependencyStructure.properties.dependencies

ID[] (Set)

DependencyStructure.properties.dependencyType

String of URI

Document

Supertype: Thing

Table 104. Features of type Document
Feature Type Description

Document.properties.encoding

String or URI

The physical or digital manifestation of the resource. Encoding may be used to determine the software, hardware or other equipment needed to display or operate the resource. Recommended best practice is to select a value from the list of Internet Media Types [http://www.iana.org/ assignments/media-types/] defining computer media formats).

See also http://dublincore.org/documents/2001/04/12/usageguide/sectd.shtml#format

Document.properties.id

String

A unique identifier associated with the document.

Document.properties.language

String or URI

A language of the intellectual content of the resource. Recommended best practice for the values of the Language element is defined by RFC 3066 [RFC 3066, http://www.ietf.org/rfc/ rfc3066.txt] which, in conjunction with ISO 639 [ISO 639, http://www.oasis- open.org/cover/iso639a.html]), defines two- and three-letter primary language tags with optional sub-tags. Examples include "en" or "eng" for English, "akk" for Akkadian, and "en-GB" for English used in the United Kingdom.

See also http://dublincore.org/documents/2001/04/12/usageguide/sectd.shtml#language

Document.properties.source

String or URI

The source of the document.

Document.properties.sourceType

String or URI

Source types include creator, distributor, contributor, publisher, etc.

Location

Supertype: NamedEntity

Table 105. Features of type Location
Feature Type Description

Location.properties.locType

String or URI

Location type: country, city, GPE, sea, lake, etc. Ideally a URI referencing a pre-defined descriptor.

Markable

Supertype: Span

Annotation type used if the coreferenced item is not already wrapped in a suitable annotation type that can be referenced.

Table 106. Features of type Markable
Feature Type Description

Coreference.properties.mentions

NamedEntity

Supertype: Span

Table 107. Features of type NamedEntity
Feature Type Description

NounChunk

Supertype: Span

The initial portion of a non-recursive noun phrase up to the head, including determiners but not including postmodifying prepositional phrases or clauses.

Table 108. Features of type NounChunk
Feature Type Description

Organization

Supertype: NamedEntity

Table 109. Features of type Organization
Feature Type Description

Organization.properties.orgType

String or URI

Sub-type information (e.g., government, educational, religious, political, museum, hotel, medical, etc.). Ideally a URL referencing a pre-defined descriptor.

Person

Supertype: NamedEntity

Table 110. Features of type Person
Feature Type Description

Person.properties.gender

String or URI

A value such as male, female, unknown. Ideally a URI referencing a pre-defined descriptor.

PhraseStructure

Supertype: Annotation

A container for phrase structure information.

Table 111. Features of type PhraseStructure
Feature Type Description

PhraseStructure.metadata.categorySet

String or URI

The URI for the category set.

PhraseStructure.properties.constituents

ID[] (Set)

The set of IDs for the top-level Constituents in the paarse tree. (cf. https://github.com/lapps/vocabulary-pages/issues/15)

Sentence

Supertype: Span

A sequence of words capable of standing alone to make an assertion, ask a question, or give a command, usually consisting of a subject and a predicate containing a finite verb.

Table 112. Features of type Sentence
Feature Type Description

Sentence.properties.sentenceType

String or URI

Values such as declarative, interrogative, exclamatory, question, fragment. Ideally a URI referencing a pre-defined descriptor.

Span

Supertype: Annotation

An annotation over a span of text. A Span may be defined by start and end offsets or by linking to one or more Token annotations with the targets property.

Table 113. Features of type Span
Feature Type Description

Span.properties.end

Integer

The ending offset (0-based) in the primary data.

Span.properties.start

Integer

The starting offset (0-based) in the primary data.

Span.properties.targets

ID[]

IDs of a sequence of annotations covering the span of text referred to by this annotation. Used as an alternative to start and end to point to component annotations (e.g., a token sequence) rather than directly into primary data, or to link two or more annotations (e.g., in a coreference annotation).

TextDocument

Supertype: Document

Table 114. Features of type TextDocument
Feature Type Description

Thing

Supertype: <none>

The most generic specification

Table 115. Features of type Thing
Feature Type Description

Thing.propertiers.alternateName

String

An alias for the item

Token

Supertype: Span

Table 116. Features of type Token
Feature Type Description

Constituent.parent

Token.metadata.posTagSet

String or URI

Token.properties.lemma

String or URI

Token.properties.length

Integer

Token.properties.orth

String or URI

Orthographic properties of the token such as LowerCase, UpperCase, UpperInitial, etc. Ideally a URI referencing a pre-defined descriptor.

Token.properties.pos

String or URI

Token.properties.tokenType

String or URI

Sub-type such as word, punctuation, abbreviation, number, symbol, etc. Ideally a URI referencing a pre-defined descriptor.

[[type-LAPPS-Token.metadata.posTagSet and friends]] ==== Token.metadata.posTagSet and friends

Supertype: __

Table 117. Features of type Token.metadata.posTagSet and friends
Feature Type Description

VerbChunk

Supertype: Span

Non-recursive verb groups, which include modals, auxiliary verbs, and medial adverbs, and end at the head verb or predicate adjective.

Table 118. Features of type VerbChunk
Feature Type Description

VerbChunk.neg

String or URI

Indicates whether or not the verb is negated. Values include YES, NO.

VerbChunk.tense

String or URI

Provides tense information for the verb. Example values include BeVBG, BeVBN, FutCon, HaveVBN, Pas, PasCon, PasPer, PasPerCon, Per, Pre, PreCon, PrePer, PrePerCon, SimFut, SimPas, SimPre, none

VerbChunk.vcType

String or URI

Values such as finite, non-finite, participle, modal, special (e.g., 'is going to investigate').

VerbChunk.voice

String or URI

Indicates if the verb group is active or passive. Possible values include ACTIVE, PASSIVE, or NONE