The document explains the how to embed OpenMinTeD-SHARE descriptor in software components and resource packages and how to use related tooling.
Introduction
Package structure
We expect that the resource you distribute is packaged as a ZIP or JAR file. Within this file, you can place your OpenMinTeD descriptors at any location you like best. If you package descriptors for TDM software components, you may wish to place them directly next to the classes that implement the these components. If you package multiple resources, e.g. annotation schema descriptions, in a single ZIP file you might wish to place the respective OpenMinTeD-SHARE descriptions directly next to them.
However, in order to allow for the automatic detection of your OpenMinTeD-SHARE descriptors, you will have to set up a file in a well-known location which points to your descriptors. This is described in the next section.
Discovery
Making descriptors discoverable
In order to make your descriptors discoverable, you have to create a folder called
META-INF/eu.openminted.share
in your ZIP or JAR file. Into this folder, you have to place a file
called descriptors.txt
with pointers to the actual descriptor files.
descriptors.txt
file../../descriptors/Component1.xml
../../descriptors/Component2.xml
Discovering descriptors
We presently provide a convenience API for Java to discover descriptors. Additionally, we assume for the moment, that any descriptors you may wish to discover exist on your Java classpath, i.e. they are packaged as Maven artifacts.
In the near future, we will also provide convenience methods to discover descriptors in explicitly named JAR and ZIP files that do not need to be on the Java classpath. |
To locate the descriptors, you can use the DescriptorFactory
class.
DescriptorFactory
classURL[] descriptorPaths = DescriptorFactory.scanDescriptors();
Annotating components
Java classes can be directly annotated with OpenMinTeD-SHARE metadata so you do not have to manually maintain a separate XML file. The OpenMinTeD-SHARE Maven Plugin can then be used to generate the OpenMinTeD-SHARE descriptor automatically as part of a build.
import eu.openminted.share.annotations.api.Component;
import eu.openminted.share.annotations.api.constants.OperationType;
@Component(classes=OperationType.READER)
class TextCorpusReader
Documentation
In addition to the descriptions obtained from the native framework descriptors, it is possible to
reference external documentation or publications. URLs referring to such external resources may
contain placeholders such as version
or command
. In addition, properties defined in the
configuration section of the OMTD-SHARE Maven plugin in the Maven POM are interpolated. This is
useful e.g. to centrally configure a documentation base URL.
Property | Description |
---|---|
version |
Component version |
command |
Command registered in the first distribution info section of the OMTD-SHARE descriptor |
shortClassName |
If the command contains dots, this property addresses the substring starting after the last dot |
import eu.openminted.share.annotations.api.Component;
import eu.openminted.share.annotations.api.DocumentationResource;
import eu.openminted.share.annotations.api.constants.OperationType;
@Component(classes=OperationType.READER)
@DocumentationResource("${docbase}/${version}/${command}.html")
class TextCorpusReader
<plugin>
<groupId>eu.openminted.share.annotations</groupId>
<artifactId>omtd-share-annotations-maven-plugin</artifactId>
<configuration>
<properties>
<docbase>http://mywebsite.com/docs</docbase>
</properties>
</configuration>
</plugin>
Parameters
The parameters of a component are picked up from the native component annotations, e.g.
@ConfigurationParameter
(UIMA/uimaFIT) or (@Parameter
) (GATE).
Hiding parameters
Some parameters offered by the components are not suitable for OpenMinTeD, e.g. because they require the specification of a file system path which is not reasonably possible on the OMTD platform. Thus, it is sensible to hide such parameters from users of the OMTD platform.
Hidden parameters must either be optional or they must provide a default value. |
import eu.openminted.share.annotations.api.Component;
import eu.openminted.share.annotations.api.Parameters;
import eu.openminted.share.annotations.api.constants.OperationType;
@Component(classes=OperationType.READER)
@Parameters(exclude = TextCorpusReader.PARAM_HIDDEN)
class TextCorpusReader
extends JCasAnnotator_ImplBase
{
/**
* Hidden parameter.
*/
public static final String PARAM_HIDDEN = "hidden";
@ConfigurationParameter(name = PARAM_HIDDEN, mandatory = true, defaultValue = "val")
private String hidden;
Using the Maven Plugin
A set of properly annotated class files within a Maven project can be automatically processed as part of the build to
produce the relevant descriptor files using the Maven plugin we provide. To use this plugin simply add the following to
your existing pom.xml
.
<dependencies>
<dependency>
<groupId>eu.openminted.share.annotations</groupId>
<artifactId>omtd-share-annotations-api</artifactId>
<version>3.0.2.7</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>eu.openminted.share.annotations</groupId>
<artifactId>omtd-share-annotations-maven-plugin</artifactId>
<version>3.0.2.7</version>
<executions>
<execution>
<phase>process-classes</phase>
<goals>
<goal>generate</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
Note that if you already have a repositories
, pluginRepositories
or build
section within your pom.xml
you will
only need to include the relevant repository or plugin element.
UIMA type mappings
UIMA type capabilities can be automatically converted to OMTD-SHARE annotation type information. This requires adding an additional configuration to the OMTD-SHARE Maven Plugin:
<plugin>
<groupId>eu.openminted.share.annotations</groupId>
<artifactId>omtd-share-annotations-maven-plugin</artifactId>
<version>3.0.2.7</version>
<executions>
...
</executions>
<configuration>
<uimaTypeMappings>
<uimaTypeMapping>META-INF/eu.openminted.share/uimaTypeMapping.map</uimaTypeMapping>
</uimaTypeMappings>
</configuration>
</plugin>
The plugin looks for the mappings in the source paths of the current module as well as its dependencies. The intended idea is that the mapping files are maintained in the same place as the UIMA type systems they describe. So for example the DKPro Core Named Entity API module provides a named entity type and also includes a UIMA-to-OMTD type mapping file which can be used by the Maven plugin.
The mapping file is a simple Java properties file assigning a UIMA type name to a OMTD-SHARE annotation type:
de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token=http://w3id.org/meta-share/omtd-share/Token
de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence=http://w3id.org/meta-share/omtd-share/Sentence
de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.POS=http://w3id.org/meta-share/omtd-share/PartOfSpeech
MIME type mappings
MIME types can be automatically converted to OMTD-SHARE data format information. However, mind that OMTD-SHARE data formats are usually more specific than MIME types, so there will be many cases in which such a mapping is not very useful. Enabling the mapping requires adding an additional configuration to the OMTD-SHARE Maven Plugin:
<plugin>
<groupId>eu.openminted.share.annotations</groupId>
<artifactId>omtd-share-annotations-maven-plugin</artifactId>
<version>3.0.2.7</version>
<executions>
...
</executions>
<configuration>
<uimaTypeMappings>
<uimaTypeMapping>META-INF/eu.openminted.share/mimeTypeMapping.map</uimaTypeMapping>
</uimaTypeMappings>
</configuration>
</plugin>
The mapping lookup mechanism in the same as for the UIMA type mappings described above.
The mapping file is a simple Java properties file assigning a MIME type name to a OMTD-SHARE data format:
text/tab-separated-values=http://w3id.org/meta-share/omtd-share/TabularFormat
Metadata mappings
The OMTD-SHARE Maven plugin looks for metadata in the following order:
-
UIMA
-
GATE
-
Maven
All metadata found in the process is usually aggregated, not overwritten. In some cases, that might lead to duplicate items.
Maven
Maven | OMTD-SHARE |
---|---|
/project/version |
/componentMetadataRecord/componentInfo/versionInfo/version |
/project/version |
/componentMetadataRecord/componentInfo/identificationInfo/resourceIdentifiers/resourceIdentifier |
/project/groupId |
/componentMetadataRecord/componentInfo/identificationInfo/resourceIdentifiers/resourceIdentifier |
/project/artifactId |
/componentMetadataRecord/componentInfo/identificationInfo/resourceIdentifiers/resourceIdentifier |
/project/version |
/componentMetadataRecord/componentInfo/distributionInfos/componentDistributionInfo/distributionLocation |
/project/groupId |
/componentMetadataRecord/componentInfo/distributionInfos/componentDistributionInfo/distributionLocation |
/project/artifactId |
/componentMetadataRecord/componentInfo/distributionInfos/componentDistributionInfo/distributionLocation |
/project/url |
/componentMetadataRecord/componentInfo/contactInfo/contactPoint |
/project/developers/name |
/componentMetadataRecord/componentInfo/resourceCreationInfo/resourceCreators/resourceCreator/surname |
/project/developers/email |
/componentMetadataRecord/componentInfo/resourceCreationInfo/resourceCreators/resourceCreator/communicationInfo/emails/email |
/project/developers/organization |
/componentMetadataRecord/componentInfo/resourceCreationInfo/resourceCreators/resourceCreator/affiliation/organizationNames/organizationName |
/project/developers/roles/role |
/componentMetadataRecord/componentInfo/resourceCreationInfo/resourceCreators/resourceCreator/affiliation/position |
/project/licenses/license |
/componentMetadataRecord/componentInfo/rightsInfo/licenseInfos/licenseInfo |
/project/mailingLists/mainlingList/name |
/componentMetadataRecord/componentInfo/contactInfo/mailingLists/mailingListInfo/mailingListName |
/project/mailingLists/mainlingList/archive |
/componentMetadataRecord/componentInfo/contactInfo/mailingLists/mailingListInfo/archive |
/project/mailingLists/mainlingList/post |
/componentMetadataRecord/componentInfo/contactInfo/mailingLists/mailingListInfo/post |
/project/mailingLists/mainlingList/subscribe |
/componentMetadataRecord/componentInfo/contactInfo/mailingLists/mailingListInfo/subscribe |
/project/mailingLists/mainlingList/unsubscribe |
/componentMetadataRecord/componentInfo/contactInfo/mailingLists/mailingListInfo/unsubscribe |
UIMA
The UIMA mappings shown below are for analysis engines, but they apply in a similar way to collection readers, just that the root element and metadata element have different names.
UIMA | OMTD-SHARE |
---|---|
/analysisEngineDescription/analysisEngineMetaData/name |
/componentMetadataRecord/componentInfo/identificationInfo/resourceNames/resourceName |
/analysisEngineDescription/annotatorImplementationName |
/componentMetadataRecord/componentInfo/identificationInfo/resourceIdentifiers/resourceIdentifier |
/analysisEngineDescription/analysisEngineMetaData/vendor |
/componentMetadataRecord/componentInfo/contactInfo/contactGroups/contactGroup/groupNames/groupName |
/analysisEngineDescription/analysisEngineMetaData/copyright |
/componentMetadataRecord/componentInfo/rightsInfo/copyrightStatement |
/analysisEngineDescription/analysisEngineMetaData/configurationParameters/configurationParameter/name |
/componentMetadataRecord/componentInfo/parameterInfos/parameterInfo/parameterName |
/analysisEngineDescription/analysisEngineMetaData/configurationParameters/configurationParameter/name |
/componentMetadataRecord/componentInfo/parameterInfos/parameterInfo/parameterLabel |
/analysisEngineDescription/analysisEngineMetaData/configurationParameters/configurationParameter/description |
/componentMetadataRecord/componentInfo/parameterInfos/parameterInfo/parameterDescription |
/analysisEngineDescription/analysisEngineMetaData/configurationParameters/configurationParameter/type |
/componentMetadataRecord/componentInfo/parameterInfos/parameterInfo/parameterType |
/analysisEngineDescription/analysisEngineMetaData/configurationParameters/configurationParameter/multiValued |
/componentMetadataRecord/componentInfo/parameterInfos/parameterInfo/multiValue |
/analysisEngineDescription/analysisEngineMetaData/configurationParameters/configurationParameter/mandatory |
/componentMetadataRecord/componentInfo/parameterInfos/parameterInfo/optional |
/analysisEngineDescription/analysisEngineMetaData/configurationParameterSettings/nameValuePair/value |
/componentMetadataRecord/componentInfo/parameterInfos/parameterInfo/defaultValue |