tree: 80821faef45967e79bee39e728ae75220c0052ed [path history] [tgz]
  1. src/
  2. pom.xml
  3. README.md
data/defaultconfig/README.md

Apache Stanbol Launcher Default Configuration

This artifacts provides the default configuration used by the Stanbol Launchers. It depends on the

org.apache.stanbol.commons.installer.bundleprovider

to be available as this module is used to actually load the provided configuration.

This STANBOL-529 for more details.

Users that do not want to run with the defaults can stop/uninstall this bundle to deactivate/remove the defaults.

NOTE: The default configuration does not include configuration for other ‘/data’ modules (such as DBpedia or OpenNLP) nor ‘/demo’ modules.

Language Detection Chain

This configures a chain that optionally includes the Tika Engine and the Metaxa Engine and the LangId Engine to detect the language of parsed Content. This EnhancementChain is intended to be used by users that are only interested in detecting the language of some text.

This EnhancementChain can also be used if neither the Tika nor Metaxa Engine are available. However than it will only be able to process plain text content.

Keyword Extraction using Entityhub

A configuration that extracts Entities from parsed content based on Entities added to the Entityhub (http://{host}:{port}/entityhub/entity).

This Engine can be used to extract and link entities that where previously added to the entityhub e.g. by using

:::bash
curl -i -X PUT -H "Content-Type:application/rdf+xml" -T {file.rdf} \
    "http://localhost:8080/entityhub/entity

The property “rdfs:label” is used for extraction. “rdfs:seeAlso” is used for processing redirects. So make sure that the entities you use “rdf:label” to store their names.

For the following rdf:types mappings to dc:types used by fise:TextAnnotation are defined:

If the Entities you add to the Entityhub do use one of those types the KeywordLinkingEngine will create TextAnnotations with the according dc:type. If not than no dc:type will be set.