DisambiguationEngine: code cleanup, Entityhub site is now determined by using entityhub:site values of fise:EntityAnnotations, EntityAnnotations refering to multiple TextAnnotations are now cloned - to allow that I needed to repurpose Suggestion to represent EntityAnnotations, so that all changes of the DisambiguationEngine are saved there and only applied at the end, Disambiguaten does no longer use the matches of the labels to calculate the score - it is assumed that the original confidence already represents this. Original and Disambiguation score are now combined (see comments for a detailed description); Enhancer Jesey: Improved Enhancer UI to support disambiguated Entities (same selected-text with different Suggestions), also added metadata to the UI; This includes fixes for STANBOL-725 and STANBOL-726 (already present in trunk); KeywordExtractionEngine; implemented a different matching for tokens (based on matches of 'processable' tokens instead of matches of all tokens - this improves performance with configurations with less restrict rules for suggestions (e.g. minFoundTokens=1) - done wiht the expectation in mind that those configuration will be more common with a DisambiguationEngine present in the EnhancementChain; Default Configuration: Added the Disambiguation Engine to the default chain and added a KeywordLinking engine based configuration based on DBpedia - those configuration are intended to ease testing of the developments in this branch and will - most likelly - not be merged back into the trunk

git-svn-id: https://svn.apache.org/repos/asf/incubator/stanbol/branches/disambiguation-engine@1379385 13f79535-47bb-0310-9956-ffa450edef68
16 files changed