| // Licensed to the Apache Software Foundation (ASF) under one |
| // or more contributor license agreements. See the NOTICE file |
| // distributed with this work for additional information |
| // regarding copyright ownership. The ASF licenses this file |
| // to you under the Apache License, Version 2.0 (the |
| // "License"); you may not use this file except in compliance |
| // with the License. You may obtain a copy of the License at |
| // |
| // http://www.apache.org/licenses/LICENSE-2.0 |
| // |
| // Unless required by applicable law or agreed to in writing, |
| // software distributed under the License is distributed on an |
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| // KIND, either express or implied. See the License for the |
| // specific language governing permissions and limitations |
| // under the License. |
| |
| [[_ugr.tools.uimafit.externalresources]] |
| = External Resources |
| |
| An analysis engine often uses some data model. |
| This may be as simple as word frequency counts or as complex as the model of a parser. |
| Often these models can become quite large. |
| If an analysis engine is deployed multiple times in the same pipeline or runs on multiple CPU cores, memory can be saved by using a shared instance of the data model. |
| UIMA supports such a scenario by so-called external resources. |
| The following sections illustrates how external resources can be used with uimaFIT. |
| |
| First create a class for the shared data model. |
| Usually this class would load its data from some URI and then expose it via its methods. |
| An example would be to load word frequency counts and to provide a `getFrequency()` method. |
| In our simple example we do not load anything from the provided URI - we just offer a method to get the URI from which data be loaded. |
| |
| [source,java] |
| ---- |
| // Simple model that only stores the URI it was loaded from. Normally data |
| // would be loaded from the URI instead and made accessible through methods |
| // in this class. This simple example only allows accessing the URI. |
| public static final class SharedModel implements SharedResourceObject { |
| private String uri; |
| |
| public void load(DataResource aData) |
| throws ResourceInitializationException { |
| |
| uri = aData.getUri().toString(); |
| } |
| |
| public String getUri() { return uri; } |
| } |
| ---- |
| |
| == Resource injection |
| |
| === Regular UIMA components |
| |
| When an external resource is used in a regular UIMA component, it is usually fetched from the context, cast and copied to a class member variable. |
| |
| [source,java] |
| ---- |
| class MyAnalysisEngine extends CasAnnotator_ImplBase { |
| final static String MODEL_KEY = "Model"; |
| private SharedModel model; |
| |
| public void initialize(UimaContext context) |
| throws ResourceInitializationException { |
| |
| configuredResource = (SharedModel) |
| getContext().getResourceObject(MODEL_KEY); |
| } |
| } |
| ---- |
| |
| uimaFIT can be used to inject external resources into such traditional components using the `createDependencyAndBind()` method. |
| To show that this works with any off-the-shelf UIMA component, the following example uses uimaFIT to configure the OpenNLP Tokenizer: |
| |
| [source,java] |
| ---- |
| // Create descriptor |
| AnalysisEngineDescription tokenizer = createEngineDescription( |
| Tokenizer.class, |
| UimaUtil.TOKEN_TYPE_PARAMETER, Token.class.getName(), |
| UimaUtil.SENTENCE_TYPE_PARAMETER, Sentence.class.getName()); |
| |
| // Create the external resource dependency for the model and bind it |
| createDependencyAndBind(tokenizer, UimaUtil.MODEL_PARAMETER, |
| TokenizerModelResourceImpl.class, |
| "http://opennlp.sourceforge.net/models-1.5/en-token.bin"); |
| ---- |
| |
| [NOTE] |
| ==== |
| We recommend declaring parameter constants in the classes that use them, e.g. |
| here in `Tokenizer`. |
| This way, the parameters for a class can be found easily. |
| However, OpenNLP declares parameters centrally in `UimaUtil`. |
| Thus, the example above is correct, although unconventional. |
| ==== |
| |
| [NOTE] |
| ==== |
| Note that uimaFIT is unable to perform type-coercion on parameters if a descriptor is created from a class that does not contain `@ConfigurationParameter` annotations, such as the OpenNLP `Tokenizer`. |
| Such a descriptor does not contain any parameter declarations! However, it is still possible to configure such a component using uimaFIT by passing exactly the expected types as parameter values. |
| Thus, we need use the `getName()` method to get the class name as a string, instead of simply passing the class itself. |
| Also, setting multi-valued parameter from a list or single value does not work here. |
| Multi-values parameters must be passed as an array of the required type. |
| Only the default UIMA types are possible: `String`, `boolean`, `int`, and `float`. |
| ==== |
| |
| === uimaFIT-aware components |
| |
| uimaFIT provides the `@ExternalResource` annotation to inject external resources directly into class member variables. |
| |
| .`@ExternalResource` annotation |
| [cols="1,1,1", frame="all", options="header"] |
| |=== |
| | Parameter |
| | Description |
| | Default |
| |
| |key |
| |Resource key |
| |field name |
| |
| |api |
| |Used when the external resource type is different from the field type, e.g. |
| when using an ExternalResourceLocator |
| |field type |
| |
| |mandatory |
| |Whether a value must be specified |
| |true |
| |=== |
| |
| [source,java] |
| ---- |
| // Example annotator that uses the SharedModel. In the process() we only |
| // test if the model was properly initialized by uimaFIT |
| public static class Annotator |
| extends org.apache.uima.fit.component.JCasAnnotator_ImplBase { |
| |
| final static String MODEL_KEY = "Model"; |
| @ExternalResource(key = MODEL_KEY) |
| private SharedModel model; |
| |
| public void process(JCas aJCas) throws AnalysisEngineProcessException { |
| assertTrue(model.getUri().endsWith("gene_model_v02.bin")); |
| // Prints the instance ID to the console - this proves the same |
| // instance of the SharedModel is used in both Annotator instances. |
| System.out.println(model); |
| } |
| } |
| ---- |
| |
| Note, that it is no longer necessary to implement the `initialize()` method. |
| uimaFIT takes care of locating the external resource `Model` in the UIMA context and assigns it to the field `model`. |
| If a mandatory resource is not present in the context, an exception is thrown. |
| |
| The resource injection mechanism is implemented in the `ExternalResourceInitializer` class. |
| uimaFIT provides several base classes that already come with an `initialize()` method using the initializer: |
| |
| * `CasAnnotator_ImplBase` |
| * `CasCollectionReader_ImplBase` |
| * `CasConsumer_ImplBase` |
| * `CasFlowController_ImplBase` |
| * `CasMultiplier_ImplBase` |
| * `JCasAnnotator_ImplBase` |
| * `JCasCollectionReader_ImplBase` |
| * `JCasConsumer_ImplBase` |
| * `JCasFlowController_ImplBase` |
| * `JCasMultiplier_ImplBase` |
| * `Resource_ImplBase` |
| |
| When building a pipeline, external resources can be set of a component just like configuration parameters. |
| External resources and configuration parameters can be mixed and appear in any order when creating a component description. |
| |
| Note that in the following example, we create only one external resource description and use it to configure two different analysis engines. |
| Because we only use a single description, also only a single instance of the external resource is created and shared between the two engines. |
| [source,java] |
| ---- |
| ExternalResourceDescription extDesc = createSharedResourceDescription( |
| SharedModel.class, new File("somemodel.bin")); |
| |
| // Binding external resource to each Annotator individually |
| AnalysisEngineDescription aed1 = createEngineDescription( |
| Annotator.class, |
| Annotator.MODEL_KEY, extDesc); |
| |
| AnalysisEngineDescription aed2 = createEngineDescription( |
| Annotator.class, |
| Annotator.MODEL_KEY, extDesc); |
| |
| // Check the external resource was injected |
| AnalysisEngineDescription aaed = createEngineDescription(aed1, aed2); |
| AnalysisEngine ae = createEngine(aaed); |
| ae.process(ae.newJCas()); |
| ---- |
| |
| This example is given as a full JUnit-based example in the the _uimaFIT-examples_ project. |
| |
| === Resources extending Resource_ImplBase |
| |
| One kind of resources extend `Resource_ImplBase`. |
| These are the easiest to handle, because uimaFIT's version of `Resource_ImplBase` already implements the necessary logic. |
| Just be sure to call `super.initialize()` when overriding `initialize()`. |
| Also mind that external resources are not available yet when `initialize()` is called. |
| For any initialization logic that requires resources, override and implement `afterResourcesInitialized()`. |
| Other than that, injection of external resources works as usual. |
| |
| [source,java] |
| ---- |
| public static class ChainableResource extends Resource_ImplBase { |
| public final static String PARAM_CHAINED_RESOURCE = "chainedResource"; |
| @ExternalResource(key = PARAM_CHAINED_RESOURCE) |
| private ChainableResource chainedResource; |
| |
| public void afterResourcesInitialized() { |
| // init logic that requires external resources |
| } |
| } |
| ---- |
| |
| === Resources implementing SharedResourceObject |
| |
| The other kind of resources implement `SharedResourceObject``. |
| Since this is an interface, uimaFIT cannot provide the initialization logic, so you have to implement a couple of things in the resource: |
| |
| * implement `ExternalResourceAware` |
| * declare a configuration parameter `ExternalResourceFactory.PARAM_RESOURCE_NAME` and return its value in `getResourceName()` |
| * invoke `ConfigurationParameterInitializer.initialize()` in the `load()` method. |
| |
| Again, mind that external resource not properly initialized until uimaFIT invokes `afterResourcesInitialized()`. |
| |
| [source,java] |
| ---- |
| public class TestSharedResourceObject implements |
| SharedResourceObject, ExternalResourceAware { |
| |
| @ConfigurationParameter(name=ExternalResourceFactory.PARAM_RESOURCE_NAME) |
| private String resourceName; |
| |
| public final static String PARAM_CHAINED_RESOURCE = "chainedResource"; |
| @ExternalResource(key = PARAM_CHAINED_RESOURCE) |
| private ChainableResource chainedResource; |
| |
| public String getResourceName() { |
| return resourceName; |
| } |
| |
| public void load(DataResource aData) |
| throws ResourceInitializationException { |
| |
| ConfigurationParameterInitializer.initialize(this, aData); |
| // rest of the init logic that does not require external resources |
| } |
| |
| public void afterResourcesInitialized() { |
| // init logic that requires external resources |
| } |
| } |
| ---- |
| |
| === Note on injecting resources into resources |
| |
| Nested resources are only initialized if they are used in a pipeline which contains at least one component that calls `ConfigurationParameterInitializer.initialize()`. |
| Any component extending uimaFIT's component base classes qualifies. |
| If you use nested resources in a pipeline without any uimaFIT-aware components, you can just add uimaFIT's `NoopAnnotator` to the pipeline. |
| |
| == Resource locators |
| |
| Normally, in UIMA an external resource needs to implement either `SharedResourceObject` or `Resource`. |
| In order to inject arbitrary objects, uimaFIT has the concept of `ExternalResourceLocator`. |
| When a resource implements this interface, not the resource itself is injected, but the method `getResource()` is called on the resource and the result is injected. |
| The following example illustrates how to inject an object from JNDI into a UIMA component: |
| |
| [source,java] |
| ---- |
| class MyAnalysisEngine2 extends JCasAnnotator_ImplBase { |
| static final String RES_DICTIONARY = "dictionary"; |
| @ExternalResource(key = RES_DICTIONARY) |
| Dictionary dictionary; |
| } |
| |
| AnalysisEngineDescription desc = createEngineDescription( |
| MyAnalysisEngine2.class); |
| |
| bindResource(desc, MyAnalysisEngine2.RES_DICTIONARY, |
| JndiResourceLocator.class, |
| JndiResourceLocator.PARAM_NAME, "dictionaries/german"); |
| ---- |