<?xml version="1.0" encoding="UTF-8"?> | |
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" | |
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"[ | |
<!ENTITY imgroot "../images/tools/tools.cde/" > | |
<!ENTITY % uimaents SYSTEM "../entities.ent" > | |
%uimaents; | |
]> | |
<!-- | |
Licensed to the Apache Software Foundation (ASF) under one | |
or more contributor license agreements. See the NOTICE file | |
distributed with this work for additional information | |
regarding copyright ownership. The ASF licenses this file | |
to you under the Apache License, Version 2.0 (the | |
"License"); you may not use this file except in compliance | |
with the License. You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, | |
software distributed under the License is distributed on an | |
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | |
KIND, either express or implied. See the License for the | |
specific language governing permissions and limitations | |
under the License. | |
--> | |
<chapter id="ugr.tools.cde"> | |
<title>Component Descriptor Editor User's Guide</title> | |
<titleabbrev>CDE User's Guide</titleabbrev> | |
<para>The Component Descriptor Editor is an Eclipse plug-in that provides a forms-based | |
interface for creating and editing UIMA XML descriptors. It supports most of the | |
descriptor formats, except the Collection Processing Engine descriptor, the PEAR | |
package descriptor and some remote deployment descriptors.</para> | |
<section id="ugr.tools.cde.launching"> | |
<title>Launching the Component Descriptor Editor</title> | |
<para>Here's how to launch this tool on a descriptor contained in the examples. This | |
presumes you have installed the examples as described in the SDK Installation and Setup | |
chapter.</para> | |
<itemizedlist spacing="compact"><listitem><para>Expand the uimaj-examples | |
project in the Eclipse Navigator or Package Explorer view</para></listitem> | |
<listitem><para>Within this project, browse to the file | |
descriptors/tutorial/ex1/RoomNumberAnnotator.xml.</para></listitem> | |
<listitem><para>Right-click on this file and select Open With → Component | |
Descriptor Editor. (If this option is not present, check to make sure you installed | |
the plug-ins as described <olink targetdoc="&uima_docs_overview;" | |
targetptr="ugr.ovv.eclipse_setup.installation"/>. The EMF plugin is also | |
required.).</para></listitem> | |
<listitem><para>This should open a graphical editor and display the contents of the | |
RoomNumberAnnotator descriptor. </para></listitem></itemizedlist> | |
</section> | |
<section id="ugr.tools.cde.creating_new_ae_descriptor"> | |
<title>Creating a New AE Descriptor</title> | |
<para>A new AE descriptor file may be created by selecting the File → New → | |
Other... menu. This brings up the following dialog: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.8in" format="JPG" fileref="&imgroot;image002.jpg"/> | |
</imageobject> | |
<textobject><phrase>Screenshot of selecting new UIMA component in Eclipse</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>If the user then selects UIMA and Analysis Engine Descriptor File, and clicks the | |
Next > button, the following dialog is displayed. We will cover creating other kinds | |
of components later in the documentation. | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="3.2in" format="JPG" fileref="&imgroot;image004.jpg"/> | |
</imageobject> | |
<textobject><phrase>Screenshot of selecting new UIMA component in Eclipse | |
after pushing Next</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>After entering the appropriate parent folder and file name, and clicking Finish, | |
an initial AE descriptor file is created with the given name, and the descriptor is | |
opened up within the Component Descriptor Editor.</para> | |
<para>At this point, the display inside the Component Descriptor Editor is the same | |
whether one started by creating a new AE descriptor, as in the preceding paragraph, or | |
one merely opened a previously created AE descriptor from, say, the Package Explorer | |
view. We show a previously created AE in the figure below: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.84in" format="JPG" fileref="&imgroot;image006.jpg"/> | |
</imageobject> | |
<textobject><phrase>Screenshot of CDE showing overview page</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>To see all the information shown in the main editor pane with less scrolling, double | |
click the title tab to toggle between the <quote>full screen</quote> and normal | |
views.</para> | |
<para>It is possible to set the Component Descriptor Editor as the default editor for all | |
.xml files by going to Window → Preferences, and then selecting File Associations | |
on the left, and *.xml on the right, and finally by clicking on Component Descriptor | |
Editor, the Default button and then OK. If AE and Type System descriptors are not the | |
primary .xml files you work with within the Eclipse environment, we recommend not | |
setting the Component Descriptor Editor as your default editor for all .xml files. To | |
open an .xml file using the Component Descriptor Editor, if the Component Descriptor | |
Editor is not set as your default editor, right click on the file in the Package Explorer, | |
or other navigational view, and select Open With → Component Descriptor Editor. | |
This choice is remembered by Eclipse for subsequent open operations.</para> | |
</section> | |
<section id="ugr.tools.cde.pages_within_the_editor"> | |
<title>Pages within the Editor</title> | |
<para>The Component Descriptor Editor follows a standard Eclipse paradigm for these | |
kinds of editors. There are several pages in the editor; each one can be selected, one at a | |
time, by clicking on the bottom tabs. The last page contains the actual XML source file | |
being edited, and is displayed as plain text.</para> | |
<para>The same set of tabs appear at the bottom of each page in the Component Descriptor | |
Editor. The Component Descriptor Editor uses this <quote>multi-page editor</quote> | |
paradigm to give the user a view of conceptually distinct portions of the Descriptor | |
metadata in separate pages. At any point in time the user may click on the Source tab to | |
view the actual XML source. The Component Descriptor Editor is, in a way, just a fancy GUI | |
for editing the XML. The tabs provide quick access to the following pages: Overview, | |
Aggregate, Parameters, Parameter Settings, Type System, Capabilities, Indexes, | |
Resources, and Source. We discuss each of these pages in turn.</para> | |
<section id="ugr.tools.cde.adjusting_display_of_pages"> | |
<title>Adjusting the display of pages</title> | |
<para>Most pages in the editor have a <quote>sash</quote> bar. This is a light gray bar | |
which separates sub-sections of the page. This bar can be dragged with the mouse to | |
adjust how the display area is split between the two sash panes. You can also change the | |
orientation of the Sash so it splits vertically, instead of horizontally, by | |
clicking on the small icons at the top right of the page that look like this: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width=".7in" format="JPG" fileref="&imgroot;image008.jpg"/> | |
</imageobject> | |
<textobject><phrase>Changing orientation of two window split</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>All of the sections on a page have subtitles, with an indicator to the left which | |
you can click to collapse or expand that particular section. Collapsing sections can | |
sometimes be useful to free up screen area for other sections.</para> | |
</section> | |
</section> | |
<section id="ugr.tools.cde.overview_page"> | |
<title>Overview Page</title> | |
<para>Normally, the first page displayed in the Component Descriptor Editor is the | |
Overview page (the name of the page is shown in the GUI panel at the top left). If there is an | |
error reading and parsing the source, the Source page is shown instead, giving you the | |
opportunity to correct the problem. For many components, the Overview page contains | |
three sections: Implementation Details, Runtime Information and overall | |
Identification Information.</para> | |
<section id="ugr.tools.cde.overview_page.implementation_details"> | |
<title>Implementation Details</title> | |
<para>In the Implementation Details section you specify the Implementation Language | |
and Engine Type. There are two kinds of Engines: Aggregate, and non-Aggregate (also | |
called Primitive). An Aggregate engine is one which is composed of additional | |
component engines and contains no code, itself. Several of the pages in the Component | |
Descriptor Editor have different formats, depending on the engine type.</para> | |
</section> | |
<section id="ugr.tools.cde.overview_page.runtime_info"> | |
<title>Runtime Information</title> | |
<para>Runtime information is only applicable for primitive engines and is disabled | |
for aggregates and other kinds of descriptors. This is where you specify the class name of the annotator | |
implementation, if you are doing a Java implementation, or the C++ shared object or dll name, | |
if you are doing a C++ implementation. Most Analysis Engines will specify that | |
they update the CAS, and that they may be replicated (for performance reasons) when deployed. If | |
a particular Analysis Engine must see every CAS (for instance, if it is counting the | |
number of CASes), then uncheck the <quote>multiple deployment allowed</quote> | |
box. If the Analysis Engine doesn't update the CAS, uncheck the <quote>updates | |
the CAS</quote> box. (Most CAS Consumers do not update the CAS, and this parameter | |
defaults to unchecked for new CAS Consumer descriptors).</para> | |
<para>Analysis engines are written using the CAS Multiplier APIs | |
(see <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.cm"/>) | |
can create additional CASes for analysis. To specify that they | |
do this, check the <quote>returns new artifacts</quote>.</para> | |
</section> | |
<section id="ugr.tools.cde.overview_page.overall_id_info"> | |
<title>Overall Identification Information</title> | |
<para>The Name should be a human-readable name that describes this component. The | |
Version, Vendor, and Description fields are optional, and are arbitrary | |
strings.</para> | |
</section> | |
</section> | |
<section id="ugr.tools.cde.aggregate_page"> | |
<title>Aggregate Page</title> | |
<para>For primitive Analysis Engines, Flow Controllers or Collection Processing | |
components, the Aggregate page is not used. For aggregate engines, the page looks like | |
this: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.84in" format="JPG" fileref="&imgroot;image010.jpg"/> | |
</imageobject> | |
<textobject><phrase>CDE Aggregate page</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>On the left we see a list of component engines, and on the right information about the | |
flow. If you hover the mouse over an item in the list of component engines, that | |
engine's description meta data will be shown. If you right-click on one of these | |
items, you get an option to open that delegate descriptor in another editor instance. | |
Any changes you make, however, won't be seen until you close and reopen the editor | |
on the importing file.</para> | |
<para>Engines can be added to the list on the left by clicking the Add button at the bottom of | |
the Component Engine section. This brings up the following dialog: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="3.2in" format="JPG" fileref="&imgroot;delegate-chooser.jpg"/> | |
</imageobject> | |
<textobject><phrase>Adding an Analysis Engine to an Aggregate</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>This dialog lets you select | |
a descriptor from your workspace, or browse the file system to select a descriptor. | |
</para> | |
<para>You can specify that the import should be by Name (the name is looked up using both the | |
Project's class path, and DataPath), or by location. If it is by name, it may | |
contain part of the path within the name. For instance, if the file name picked is | |
<literal>c:/project/subproject/src/com/company/prod/xyz.xml</literal>, and | |
the class path includes <literal>c:/project/subproject/src</literal>, the name in | |
the descriptor will be <quote><literal>com.company.prod.xyz</literal></quote>. | |
If it is by location, the file reference is converted to a relative reference if | |
possible, in the descriptor.</para> | |
<para>The final selection at the bottom tells whether or not the selected engine(s) | |
should automatically be added to the end of the flow section (the right section on the | |
Aggregate page). The OK button does not become activated until a descriptor | |
file is selected.</para> | |
<para>To remove an analysis engine from the component engine list simply select an engine | |
and click the Remove button, or press the delete key. If the engine is already in the flow | |
list you will be warned that deletion will also delete the specified engine from this | |
list.</para> | |
<section id="ugr.tools.cde.aggregate_page.adding_components_more_than_once"> | |
<title>Adding components more than once</title> | |
<para>Components may be added to the left panel more than once. Each of these components | |
will be given a key which is unique. A typical reason this might be done is to use a | |
component in a flow several times, but have each use be associated with different | |
configuration parameters (different configuration parameters can be associated | |
with each instance).</para> | |
</section> | |
<section | |
id="ugr.tools.cde.aggregate_page.adding_removing_components_from_flow"> | |
<title>Adding or Removing components in a flow</title> | |
<para>The button in-between the Component Engines and the Flow List, labeled | |
<literal>>></literal>, adds a chosen engine to the flow list and the button | |
labeled <literal><<</literal> removes an engine from the flow list. To add an | |
engine to the flow list you must first select an engine from the left hand list, and then | |
press the <literal>>></literal> button. Engines may appear any number of | |
times in the flow list. To remove an engine from the flow list, select an engine from the | |
right hand list and press the <literal><<</literal> button.</para> | |
</section> | |
<section id="ugr.tools.cde.aggregate_page.adding_remote_aes"> | |
<title>Adding remote Analysis Engines</title> | |
<para>There are two ways to add remote engines: add an existing descriptor, which | |
specifies a remote engine (just as if you were adding a non-remote engine) or use the | |
Add Remote button which will create a remote descriptor, save it, and then import it, | |
all in one operation. The Add Remote button enables you to easily specify the | |
information needed to create a Service Client descriptor for a remote AE - one that | |
runs on a different computer connected over the network. The Service Client | |
descriptor is described in <olink targetdoc="&uima_docs_ref;" | |
targetptr="ugr.ref.xml.component_descriptor.service_client"/>. The Add | |
Remote button creates this descriptor, saves it as a file in the workspace, and | |
imports it into the aggregate.</para> | |
<para>Of course, if you already have a Service Client descriptor, you can add it to the | |
set of delegates, just like adding other kinds of analysis engines.</para> | |
<para>After clicking on Add Remote, the following dialog is displayed: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.84in" format="JPG" fileref="&imgroot;image014.jpg"/> | |
</imageobject> | |
<textobject><phrase>Adding a remote client to an aggregate</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>To define a remote service you specify the Service Kind, Protocol Service Type, | |
URI and Key. You can also specify a Timeout in milliseconds, used by the SOAP service, | |
and a VNS Host and Port used by the Vinci Service. Just like when one adds an engine from | |
the file system, you have the option of adding the engine to the end of the flow. The | |
Component Descriptor Editor currently only supports Vinci and SOAP services using | |
this dialog.</para> | |
<para>Remote engines are added to the descriptor using the | |
<import ... > syntax. The information you specify here is saved in the Eclipse | |
project as a file, using a generated name, <key-name>.xml, where | |
<key-name> is the name you listed as the Key. Because of this, the key-name must | |
be a valid file name. If you want a different name, you can change the path information | |
in the dialog box.</para> | |
</section> | |
<section id="ugr.tools.cde.aggregate_page.connecting_to_remote_services"> | |
<title>Connecting to Remote Services</title> | |
<para>If you are using the Vinci protocol, it requires that you specify the location of | |
the Vinci Name Server (an IP address and a Port number). You can specify these in the | |
service descriptor, or globally, for your Eclipse workspace, using the Eclipse menu | |
item: Window → Preferences... → UIMA Preferences. If the remote service | |
is available (up and running), additional operations become possible. For | |
instance, hovering the mouse over the remote descriptor will show the description | |
metadata from the remote service.</para> | |
</section> | |
<section id="ugr.tools.cde.aggregate_page.finding_aes_by_searching"> | |
<title>Finding Analysis Engines by searching</title> | |
<para>The next button that appears between the component engine list and the flow list | |
is the Find AE button. When this button is pressed the following dialog is displayed, | |
which allows one to search for AEs by name, by input or output types, or by a combination | |
of these criteria. This function searches the existing Eclipse workspace for | |
matching *.xml descriptor source files; it does not look inside Jar files. | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.3in" format="JPG" fileref="&imgroot;image016.jpg"/> | |
</imageobject> | |
<textobject><phrase>Searching for an AE to add to an aggregate</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>The search automatically adds a <quote>match any characters</quote> - style | |
(*) wildcard at the beginning and end of anything entered. Thus, if person is | |
specified for an output type, a <quote>*person*</quote> search is performed. Such a | |
search would match such things as <quote>my.namespace.person</quote> and | |
<quote>person.governmentOfficial.</quote> One can search in all projects or one | |
particular project. The search does an implicit <emphasis>and</emphasis> on all | |
fields which are left non-blank.</para> | |
</section> | |
<section id="ugr.tools.cde.aggregate_page.component_engine_flow"> | |
<title>Component Engine Flow</title> | |
<para>The UIMA SDK currently supports three kinds of sequencing flows: Fixed, | |
CapabilityLanguageFlow (see <olink targetdoc="&uima_docs_ref;" | |
targetptr="ugr.ref.xml.component_descriptor.aes.aggregate.flow_constraints.capability_language_flow"/> | |
), and user-defined. The first two require specification of a linear flow sequence; | |
this linear flow sequence can also be read by a user-defined flow controller (what use | |
is made of it is up to the user-defined flow controller). The Component Engine Flow | |
section allows specification of these items.</para> | |
<para>The pull-down labeled Flow Kind picks between the three flow models. When the | |
user-defined flow is selected, the Browse and Search buttons become enabled to let | |
you pick the flow controller XML descriptor to import. | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="3.8in" format="JPG" fileref="&imgroot;image018.jpg"/> | |
</imageobject> | |
<textobject><phrase>Specifying flow control</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>The key name value is set automatically from the XML descriptor being imported, | |
and enables parameters to be overridden for that descriptor (see following | |
sections).</para> | |
<para>The Up and Down buttons to the right in the Flow section are activated when an | |
engine in the flow is selected. The Up button moves the selected engine up one place in | |
the execution order, and down moves the selected engine down one place in the | |
execution order. Remember that engines can appear multiple times in the flow (or not | |
at all).</para> | |
</section> | |
</section> | |
<section id="ugr.tools.cde.parm_definition"> | |
<title>Parameters Definition Page</title> | |
<para>There are two pages for parameters: the first one is where parameters are defined, | |
and the second one is where the parameter settings are configured. The first page is the | |
Parameter Definition page and has two alternatives, depending on whether or not the | |
descriptor is an Aggregate or not. We start with a description of parameter definitions | |
for Primitive engines, CAS Consumers, Collection Readers, CAS Initializers, and Flow | |
Controllers. Here is an example: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.2in" format="JPG" fileref="&imgroot;image020.jpg"/> | |
</imageobject> | |
<textobject><phrase>Parameter Definitions - not Aggregate</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>The first checkbox at the top simplifies things if you are not using Parameter | |
Groups (see the following section for a discussion of groups). In this case, leave the | |
check box unchecked. The main area shows a list of parameter definitions. Each | |
parameter has a name, which must be unique for this Analysis Engine. The other three | |
attributes specify whether the parameter can have a single or multiple values (an array | |
of values), whether it is Optional or Mandatory, and what the value type it can hold | |
(String, Integer, Float, and Boolean).</para> | |
<para>In addition to using the buttons on the right to edit this information, you can | |
double-click a parameter to edit it, or remove (delete) a selected parameter by | |
pressing the delete key. Use the Add button to add a new parameter to the list.</para> | |
<para>Parameters have an additional description field, which you can specify when you | |
add or edit a parameter. To see the value of the description, hover the mouse over the | |
item, as shown in the picture below: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.5in" format="JPG" fileref="&imgroot;image022.jpg"/> | |
</imageobject> | |
<textobject><phrase>Parameter description shown in a hover message</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<section id="ugr.tools.cde.parm_definition.using_groups"> | |
<title>Using groups</title> | |
<para>The group concept for parameters arose from the observation that sets of | |
parameters were sometimes associated with different configuration needs. As an | |
example, you might have an Analysis Engine which needed different configuration | |
based on the language of a document.</para> | |
<para>To use groups, you check the <quote>Use Parameter Groups</quote> box. When you | |
do this, you get the ability to add groups, and to define parameters within these | |
groups. You also get a capability to define <quote>Common</quote> parameters, | |
which are parameters which are defined for all groups. Here is a screen shot showing | |
some parameter groups in use: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.2in" format="JPG" fileref="&imgroot;image024.jpg"/> | |
</imageobject> | |
<textobject><phrase>Using parameter groups</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>You can see the <quote><Common></quote> parameters as well as two | |
different sets of groups.</para> | |
<para>The Default Group is an optional specification of what Group to use if the | |
parameter is not available for the group requested.</para> | |
<para>The Search strategy specifies what to do when a parameter is not available for the | |
group requested. It can have the values of None, language_fallback, or | |
default_fallback. These are more fully described in the section <olink | |
targetdoc="&uima_docs_ref;" | |
targetptr="ugr.ref.xml.component_descriptor.aes.configuration_parameter_declaration"/> | |
.</para> | |
<para>Groups are added using the Add Group button. Once added, they can be edited or | |
removed, using the buttons to the right, or the standard gestures for editing | |
(double-clicking the item) and removing (pressing the delete key after an item is | |
selected). Removing a group removes all the parameter definitions in the group. If | |
you try and remove the <quote><Common></quote> group, it just removes the | |
parameters in the group.</para> | |
<para>Each entry for a group in the table specifies one or more group names. For example, | |
the highlighted entry above, specifies two groups: <quote>myNewGroup2</quote> | |
and <quote>mg3</quote>. The parameter definition underneath is considered to be in | |
both groups.</para> | |
</section> | |
<section id="ugr.tools.cde.parm_definition.aggregates"> | |
<title>Parameter declarations for Aggregates</title> | |
<para>Aggregates declare parameters which always must override a parameter setting | |
for a component making up the aggregate. They do this using the version of this page | |
which is shown when the descriptor is an Aggregate; here's an example: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.84in" format="JPG" fileref="&imgroot;image026.jpg"/> | |
</imageobject> | |
<textobject><phrase>Aggregate parameters</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>There is an additional panel shown (on the right) which lists all of the | |
components by their key names, and shows for each of them their defined parameters. To | |
add a new override for one or more of these parameters to the aggregate, select the | |
component parameter you wish to override and push the Create Override button (or, you | |
can just double-click the component parameter). This will automatically add a | |
parameter of the same name (by default – you can change the name if you like) to | |
the aggregate, putting it into the same group(s) (if groups are being used in the | |
component – this is required), and setting the properties of the parameter to | |
match those of the component (this is required).</para> | |
<note><para>If the name of the parameter being added already is in use in the aggregate, | |
and the parameters are not compatible, a new parameter name is generated by suffixing | |
the name with a number. If the parameters are compatible, the selected component | |
parameter is added to the existing aggregate parameter, as an additional override. If | |
you don't want this behavior, but want to have a new name generated in this case, | |
push the Create non-shared Override button instead, or hold down the | |
<quote>shift</quote> key when double clicking the component parameter.</para> | |
<para>The required / optional setting in the aggregate parameter is set to match that of | |
the parameter being overridden. You may want to make an optional delegate parameter | |
required. You can do this by changing that value manually in the source editor view. | |
</para></note> | |
<para>In the above example, the user has just double-clicked the | |
<quote>TypeNames</quote> parameter in the <quote>NameRecognizer</quote> | |
component. This added that parameter to this aggregate under the <quote><Not in | |
any group></quote> section – since it wasn't part of a group.</para> | |
<para>Once you have added a parameter definition to the aggregate, you can use the | |
buttons on the right side of the left panel to add additional overrides or remove | |
parameters or their overrides. <phrase | |
id="ugr.tools.cde.parm_definition.removing_groups"> You can also remove | |
groups; removing a group is like removing all the parameter definitions in the | |
group.</phrase></para> | |
<para>In addition to adding one parameter at a time from a component, you can also add all | |
the parameters for a group within a component, or all the parameters in the component, | |
by selecting those items.</para> | |
<para>If you double-click (or push Create Override) the | |
<quote><Common></quote> group or a parameter in the <Common> group in | |
a component, a special group is created in the Aggregate consisting of all of the | |
groups in that component, and the overriding parameter (or parameters) are added to | |
that. This is done because each component can have different groups belonging to the | |
Common group notion; the Common group for a component is just shorthand for all the | |
groups in that component.</para> | |
<para>The Aggregate's specification of the default group and search strategy | |
override any specifications contained in the components.</para> | |
</section> | |
</section> | |
<section id="ugr.tools.cde.parameter_settings"> | |
<title>Parameter Settings Page</title> | |
<para>The Parameter Settings page is rather straightforward; it is where the user | |
defines parameter settings for their engines. An example of such a page is given below: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.84in" format="JPG" fileref="&imgroot;image028.jpg"/> | |
</imageobject> | |
<textobject><phrase>Parameter settings page</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>For single valued attributes, the user simply types the default value into the | |
Value box on the right hand side. For multi-valued parameters the user should use the | |
Add, Edit and Remove buttons to manage the list of multiple parameter values.</para> | |
<para>Values within groups are shown with each group separately displayed, to allow | |
configuring different values for each group.</para> | |
<para>Values are checked for validity. For Boolean values in a list, use the words | |
<literal>true</literal> or <literal>false</literal>.</para> | |
<note><para>If you specify a value in a single-valued parameter, and then delete all the | |
characters in the value, the CDE will treat this as if you wanted to not specify any setting | |
for this parameter. In order to specify a 0 length string setting for a String-valued | |
parameter, you will have to manually edit the XML using the <quote>Source</quote> tab. | |
</para> | |
<para> For array valued parameters, if you remove all of the entries for a particular array | |
parameter setting, the XML will reflect a 0-length array. To change this to an | |
unspecified parameter setting, you will have to manually edit the XML using the | |
<quote>Source</quote> tab. </para></note> | |
</section> | |
<section id="ugr.tools.cde.type_system"> | |
<title>Type System Page</title> | |
<para>This page declares the type system used by the annotator. For aggregates it is | |
derived by merging the type systems of all constituent AEs. The types used by the AE | |
constitute the language in which the inputs and outputs are described in the | |
Capabilities page and also affect the choice of indexes on the Indexes page. The Type | |
System page looks like the following: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.84in" format="JPG" fileref="&imgroot;image030.jpg"/> | |
</imageobject> | |
<textobject><phrase>Type System declaration page</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>Before discussing this page in detail, it is important to note that there are two | |
settings that affect the operation of this page. These are accessed by selecting the | |
UIMA → Settings (or by going to the Eclipse Window → Preferences → UIMA | |
Preferences) and checking or unchecking one of the following: <quote>Auto generate | |
.java files when defining types</quote> and <quote>Display fully qualified type | |
names.</quote></para> | |
<para id="ugr.tools.cde.auto_jcasgen">When the Auto generate option is checked and the development language for the AE is | |
Java, any time a change is made to a type and the change is saved, the corresponding .java | |
files are generated using the JCasGen tool. The results are stored in the primary source | |
directory defined for the project. The primary source directory is that listed first | |
when you right click on your project and select Properties → Java Build Path, click | |
on the Source tab and look in the list box under the text that reads: <quote>Source folder | |
on build path.</quote> If no source folders are defined, you will get a warning that you | |
have no source folders defined and JCasGen will not be run. (For information on JCasGen | |
see <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.jcasgen"/>). | |
When JCasGen is run, you can monitor the progress of the generation by observing the | |
status on the Eclipse status line (normally at the bottom of the Eclipse window). | |
JCasGen runs on the fully-merged type system, consisting of the type specification | |
plus any imported type system, plus (for aggregates) the merged type systems of all the | |
components in an aggregate.</para> | |
<warning><para>If the components of the aggregate have different definitions for the same | |
type name, the CDE will show a warning. It is possible to continue past this warning, | |
in which case the CDE will produce the correct | |
Java source files representing the merged types (that is, the | |
type definition that contains all of the features defined on that type by all of your | |
components). However, it is not recommended to use this feature | |
(of having different definitions for the same type name) since it can make it | |
difficult to combine/package your annotator with others. See <olink | |
targetdoc="&uima_docs_ref;" | |
targetptr="ugr.ref.jcas.merging_types_from_other_specs"/> for more information. | |
</para></warning> | |
<note><para>In addition to running automatically, you can manually run JCasGen on the | |
fully merged type system by clicking the JCasGen button, or by selecting Run JCasGen from | |
the UIMA pulldown menu: </para></note> | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.2in" format="JPG" fileref="&imgroot;image032.jpg"/> | |
</imageobject> | |
<textobject><phrase>Setting JCasGen options</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot> | |
<para>When <quote>Display fully qualified type names</quote> is left unchecked, the | |
namespace of types is not displayed, i.e. if a fully qualified type name is | |
my.namespace.person, only the abbreviated type name person will be displayed. In the | |
Type page diagram shown above, <quote>Display fully qualified type names</quote> is | |
in fact unchecked.</para> | |
<para>To add, edit, or remove types the buttons on the top left section are used. When | |
adding or editing types, fully qualified type names should of course be used, | |
regardless of whether the <quote>Display fully qualified type names</quote> is | |
unchecked. Removing or editing a type will have a cascading effect in that the type | |
removal/edit will effect inputs, outputs, indexes and type priorities in the natural | |
way.</para> | |
<para>When a type is added, this dialog is shown: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="4.2in" format="JPG" fileref="&imgroot;image034.jpg"/> | |
</imageobject> | |
<textobject><phrase>Adding a type</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>Type names should be specified using a namespace. The namespace is like a Java | |
package name, and serves to insure type names are unique. It also serves as the package | |
name for the generated JCas classes. The namespace name is the set of names up to the last | |
period in the string.</para> | |
<para>The supertype must be picked from an existing type. The entry field for the | |
supertype supports Eclipse-style content assist. To use it, put the cursor in the | |
supertype field, and type a letter or two of the supertype name (lower case is fine), | |
either starting with the name space, or just with the type name (without the name space), | |
and hold down the Control key and then press the spacebar. When you do this, you can see a | |
list of suitable matching types. You can then type more letters to narrow down your | |
choices, or pick the right entry with the mouse.</para> | |
<para>To see the available types and pick one, press the Browse button. This will show the | |
available types, and as you type letters for the type name (in lower case – | |
capitalization is ignored), the available types that match are narrowed. When | |
you've typed enough to specify the type you want, press Enter. Or you can use the | |
list of matching type names and pick the one you want with the mouse.</para> | |
<para>Once you've added the type, you can add features to it by highlighting the | |
type, and pressing the Add button.</para> | |
<para>If the type being defined is a subtype of uima.cas.String, the Add button allows you | |
to add allowed values for the string, instead of adding features.</para> | |
<para>To edit a type or feature, you can double click the entry, or highlight the entry and | |
press the Edit button. To delete a type or feature, you highlight the entry to be deleted, | |
and click the delete button or push the delete key.</para> | |
<para>If the range of a feature is an array or one of the built-in list types, an additional | |
specification allows you to specify if multiple references to the object referenced by | |
this feature are allowed. If they are not allowed then the XMI serialization of | |
instances of this type use a more efficient format.</para> | |
<para>If the range of a feature is an array of Feature Structures, then it is possible to | |
specify an element type for the array. This information is used in the XMI serialization | |
and also by the JCas generation routines to generate more efficient code. | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="4.2in" format="JPG" fileref="&imgroot;image036.jpg"/> | |
</imageobject> | |
<textobject><phrase>Specifying a Feature Structure</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>It is also possible to import type systems for inclusion in your descriptor. To do | |
this, use the Type Import panel's<literal> Add...</literal> button. This | |
allows you to import a type system descriptor.</para> | |
<para>When importing by name, the name is resolved using the class path for the Eclipse | |
project containing the descriptor file being edited, or by looking up this name in the | |
UIMA DataPath. The DataPath can be set by pushing the Set DataPath button. It will be | |
remembered for this Eclipse project, as a project Property, so you only have to set it | |
once (per project). The value of the DataPath setting is written just like a class path, | |
and can include directories or JAR files, just as is true for class paths.</para> | |
<para>The following dialog allows you to pick one or more files from the Eclipse | |
workspace, or one file (at a time) from the file system: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="3.5in" format="JPG" fileref="&imgroot;import-chooser.jpg"/> | |
</imageobject> | |
<textobject><phrase>Picking files for importing</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>This is essentially the same dialog as was used to add component engines to an | |
aggregate. To import from a type system descriptor that is not part of your Eclipse | |
workspace, click the Browse the file system.... button.</para> | |
<para>Imported types are validated, and if OK, they are added to the list in the Imported | |
Type Systems section of the Type System page. Any types they define are merged with the | |
existing type system.</para> | |
<para>Imported types and features which are only defined in imports are shown in the Type | |
System section, but in a grayed-out font; these type cannot be edited here. To change | |
them, open up the imported type system descriptor, and change them there.</para> | |
<para>If you hover the mouse over an import specification, it will show more information | |
about the import. If you right-click, it will bring up a context menu that allows opening | |
the imported file in the Editor, if the imported file is part of the Eclipse workspace. | |
Changes you make, however, won't be seen until you close and reopen the editor on | |
the importing file.</para> | |
<para>It is not possible to define types for an aggregate analysis engine. In this case the | |
type system is computed from the component AEs. The Type System information is shown in a | |
grayed-out font.</para> | |
<section id="ugr.tools.cde.type_system.exporting"> | |
<title>Exporting</title> | |
<para>In addition to importing type specifications, you can export as well. When you | |
push the Export... button, the editor will create a new importable XML descriptor for | |
the types in this type system, and change the existing descriptor to import that newly | |
created one. | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="3.75in" format="JPG" fileref="&imgroot;image040.jpg"/> | |
</imageobject> | |
<textobject><phrase>Exporting a type system</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>The base file name you type is inserted into the path in the line below | |
automatically. You can change the path where the generated part descriptor is stored | |
by overtyping the lower text box. When you click OK, the new part descriptor will be | |
generated, and the current descriptor will be changed to import that part.</para> | |
</section> | |
</section> | |
<section id="ugr.tools.cde.capabilities"> | |
<title>Capabilities Page</title> | |
<para>Capabilities come in <quote>sets</quote>. You can have multiple sets of | |
capabilities; each one specifies languages supported, plus inputs and outputs of the | |
Analysis Engine. The idea behind having multiple sets is the concept that different | |
inputs can result in different outputs. Many Analysis Engines, though, will probably | |
define just one set of capabilities. A sample Capabilities page is given below: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.2in" format="JPG" fileref="&imgroot;image042.jpg"/> | |
</imageobject> | |
<textobject><phrase>Capabilities page</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>When defining the capabilities of a primitive analysis engine, input and output | |
types can be any type defined in the type system. When defining the capabilities of an | |
aggregate the inputs must be a subset of the union of the inputs in the constituent | |
analysis engines and the outputs must be a subset of the union of the outputs of the | |
constituent analysis engines.</para> | |
<para>To add a type, first select something in the set you wish to add the type to, and press | |
Add Type. The following dialog appears presenting the user with a list of types which are | |
candidates for additional inputs: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="4.4in" format="JPG" fileref="&imgroot;image044.jpg"/> | |
</imageobject> | |
<textobject><phrase>Adding a type to the capabilities page</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>Follow the instructions to mark the types as input and / or output (a type can be | |
both). By default, the <all features> flag is set to true. If you want to specify a | |
subset of features of a type, read on.</para> | |
<para>When types have features, you can specify what features are input and / or output. A | |
type doesn't have to be an output to have an output feature. For example, an | |
Analysis Engine might be passed as input a type Token, and it adds (outputs) a feature to | |
the existing Token types. If no new Token instances were created, it would not be an | |
output Type, but it would have features which are output.</para> | |
<para>To specify features as input and / or output (they can be both), select a type, and | |
press Add. The following dialog box appears: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="4in" format="JPG" fileref="&imgroot;image046.jpg"/> | |
</imageobject> | |
<textobject><phrase>Specifying features as input or output</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>To mark a feature as being input and / or output, click the mouse in the input and / or | |
output column for the feature. If you select <all features>, it unmarks any | |
individual feature you selected, since <all features> subsumes all the | |
features.</para> | |
<para>The Languages part of the capability is where you specify what languages are | |
supported by the Analysis Engine. Supported languages should be listed using either a | |
two letter ISO-639 language code, or an ISO-639 language code followed by a two-letter | |
ISO-3166 country code. Add a language by selecting Languages and pressing the Add | |
button. The dialog for adding languages is given below. | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="4in" format="JPG" fileref="&imgroot;image048.jpg"/> | |
</imageobject> | |
<textobject><phrase>Specifying a language</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>The Sofa part of the capability is optional; it allows defining Sofa names that this | |
component uses, and whether they are input (meaning they are created outside of this | |
component, and passed into it), or output (meaning that they are created by this | |
component). Note that a Sofa can be either input or output, but can't be | |
both.</para> | |
<para>To add a Sofa name (which is synonymous with the view name), press the Add Sofa | |
button, and this dialog appears: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="4.2in" format="JPG" fileref="&imgroot;image050.jpg"/> | |
</imageobject> | |
<textobject><phrase>Specifying a Sofa name</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<section id="ugr.tools.cde.capabilities.sofa_name_mapping"> | |
<title>Sofa (and view) name mappings</title> | |
<para>Sofa names, once created, are used in Sofa Mappings. These are optional | |
mappings, done in an aggregate, that specify which Sofas are the same ones but with | |
different names. The Sofa Mappings section is minimized unless you are editing an | |
Aggregate descriptor, and have one or more Sofa names defined for the aggregate. In | |
that case, the Sofa Mappings section will look like this: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.4in" format="JPG" fileref="&imgroot;image052.jpg"/> | |
</imageobject> | |
<textobject><phrase>Sofa mappings</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>Here the aggregate has defined two input Sofas, named | |
<quote>MyInputSofa</quote>, and <quote>AnotherSofa</quote>. Any named sofas in | |
the aggregate's capabilities will appear in the Sofa Mapping section, listed | |
either under Inputs or Outputs. Each name in the Mappings has 0 or more delegate | |
(component) sofa names mapped to it. A delegate may have multiple Sofas, as in this | |
example, where the GovernmentOfficialRecognizer delegate has Sofas named | |
<quote>so1</quote> and <quote>so2</quote>.</para> | |
<para>Delegate components may be written as Single-View components. In this case, | |
they have one implicit, default Sofa (<quote>_InitialView</quote>), and to map to | |
it you use the form shown for the <quote>NameRecognizer</quote> – you map to | |
the delegate's key name in the aggregate, without specifying a Sofa name. You | |
can also specify the sofa name explicitly, e.g., | |
NameRecognizer/_InitialView.</para> | |
<para>To add a new mapping, select the Aggregate Sofa name you wish to add the mapping | |
for, and press the Add button. This brings up a window like this, showing all available | |
delegates and their Sofas; select one or more (use the normal multi-select methods) | |
of these and press OK to add them. | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.84in" format="JPG" fileref="&imgroot;image054.jpg"/> | |
</imageobject> | |
<textobject><phrase>Adding a Sofa mapping</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>To edit an existing mapping, select the mapping and press Edit. This will show the | |
existing mapping with all mapped items <quote>selected</quote>, and other | |
available items unselected. Change the items selected to match what you want, | |
deselecting some, and perhaps selecting others, and press OK.</para> | |
</section> | |
</section> | |
<section id="ugr.tools.cde.indexes"> | |
<title>Indexes Page</title> | |
<para>The Indexes page is where the user declares what indexes and type priority lists are | |
used by the analysis engine. Indexes are used to determine which Feature | |
Structures of a particular type are fetched, using an iterator in the UIMA API. An | |
unpopulated Indexes page is displayed below: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.5in" format="JPG" fileref="&imgroot;image056.jpg"/> | |
</imageobject> | |
<textobject><phrase>Index page</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>Both indexes and type priority lists can have imports. These imports work just like | |
the type system imports, described above. Both indexes and type priority lists can be | |
exported to new component descriptors, using the Export... button, just like the type | |
system export operation described above.</para> | |
<para>The built-in Annotation Index is always present. It is based on the built-in type | |
<literal>uima.tcas.Annotation </literal>and has keys begin (Ascending), end | |
(Descending) and TYPE_PRIORITY. There are no built-in type priorities, so this last | |
sort item does not play a role in the index unless type priorities are specified.</para> | |
<para>Type priority may be combined with other keys. Type priorities are defined in the | |
Priority Lists section, using one or more priority list. A given priority list gives an | |
ordering among a group of types. Types that appear higher in the priority list are given | |
higher priority, in other words, they sort first when TYPE_PRIORITY is specified as the | |
index key. Subtypes of these types are also ordered in a consistent manner, unless | |
overridden by another specific type priority specification. To get the ordering used | |
among all the types, all of the type priority lists are merged. This gives a partial | |
ordering among the types. Ties are resolved in an unspecified fashion. The Component | |
Descriptor Editor checks for incompatible orderings, and informs the user if they | |
exist, so they can be corrected.</para> | |
<para>To create a new index, use the Add Index button in the top left section. This brings up | |
this dialog: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="4in" format="JPG" fileref="&imgroot;image058.jpg"/> | |
</imageobject> | |
<textobject><phrase>Adding a new index</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>Each index needs a globally unique index name. Every index indexes one CAS type (including | |
its subtypes). If you're using Eclipse 3.2 or later, the entry field for this | |
has content assist (start typing the type name | |
and press Control – Spacebar to get help, or press the Browse button to pick a | |
type).</para> | |
<para>Indexes can be sorted, in which case you need to specify one or more keys to sort on. | |
Sort keys are selected from features whose range type is Integer, Float, or String. Some | |
elements will be disabled if they are not relevant. For instance, if the index kind is | |
<quote>bag</quote>, you cannot provide sort keys. The order of sort keys can be | |
adjusted using the up and down buttons, if necessary.</para> | |
<note><para>There is usually no need to explicitly declare a Bag index in your descriptor. | |
As of UIMA v2.1, if you do not declare any index for a type (or any of its | |
supertypes), a Bag index will be automatically created. This index is | |
accessed using the <literal>getAllIndexedFS(...)</literal> method defined on the index repository.</para></note> | |
<para>A set index will contain no duplicates of the same type, where a duplicate is defined | |
by the indexing comparator. That is, if you commit two feature structures of the same | |
type that are equal with respect to the indexing comparator, only the first one will be | |
entered into the index. Note that you can still have duplicates with respect to the | |
indexing order, if they are of a different type. A set index is not guaranteed to be | |
sorted. If no keys are specified for a set index, then all instances are considered by | |
default to be equal, so only the first instance (for a particular type or subtype of the | |
type being indexed) is indexed. On the other hand, <quote>bag</quote> indicates that | |
all annotation instances are indexed, including duplicates.</para> | |
<para>The Priority Lists section of the Indexes page is used to specify Priority Lists of | |
types. Priority Lists are unnamed ordered sets of type names. Add a new priority list by | |
clicking the Add Set button. Add a type to an existing priority list by first selecting | |
the set, and then clicking Add. You can use the up and down buttons to adjust the order as | |
necessary; these buttons move the selected item up or down.</para> | |
<para>Although it is possible to import self-contained index and type priority files, | |
the creation of such files is not yet supported by the Component Descriptor Editor. If | |
you create these files using another editor, they can be imported using the | |
corresponding Import panels, shown on the right. Imports are specified in the same | |
manner as they are for Type System imports.</para> | |
</section> | |
<section id="ugr.tools.cde.resources"> | |
<title>Resources Page</title> | |
<para>The resources page describes resource dependencies (for primitive Analysis | |
Engines) and external Resource specification and their bindings to the resource | |
dependencies.</para> | |
<para>Only primitive Analysis Engines define resource dependencies. Primitive and | |
Aggregate Analysis Engines can define external resources and connect them (bind them) | |
to resource dependencies.</para> | |
<para>When an Aggregate is providing an external resource to be bound to a dependency, the | |
binding is specified using a possibly multi-level path, starting at the Aggregate, and | |
specify which component (by its key name), and then if that component is, in turn, an | |
Aggregate, which component (again by its key name), and so on until you reach a | |
primitive. The sequence of key names is made into the binding specification by joining | |
the parts with a <quote>/</quote> character. All of this is done for you by the Component | |
Descriptor Editor.</para> | |
<para>Any external resource provided by an Aggregate will override any binding provided | |
by any lower level component for the same resource dependency.</para> | |
<para>There are two views of the Resources page, depending on whether the Analysis Engine | |
is an Aggregate or Primitive. Here's the view for a Primitive: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5in" format="JPG" fileref="&imgroot;image060.jpg"/> | |
</imageobject> | |
<textobject><phrase>Resources page for a primitive</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>To declare a resource dependency, click the Add button in the right hand panel. This | |
puts up the dialog: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="4in" format="JPG" fileref="&imgroot;image062.jpg"/> | |
</imageobject> | |
<textobject><phrase>Specifying a resource dependency</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>The Key must be unique within the descriptor declaring it. The Interface, if | |
present, is the name of a Java interface the Analysis Engine uses to access the | |
resource.</para> | |
<para>Declare actual External resource on the left side of the page. Clicking | |
<quote>Add</quote> brings up this dialog: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="4.2in" format="JPG" fileref="&imgroot;image064.jpg"/> | |
</imageobject> | |
<textobject><phrase>Specifying an External Resource</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>The Name must be unique within this Analysis Engine. The URL identifies a file | |
resource. If both the URL and URL suffix are used, the file resource is formed by | |
combining the first URL part with the language-identifier, followed by the URL suffix; | |
see <olink targetdoc="&uima_docs_ref;" | |
targetptr="ugr.ref.xml.component_descriptor.aes.primitive.resource_manager_configuration"/> | |
. URLs may be written as <quote>relative</quote> URLs; in this case they are resolved by | |
looking them up relative to the classpath and/or datapath. A relative URL has the path | |
part starting without an intial <quote>/</quote>; for example: | |
file:my/directory/file. An absolute URL starts with file:/ or file:/// or | |
file://some.network.address/. For more information about URLs, please read the | |
javaDoc information for the Java class <quote>URL</quote>.</para> | |
<para>The Implementation is optional, and if given, must be a Java class that implements | |
the interface specified in any Resource Dependencies this resource is bound | |
to.</para> | |
<section id="ugr.tools.cde.resources.binding"> | |
<title>Binding</title> | |
<para>Once you have an external resource definition, and a Resource Dependency, you | |
can bind them together. To do this, you select the two things (an external resource | |
definition, and a Resource Dependency) that you want to bind together, and click | |
Bind.</para> | |
</section> | |
<section id="ugr.tools.cde.resources.aggregates"> | |
<title>Resources with Aggregates</title> | |
<para>When editing an Aggregate Descriptor, the Resource definitions panel will show | |
all the resources at the primitive level, with paths down through the components | |
(multiple levels, if needed) to get to the primitives. The Aggregate can define | |
external resources, and bind them to one or more uses by the primitives.</para> | |
</section> | |
<section id="ugr.tools.cde.resources.imports_exports"> | |
<title>Imports and Exports</title> | |
<para>Resource definitions and their bindings can be imported, just like other | |
imports. Existing Resource definitions and their bindings can be exported to a new | |
importable part, and replaced with an import for that importable part, using the | |
<quote>Export...</quote> button, just like the similar function on the Type System | |
page.</para> | |
</section> | |
</section> | |
<section id="ugr.tools.cde.source"> | |
<title>Source Page</title> | |
<para>The Source page is a text view of the xml content of the Analysis Engine or Type System | |
being configured. An example of this page is displayed below: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="5.84in" format="JPG" fileref="&imgroot;image066.jpg"/> | |
</imageobject> | |
<textobject><phrase>Source page</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>Changes made in the GUI are immediately reflected in the xml source, and changes | |
made in the xml source are immediately reflected back in the GUI. The thought here is that | |
the GUI view and the Source view are just two ways of looking at the same data. When the data | |
is in an unsaved state the file name is prefaced with an asterisk in the currently | |
selected file tab in the editor pane inside Eclipse (as in the example above).</para> | |
<para>You may accidentally create invalid descriptors or XML by editing directly in the | |
Source view. If you do this, when you try and save or when you switch to a different view, | |
the error will be detected and reported. In the case of saving, the file will be saved, | |
even if it is in an error state.</para> | |
<section id="ugr.tools.cde.source.formatting"> | |
<title>Source formatting – indentation</title> | |
<para>The XML is indented using an indentation amount saved as a global UIMA | |
preference. To change this preference, use the Eclipse menu item: Windows → | |
Preferences → UIMA Preferences.</para> | |
</section> | |
</section> | |
<section id="ugr.tools.cde.creating_self_contained_type_system"> | |
<title>Creating a Self-Contained Type System</title> | |
<para>It is also possible to use the Component Descriptor Editor to create or edit | |
self-contained type systems. To create a self-contained type system, select the menu | |
item File → New → Other and then select Type System Descriptor File. From the | |
next page of the selection wizard specify a Parent Folder and File name and click Finish. | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="3.5in" format="JPG" fileref="&imgroot;image068.jpg"/> | |
</imageobject> | |
<textobject><phrase>Working with a self-contained type system</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot> | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="3.5in" format="JPG" fileref="&imgroot;image070.jpg"/> | |
</imageobject> | |
<textobject><phrase></phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>This will take you to a version of the Component Descriptor Editor for editing a type | |
system file which contains just three pages: an overview page, a type system page, and a | |
source page. The overview page is a bit more spartan than in the case of an AE. It looks like | |
the following: | |
<screenshot> | |
<mediaobject> | |
<imageobject> | |
<imagedata width="3.7in" format="JPG" fileref="&imgroot;image072.jpg"/> | |
</imageobject> | |
<textobject><phrase>Editing a type system object</phrase> | |
</textobject> | |
</mediaobject> | |
</screenshot></para> | |
<para>Just like an AE has an associated name, version, vendor and description, the same is | |
true of a self-contained type system. The Type System page is identical to that in an AE | |
descriptor file, as is the Source page. Note that a self-contained type system can | |
import type systems just like the type system associated with an AE.</para> | |
<para>A type system component can also be created from an existing descriptor which | |
contains a type system definition section, by clicking on the Export... button on the | |
Type System page.</para> | |
</section> | |
<section id="ugr.tools.cde.creating_other_descriptor_components"> | |
<title>Creating Other Descriptor Components</title> | |
<para>The new wizard can create several other kinds of components: Collection | |
Processing Management (CPM) components, flow controllers, and importable parts | |
(besides Type Systems, described above, Indexes, Type Priorities, and Resource | |
Manager Configuration imports).</para> | |
<para>The CPM components supported by this editor include the Collection Reader, CAS | |
Initializer, and CAS Consumer descriptors. Each of these is basically treated just | |
like a primitive AE descriptor, with small changes to accommodate the different | |
semantics. For instance, a CAS Consumer can't declare in its capabilities | |
section that it outputs types or features.</para> | |
<para>Flow controllers are components that control the flow of CASes within an | |
aggregate, an are edited in a similar fashion as a primitive Analysis Engine.</para> | |
<para>The importable part support requires context information to enable the editor to | |
work, because much of the power of this editor comes from extensive checking that | |
requires additional information, other than what is available in just the importable | |
part. For instance, when you create or edit an Indexes import, the facility for adding | |
new indexes needs the type information, which is not present in this part when it is | |
edited alone. </para> | |
<para>To overcome this, when you edit these descriptors, you will be asked to | |
specify a context descriptor, usually a descriptor which would import the part being | |
edited, which would have the additional information needed. </para> | |
<para>Various methods are used | |
to guess what the context descriptor should be - and if the guess is correct, you can just | |
press the Enter key to confirm. The last successful context file is remembered and will | |
be suggested as the context file to use at the next edit session</para> | |
</section> | |
</chapter> |