blob: a9a1463864907fc733fb6166eccb288bda825828 [file] [log] [blame]
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>UIMA Tools Guide and Reference</title><link rel="stylesheet" type="text/css" href="css/stylesheet-html.css"><meta name="generator" content="DocBook XSL-NS Stylesheets V1.76.1"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div lang="en" class="book" title="UIMA Tools Guide and Reference" id="d5e1"><div xmlns:d="http://docbook.org/ns/docbook" class="titlepage"><div><div><h1 class="title">UIMA Tools Guide and Reference</h1></div><div><div class="authorgroup">
<h3 class="corpauthor">Written and maintained by the Apache UIMA&#8482; Development Community</h3>
</div></div><div><p class="releaseinfo">Version 3.1.1</p></div><div><p class="copyright">Copyright &copy; 2006, 2019 The Apache Software Foundation</p></div><div><div class="legalnotice" title="Legal Notice"><a name="d5e8"></a>
<p> </p>
<p title="License and Disclaimer">
<b>License and Disclaimer.&nbsp;</b>
The ASF licenses this documentation
to you under the Apache License, Version 2.0 (the
"License"); you may not use this documentation except in compliance
with the License. You may obtain a copy of the License at
</p><div class="blockquote"><blockquote class="blockquote">
<a class="ulink" href="http://www.apache.org/licenses/LICENSE-2.0" target="_top">http://www.apache.org/licenses/LICENSE-2.0</a>
</blockquote></div><p title="License and Disclaimer">
Unless required by applicable law or agreed to in writing,
this documentation and its contents are distributed under the License
on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
</p>
<p> </p>
<p> </p>
<p title="Trademarks">
<b>Trademarks.&nbsp;</b>
All terms mentioned in the text that are known to be trademarks or
service marks have been appropriately capitalized. Use of such terms
in this book should not be regarded as affecting the validity of the
the trademark or service mark.
</p>
</div></div><div><p class="pubdate">November, 2019</p></div></div><hr></div><div class="toc"><p><b>Table of Contents</b></p><dl><dt><span class="chapter"><a href="#ugr.tools.cde">1. CDE User's Guide</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cde.launching">1.1. Launching the Component Descriptor Editor</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.creating_new_ae_descriptor">1.2. Creating a New AE Descriptor</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.pages_within_the_editor">1.3. Pages within the Editor</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cde.adjusting_display_of_pages">1.3.1. Adjusting the display of pages</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.tools.cde.overview_page">1.4. Overview Page</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cde.overview_page.implementation_details">1.4.1. Implementation Details</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.overview_page.runtime_info">1.4.2. Runtime Information</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.overview_page.overall_id_info">1.4.3. Overall Identification Information</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.tools.cde.aggregate_page">1.5. Aggregate Page</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cde.aggregate_page.adding_components_more_than_once">1.5.1. Adding components more than once</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.aggregate_page.adding_removing_components_from_flow">1.5.2. Adding or Removing components in a flow</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.aggregate_page.adding_remote_aes">1.5.3. Adding remote Analysis Engines</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.aggregate_page.connecting_to_remote_services">1.5.4. Connecting to Remote Services</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.aggregate_page.finding_aes_by_searching">1.5.5. Finding Analysis Engines by searching</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.aggregate_page.component_engine_flow">1.5.6. Component Engine Flow</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.tools.cde.parm_definition">1.6. Parameters Definition Page</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cde.parm_definition.using_groups">1.6.1. Using groups</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.parm_definition.adding">1.6.2. Adding or Editing a Parameter</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.parm_definition.aggregates">1.6.3. Parameter declarations for Aggregates</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.tools.cde.parameter_settings">1.7. Parameter Settings Page</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.type_system">1.8. Type System Page</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cde.type_system.exporting">1.8.1. Exporting</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.tools.cde.capabilities">1.9. Capabilities Page</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cde.capabilities.sofa_name_mapping">1.9.1. Sofa (and view) name mappings</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.tools.cde.indexes">1.10. Indexes Page</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.resources">1.11. Resources Page</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cde.resources.binding">1.11.1. Binding</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.resources.aggregates">1.11.2. Resources with Aggregates</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.resources.imports_exports">1.11.3. Imports and Exports</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.tools.cde.source">1.12. Source Page</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cde.source.formatting">1.12.1. Source formatting &#8211; indentation</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.tools.cde.creating_self_contained_type_system">1.13. Creating a Self-Contained Type System</a></span></dt><dt><span class="section"><a href="#ugr.tools.cde.creating_other_descriptor_components">1.14. Creating Other Descriptor Components</a></span></dt></dl></dd><dt><span class="chapter"><a href="#ugr.tools.cpe">2. CPE Configurator User's Guide</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cpe.limitations">2.1. Limitations of the CPE Configurator</a></span></dt><dt><span class="section"><a href="#ugr.tools.cpe.starting">2.2. Starting the CPE Configurator</a></span></dt><dt><span class="section"><a href="#ugr.tools.cpe.selecting_component_descriptors">2.3. Selecting Component Descriptors</a></span></dt><dt><span class="section"><a href="#ugr.tools.cpe.running">2.4. Running a Collection Processing Engine</a></span></dt><dt><span class="section"><a href="#ugr.tools.cpe.file_menu">2.5. The File Menu</a></span></dt><dt><span class="section"><a href="#ugr.tools.cpe.help_menu">2.6. The Help Menu</a></span></dt></dl></dd><dt><span class="chapter"><a href="#ugr.tools.doc_analyzer">3. Document Analyzer User's Guide</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.doc_analyzer.starting">3.1. Starting the Document Analyzer</a></span></dt><dt><span class="section"><a href="#ugr.tools.doc_analyzer.running_an_ae">3.2. Running an AE</a></span></dt><dt><span class="section"><a href="#ugr.tools.doc_analyzer.viewing_results">3.3. Viewing the Analysis Results</a></span></dt><dt><span class="section"><a href="#ugr.tools.doc_analyzer.configuring">3.4. Configuring the Annotation Viewer</a></span></dt><dt><span class="section"><a href="#ugr.tools.doc_analyzer.interactive_mode">3.5. Interactive Mode</a></span></dt><dt><span class="section"><a href="#ugr.tools.doc_analyzer.view_mode">3.6. View Mode</a></span></dt></dl></dd><dt><span class="chapter"><a href="#ugr.tools.annotation_viewer">4. Annotation Viewer</a></span></dt><dt><span class="chapter"><a href="#ugr.tools.cvd">5. CAS Visual Debugger</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cvd.introduction">5.1. Introduction</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.cvd.introduction.running">5.1.1. Running CVD</a></span></dt><dt><span class="section"><a href="#cvd.introduction.commandline">5.1.2. Command line parameters</a></span></dt></dl></dd><dt><span class="section"><a href="#cvd.errorHandling">5.2. Error Handling</a></span></dt><dt><span class="section"><a href="#cvd.preferencesFile">5.3. Preferences File</a></span></dt><dt><span class="section"><a href="#cvd.theMenus">5.4. The Menus</a></span></dt><dd><dl><dt><span class="section"><a href="#cvd.fileMenu">5.4.1. The File Menu</a></span></dt><dt><span class="section"><a href="#cvd.editMenu">5.4.2. The Edit Menu</a></span></dt><dt><span class="section"><a href="#cvd.runMenu">5.4.3. The Run Menu</a></span></dt><dt><span class="section"><a href="#cvd.toolsMenu">5.4.4. The tools menu</a></span></dt></dl></dd><dt><span class="section"><a href="#cvd.mainDisplayArea">5.5. The Main Display Area</a></span></dt><dd><dl><dt><span class="section"><a href="#cvd.statusBar">5.5.1. The Status Bar</a></span></dt><dt><span class="section"><a href="#cvd.keyboardNavigation">5.5.2. Keyboard Navigation and Shortcuts</a></span></dt></dl></dd></dl></dd><dt><span class="chapter"><a href="#ugr.tools.eclipse_launcher">6. Eclipse Analysis Engine Launcher's Guide</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.eclipse_launcher.create_configuration">6.1. Creating an Analysis Engine launch configuration</a></span></dt><dt><span class="section"><a href="#ugr.tools.eclipse_launcher.launching">6.2. Launching an Analysis Engine</a></span></dt></dl></dd><dt><span class="chapter"><a href="#ugr.tools.ce">7. Cas Editor User's Guide</a></span></dt><dd><dl><dt><span class="section"><a href="#sandbox.caseditor.Introduction">7.1. Introduction</a></span></dt><dt><span class="section"><a href="#sandbox.caseditor.Launching">7.2. Launching the Cas Editor</a></span></dt><dd><dl><dt><span class="section"><a href="#sandbox.caseditor.typeSystemSpec">7.2.1. Specifying a type system</a></span></dt></dl></dd><dt><span class="section"><a href="#sandbox.caseditor.annotation_editor">7.3. Annotation editor</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cas_editor.annotation_editor.editor">7.3.1. Editor</a></span></dt><dt><span class="section"><a href="#sandbox.caseditor.annotation_editor.styling">7.3.2. Configure annotation styling</a></span></dt><dt><span class="section"><a href="#ugr.tools.cas_editor.annotation_editor.cas_views">7.3.3. CAS view support</a></span></dt><dt><span class="section"><a href="#ugr.tools.cas_editor.annotation_editor.outline">7.3.4. Outline view</a></span></dt><dt><span class="section"><a href="#ugr.tools.cas_editor.annotation_editor.properties_view">7.3.5. Edit Views</a></span></dt><dt><span class="section"><a href="#ugr.tools.cas_editor.annotation_editor.fs_view">7.3.6. FeatureStructure View</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.tools.cas_editor.custom_view">7.4. Implementing a custom Cas Editor View</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.cas_editor.custom_view.sample">7.4.1. Annotation Status View Sample</a></span></dt></dl></dd></dl></dd><dt><span class="chapter"><a href="#ugr.tools.jcasgen">8. JCasGen User's Guide</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.jcasgen.running_without_eclipse">8.1. Running stand-alone without Eclipse</a></span></dt><dt><span class="section"><a href="#ugr.tools.jcasgen.running_standalone_with_eclipse">8.2. Running stand-alone with Eclipse</a></span></dt><dt><span class="section"><a href="#ugr.tools.jcasgen.running_within_eclipse">8.3. Running within Eclipse</a></span></dt><dt><span class="section"><a href="#ugr.tools.jcasgen.maven_plugin">8.4. Using the jcasgen-maven-plugin</a></span></dt></dl></dd><dt><span class="chapter"><a href="#ugr.tools.pear.packager">9. PEAR Packager User's Guide</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.pear.packager.using_eclipse_plugin">9.1. Using the PEAR Eclipse Plugin</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.pear.packager.add_uima_nature">9.1.1. Add UIMA Nature to your project</a></span></dt><dt><span class="section"><a href="#ugr.tools.pear.packager.using_pear_generation_wizard">9.1.2. Using the PEAR Generation Wizard</a></span></dt></dl></dd><dt><span class="section"><a href="#ugr.tools.pear.packager.using_command_line">9.2. Using the PEAR command line packager</a></span></dt></dl></dd><dt><span class="chapter"><a href="#ugr.tools.pear.packager.maven.plugin.usage">10. The PEAR Packaging Maven Plugin</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.pear.packager.maven.plugin.usage.configure">10.1. Specifying the PEAR Packaging Maven Plugin</a></span></dt><dt><span class="section"><a href="#ugr.tools.pear.packager.maven.plugin.usage.dependencies">10.2. Automatically including dependencies</a></span></dt><dt><span class="section"><a href="#ugr.tools.pear.packager.maven.plugin.commandline">10.3. Running from the command line</a></span></dt><dt><span class="section"><a href="#ugr.tools.pear.packager.maven.plugin.install.src">10.4. Building the PEAR Packaging Plugin From Source</a></span></dt></dl></dd><dt><span class="chapter"><a href="#ugr.tools.pear.installer">11. PEAR Installer User's Guide</a></span></dt><dt><span class="chapter"><a href="#ugr.tools.pear.merger">12. PEAR Merger User's Guide</a></span></dt><dd><dl><dt><span class="section"><a href="#ugr.tools.pear.merger.merge_details">12.1. Details of the merging process</a></span></dt><dt><span class="section"><a href="#ugr.tools.pear.merger.testing_modifying_resulting_pear">12.2. Testing and Modifying the resulting PEAR</a></span></dt><dt><span class="section"><a href="#ugr.tools.pear.merger.restrictions_limitations">12.3. Restrictions and Limitations</a></span></dt></dl></dd></dl></div>
<div class="chapter" title="Chapter&nbsp;1.&nbsp;Component Descriptor Editor User's Guide" id="ugr.tools.cde"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;1.&nbsp;Component Descriptor Editor User's Guide</h2></div></div></div>
<p>The Component Descriptor Editor is an Eclipse plug-in that provides a forms-based
interface for creating and editing UIMA XML descriptors. It supports most of the
descriptor formats, except the Collection Processing Engine descriptor, the PEAR
package descriptor and some remote deployment descriptors.</p>
<div class="section" title="1.1.&nbsp;Launching the Component Descriptor Editor"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.launching">1.1.&nbsp;Launching the Component Descriptor Editor</h2></div></div></div>
<p>Here's how to launch this tool on a descriptor contained in the examples. This
presumes you have installed the examples as described in the SDK Installation and Setup
chapter.</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc" compact><li class="listitem"><p>Expand the uimaj-examples
project in the Eclipse Navigator or Package Explorer view</p></li><li class="listitem"><p>Within this project, browse to the file
descriptors/tutorial/ex1/RoomNumberAnnotator.xml.</p></li><li class="listitem"><p>Right-click on this file and select Open With <span class="symbol">&#8594;</span> Component
Descriptor Editor. (If this option is not present, check to make sure you installed
the plug-ins as described in <a href="overview_and_setup.html#ugr.ovv.eclipse_setup.installation" class="olink">Section&nbsp;3.1, &#8220;Installation&#8221;</a> of the <a href="overview_and_setup.html#d4e1" class="olink">UIMA Overview &amp; SDK Setup</a> book.
The EMF plugin is also
required.)</p></li><li class="listitem"><p>This should open a graphical editor and display the contents of the
RoomNumberAnnotator descriptor. </p></li></ul></div>
</div>
<div class="section" title="1.2.&nbsp;Creating a New AE Descriptor"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.creating_new_ae_descriptor">1.2.&nbsp;Creating a New AE Descriptor</h2></div></div></div>
<p>A new AE descriptor file may be created by selecting the File <span class="symbol">&#8594;</span> New <span class="symbol">&#8594;</span>
Other... menu. This brings up the following dialog:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.cde/image002.jpg" width="574" alt="Screenshot of selecting new UIMA component in Eclipse"></td></tr></table></div>
</div>
<p>If the user then selects UIMA and Analysis Engine Descriptor File, and clicks the
Next &gt; button, the following dialog is displayed. We will cover creating other kinds
of components later in the documentation.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="317"><tr><td><img src="images/tools/tools.cde/image004.jpg" width="317" alt="Screenshot of selecting new UIMA component in Eclipse after pushing Next"></td></tr></table></div>
</div>
<p>After entering the appropriate parent folder and file name, and clicking Finish,
an initial AE descriptor file is created with the given name, and the descriptor is
opened up within the Component Descriptor Editor.</p>
<p>At this point, the display inside the Component Descriptor Editor is the same
whether one started by creating a new AE descriptor, as in the preceding paragraph, or
one merely opened a previously created AE descriptor from, say, the Package Explorer
view. We show a previously created AE in the figure below:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.cde/image006.jpg" width="564" alt="Screenshot of CDE showing overview page"></td></tr></table></div>
</div>
<p>To see all the information shown in the main editor pane with less scrolling, double
click the title tab to toggle between the <span class="quote">&#8220;<span class="quote">full screen</span>&#8221;</span> and normal
views.</p>
<p>It is possible to set the Component Descriptor Editor as the default editor for all
.xml files by going to Window <span class="symbol">&#8594;</span> Preferences, and then selecting File Associations
on the left, and *.xml on the right, and finally by clicking on Component Descriptor
Editor, the Default button and then OK. If AE and Type System descriptors are not the
primary .xml files you work with within the Eclipse environment, we recommend not
setting the Component Descriptor Editor as your default editor for all .xml files. To
open an .xml file using the Component Descriptor Editor, if the Component Descriptor
Editor is not set as your default editor, right click on the file in the Package Explorer,
or other navigational view, and select Open With <span class="symbol">&#8594;</span> Component Descriptor Editor.
This choice is remembered by Eclipse for subsequent open operations.</p>
</div>
<div class="section" title="1.3.&nbsp;Pages within the Editor"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.pages_within_the_editor">1.3.&nbsp;Pages within the Editor</h2></div></div></div>
<p>The Component Descriptor Editor follows a standard Eclipse paradigm for these
kinds of editors. There are several pages in the editor; each one can be selected, one at a
time, by clicking on the bottom tabs. The last page contains the actual XML source file
being edited, and is displayed as plain text.</p>
<p>The same set of tabs appear at the bottom of each page in the Component Descriptor
Editor. The Component Descriptor Editor uses this <span class="quote">&#8220;<span class="quote">multi-page editor</span>&#8221;</span>
paradigm to give the user a view of conceptually distinct portions of the Descriptor
metadata in separate pages. At any point in time the user may click on the Source tab to
view the actual XML source. The Component Descriptor Editor is, in a way, just a fancy GUI
for editing the XML. The tabs provide quick access to the following pages: Overview,
Aggregate, Parameters, Parameter Settings, Type System, Capabilities, Indexes,
Resources, and Source. We discuss each of these pages in turn.</p>
<div class="section" title="1.3.1.&nbsp;Adjusting the display of pages"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.adjusting_display_of_pages">1.3.1.&nbsp;Adjusting the display of pages</h3></div></div></div>
<p>Most pages in the editor have a <span class="quote">&#8220;<span class="quote">sash</span>&#8221;</span> bar. This is a light gray bar
which separates sub-sections of the page. This bar can be dragged with the mouse to
adjust how the display area is split between the two sash panes. You can also change the
orientation of the Sash so it splits vertically, instead of horizontally, by
clicking on the small icons at the top right of the page that look like this:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="69"><tr><td><img src="images/tools/tools.cde/image008.jpg" width="69" alt="Changing orientation of two window split"></td></tr></table></div>
</div>
<p>All of the sections on a page have subtitles, with an indicator to the left which
you can click to collapse or expand that particular section. Collapsing sections can
sometimes be useful to free up screen area for other sections.</p>
</div>
</div>
<div class="section" title="1.4.&nbsp;Overview Page"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.overview_page">1.4.&nbsp;Overview Page</h2></div></div></div>
<p>Normally, the first page displayed in the Component Descriptor Editor is the
Overview page (the name of the page is shown in the GUI panel at the top left). If there is an
error reading and parsing the source, the Source page is shown instead, giving you the
opportunity to correct the problem. For many components, the Overview page contains
three sections: Implementation Details, Runtime Information and overall
Identification Information.</p>
<div class="section" title="1.4.1.&nbsp;Implementation Details"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.overview_page.implementation_details">1.4.1.&nbsp;Implementation Details</h3></div></div></div>
<p>In the Implementation Details section you specify the Implementation Language
and Engine Type. There are two kinds of Engines: Aggregate, and non-Aggregate (also
called Primitive). An Aggregate engine is one which is composed of additional
component engines and contains no code, itself. Several of the pages in the Component
Descriptor Editor have different formats, depending on the engine type.</p>
</div>
<div class="section" title="1.4.2.&nbsp;Runtime Information"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.overview_page.runtime_info">1.4.2.&nbsp;Runtime Information</h3></div></div></div>
<p>Runtime information is only applicable for primitive engines and is disabled
for aggregates and other kinds of descriptors. This is where you specify the class name of the annotator
implementation, if you are doing a Java implementation, or the C++ shared object or dll name,
if you are doing a C++ implementation. Most Analysis Engines will specify that
they update the CAS, and that they may be replicated (for performance reasons) when deployed. If
a particular Analysis Engine must see every CAS (for instance, if it is counting the
number of CASes), then uncheck the <span class="quote">&#8220;<span class="quote">multiple deployment allowed</span>&#8221;</span>
box. If the Analysis Engine doesn't update the CAS, uncheck the <span class="quote">&#8220;<span class="quote">updates
the CAS</span>&#8221;</span> box. (Most CAS Consumers do not update the CAS, and this parameter
defaults to unchecked for new CAS Consumer descriptors).</p>
<p>Analysis engines are written using the CAS Multiplier APIs
(see <a href="tutorials_and_users_guides.html#d5e1" class="olink">UIMA Tutorial and Developers' Guides</a>
<a href="tutorials_and_users_guides.html#ugr.tug.cm" class="olink">Chapter&nbsp;7, <i>CAS Multiplier Developer's Guide</i></a>)
can create additional CASes for analysis. To specify that they
do this, check the <span class="quote">&#8220;<span class="quote">returns new artifacts</span>&#8221;</span>.</p>
</div>
<div class="section" title="1.4.3.&nbsp;Overall Identification Information"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.overview_page.overall_id_info">1.4.3.&nbsp;Overall Identification Information</h3></div></div></div>
<p>The Name should be a human-readable name that describes this component. The
Version, Vendor, and Description fields are optional, and are arbitrary
strings.</p>
</div>
</div>
<div class="section" title="1.5.&nbsp;Aggregate Page"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.aggregate_page">1.5.&nbsp;Aggregate Page</h2></div></div></div>
<p>For primitive Analysis Engines, Flow Controllers or Collection Processing
components, the Aggregate page is not used. For aggregate engines, the page looks like
this:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.cde/image010.jpg" width="564" alt="CDE Aggregate page"></td></tr></table></div>
</div>
<p>On the left we see a list of component engines, and on the right information about the
flow. If you hover the mouse over an item in the list of component engines, that
engine's description meta data will be shown. If you right-click on one of these
items, you get an option to open that delegate descriptor in another editor instance.
Any changes you make, however, won't be seen until you close and reopen the editor
on the importing file.</p>
<p>Engines can be added to the list on the left by clicking the Add button at the bottom of
the Component Engine section. This brings up one of the following two dialogs:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="384"><tr><td><img src="images/tools/tools.cde/import-by-location.jpg" width="384" alt="Adding an Analysis Engine to an Aggregate, by location"></td></tr></table></div>
</div>
<p>This dialog lets you select
a descriptor from your workspace, or browse the file system to select a descriptor.
</p>
<p>Or, if you have selected to import by name, this dialog is shown:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="524"><tr><td><img src="images/tools/tools.cde/import-by-name.jpg" width="524" alt="Adding an Analysis Engine to an Aggregate, by name"></td></tr></table></div>
</div>
<p>You can specify that the import should be by Name (the name is looked up using both the
Project's class path, and DataPath), or by location. If it is by name,
the dialog shows the available xml files on the class path, to pick from. If the
one you want isn't showing, this means it isn't on the enclosing Eclipse Java Project's
classpath, nor on the datapath, and one of those needs to be updated to include the
path to the resource. If the name picked is
<code class="literal">com/company/prod/xyz.xml</code>, the name in
the descriptor will be <span class="quote">&#8220;<span class="quote"><code class="literal">com.company.prod.xyz</code></span>&#8221;</span>.
The "Browse the file system..." button is disabled when import by name is checked, because
the file system is not the source of the imports - rather, its the resources on the
classpath or datapath that are.</p>
<p>
If it is by location, the file reference is converted to a relative reference if
possible, in the descriptor.</p>
<p>The final selection at the bottom tells whether or not the selected engine(s)
should automatically be added to the end of the flow section (the right section on the
Aggregate page). The OK button does not become activated until a descriptor
file is selected.</p>
<p>To remove an analysis engine from the component engine list simply select an engine
and click the Remove button, or press the delete key. If the engine is already in the flow
list you will be warned that deletion will also delete the specified engine from this
list.</p>
<div class="section" title="1.5.1.&nbsp;Adding components more than once"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.aggregate_page.adding_components_more_than_once">1.5.1.&nbsp;Adding components more than once</h3></div></div></div>
<p>Components may be added to the left panel more than once. Each of these components
will be given a key which is unique. A typical reason this might be done is to use a
component in a flow several times, but have each use be associated with different
configuration parameters (different configuration parameters can be associated
with each instance).</p>
</div>
<div class="section" title="1.5.2.&nbsp;Adding or Removing components in a flow"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.aggregate_page.adding_removing_components_from_flow">1.5.2.&nbsp;Adding or Removing components in a flow</h3></div></div></div>
<p>The button in-between the Component Engines and the Flow List, labeled
<code class="literal">&gt;&gt;</code>, adds a chosen engine to the flow list and the button
labeled <code class="literal">&lt;&lt;</code> removes an engine from the flow list. To add an
engine to the flow list you must first select an engine from the left hand list, and then
press the <code class="literal">&gt;&gt;</code> button. Engines may appear any number of
times in the flow list. To remove an engine from the flow list, select an engine from the
right hand list and press the <code class="literal">&lt;&lt;</code> button.</p>
</div>
<div class="section" title="1.5.3.&nbsp;Adding remote Analysis Engines"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.aggregate_page.adding_remote_aes">1.5.3.&nbsp;Adding remote Analysis Engines</h3></div></div></div>
<p>There are two ways to add remote engines: add an existing descriptor, which
specifies a remote engine (just as if you were adding a non-remote engine) or use the
Add Remote button which will create a remote descriptor, save it, and then import it,
all in one operation. The Add Remote button enables you to easily specify the
information needed to create a remote service descriptor for a remote AE - one that
runs on a different computer connected over the network. There are 3 kinds of
these: two are variants of the Service Client
descriptor, described in <a href="references.html#d5e1" class="olink">UIMA References</a> <a href="references.html#ugr.ref.xml.component_descriptor.service_client" class="olink">Section&nbsp;2.7, &#8220;Service Client Descriptors&#8221;</a>;
the other is the UIMA-AS JMS Service descriptor, described in
<a href="uima_async_scaleout.pdf" class="olink">UIMA References</a> <span class="olink">????</span>. The Add
Remote button creates an instance of one of these descriptors,
saves it as a file in the workspace, and
imports it into the aggregate.</p>
<p>Of course, if you already have a remote service descriptor, you can add it to the
set of delegates using the <code class="code">Add</code> button, just like adding other kinds of analysis engines.</p>
<p>After clicking on <code class="code">Add Remote</code>, the following dialog is displayed:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="485"><tr><td><img src="images/tools/tools.cde/image014v2.jpg" width="485" alt="Adding a remote client to an aggregate"></td></tr></table></div>
</div>
<p>To define a remote service you specify the Service Kind, Protocol Service Type,
URI and Key. You can also specify a Timeout in milliseconds, used by the SOAP and
JMS services,
and a VNS Host and Port used by the Vinci Service.
The JMS service has additional timeouts and other parameters you may specify.
Just like when one adds an engine from
the file system, you have the option of adding the engine to the end of the flow. The
Component Descriptor Editor currently only supports Vinci and SOAP services using
this dialog.</p>
<p>Remote engines are added to the descriptor using the
&lt;import ... &gt; syntax. The information you specify here is saved in the Eclipse
project as a file, using a generated name, &lt;key-name&gt;.xml, where
&lt;key-name&gt; is the name you listed as the Key. Because of this, the key-name must
be a valid file name. If you want a different name, you can change the path information
in the dialog box.</p>
</div>
<div class="section" title="1.5.4.&nbsp;Connecting to Remote Services"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.aggregate_page.connecting_to_remote_services">1.5.4.&nbsp;Connecting to Remote Services</h3></div></div></div>
<p>If you are using the Vinci protocol, it requires that you specify the location of
the Vinci Name Server (an IP address and a Port number). You can specify these in the
service descriptor, or globally, for your Eclipse workspace, using the Eclipse menu
item: Window <span class="symbol">&#8594;</span> Preferences... <span class="symbol">&#8594;</span> UIMA Preferences.
</p>
<p>If the remote service
is available (up and running), additional operations become possible. For
instance, hovering the mouse over the remote descriptor will show the description
metadata from the remote service.</p>
</div>
<div class="section" title="1.5.5.&nbsp;Finding Analysis Engines by searching"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.aggregate_page.finding_aes_by_searching">1.5.5.&nbsp;Finding Analysis Engines by searching</h3></div></div></div>
<p>The next button that appears between the component engine list and the flow list
is the Find AE button. When this button is pressed the following dialog is displayed,
which allows one to search for AEs by name, by input or output types, or by a combination
of these criteria. This function searches the existing Eclipse workspace for
matching *.xml descriptor source files; it does not look inside Jar files.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="525"><tr><td><img src="images/tools/tools.cde/image016.jpg" width="525" alt="Searching for an AE to add to an aggregate"></td></tr></table></div>
</div>
<p>The search automatically adds a <span class="quote">&#8220;<span class="quote">match any characters</span>&#8221;</span> - style
(*) wildcard at the beginning and end of anything entered. Thus, if person is
specified for an output type, a <span class="quote">&#8220;<span class="quote">*person*</span>&#8221;</span> search is performed. Such a
search would match such things as <span class="quote">&#8220;<span class="quote">my.namespace.person</span>&#8221;</span> and
<span class="quote">&#8220;<span class="quote">person.governmentOfficial.</span>&#8221;</span> One can search in all projects or one
particular project. The search does an implicit <span class="emphasis"><em>and</em></span> on all
fields which are left non-blank.</p>
</div>
<div class="section" title="1.5.6.&nbsp;Component Engine Flow"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.aggregate_page.component_engine_flow">1.5.6.&nbsp;Component Engine Flow</h3></div></div></div>
<p>The UIMA SDK currently supports three kinds of sequencing flows: Fixed,
CapabilityLanguageFlow, and user-defined
(see <a href="references.html#d5e1" class="olink">UIMA References</a> <a href="references.html#ugr.ref.xml.component_descriptor.aes.aggregate.flow_constraints" class="olink">Section&nbsp;2.4.2.3, &#8220;FlowConstraints&#8221;</a>).
The first two require specification of a linear flow sequence;
this linear flow sequence can also be read by a user-defined flow controller (what use
is made of it is up to the user-defined flow controller). The Component Engine Flow
section allows specification of these items.</p>
<p>The pull-down labeled Flow Kind picks between the three flow models. When the
user-defined flow is selected, the Browse and Search buttons become enabled to let
you pick the flow controller XML descriptor to import.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="376"><tr><td><img src="images/tools/tools.cde/image018.jpg" width="376" alt="Specifying flow control"></td></tr></table></div>
</div>
<p>The key name value is set automatically from the XML descriptor being imported,
and enables parameters to be overridden for that descriptor (see following
sections).</p>
<p>The Up and Down buttons to the right in the Flow section are activated when an
engine in the flow is selected. The Up button moves the selected engine up one place in
the execution order, and down moves the selected engine down one place in the
execution order. Remember that engines can appear multiple times in the flow (or not
at all).</p>
</div>
</div>
<div class="section" title="1.6.&nbsp;Parameters Definition Page"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.parm_definition">1.6.&nbsp;Parameters Definition Page</h2></div></div></div>
<p>There are two pages for parameters: the first one is where parameters are defined,
and the second one is where the parameter settings are configured. The first page is the
Parameter Definition page and has two alternatives, depending on whether or not the
descriptor is an Aggregate or not. We start with a description of parameter definitions
for Primitive engines, CAS Consumers, Collection Readers, CAS Initializers, and Flow
Controllers. Here is an example:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="515"><tr><td><img src="images/tools/tools.cde/image020.jpg" width="515" alt="Parameter Definitions - not Aggregate"></td></tr></table></div>
</div>
<p>The first checkbox at the top simplifies things if you are not using Parameter
Groups (see the following section for a discussion of groups). In this case, leave the
check box unchecked. The main area shows a list of parameter definitions. Each
parameter has a name, which must be unique for this Analysis Engine. The first three
attributes specify whether the parameter can have a single or multiple values (an array
of values), whether it is Optional or Mandatory, and what the value type it can hold
(String, Integer, Float, and Boolean). If an external override name has been specified
an attribute of "XO" is included. See <a href="references.html#d5e1" class="olink">UIMA References</a>
<a href="references.html#ugr.ref.xml.component_descriptor.aes.external_configuration_parameter_overrides" class="olink">Section&nbsp;2.4.3.4, &#8220;External Configuration Parameter Overrides&#8221;</a>
for a discussion of external configuration parameter overrides.</p>
<p>In addition to using the buttons on the right to edit this information, you can
double-click a parameter to edit it, or remove (delete) a selected parameter by
pressing the delete key. Use the Add button to add a new parameter to the list.</p>
<p>Parameters have an additional description field, which you can specify when you
add or edit a parameter. To see the value of the description, hover the mouse over the
item, as shown in the picture below. If the parameter has an external override name its value
is included in the hover.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="545"><tr><td><img src="images/tools/tools.cde/image022.jpg" width="545" alt="Parameter description shown in a hover message"></td></tr></table></div>
</div>
<div class="section" title="1.6.1.&nbsp;Using groups"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.parm_definition.using_groups">1.6.1.&nbsp;Using groups</h3></div></div></div>
<p>The group concept for parameters arose from the observation that sets of
parameters were sometimes associated with different configuration needs. As an
example, you might have an Analysis Engine which needed different configuration
based on the language of a document.</p>
<p>To use groups, you check the <span class="quote">&#8220;<span class="quote">Use Parameter Groups</span>&#8221;</span> box. When you
do this, you get the ability to add groups, and to define parameters within these
groups. You also get a capability to define <span class="quote">&#8220;<span class="quote">Common</span>&#8221;</span> parameters,
which are parameters which are defined for all groups. Here is a screen shot showing
some parameter groups in use:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="515"><tr><td><img src="images/tools/tools.cde/image024.jpg" width="515" alt="Using parameter groups"></td></tr></table></div>
</div>
<p>You can see the <span class="quote">&#8220;<span class="quote">&lt;Common&gt;</span>&#8221;</span> parameters as well as two
different sets of groups.</p>
<p>The Default Group is an optional specification of what Group to use if the
parameter is not available for the group requested.</p>
<p>The Search strategy specifies what to do when a parameter is not available for the
group requested. It can have the values of None, language_fallback, or
default_fallback. These are more fully described in the section
<a href="references.html#d5e1" class="olink">UIMA References</a> <a href="references.html#ugr.ref.xml.component_descriptor.aes.configuration_parameter_declaration" class="olink">Section&nbsp;2.4.3.1, &#8220;Configuration Parameter Declaration&#8221;</a>
.</p>
<p>Groups are added using the Add Group button. Once added, they can be edited or
removed, using the buttons to the right, or the standard gestures for editing
(double-clicking the item) and removing (pressing the delete key after an item is
selected). Removing a group removes all the parameter definitions in the group. If
you try and remove the <span class="quote">&#8220;<span class="quote">&lt;Common&gt;</span>&#8221;</span> group, it just removes the
parameters in the group.</p>
<p>Each entry for a group in the table specifies one or more group names. For example,
the highlighted entry above, specifies two groups: <span class="quote">&#8220;<span class="quote">myNewGroup2</span>&#8221;</span>
and <span class="quote">&#8220;<span class="quote">mg3</span>&#8221;</span>. The parameter definition underneath is considered to be in
both groups.</p>
</div>
<div class="section" title="1.6.2.&nbsp;Adding or Editing a Parameter"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.parm_definition.adding">1.6.2.&nbsp;Adding or Editing a Parameter</h3></div></div></div>
<p>When creating or modifying a parameter both a unique name and a valid type must be
specified. The Description and External Override fields are optional. The defaults for the two
checkboxs indicate a single-valued optional parameter in the example below:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="465"><tr><td><img src="images/tools/tools.cde/image025.jpg" width="465" alt="Aggregate parameters"></td></tr></table></div>
</div>
</div>
<div class="section" title="1.6.3.&nbsp;Parameter declarations for Aggregates"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.parm_definition.aggregates">1.6.3.&nbsp;Parameter declarations for Aggregates</h3></div></div></div>
<p>Aggregates declare parameters which always must override a parameter setting
for a component making up the aggregate. They do this using the version of this page
which is shown when the descriptor is an Aggregate; here's an example:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.cde/image026.jpg" width="564" alt="Aggregate parameters"></td></tr></table></div>
</div>
<p>There is an additional panel shown (on the right) which lists all of the
components by their key names, and shows for each of them their defined parameters. To
add a new override for one or more of these parameters to the aggregate, select the
component parameter you wish to override and push the Create Override button (or, you
can just double-click the component parameter). This will automatically add a
parameter of the same name (by default &#8211; you can change the name if you like) to
the aggregate, putting it into the same group(s) (if groups are being used in the
component &#8211; this is required), and setting the properties of the parameter to
match those of the component (this is required).</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>If the name of the parameter being added already is in use in the aggregate,
and the parameters are not compatible, a new parameter name is generated by suffixing
the name with a number. If the parameters are compatible, the selected component
parameter is added to the existing aggregate parameter, as an additional override. If
you don't want this behavior, but want to have a new name generated in this case,
push the Create non-shared Override button instead, or hold down the
<span class="quote">&#8220;<span class="quote">shift</span>&#8221;</span> key when double clicking the component parameter.</p>
<p>The required / optional setting in the aggregate parameter is set to match that of
the parameter being overridden. You may want to make an optional delegate parameter
required. You can do this by changing that value manually in the source editor view.
</p></div>
<p>In the above example, the user has just double-clicked the
<span class="quote">&#8220;<span class="quote">TypeNames</span>&#8221;</span> parameter in the <span class="quote">&#8220;<span class="quote">NameRecognizer</span>&#8221;</span>
component. This added that parameter to this aggregate under the <span class="quote">&#8220;<span class="quote">&lt;Not in
any group&gt;</span>&#8221;</span> section &#8211; since it wasn't part of a group.</p>
<p>Once you have added a parameter definition to the aggregate, you can use the
buttons on the right side of the left panel to add additional overrides or remove
parameters or their overrides. <span><a name="ugr.tools.cde.parm_definition.removing_groups"></a> You can also remove
groups; removing a group is like removing all the parameter definitions in the
group.</span></p>
<p>In addition to adding one parameter at a time from a component, you can also add all
the parameters for a group within a component, or all the parameters in the component,
by selecting those items.</p>
<p>If you double-click (or push Create Override) the
<span class="quote">&#8220;<span class="quote">&lt;Common&gt;</span>&#8221;</span> group or a parameter in the &lt;Common&gt; group in
a component, a special group is created in the Aggregate consisting of all of the
groups in that component, and the overriding parameter (or parameters) are added to
that. This is done because each component can have different groups belonging to the
Common group notion; the Common group for a component is just shorthand for all the
groups in that component.</p>
<p>The Aggregate's specification of the default group and search strategy
override any specifications contained in the components.</p>
</div>
</div>
<div class="section" title="1.7.&nbsp;Parameter Settings Page"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.parameter_settings">1.7.&nbsp;Parameter Settings Page</h2></div></div></div>
<p>The Parameter Settings page is rather straightforward; it is where the user
defines parameter settings for their engines. An example of such a page is given below:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.cde/image028.jpg" width="564" alt="Parameter settings page"></td></tr></table></div>
</div>
<p>For single valued attributes, the user simply types the default value into the
Value box on the right hand side. For multi-valued parameters the user should use the
Add, Edit and Remove buttons to manage the list of multiple parameter values.</p>
<p>Values within groups are shown with each group separately displayed, to allow
configuring different values for each group.</p>
<p>Values are checked for validity. For Boolean values in a list, use the words
<code class="literal">true</code> or <code class="literal">false</code>.</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>If you specify a value in a single-valued parameter, and then delete all the
characters in the value, the CDE will treat this as if you wanted to not specify any setting
for this parameter. In order to specify a 0 length string setting for a String-valued
parameter, you will have to manually edit the XML using the <span class="quote">&#8220;<span class="quote">Source</span>&#8221;</span> tab.
</p>
<p> For array valued parameters, if you remove all of the entries for a particular array
parameter setting, the XML will reflect a 0-length array. To change this to an
unspecified parameter setting, you will have to manually edit the XML using the
<span class="quote">&#8220;<span class="quote">Source</span>&#8221;</span> tab. </p></div>
</div>
<div class="section" title="1.8.&nbsp;Type System Page"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.type_system">1.8.&nbsp;Type System Page</h2></div></div></div>
<p>This page declares the type system used by the annotator. For aggregates it is
derived by merging the type systems of all constituent AEs. The types used by the AE
constitute the language in which the inputs and outputs are described in the
Capabilities page and also affect the choice of indexes on the Indexes page. The Type
System page looks like the following:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="634"><tr><td><img src="images/tools/tools.cde/limitJCasGenType.jpg" width="634" alt="Type System declaration page"></td></tr></table></div>
</div>
<p>Before discussing this page in detail, it is important to note that there are 3
settings that affect the operation of this page. These are accessed by selecting the
UIMA <span class="symbol">&#8594;</span> Settings (or by going to the Eclipse Window <span class="symbol">&#8594;</span> Preferences <span class="symbol">&#8594;</span> UIMA
Preferences) and checking or unchecking one of the following: <span class="quote">&#8220;<span class="quote">Auto generate
.java files when defining types</span>&#8221;</span>,
<span class="quote">&#8220;<span class="quote">Generate JCasGen classes only for types defined within the local project scope</span>&#8221;</span>
and <span class="quote">&#8220;<span class="quote">Display fully qualified type
names.</span>&#8221;</span></p>
<p><a name="ugr.tools.cde.auto_jcasgen"></a>When the Auto generate option is checked and the development language for the AE is
Java, any time a change is made to a type and the change is saved, the corresponding .java
files are generated using the JCasGen tool. The results are stored in the primary source
directory defined for the project. The primary source directory is that listed first
when you right click on your project and select Properties <span class="symbol">&#8594;</span> Java Build Path, click
on the Source tab and look in the list box under the text that reads: <span class="quote">&#8220;<span class="quote">Source folder
on build path.</span>&#8221;</span> If no source folders are defined, you will get a warning that you
have no source folders defined and JCasGen will not be run. (For information on JCasGen
see <a href="tools.html#d5e1" class="olink">UIMA Tools Guide and Reference</a>
<a href="tools.html#ugr.tools.jcasgen" class="olink">Chapter&nbsp;8, <i>JCasGen User's Guide</i></a>).
When JCasGen is run, you can monitor the progress of the generation by observing the
status on the Eclipse status line (normally at the bottom of the Eclipse window).
JCasGen runs on the fully-merged type system, consisting of the type specification
plus any imported type system, plus (for aggregates) the merged type systems of all the
components in an aggregate.</p>
<div class="warning" title="Warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>If the components of the aggregate have different definitions for the same
type name, the CDE will show a warning. It is possible to continue past this warning,
in which case the CDE will produce the correct
Java source files representing the merged types (that is, the
type definition that contains all of the features defined on that type by all of your
components). However, it is not recommended to use this feature
(of having different definitions for the same type name) since it can make it
difficult to combine/package your annotator with others. See <a href="references.html#d5e1" class="olink">UIMA References</a>
<a href="references.html#ugr.ref.jcas.merging_types_from_other_specs" class="olink">Section&nbsp;5.5, &#8220;Merging Types&#8221;</a> for more information.
</p></div>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>In addition to running automatically, you can manually run JCasGen on the
fully merged type system by clicking the JCasGen button, or by selecting Run JCasGen from
the UIMA pulldown menu: </p></div>
<div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="515"><tr><td><img src="images/tools/tools.cde/image032.jpg" width="515" alt="Setting JCasGen options"></td></tr></table></div>
</div>
<p>When <span class="quote">&#8220;<span class="quote">Generate JCasGen classes only for types defined within the local project scope</span>&#8221;</span>
is checked, then JCasGen skips generating classes for types that are imported from sources outside this project.
This might be done, for instance, if you have an aggregate which is importing type systems from its delegates,
some of which are defined in other projects, and have JCasGen'd files already present in those other projects.
</p>
<p>The UIMA settings and preferences for controlling this are used to initialize a particular instance of the
editor, when it is started. Following that, you can override this setting, just for that editor, by checking or
unchecking the box shown on the type system page:</p>
<div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="72"><tr><td><img src="images/tools/tools.cde/limitJCasGen.jpg" width="72" alt="Setting JCasGen options"></td></tr></table></div>
</div>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>If this is checked, and one of the types that would be excluded has merged type features, an error message
is issued - because JCasGen will need to be run for the combined (merged) type in order to get a class definition
that will work for this configuration (have access to all the features). If this happens, you have to run without
limiting JCasGen, and manually delete any duplicated/unwanted source results.</p></div>
<p>When <span class="quote">&#8220;<span class="quote">Display fully qualified type names</span>&#8221;</span> is left unchecked, the
namespace of types is not displayed, i.e. if a fully qualified type name is
my.namespace.person, only the abbreviated type name person will be displayed. In the
Type page diagram shown above, <span class="quote">&#8220;<span class="quote">Display fully qualified type names</span>&#8221;</span> is
in fact unchecked.</p>
<p>To add, edit, or remove types the buttons on the top left section are used. When
adding or editing types, fully qualified type names should of course be used,
regardless of whether the <span class="quote">&#8220;<span class="quote">Display fully qualified type names</span>&#8221;</span> is
unchecked. Removing or editing a type will have a cascading effect in that the type
removal/edit will effect inputs, outputs, indexes and type priorities in the natural
way.</p>
<p>When a type is added, this dialog is shown:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="416"><tr><td><img src="images/tools/tools.cde/image034.jpg" width="416" alt="Adding a type"></td></tr></table></div>
</div>
<p>Type names should be specified using a namespace. The namespace is like a Java
package name, and serves to insure type names are unique. It also serves as the package
name for the generated JCas classes. The namespace name is the set of names up to the last
period in the string.</p>
<p>The supertype must be picked from an existing type. The entry field for the
supertype supports Eclipse-style content assist. To use it, put the cursor in the
supertype field, and type a letter or two of the supertype name (lower case is fine),
either starting with the name space, or just with the type name (without the name space),
and hold down the Control key and then press the spacebar. When you do this, you can see a
list of suitable matching types. You can then type more letters to narrow down your
choices, or pick the right entry with the mouse.</p>
<p>To see the available types and pick one, press the Browse button. This will show the
available types, and as you type letters for the type name (in lower case &#8211;
capitalization is ignored), the available types that match are narrowed. When
you've typed enough to specify the type you want, press Enter. Or you can use the
list of matching type names and pick the one you want with the mouse.</p>
<p>Once you've added the type, you can add features to it by highlighting the
type, and pressing the Add button.</p>
<p>If the type being defined is a subtype of uima.cas.String, the Add button allows you
to add allowed values for the string, instead of adding features.</p>
<p>To edit a type or feature, you can double click the entry, or highlight the entry and
press the Edit button. To delete a type or feature, you highlight the entry to be deleted,
and click the delete button or push the delete key.</p>
<p>If the range of a feature is an array or one of the built-in list types, an additional
specification allows you to specify if multiple references to the object referenced by
this feature are allowed. If they are not allowed then the XMI serialization of
instances of this type use a more efficient format.</p>
<p>If the range of a feature is an array of Feature Structures, then it is possible to
specify an element type for the array. This information is used in the XMI serialization
and also by the JCas generation routines to generate more efficient code.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="416"><tr><td><img src="images/tools/tools.cde/image036.jpg" width="416" alt="Specifying a Feature Structure"></td></tr></table></div>
</div>
<p>It is also possible to import type systems for inclusion in your descriptor. To do
this, use the Type Import panel's<code class="literal"> Add...</code> button. This
allows you to import a type system descriptor.</p>
<p>When importing by name, the name is resolved using the class path for the Eclipse
project containing the descriptor file being edited, or by looking up this name in the
UIMA DataPath. The DataPath can be set by pushing the Set DataPath button. It will be
remembered for this Eclipse project, as a project Property, so you only have to set it
once (per project). The value of the DataPath setting is written just like a class path,
and can include directories or JAR files, just as is true for class paths.</p>
<p>The following dialog allows you to pick one or more files from the Eclipse
workspace, or one file (at a time) from the file system:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="347"><tr><td><img src="images/tools/tools.cde/import-chooser.jpg" width="347" alt="Picking files for importing"></td></tr></table></div>
</div>
<p>This is essentially the same dialog as was used to add component engines to an
aggregate. To import from a type system descriptor that is not part of your Eclipse
workspace, click the Browse the file system.... button.</p>
<p>Imported types are validated, and if OK, they are added to the list in the Imported
Type Systems section of the Type System page. Any types they define are merged with the
existing type system.</p>
<p>Imported types and features which are only defined in imports are shown in the Type
System section, but in a grayed-out font; these type cannot be edited here. To change
them, open up the imported type system descriptor, and change them there.</p>
<p>If you hover the mouse over an import specification, it will show more information
about the import. If you right-click, it will bring up a context menu that allows opening
the imported file in the Editor, if the imported file is part of the Eclipse workspace.
Changes you make, however, won't be seen until you close and reopen the editor on
the importing file.</p>
<p>It is not possible to define types for an aggregate analysis engine. In this case the
type system is computed from the component AEs. The Type System information is shown in a
grayed-out font.</p>
<div class="section" title="1.8.1.&nbsp;Exporting"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.type_system.exporting">1.8.1.&nbsp;Exporting</h3></div></div></div>
<p>In addition to importing type specifications, you can export as well. When you
push the Export... button, the editor will create a new importable XML descriptor for
the types in this type system, and change the existing descriptor to import that newly
created one.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="371"><tr><td><img src="images/tools/tools.cde/image040.jpg" width="371" alt="Exporting a type system"></td></tr></table></div>
</div>
<p>The base file name you type is inserted into the path in the line below
automatically. You can change the path where the generated part descriptor is stored
by overtyping the lower text box. When you click OK, the new part descriptor will be
generated, and the current descriptor will be changed to import that part.</p>
</div>
</div>
<div class="section" title="1.9.&nbsp;Capabilities Page"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.capabilities">1.9.&nbsp;Capabilities Page</h2></div></div></div>
<p>Capabilities come in <span class="quote">&#8220;<span class="quote">sets</span>&#8221;</span>. You can have multiple sets of
capabilities; each one specifies languages supported, plus inputs and outputs of the
Analysis Engine. The idea behind having multiple sets is the concept that different
inputs can result in different outputs. Many Analysis Engines, though, will probably
define just one set of capabilities. A sample Capabilities page is given below:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="515"><tr><td><img src="images/tools/tools.cde/image042.jpg" width="515" alt="Capabilities page"></td></tr></table></div>
</div>
<p>When defining the capabilities of a primitive analysis engine, input and output
types can be any type defined in the type system. When defining the capabilities of an
aggregate the inputs must be a subset of the union of the inputs in the constituent
analysis engines and the outputs must be a subset of the union of the outputs of the
constituent analysis engines.</p>
<p>To add a type, first select something in the set you wish to add the type to, and press
Add Type. The following dialog appears presenting the user with a list of types which are
candidates for additional inputs:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="436"><tr><td><img src="images/tools/tools.cde/image044.jpg" width="436" alt="Adding a type to the capabilities page"></td></tr></table></div>
</div>
<p>Follow the instructions to mark the types as input and / or output (a type can be
both). By default, the &lt;all features&gt; flag is set to true. If you want to specify a
subset of features of a type, read on.</p>
<p>When types have features, you can specify what features are input and / or output. A
type doesn't have to be an output to have an output feature. For example, an
Analysis Engine might be passed as input a type Token, and it adds (outputs) a feature to
the existing Token types. If no new Token instances were created, it would not be an
output Type, but it would have features which are output.</p>
<p>To specify features as input and / or output (they can be both), select a type, and
press Add. The following dialog box appears:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="396"><tr><td><img src="images/tools/tools.cde/image046.jpg" width="396" alt="Specifying features as input or output"></td></tr></table></div>
</div>
<p>To mark a feature as being input and / or output, click the mouse in the input and / or
output column for the feature. If you select &lt;all features&gt;, it unmarks any
individual feature you selected, since &lt;all features&gt; subsumes all the
features.</p>
<p>The Languages part of the capability is where you specify what languages are
supported by the Analysis Engine. Supported languages should be listed using either a
two letter ISO-639 language code, or an ISO-639 language code followed by a hyphen and then a two-letter
ISO-3166 country code. Add a language by selecting Languages and pressing the Add
button. The dialog for adding languages is given below.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="396"><tr><td><img src="images/tools/tools.cde/image048.jpg" width="396" alt="Specifying a language"></td></tr></table></div>
</div>
<p>The Sofa part of the capability is optional; it allows defining Sofa names that this
component uses, and whether they are input (meaning they are created outside of this
component, and passed into it), or output (meaning that they are created by this
component). Note that a Sofa can be either input or output, but can't be
both.</p>
<p>To add a Sofa name (which is synonymous with the view name), press the Add Sofa
button, and this dialog appears:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="416"><tr><td><img src="images/tools/tools.cde/image050.jpg" width="416" alt="Specifying a Sofa name"></td></tr></table></div>
</div>
<div class="section" title="1.9.1.&nbsp;Sofa (and view) name mappings"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.capabilities.sofa_name_mapping">1.9.1.&nbsp;Sofa (and view) name mappings</h3></div></div></div>
<p>Sofa names, once created, are used in Sofa Mappings. These are optional
mappings, done in an aggregate, that specify which Sofas are the same ones but with
different names. The Sofa Mappings section is minimized unless you are editing an
Aggregate descriptor, and have one or more Sofa names defined for the aggregate. In
that case, the Sofa Mappings section will look like this:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="535"><tr><td><img src="images/tools/tools.cde/image052.jpg" width="535" alt="Sofa mappings"></td></tr></table></div>
</div>
<p>Here the aggregate has defined two input Sofas, named
<span class="quote">&#8220;<span class="quote">MyInputSofa</span>&#8221;</span>, and <span class="quote">&#8220;<span class="quote">AnotherSofa</span>&#8221;</span>. Any named sofas in
the aggregate's capabilities will appear in the Sofa Mapping section, listed
either under Inputs or Outputs. Each name in the Mappings has 0 or more delegate
(component) sofa names mapped to it. A delegate may have multiple Sofas, as in this
example, where the GovernmentOfficialRecognizer delegate has Sofas named
<span class="quote">&#8220;<span class="quote">so1</span>&#8221;</span> and <span class="quote">&#8220;<span class="quote">so2</span>&#8221;</span>.</p>
<p>Delegate components may be written as Single-View components. In this case,
they have one implicit, default Sofa (<span class="quote">&#8220;<span class="quote">_InitialView</span>&#8221;</span>), and to map to
it you use the form shown for the <span class="quote">&#8220;<span class="quote">NameRecognizer</span>&#8221;</span> &#8211; you map to
the delegate's key name in the aggregate, without specifying a Sofa name. You
can also specify the sofa name explicitly, e.g.,
NameRecognizer/_InitialView.</p>
<p>To add a new mapping, select the Aggregate Sofa name you wish to add the mapping
for, and press the Add button. This brings up a window like this, showing all available
delegates and their Sofas; select one or more (use the normal multi-select methods)
of these and press OK to add them.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.cde/image054.jpg" width="564" alt="Adding a Sofa mapping"></td></tr></table></div>
</div>
<p>To edit an existing mapping, select the mapping and press Edit. This will show the
existing mapping with all mapped items <span class="quote">&#8220;<span class="quote">selected</span>&#8221;</span>, and other
available items unselected. Change the items selected to match what you want,
deselecting some, and perhaps selecting others, and press OK.</p>
</div>
</div>
<div class="section" title="1.10.&nbsp;Indexes Page"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.indexes">1.10.&nbsp;Indexes Page</h2></div></div></div>
<p>The Indexes page is where the user declares what indexes and type priority lists are
used by the analysis engine. Indexes are used to determine which Feature
Structures of a particular type are fetched, using an iterator in the UIMA API. An
unpopulated Indexes page is displayed below:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="545"><tr><td><img src="images/tools/tools.cde/image056.jpg" width="545" alt="Index page"></td></tr></table></div>
</div>
<p>Both indexes and type priority lists can have imports. These imports work just like
the type system imports, described above. Both indexes and type priority lists can be
exported to new component descriptors, using the Export... button, just like the type
system export operation described above.</p>
<p>The built-in Annotation Index is always present. It is based on the built-in type
<code class="literal">uima.tcas.Annotation </code>and has keys begin (Ascending), end
(Descending) and TYPE_PRIORITY. There are no built-in type priorities, so this last
sort item does not play a role in the index unless type priorities are specified.</p>
<p>Type priority may be combined with other keys. Type priorities are defined in the
Priority Lists section, using one or more priority list. A given priority list gives an
ordering among a group of types. Types that appear higher in the priority list are given
higher priority, in other words, they sort first when TYPE_PRIORITY is specified as the
index key. Subtypes of these types are also ordered in a consistent manner, unless
overridden by another specific type priority specification. To get the ordering used
among all the types, all of the type priority lists are merged. This gives a partial
ordering among the types. Ties are resolved in an unspecified fashion. The Component
Descriptor Editor checks for incompatible orderings, and informs the user if they
exist, so they can be corrected.</p>
<p>To create a new index, use the Add Index button in the top left section. This brings up
this dialog:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="396"><tr><td><img src="images/tools/tools.cde/image058.jpg" width="396" alt="Adding a new index"></td></tr></table></div>
</div>
<p>Each index needs a globally unique index name. Every index indexes one CAS type (including
its subtypes). If you're using Eclipse 3.2 or later, the entry field for this
has content assist (start typing the type name
and press Control &#8211; Spacebar to get help, or press the Browse button to pick a
type).</p>
<p>Indexes can be sorted, in which case you need to specify one or more keys to sort on.
Sort keys are selected from features whose range type is Integer, Float, or String. Some
elements will be disabled if they are not relevant. For instance, if the index kind is
<span class="quote">&#8220;<span class="quote">bag</span>&#8221;</span>, you cannot provide sort keys. The order of sort keys can be
adjusted using the up and down buttons, if necessary.</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>There is usually no need to explicitly declare a Bag index in your descriptor.
As of UIMA v2.1, if you do not declare any index for a type (or any of its
supertypes), a Bag index will be automatically created. This index is
accessed using the <code class="literal">getAllIndexedFS(...)</code> method defined on the index repository.</p></div>
<p>A set index will contain no duplicates of the same type, where a duplicate is defined
by the indexing comparator. That is, if you commit two feature structures of the same
type that are equal with respect to the indexing comparator, only the first one will be
entered into the index. Note that you can still have duplicates with respect to the
indexing order, if they are of a different type. A set index is not guaranteed to be
sorted. If no keys are specified for a set index, then all instances are considered by
default to be equal, so only the first instance (for a particular type or subtype of the
type being indexed) is indexed. On the other hand, <span class="quote">&#8220;<span class="quote">bag</span>&#8221;</span> indicates that
all annotation instances are indexed, including duplicates.</p>
<p>The Priority Lists section of the Indexes page is used to specify Priority Lists of
types. Priority Lists are unnamed ordered sets of type names. Add a new priority list by
clicking the Add Set button. Add a type to an existing priority list by first selecting
the set, and then clicking Add. You can use the up and down buttons to adjust the order as
necessary; these buttons move the selected item up or down.</p>
<p>Although it is possible to import self-contained index and type priority files,
the creation of such files is not yet supported by the Component Descriptor Editor. If
you create these files using another editor, they can be imported using the
corresponding Import panels, shown on the right. Imports are specified in the same
manner as they are for Type System imports.</p>
</div>
<div class="section" title="1.11.&nbsp;Resources Page"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.resources">1.11.&nbsp;Resources Page</h2></div></div></div>
<p>The resources page describes resource dependencies (for primitive Analysis
Engines) and external Resource specification and their bindings to the resource
dependencies.</p>
<p>Only primitive Analysis Engines define resource dependencies. Primitive and
Aggregate Analysis Engines can define external resources and connect them (bind them)
to resource dependencies.</p>
<p>When an Aggregate is providing an external resource to be bound to a dependency, the
binding is specified using a possibly multi-level path, starting at the Aggregate, and
specify which component (by its key name), and then if that component is, in turn, an
Aggregate, which component (again by its key name), and so on until you reach a
primitive. The sequence of key names is made into the binding specification by joining
the parts with a <span class="quote">&#8220;<span class="quote">/</span>&#8221;</span> character. All of this is done for you by the Component
Descriptor Editor.</p>
<p>Any external resource provided by an Aggregate will override any binding provided
by any lower level component for the same resource dependency.</p>
<p>There are two views of the Resources page, depending on whether the Analysis Engine
is an Aggregate or Primitive. Here's the view for a Primitive:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="495"><tr><td><img src="images/tools/tools.cde/image060.jpg" width="495" alt="Resources page for a primitive"></td></tr></table></div>
</div>
<p>To declare a resource dependency, click the Add button in the right hand panel. This
puts up the dialog:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="396"><tr><td><img src="images/tools/tools.cde/image062.jpg" width="396" alt="Specifying a resource dependency"></td></tr></table></div>
</div>
<p>The Key must be unique within the descriptor declaring it. The Interface, if
present, is the name of a Java interface the Analysis Engine uses to access the
resource.</p>
<p>Declare actual External resource on the left side of the page. Clicking
<span class="quote">&#8220;<span class="quote">Add</span>&#8221;</span> brings up this dialog:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="416"><tr><td><img src="images/tools/tools.cde/image064.jpg" width="416" alt="Specifying an External Resource"></td></tr></table></div>
</div>
<p>The Name must be unique within this Analysis Engine. The URL identifies a file
resource. If both the URL and URL suffix are used, the file resource is formed by
combining the first URL part with the language-identifier, followed by the URL suffix;
see <a href="references.html#d5e1" class="olink">UIMA References</a> <a href="references.html#ugr.ref.xml.component_descriptor.aes.primitive.resource_manager_configuration" class="olink">Section&nbsp;2.4.1.9, &#8220;Resource Manager Configuration&#8221;</a>
. URLs may be written as <span class="quote">&#8220;<span class="quote">relative</span>&#8221;</span> URLs; in this case they are resolved by
looking them up relative to the classpath and/or datapath. A relative URL has the path
part starting without an intial <span class="quote">&#8220;<span class="quote">/</span>&#8221;</span>; for example:
file:my/directory/file. An absolute URL starts with file:/ or file:/// or
file://some.network.address/. For more information about URLs, please read the
javaDoc information for the Java class <span class="quote">&#8220;<span class="quote">URL</span>&#8221;</span>.</p>
<p>The Implementation is optional, and if given, must be a Java class that implements
the interface specified in any Resource Dependencies this resource is bound
to.</p>
<div class="section" title="1.11.1.&nbsp;Binding"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.resources.binding">1.11.1.&nbsp;Binding</h3></div></div></div>
<p>Once you have an external resource definition, and a Resource Dependency, you
can bind them together. To do this, you select the two things (an external resource
definition, and a Resource Dependency) that you want to bind together, and click
Bind.</p>
</div>
<div class="section" title="1.11.2.&nbsp;Resources with Aggregates"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.resources.aggregates">1.11.2.&nbsp;Resources with Aggregates</h3></div></div></div>
<p>When editing an Aggregate Descriptor, the Resource definitions panel will show
all the resources at the primitive level, with paths down through the components
(multiple levels, if needed) to get to the primitives. The Aggregate can define
external resources, and bind them to one or more uses by the primitives.</p>
</div>
<div class="section" title="1.11.3.&nbsp;Imports and Exports"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.resources.imports_exports">1.11.3.&nbsp;Imports and Exports</h3></div></div></div>
<p>Resource definitions and their bindings can be imported, just like other
imports. Existing Resource definitions and their bindings can be exported to a new
importable part, and replaced with an import for that importable part, using the
<span class="quote">&#8220;<span class="quote">Export...</span>&#8221;</span> button, just like the similar function on the Type System
page.</p>
</div>
</div>
<div class="section" title="1.12.&nbsp;Source Page"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.source">1.12.&nbsp;Source Page</h2></div></div></div>
<p>The Source page is a text view of the xml content of the Analysis Engine or Type System
being configured. An example of this page is displayed below:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.cde/image066.jpg" width="564" alt="Source page"></td></tr></table></div>
</div>
<p>Changes made in the GUI are immediately reflected in the xml source, and changes
made in the xml source are immediately reflected back in the GUI. The thought here is that
the GUI view and the Source view are just two ways of looking at the same data. When the data
is in an unsaved state the file name is prefaced with an asterisk in the currently
selected file tab in the editor pane inside Eclipse (as in the example above).</p>
<p>You may accidentally create invalid descriptors or XML by editing directly in the
Source view. If you do this, when you try and save or when you switch to a different view,
the error will be detected and reported. In the case of saving, the file will be saved,
even if it is in an error state.</p>
<div class="section" title="1.12.1.&nbsp;Source formatting &#8211; indentation"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cde.source.formatting">1.12.1.&nbsp;Source formatting &#8211; indentation</h3></div></div></div>
<p>The XML is indented using an indentation amount saved as a global UIMA
preference. To change this preference, use the Eclipse menu item: Windows <span class="symbol">&#8594;</span>
Preferences <span class="symbol">&#8594;</span> UIMA Preferences.</p>
</div>
</div>
<div class="section" title="1.13.&nbsp;Creating a Self-Contained Type System"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.creating_self_contained_type_system">1.13.&nbsp;Creating a Self-Contained Type System</h2></div></div></div>
<p>It is also possible to use the Component Descriptor Editor to create or edit
self-contained type systems. To create a self-contained type system, select the menu
item File <span class="symbol">&#8594;</span> New <span class="symbol">&#8594;</span> Other and then select Type System Descriptor File. From the
next page of the selection wizard specify a Parent Folder and File name and click Finish.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="347"><tr><td><img src="images/tools/tools.cde/image068.jpg" width="347" alt="Working with a self-contained type system"></td></tr></table></div>
</div><p>
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="347"><tr><td><img src="images/tools/tools.cde/image070.jpg" width="347"></td></tr></table></div>
</div>
<p>This will take you to a version of the Component Descriptor Editor for editing a type
system file which contains just three pages: an overview page, a type system page, and a
source page. The overview page is a bit more spartan than in the case of an AE. It looks like
the following:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="366"><tr><td><img src="images/tools/tools.cde/image072.jpg" width="366" alt="Editing a type system object"></td></tr></table></div>
</div>
<p>Just like an AE has an associated name, version, vendor and description, the same is
true of a self-contained type system. The Type System page is identical to that in an AE
descriptor file, as is the Source page. Note that a self-contained type system can
import type systems just like the type system associated with an AE.</p>
<p>A type system component can also be created from an existing descriptor which
contains a type system definition section, by clicking on the Export... button on the
Type System page.</p>
</div>
<div class="section" title="1.14.&nbsp;Creating Other Descriptor Components"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cde.creating_other_descriptor_components">1.14.&nbsp;Creating Other Descriptor Components</h2></div></div></div>
<p>The new wizard can create several other kinds of components: Collection
Processing Management (CPM) components, flow controllers, and importable parts
(besides Type Systems, described above, Indexes, Type Priorities, and Resource
Manager Configuration imports).</p>
<p>The CPM components supported by this editor include the Collection Reader, CAS
Initializer, and CAS Consumer descriptors. Each of these is basically treated just
like a primitive AE descriptor, with small changes to accommodate the different
semantics. For instance, a CAS Consumer can't declare in its capabilities
section that it outputs types or features.</p>
<p>Flow controllers are components that control the flow of CASes within an
aggregate, an are edited in a similar fashion as a primitive Analysis Engine.</p>
<p>The importable part support requires context information to enable the editor to
work, because much of the power of this editor comes from extensive checking that
requires additional information, other than what is available in just the importable
part. For instance, when you create or edit an Indexes import, the facility for adding
new indexes needs the type information, which is not present in this part when it is
edited alone. </p>
<p>To overcome this, when you edit these descriptors, you will be asked to
specify a context descriptor, usually a descriptor which would import the part being
edited, which would have the additional information needed. </p>
<p>Various methods are used
to guess what the context descriptor should be - and if the guess is correct, you can just
press the Enter key to confirm. The last successful context file is remembered and will
be suggested as the context file to use at the next edit session</p>
</div>
</div>
<div class="chapter" title="Chapter&nbsp;2.&nbsp;Collection Processing Engine Configurator User's Guide" id="ugr.tools.cpe"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;2.&nbsp;Collection Processing Engine Configurator User's Guide</h2></div></div></div>
<p>A <span class="emphasis"><em>Collection Processing Engine (CPE)</em></span> processes
collections of artifacts (documents) through the combination of the following
components: a Collection Reader, Analysis Engines, and CAS Consumers.
<sup>[<a name="d5e597" href="#ftn.d5e597" class="footnote">1</a>]</sup>
</p>
<p>The <span class="emphasis"><em>Collection Processing Engine Configurator(CPE
Configurator)</em></span> is a graphical tool that allows you to assemble and run
CPEs.</p>
<p>For an introduction to Collection Processing Engine concepts, including
developing the components that make up a CPE, read <a href="tutorials_and_users_guides.html#d5e1" class="olink">UIMA Tutorial and Developers' Guides</a>
<a href="tutorials_and_users_guides.html#ugr.tug.cpe" class="olink">Chapter&nbsp;2, <i>Collection Processing Engine Developer's Guide</i></a>. This
chapter is a user's guide for using the CPE Configurator tool, and does not describe
UIMA's Collection Processing Architecture itself.</p>
<div class="section" title="2.1.&nbsp;Limitations of the CPE Configurator"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cpe.limitations">2.1.&nbsp;Limitations of the CPE Configurator</h2></div></div></div>
<p>The CPE Configurator only supports basic CPE configurations.</p>
<p>It only supports <span class="quote">&#8220;<span class="quote">Integrated</span>&#8221;</span> deployments (although it will
connect to remotes if particular CAS Processors are specified with remote service
descriptors). It doesn't support configuration of the error handling. It
doesn't support Sofa Mappings; it assumes all Single-View components are
operating with the _InitialView Sofa. Multi-View components will not have their names
mapped. It sets up a fixed-sized CAS Pool.</p>
<p>To set these additional options, you must edit the CPE Descriptor XML file
directly. See <a href="references.html#d5e1" class="olink">UIMA References</a>
<a href="references.html#ugr.ref.xml.cpe_descriptor" class="olink">Chapter&nbsp;3, <i>Collection Processing Engine Descriptor Reference</i></a> for the syntax.
You may then open the CPE Descriptor in the CPE Configurator and run it. The changes
you applied to the CPE Descriptor <span class="emphasis"><em>will</em></span> be respected, although you
will not be able to see them or edit them from the GUI.
</p>
</div>
<div class="section" title="2.2.&nbsp;Starting the CPE Configurator"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cpe.starting">2.2.&nbsp;Starting the CPE Configurator</h2></div></div></div>
<p>The CPE Configurator tool can be run using the <code class="literal">cpeGui</code> shell
script, which is located in the <code class="literal">bin</code> directory of the UIMA SDK. If
you've installed the example Eclipse project (see <a href="overview_and_setup.html#d4e1" class="olink">UIMA Overview &amp; SDK Setup</a>
<a href="overview_and_setup.html#ugr.ovv.eclipse_setup.example_code" class="olink">Section&nbsp;3.2, &#8220;Setting up Eclipse to view Example Code&#8221;</a>, you can also run it using the
<span class="quote">&#8220;<span class="quote">UIMA CPE GUI</span>&#8221;</span> run configuration provided in that project.</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>If you are planning to build a CPE using components other than the examples
included in the UIMA SDK, you will first need to update your CLASSPATH environment
variable to include the classes needed by these components.</p></div>
<p>When you first start the CPE Configurator, you will see the main window shown here:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.cpe/image002.jpg" width="574" alt="CPE Configurator main GUI window"></td></tr></table></div>
</div>
</div>
<div class="section" title="2.3.&nbsp;Selecting Component Descriptors"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cpe.selecting_component_descriptors">2.3.&nbsp;Selecting Component Descriptors</h2></div></div></div>
<p>The CPE Configurator's main window is divided into three sections, one each for the Collection
Reader, Analysis Engines, and CAS Consumers.<sup>[<a name="d5e633" href="#ftn.d5e633" class="footnote">2</a>]</sup></p>
<p>In each section of the CPE Configurator, you can select the component(s) you want to use by browsing to (or
typing the location of) their XML descriptors. You must select a Collection Reader, and at least one Analysis
Engine or CAS Consumer.</p>
<p>When you select a descriptor, the configuration parameters that are defined in that descriptor will then
be displayed in the GUI; these can be modified to override the values present in the descriptor.</p>
<p>For example, the screen shot below shows the CPE Configurator after the following components have been
chosen:
</p><pre class="programlisting">examples/descriptors/collectionReader/FileSystemCollectionReader.xml
examples/descriptors/analysis_engine/NamesAndPersonTitles_TAE.xml
examples/descriptors/cas_consumer/XmiWriterCasConsumer.xml</pre>
<div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.cpe/image004.jpg" width="574" alt="CPE Configurator after components chosen"></td></tr></table></div>
</div>
</div>
<div class="section" title="2.4.&nbsp;Running a Collection Processing Engine"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cpe.running">2.4.&nbsp;Running a Collection Processing Engine</h2></div></div></div>
<p>After selecting each of the components and providing configuration settings,
click the play (forward arrow) button at the bottom of the screen to begin processing. A
progress bar should be displayed in the lower left corner. (Note that the progress bar
will not begin to move until all components have completed their initialization, which
may take several seconds.) Once processing has begun, the pause and stop buttons become
enabled.</p>
<p>If an error occurs, you will be informed by an error dialog. If processing completes
successfully, you will be presented with a performance report.</p>
</div>
<div class="section" title="2.5.&nbsp;The File Menu"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cpe.file_menu">2.5.&nbsp;The File Menu</h2></div></div></div>
<p>The CPE Configurator's File Menu has the following options:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p>Open CPE Descriptor</p></li><li class="listitem"><p>Save CPE Descriptor</p></li><li class="listitem"><p>Save Options (submenu)</p></li><li class="listitem"><p>Refresh Descriptors from File System</p></li><li class="listitem"><p>Clear All</p></li><li class="listitem"><p>Exit </p></li></ul></div>
<p><span class="bold"><strong>Open CPE Descriptor</strong></span> will allow you to select a
CPE Descriptor file from disk, and will read in that CPE Descriptor and configure the GUI
appropriately.</p>
<p><span class="bold"><strong>Save CPE Descriptor</strong></span> will create a CPE
Descriptor file that defines the CPE you have constructed. This CPE Descriptor will
identify the components that constitute the CPE, as well as the configuration settings
you have specified for each of these components. Later, you can use <span class="quote">&#8220;<span class="quote">Open CPE
Descriptor</span>&#8221;</span> to restore the CPE Configurator to the state. Also, CPE
Descriptors can be used to easily run a CPE from a Java program &#8211; see
<a href="tutorials_and_users_guides.html#d5e1" class="olink">UIMA Tutorial and Developers' Guides</a> <a href="tutorials_and_users_guides.html#ugr.tug.application.running_a_cpe_from_a_descriptor" class="olink">Section&nbsp;3.3.1, &#8220;Running a CPE from a Descriptor&#8221;</a>
.</p>
<p>CPE Descriptors also allow specifying operational parameters, such as error
handling options that are not currently available for configuration through the CPE
Configurator. For more information on manually creating a CPE Descriptor, see
<a href="references.html#d5e1" class="olink">UIMA References</a>
<a href="references.html#ugr.ref.xml.cpe_descriptor" class="olink">Chapter&nbsp;3, <i>Collection Processing Engine Descriptor Reference</i></a>.
</p>
<p>The <span class="bold"><strong>Save Options</strong></span> submenu has one item,
"Use &lt;import&gt;". If this item is checked (the default), saved CPE descriptors
will use the <code class="literal">&lt;import&gt;</code> syntax to refer to their component
descriptors. If unchecked, the older <code class="literal">&lt;include&gt;</code> syntax will
be used for new components that you add to your CPE using the GUI. (However, if you
open a CPE descriptor that used &lt;import&gt;, these imports will not be replaced.)
</p>
<p><span class="bold"><strong>Refresh Descriptors from File System</strong></span> will
reload all descriptors from disk. This is useful if you have made a change to the
descriptor outside of the CPE Configurator, and want to refresh the display.</p>
<p><span class="bold"><strong>Clear All</strong></span> will reset the CPE Configurator to
its initial state, with no components selected.</p>
<p><span class="bold"><strong>Exit</strong></span> will close the CPE Configurator. If you
have unsaved changes, you will be prompted as to whether you would like to save them to a
CPE Descriptor file. If you do not save them, they will be lost.</p>
<p>When you restart the CPE Configurator, it will automatically reload the last CPE
descriptor file that you were working with.</p>
</div>
<div class="section" title="2.6.&nbsp;The Help Menu"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cpe.help_menu">2.6.&nbsp;The Help Menu</h2></div></div></div>
<p>The CPE Configurator's Help menu provides <span class="quote">&#8220;<span class="quote">About</span>&#8221;</span>
information and some very simple instructions on how to use the tool.</p>
</div>
<div class="footnotes"><br><hr width="100" align="left"><div class="footnote"><p><sup>[<a id="ftn.d5e597" href="#d5e597" class="para">1</a>] </sup>Earlier versions of UIMA supported another component, the CAS
Initializer, but this component is now deprecated in UIMA Version 2.</p></div><div class="footnote">
<p><sup>[<a id="ftn.d5e633" href="#d5e633" class="para">2</a>] </sup>There is also a fourth pane, for the CAS Initializer, but it is hidden by default. To enable it click the
<code class="literal">View <span class="symbol">&#8594;</span> CAS Initializer Panel</code> menu item.</p></div></div></div>
<div class="chapter" title="Chapter&nbsp;3.&nbsp;Document Analyzer User's Guide" id="ugr.tools.doc_analyzer"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;3.&nbsp;Document Analyzer User's Guide</h2></div></div></div>
<p>The <span class="emphasis"><em>Document Analyzer</em></span> is a tool provided by the
UIMA SDK for testing annotators and AEs. It reads text files from your disk, processes them using an AE, and
allows you to view the results. The
Document Analyzer is designed to work with text files and cannot be used with
Analysis Engines that process other types of data.</p>
<p>For an introduction to developing annotators and Analysis
Engines, read <a href="tutorials_and_users_guides.html#d5e1" class="olink">UIMA Tutorial and Developers' Guides</a>
<a href="tutorials_and_users_guides.html#ugr.tug.aae" class="olink">Chapter&nbsp;1, <i>Annotator and Analysis Engine Developer's Guide</i></a>.
This chapter is a user's guide for using the Document Analyzer tool, and
does not describe the process of developing annotators and Analysis Engines.</p>
<div class="section" title="3.1.&nbsp;Starting the Document Analyzer"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.doc_analyzer.starting">3.1.&nbsp;Starting the Document Analyzer</h2></div></div></div>
<p>To run the Document Analyzer, execute the <code class="literal">documentAnalyzer</code> script that is in the <code class="literal">bin</code> directory of your UIMA SDK installation, or, if you
are using the example Eclipse project, execute the <span class="quote">&#8220;<span class="quote">UIMA Document Analyzer</span>&#8221;</span>
run configuration supplied with that project.</p>
<p>Note that if you're planning to run an Analysis Engine
other than one of the examples included in the UIMA SDK, you'll first need to
update your CLASSPATH environment variable to include the classes needed by
that Analysis Engine.</p>
<p>When you first run the Document Analyzer, you should see a
screen that looks like this:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.doc_analyzer/DocAnalyzerScr1.png" width="574" alt="Document Analyzer GUI"></td></tr></table></div>
</div>
</div>
<div class="section" title="3.2.&nbsp;Running an AE"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.doc_analyzer.running_an_ae">3.2.&nbsp;Running an AE</h2></div></div></div>
<p>To run a AE, you must first configure the six fields on
the main screen of the Document Analyzer.</p>
<p><span class="bold"><strong>Input Directory:</strong></span>
Browse to or type the path of a directory containing text files that you
want to analyze. Some sample documents
are provided in the UIMA SDK under the <code class="literal">examples/data</code>
directory.</p>
<p><span class="bold"><strong>Input File Format:</strong></span> Set this to "text". It can, alternatively,
be set to one of the two serialized forms for CASes, if you have previously generated and saved these.
For the CAS formats only, you can also specify "Lenient deserialization"; if checked, then extra
types and features in the CAS being deserialized and loaded (that are not defined by the Annotator-to-be-run's
type system) will not cause a deserialization error, but will instead be ignored.</p>
<p><span class="bold"><strong>Character Encoding:</strong></span>
The character encoding of the input files. The default, UTF-8, also works fine for ASCII
text files. If you have a different
encoding, select it here. For more information on character sets and their names, see the Javadocs for
<code class="literal">java.nio.charset.Charset</code>.</p>
<p><span class="bold"><strong>Output Directory:</strong></span> Browse to or type the path of a directory where you want
output to be written. (As we'll see later, you won't normally need to look directly at these files, but the
Document Analyzer needs to know where to write them.) The files written to this directory will be an XML
representation of the analyzed documents. If this directory doesn't exist, it will be created. If the
directory exists, any files in it will be deleted (but the tool will ask you to confirm this before doing so). If you
leave this field blank, your AE will be run but no output will be generated.</p>
<p><span class="bold"><strong>Location of AE XML Descriptor:</strong></span>
Browse to or type the path of the descriptor
for the AE that you want to run. There
are some example descriptors provided in the UIMA SDK under the <code class="literal">examples/descriptors/analysis_engine</code> and <code class="literal">examples/descriptors/tutorial</code> directories.</p>
<p><span class="bold"><strong>XML Tag containing Text:</strong></span>
This is an optional feature. If you enter a value here, it specifies the
name of an XML tag, expected to be found within the input documents, that
contains the text to be analyzed. For
example, the value <code class="literal">TEXT</code> would cause the AE to only
analyze the portion of the document enclosed within &lt;TEXT&gt;...&lt;/TEXT&gt;
tags. Also, any XML tags occuring within that text will be removed prior to analysis.</p>
<p><span class="bold"><strong>Language:</strong></span>
Specify
the language in which the documents are written. Some Analysis Engines, but not all, require
that this be set correctly in order to do their analysis. You can select a value from the drop-down
list or type your own. The value entered
here must be an ISO language identifier, the list of which can be found here:
<a class="ulink" href="http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt" target="_top">http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt</a>.
</p>
<p>Once you've filled in the appropriate values, press the
<span class="quote">&#8220;<span class="quote">Run</span>&#8221;</span> button.</p>
<p>If an error occurs, a dialog will appear with the error
message. (A stack trace will also be
printed to the console, which may help you if the error was generated by your
own annotator code.) Otherwise, an
<span class="quote">&#8220;<span class="quote">Analysis Results</span>&#8221;</span> window will appear.</p>
</div>
<div class="section" title="3.3.&nbsp;Viewing the Analysis Results"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.doc_analyzer.viewing_results">3.3.&nbsp;Viewing the Analysis Results</h2></div></div></div>
<p>After a successful analysis, the <span class="quote">&#8220;<span class="quote">Analysis
Results</span>&#8221;</span> window will appear.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="416"><tr><td><img src="images/tools/tools.doc_analyzer/image004.jpg" width="416" alt="Analysis Results Window"></td></tr></table></div>
</div>
<p>The <span class="quote">&#8220;<span class="quote">Results Display Format</span>&#8221;</span> options at the
bottom of this window show the different ways you can view your analysis &#8211; the
Java Viewer, Java Viewer (JV) with User Colors, HTML, and XML.
The default, Java Viewer, is recommended.</p>
<p>Once you have selected your desired Results Display
Format, you can double-click on one of the files in the list to view the
analysis done on that file.</p>
<p>For the Java viewer, two different view modes are supported, each represented by one of two
radio buttons titled "Annnotations", and "Features":</p>
<p>In the "Annotations" view, each annotation which is declared to be an output of the pipeline
(in the top most Annotator Descriptor) is given a checkbox and a color, in the bottom panel. You can control which
annotations are shown by using the checkboxes in the bottom panel, the Select All button,
or the Deselet All button. The results display looks like this (for the AE descriptor
<code class="literal">examples/descriptors/tutorial/ex4/MeetingDetectorTAE.xml</code>):
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.doc_analyzer/image006v2.png" width="574" alt="Analysis Results Window showing results from tutorial example 4 in Annotations view mode"></td></tr></table></div>
</div>
<p>You can click the mouse on one of the highlighted
annotations to see a list of all its features in the frame on the right.</p>
<p>In the "Features" view, you can specify a combination of a single type, a single feature of that type, and some feature values for that feature.
The annotations whose feature values match will be highlighted. Step by step, you first select a specific type of annotations by using
a radio button in the first tab of the legend.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.doc_analyzer/image007-1v2.png" width="574" alt="Analysis Results Window showing results from tutorial example 4 in Features view mode by selecting the DateAnnotation type."></td></tr></table></div>
</div>
<p>Selecting this automatically transitions to the second tab, where you then select a specific feature
of the annotation type.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.doc_analyzer/image007-2v2.png" width="574" alt="Analysis Results Window showing results from tutorial example 4 in Features view mode by selecting the shortDateString feature."></td></tr></table></div>
</div>
<p>Selecting this again automatically transitions you to the thrid tab, where you select some specific feature
values in the third tab of the legend.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.doc_analyzer/image007-3v2.png" width="574" alt="Analysis Results Window showing results from tutorial example 4 in Features view mode by selecting individual shortDateString feature values."></td></tr></table></div>
</div>
<p>In each of the above two view modes, you can click the mouse on one of the highlighted
annotations to see a list of all its features in the frame on the right.</p>
<p>If you are viewing a CAS that contains multiple subjects
of analysis, then a selector will appear at the bottom right of the Annotation
Viewer window. This will allow you to
choose the Sofa that you wish to view. Note that only text Sofas containing a non-null document are available
for viewing.</p>
</div>
<div class="section" title="3.4.&nbsp;Configuring the Annotation Viewer"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.doc_analyzer.configuring">3.4.&nbsp;Configuring the Annotation Viewer</h2></div></div></div>
<p>The <span class="quote">&#8220;<span class="quote">JV User Colors</span>&#8221;</span> and the HTML viewer allow
you to specify exactly which colors are used to display each of your annotation
types. For the Java Viewer, you can also
specify which types should be initially selected, and you can hide types
entirely.</p>
<p>To configure the viewer, click the <span class="quote">&#8220;<span class="quote">Edit Style
Map</span>&#8221;</span> button on the <span class="quote">&#8220;<span class="quote">Analysis Results</span>&#8221;</span> dialog.
You should see a dialog that looks like this:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.doc_analyzer/image008.jpg" width="574" alt="Configuring the Analysis Results Viewer"></td></tr></table></div>
</div>
<p>To change the color assigned to a type, simply click on
the colored cell in the <span class="quote">&#8220;<span class="quote">Background</span>&#8221;</span> column for the type you wish to
edit. This will display a dialog that
allows you to choose the color. For the
HTML viewer only, you can also change the foreground color.</p>
<p>If you would like the type to be initially checked
(selected) in the legend when the viewer is first launched, check the box in
the <span class="quote">&#8220;<span class="quote">Checked</span>&#8221;</span> column. If you
would like the type to never be shown in the viewer, click the box in the
<span class="quote">&#8220;<span class="quote">Hidden</span>&#8221;</span> column. These
settings only affect the Java Viewer, not the HTML view.</p>
<p>When you are done editing, click the <span class="quote">&#8220;<span class="quote">Save</span>&#8221;</span>
button. This will save your choices to a
file in the same directory as your AE descriptor. From now on, when you view analysis results
produced by this AE using the <span class="quote">&#8220;<span class="quote">JV User Colors</span>&#8221;</span> or <span class="quote">&#8220;<span class="quote">HTML</span>&#8221;</span>
options, the viewer will be configured as you have specified.</p>
</div>
<div class="section" title="3.5.&nbsp;Interactive Mode"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.doc_analyzer.interactive_mode">3.5.&nbsp;Interactive Mode</h2></div></div></div>
<p>Interactive Mode allows you to analyze text that you type
or cut-and-paste into the tool, rather than requiring that the documents be
stored as files.</p>
<p>In the main Document Analyzer window, you can invoke
Interactive Mode by clicking the <span class="quote">&#8220;<span class="quote">Interactive</span>&#8221;</span> button instead of the
<span class="quote">&#8220;<span class="quote">Run</span>&#8221;</span> button. This will
display a dialog that looks like this:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="545"><tr><td><img src="images/tools/tools.doc_analyzer/image010.jpg" width="545" alt="Invoking Interactive Mode"></td></tr></table></div>
</div>
<p>You can type or cut-and-paste your text into this window,
then choose your Results Display Format and click the <span class="quote">&#8220;<span class="quote">Analyze</span>&#8221;</span>
button. Your AE will be run on the text
that you supplied and the results will be displayed as usual.</p>
</div>
<div class="section" title="3.6.&nbsp;View Mode"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.doc_analyzer.view_mode">3.6.&nbsp;View Mode</h2></div></div></div>
<p>If you have previously run a AE and saved its analysis
results, you can use the Document Analyzer's View mode to view those results,
without re-running your analysis. To do
this, on the main Document Analyzer window simply select the location of your
analyzed documents in the <span class="quote">&#8220;<span class="quote">Output Directory</span>&#8221;</span> dialog and click the
<span class="quote">&#8220;<span class="quote">View</span>&#8221;</span> button. You can then
view your analysis results as described in Section
<a class="xref" href="#ugr.tools.doc_analyzer.viewing_results" title="3.3.&nbsp;Viewing the Analysis Results">Section&nbsp;3.3, &#8220;Viewing the Analysis Results&#8221;</a>.</p>
</div>
</div>
<div class="chapter" title="Chapter&nbsp;4.&nbsp;Annotation Viewer" id="ugr.tools.annotation_viewer"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;4.&nbsp;Annotation Viewer</h2></div></div></div>
<p>The <span class="emphasis"><em>Annotation Viewer</em></span> is a tool for viewing analysis results
that have been saved to your disk as <span class="emphasis"><em>external XML representations of the
CAS</em></span>. These are saved in a particular format called XMI. In the UIMA SDK, XML
versions of CASes can be generated by:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p>Running the Document Analyzer (see <a href="tools.html#ugr.tools.doc_analyzer" class="olink">Chapter&nbsp;3, <i>Document Analyzer User's Guide</i></a>, which
saves an XML representations of the CAS to the specified output directory.</p>
</li><li class="listitem"><p>Running a Collection Processing Engine that includes the
<span class="emphasis"><em>XMI Writer </em></span>CAS Consumer
(<code class="literal">examples/descriptors/cas_consumer/XmiWriterCasConsumer.xml)</code>.
</p></li><li class="listitem"><p>Explicitly creating XML representations of the CAS from your own
application using the org.apache.uima.cas.impl.XMISerializer class. The best way
to learn how to do this is to look at the example code for the XMI Writer CAS Consumer,
located in
<code class="literal">examples/src/org/apache/uima/examples/xmi/XmiWriterCasConsumer.java</code>.
<sup>[<a name="d5e844" href="#ftn.d5e844" class="footnote">3</a>]</sup> </p></li></ul></div>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>The Annotation Viewer only shows CAS views where the Sofa data type is a String.
</p></div>
<p>You can run the Annotation Viewer by executing the
<code class="literal">annotationViewer</code> shell script located in the bin directory of the
UIMA SDK or the "UIMA Annotation Viewer" Eclipse run configuration in the
<code class="literal">uimaj-examples</code> project. This will open the following window:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.annotation_viewer/image002.jpg" width="574" alt="Screenshot of annotationViewer"></td></tr></table></div>
</div>
<p>Select an input directory (which must contain XMI files), and the descriptor for the
AE that produced the Analysis (which is needed to get the type system for the analysis).
Then press the <span class="quote">&#8220;<span class="quote">View</span>&#8221;</span> button.</p>
<p>This will bring up a dialog where you can select a viewing format and double-click on a
document to view it. This dialog is the same as the one that is described in <a href="tools.html#ugr.tools.doc_analyzer.viewing_results" class="olink">Section&nbsp;3.3, &#8220;Viewing the Analysis Results&#8221;</a>.</p>
<div class="footnotes"><br><hr width="100" align="left"><div class="footnote"><p><sup>[<a id="ftn.d5e844" href="#d5e844" class="para">3</a>] </sup>An older form of a different XML format for the CAS is also provided
mainly for backwards compatibility. This form is called XCAS, and you can see examples
of its use in
<code class="literal">examples/src/org/apache/uima/examples/cpe/XCasWriterCasConsumer.java</code>.
</p></div></div></div>
<div class="chapter" title="Chapter&nbsp;5.&nbsp;CAS Visual Debugger" id="ugr.tools.cvd"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;5.&nbsp;CAS Visual Debugger</h2></div></div></div>
<div class="section" title="5.1.&nbsp;Introduction"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cvd.introduction">5.1.&nbsp;Introduction</h2></div></div></div>
<p>
The CAS Visual Debugger is a tool to run text analysis engines in UIMA
and view the results. The tool is implemented as a stand-alone GUI
tool using Java's Swing library.
</p>
<p>
This is a developer's tool.&nbsp; It is intended to support you in writing
text analysis annotators for UIMA (Unstructured Information Management
Architecture).&nbsp; As a development tool, the emphasis is not so much on
pretty pictures, but rather on navigability.&nbsp; It is intended to show
you all the information you need, and show it to you quickly (at least
on a fast machine ;-).
</p>
<p>
The main purpose of this application is to let you browse all the data
that was created when you ran an analysis engine over some text.&nbsp; The
display mimics the access methods you have in the CAS API in terms of
indexes, types, feature structures and feature values.
</p>
<p>
As in the CAS, there is special support for annotations.&nbsp; Clicking on
an annotation will select the corresponding text, and conversely, you
can display all annotations that cover a given position in the text.
This will be explained in more detail in the section on the main
display area.
</p>
<p>
As usual, the graphics in this manual are for illustrative purposes
and may not look 100% like the actual version of CVD you are running.
This depends on your operating system, your version of Java, and a
variety of other factors.
</p>
<div class="section" title="5.1.1.&nbsp;Running CVD"><div class="titlepage"><div><div><h3 class="title" id="ugr.cvd.introduction.running">5.1.1.&nbsp;Running CVD</h3></div></div></div>
<p>
You will usually want to start CVD from the command line, or from Eclipse. To start CVD from the
command line, you minimally need the uima-core and uima-tools jars. Below is a sample command
line for sh and its offspring.
</p><pre class="programlisting">java -cp ${UIMA_HOME}/lib/uima-core.jar:${UIMA_HOME}/lib/uima-tools.jar
org.apache.uima.tools.cvd.CVD</pre><p>
However, there is no need to type this. The ${UIMA_HOME}/bin directory contains a cvd.sh and
cvd.bat file for Unix/Linux/MacOS and Windows, respectively.
</p>
<p>
In Eclipse, you have a ready to use launch configuration available when you have installed the
UIMA sample project (see <a href="overview_and_setup.html#d4e1" class="olink">UIMA Overview &amp; SDK Setup</a> <a href="overview_and_setup.html#ugr.ovv.eclipse_setup.example_code" class="olink">Section&nbsp;3.2, &#8220;Setting up Eclipse to view Example Code&#8221;</a>). Below is a screenshot of the the Eclipse Run
dialog with the CVD
run configuration selected.
</p><div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.cvd/eclipse-cvd-launch.jpg" width="459" alt="Eclipse run dialog with CVD selected"></div>
</div><p>
</p>
</div>
<div class="section" title="5.1.2.&nbsp;Command line parameters"><div class="titlepage"><div><div><h3 class="title" id="cvd.introduction.commandline">5.1.2.&nbsp;Command line parameters</h3></div></div></div>
<p>
You can provide some command line parameters to influence the startup behavior of CVD. For
example, if you want to run a certain analysis engine on a certain text over and over again
(for debugging, say), you can make CVD load the annotator and text at startup and execute
the annotator. Here's a list of the supported command line options.
</p>
<div class="table"><a name="cvd.table.commandline"></a><p class="title"><b>Table&nbsp;5.1.&nbsp;Command line options</b></p><div class="table-contents">
<table summary="Command line options" style="border: none;"><colgroup><col><col></colgroup><thead><tr><th style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Option</th><th style="border-bottom: 0.5pt solid black; ">Description</th></tr></thead><tbody><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">-text &lt;textFile&gt;</code>
</td><td style="border-bottom: 0.5pt solid black; ">Loads the text file <code class="computeroutput">&lt;textFile&gt;</code></td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">-desc &lt;descriptorFile&gt;</code>
</td><td style="border-bottom: 0.5pt solid black; ">Loads the descriptor <code class="computeroutput">&lt;descriptorFile&gt;</code></td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">-exec</code>
</td><td style="border-bottom: 0.5pt solid black; ">Runs the pre-loaded annotator; only allowed in conjunction with <code class="computeroutput">-desc</code> </td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">-datapath &lt;datapath&gt;</code>
</td><td style="border-bottom: 0.5pt solid black; ">Sets the data path to <code class="computeroutput">&lt;datapath&gt;</code></td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">-ini &lt;iniFile&gt;</code>
</td><td style="border-bottom: 0.5pt solid black; ">Makes CVD use alternative ini file <code class="computeroutput">&lt;textFile&gt;</code> (default is ~/annotViewer.pref)</td></tr><tr><td style="border-right: 0.5pt solid black; ">
<code class="computeroutput">-lookandfeel &lt;lnfClass&gt;</code>
</td><td style="">Uses alternative look-and-feel <code class="computeroutput">&lt;lnfClass&gt;</code></td></tr></tbody></table>
</div></div><br class="table-break">
</div>
</div>
<div class="section" title="5.2.&nbsp;Error Handling"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="cvd.errorHandling">5.2.&nbsp;Error Handling</h2></div></div></div>
<p>
On encountering
an error, CVD will pop up an error dialog with a short,
usually incomprehensible message.&nbsp; Often, the error message will
claim that there is more information available in the log file, and
sometimes, this is actually true; so do go and check the log.&nbsp; You
can view the log file by selecting the appropriate item in the
"Tools" menu.
</p><div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.cvd/ErrorExample.jpg" alt="Sample error dialog"></div>
</div><p>
</p>
</div>
<div class="section" title="5.3.&nbsp;Preferences File"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="cvd.preferencesFile">5.3.&nbsp;Preferences File</h2></div></div></div>
<p>
The program will attempt to read on startup and save on exit a file
called annotViewer.pref in your home directory.&nbsp; This file contains
information about choices you made while running the program:
directories (such as where your data files are) and window sizes.&nbsp;
These settings will be used the next time you use the program. There
is no user control over this process, but the file format is
reasonably transparent, in case you feel like changing it.&nbsp; Note,
however, that the file will be overwritten every time you exit the
program.
</p>
<p>
If you use CVD for several projects, it may be convenient to use a different
ini files for each project. You can specify the ini file CVD should use
with the </p><pre class="programlisting">-ini &lt;iniFile&gt;</pre><p> parameter on the
command line.
</p>
</div>
<div class="section" title="5.4.&nbsp;The Menus"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="cvd.theMenus">5.4.&nbsp;The Menus</h2></div></div></div>
<p>
We give a brief description of the various menus. All menu items come
with mnemonics (e.g., Alt-F X will exit the program). In addition,
some menu items have their own keyboard accelerators that you can use
anywhere in the program. For example, Ctrl-S will save the text
you've been editing.
</p>
<div class="section" title="5.4.1.&nbsp;The File Menu"><div class="titlepage"><div><div><h3 class="title" id="cvd.fileMenu">5.4.1.&nbsp;The File Menu</h3></div></div></div>
<p>
The File menu lets you load, create and save text, load and save
color settings, and import and export the XCAS format. Here's a
screenshot.
</p><div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.cvd/FileMenu.jpg" alt="The File menu"></div>
</div><p>
</p>
<div class="itemizedlist"><p>
Below is a list of the menu items, together with an explanation.
</p><ul class="itemizedlist" type="disc"><li class="listitem">
<p title="New Text...">
<b>New Text...&nbsp;</b>
Clears the text area. Text you type is written to an anonymous
buffer. You can use "Save Text As..." to save the text
you typed to a file. Note: whenever you modify the text, be it
through typing, loading a file or using the "New
Text..." menu item, previous analysis results will be lost.
Since the previous analysis is specific to the text, modifying
the text invalidates the analysis.
</p>
</li><li class="listitem">
<p title="Open Text File">
<b>Open Text File.&nbsp;</b>
Loads a new text file into the viewer.&nbsp; The next time you run an
analysis engine, it will run the text you loaded last.&nbsp; Depending
on the annotator you're using, the program may run slow with very
large text files, so you may want to experiment.
</p>
</li><li class="listitem">
<p title="Save Text File">
<b>Save Text File.&nbsp;</b>
Saves the currently open text file. If no file is currently
loaded (either because you haven't loaded a file, or you've used
the "New Text..." menu item), this menu item is
disabled (and Ctrl-S will do nothing).
</p>
</li><li class="listitem">
<p title="Save Text As...">
<b>Save Text As...&nbsp;</b>
Save the text to a file of your choosing. This can be an existing
file, which is then overwritten, or it can be a new file that
you're creating.
</p>
</li><li class="listitem">
<p title="Change Code Page">
<b>Change Code Page.&nbsp;</b>
Allows you to change the code page that is used to load and save
text files. If you're sure the text you're loading is in ASCII or
one of the 8-bit extensions such as ISO-8859-1 (ISO Latin1),
there is probably nothing you need to do. Just load the text and
look at the display. If you see no funny characters or square
boxes, chances are your selected code page is compatible with
your text file.
Note that the code page setting is also in effect when you save
files. You can observe the effects with a hex editor or by just
looking at the file size. For example, if you save the default
text
<code class="computeroutput">This is where the text goes.</code>
to a file on Windows using the default code page, the size of the
file will be 28 bytes. If you now change the code page to UTF-16
and save the file again, the file size will be 58 bytes: two
bytes for each character, plus two bytes for the byte-order mark.
Now switch the code page back to the default Windows code page
and reload the UTF-16 file to see the difference in the editor.
CVD will display all code pages that are available in the JVM
you're running it on. The first code page in the list is the
default code page of your system. This is also CVD's default if
you don't make a specific choice.
Your code page selection will be remembered in CVD's ini file.
</p>
</li><li class="listitem">
<p title="Load Color Settings">
<b>Load Color Settings.&nbsp;</b>
Load previously saved color settings from a file (see
Tools/Customize Annotation Display).&nbsp; It is highly recommended
that you only load automatically generated files.&nbsp; Strange things
may happen if you try to load the wrong file format. On startup,
the program attempts to load the last color settings file that
you loaded or saved during a previous session. If you intend to
use the same color settings as the last time you ran the program,
there is therefore no need to manually load a color settings
file.
</p>
</li><li class="listitem">
<p title="Save Color Settings">
<b>Save Color Settings.&nbsp;</b>
Save your customized color settings (see Tools/Customize
Annotation Display).&nbsp; The file is a Java properties file, and as
such, reasonably transparent.&nbsp; What is not transparent is the
encoding of the colors (integer encoding of 24-bit RGB values),
so changing the file by hand is not really recommended.
</p>
</li><li class="listitem">
<p title="Read Type System File">
<b>Read Type System File.&nbsp;</b>
Load a type system file. This allows you to load an XCAS file
without having to have access to the corresponding annotator.
</p>
</li><li class="listitem">
<p title="Write Type System File">
<b>Write Type System File.&nbsp;</b>
Create a type system file from the currently loaded type
definitions. In addition, you can save the current CAS as a XCAS
file (see below). This allows you to later load the type system
and XCAS to view the CAS without having to rerun the annotator.
</p>
</li><li class="listitem">
<p title="Read XMI CAS File">
<b>Read XMI CAS File.&nbsp;</b>
Read an XMI CAS file. Important: XMI CAS is a serialization format that
serializes a CAS without type system and index information. It is
therefore impossible to read in a stand-alone XMI CAS file. XMI CAS
files can only be interpreted in the context of an existing type
system. Consequently, you need to first load the Analysis Engine that was used to
create the XMI file, to be able to load that XMI file.
</p>
</li><li class="listitem">
<p title="Write XMI CAS File">
<b>Write XMI CAS File.&nbsp;</b>
Writes the current analysis out as an XMI CAS file.
</p>
</li><li class="listitem">
<p title="Read XCAS File">
<b>Read XCAS File.&nbsp;</b>
Read an XCAS file. Important: XCAS is a serialization format that
serializes a CAS without type system and index information. It is
therefore impossible to read in a stand-alone XCAS file. XCAS
files can only be interpreted in the context of an existing type
system. Consequently, you need to load the Analysis Engine that was used to
create the XCAS file to be able to load it. Loading a XCAS file
without loading the Analysis Engine may produce strange errors. You may get
syntax errors on loading the XCAS file, or worse, everything may
appear to go smoothly but in reality your CAS may be corrupted.
</p>
</li><li class="listitem">
<p title="Write XCAS File">
<b>Write XCAS File.&nbsp;</b>
Writes the current analysis out as an XCAS file.
</p>
</li><li class="listitem">
<p title="Exit">
<b>Exit.&nbsp;</b>
Exits the program. Your preferences will be saved.
</p>
</li></ul></div>
</div>
<div class="section" title="5.4.2.&nbsp;The Edit Menu"><div class="titlepage"><div><div><h3 class="title" id="cvd.editMenu">5.4.2.&nbsp;The Edit Menu</h3></div></div></div>
<p>
</p><div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.cvd/EditMenu.jpg" alt="The Edit menu"></div>
</div><p>
The "Edit" menu provides a standard text editing menu with
Cut, Copy and Paste, as well as unlimited Undo.
</p>
<p>
Note that standard keyboard accelerators Ctrl-X, Ctrl-C, Ctrl-V and
Ctrl-Z can be used for Cut, Copy, Paste and Undo, respectively. The
text area supports other standard keyboard operations such as
navigation HOME, Ctrl-HOME etc., as well as marking text with Shift-
&lt;ArrowKey&gt;.
</p>
</div>
<div class="section" title="5.4.3.&nbsp;The Run Menu"><div class="titlepage"><div><div><h3 class="title" id="cvd.runMenu">5.4.3.&nbsp;The Run Menu</h3></div></div></div>
<p>
</p><div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.cvd/RunMenu.jpg" alt="The Run menu"></div>
</div><p>
In the Run menu, you can load and run text analysis engines.
</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">
<p title="Load AE">
<b>Load AE.&nbsp;</b>
Loads and initializes a text analysis engine. Choosing this menu
item will display a file open dialog where you should choose an
XML descriptor of a Text Analysis Engine to process the current
text.&nbsp; Even if the analysis engine runs fast, this will take a
while, since there is a lot of setup work to do when a new TAE is
created.&nbsp; So be patient.
When you develop a new annotator, you will often need to
recompile your code. Gladis will not reload your annotator code.
When you recompile your code, you need to terminate the GUI and
restart it. If you only make changes to the XML descriptor, you
don't need to restart the GUI. Simply reload the XML file.
</p>
</li><li class="listitem">
<p title="Run AE">
<b>Run AE.&nbsp;</b>
Before you have (successfully) loaded a TAE, this menu item will
be disabled. After you have loaded a TAE, it will be enabled, and
the name changes according to the name of the TAE you have
loaded. For example, if you've loaded "The World's Fastest
Parser", you will have a menu item called "Run The
World's Fastest Parser". When you choose the item, the TAE
is run on whatever text you have currently loaded.
After a TAE has run successfully, the index window in the upper
left-hand corner of the screen should be updated and show the
indexes that were created by this run.&nbsp; We will have more to say
about indexes and what to do with them later.
</p>
</li><li class="listitem">
<p title="Run AE on CAS">
<b>Run AE on CAS.&nbsp;</b>
This allows you to run an analysis engine on the current CAS.
This is useful if you have loaded a CAS from an XCAS file, and
would like to run further analysis on it.
</p>
</li><li class="listitem">
<p title="Run collectionProcessComplete">
<b>Run collectionProcessComplete.&nbsp;</b>
When you select this item, the analysis engine's
collectionProcessComplete() method is called.
</p>
</li><li class="listitem">
<p title="Performance Report">
<b>Performance Report.&nbsp;</b>
After you've run your analysis, you can view a performance report. It will show
you where the time went: which component used how much of the processing time.
</p>
</li><li class="listitem">
<p title="Recently used">
<b>Recently used.&nbsp;</b>
Collects a list of recently used analysis engines as a short-cut
for loading.
</p>
</li><li class="listitem">
<p title="Language">
<b>Language.&nbsp;</b>
Some annotators do language specific processing. For example, if
you run lexical analysis, the results may be quite different
depending on what the analysis engine thinks the language of the
document is. With this menu item, you can manually set the
document language. Alternatively, you can use an automatic
language identification annotator. If the analysis engines you're
working with are language agnostic, there is no need to set the
language.
</p>
</li></ul></div>
</div>
<div class="section" title="5.4.4.&nbsp;The tools menu"><div class="titlepage"><div><div><h3 class="title" id="cvd.toolsMenu">5.4.4.&nbsp;The tools menu</h3></div></div></div>
<p>
The tools menu contains some assorted utilities, such as the log
file viewer. Here you can also set the log level for UIMA.
A more detailed description of some of the menu items
follows below.
</p>
<div class="section" title="5.4.4.1.&nbsp;View Type System"><div class="titlepage"><div><div><h4 class="title" id="cvd.viewTypeSystem">5.4.4.1.&nbsp;View Type System</h4></div></div></div>
<p>
</p><div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.cvd/TypeSystemViewer.jpg"></div>
</div><p>
Brings up a new window that displays the type system. This menu
item is disabled until the first time you have run an analysis
engine, since there is no type system to display until then. An
example is shown above.
</p>
<p>
You can view the inheritance tree on the left by expanding and
collapsing nodes.&nbsp; When you select a type, the features defined on
that type are displayed in the table on the right.&nbsp; The feature
table has three columns.&nbsp; The first gives the name of the feature,
the second one the type of the feature (i.e., what values it
takes), and the third column displays the highest type this feature
is defined on.&nbsp; In this example, the features "begin" and
"end" are inherited from the built-in annotation type.
</p>
<p>
In the options menu, you can configure if you want to see inherited
features or not (not yet implemented).
</p>
</div>
<div class="section" title="5.4.4.2.&nbsp;Show Selected Annotations"><div class="titlepage"><div><div><h4 class="title" id="cvd.showSelectedAnnotations">5.4.4.2.&nbsp;Show Selected Annotations</h4></div></div></div>
<p>
</p><div class="figure"><a name="AnnotationViewerFigure"></a><div class="figure-contents">
<div class="mediaobject"><img src="images/tools/tools.cvd/AnnotationViewer.jpg"></div>
</div><p class="title"><b>Figure&nbsp;5.1.&nbsp;
Annotations produced by a statistical named entity tagger
</b></p></div><p><br class="figure-break">
</p>
<p>
To enable this menu, you must have run an analysis engine and
selected the ``AnnotationIndex'' or one of its subnodes in the
upper left hand corncer of the screen.&nbsp; It will bring up a new text
window with all selected annotations marked up in the text.&nbsp;
</p>
<p>
<a class="xref" href="#AnnotationViewerFigure" title="Figure&nbsp;5.1.&nbsp; Annotations produced by a statistical named entity tagger">Figure&nbsp;5.1, &#8220;
Annotations produced by a statistical named entity tagger
&#8221;</a>
shows the results of applying a statistical named entity tagger to
a newspaper article.&nbsp; Some annotation colors have been customized:
countries are in reverse video, organizations have a turquois
background, person names are green, and occupations have a maroon
background.&nbsp; The default background color is yellow.&nbsp; This color is
also used if there is more than one annotation spanning a certain
text.&nbsp; Clearly, this display is only useful if you don't have any
overlapping annotations, or at least not too many.
</p>
<p>
This menu item is also available as a context menu in the Index
Tree area of the main window. To use it, select the annotation
index or one of its subnodes, right-click to bring up a popup menu,
and select the only item in the popup menu. The popup menu is
actually a better way to invoke the annotation display, since it
changes according to the selection in the Index Tree area, and will
tell you if what you've selected can be displayed or not.
</p>
</div>
</div>
</div>
<div class="section" title="5.5.&nbsp;The Main Display Area"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="cvd.mainDisplayArea">5.5.&nbsp;The Main Display Area</h2></div></div></div>
<p>
The main display area has three sub-areas.&nbsp; In the upper left-hand
corner is the
<span class="bold"><strong>index display</strong></span>, which shows the indexes that were defined in the
AE, as well as
the types of the indexes and their subtypes.&nbsp; In the lower left-hand
corner, the content of indexes and sub-indexes is displayed
(<span class="bold"><strong>FS display</strong></span>).&nbsp; Clicking on any node in the index display will
show the
corresponding feature structures in the FS display.&nbsp; You can explore
those structures by expanding the tree nodes.&nbsp; When you click on a
node that represents an annotation, clicking on it will cause the
corresponding text span to marked in the
<span class="bold"><strong>text display</strong></span>.
</p>
<p>
</p><div class="figure"><a name="Main1Figure"></a><div class="figure-contents">
<div class="mediaobject"><img src="images/tools/tools.cvd/Main1.jpg"></div>
</div><p class="title"><b>Figure&nbsp;5.2.&nbsp;State of GUI after running an analysis engine</b></p></div><p><br class="figure-break">
</p>
<p>
<a class="xref" href="#Main1Figure" title="Figure&nbsp;5.2.&nbsp;State of GUI after running an analysis engine">Figure&nbsp;5.2, &#8220;State of GUI after running an analysis engine&#8221;</a>
shows the state after running the UIMA_Analysis_Example.xml aggregate from the
uimaj-examples project.&nbsp; There are two indexes in the index display, and the
annotation index has been selected.&nbsp; Note that the number of
structures in an index is displayed in square brackets after the
index name.
</p>
<p>
Since displaying thousands of sister nodes is both confusing and
slow, nodes are grouped in powers of 10.&nbsp; As soon as there are no
more than 100 sister nodes, they are displayed next to each other.
</p>
<p>
In our example, a name annotation has been selected, and the
corresponding token text is highlighted in the text area.&nbsp; We have
also expanded the token node to display its structure (not much to see in this simple example).
</p>
<p>
In <a class="xref" href="#Main1Figure" title="Figure&nbsp;5.2.&nbsp;State of GUI after running an analysis engine">Figure&nbsp;5.2, &#8220;State of GUI after running an analysis engine&#8221;</a>, we selected an annotation in the FS display to find the
corresponding text.&nbsp; We can also do the reverse and find out what
annotations cover a certain point in the text.&nbsp; Let's go back to the
name recognizer for an example.
</p>
<p>
</p><div class="figure"><a name="Main2Figure"></a><div class="figure-contents">
<div class="mediaobject"><img src="images/tools/tools.cvd/Main2.jpg"></div>
</div><p class="title"><b>Figure&nbsp;5.3.&nbsp;
Finding annotations for a specific location in the text
</b></p></div><p><br class="figure-break">
</p>
<p>
We would like to know if the Michael Baessler has been
recognized as a name.&nbsp; So we position the cursor in the corresponding
text span somewhere, then right-click to bring up the context menu
telling us which annotations exist at this point. An example is shown
in
<a class="xref" href="#Main2Figure" title="Figure&nbsp;5.3.&nbsp; Finding annotations for a specific location in the text">Figure&nbsp;5.3, &#8220;
Finding annotations for a specific location in the text
&#8221;</a>.
</p>
<p>
</p><div class="figure"><a name="Main3Figure"></a><div class="figure-contents">
<div class="mediaobject"><img src="images/tools/tools.cvd/Main3.jpg"></div>
</div><p class="title"><b>Figure&nbsp;5.4.&nbsp;
Selecting an annotation from the context menu will highlight that
annotation in the FS display
</b></p></div><p><br class="figure-break">
</p>
<p>
At this point (<a class="xref" href="#Main2Figure" title="Figure&nbsp;5.3.&nbsp; Finding annotations for a specific location in the text">Figure&nbsp;5.3, &#8220;
Finding annotations for a specific location in the text
&#8221;</a>),
we only know that somewhere around the text cursor position (not
visible in the picture), we discovered a name. When we select the corresponding entry in the
context menu, the name annotation is selected in the FS display, and its covered text is
highlighted.
<a class="xref" href="#Main3Figure" title="Figure&nbsp;5.4.&nbsp; Selecting an annotation from the context menu will highlight that annotation in the FS display">Figure&nbsp;5.4, &#8220;
Selecting an annotation from the context menu will highlight that
annotation in the FS display
&#8221;</a> shows the display after
the name node has been selected in
the popup menu.
</p>
<p>
We're glad to see that, indeed, Michael Baessler is
considered to be a name.&nbsp; Note that in the FS display, the
corresponding annotation node has been selected, and the tree has
been expanded to make the node visible.
</p>
<p>
NB that the annotations displayed in the popup menu come from the
annotations currently displayed in the FS display.&nbsp; If you didn't
select the annotation index or one of its sub-nodes, no annotations
can be displayed and the popup menu will be empty.
</p>
<div class="section" title="5.5.1.&nbsp;The Status Bar"><div class="titlepage"><div><div><h3 class="title" id="cvd.statusBar">5.5.1.&nbsp;The Status Bar</h3></div></div></div>
<p>
At the bottom of the screen, some useful information is displayed in
the
<span class="bold"><strong>status bar</strong></span>. The left-most area shows the most recent major event, with the
time when the event terminated in square brackets. The next area
shows the file name of the currently loaded XML descriptor. This
area supports a tool tip that will show the full path to the file.
The right-most area shows the current cursor position, or the extent
of the selection, if a portion of the text has been selected. The
numbers correspond to the character offsets that are used for
annotations.
</p>
</div>
<div class="section" title="5.5.2.&nbsp;Keyboard Navigation and Shortcuts"><div class="titlepage"><div><div><h3 class="title" id="cvd.keyboardNavigation">5.5.2.&nbsp;Keyboard Navigation and Shortcuts</h3></div></div></div>
<p>
The GUI can be completely navigated and operated through the
keyboard. All menus and menu items support keyboard mnemonics, and
some common operations are accessible through keyboard accelerators.
</p>
<p>
You can move the focus between the three main areas using
<code class="computeroutput">Tab</code>
(clockwise) and
<code class="computeroutput">Shift-Tab</code>
(counterclockwise). When the focus is on the text area, the
<code class="computeroutput">Tab</code>
key will insert the corresponding character into the text, so you
will need to use
<code class="computeroutput">Ctrl-Tab</code>
and
<code class="computeroutput">Ctrl-Shift-Tab</code>
instead. Alternatively, you can use the following key bindings to
jump directly to one of the areas:
<code class="computeroutput">Ctrl-T</code>
to focus the text area,
<code class="computeroutput">Ctrl-I</code>
for the index repository frame and
<code class="computeroutput">Ctrl-F</code>
for the feature structure area.
</p>
<p>
Some additional keyboard shortcuts are available only in the text
area, such as
<code class="computeroutput">Ctrl-X</code>
for Cut,
<code class="computeroutput">Ctrl-C</code>
for Copy,
<code class="computeroutput">Ctrl-V</code>
for Paste and
<code class="computeroutput">Ctrl-Z</code>
for Undo. The context menu in the text area can be evoke through the
<code class="computeroutput">Alt-Enter</code>
shortcut. Text can be selected using the arrow keys while holding
the
<code class="computeroutput">Shift</code>
key.
</p>
<p>
The following table shows the supported keyboard shortcuts.
</p>
<div class="table"><a name="cvd.table.keyboardShortcuts"></a><p class="title"><b>Table&nbsp;5.2.&nbsp;Keyboard shortcuts</b></p><div class="table-contents">
<table summary="Keyboard shortcuts" style="border: none;"><colgroup><col><col><col></colgroup><thead><tr><th style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Shortcut</th><th style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Action</th><th style="border-bottom: 0.5pt solid black; ">Scope</th></tr></thead><tbody><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">Ctrl-O</code>
</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Open text file</td><td style="border-bottom: 0.5pt solid black; ">Global</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">Ctrl-S</code>
</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Save text file</td><td style="border-bottom: 0.5pt solid black; ">Global</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">Ctrl-L</code>
</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Load AE descriptor</td><td style="border-bottom: 0.5pt solid black; ">Global</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">Ctrl-R</code>
</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Run current AE</td><td style="border-bottom: 0.5pt solid black; ">Global</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">Ctrl-I</code>
</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Switch focus to index repository</td><td style="border-bottom: 0.5pt solid black; ">Global</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">Ctrl-T</code>
</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Switch focus to text area</td><td style="border-bottom: 0.5pt solid black; ">Global</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">Ctrl-F</code>
</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Switch focus to FS area</td><td style="border-bottom: 0.5pt solid black; ">Global</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">Ctrl-X</code>
</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Cut selection</td><td style="border-bottom: 0.5pt solid black; ">Text</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">Ctrl-C</code>
</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Copy selection</td><td style="border-bottom: 0.5pt solid black; ">Text</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">Ctrl-V</code>
</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Paste selection</td><td style="border-bottom: 0.5pt solid black; ">Text</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">
<code class="computeroutput">Ctrl-Z</code>
</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; ">Undo</td><td style="border-bottom: 0.5pt solid black; ">Text</td></tr><tr><td style="border-right: 0.5pt solid black; ">
<code class="computeroutput">Alt-Enter</code>
</td><td style="border-right: 0.5pt solid black; ">Show context menu</td><td style="">Text</td></tr></tbody></table>
</div></div><br class="table-break">
</div>
</div>
</div>
<div class="chapter" title="Chapter&nbsp;6.&nbsp;Eclipse Analysis Engine Launcher's Guide" id="ugr.tools.eclipse_launcher"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;6.&nbsp;Eclipse Analysis Engine Launcher's Guide</h2></div></div></div>
<p>
The Analysis Engine Launcher is an Eclipse plug-in that provides debug and run support
for Analysis Engines directly within eclipse, like a Java program can be debugged.
It supports most of the descriptor formats except CPE, UIMA AS and
some remote deployment descriptors.
</p>
<div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="495"><tr><td><img src="images/tools/tools.eclipse_launcher/image01.png" width="495"></td></tr></table></div>
</div>
<div class="section" title="6.1.&nbsp;Creating an Analysis Engine launch configuration"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.eclipse_launcher.create_configuration">6.1.&nbsp;Creating an Analysis Engine launch configuration</h2></div></div></div>
<p>
To debug or run an Analysis Engine a launch configuration must be created. To do this
select "Run -&gt; Run Configurations" or "Run -&gt; Run Configurations" from the menu bar. A dialog
will open where the launch configuration can be created. Select UIMA Analysis Engine and create
a new configuration via pressing the New button at the top, or via the New button in the context menu.
The newly created configuration will be automatically selected and the Main tab will be displayed.
</p>
<p>
The Main tab defines the Analysis Engine which will be launched. First select the project which
contains the descriptor, then choose a descriptor and select the input. The input can either be
a folder which contains input files or just a single input file, if the recursively check box
is marked the input folder will be scanned recursively for input files.
</p>
<p>
The input format defines the format of the input files, if it is set to CASes the input resource
must be either in the XMI or XCAS format and if it is set to plain text, plain text input files in
the specified encoding are expected. The input logic filters out all files which do not have an appropriate
file ending, depending on the chosen format the file ending must be one of .xcas, .xmi or .txt, all
other files are ignored when the input is a folder, if a single file is selected it will be processed
independent of the file ending.
</p>
<p>
The output directory is optional, if set all processed input files will be written to the specified
directory in the XMI CAS format, if the clear check box is marked all files inside the output folder will be deleted, usually
this option is not needed because existing files will be overwritten without notice.
</p>
<p>
The other tabs in the launch configuration are documented in the eclipse documentation,
see the "Java development user guide -&gt; Tasks -&gt; Running and Debugging".
</p>
</div>
<div class="section" title="6.2.&nbsp;Launching an Analysis Engine"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.eclipse_launcher.launching">6.2.&nbsp;Launching an Analysis Engine</h2></div></div></div>
<p>
To launch an Analysis Engine go to the previously created launch configuration and
click on "Debug" or "Run" depending on the desired run mode. The Analysis Engine will
now be launched. The output will be shown in the Console View. To debug an Analysis Engine
place breakpoints inside the implementation class. If a breakpoint is hit the execution will pause
like in a Java program.
</p>
</div>
</div>
<div class="chapter" title="Chapter&nbsp;7.&nbsp;Cas Editor User's Guide" id="ugr.tools.ce"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;7.&nbsp;Cas Editor User's Guide</h2></div></div></div>
<div class="section" title="7.1.&nbsp;Introduction"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="sandbox.caseditor.Introduction">7.1.&nbsp;Introduction</h2></div></div></div>
<p>
The CAS Editor is an Eclipse based annotation tool which supports manual and automatic
annotation (via running UIMA annotators) of CASes stored in files.
Currently only text-based CAS are supported.
The CAS Editor can visualize and edit all feature structures.
Feature Structures which are annotations can additionally be viewed and edited
directly on text.
</p>
<div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.caseditor/CasEditor.png" width="564"></td></tr></table></div>
</div>
</div>
<div class="section" title="7.2.&nbsp;Launching the Cas Editor"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="sandbox.caseditor.Launching">7.2.&nbsp;Launching the Cas Editor</h2></div></div></div>
<p>
To open a CAS in the Cas Editor it needs a compatible type system
and styling information which specify how to display the types.
The styling information is created automatically by the Cas Editor;
but the type system file must be provided by the user.
</p>
<p>A CAS in the xmi or xcas format can simply be opened by clicking
on it, like a text file is opened with the Eclipse text editor.</p>
<div class="section" title="7.2.1.&nbsp;Specifying a type system"><div class="titlepage"><div><div><h3 class="title" id="sandbox.caseditor.typeSystemSpec">7.2.1.&nbsp;Specifying a type system</h3></div></div></div>
<p>
The Cas Editor expects a type system file at the root of the project
named TypeSystem.xml. If a type system cannot be found, this message is shown:
</p><div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/ProvideTypeSystem.png" alt="No type system available for the opened CAS."></div>
</div><p>
</p>
<p>
If the type system file does not exist in this
location you can point the Cas Editor to a specific
type system file.
You can also change the default type system location in the
properties page of the Eclipse project. To do that right click the project,
select Properties and go to the UIMA Type System tab, and specify the default
location for the type system file.
</p>
</div>
<p>
After the Cas Editor is opened switch to the Cas Editor
Perspective to see all the Cas Editor related views.
</p>
</div>
<div class="section" title="7.3.&nbsp;Annotation editor"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="sandbox.caseditor.annotation_editor">7.3.&nbsp;Annotation editor</h2></div></div></div>
<p>
The annotation editor shows the text with annotations and
provides different views to show aspects of the CAS.
</p>
<div class="section" title="7.3.1.&nbsp;Editor"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cas_editor.annotation_editor.editor">7.3.1.&nbsp;Editor</h3></div></div></div>
<p>
After the editor is open it shows the default sofa of the CAS.
(Displaying another sofa is right now not possible.)
The editor has an associated, changeable CAS Type. This type is called the editor "mode".
By default the editor only shows annotation of this type. Actions and views are
sensitive to this mode. The next screen shows the display, where the mode is set to "Person":
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.caseditor/EditorOneType.png" width="564"></td></tr></table></div>
</div><p>
To change the mode for the editor, use the "Mode" menu in the editor context menu.
To open the context menu right click somewhere on the text.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.caseditor/ModeMenu.png" width="564"></td></tr></table></div>
</div><p>
The current mode is displayed in the status line at the bottom and in the Style View.
</p>
<p>
It's possible to work with more than one annotation type at a time; the mode just selects the default annotation type
which can be marked with the fewest keystrokes. To show annotations of other types, use the "Show" menu in
the context menu.
</p><div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/ShowAnnotationsMenu.png"></div>
</div><p>
Alternatively, you may select the annotation types to be shown in the Style View.
</p><div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/StyleView2.png"></div>
</div><p>
The editor will show the additional selected types.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.caseditor/EditorAllTypes.png" width="564"></td></tr></table></div>
</div><p>
The annotation renderer and rendering layer can be changed in the Properties dialog. After the
change all editors which share the same type system will be updated.
</p>
<p>
The editor automatically selects annotations of the editor mode type that are near the
cursor. This selection is then synchronized or displayed in other views.
</p>
<p>
To create an annotation manually using the editor, mark a piece of text and then
press the enter key. This creates an annotation of the
type of the editor mode, having bounds corresponding to the selection.
You can also use the "Quick Annotate" action from the context menu.
</p>
<p>
It is also possible to choose the annotation type; press
shift + enter (smart insert) or click on "Annotate" in the context menu for this.
A dialog will ask for the annotation type to create; either select the desired type or use
the associated key shortcut. In the screen shot below, pressing the "p" key
will create a Person annotation for "Obama".
</p>
<div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/ShiftEnter.png"></div>
</div>
<p>
To delete an annotation, select it and press the delete
key. Only annotations of the editor mode can be deleted with this method.
To delete non-editor mode annotations use the Outline View.
</p>
<p>
For annotation projects you can change the font size in the editor.
The default font size is 13. To change this open the Eclipse preference dialog,
go to "UIMA Annotation Editor".
</p>
</div>
<div class="section" title="7.3.2.&nbsp;Configure annotation styling"><div class="titlepage"><div><div><h3 class="title" id="sandbox.caseditor.annotation_editor.styling">7.3.2.&nbsp;Configure annotation styling</h3></div></div></div>
<p>
The Cas Editor can visualize the annotations in multiple
highlighting colors and with different annotation drawing styles.
The annotation styling is defined per type system. When its changed,
the appearance changes in all opened editors sharing a type system.
</p>
<p>
The styling is initialized with a unique color for every
annotation type and every annotation is drawn with
Squiggles annotation style. You may adjust
the annotation styles and coloring depending on the project
needs.
</p>
<div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/StyleView.png"></div>
</div>
<p>
The Cas Editor offers a property page to edit the
styling. To open this property page click on the "Properties"
button in the Styles view.
</p>
<p>
The property page can be seen below. By clicking on one of the
annotation types, the color, drawing style and drawing layer can be edited on the right
side.
</p>
<div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.caseditor/StyleProperties.png" width="564"></td></tr></table></div>
</div>
<p>
The annotations can be visualized with one the following
annotation stlyes:
</p><div class="table"><a name="d5e1306"></a><p class="title"><b>Table&nbsp;7.1.&nbsp;Style Table</b></p><div class="table-contents">
<table summary="Style Table" style="border-collapse: collapse;border-top: 0.5pt solid black; border-bottom: 0.5pt solid black; border-left: 0.5pt solid black; border-right: 0.5pt solid black; "><colgroup><col><col><col></colgroup><thead><tr><th style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">Style</th><th style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">Sample</th><th style="border-bottom: 0.5pt solid black; " align="left">Description</th></tr></thead><tbody><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">BACKGROUND</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">
<div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/Style-Background.png"></div>
</div>
</td><td style="border-bottom: 0.5pt solid black; " align="left">
<p>The background is drawn in the annotation color.</p>
</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">TEXT_COLOR</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">
<div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/Style-TextColor.png"></div>
</div>
</td><td style="border-bottom: 0.5pt solid black; " align="left">
<p>The text is drawn in the annotation color.</p>
</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">TOKEN</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">
<div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/Style-Token.png"></div>
</div>
</td><td style="border-bottom: 0.5pt solid black; " align="left">
<p>
The token type assumes that token annotation are always separated
by a whitespace. Only if they are not separated by a whitespace
a vertical line is drawn to display the two token annotations.
The image on the left actually contains three annotations, one for "Mr", "."
and "Obama".
</p>
</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">SQUIGGLES</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">
<div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/Style-Squiggles.png"></div>
</div>
</td><td style="border-bottom: 0.5pt solid black; " align="left">
<p>Squiggles are drawen under the annotation in the annotation color.</p>
</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">BOX</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">
<div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/Style-Box.png"></div>
</div>
</td><td style="border-bottom: 0.5pt solid black; " align="left">
<p>A box in the annotation color is drawn around
the annotation.</p>
</td></tr><tr><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">UNDERLINE</td><td style="border-right: 0.5pt solid black; border-bottom: 0.5pt solid black; " align="left">
<div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/Style-Underline.png"></div>
</div>
</td><td style="border-bottom: 0.5pt solid black; " align="left">
<p>A line in the annotation color is drawen below
the annotation.</p>
</td></tr><tr><td style="border-right: 0.5pt solid black; " align="left">BRACKET</td><td style="border-right: 0.5pt solid black; " align="left">
<div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/Style-Bracket.png"></div>
</div>
</td><td style="" align="left">
<p>An opening bracket is drawn around the first
character of the annotation and a closing bracket
is drawn around the last character of the annotation.</p>
</td></tr></tbody></table>
</div></div><p><br class="table-break">
</p>
<p>
The Cas Editor can draw the annotations in different
layers. If the spans of two annotations overlap the annotation
which is in a higher layer is drawn over annotations in a lower
layer. Depending on the drawing style it is possible to see
both annotations. The drawing order is defined by the layer
number, layer 0 is drawn first, then layer 1 and so on.
If annotations in the same layer overlap its not defined which
annotation type is drawn first.
</p>
</div>
<div class="section" title="7.3.3.&nbsp;CAS view support"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cas_editor.annotation_editor.cas_views">7.3.3.&nbsp;CAS view support</h3></div></div></div>
<p>
The Annotation Editor can only display text Sofa CAS views. Displaying
CAS views with Sofas of different types is not possible and will show
an editor page to switch back to another CAS view. The Edit and Feature Structure Browser views
are still available and might be used to edit Feature Structures which belong to the CAS view.
</p>
<p>
To switch to another CAS view, right click in the editor to open
the context menu and choose "CAS Views" and the view the editor
should switch to.
</p>
</div>
<div class="section" title="7.3.4.&nbsp;Outline view"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cas_editor.annotation_editor.outline">7.3.4.&nbsp;Outline view</h3></div></div></div>
<p>
The outline view gives an overview of the annoations which are
shown in the editor. The annotation are grouped by type. There are
actions to increase or decrease the bounds of the selected annotation. There is
also an action to merge selected annotations. The outline has second view mode where only
annotations of the current editor mode are shown.
</p><div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/Outline.png"></div>
</div><p>
The style can be switched in the view menu, to a style where it only shows the annotations which
belong to the current editor mode.
</p>
</div>
<div class="section" title="7.3.5.&nbsp;Edit Views"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cas_editor.annotation_editor.properties_view">7.3.5.&nbsp;Edit Views</h3></div></div></div>
<p>
The Edit Views show details about the currently
selected annotations or feature structures. It is
possible to change primitive values in this view.
Referenced feature structures can be created and deleted,
including arrays. To link a feature structure with
other feature structures, it can be pinned to the edit
view. This means that it does not change if the
selection changes.
</p>
<div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/EditView.png"></div>
</div>
</div>
<div class="section" title="7.3.6.&nbsp;FeatureStructure View"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cas_editor.annotation_editor.fs_view">7.3.6.&nbsp;FeatureStructure View</h3></div></div></div>
<p>
The FeatureStructure View lists all feature structures of
a specified type. The type is selected in the type
combobox.
</p>
<p>
It's possible to create and delete feature structures of
every type.
</p>
<div class="screenshot">
<div class="mediaobject"><img src="images/tools/tools.caseditor/FSView.png"></div>
</div>
</div>
</div>
<div class="section" title="7.4.&nbsp;Implementing a custom Cas Editor View"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.cas_editor.custom_view">7.4.&nbsp;Implementing a custom Cas Editor View</h2></div></div></div>
<p>
Custom Cas Editor views can be added,
to rapidly create, access and/or change Feature Structures in the CAS.
While the Annotation Editor and its views offer support for general viewing and editing,
accessing and editing things in the CAS can be streamlined using a custom Cas Editor.
A custom Cas Editor view can be
programmed to use a particular type system and optimized to quickly change or show something.
</p>
<p>
Annotation projects often need to track the annotation status of a CAS where a user
needs to mark which parts have been annotated or corrected. To do this with the Cas Editor
a user would need to use the Feature Structure Browser view to select the Feature Structure
and then edit it inside the Edit view.
A custom Cas Editor view could directly select and show the Feature Structure and offer
a tailored user interface to change the annotation status.
Some features such as the name of the annotator could even be automatically filled in.
</p>
<p>
The creation of Feature Structures which are linked to existing annotations or Feature Structures
is usually difficult with the standard views. A custom view which can make assumptions about the
type system is usually needed to do this efficiently.
</p>
<div class="section" title="7.4.1.&nbsp;Annotation Status View Sample"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.cas_editor.custom_view.sample">7.4.1.&nbsp;Annotation Status View Sample</h3></div></div></div>
<p>
The Cas Editor provides the CasEditorView class as a base class for views which need to access
the CAS which is opened in the current editor. It shows a "view not available" message when the current
editor does not show a CAS, no editor is opened at all or the current CAS view is incompatible with
the view.
</p>
<p>
The following snippet shows how it is usually implemented:
</p><pre class="programlisting">
public class AnnotationStatusView extends CasEditorView {
public AnnotationStatusView() {
super("The Annotation Status View is currently not available.");
}
@Override
protected IPageBookViewPage doCreatePage(ICasEditor editor) {
ICasDocument document = editor.getDocument();
if (document != null) {
return new AnnotationStatusViewPage(editor);
}
return null;
}
}
</pre><p>
The doCreatePage method is called to create the actual view page. If the document
is null the editor failed to load a document and is showing an error message.
In the case the document is not null but the CAS view is incompatible the method
should return null to indicate that it has nothing to show. In this case the
"not available" message is displayed.
</p>
<p>
The next step is to implement the AnnotationStatusViewPage. That is the page which
gets the CAS as input and need to provide the user with a ui to change the Annotation
Status Feature Structure.
</p><pre class="programlisting">
public class AnnotationStatusViewPage extends Page {
private ICasEditor editor;
AnnotationStatusViewPage(ICasEditor editor) {
this.editor = editor;
}
...
public void createControl(Composite parent) {
// create ui elements here
...
ICasDocument document = editor.getDocument();
CAS cas = document.getCAS();
// Retrieve Annotation Status FS from CAS
// and initalize the ui elements with it
FeatureStructre statusFS;
...
// Add event listeners to the ui element
// to save an update to the CAS
// and to advertise a change
...
// Send update event
document.update(statusFS);
}
}
</pre><p>
The above code sketches out how a typical view page is implemented. The CAS can be directly used
to access any Feature Structures or annotations stored in it.
When something is modified added/removed/changed that must be advertised via the ICasDocument object.
It has multiple notification methods which send an event so that other views can be updated.
The view itself can also register a listener to receive CAS change events.
</p>
</div>
</div>
</div>
<div class="chapter" title="Chapter&nbsp;8.&nbsp;JCasGen User's Guide" id="ugr.tools.jcasgen"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;8.&nbsp;JCasGen User's Guide</h2></div></div></div>
<p>JCasGen reads a descriptor for an application (either an Analysis Engine Descriptor,
or a Type System Descriptor), creates the merged type system
specification by merging all the type system information from all the components
referred to in the descriptor, and then uses this merged type system to create Java source
files for classes that enable JCas access to the CAS. Java classes are not produced for the
built-in types, since these classes are already provided by the UIMA SDK. (An exception is
the built-in type <code class="literal">uima.tcas.DocumentAnnotation</code>, see the warning below.) </p>
<div class="warning" title="Warning" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Warning</h3><p>If the components comprising the input to the type merging process
have different definitions for the same type name,
JCasGen will show a warning, and in some environments may offer to abort the operation.
If you continue past this warning,
JCasGen will produce correct Java source files representing the merged types
(that is, the
type definition containing all of the features defined on that type by all of the
components). It is recommended that you do not use this capability (of having
two different definitions for the same type name, with different feature sets) since it can make it
difficult to combine/package your annotator with others. See <a href="references.html#d5e1" class="olink">UIMA References</a>
<a href="references.html#ugr.ref.jcas.merging_types_from_other_specs" class="olink">Section&nbsp;5.5, &#8220;Merging Types&#8221;</a> for more information.
</p>
<p>Also note that if your type system declares a custom version of the
<code class="literal">uima.tcas.DocumentAnnotation</code>
built-in type, then JCasGen will generate a Java source file for it. If you do this, you need to be
aware of the issues discussed in <a href="references.html#d5e1" class="olink">UIMA References</a>
<a href="references.html#ugr.ref.jcas.documentannotation_issues" class="olink">Section&nbsp;5.5.4, &#8220;Adding Features to DocumentAnnotation&#8221;</a>.</p></div>
<p>JCasGen can be run in many ways. For Eclipse users using the Component Descriptor Editor, there's a button
on the Type System Description page to run it on that type system. There's also a jcasgen-maven-plugin to use
in maven build scripts. There's a menu-driven GUI tool for it.
And, there are command line scripts you can use to invoke it.</p>
<p>There are several versions of JCasGen. The basic version reads an XML descriptor
which contains a type system descriptor, and generates the corresponding Java Class
Models for those types. Variants exist for the Eclipse environment that allow merging the
newly generated Java source code with previously augmented versions; see <a href="references.html#d5e1" class="olink">UIMA References</a> <a href="references.html#ugr.ref.jcas.augmenting_generated_code" class="olink">Section&nbsp;5.4, &#8220;Augmenting the generated Java Code&#8221;</a> for a discussion of how the
Java Class Models can be augmented by adding additional methods and fields.</p>
<p>Input to JCasGen needs to be mostly self-contained. In particular, any types that are
defined to depend on user-defined supertypes must have that supertype defined, if the
supertype is <code class="literal">uima.tcas.Annotation </code>or a subtype of it. Any features
referencing ranges which are subtypes of uima.cas.String must have those subtypes
included. If this is not followed, a warning message is given stating that the resulting
generation may be inaccurate.</p>
<p>JCasGen is typically invoked automatically when using the Component Descriptor
Editor (see <a href="tools.html#ugr.tools.cde.auto_jcasgen" class="olink">Section&nbsp;1.8, &#8220;Type System Page&#8221;</a>), but can also be run using a shell
script. These scripts can take 0, 1, or 2 arguments. The first argument is the location of
the file containing the input XML descriptor. The second argument specifies where the
generated Java source code should go. If it isn't given, JCasGen generates its
output into a subfolder called JCas (or sometimes JCasNew &#8211; see below), of the first
argument's path.</p>
<p>The first argument, the input file, can be written as
<code class="literal">jar:&lt;url&gt;!{entry}</code>, for example:
<code class="literal">jar:http://www.foo.com/bar/baz.jar!/COM/foo/quux.class</code></p>
<p>If no arguments are given to JCasGen, then it launches a GUI to interact with the user
and ask for the same input. The GUI will remember the arguments you previously used.
Here's what it looks like:
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.jcasgen/image002.jpg" width="574" alt="JCasGen tool showing fields for input arguments"></td></tr></table></div>
</div>
<p>When running with automatic merging of the generated Java source with previously
augmented versions, the output location is where the merge function obtains the source
for the merge operation.</p>
<p>As is customary for Java, the generated class source files are placed in the
appropriate subdirectory structure according to Java conventions that correspond to
the package (name space) name.</p>
<p>The Java classes must be compiled and the resulting class files included in the class
path of your application; you make these classes available for other annotator writers
using your types, perhaps packaged as an xxx.jar file. If the xxx.jar file is made to
contain only the Java Class Models for the CAS types, it can be reused by any users of these
types.</p>
<div class="section" title="8.1.&nbsp;Running stand-alone without Eclipse"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.jcasgen.running_without_eclipse">8.1.&nbsp;Running stand-alone without Eclipse</h2></div></div></div>
<p>There is no capability to automatically merge the generated Java source with
previous versions, unless running with Eclipse. If run without Eclipse, no automatic
merging of the generated Java source is done with any previous versions. In this case,
the output is put in a folder called <span class="quote">&#8220;<span class="quote">JCasNew</span>&#8221;</span> unless overridden by
specifying a second argument.</p>
<p>The distribution includes a shell script/bat file to run the stand-alone version,
called jcasgen.</p>
</div>
<div class="section" title="8.2.&nbsp;Running stand-alone with Eclipse"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.jcasgen.running_standalone_with_eclipse">8.2.&nbsp;Running stand-alone with Eclipse</h2></div></div></div>
<p>If you have Eclipse and EMF (EMF = Eclipse Modeling Framework; both of these are
available from <a class="ulink" href="http://www.eclipse.org" target="_top">http://www.eclipse.org</a>) installed (version 3 or
later) JCasGen can merge the Java code it generates with previous versions, picking up
changes you might have inserted by hand. The output (and source of the merge input) is in a
folder <span class="quote">&#8220;<span class="quote">JCas</span>&#8221;</span> under the same path as the input XML file, unless
overridden by specifying a second argument.</p>
<p>You must install the UIMA plug-ins into Eclipse to enable this function.</p>
<p>The distribution includes a shell script/bat file to run the stand-alone with
Eclipse version, called jcasgen_merge. This works by starting Eclipse in
<span class="quote">&#8220;<span class="quote">headless</span>&#8221;</span> mode (no GUI) and invoking JCasGen within Eclipse. You will
need to set the ECLIPSE_HOME environment variable or modify the jcasgen_merge shell
script to specify where to find Eclipse. The version of Eclipse needed is 3 or higher,
with the EMF plug-in and the UIMA runtime plug-in installed. A temporary workspace is
used; the name/location of this is customizable in the shell script.</p>
<p>Log and error messages are written to the UIMA log. This file is called uima.log, and
is located in the default working directory, which if not overridden, is the startup
directory of Eclipse.</p>
</div>
<div class="section" title="8.3.&nbsp;Running within Eclipse"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.jcasgen.running_within_eclipse">8.3.&nbsp;Running within Eclipse</h2></div></div></div>
<p>There are two ways to run JCasGen within Eclipse. The first way is to configure an
Eclipse external tools launcher, and use it to run the stand-alone shell scripts, with
the arguments filled in. Here's a picture of a typical launcher configuration
screen (you get here by navigating from the top menu: Run &#8211;&gt; External Tools
&#8211;&gt; External tools...).
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.jcasgen/image004.jpg" width="574" alt="Running JCasGen within Eclipse using the external tool launcher"></td></tr></table></div>
</div>
<p>The second way (which is the normal way it's done) to run within Eclipse is to use the
Component Descriptor Editor (CDE) (see <a href="tools.html#ugr.tools.cde" class="olink">Chapter&nbsp;1, <i>Component Descriptor Editor User's Guide</i></a>). This tool can be configured to automatically
launch JCasGen whenever the type system descriptor is modified. In this release, this
operation completely regenerates the files, even if just a small thing changed. For
very large type systems, you probably don't want to enable this all the time. The
configurator tool has an option to enable/disable this function.</p>
</div>
<div class="section" title="8.4.&nbsp;Using the jcasgen-maven-plugin"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.jcasgen.maven_plugin">8.4.&nbsp;Using the jcasgen-maven-plugin</h2></div></div></div>
<p>For Maven builds, you can use the jcasgen-maven-plugin to take one or more
top level descriptors (Type System or Analysis Engine descriptors), merge them
together in the standard way UIMA merges type definitions, and produce the corresponding
JCas source classes. These, by default, are generated to the standard spot for Maven
builds for generated files.</p>
<p>You can use ant-like include / exclude patterns to specify the top level descriptor
files. If you set &lt;limitToProject&gt; to true, then after a complete UIMA type system
merge is done with all of the types, including those that are imported, only those
types which are defined within this Maven project (that is, in some subdirectory of the project)
will be generated.</p>
<p>To use the jcasgen-maven-plugin, specify it in the POM as follows:</p>
<pre class="programlisting">&lt;plugin&gt;
&lt;groupId&gt;org.apache.uima&lt;/groupId&gt;
&lt;artifactId&gt;jcasgen-maven-plugin&lt;/artifactId&gt;
&lt;version&gt;2.4.1&lt;/version&gt; &lt;!-- change this to the latest version --&gt;
&lt;executions&gt;
&lt;execution&gt;
&lt;goals&gt;&lt;goal&gt;generate&lt;/goal&gt;&lt;/goals&gt; &lt;!-- this is the only goal --&gt;
&lt;!-- runs in phase process-resources by default --&gt;
&lt;configuration&gt;
&lt;!-- REQUIRED --&gt;
&lt;typeSystemIncludes&gt;
&lt;!-- one or more ant-like file patterns
identifying top level descriptors --&gt;
&lt;typeSystemInclude&gt;src/main/resources/MyTs.xml
&lt;/typeSystemInclude&gt;
&lt;/typeSystemIncludes&gt;
&lt;!-- OPTIONAL --&gt;
&lt;!-- a sequence of ant-like file patterns
to exclude from the above include list --&gt;
&lt;typeSystemExcludes&gt;
&lt;/typeSystemExcludes&gt;
&lt;!-- OPTIONAL --&gt;
&lt;!-- where the generated files go --&gt;
&lt;!-- default value:
${project.build.directory}/generated-sources/jcasgen" --&gt;
&lt;outputDirectory&gt;
&lt;/outputDirectory&gt;
&lt;!-- true or false, default = false --&gt;
&lt;!-- if true, then although the complete merged type system
will be created internally, only those types whose
definition is contained within this maven project will be
generated. The others will be presumed to be
available via other projects. --&gt;
&lt;!-- OPTIONAL --&gt;
&lt;limitToProject&gt;false&lt;/limitToProject&gt;
&lt;/configuration&gt;
&lt;/execution&gt;
&lt;/executions&gt;
&lt;/plugin&gt;</pre>
</div>
</div>
<div class="chapter" title="Chapter&nbsp;9.&nbsp;PEAR Packager User's Guide" id="ugr.tools.pear.packager"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;9.&nbsp;PEAR Packager User's Guide</h2></div></div></div>
<p>A PEAR (Processing Engine ARchive) file is a standard package for UIMA (Unstructured
Information Management Architecture) components. The PEAR package can be used for
distribution and reuse by other components or applications. It also allows applications
and tools to manage UIMA components automatically for verification, deployment,
invocation, testing, etc. Please refer to <a href="references.html#d5e1" class="olink">UIMA References</a>
<a href="references.html#ugr.ref.pear" class="olink">Chapter&nbsp;6, <i>PEAR Reference</i></a>
for more information about the internal structure of a PEAR file.</p>
<p>This chapter describes how to use the PEAR Eclipse plugin or the PEAR command line packager
to create PEAR files for standard UIMA components.</p>
<div class="section" title="9.1.&nbsp;Using the PEAR Eclipse Plugin"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.pear.packager.using_eclipse_plugin">9.1.&nbsp;Using the PEAR Eclipse Plugin</h2></div></div></div>
<p>The PEAR Eclipse plugin is automatically installed if you followed the directions in
<a href="overview_and_setup.html#d4e1" class="olink">UIMA Overview &amp; SDK Setup</a>
<a href="overview_and_setup.html#ugr.ovv.eclipse_setup" class="olink">Chapter&nbsp;3, <i>Setting up the Eclipse IDE to work with UIMA</i></a>. The use of the
plugin involves the following two steps:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc" compact><li class="listitem"><p>Add the UIMA nature to your project
</p></li><li class="listitem"><p>Create a PEAR file using the PEAR generation wizard </p>
</li></ul></div>
<div class="section" title="9.1.1.&nbsp;Add UIMA Nature to your project"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.pear.packager.add_uima_nature">9.1.1.&nbsp;Add UIMA Nature to your project</h3></div></div></div>
<p>First, create a project for your UIMA component:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc" compact><li class="listitem"><p>Create a Java project, which
would contain all the files and folders needed for your UIMA component.</p>
</li><li class="listitem"><p>Create a source folder called <span class="quote">&#8220;<span class="quote">src</span>&#8221;</span> in your
project, and make it the only source folder, by clicking on
<span class="quote">&#8220;<span class="quote">Properties</span>&#8221;</span> in your project's context menu (right-click),
then select <span class="quote">&#8220;<span class="quote">Java Build Path</span>&#8221;</span>, then add the <span class="quote">&#8220;<span class="quote">src</span>&#8221;</span>
folder to the source folders list, and remove any other folder from the
list.</p></li><li class="listitem"><p>Specify an output folder for your project called bin, by clicking
on <span class="quote">&#8220;<span class="quote">Properties</span>&#8221;</span> in your project's context menu
(right-click), then select <span class="quote">&#8220;<span class="quote">Java Build Path</span>&#8221;</span>, and specify
<span class="quote">&#8220;<span class="quote"><span class="emphasis"><em>your_project_name</em></span>/bin</span>&#8221;</span> as the default
output folder. </p></li></ul></div>
<p>Then, add the UIMA nature to your project by clicking on <span class="quote">&#8220;<span class="quote">Add UIMA
Nature</span>&#8221;</span> in the context menu (right-click) of your project. Click
<span class="quote">&#8220;<span class="quote">Yes</span>&#8221;</span> on the <span class="quote">&#8220;<span class="quote">Adding UIMA custom Nature</span>&#8221;</span> dialog box.
Click <span class="quote">&#8220;<span class="quote">OK</span>&#8221;</span> on the confirmation dialog box.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.pear.packager/image002.jpg" width="574" alt="Screenshot of Adding the UIMA Nature to your project"></td></tr></table></div>
</div>
<p>Adding the UIMA nature to your project creates the PEAR structure in your
project. The PEAR structure is a structured tree of folders and files, including the
following elements:
</p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p><span class="bold"><strong>Required
Elements:</strong></span>
</p><div class="itemizedlist"><ul class="itemizedlist" type="circle"><li class="listitem"><p>The <span class="bold"><strong>
metadata</strong></span> folder which contains the PEAR installation descriptor
and properties files.</p></li><li class="listitem"><p>The installation descriptor (<span class="bold"><strong>
metadata/install.xml</strong></span>)
</p></li></ul></div></li><li class="listitem"><p><span class="bold"><strong>Optional Elements:</strong></span>
</p><div class="itemizedlist"><ul class="itemizedlist" type="circle"><li class="listitem"><p>The <span class="bold"><strong>
desc</strong></span> folder to contain descriptor files of analysis engines,
component analysis engines (all levels), and other component (Collection
Readers, CAS Consumers, etc).</p></li><li class="listitem"><p>The <span class="bold"><strong>src </strong></span>folder to
contain the source code</p></li><li class="listitem"><p>The <span class="bold"><strong>bin</strong></span> folder to
contain executables, scripts, class files, dlls, shared libraries,
etc.</p></li><li class="listitem"><p>The <span class="bold"><strong>lib</strong></span> folder to
contain jar files. </p></li><li class="listitem"><p>The <span class="bold"><strong>doc </strong></span>folder
containing documentation materials, preferably accessible through an
index.html.</p></li><li class="listitem"><p>The <span class="bold"><strong>data</strong></span> folder to
contain data files (e.g. for testing).</p></li><li class="listitem"><p>The <span class="bold"><strong>conf</strong></span> folder to
contain configuration files.</p></li><li class="listitem"><p>The <span class="bold"><strong>resources</strong></span> folder
to contain other resources and dependencies.</p></li><li class="listitem"><p>Other user-defined folders or files are allowed, but
<span class="emphasis"><em>should be avoided</em></span>. </p></li></ul></div><p> </p></li></ul></div>
<p>For more information about the PEAR structure, please refer to the
<span class="quote">&#8220;<span class="quote">Processing Engine Archive</span>&#8221;</span> section.
</p><div class="figure"><a name="ugr.tools.pear.packager.fig.pear_structure"></a><div class="figure-contents">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="297"><tr><td><img src="images/tools/tools.pear.packager/image004.jpg" width="297" alt="Pear structure"></td></tr></table></div>
</div><p class="title"><b>Figure&nbsp;9.1.&nbsp;The Pear Structure</b></p></div><p><br class="figure-break"></p>
</div>
<div class="section" title="9.1.2.&nbsp;Using the PEAR Generation Wizard"><div class="titlepage"><div><div><h3 class="title" id="ugr.tools.pear.packager.using_pear_generation_wizard">9.1.2.&nbsp;Using the PEAR Generation Wizard</h3></div></div></div>
<p>Before using the PEAR Generation Wizard, add all the files needed to
run your component including descriptors, jars, external libraries, resources,
and component analysis engines (in the case of an aggregate analysis engine), etc.
<span class="emphasis"><em>Do not</em></span> add Jars for the UIMA framework, however. Doing so will
cause class loading problems at run time.</p>
<p>
If you're using a Java IDE like Eclipse, instead of using the output folder (usually
<code class="literal">bin</code> as the source of your classes, it's recommended that
you generate a Jar file containing these classes.</p>
<p>Then, click on <span class="quote">&#8220;<span class="quote">Generate PEAR file</span>&#8221;</span> from the context menu
(right-click) of your project, to open the PEAR Generation wizard, and follow the
instructions on the wizard to generate the PEAR file.</p>
<div class="section" title="9.1.2.1.&nbsp;The Component Information page"><div class="titlepage"><div><div><h4 class="title" id="ugr.tools.pear.packager.wizard.component_information">9.1.2.1.&nbsp;The Component Information page</h4></div></div></div>
<p>The first page of the PEAR generation wizard is the component information
page. Specify in this page a component ID for your PEAR and select the main Analysis
Engine descriptor. The descriptor must be specified using a pathname relative to
the project's root (e.g. <span class="quote">&#8220;<span class="quote">desc/MyAE.xml</span>&#8221;</span>). The component id
is a string that uniquely identifies the component. It should use the JAVA naming
convention (e.g. org.apache.uima.mycomponent).</p>
<p>Optionally, you can include specific Collection Iterator, CAS Initializer (deprecated
as of Version 2.1),
or CAS Consumers. In this case, specify the corresponding descriptors in this
page.
</p><div class="figure"><a name="ugr.tools.pear.packager.fig.wizard.component_information"></a><div class="figure-contents">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.pear.packager/image006.jpg" width="574" alt="Pear Wizard - component information page"></td></tr></table></div>
</div><p class="title"><b>Figure&nbsp;9.2.&nbsp;The Component Information Page</b></p></div><p><br class="figure-break"></p>
</div>
<div class="section" title="9.1.2.2.&nbsp;The Installation Environment page"><div class="titlepage"><div><div><h4 class="title" id="ugr.tools.pear.packager.wizard.install_environment">9.1.2.2.&nbsp;The Installation Environment page</h4></div></div></div>
<p>The installation environment page is used to specify the following:
</p><div class="itemizedlist"><ul class="itemizedlist" type="disc" compact><li class="listitem"><p>Preferred operating
system</p></li><li class="listitem"><p>Required JDK version, if applicable.</p></li><li class="listitem"><p>Required Environment variable settings. This is where
you specify special CLASSPATH paths. You do not need to specify this for
any Jar that is listed in the your eclipse project classpath settings; those are automatically
put into the generated CLASSPATH. Nor should you include paths to the
UIMA Framework itself, here. Doing so may cause class loading problems.
</p>
<p>CLASSPATH segments are written here using a semicolon ";" as the separator;
during PEAR installation, these will be adjusted to be the correct character for the
target Operating System.</p>
<p>In order to specify the UIMA datapath for your component you have to create an environment
variable with the property name <code class="literal">uima.datapath</code>. The value of this property
must contain the UIMA datapath settings.</p>
</li></ul></div>
<p>Path names should be specified using macros (see below), instead of
hard-coded absolute paths that might work locally, but probably won't if the
PEAR is deployed in a different machine and environment.</p>
<p>Macros are variables such as $main_root, used to represent a string such as the
full path of a certain directory.</p>
<p>These macros should be defined in the PEAR.properties file using the local
values. The tools and applications that use and deploy PEAR files should replace
these macros (in the files included in the conf and desc folders) with the
corresponding values in the local environment as part of the deployment
process.</p>
<p>Currently, there are two types of macros:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p>$main_root, which represents the local absolute
path of the main component root directory after deployment.</p></li><li class="listitem"><p><span class="emphasis"><em>$component_id$root</em></span>, which
represents the local absolute path to the root directory of the component which
has <span class="emphasis"><em>component_id</em></span> as component ID. This component could
be, for instance, a delegate component. </p></li></ul></div>
<div class="figure"><a name="ugr.tools.pear.packager.fig.wizard.install_environment"></a><div class="figure-contents">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.pear.packager/image008.jpg" width="574" alt="Pear Wizard - install environment page"></td></tr></table></div>
</div><p class="title"><b>Figure&nbsp;9.3.&nbsp;The Installation Environment Page</b></p></div><br class="figure-break">
</div>
<div class="section" title="9.1.2.3.&nbsp;The PEAR file content page"><div class="titlepage"><div><div><h4 class="title" id="ugr.tools.pear.packager.wizard.file_content">9.1.2.3.&nbsp;The PEAR file content page</h4></div></div></div>
<p>The last page of the wizard is the <span class="quote">&#8220;<span class="quote">PEAR file Export</span>&#8221;</span> page, which
allows the user to select the files to include in the PEAR file. The metadata folder
and all its content is mandatory. Make sure you include all the files needed to run
your component including descriptors, jars, external libraries, resources, and
component analysis engines (in the case of an aggregate analysis engine), etc.
It's recommended to generate a jar file from your code as an alternative to
building the project and making sure the output folder (bin) contains the required
class files.</p>
<p>Eclipse compiles your class files into some output directory, often named
"bin" when you take the usual defaults in Eclipse. The recommended practice is to
take all these files and put them into a Jar file, perhaps using the Eclipse Export
wizard. You would place that Jar file into the PEAR <code class="literal">lib</code> directory.</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p>If you are relying on the class files generated in the output folder
(usually called bin) to run your code, then make sure the project is built properly,
and all the required class files are generated without errors, and then put the
output folder (e.g. $main_root/bin) in the classpath using the option to set
environment variables, by setting the CLASSPATH variable to include this folder (see the
<span class="quote">&#8220;<span class="quote">Installation Environment</span>&#8221;</span> page.
Beware that using a Java output folder named "bin" in this case is a poor practice,
because the PEAR installation
tools will presume this folder contains binary executable files, and will adds this folder to
the PATH environment variable.
</p> </div>
<div class="figure"><a name="ugr.tools.pear.packager.fig.wizard.export"></a><div class="figure-contents">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="564"><tr><td><img src="images/tools/tools.pear.packager/image010.jpg" width="564" alt="Pear Wizard - File Export Page"></td></tr></table></div>
</div><p class="title"><b>Figure&nbsp;9.4.&nbsp;The PEAR File Export Page</b></p></div><br class="figure-break">
</div>
</div>
</div>
<div class="section" title="9.2.&nbsp;Using the PEAR command line packager"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.pear.packager.using_command_line">9.2.&nbsp;Using the PEAR command line packager</h2></div></div></div>
<p>The PEAR command line packager takes some PEAR package parameter settings on the command line to create an
UIMA PEAR file.</p>
<p>To run the PEAR command line packager you can use the provided runPearPackager (.bat for Windows, and .sh for Unix)
scripts. The packager can be used in three different modes.</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">
<p>Mode 1: creates a complete PEAR package with the provided information (default mode)</p>
<pre class="programlisting">runPearPackager -compID &lt;componentID&gt;
-mainCompDesc &lt;mainComponentDesc&gt; [-classpath &lt;classpath&gt;]
[-datapath &lt;datapath&gt;] -mainCompDir &lt;mainComponentDir&gt;
-targetDir &lt;targetDir&gt; [-envVars &lt;propertiesFilePath&gt;]</pre>
<p> The created PEAR file has the file name &lt;componentID&gt;.pear and is located in the &lt;targetDir&gt;.</p>
</li><li class="listitem">
<p>Mode 2: creates a PEAR installation descriptor without packaging the PEAR file</p>
<pre class="programlisting">runPearPackager -create -compID &lt;componentID&gt;
-mainCompDesc &lt;mainComponentDesc&gt; [-classpath &lt;classpath&gt;]
[-datapath &lt;datapath&gt;] -mainCompDir &lt;mainComponentDir&gt;
[-envVars &lt;propertiesFilePath&gt;]</pre>
<p> The PEAR installation descriptor is created in the &lt;mainComponentDir&gt;/metadata directory.</p>
</li><li class="listitem">
<p>Mode 3: creates a PEAR package with an existing PEAR installation descriptor</p>
<pre class="programlisting">runPearPackager -package -compID &lt;componentID&gt;
-mainCompDir &lt;mainComponentDir&gt; -targetDir &lt;targetDir&gt;</pre>
<p> The created PEAR file has the file name &lt;componentID&gt;.pear and is located in the &lt;targetDir&gt;.</p>
</li></ul></div><p>
</p>
<p>The modes 2 and 3 should be used when you want to manipulate the PEAR installation descriptor before packaging
the PEAR file. </p>
<p>Some more details about the PearPackager parameters is provided in the list below:</p>
<div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">
<code class="literal">&lt;componentID&gt;</code>: PEAR package component ID.
</li><li class="listitem">
<code class="literal">&lt;mainComponentDesc&gt;</code>: Main component descriptor of the PEAR package.
</li><li class="listitem">
<code class="literal">&lt;classpath&gt;</code>: PEAR classpath settings. Use $main_root macros to specify
path entries. Use <code class="literal">;</code> to separate the entries.
</li><li class="listitem">
<code class="literal">&lt;datapath&gt;</code>: PEAR datapath settings. Use $main_root macros to specify
path entries. Use <code class="literal">;</code> to separate the path entries.
</li><li class="listitem">
<code class="literal">&lt;mainComponentDir&gt;</code>: Main component directory that contains the PEAR package content.
</li><li class="listitem">
<code class="literal">&lt;targetDir&gt;</code>: Target directory where the created PEAR file is written to.
</li><li class="listitem">
<code class="literal">&lt;propertiesFilePath&gt;</code>: Path name to a properties file that contains environment variables that must be
set to run the PEAR content.
</li></ul></div><p>
</p>
</div>
</div>
<div class="chapter" title="Chapter&nbsp;10.&nbsp;The PEAR Packaging Maven Plugin" id="ugr.tools.pear.packager.maven.plugin.usage"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;10.&nbsp;The PEAR Packaging Maven Plugin</h2></div></div></div>
<p>
UIMA includes a Maven plugin that supports creating PEAR packages using Maven.
When configured for a project, it assumes that the project has the PEAR layout,
and will copy the standard directories that are part of a PEAR structure under the
project root into the PEAR, excluding files that start with a period (".").
It also will put the Jar that is built for the project
into the lib/ directory and include it first on the generated classpath.
</p>
<p>
The classpath that is generated for this includes the artifact's Jar first, any user specified
entries second (in the order they are specified), and finally, entries for all Jars
found in the lib/ directory (in some arbitrary order).
</p>
<div class="section" title="10.1.&nbsp;Specifying the PEAR Packaging Maven Plugin"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.pear.packager.maven.plugin.usage.configure">10.1.&nbsp;Specifying the PEAR Packaging Maven Plugin</h2></div></div></div>
<p>
To use the PEAR Packaging Plugin within a Maven build,
the plugin must be added to the plugins section of the
Maven POM as shown below:
</p>
<p>
</p><pre class="programlisting">&lt;build&gt;
&lt;plugins&gt;
...
&lt;plugin&gt;
&lt;groupId&gt;org.apache.uima&lt;/groupId&gt;
&lt;artifactId&gt;PearPackagingMavenPlugin&lt;/artifactId&gt;
&lt;!-- if version is omitted, then --&gt;
&lt;!-- version is inherited from parent's pluginManagement section --&gt;
&lt;!-- otherwise, include a version element here --&gt;
&lt;!-- says to load Maven extensions
(such as packaging and type handlers) from this plugin --&gt;
&lt;extensions&gt;true&lt;/extensions&gt;
&lt;executions&gt;
&lt;execution&gt;
&lt;phase&gt;package&lt;/phase&gt;
&lt;!-- where you specify details of the thing being packaged --&gt;
&lt;configuration&gt;
&lt;classpath&gt;
&lt;!-- PEAR file component classpath settings --&gt;
$main_root/lib/sample.jar
&lt;/classpath&gt;
&lt;mainComponentDesc&gt;
&lt;!-- PEAR file main component descriptor --&gt;
desc/${artifactId}.xml
&lt;/mainComponentDesc&gt;
&lt;componentId&gt;
&lt;!-- PEAR file component ID --&gt;
${artifactId}
&lt;/componentId&gt;
&lt;datapath&gt;
&lt;!-- PEAR file UIMA datapath settings --&gt;
$main_root/resources
&lt;/datapath&gt;
&lt;/configuration&gt;
&lt;goals&gt;
&lt;goal&gt;package&lt;/goal&gt;
&lt;/goals&gt;
&lt;/execution&gt;
&lt;/executions&gt;
&lt;/plugin&gt;
...
&lt;/plugins&gt;
&lt;/build&gt;
</pre><p>
</p>
<p>
To configure the plugin with the specific settings of a PEAR package, the
<code class="code">&lt;configuration&gt;</code> element section is used. This sections contains all parameters
that are used by the PEAR Packaging Plugin to package the right content and set the specific PEAR package settings.
The details about each parameter and how it is used is shown below:
</p>
<p>
</p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">
<p>
<code class="code">&lt;classpath&gt;</code>
- This element specifies the classpath settings for the
PEAR component. The Jar artifact that is built during the current Maven build is
automatically added to the PEAR classpath settings and does not have to be added manually.
In addition, all Jars in the lib directory and its subdirectories will be added to the
generated classpath when the PEAR is installed.
</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3>
<p>Use $main_root variables to refer to libraries inside
the PEAR package. For more details about PEAR packaging please refer to the
Apache UIMA PEAR documentation.</p>
</div>
</li><li class="listitem">
<p>
<code class="code">&lt;mainComponentDesc&gt;</code>
- This element specifies the relative path to the main component descriptor
that should be used to run the PEAR content. The path must be relative to the
project root. A good default to use is <code class="code">desc/${artifactId}.xml</code>.
</p>
</li><li class="listitem">
<p>
<code class="code">&lt;componentID&gt;</code>
- This element specifies the PEAR package component ID. A good default
to use is <code class="code">${artifactId}</code>.
</p>
</li><li class="listitem">
<p>
<code class="code">&lt;datapath&gt;</code>
- This element specifies the PEAR package UIMA datapath settings.
If no datapath settings are necessary, this element can be omitted.
</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3>
<p>Use $main_root variables to refer libraries inside
the PEAR package. For more details about PEAR packaging please refer to the
Apache UIMA PEAR documentation.</p>
</div>
</li></ul></div><p>
</p>
<p>
For most Maven projects it is sufficient to specify the parameters described above. In some cases, for
more complex projects, it may be necessary to specify some additional configuration
parameters. These parameters are listed below with the default values that are used if they are not
added to the configuration section shown above.
</p>
<p>
</p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem">
<p>
<code class="code">&lt;mainComponentDir&gt;</code>
- This element specifies the main component directory where the UIMA
nature is applied. By default this parameter points to the project root
directory - ${basedir}.
</p>
</li><li class="listitem">
<p>
<code class="code">&lt;targetDir&gt;</code>
- This element specifies the target directory where the result of the plugin
are written to. By default this parameters points to the default Maven output
directory - ${basedir}/target
</p>
</li></ul></div><p>
</p>
</div>
<div class="section" title="10.2.&nbsp;Automatically including dependencies"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.pear.packager.maven.plugin.usage.dependencies">10.2.&nbsp;Automatically including dependencies</h2></div></div></div>
<p>
A key concept in PEARs is that they allow specifying other Jars in the classpath.
You can optionally include these Jars within the PEAR package.
</p>
<p>
The PEAR Packaging Plugin does not take care of automatically
adding these Jars (that the PEAR might depend on) to the PEAR archive.
However, this
behavior can be manually added to your Maven POM.
The following two build plugins
hook into the build cycle and insure that all runtime
dependencies are included in the PEAR file.
</p>
<p>
The dependencies will be automatically included in the
PEAR file using this procedure; the pear install process also will automatically
adds all files in the lib directory (and sub directories) to the
classpath.
</p>
<p>
The <code class="code">maven-dependency-plugin</code>
copies the runtime dependencies of the PEAR into the
<code class="code">lib</code> folder, which is where the PEAR packaging
plugin expects them.
</p>
<pre class="programlisting">&lt;build&gt;
&lt;plugins&gt;
...
&lt;plugin&gt;
&lt;groupId&gt;org.apache.maven.plugins&lt;/groupId&gt;
&lt;artifactId&gt;maven-dependency-plugin&lt;/artifactId&gt;
&lt;executions&gt;
&lt;!-- Copy the dependencies to the lib folder for the PEAR to copy --&gt;
&lt;execution&gt;
&lt;id&gt;copy-dependencies&lt;/id&gt;
&lt;phase&gt;package&lt;/phase&gt;
&lt;goals&gt;
&lt;goal&gt;copy-dependencies&lt;/goal&gt;
&lt;/goals&gt;
&lt;configuration&gt;
&lt;outputDirectory&gt;${basedir}/lib&lt;/outputDirectory&gt;
&lt;overWriteSnapshots&gt;true&lt;/overWriteSnapshots&gt;
&lt;includeScope&gt;runtime&lt;/includeScope&gt;
&lt;/configuration&gt;
&lt;/execution&gt;
&lt;/executions&gt;
&lt;/plugin&gt;
...
&lt;/plugins&gt;
&lt;/build&gt;
</pre>
<p>
The second Maven plug-in hooks into the <code class="code">clean</code>
phase of the build life-cycle, and deletes the
<code class="code">lib</code> folder.
</p>
<div class="note" title="Note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3>
<p>
With this approach, the <code class="code">lib</code> folder is
automatically created, populated, and removed
during the build process. Therefore it should not go into
the source control system and neither should you
manually place any jars in there.
</p>
</div>
<pre class="programlisting">&lt;build&gt;
&lt;plugins&gt;
...
&lt;plugin&gt;
&lt;artifactId&gt;maven-antrun-plugin&lt;/artifactId&gt;
&lt;executions&gt;
&lt;!-- Clean the libraries after packaging --&gt;
&lt;execution&gt;
&lt;id&gt;CleanLib&lt;/id&gt;
&lt;phase&gt;clean&lt;/phase&gt;
&lt;configuration&gt;
&lt;tasks&gt;
&lt;delete quiet="true"
failOnError="false"&gt;
&lt;fileset dir="lib" includes="**/*.jar"/&gt;
&lt;/delete&gt;
&lt;/tasks&gt;
&lt;/configuration&gt;
&lt;goals&gt;
&lt;goal&gt;run&lt;/goal&gt;
&lt;/goals&gt;
&lt;/execution&gt;
&lt;/executions&gt;
&lt;/plugin&gt;
...
&lt;/plugins&gt;
&lt;/build&gt;
</pre>
</div>
<div class="section" title="10.3.&nbsp;Running from the command line"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.pear.packager.maven.plugin.commandline">10.3.&nbsp;Running from the command line</h2></div></div></div>
<p>
The pear packager can be run as a maven command. To enable this, you have to add the following to your
maven settings file:
</p><pre class="programlisting">&lt;settings&gt;
...
&lt;pluginGroups&gt;
&lt;pluginGroup&gt;org.apache.uima&lt;/pluginGroup&gt;
&lt;/pluginGroups&gt;</pre><p>
To invoke the pear packager using maven, use the command:
</p><pre class="programlisting">mvn uima-pear:package &lt;parameters...&gt;</pre><p>
The settings are the same ones used in the configuration above, specified as -D variables
where the variable name is pear.parameterName.
For example:
</p><pre class="programlisting">mvn uima-pear:package -Dpear.mainComponentDesc=desc/mydescriptor.xml
-Dpear.componentId=foo</pre><p>
</p>
</div>
<div class="section" title="10.4.&nbsp;Building the PEAR Packaging Plugin From Source"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.pear.packager.maven.plugin.install.src">10.4.&nbsp;Building the PEAR Packaging Plugin From Source</h2></div></div></div>
<p>
The plugin code is available in the Apache
subversion repository at:
<a class="ulink" href="http://svn.apache.org/repos/asf/uima/uimaj/trunk/PearPackagingMavenPlugin" target="_top">http://svn.apache.org/repos/asf/uima/uimaj/trunk/PearPackagingMavenPlugin</a>.
Use the following command line to build it (you will need the Maven build tool, available from Apache):
</p>
<p>
</p><pre class="programlisting">#PearPackagingMavenPlugin&gt; mvn install</pre><p>
</p>
<p>
This maven command will build the tool and install it in your local maven repository,
making it available for use by other maven POMs. The plugin version number
is displayed at the end of the Maven build as shown in the example below. For this example, the plugin
version number is: <code class="code">2.3.0-incubating</code>
</p>
<p>
</p><pre class="programlisting">[INFO] Installing
/code/apache/PearPackagingMavenPlugin/target/
PearPackagingMavenPlugin-2.3.0-incubating.jar
to
/maven-repository/repository/org/apache/uima/PearPackagingMavenPlugin/
2.3.0-incubating/
PearPackagingMavenPlugin-2.3.0-incubating.jar
[INFO] [plugin:updateRegistry]
[INFO] --------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] --------------------------------------------------------------
[INFO] Total time: 6 seconds
[INFO] Finished at: Tue Nov 13 15:07:11 CET 2007
[INFO] Final Memory: 10M/24M
[INFO] --------------------------------------------------------------</pre><p>
</p>
</div>
</div>
<div class="chapter" title="Chapter&nbsp;11.&nbsp;PEAR Installer User's Guide" id="ugr.tools.pear.installer"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;11.&nbsp;PEAR Installer User's Guide</h2></div></div></div>
<p>PEAR (Processing Engine ARchive) is a new standard for packaging UIMA compliant
components. This standard defines several service elements that should be included in
the archive package to enable automated installation of the encapsulated UIMA
component. The major PEAR service element is an XML Installation Descriptor that
specifies installation platform, component attributes, custom installation
procedures and environment variables. </p>
<p>The installation of a UIMA compliant component includes 2 steps: (1) installation of
the component code and resources in a local file system, and (2) verification of the
serviceability of the installed component. Installation of the component code and
resources involves extracting component files from the archive (PEAR) package in a
designated directory and localizing file references in component descriptors and other
configuration files. Verification of the component serviceability is accomplished
with the help of standard UIMA mechanisms for instantiating analysis engines.
</p><div class="screenshot">
<div class="mediaobject"><table border="0" summary="manufactured viewport for HTML img" cellspacing="0" cellpadding="0" width="574"><tr><td><img src="images/tools/tools.pear.installer/image002.jpg" width="574" alt="PEAR Installer GUI"></td></tr></table></div>
</div>
<p>There are two versions of the PEAR Installer. One is an interactive, GUI-based
application which puts up a panel asking for the parameters of the installation; the
other is a command line interface version where you pass the parameters needed on the command
line itself. To launch the GUI version of the PEAR Installer, use the script in the UIMA bin directory:
<code class="code">runPearInstaller.bat</code> or <code class="code">runPearInstaller.sh.</code>
The command line is launched using <code class="code">runPearInstallerCli.cmd</code> or
<code class="code">runPearInstallerCli.sh.</code></p>
<p>The PEAR Installer installs UIMA
compliant components (analysis engines) from PEAR packages in a local file system. To
install a desired UIMA component the user needs to select the appropriate PEAR file in a
local file system and specify the installation directory (optional). If no installation
directory is specified, the PEAR file is installed to the current working directory.
By default the PEAR packages are not installed directly to the specified installation directory.
For each PEAR a subdirectory with the name of the PEAR's ID is created where the PEAR package is
installed to. If the PEAR installation directory already exists, the old content is automatically
deleted before the new content is installed. During the
component installation the user can read messages printed by the installation program in
the message area of the application window. If the installation fails, appropriate error
message is printed to help identifying and fixing the problem.</p>
<p>After the desired UIMA component is successfully installed, the PEAR Installer
allows testing this component in the CAS Visual Debugger (CVD) application, which is
provided with the UIMA package. The CVD application will load your UIMA component using
its XML descriptor file. If the component is loaded successfully, you'll be able to
run it either with sample documents provided in the
<code class="literal">&lt;UIMA_HOME&gt;/examples/data</code> directory, or with any other
sample documents. See <a href="tools.html#ugr.tools.cvd" class="olink">Chapter&nbsp;5, <i>CAS Visual Debugger</i></a> for more information about the CVD application.
Running your component in the CVD application helps to make sure the component will run in
other UIMA applications. If the CVD application fails to load or run your component, or
throws an exception, you can find more information about the problem in the uima.log file
in the current working directory. The log file can be viewed with the CVD.</p>
<p>PEAR Installer creates a file named <code class="literal">setenv.txt</code> in the
<code class="literal">&lt;component_root&gt;/metadata</code> directory. This file contains
environment variables required to run your component in any UIMA application.
It also creates a PEAR descriptor (see also <a href="references.html#d5e1" class="olink">UIMA References</a>
<a href="references.html#ugr.ref.pear.specifier" class="olink">Section&nbsp;6.3, &#8220;PEAR package descriptor&#8221;</a>)
file named <code class="literal">&lt;componentID&gt;_pear.xml</code>
in the <code class="literal">&lt;component_root&gt;</code> directory that can be used to directly run
the installed pear file in your application.
</p>
<p>
The metadata/setenv.txt is not read by the UIMA framework anywhere.
It's there for use by non-UIMA application code if that code wants to set environment variables.
The metadata/setenv.txt is just a "convenience" file duplicating what's in the xml.
</p>
<p>
The setenv.txt file has 2 special variables: the CLASSPATH and the PATH.
The CLASSPATH is computed from any supplied CLASSPATH environment variable,
plus the jars that are configured in the PEAR structure, including subcomponents.
The PATH is similarly computed, using any supplied PATH environment variable plus
it includes the "bin" subdirectory of the PEAR structure, if it exists.
</p>
<p>The command line version of the PEAR installer has one required argument:
the path to the PEAR file being installed. A second argument can specify the
installation directory (default is the current working directory).
An optional argument, one of "-c" or "-check" or "-verify", causes verification to be done
after installation, as described above.</p>
</div>
<div class="chapter" title="Chapter&nbsp;12.&nbsp;PEAR Merger User's Guide" id="ugr.tools.pear.merger"><div class="titlepage"><div><div><h2 class="title">Chapter&nbsp;12.&nbsp;PEAR Merger User's Guide</h2></div></div></div>
<p>The PEAR Merger utility takes two or more PEAR files and merges their contents,
creating a new PEAR which has, in turn, a new Aggregate analysis engine whose delegates are
the components from the original files being merged. It does this by (1) copying the
contents of the input components into the output component, placing each component into a
separate subdirectory, (2) generating a UIMA descriptor for the output Aggregate
analysis engine and (3) creating an output PEAR file that encapsulates the output
Aggregate.</p>
<p>The merge logic is quite simple, and is intended to work for simple cases. More complex
merging needs to be done by hand. Please see the Restrictions and Limitations section,
below.</p>
<p>To run the PearMerger command line utility you can use the runPearMerger scripts (.bat for Windows, and .sh for
Unix). The usage of the tooling is shown below:</p>
<pre class="programlisting">runPearMerger 1st_input_pear_file ... nth_input_pear_file
-n output_analysis_engine_name [-f output_pear_file ]</pre>
<p>The first group of parameters are the input PEAR files. No duplicates are allowed
here. The <code class="literal">-n</code> parameter is the name of the generated Aggregate
Analysis Engine. The optional <code class="literal">-f</code> parameter specifies the name of
the output file. If it is omitted, the output is written to
<code class="literal">output_analysis_engine_name.pear</code> in the current working directory.</p>
<p>During the running of this tool, work files are written to a temporary directory
created in the user's home directory.</p>
<div class="section" title="12.1.&nbsp;Details of the merging process"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.pear.merger.merge_details">12.1.&nbsp;Details of the merging process</h2></div></div></div>
<p>The PEARs are merged using the following steps:</p>
<div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>A temporary working directory, is created for the
output aggregate component.</p></li><li class="listitem"><p>Each input PEAR file is extracted into a separate
'input_component_name' folder under the working directory.</p>
</li><li class="listitem"><p>The extracted files are processed to adjust the
'$main_root' macros. This operation differs from the PEAR installation
operation, because it does not replace the macros with absolute paths.</p>
</li><li class="listitem"><p>The output PEAR directory structure, 'metadata' and
'desc' folders under the working directory, are created.</p>
</li><li class="listitem"><p>The UIMA AE descriptor for the output aggregate component is built
in the 'desc' folder. This aggregate descriptor refers to the input
delegate components, specifying 'fixed flow' based on the original
order of the input components in the command line. The aggregate descriptor's
'capabilities' and
'operational properties' sections are built based on the input
components' specifications.</p></li><li class="listitem"><p>A new PEAR installation descriptor is created in the
'metadata' folder, referencing the new output aggregate descriptor
built in the previous step. </p></li><li class="listitem"><p>The content of the temporary output working directory is zipped to
created the output PEAR, and then the temporary working directory is deleted.
</p></li></ol></div>
<p>The PEAR merger utility logs all the operations both to standard console output and
to a log file, pm.log, which is created in the current working directory.</p>
</div>
<div class="section" title="12.2.&nbsp;Testing and Modifying the resulting PEAR"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.pear.merger.testing_modifying_resulting_pear">12.2.&nbsp;Testing and Modifying the resulting PEAR</h2></div></div></div>
<p>The output PEAR file can be installed and tested using the PEAR Installer. The
output aggregate component can also be tested by using the CVD or DocAnalyzer
tools.</p>
<p>The PEAR Installer creates Eclipse project files (.classpath and .project) in the
root directory of the installer PEAR, so the installed component can be imported into
the Eclipse IDE as an external project. Once the component is in the Eclipse IDE,
developers may use the Component Descriptor Editor and the PEAR Packager to modify the
output aggregate descriptor and re-package the component.</p>
</div>
<div class="section" title="12.3.&nbsp;Restrictions and Limitations"><div class="titlepage"><div><div><h2 class="title" style="clear: both" id="ugr.tools.pear.merger.restrictions_limitations">12.3.&nbsp;Restrictions and Limitations</h2></div></div></div>
<p>The PEAR Merger utility only does basic merging operations, and is limited as
follows. You can overcome these by editing the resulting PEAR file or the resulting
Aggregate Descriptor.</p>
<div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>The Merge operation specifies Fixed Flow sequencing
for the Aggregate.</p></li><li class="listitem"><p>The merged aggregate does not define any parameters, so the delegate
parameters cannot be overridden.</p></li><li class="listitem"><p>No External Resource definitions are generated for the
aggregate.</p></li><li class="listitem"><p>No Sofa Mappings are generated for the aggregate.</p>
</li><li class="listitem"><p>Name collisions are not checked for. Possible name collisions could
occur in the fully-qualified class names of the implementing Java classes, the names
of JAR files, the names of descriptor files, and the names of resource bindings or
resource file paths.</p></li><li class="listitem"><p>The input and output capabilities are generated based on merging the
capabilities from the components (removing duplicates). Capability sets are
ignored - only the first of the set is used in this process, and only one set is created
for the generated Aggregate. There is no support for merging Sofa
specifications.</p></li><li class="listitem"><p>No Indexes or Type Priorities are created for the generated
Aggregate. No checking is done to see if the Indexes or Type Priorities of the
components conflict or are inconsistent.</p></li><li class="listitem"><p>You can only merge Analysis Engines and CAS Consumers. </p>
</li><li class="listitem"><p>Although PEAR file installation descriptors that are being merged
can have specific XML elements describing Collection Reader and CAS Consumer
descriptors, these elements are ignored during the merge, in the sense that the
installation descriptor that is created by the merge does not set these elements. The
merge process does not use these elements; the output PEAR's new aggregate only
references the merged components' main PEAR descriptor element, as
identified by the PEAR element:
</p><pre class="programlisting">&lt;SUBMITTED_COMPONENT&gt;
&lt;DESC&gt;the_component.xml&lt;/DESC&gt;...
&lt;/SUBMITTED_COMPONENT&gt;
</pre>
</li></ol></div>
</div>
</div>
</div></body></html>