uima-docbook-overview-and-setup/src/docbook/project_overview.xml - uima-uimaj - Git at Google

 <?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
 "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
 <!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >
 %uimaents;
 ]>
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership.  The ASF licenses this file
 to you under the Apache License, Version 2.0 (the
 "License"); you may not use this file except in compliance
 with the License.  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing,
 software distributed under the License is distributed on an
 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
 <chapter id="ugr.project_overview">
   <title>UIMA Overview</title>
   <titleabbrev>Overview</titleabbrev>

   <para>The Unstructured Information Management Architecture (UIMA) is an architecture and software framework
     for creating, discovering, composing and deploying a broad range of multi-modal analysis capabilities and
     integrating them with search technologies.  The architecture is undergoing a standardization effort,
     referred to as the <emphasis>UIMA specification</emphasis> by a technical committee within
     <ulink url="http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=uima">OASIS</ulink>.
     </para>

   <para>The <emphasis>Apache UIMA</emphasis> framework is an Apache licensed, open source implementation of the
     UIMA Architecture, and provides a run-time environment in which developers can plug in
     and run their UIMA component implementations and with which they can build and deploy UIM applications. The
     framework itself is not specific to any IDE or platform.</para>

   <para>It includes an all-Java implementation of the
     UIMA framework for the development, description, composition and deployment of UIMA components and
     applications. It also provides the developer with an Eclipse-based (<ulink url="http://www.eclipse.org/"/>
     ) development environment that includes a set of tools and utilities for using UIMA. It also includes
     a C++ version of the framework, and
     enablements for Annotators built in Perl, Python, and TCL.</para>

   <para>This chapter is the intended starting point for readers that are new to the Apache UIMA Project. It includes
     this introduction and the following sections:</para>
   <itemizedlist>
     <listitem>
       <para> <xref linkend="ugr.project_overview_doc_overview"/> provides a list of the books and topics included in
         the Apache UIMA documentation with a brief summary of each. </para>
     </listitem>
     <listitem>
       <para> <xref linkend="ugr.project_overview_doc_use"/> describes a recommended path through the
         documentation to help get the reader up and running with UIMA </para>
     </listitem>
     <listitem>
       <para> <xref linkend="ugr.project_overview_migrating_from_ibm_uima"/> is intended for users of IBM
         UIMA, and describes the steps needed to upgrade to Apache UIMA. </para>
     </listitem>
     <listitem>
       <para> <xref linkend="ugr.project_overview_changes_from_v1"/> lists the changes that occurred between UIMA
         v1.x and UIMA v2.x (independent of the transition to Apache).</para>
     </listitem>
   </itemizedlist>

     <para>The main website for Apache UIMA is <ulink url="http://uima.apache.org"/>.  Here you
     can find out many things, including:
      <itemizedlist spacing="compact">
        <listitem><para>how to download (both the binary and source distributions</para></listitem>
        <listitem><para>how to participate in the development</para></listitem>
        <listitem><para>mailing lists - including the user list used like a forum for questions and answers</para></listitem>
        <listitem><para>a Wiki where you can find and contribute all kinds of information, including tips and best practices</para></listitem>
        <listitem><para>a sandbox - a subproject for potential new additions to Apache UIMA or to subprojects of it.  Things here
        are works in progress, and may (or may not) be included in releases.</para></listitem>
        <listitem><para>links to conferences</para></listitem>
      </itemizedlist>
       </para>

   <section id="ugr.project_overview_doc_overview">
     <title>Apache UIMA Project Documentation Overview</title>
     <para> The user documentation for UIMA is organized into several parts.
       <itemizedlist spacing="compact">
         <listitem>
           <para> Overviews - this documentation </para>
         </listitem>
         <listitem>
           <para> Eclipse Tooling Installation and Setup - also in this document </para>
         </listitem>
         <listitem>
           <para> Tutorials and Developer's Guides </para>
         </listitem>
         <listitem>
           <para> Tools Users' Guides </para>
         </listitem>
         <listitem>
           <para> References </para>
         </listitem>
       </itemizedlist> </para>

     <para>
     The first 2 parts make up this book; the last 3 have individual
     books.  The books are provided both as
     (somewhat large) html files, viewable in browsers, and also as PDF files.
     The documentation is fully hyperlinked, with tables of contents.  The PDF versions are set up to
     print nicely - they have page numbers included on the cross references within a book. </para>

     <para>If you view the PDF files inside
     a browser that supports imbedded viewing of PDF, the hyperlinks between different PDF books may work (not
     all browsers have been tested...).</para>

     <para>The following set of tables gives a more detailed overview of the various parts of the
     documentation.
     </para>

     <section id="ugr.project_overview_overview">
       <title>Overviews</title>

       <informaltable frame="all" rowsep="1" colsep="1">
         <tgroup cols="2">
           <colspec colnum="1" colname="col1" colwidth="1*"/>
           <colspec colnum="2" colname="col2" colwidth="2.5*"/>
           <tbody>
             <row>
               <entry><emphasis>Overview of the Documentation</emphasis>
               </entry>
               <entry>
                 <para>What you are currently reading.  Lists the documents provided in the Apache
                 UIMA documentation set and provides
                  a recommended path through the documentation for getting started using
                   UIMA.  It includes release notes and provides a brief high-level description of
                   the different software modules included in the
                   Apache UIMA Project.  See <xref linkend="ugr.project_overview_doc_overview"/>.</para>
               </entry>
             </row>
             <row>
               <entry><emphasis>Conceptual Overview</emphasis>
               </entry>
               <entry>Provides a broad conceptual overview of the UIMA component architecture; includes
                 references to the other documents in the documentation set that provide more detail.
                 See <xref linkend="ugr.ovv.conceptual"/></entry>
             </row>
             <row>
               <entry><emphasis>UIMA FAQs</emphasis>
               </entry>
               <entry>Frequently Asked Questions about general UIMA concepts. (Not a programming
                 resource.)  See <xref linkend="ugr.faqs"/>.</entry>
             </row>
             <row>
               <entry><emphasis>Known Issues</emphasis>
               </entry>
               <entry>Known issues and problems with the UIMA SDK.  See <xref linkend="ugr.issues"/>.</entry>
             </row>
             <row>
               <entry><emphasis>Glossary</emphasis>
               </entry>
               <entry>UIMA terms and concepts and their basic definitions.  See <xref linkend="ugr.glossary"/>.</entry>
             </row>
           </tbody>
         </tgroup>
       </informaltable>
     </section>
     <section id="ugr.project_overview_setup">
       <title>Eclipse Tooling Installation and Setup</title>
       <para>Provides step-by-step instructions for installing Apache UIMA in the Eclipse Interactive
         Development Environment.  See <xref linkend="ugr.ovv.eclipse_setup"/>.</para>
     </section>

     <section id="ugr.project_overview_tutorials_dev_guides">
       <title>Tutorials and Developer&apos;s Guides</title>
       <informaltable>
         <tgroup cols="2">
           <colspec colnum="1" colname="col1" colwidth="1*"/>
           <colspec colnum="2" colname="col2" colwidth="2.5*"/>
           <tbody>
             <row id="ugr.project_overview_tutorial_annotator">
               <entry><emphasis>Annotators and Analysis Engines</emphasis>
               </entry>
               <entry>Tutorial-style guide for building UIMA annotators and analysis engines. This chapter
                 introduces the developer to creating type systems and using UIMA&apos;s common data structure,
                 the CAS or Common Analysis Structure. It demonstrates how to use built in tools to specify and create
                 basic UIMA analysis components.  See
                 <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.aae"/>.</entry>
             </row>
             <row id="ugr.project_overview_tutorial_cpe">
               <entry><emphasis>Building UIMA Collection Processing Engines</emphasis>
               </entry>
               <entry>Tutorial-style guide for building UIMA collection processing engines. These
                manage the
                 analysis of collections of documents from source to sink.  See
                 <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.cpe"/>.</entry>
             </row>
             <row id="ugr.project_overview_tutorial_application_development">
               <entry><emphasis>Developing Complete Applications</emphasis>
               </entry>
               <entry>Tutorial-style guide on using the UIMA APIs to create, run and manage UIMA components from
                 your application. Also describes APIs for saving and restoring the contents of a CAS using an XML
                 format called <trademark class="registered"> XMI</trademark>.  See
                 <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.application"/>.</entry>
             </row>
             <row id="ugr.project_overview_guide_flow_controller">
               <entry><emphasis>Flow Controller</emphasis>
               </entry>
               <entry>When multiple components are combined in an Aggregate, each CAS flow among the various
                 components. UIMA provides two built-in flows, and also allows custom flows to be
                 implemented.  See <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.fc"/>.</entry>
             </row>
             <row id="ugr.project_overview_guide_multiple_sofas">
               <entry><emphasis>Developing Applications using Multiple Subjects of Analysis</emphasis>
               </entry>
               <entry>A single CAS maybe associated with multiple subjects of analysis (Sofas). These are useful
                 for representing and analyzing different formats or translations of the same document. For
                 multi-modal analysis, Sofas are good for different modal representations of the same stream
                 (e.g., audio and close-captions).This chapter provides the developer details on how to use
                 multiple Sofas in an application.  See
                 <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.aas"/>.</entry>
             </row>
             <row id="ugr.project_overview_guide_multiple_views">
               <entry><emphasis>Multiple CAS Views of an Artifact</emphasis>
               </entry>
               <entry>UIMA provides an extension to the basic model of the CAS which supports
               analysis of multiple views of the same artifact, all contained with the CAS. This
               chapter describes the concepts, terminology, and the API and XML extensions that
               enable this.  See
                 <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.mvs"/>.</entry>
             </row>
             <row id="ugr.project_overview_guide_cas_multiplier">
               <entry><emphasis>CAS Multiplier</emphasis>
               </entry>
               <entry>A component may add additional CASes into the workflow. This may be useful to break up a large
                 artifact into smaller units, or to create a new CAS that collects information from multiple other
                 CASes.  See <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.cm"/>.</entry>
             </row>
             <row id="ugr.project_overview_xmi_emf">
               <entry><emphasis>XMI and EMF Interoperability</emphasis>
               </entry>
               <entry>The UIMA Type system and the contents of the CAS itself can be externalized using the XMI
                 standard for XML MetaData. Eclipse Modeling Framework (EMF) tooling can be used to develop
                 applications that use this information.  See
                 <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.xmi_emf"/>.</entry>
             </row>
           </tbody>
         </tgroup>
       </informaltable>
     </section>

     <section id="ugr.project_overview_tool_guides">
       <title>Tools Users&apos; Guides</title>

       <informaltable>
         <tgroup cols="2">
           <colspec colnum="1" colname="col1" colwidth="1*"/>
           <colspec colnum="2" colname="col2" colwidth="2.5*"/>
           <tbody>
             <row id="ugr.project_overview_tools_component_descriptor_editor">
               <entry><emphasis>Component Descriptor Editor</emphasis>
               </entry>
               <entry>Describes the features of the Component Descriptor Editor Tool. This tool provides a GUI for
                 specifying the details of UIMA component descriptors, including those for Analysis Engines
                 (primitive and aggregate), Collection Readers, CAS Consumers and Type Systems.  See
                 <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.cde"/>.</entry>
             </row>
             <row id="ugr.project_overview_tools_cpe_configurator">
               <entry><emphasis>Collection Processing Engine Configurator</emphasis>
               </entry>
               <entry>Describes the User Interfaces and features of the CPE Configurator tool. This tool allows the
                 user to select and configure the components of a Collection Processing Engine and then to run the
                 engine.  See
                 <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.cpe"/>.</entry>
             </row>
             <row id="ugr.project_overview_tools_pear_packager">
               <entry><emphasis>Pear Packager</emphasis>
               </entry>
               <entry>Describes how to use the PEAR Packager utility. This utility enables developers to produce an
                 archive file for an analysis engine that includes all required resources for installing that
                 analysis engine in another UIMA environment.  See
                 <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.pear.packager"/>.</entry>
             </row>
             <row id="ugr.project_overview_tools_pear_installer">
               <entry><emphasis>Pear Installer</emphasis>
               </entry>
               <entry>Describes how to use the PEAR Installer utility. This utility installs and verifies an
                 analysis engine from an archive file (PEAR) with all its resources in the right place so it is ready to
                 run.  See
                 <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.pear.installer"/>.</entry>
             </row>
             <row id="ugr.project_overview_tools_pear_merger">
               <entry><emphasis>Pear Merger</emphasis>
               </entry>
               <entry>Describes how to use the Pear Merger utility, which does a simple merge of multiple PEAR
                 packages into one.  See
                 <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.pear.merger"/>.</entry>
             </row>
             <row id="ugr.project_overview_tools_document_analyzer">
               <entry><emphasis>Document Analyzer</emphasis>
               </entry>
               <entry>Describes the features of a tool for applying a UIMA analysis engine to a set of documents and
                 viewing the results.  See
                 <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.doc_analyzer"/>.</entry>
             </row>
             <row id="ugr.project_overview_tools_cas_visual_debugger">
               <entry><emphasis>CAS Visual Debugger</emphasis>
               </entry>
               <entry>Describes the features of a tool for viewing the detailed structure and contents of a CAS. Good
                 for debugging.  See
                 <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.cvd"/>.</entry>
             </row>
             <row id="ugr.project_overview_tools_jcasgen">
               <entry><emphasis>JCasGen</emphasis>
               </entry>
               <entry>Describes how to run the JCasGen utility, which automatically builds Java classes that
                 correspond to a particular CAS Type System.  See
                 <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.jcasgen"/>.</entry>
             </row>
             <row id="ugr.project_overview_tools_xml_cas_viewer">
               <entry><emphasis>XML CAS Viewer</emphasis>
               </entry>
               <entry>Describes how to run the supplied viewer to view externalized XML forms of CASes. This viewer
                 is used in the examples.  See
                 <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.annotation_viewer"/>.</entry>
             </row>
           </tbody>
         </tgroup>
       </informaltable>
     </section>

     <section id="ugr.project_overview_reference">
       <title>References</title>
       <informaltable>
         <tgroup cols="2">
           <colspec colnum="1" colname="col1" colwidth="1*"/>
           <colspec colnum="2" colname="col2" colwidth="2.5*"/>
           <tbody>
             <row id="ugr.project_overview_javadocs">
               <entry><emphasis>Introduction to the UIMA API Javadocs</emphasis>
               </entry>
               <entry>Javadocs detailing the UIMA programming interfaces  See
                 <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.javadocs"/></entry>
             </row>
             <row id="ugr.project_overview_xml_ref_component_descriptor">
               <entry><emphasis>XML: Component Descriptor</emphasis>
               </entry>
               <entry>Provides detailed XML format for all the UIMA component descriptors, except the CPE (see
                 next).  See
                 <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.xml.component_descriptor"/>.</entry>
             </row>
             <row id="ugr.project_overview_xml_ref_collection_processing_engine_descriptor">
               <entry><emphasis>XML: Collection Processing Engine Descriptor</emphasis>
               </entry>
               <entry>Provides detailed XML format for the Collection Processing Engine descriptor.  See
                 <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.xml.cpe_descriptor"/></entry>
             </row>
             <row id="ugr.project_overview_cas">
               <entry><emphasis>CAS</emphasis>
               </entry>
               <entry>Provides detailed description of the principal CAS interface.  See
                 <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.cas"/></entry>
             </row>
             <row id="ugr.project_overview_jcas">
               <entry><emphasis>JCas</emphasis>
               </entry>
               <entry>Provides details on the JCas, a native Java interface to the CAS.  See
                 <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.jcas"/></entry>
             </row>
             <row id="ugr.project_overview_ref_pear">
               <entry><emphasis>PEAR Reference</emphasis>
               </entry>
               <entry>Provides detailed description of the deployable archive format for UIMA
                 components.  See
                 <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.pear"/></entry>
             </row>
             <row id="ugr.project_overview_xmi_cas_serialization">
               <entry><emphasis>XMI CAS Serialization Reference</emphasis>
               </entry>
               <entry>Provides detailed description of the deployable archive format for UIMA
                 components.  See
                 <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.xmi"/></entry>
             </row>
           </tbody>
         </tgroup>
       </informaltable>
     </section>
   </section>

   <section id="ugr.project_overview_doc_use">
     <!-- _crossRef358 -->
     <title>How to use the Documentation</title>
     <orderedlist>
       <listitem>
         <para>Explore this chapter to get an overview of the different documents that are included with Apache UIMA.</para>
       </listitem>
       <listitem>
         <para> Read <olink targetdoc="&uima_docs_overview;" targetptr="ugr.ovv.conceptual"/> to get a broad
           view of the basic UIMA concepts and philosophy with reference to the other documents included in the
           documentation set which provide greater detail. </para>
       </listitem>
       <listitem>
         <para> For more general information on the UIMA architecture and how it has been used, refer to the IBM Systems
           Journal special issue on Unstructured Information Management, on-line at <ulink
             url="http://www.research.ibm.com/journal/sj43-3.html"/> or to the section of the UIMA project
           website on Apache website where other publications are listed. </para>
       </listitem>
       <listitem>
         <para> Set up Apache UIMA in your Eclipse environment. To do this, follow the instructions in <xref
             linkend="ugr.ovv.eclipse_setup"/>. </para>
       </listitem>
       <listitem>
         <para> Develop sample UIMA annotators, run them and explore the results. Read <olink
             targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.aae"/> and follow it like a tutorial
           to learn how to develop your first UIMA annotator and set up and run your first UIMA analysis engines.
           <itemizedlist>
             <listitem>
               <para> As part of this you will use a few tools including
                 <itemizedlist>
                   <listitem>
                     <para> The UIMA Component Descriptor Editor, described in more detail in <olink
                         targetdoc="&uima_docs_tools;" targetptr="ugr.tools.cde"/> and </para>
                   </listitem>
                   <listitem>
                     <para> The Document Analyzer, described in more detail in <olink
                         targetdoc="&uima_docs_tools;" targetptr="ugr.tools.doc_analyzer"/>. </para>
                   </listitem>

                 </itemizedlist> </para>

             </listitem>
             <listitem>
               <para>While following along in <olink targetdoc="&uima_docs_tutorial_guides;"
                   targetptr="ugr.tug.aae"/>, reference documents that may help are:
                 <itemizedlist>
                   <listitem>
                     <para> <olink targetdoc="&uima_docs_ref;"
                         targetptr="ugr.ref.xml.component_descriptor"/> for understanding the analysis
                       engine descriptors </para>
                   </listitem>
                   <listitem>
                     <para> <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.jcas"/> for
                       understanding the JCas </para>
                   </listitem>
                 </itemizedlist> </para>
             </listitem>
           </itemizedlist> </para>
       </listitem>
       <listitem>
         <para> Learn how to create, run and manage a UIMA analysis engine as part of an application.
           Connect your analysis engine to the provided semantic search engine to learn how a
           complete analysis and search application may be built with Apache UIMA. <olink
             targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.application"/> will guide you
           through this process.
           <itemizedlist>
             <listitem>
               <para> As part of this you will use the document analyzer (described in more detail in <olink
                   targetdoc="&uima_docs_tools;" targetptr="ugr.tools.doc_analyzer"/> and semantic search
                 GUI tools (see <olink targetdoc="&uima_docs_tutorial_guides;"
                   targetptr="ugr.tug.application.search.query_tool"/>. </para>
             </listitem>
           </itemizedlist> </para>
       </listitem>
       <listitem>
         <para> Pat yourself on the back. Congratulations! If you reached this step successfully, then you have an
           appreciation for the UIMA analysis engine architecture. You would have built a few sample annotators,
           deployed UIMA analysis engines to analyze a few documents, searched over the results using the built-in
           semantic search engine and viewed the results through a built-in viewer
           &ndash; all as part of a simple but complete application. </para>
       </listitem>
       <listitem>
         <para> Develop and run a Collection Processing Engine (CPE) to analyze and gather the results of an entire
           collection of documents. <olink targetdoc="&uima_docs_tutorial_guides;"
             targetptr="ugr.tug.cpe"/> will guide you through this process.
           <itemizedlist>
             <listitem>
               <para> As part of this you will use the CPE Configurator tool. For details see <olink
                   targetdoc="&uima_docs_tools;" targetptr="ugr.tools.cpe"/>. </para>
             </listitem>
             <listitem>
               <para> You will also learn about CPE Descriptors. The detailed format for these may be found in <olink
                   targetdoc="&uima_docs_ref;" targetptr="ugr.ref.xml.cpe_descriptor"/>. </para>
             </listitem>
           </itemizedlist> </para>
       </listitem>
       <listitem>
         <para> Learn how to package up an analysis engine for easy installation into another UIMA environment.
             <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.pear.packager"/> and <olink
             targetdoc="&uima_docs_tools;" targetptr="ugr.tools.pear.installer"/> will teach you how to
           create UIMA analysis engine archives so that you can easily share your components with a broader
           community. </para>
       </listitem>
     </orderedlist>
   </section>

   <section id="ugr.project_overview_changes_from_previous">
       <title>Changes from Previous Major Versions</title>
     <para> There are two previous version of UIMA, available from IBM's alphaWorks: version 1.4.x and version 2.0
       (the 2.0 version was a "beta" only release). This section describes the changes relative to both of these
       releases. A migration utility is provided which updates your Java code and descriptors as needed for this
       release. See <xref linkend="ugr.project_overview_migrating_from_ibm_uima"/> for instructions on how to
       run the migration utility. </para>

      <note><para>Each Apache UIMA release includes RELEASE_NOTES and RELEASE_NOTES.html files that
         describe the changes that have occurred in each release.
         Please refer to those files for specific changes for each Apache UIMA release.</para></note>

     <section id="ugr.project_overview_changes_from_2_0">
     <title>Changes from IBM UIMA 2.0 to Apache UIMA 2.1</title>

     <para>This section describes what has changed between version 2.0 and version 2.1 of UIMA;
       the following section describes the differences between version 1.4 and version 2.1.
       </para>

       <section id="ugr.project_overview.migration_utility.java_package_name_changes">
         <title>Java Package Name Changes</title>
         <para>All of the UIMA Java package names have changed in Apache UIMA. They now start with
           <literal>org.apache</literal> rather than <literal>com.ibm</literal>. There have been other
           changes as well. The package name segment <literal>reference_impl</literal> has been shortened to
           <literal>impl</literal>, and some segments have been reordered. For example
           <literal>com.ibm.uima.reference_impl.analysis_engine</literal> has become
           <literal>org.apache.uima.analysis_engine.impl</literal>. Tools are now consolidated under
           <literal>org.apache.uima.tools</literal> and service adapters under
           <literal>org.apache.uima.adapter</literal>. </para>
         <para>The migration utility will replace all occurrences of IBM UIMA package names with their Apache UIMA
           equivalents. It will not replace <emphasis>prefixes</emphasis> of package names, so if your code uses
           a package called <literal>com.ibm.uima.myproject</literal> (although that is not recommended), it
           will not be replaced.</para>
       </section>
       <section id="ugr.project_overview.migration_utility.xml_descriptor_changes">
         <title>XML Descriptor Changes</title>
         <para>The XML namespace in UIMA component descriptors has changed from
           <literal>http://uima.watson.ibm.com/resourceSpecifier</literal> to
           <literal>http://uima.apache.org/resourceSpecifier</literal>. The value of the
           <literal>&lt;frameworkImplementation></literal> must now be
           <literal>org.apache.uima.java</literal> or <literal>org.apache.uima.cpp</literal>. The
           migration script will apply these replacements. </para>
       </section>
       <section id="ugr.project_overview.migration_utility.tcas_replaced_by_cas">
         <title>TCAS replaced by CAS</title>
         <para>In Apache UIMA the <literal>TCAS</literal> interface has been removed. All uses of it must now be
           replaced by the <literal>CAS</literal> interface. (All methods that used to be defined on
           <literal>TCAS</literal> were moved to <literal>CAS</literal> in v2.0.) The method
           <literal>CAS.getTCAS()</literal> is replaced with <literal>CAS.getCurrentView()</literal> and
           <literal>CAS.getTCAS(String)</literal> is replaced with <literal>CAS.getView(String)</literal>
           . The following have also been removed and replaced with the equivalent "CAS" variants:
           <literal>TCASException</literal>, <literal>TCASRuntimeException</literal>,
           <literal>TCasPool</literal>, and <literal>CasCreationUtils.createTCas(...)</literal>. </para>
         <para>The migration script will apply the necessary replacements.</para>
       </section>
       <section id="ugr.project_overview.migration_utility.jcas_interface">
         <title>JCas Is Now an Interface</title>
         <para>In previous versions, user code accessed the JCas <emphasis>class</emphasis> directly. In Apache
           UIMA there is now an interface, <literal>org.apache.uima.jcas.JCas</literal>, which all JCas-based
           user code must now use. Static methods that were previously on the JCas class (and called from JCas cover
           classes generated by JCasGen) have been moved to the new
           <literal>org.apache.uima.jcas.JCasRegistry</literal> class. The migration script will apply the
           necessary replacements to your code, including any JCas cover classes that are part of your codebase.
           </para>
       </section>
       <section id="ugr.project_overview.migration_utility.jar_files">
         <title>JAR File names Have Changed</title>
         <para>The UIMA JAR file names have changed slightly.  Underscores have been replaced with hyphens to
           be consistent with Apache naming conventions.  For example <literal>uima_core.jar</literal> is now
           <literal>uima-core.jar</literal>.  Also <literal>uima_jcas_builtin_types.jar</literal> has been
           renamed to <literal>uima-document-annotation.jar</literal>.  Finally, the <literal>jVinci.jar</literal>
           file is now in the <literal>lib</literal> directory rather than the <literal>lib/vinci</literal>
           directory as was previously the case.  The migration script will apply the necessary replacements,
           for example to script files or Eclipse launch configurations. (See <xref
           linkend="ugr.project_overview_running_the_migration_utility"/> for a list of file extensions that
           the migration utility will process by default.)
           </para>
       </section>
     <section id="ugr.ovv.search_engine_repackaged">
       <title>Semantic Search Engine Repackaged</title>
       <para>The versions of the UIMA SDK prior to the move into Apache came with a semantic search engine. The Apache
         version does not include this search engine. The search engine has been repackaged and is separately
         available from <ulink url="http://www.alphaworks.ibm.com/tech/uima"/>. The intent is to hook up (over
         time) with other open source search engines, such as the Lucene search engine project in Apache.</para>
     </section>
   </section>


   <section id="ugr.project_overview_changes_from_v1">
     <title>Changes from UIMA Version 1.x</title>
     <para>Version 2.x of UIMA provides new capabilities and refines several areas of the UIMA
       architecture, as compared with version 1.</para>

     <section id="ugr.project_overview_new_capabilities">
       <title>New Capabilities</title>
       <formalpara id="ugr.project_overview_new_data_types">
         <title>New Primitive data types</title>
         <para>UIMA now supports Boolean (bit), Byte, Short (16 bit integers), Long (64 bit
           integers), and Double (64 bit floating point) primitive types, and arrays of
           these. These types can be used like all the other primitive types.</para>
       </formalpara>
       <formalpara id="ugr.ovv.simpler_aes_and_cases">
         <title>Simpler Analysis Engines and CASes</title>
         <para>Version 1.x made a distinction between Analysis Engines and Text Analysis
           Engines. This distinction has been eliminated in Version 2 - new code should just
           refer to Analysis Engines. Analysis Engines can operate on multiple kinds of
           artifacts, including text.</para>
       </formalpara>
       <formalpara id="ugr.ovv.sofas_and_cas_views_simplified">
         <title>Sofas and CAS Views simplified</title>
         <para>The APIs for manipulating multiple subjects of analysis (Sofas) and their
           corresponding CAS Views have been simplified.</para>
       </formalpara>
       <formalpara id="ugr.ovv.ae_support_multiple_new_cases">
         <title>Analysis Component generalized to support multiple new CAS
           outputs</title>
         <para>Analysis Components, in general, can make use of new capabilities to return
           multiple new CASes, in addition to returning the original CAS that is passed in.
           This allows components to have Collection Reader-like capabilities, but be
           placed anywhere in the flow. See <olink
             targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.cm"/>
           .</para>
       </formalpara>
       <formalpara id="ugr.ovv.user_customized_fc">
         <title>User-customized Flow Controllers</title>
         <para>A new component, the Flow Controller, can be supplied by the user to implement
           arbitrary flow control for CASes within an Aggregate. This is in addition to the two
           built-in flow control choices of linear and language-capability flow. See <olink
             targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.fc"/>
           .</para>
       </formalpara>
     </section>

     <section id="ugr.ovv.other_changes">
       <title>Other Changes</title>

       <formalpara>
         <title>New additional Annotator API ImplBase</title>
         <para>
           As of version 2.1, UIMA has a new set of Annotator interfaces. Annotators should now
           extend CasAnnotator_ImplBase or JCasAnnotator_ImplBase instead of the v1.x
           TextAnnotator_ImplBase and JTextAnnotator_ImplBase.  The v1.x annotator
           interfaces are unchanged and are still supported for backwards compatibility.
          </para>
         </formalpara>
       <para>
         The new Annotator interfaces support the changed approaches for ResultSpecifications
         and the changed exception names (see below), and have all the methods that CAS Consumers
       have, including CollectionProcessComplete and BatchProcessComplete.</para>

     <formalpara id="ugr.ovv.exceptions_rationalized">
       <title>UIMA Exceptions rationalized</title>

       <para>In version 1 there were different exceptions for the methods of an
         AnalysisEngine and for the corresponding methods of an Annotator; these were merged
         in version 2.

         <itemizedlist spacing="compact">
           <listitem><para>AnnotatorProcessException (v1) &rarr;
             AnalysisEngineProcessException (v2)</para></listitem>
           <listitem><para>AnnotatorInitializationException (v1) &rarr;
             ResourceInitializationException (v2)</para></listitem>
           <listitem><para>AnnotatorConfigurationException (v1) &rarr;
             ResourceConfigurationException (v2)</para></listitem>
           <listitem><para>AnnotatorContextException (v1) &rarr;
             ResourceAccessException (v2)</para></listitem>
         </itemizedlist> The previous exceptions are still available, but new code should
         use the new exceptions.</para>
         </formalpara>
         <note><para>The signature for typeSystemInit changed the <quote>throws</quote> clause to throw AnalysisEngineProcessException.
           For Annotators that extend the previous base, the previous definition of typeSystemInit will continue to
           work for backwards compatibility.
        </para></note>


     <formalpara id="ugr.ovv.result_specification">
       <title>Changes in Result Specifications</title>
       <para>In version 1, the <literal>process(...)</literal> method took a second
         argument, a ResultSpecification. Now it is set when changed and it's up to the
         annotator to store it in a local field and make it available when needed.
         This approach lets the annotator receive a specific signal (a method call) when
         the Result Specification changes. Previously, it would need to check on every call to
         see if it changed. The default impl base classes provide set/getResultSpecification(...)
         methods for this</para>
     </formalpara>

     <formalpara id="ugr.ovv.one_capability_set">
       <title>Only one Capability Set</title>
       <para>In version one, you can define
         multiple capability sets. These were not supported well, and for version two,
         this is now simplified - you should only use one capability set.
         (For backwards compatibility, if you use more,
         this won't cause a problem for now).</para>
     </formalpara>


       <formalpara>
         <title>TextAnalysisEngine deprecated; use AnalysisEngine instead</title>
       <para>TextAnalysisEngine has been deprecated - it is now no different than
         AnalysisEngine. Previous code that uses this should still continue to work,
         however.</para></formalpara>

       <formalpara>
         <title>Annotator Context deprecated; use UimaContext instead</title>
         <para>The context for the Annotator is the same as the overall UIMA context.
         The impl base classes provide a getContext() method which returns now the
         UimaContext object.</para>
       </formalpara>

       <formalpara>
         <title>DocumentAnalyzer tool uses XMI formats</title>
       <para>The DocumentAnalyzer tool saves outputs in the new XMI serialization format.
         The AnnotationViewer and SemanticSearchGUI tools can read both the new XMI format
         and the previous XCAS format.</para></formalpara>

       <formalpara>
         <title>CAS Initializer deprecated</title>
         <para>Example code that used CAS Initializers has been rewritten to not use this.</para>
       </formalpara>
     </section>

     <section id="ugr.project_overview_backwards_compatibility">
       <title>Backwards Compatibility</title>
       <para>Other than the changes from IBM UIMA to Apache UIMA described above, most UIMA 1.x
         applications should not require additional changes to upgrade to UIMA 2.x. However,
         there are a few exceptions that UIMA 1.x users may need to be aware of:
         <itemizedlist>
           <listitem>
             <para> There have been some changes to ResultSpecifications. We do not
               guarantee 100% backwards compatibility for applications that made use of
               them, although most cases should work. </para>
           </listitem>
           <listitem>
             <para> For applications that deal with multiple subjects of analysis (Sofas),
               the rules that determine whether a component is Multi-View or Single-View
               have been made more consistent. A component is considered Multi-View if and
               only if it declares at least one inputSofa or outputSofa in its descriptor.
               This leads to the following incompatibilities in unusual cases:
               <itemizedlist>
                 <listitem>
                   <para> It is an error if an annotator that implements the TextAnnotator or
                     JTextAnnotator interface also declares inputSofas or outputSofas in
                     its descriptor. Such annotators must be Single-View. </para>
                 </listitem>
                 <listitem>
                   <para> Annotators that implement GenericAnnotator but do not declare
                     any inputSofas or outputSofas will now be passed the view of default
                     Sofa instead of the Base CAS. </para>
                 </listitem>
               </itemizedlist> </para>
           </listitem>
         </itemizedlist> </para>

     </section>
   </section>
   </section>

   <section id="ugr.project_overview_migrating_from_ibm_uima">
     <title>Migrating from IBM UIMA to Apache UIMA</title>
     <para>In Apache UIMA, several things have changed that require changes to user code and descriptors.
       A migration utility is provided which will make the required updates to your files.  The most
       significant change is that the Java package names for all of the UIMA classes and interfaces have changed
       from what they were in IBM UIMA; all of the package names now start with the prefix <literal>org.apache</literal>.</para>

     <section id="ugr.project_overview_running_the_migration_utility">
       <title>Running the Migration Utility</title>
       <note>
         <para>Before running the migration utility, be sure to back up your files, just in case you encounter any
         problems, because the migration tool updates the files in place in the directories where it finds them.</para>
       </note>
       <para> The migration utility is run by executing the script file
         <literal>apache-uima/bin/ibmUimaToApacheUima.bat</literal> (Windows) or
         <literal>apache-uima/bin/ibmUimaToApacheUima.sh</literal> (UNIX). You must pass one argument: the
         directory containing the files that you want to be migrated. Subdirectories will be processed
         recursively.</para>

       <para>The script scans your files and applies the necessary updates, for example replacing the com.ibm
         package names with the new org.apache package names. For more details on what has changed in the UIMA APIs and
         what changes are performed by the migration script, see <xref linkend="ugr.project_overview_changes_from_2_0"/>.</para>

       <para>The script will only attempt to modify files with the extensions: java, xml, xmi, wsdd, properties,
         launch, bat, cmd, sh, ksh, or csh; and files with no extension. Also, files with size greater than 1,000,000
         bytes will be skipped. (If you want the script to modify files with other extensions, you can edit the script
         file and change the <literal>-ext</literal> argument appropriately.) </para>

       <para>If the migration tool reports warnings, there may be a few additional steps to take.  The following two
         sections explain some simple manual changes that you might need to make to your code.</para>

       <section id="ugr.project_overview_running_the_migration_utility.jcas_for_document_annotation">
         <title>JCas Cover Classes for DocumentAnnotation</title>
         <para> If you have run JCasGen it is likely that you have the classes
           <literal>com.ibm.uima.jcas.tcas.DocumentAnnotation</literal> and
           <literal>com.ibm.uima.jcas.tcas.DocumentAnnotation_Type</literal> as part of your code. This
           package name is no longer valid, and the migration utility does not move your files between directories so
           it is unable to fix this. </para>
         <para> If you have not made manual modifications to these classes, the best solution is usually to just delete
           these two classes (and their containing package). There is a default version in the
           <literal>uima-document-annotation.jar</literal> file that is included in Apache UIMA. If you
           <emphasis>have</emphasis> made custom changes, then you should not delete the file but instead move it to
           the correct package <literal>org.apache.uima.jcas.tcas</literal>. For more information about JCas
           and DocumentAnnotation please see <olink targetdoc="&uima_docs_ref;"
             targetptr="ugr.ref.jcas.documentannotation_issues"/> </para>
       </section>
       <section id="ugr.project_overview_running_the_migration_utility.manual_migration_needed.getdocumentannotation">
         <title>JCas.getDocumentAnnotation</title>
         <para>The deprecated method <literal>JCas.getDocumentAnnotation</literal> has been removed. Its use
           must be replaced with <literal>JCas.getDocumentAnnotationFs</literal>. The method
           <literal>JCas.getDocumentAnnotationFs()</literal> returns type <literal>TOP</literal>, so your
           code must cast this to type <literal>DocumentAnnotation</literal>. The reasons for this are described
           in <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.jcas.documentannotation_issues"/>.
           </para>
       </section>

     </section>


     <section id="ugr.project_overview_rare_migration">
       <title>Manual Migration</title>
       <para>The following are rare cases where you may need to take additional steps to migrate your code.  You need only
         read this section if the migration tool reported a warning or if you are having trouble getting your code to
         compile or run after running the migration.  For most users, attention to these things will not
         be required.</para>

       <section id="ugr.project_overview.manual_migration_needed.xiinclude">
         <title>xi:include</title>
         <para>The use of &lt;xi:include> in UIMA component descriptors has been discouraged for some time, and in
           Apache UIMA support for it has been removed. If you have descriptors that use that, you must change them to
           use UIMA's &lt;import> syntax instead. The proper syntax is described in <olink
             targetdoc="&uima_docs_ref;" targetptr="ugr.ref.xml.component_descriptor.imports"/>.
           </para>
       </section>
       <section id="ugr.project_overview.manual_migration_needed.duplicate_methods_cas_tcas">
         <title>Duplicate Methods Taking CAS and TCAS as Arguments</title>
         <para>Because <literal>TCAS</literal> has been replaced by <literal>CAS</literal>, if you had two
           methods distinguished only by whether an argument type was <literal>TCAS</literal> or
           <literal>CAS</literal>, the migration tool will cause these to have identical signatures, which will be
           a compile error. If this happens, consider why the two variants were needed in the first place. Often, it may
           work to simply delete one of the methods.</para>
       </section>
       <section id="ugr.project_overview.manual_migration_needed.undocumented_methods">
         <title>Use of Undocumented Methods from the com.ibm.uima.util package</title>
         <titleabbrev>Undocumented Methods</titleabbrev>
         <para>Previous UIMA versions has some methods in the <literal>com.ibm.uima.util</literal> package that
           were for internal use and were not documented in the Javadoc. (There are also many methods in that package
           which are documented, and there is no issue with using these.) It is not recommended that you use any of the
           undocumented methods. If you do, the migration script will not handle them correctly. These have now been
           moved to <literal>org.apache.uima.internal.util</literal>, and you will have to manually update your
           imports to point to this location.</para>
       </section>
       <section id="ugr.project_overview.manual_migration_needed.uima_package_names_in_user_code">
         <title>Use of UIMA Package Names for User Code</title>
         <titleabbrev>Package Names</titleabbrev>
         <para>If you have placed your own classes in a package that has exactly the same name as one of the UIMA packages
           (not recommended), this will cause problems when your run the migration script. Since the script replaces
           UIMA package names, all of your imports that refer to your class will get replaced and your code will no
           longer compile. If this happens, you can fix it by manually moving your code to the new Apache UIMA package
           name (i.e., whatever name your imports got replaced with). However, we recommend instead that you do not
           use Apache UIMA package names for your own code.</para>
         <para>An even more rare case would be if you had a package name that started with a capital letter (poor Java
           style) AND was prefixed by one of the UIMA package names, for example a package named
           <literal>com.ibm.uima.MyPackage</literal>. This would be treated as a class name and replaced with
           <literal>org.apache.uima.MyPackage</literal> wherever it occurs.</para>
       </section>
       <section id="ugr.project_overview.manual_migration_needed.exceptions_extend_uima_exceptions">
         <title>CASException and CASRuntimeException now extend UIMA(Runtime)Exception</title>
         <titleabbrev>Changes to CAS Exceptions</titleabbrev>
         <para>
           This change may affect user code to a small extent, as some of the APIs on
           <literal>CASException</literal> and <literal>CASRuntimeException</literal> no longer exist.
           On the up side, all UIMA exceptions are now derived from the same base classes and behave
           the same way.  The most significant change is that you can no longer check for the specific
           type of exception the way you used to.  For example, if you had code like this:

           <programlisting>catch (CASRuntimeException e) {
   if (e.getError() == CASRuntimeException.ILLEGAL_ARRAY_SIZE) {
   // Do something in case this particular error is caught</programlisting>

           you will need to replace it with the following:

           <programlisting>catch (CASRuntimeException e) {
   if (e.getMessageKey().equals(CASRuntimeException.ILLEGAL_ARRAY_SIZE)) {
   // Do something in case this particular error is caught</programlisting>

           as the message keys are now strings.  This change is not handled by the migration script.
         </para>
       </section>
     </section>
   </section>

   <section id="ugr.project_overview_summary">
     <title>Apache UIMA Summary</title>
     <section id="ugr.ovv.summary.general">
       <title>General</title>
       <para>UIMA supports the development, discovery, composition and deployment of multi-modal
         analytics for the analysis of unstructured information and its integration with search
         technologies.</para>

       <para>Apache UIMA includes APIs and tools for creating analysis components. Examples of analysis components include
         tokenizers, summarizers, categorizers, parsers, named-entity detectors etc. Tutorial examples are
         provided with Apache UIMA; additional components are available from the community. </para>

       <para>Apache UIMA does not itself include a semantic search engine; instructions are included for
         incorporating the semantic search SDK from IBM's <ulink url="http://alphaworks.ibm.com/tech/uima">alphaWorks</ulink>
         which can index the results of
         analysis and for using this semantic index to perform more advanced search. </para>
     </section>
     <section id="ugr.ovv.summary.programming_language_support">
       <title>Programming Language Support</title>
       <para>UIMA supports the development and integration of analysis algorithms developed in different
         programming languages. </para>

       <para>The Apache UIMA project is both a Java framework and a matching C++
         enablement layer, which allows annotators to be written in C++ and have access to a C++ version of the CAS. The
         C++ enablement layer also enables annotators to be written in Perl, Python, and TCL, and to interoperate with
         those written in other languages. <!--Documentation for this is provided here (link to be filled in).-->
         </para>

     </section>
     <section id="ugr.ovv.general.summary.multi_modal_support">
       <title>Multi-Modal Support</title>
       <para>The UIMA architecture supports the development, discovery, composition and deployment of
         multi-modal analytics, including text, audio and video. <olink
           targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.aas"/> discuss this is more
         detail.</para>
     </section>
     <section id="ugr.ovv.summary.general.semantic_search_components">
       <title>Semantic Search Components</title>
       <para> The Lucene search engine as of this writing (November, 2006) does not support searching with
         annotations. The site <ulink url="http://www.alphaworks.ibm.com/tech/uima"/> provides a download of a
         semantic search engine, a simple demo query tool, some documentation on the semantic search engine, and a
         component that connects the results of UIMA analysis to the indexer so that the annotations as well as
         key-words can be indexed. </para>

       <para>Previous versions of the UIMA SDK (prior to the Apache versions) are available from <ulink
           url="http://www.alphaworks.ibm.com/tech/uima"> IBM's alphaWorks</ulink>. The source code for
         previous versions of the main UIMA framework is available on <ulink
           url="http://uima-framework.sourceforge.net/"> SourceForge</ulink>.</para>
     </section>
   </section>

   <section id="ugr.project_overview_summary_sdk_capabilities">
     <title>Summary of Apache UIMA Capabilities</title>
     <informaltable frame="all" rowsep="1" colsep="1">
       <tgroup cols="2">
         <colspec colnum="1" colname="col1" colwidth=".75*"/>
         <colspec colnum="2" colname="col2" colwidth="*"/>
         <tbody>
           <row>
             <entry role="tableSubhead">Module</entry>
             <entry role="tableSubhead">Description</entry>
           </row>
           <row>
             <entry>UIMA Framework Core</entry>
             <entry>
               <para>A framework integrating core functions for creating, deploying, running and managing UIMA
                 components, including analysis engines and Collection Processing Engines in collocated and/or
                 distributed configurations. </para>

               <para>The framework includes an implementation of core components for transport layer adaptation,
                 CAS management, workflow management based on declarative specifications, resource management,
                 configuration management, logging, and other functions.</para>
             </entry>
           </row>
           <row>
             <entry>C++ and other programming language Interoperability</entry>

             <entry>
               <para>Includes C++ CAS and supports the creation of UIMA compliant C++ components that can be
                 deployed in the UIMA run-time through a built-in JNI adapter. This includes high-speed binary
                 serialization.</para>

               <para>Includes support for creating service-based UIMA engines. This is ideal for
                 wrapping existing code written in different languages.</para>
             </entry>
           </row>
           <row>
             <entry role="tableSubhead">Framework Services and APIs</entry>
             <entry role="tableSubhead">Note that interfaces of these components are available to the developer
               but different implementations are possible in different implementations of the UIMA
               framework.</entry>
           </row>
           <row>
             <entry>CAS</entry>
             <entry>These classes provide the developer with typed access to the Common Analysis Structure (CAS),
               including type system schema, elements, subjects of analysis and indices. Multiple subjects of
               analysis (Sofas) mechanism supports the independent or simultaneous analysis of multiple views of
               the same artifacts (e.g. documents), supporting multi-lingual and multi-modal analysis.</entry>
           </row>
           <row>
             <entry>JCas</entry>
             <entry>An alternative interface to the CAS, providing Java-based UIMA Analysis components with
               native Java object access to CAS types and their attributes or features, using the
               JavaBeans conventions of getters and setters.</entry>
           </row>

           <row>
             <entry>Collection Processing Management (CPM)</entry>
             <entry>Core functions for running UIMA collection processing engines in collocated and/or
               distributed configurations. The CPM provides scalability across parallel processing pipelines,
               check-pointing, performance monitoring and recoverability.</entry>
           </row>
           <row>
             <entry>Resource Manager</entry>
             <entry>Provides UIMA components with run-time access to external resources handling capabilities
               such as resource naming, sharing, and caching. </entry>
           </row>
           <row>
             <entry>Configuration Manager</entry>
             <entry>Provides UIMA components with run-time access to their configuration parameter settings.
               </entry>
           </row>
           <row>
             <entry>Logger</entry>
             <entry>Provides access to a common logging facility.</entry>
           </row>
           <row>
             <entry namest="col1" nameend="col2" align="center" role="tableSubhead"> Tools and Utilities
               </entry>
           </row>
           <row>
             <entry>JCasGen</entry>
             <entry>Utility for generating a Java object model for CAS types from a UIMA XML type system
               definition.</entry>
           </row>
           <row>
             <entry>Saving and Restoring CAS contents</entry>
             <entry>APIs in the core framework support saving and restoring the contents of a CAS to streams using an
               XMI format. </entry>
           </row>
           <row>
             <entry>PEAR Packager for Eclipse</entry>
             <entry>Tool for building a UIMA component archive to facilitate porting, registering, installing and
               testing components.</entry>
           </row>
           <row>
             <entry>PEAR Installer</entry>
             <entry>Tool for installing and verifying a UIMA component archive in a UIMA installation.</entry>
           </row>
           <row>
             <entry>PEAR Merger</entry>
             <entry>Utility that combines multiple PEARs into one.</entry>
           </row>
           <row>
             <entry>Component Descriptor Editor</entry>
             <entry>Eclipse Plug-in for specifying and configuring component descriptors for UIMA analysis
               engines as well as other UIMA component types including Collection Readers and CAS
               Consumers.</entry>
           </row>
           <row>
             <entry>CPE Configurator</entry>
             <entry>Graphical tool for configuring Collection Processing Engines and applying them to
               collections of documents.</entry>
           </row>
           <row>
             <entry>Java Annotation Viewer</entry>
             <entry>Viewer for exploring annotations and related CAS data.</entry>
           </row>
           <row>
             <entry>CAS Visual Debugger</entry>
             <entry>GUI Java application that provides developers with detailed visual view of the contents of a
               CAS.</entry>
           </row>
           <row>
             <entry>Document Analyzer</entry>
             <entry>GUI Java application that applies analysis engines to sets of documents and shows results in a
               viewer.</entry>
           </row>
           <row>
             <entry namest="col1" nameend="col2" align="center" role="tableSubhead"> Example Analysis
               Components </entry>
           </row>
           <row>
             <entry>Database Writer</entry>
             <entry>CAS Consumer that writes the content of selected CAS types into a relational database, using
               JDBC. This code is in cpe/PersonTitleDBWriterCasConsumer. </entry>
           </row>
           <row>
             <entry>Annotators</entry>
             <entry> Set of simple annotators meant for pedagogical purposes. Includes: Date/time, Room-number,
               Regular expression, Tokenizer, and Meeting-finder annotator. There are sample CAS Multipliers
               as well. </entry>
           </row>
           <row>
             <entry>Flow Controllers</entry>
             <entry> There is a sample flow-controller based on the whiteboard concept of sending the CAS to whatever
               annotator hasn't yet processed it, when that annotator's inputs are available in the CAS. </entry>
           </row>
           <row>
             <entry>XMI Collection Reader, CAS Consumer</entry>
             <entry>Reads and writes the CAS in the XMI format</entry>
           </row>

           <row>
             <entry>File System Collection Reader</entry>
             <entry> Simple Collection Reader for pulling documents from the file system and initializing CASes.
               </entry>
           </row>
           <row>
             <entry namest="col1" nameend="col2" align="center" role="tableSubhead"> Components available
               from <ulink url="www.alphaworks.ibm.com/tech/uima"></ulink> </entry>
           </row>
           <row>
             <entry>Semantic Search CAS Indexer</entry>
             <entry>A CAS Consumer that uses the semantic search engine indexer to build an index from a stream of
               CASes. Requires the semantic search engine (available from the same place). </entry>
           </row>
         </tbody>
       </tgroup>
     </informaltable>
   </section>

 </chapter>