<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >  
%uimaents;
]>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.
-->
<chapter id="ugr.project_overview">
  <title>UIMA Overview</title>
  <titleabbrev>Overview</titleabbrev>
  
  <para>The Unstructured Information Management Architecture (UIMA) is an architecture and software framework
    for creating, discovering, composing and deploying a broad range of multi-modal analysis capabilities and
    integrating them with search technologies.  The architecture is undergoing a standardization effort, 
    referred to as the <emphasis>UIMA specification</emphasis> by a technical committee within
    <ulink url="http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=uima">OASIS</ulink>.  
    </para>
  
  <para>The <emphasis>Apache UIMA</emphasis> framework is an Apache licensed, open source implementation of the
    UIMA Architecture, and provides a run-time environment in which developers can plug in
    and run their UIMA component implementations and with which they can build and deploy UIM applications. The
    framework itself is not specific to any IDE or platform.</para>
  
  <para>It includes an all-Java implementation of the
    UIMA framework for the development, description, composition and deployment of UIMA components and
    applications. It also provides the developer with an Eclipse-based (<ulink url="http://www.eclipse.org/"/>
    ) development environment that includes a set of tools and utilities for using UIMA. It also includes 
    a C++ version of the framework, and
    enablements for Annotators built in Perl, Python, and TCL.</para>
  
  <para>This chapter is the intended starting point for readers that are new to the Apache UIMA Project. It includes
    this introduction and the following sections:</para> 
  <itemizedlist>
    <listitem>
      <para> <xref linkend="ugr.project_overview_doc_overview"/> provides a list of the books and topics included in
        the Apache UIMA documentation with a brief summary of each. </para>
    </listitem>
    <listitem>
      <para> <xref linkend="ugr.project_overview_doc_use"/> describes a recommended path through the
        documentation to help get the reader up and running with UIMA </para>
    </listitem>
    <listitem>
      <para> <xref linkend="ugr.project_overview_migrating_from_ibm_uima"/> is intended for users of IBM
        UIMA, and describes the steps needed to upgrade to Apache UIMA. </para>
    </listitem>
    <listitem>
      <para> <xref linkend="ugr.project_overview_changes_from_v1"/> lists the changes that occurred between UIMA
        v1.x and UIMA v2.x (independent of the transition to Apache).</para>
    </listitem>
  </itemizedlist>
    
    <para>The main website for Apache UIMA is <ulink url="http://uima.apache.org"/>.  Here you 
    can find out many things, including:
     <itemizedlist spacing="compact">
       <listitem><para>how to download (both the binary and source distributions</para></listitem>
       <listitem><para>how to participate in the development</para></listitem>
       <listitem><para>mailing lists - including the user list used like a forum for questions and answers</para></listitem>
       <listitem><para>a Wiki where you can find and contribute all kinds of information, including tips and best practices</para></listitem>
       <listitem><para>a sandbox - a subproject for potential new additions to Apache UIMA or to subprojects of it.  Things here
       are works in progress, and may (or may not) be included in releases.</para></listitem>
       <listitem><para>links to conferences</para></listitem>
     </itemizedlist>
      </para>
 
  <section id="ugr.project_overview_doc_overview">
    <title>Apache UIMA Project Documentation Overview</title>
    <para> The user documentation for UIMA is organized into several parts.
      <itemizedlist spacing="compact">
        <listitem>
          <para> Overviews - this documentation </para>
        </listitem>
        <listitem>
          <para> Eclipse Tooling Installation and Setup - also in this document </para>
        </listitem>
        <listitem>
          <para> Tutorials and Developer's Guides </para>
        </listitem>
        <listitem>
          <para> Tools Users' Guides </para>
        </listitem>
        <listitem>
          <para> References </para>
        </listitem>
      </itemizedlist> </para>
    
    <para>
    The first 2 parts make up this book; the last 3 have individual 
    books.  The books are provided both as
    (somewhat large) html files, viewable in browsers, and also as PDF files.  
    The documentation is fully hyperlinked, with tables of contents.  The PDF versions are set up to 
    print nicely - they have page numbers included on the cross references within a book. </para>
    
    <para>If you view the PDF files inside
    a browser that supports imbedded viewing of PDF, the hyperlinks between different PDF books may work (not 
    all browsers have been tested...).</para>
    
    <para>The following set of tables gives a more detailed overview of the various parts of the
    documentation.
    </para>
    
    <section id="ugr.project_overview_overview">
      <title>Overviews</title>
      
      <informaltable frame="all" rowsep="1" colsep="1">
        <tgroup cols="2">
          <colspec colnum="1" colname="col1" colwidth="1*"/>
          <colspec colnum="2" colname="col2" colwidth="2.5*"/>
          <tbody>
            <row>
              <entry><emphasis>Overview of the Documentation</emphasis>
              </entry>
              <entry>
                <para>What you are currently reading.  Lists the documents provided in the Apache 
                UIMA documentation set and provides
                 a recommended path through the documentation for getting started using
                  UIMA.  It includes release notes and provides a brief high-level description of 
                  the different software modules included in the
                  Apache UIMA Project.  See <xref linkend="ugr.project_overview_doc_overview"/>.</para>
              </entry>
            </row>
            <row>
              <entry><emphasis>Conceptual Overview</emphasis>
              </entry>
              <entry>Provides a broad conceptual overview of the UIMA component architecture; includes
                references to the other documents in the documentation set that provide more detail.
                See <xref linkend="ugr.ovv.conceptual"/></entry>
            </row>
            <row>
              <entry><emphasis>UIMA FAQs</emphasis>
              </entry>
              <entry>Frequently Asked Questions about general UIMA concepts. (Not a programming
                resource.)  See <xref linkend="ugr.faqs"/>.</entry>
            </row>
            <row>
              <entry><emphasis>Known Issues</emphasis>
              </entry>
              <entry>Known issues and problems with the UIMA SDK.  See <xref linkend="ugr.issues"/>.</entry>
            </row>
            <row>
              <entry><emphasis>Glossary</emphasis>
              </entry>
              <entry>UIMA terms and concepts and their basic definitions.  See <xref linkend="ugr.glossary"/>.</entry>
            </row>
          </tbody>
        </tgroup>
      </informaltable>
    </section>
    <section id="ugr.project_overview_setup">
      <title>Eclipse Tooling Installation and Setup</title>
      <para>Provides step-by-step instructions for installing Apache UIMA in the Eclipse Interactive
        Development Environment.  See <xref linkend="ugr.ovv.eclipse_setup"/>.</para>
    </section>
    
    <section id="ugr.project_overview_tutorials_dev_guides">
      <title>Tutorials and Developer&apos;s Guides</title>
      <informaltable>
        <tgroup cols="2">
          <colspec colnum="1" colname="col1" colwidth="1*"/>
          <colspec colnum="2" colname="col2" colwidth="2.5*"/>
          <tbody>
            <row id="ugr.project_overview_tutorial_annotator">
              <entry><emphasis>Annotators and Analysis Engines</emphasis>
              </entry>
              <entry>Tutorial-style guide for building UIMA annotators and analysis engines. This chapter
                introduces the developer to creating type systems and using UIMA&apos;s common data structure,
                the CAS or Common Analysis Structure. It demonstrates how to use built in tools to specify and create
                basic UIMA analysis components.  See 
                <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.aae"/>.</entry>
            </row>
            <row id="ugr.project_overview_tutorial_cpe">
              <entry><emphasis>Building UIMA Collection Processing Engines</emphasis>
              </entry>
              <entry>Tutorial-style guide for building UIMA collection processing engines. These
               manage the
                analysis of collections of documents from source to sink.  See 
                <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.cpe"/>.</entry>
            </row>
            <row id="ugr.project_overview_tutorial_application_development">
              <entry><emphasis>Developing Complete Applications</emphasis>
              </entry>
              <entry>Tutorial-style guide on using the UIMA APIs to create, run and manage UIMA components from
                your application. Also describes APIs for saving and restoring the contents of a CAS using an XML
                format called <trademark class="registered"> XMI</trademark>.  See 
                <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.application"/>.</entry>
            </row>
            <row id="ugr.project_overview_guide_flow_controller">
              <entry><emphasis>Flow Controller</emphasis>
              </entry>
              <entry>When multiple components are combined in an Aggregate, each CAS flow among the various
                components. UIMA provides two built-in flows, and also allows custom flows to be
                implemented.  See <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.fc"/>.</entry>
            </row>
            <row id="ugr.project_overview_guide_multiple_sofas">
              <entry><emphasis>Developing Applications using Multiple Subjects of Analysis</emphasis>
              </entry>
              <entry>A single CAS maybe associated with multiple subjects of analysis (Sofas). These are useful
                for representing and analyzing different formats or translations of the same document. For
                multi-modal analysis, Sofas are good for different modal representations of the same stream
                (e.g., audio and close-captions).This chapter provides the developer details on how to use
                multiple Sofas in an application.  See 
                <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.aas"/>.</entry>
            </row>
            <row id="ugr.project_overview_guide_multiple_views">
              <entry><emphasis>Multiple CAS Views of an Artifact</emphasis>
              </entry>
              <entry>UIMA provides an extension to the basic model of the CAS which supports 
              analysis of multiple views of the same artifact, all contained with the CAS. This 
              chapter describes the concepts, terminology, and the API and XML extensions that 
              enable this.  See 
                <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.mvs"/>.</entry>
            </row>
            <row id="ugr.project_overview_guide_cas_multiplier">
              <entry><emphasis>CAS Multiplier</emphasis>
              </entry>
              <entry>A component may add additional CASes into the workflow. This may be useful to break up a large
                artifact into smaller units, or to create a new CAS that collects information from multiple other
                CASes.  See <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.cm"/>.</entry>
            </row>
            <row id="ugr.project_overview_xmi_emf">
              <entry><emphasis>XMI and EMF Interoperability</emphasis>
              </entry>
              <entry>The UIMA Type system and the contents of the CAS itself can be externalized using the XMI
                standard for XML MetaData. Eclipse Modeling Framework (EMF) tooling can be used to develop
                applications that use this information.  See 
                <olink targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.xmi_emf"/>.</entry>
            </row>
          </tbody>
        </tgroup>
      </informaltable>
    </section>
    
    <section id="ugr.project_overview_tool_guides">
      <title>Tools Users&apos; Guides</title>
      
      <informaltable>
        <tgroup cols="2">
          <colspec colnum="1" colname="col1" colwidth="1*"/>
          <colspec colnum="2" colname="col2" colwidth="2.5*"/>
          <tbody>
            <row id="ugr.project_overview_tools_component_descriptor_editor">
              <entry><emphasis>Component Descriptor Editor</emphasis>
              </entry>
              <entry>Describes the features of the Component Descriptor Editor Tool. This tool provides a GUI for
                specifying the details of UIMA component descriptors, including those for Analysis Engines
                (primitive and aggregate), Collection Readers, CAS Consumers and Type Systems.  See 
                <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.cde"/>.</entry>
            </row>
            <row id="ugr.project_overview_tools_cpe_configurator">
              <entry><emphasis>Collection Processing Engine Configurator</emphasis>
              </entry>
              <entry>Describes the User Interfaces and features of the CPE Configurator tool. This tool allows the
                user to select and configure the components of a Collection Processing Engine and then to run the
                engine.  See 
                <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.cpe"/>.</entry>
            </row>
            <row id="ugr.project_overview_tools_pear_packager">
              <entry><emphasis>Pear Packager</emphasis>
              </entry>
              <entry>Describes how to use the PEAR Packager utility. This utility enables developers to produce an
                archive file for an analysis engine that includes all required resources for installing that
                analysis engine in another UIMA environment.  See 
                <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.pear.packager"/>.</entry>
            </row>
            <row id="ugr.project_overview_tools_pear_installer">
              <entry><emphasis>Pear Installer</emphasis>
              </entry>
              <entry>Describes how to use the PEAR Installer utility. This utility installs and verifies an
                analysis engine from an archive file (PEAR) with all its resources in the right place so it is ready to
                run.  See 
                <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.pear.installer"/>.</entry>
            </row>
            <row id="ugr.project_overview_tools_pear_merger">
              <entry><emphasis>Pear Merger</emphasis>
              </entry>
              <entry>Describes how to use the Pear Merger utility, which does a simple merge of multiple PEAR
                packages into one.  See 
                <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.pear.merger"/>.</entry>
            </row>
            <row id="ugr.project_overview_tools_document_analyzer">
              <entry><emphasis>Document Analyzer</emphasis>
              </entry>
              <entry>Describes the features of a tool for applying a UIMA analysis engine to a set of documents and
                viewing the results.  See 
                <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.doc_analyzer"/>.</entry>
            </row>
            <row id="ugr.project_overview_tools_cas_visual_debugger">
              <entry><emphasis>CAS Visual Debugger</emphasis>
              </entry>
              <entry>Describes the features of a tool for viewing the detailed structure and contents of a CAS. Good
                for debugging.  See 
                <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.cvd"/>.</entry>
            </row>
            <row id="ugr.project_overview_tools_jcasgen">
              <entry><emphasis>JCasGen</emphasis>
              </entry>
              <entry>Describes how to run the JCasGen utility, which automatically builds Java classes that
                correspond to a particular CAS Type System.  See 
                <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.jcasgen"/>.</entry>
            </row>
            <row id="ugr.project_overview_tools_xml_cas_viewer">
              <entry><emphasis>XML CAS Viewer</emphasis>
              </entry>
              <entry>Describes how to run the supplied viewer to view externalized XML forms of CASes. This viewer
                is used in the examples.  See 
                <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.annotation_viewer"/>.</entry>
            </row>
          </tbody>
        </tgroup>
      </informaltable>
    </section>
    
    <section id="ugr.project_overview_reference">
      <title>References</title>
      <informaltable>
        <tgroup cols="2">
          <colspec colnum="1" colname="col1" colwidth="1*"/>
          <colspec colnum="2" colname="col2" colwidth="2.5*"/>
          <tbody>
            <row id="ugr.project_overview_javadocs">
              <entry><emphasis>Introduction to the UIMA API Javadocs</emphasis>
              </entry>
              <entry>Javadocs detailing the UIMA programming interfaces  See 
                <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.javadocs"/></entry>
            </row>
            <row id="ugr.project_overview_xml_ref_component_descriptor">
              <entry><emphasis>XML: Component Descriptor</emphasis>
              </entry>
              <entry>Provides detailed XML format for all the UIMA component descriptors, except the CPE (see
                next).  See 
                <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.xml.component_descriptor"/>.</entry>
            </row>
            <row id="ugr.project_overview_xml_ref_collection_processing_engine_descriptor">
              <entry><emphasis>XML: Collection Processing Engine Descriptor</emphasis>
              </entry>
              <entry>Provides detailed XML format for the Collection Processing Engine descriptor.  See 
                <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.xml.cpe_descriptor"/></entry>
            </row>
            <row id="ugr.project_overview_cas">
              <entry><emphasis>CAS</emphasis>
              </entry>
              <entry>Provides detailed description of the principal CAS interface.  See 
                <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.cas"/></entry>
            </row>
            <row id="ugr.project_overview_jcas">
              <entry><emphasis>JCas</emphasis>
              </entry>
              <entry>Provides details on the JCas, a native Java interface to the CAS.  See 
                <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.jcas"/></entry>
            </row>
            <row id="ugr.project_overview_ref_pear">
              <entry><emphasis>PEAR Reference</emphasis>
              </entry>
              <entry>Provides detailed description of the deployable archive format for UIMA
                components.  See 
                <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.pear"/></entry>
            </row>
            <row id="ugr.project_overview_xmi_cas_serialization">
              <entry><emphasis>XMI CAS Serialization Reference</emphasis>
              </entry>
              <entry>Provides detailed description of the deployable archive format for UIMA
                components.  See 
                <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.xmi"/></entry>
            </row>
          </tbody>
        </tgroup>
      </informaltable>
    </section>
  </section>
  
  <section id="ugr.project_overview_doc_use">
    <!-- _crossRef358 -->
    <title>How to use the Documentation</title>
    <orderedlist>
      <listitem>
        <para>Explore this chapter to get an overview of the different documents that are included with Apache UIMA.</para>
      </listitem>
      <listitem>
        <para> Read <olink targetdoc="&uima_docs_overview;" targetptr="ugr.ovv.conceptual"/> to get a broad
          view of the basic UIMA concepts and philosophy with reference to the other documents included in the
          documentation set which provide greater detail. </para>
      </listitem>
      <listitem>
        <para> For more general information on the UIMA architecture and how it has been used, refer to the IBM Systems
          Journal special issue on Unstructured Information Management, on-line at <ulink
            url="http://www.research.ibm.com/journal/sj43-3.html"/> or to the section of the UIMA project
          website on Apache website where other publications are listed. </para>
      </listitem>
      <listitem>
        <para> Set up Apache UIMA in your Eclipse environment. To do this, follow the instructions in <xref
            linkend="ugr.ovv.eclipse_setup"/>. </para>
      </listitem>
      <listitem>
        <para> Develop sample UIMA annotators, run them and explore the results. Read <olink
            targetdoc="&uima_docs_tutorial_guides;"/> <olink
            targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.aae"/> and follow it like a tutorial
          to learn how to develop your first UIMA annotator and set up and run your first UIMA analysis engines.
          <itemizedlist>
            <listitem>
              <para> As part of this you will use a few tools including
                <itemizedlist>
                  <listitem>
                    <para> The UIMA Component Descriptor Editor, described in more detail in <olink
                        targetdoc="&uima_docs_tools;"/> <olink
                        targetdoc="&uima_docs_tools;" targetptr="ugr.tools.cde"/> and </para>
                  </listitem>
                  <listitem>
                    <para> The Document Analyzer, described in more detail in <olink
                        targetdoc="&uima_docs_tools;"/> <olink
                        targetdoc="&uima_docs_tools;" targetptr="ugr.tools.doc_analyzer"/>. </para>
                  </listitem>
                  
                </itemizedlist> </para>
              
            </listitem>
            <listitem>
              <para>While following along in <olink targetdoc="&uima_docs_tutorial_guides;"/>
                <olink targetdoc="&uima_docs_tutorial_guides;"
                  targetptr="ugr.tug.aae"/>, reference documents that may help are:
                <itemizedlist>
                  <listitem>
                    <para> <olink targetdoc="&uima_docs_ref;"/> <olink targetdoc="&uima_docs_ref;"
                        targetptr="ugr.ref.xml.component_descriptor"/> for understanding the analysis
                      engine descriptors </para>
                  </listitem>
                  <listitem>
                    <para> <olink targetdoc="&uima_docs_ref;"/> 
                      <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.jcas"/> for
                      understanding the JCas </para>
                  </listitem>
                </itemizedlist> </para>
            </listitem>
          </itemizedlist> </para>
      </listitem>
      <listitem>
        <para> Learn how to create, run and manage a UIMA analysis engine as part of an application. 
          Connect your analysis engine to the provided semantic search engine to learn how a
          complete analysis and search application may be built with Apache UIMA. <olink
            targetdoc="&uima_docs_tutorial_guides;"/> <olink
            targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.application"/> will guide you
          through this process.
          <itemizedlist>
            <listitem>
              <para> As part of this you will use the document analyzer (described in more detail in <olink
                  targetdoc="&uima_docs_tools;"/> <olink
                  targetdoc="&uima_docs_tools;" targetptr="ugr.tools.doc_analyzer"/> and semantic search
                GUI tools (see <olink targetdoc="&uima_docs_tutorial_guides;"/>
                <olink targetdoc="&uima_docs_tutorial_guides;"
                  targetptr="ugr.tug.application.search.query_tool"/>. </para>
            </listitem>
          </itemizedlist> </para>
      </listitem>
      <listitem>
        <para> Pat yourself on the back. Congratulations! If you reached this step successfully, then you have an
          appreciation for the UIMA analysis engine architecture. You would have built a few sample annotators,
          deployed UIMA analysis engines to analyze a few documents, searched over the results using the built-in
          semantic search engine and viewed the results through a built-in viewer
          &ndash; all as part of a simple but complete application. </para>
      </listitem>
      <listitem>
        <para> Develop and run a Collection Processing Engine (CPE) to analyze and gather the results of an entire
          collection of documents. <olink targetdoc="&uima_docs_tutorial_guides;"/>
          <olink targetdoc="&uima_docs_tutorial_guides;"
            targetptr="ugr.tug.cpe"/> will guide you through this process.
          <itemizedlist>
            <listitem>
              <para> As part of this you will use the CPE Configurator tool. For details see <olink
                  targetdoc="&uima_docs_tools;"/> <olink
                  targetdoc="&uima_docs_tools;" targetptr="ugr.tools.cpe"/>. </para>
            </listitem>
            <listitem>
              <para> You will also learn about CPE Descriptors. The detailed format for these may be found in <olink
                  targetdoc="&uima_docs_ref;"/> <olink
                  targetdoc="&uima_docs_ref;" targetptr="ugr.ref.xml.cpe_descriptor"/>. </para>
            </listitem>
          </itemizedlist> </para>
      </listitem>
      <listitem>
        <para> Learn how to package up an analysis engine for easy installation into another UIMA environment.
            <olink targetdoc="&uima_docs_tools;"/>
            <olink targetdoc="&uima_docs_tools;" targetptr="ugr.tools.pear.packager"/> and <olink
            targetdoc="&uima_docs_tools;"/> <olink
            targetdoc="&uima_docs_tools;" targetptr="ugr.tools.pear.installer"/> will teach you how to
          create UIMA analysis engine archives so that you can easily share your components with a broader
          community. </para>
      </listitem>
    </orderedlist>
  </section>
  
  <section id="ugr.project_overview_changes_from_previous">
      <title>Changes from Previous Major Versions</title>
    <para> There are two previous version of UIMA, available from IBM's alphaWorks: version 1.4.x and version 2.0
      (the 2.0 version was a "beta" only release). This section describes the changes relative to both of these
      releases. A migration utility is provided which updates your Java code and descriptors as needed for this
      release. See <xref linkend="ugr.project_overview_migrating_from_ibm_uima"/> for instructions on how to
      run the migration utility. </para>
    
     <note><para>Each Apache UIMA release includes RELEASE_NOTES and RELEASE_NOTES.html files that
        describe the changes that have occurred in each release.
        Please refer to those files for specific changes for each Apache UIMA release.</para></note>

    <section id="ugr.project_overview_changes_from_2_0">
    <title>Changes from IBM UIMA 2.0 to Apache UIMA 2.1</title>
    
    <para>This section describes what has changed between version 2.0 and version 2.1 of UIMA;
      the following section describes the differences between version 1.4 and version 2.1.
      </para>
    
      <section id="ugr.project_overview.migration_utility.java_package_name_changes">
        <title>Java Package Name Changes</title>
        <para>All of the UIMA Java package names have changed in Apache UIMA. They now start with
          <literal>org.apache</literal> rather than <literal>com.ibm</literal>. There have been other
          changes as well. The package name segment <literal>reference_impl</literal> has been shortened to
          <literal>impl</literal>, and some segments have been reordered. For example
          <literal>com.ibm.uima.reference_impl.analysis_engine</literal> has become
          <literal>org.apache.uima.analysis_engine.impl</literal>. Tools are now consolidated under
          <literal>org.apache.uima.tools</literal> and service adapters under
          <literal>org.apache.uima.adapter</literal>. </para>
        <para>The migration utility will replace all occurrences of IBM UIMA package names with their Apache UIMA
          equivalents. It will not replace <emphasis>prefixes</emphasis> of package names, so if your code uses
          a package called <literal>com.ibm.uima.myproject</literal> (although that is not recommended), it
          will not be replaced.</para>
      </section>
      <section id="ugr.project_overview.migration_utility.xml_descriptor_changes">
        <title>XML Descriptor Changes</title>
        <para>The XML namespace in UIMA component descriptors has changed from
          <literal>http://uima.watson.ibm.com/resourceSpecifier</literal> to
          <literal>http://uima.apache.org/resourceSpecifier</literal>. The value of the
          <literal>&lt;frameworkImplementation></literal> must now be
          <literal>org.apache.uima.java</literal> or <literal>org.apache.uima.cpp</literal>. The
          migration script will apply these replacements. </para>
      </section>
      <section id="ugr.project_overview.migration_utility.tcas_replaced_by_cas">
        <title>TCAS replaced by CAS</title>
        <para>In Apache UIMA the <literal>TCAS</literal> interface has been removed. All uses of it must now be
          replaced by the <literal>CAS</literal> interface. (All methods that used to be defined on
          <literal>TCAS</literal> were moved to <literal>CAS</literal> in v2.0.) The method
          <literal>CAS.getTCAS()</literal> is replaced with <literal>CAS.getCurrentView()</literal> and
          <literal>CAS.getTCAS(String)</literal> is replaced with <literal>CAS.getView(String)</literal>
          . The following have also been removed and replaced with the equivalent "CAS" variants:
          <literal>TCASException</literal>, <literal>TCASRuntimeException</literal>,
          <literal>TCasPool</literal>, and <literal>CasCreationUtils.createTCas(...)</literal>. </para>
        <para>The migration script will apply the necessary replacements.</para>
      </section>
      <section id="ugr.project_overview.migration_utility.jcas_interface">
        <title>JCas Is Now an Interface</title>
        <para>In previous versions, user code accessed the JCas <emphasis>class</emphasis> directly. In Apache
          UIMA there is now an interface, <literal>org.apache.uima.jcas.JCas</literal>, which all JCas-based
          user code must now use. Static methods that were previously on the JCas class (and called from JCas cover
          classes generated by JCasGen) have been moved to the new
          <literal>org.apache.uima.jcas.JCasRegistry</literal> class. The migration script will apply the
          necessary replacements to your code, including any JCas cover classes that are part of your codebase.
          </para>
      </section>
      <section id="ugr.project_overview.migration_utility.jar_files">
        <title>JAR File names Have Changed</title>
        <para>The UIMA JAR file names have changed slightly.  Underscores have been replaced with hyphens to 
          be consistent with Apache naming conventions.  For example <literal>uima_core.jar</literal> is now 
          <literal>uima-core.jar</literal>.  Also <literal>uima_jcas_builtin_types.jar</literal> has been 
          renamed to <literal>uima-document-annotation.jar</literal>.  Finally, the <literal>jVinci.jar</literal> 
          file is now in the <literal>lib</literal> directory rather than the <literal>lib/vinci</literal> 
          directory as was previously the case.  The migration script will apply the necessary replacements,
          for example to script files or Eclipse launch configurations. (See <xref
          linkend="ugr.project_overview_running_the_migration_utility"/> for a list of file extensions that
          the migration utility will process by default.)
          </para>
      </section>      
    <section id="ugr.ovv.search_engine_repackaged">
      <title>Semantic Search Engine Repackaged</title>
      <para>The versions of the UIMA SDK prior to the move into Apache came with a semantic search engine. The Apache
        version does not include this search engine. The search engine has been repackaged and is separately
        available from <ulink url="http://www.alphaworks.ibm.com/tech/uima"/>. The intent is to hook up (over
        time) with other open source search engines, such as the Lucene search engine project in Apache.</para>
    </section>
  </section>
    
    
  <section id="ugr.project_overview_changes_from_v1">
    <title>Changes from UIMA Version 1.x</title>
    <para>Version 2.x of UIMA provides new capabilities and refines several areas of the UIMA
      architecture, as compared with version 1.</para>
    
    <section id="ugr.project_overview_new_capabilities">
      <title>New Capabilities</title>
      <formalpara id="ugr.project_overview_new_data_types">
        <title>New Primitive data types</title>
        <para>UIMA now supports Boolean (bit), Byte, Short (16 bit integers), Long (64 bit
          integers), and Double (64 bit floating point) primitive types, and arrays of
          these. These types can be used like all the other primitive types.</para>
      </formalpara>
      <formalpara id="ugr.ovv.simpler_aes_and_cases">
        <title>Simpler Analysis Engines and CASes</title>
        <para>Version 1.x made a distinction between Analysis Engines and Text Analysis
          Engines. This distinction has been eliminated in Version 2 - new code should just
          refer to Analysis Engines. Analysis Engines can operate on multiple kinds of
          artifacts, including text.</para>
      </formalpara>
      <formalpara id="ugr.ovv.sofas_and_cas_views_simplified">
        <title>Sofas and CAS Views simplified</title>
        <para>The APIs for manipulating multiple subjects of analysis (Sofas) and their
          corresponding CAS Views have been simplified.</para>
      </formalpara>
      <formalpara id="ugr.ovv.ae_support_multiple_new_cases">
        <title>Analysis Component generalized to support multiple new CAS
          outputs</title>
        <para>Analysis Components, in general, can make use of new capabilities to return
          multiple new CASes, in addition to returning the original CAS that is passed in.
          This allows components to have Collection Reader-like capabilities, but be
          placed anywhere in the flow. See <olink
            targetdoc="&uima_docs_tutorial_guides;"/> <olink
            targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.cm"/>
          .</para>
      </formalpara>
      <formalpara id="ugr.ovv.user_customized_fc">
        <title>User-customized Flow Controllers</title>
        <para>A new component, the Flow Controller, can be supplied by the user to implement
          arbitrary flow control for CASes within an Aggregate. This is in addition to the two
          built-in flow control choices of linear and language-capability flow. See <olink
            targetdoc="&uima_docs_tutorial_guides;"/> <olink
            targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.fc"/>
          .</para>
      </formalpara>
    </section>
 
    <section id="ugr.ovv.other_changes">
      <title>Other Changes</title>
            
      <formalpara>
        <title>New additional Annotator API ImplBase</title>
        <para>
          As of version 2.1, UIMA has a new set of Annotator interfaces. Annotators should now 
          extend CasAnnotator_ImplBase or JCasAnnotator_ImplBase instead of the v1.x 
          TextAnnotator_ImplBase and JTextAnnotator_ImplBase.  The v1.x annotator 
          interfaces are unchanged and are still supported for backwards compatibility.
         </para>
        </formalpara>
      <para>      
        The new Annotator interfaces support the changed approaches for ResultSpecifications
        and the changed exception names (see below), and have all the methods that CAS Consumers
      have, including CollectionProcessComplete and BatchProcessComplete.</para>
  
    <formalpara id="ugr.ovv.exceptions_rationalized">
      <title>UIMA Exceptions rationalized</title>
      
      <para>In version 1 there were different exceptions for the methods of an
        AnalysisEngine and for the corresponding methods of an Annotator; these were merged
        in version 2.
        
        <itemizedlist spacing="compact">
          <listitem><para>AnnotatorProcessException (v1) &rarr;
            AnalysisEngineProcessException (v2)</para></listitem>
          <listitem><para>AnnotatorInitializationException (v1) &rarr;
            ResourceInitializationException (v2)</para></listitem>
          <listitem><para>AnnotatorConfigurationException (v1) &rarr;
            ResourceConfigurationException (v2)</para></listitem>
          <listitem><para>AnnotatorContextException (v1) &rarr;
            ResourceAccessException (v2)</para></listitem>
        </itemizedlist> The previous exceptions are still available, but new code should
        use the new exceptions.</para>
        </formalpara>
        <note><para>The signature for typeSystemInit changed the <quote>throws</quote> clause to throw AnalysisEngineProcessException.
          For Annotators that extend the previous base, the previous definition of typeSystemInit will continue to 
          work for backwards compatibility.
       </para></note>

    
    <formalpara id="ugr.ovv.result_specification">
      <title>Changes in Result Specifications</title>
      <para>In version 1, the <literal>process(...)</literal> method took a second
        argument, a ResultSpecification. Now it is set when changed and it's up to the
        annotator to store it in a local field and make it available when needed.  
        This approach lets the annotator receive a specific signal (a method call) when
        the Result Specification changes. Previously, it would need to check on every call to
        see if it changed. The default impl base classes provide set/getResultSpecification(...)
        methods for this</para>
    </formalpara>
    
    <formalpara id="ugr.ovv.one_capability_set">
      <title>Only one Capability Set</title>
      <para>In version one, you can define 
        multiple capability sets. These were not supported well, and for version two, 
        this is now simplified - you should only use one capability set. 
        (For backwards compatibility, if you use more, 
        this won't cause a problem for now).</para>
    </formalpara>
    
    
      <formalpara>
        <title>TextAnalysisEngine deprecated; use AnalysisEngine instead</title>
      <para>TextAnalysisEngine has been deprecated - it is now no different than
        AnalysisEngine. Previous code that uses this should still continue to work,
        however.</para></formalpara>
      
      <formalpara>
        <title>Annotator Context deprecated; use UimaContext instead</title>
        <para>The context for the Annotator is the same as the overall UIMA context. 
        The impl base classes provide a getContext() method which returns now the 
        UimaContext object.</para>
      </formalpara>
      
      <formalpara>
        <title>DocumentAnalyzer tool uses XMI formats</title>
      <para>The DocumentAnalyzer tool saves outputs in the new XMI serialization format.
        The AnnotationViewer and SemanticSearchGUI tools can read both the new XMI format
        and the previous XCAS format.</para></formalpara>
      
      <formalpara>
        <title>CAS Initializer deprecated</title>
        <para>Example code that used CAS Initializers has been rewritten to not use this.</para> 
      </formalpara>
    </section>
    
    <section id="ugr.project_overview_backwards_compatibility">
      <title>Backwards Compatibility</title>
      <para>Other than the changes from IBM UIMA to Apache UIMA described above, most UIMA 1.x
        applications should not require additional changes to upgrade to UIMA 2.x. However,
        there are a few exceptions that UIMA 1.x users may need to be aware of:
        <itemizedlist>
          <listitem>
            <para> There have been some changes to ResultSpecifications. We do not
              guarantee 100% backwards compatibility for applications that made use of
              them, although most cases should work. </para>
          </listitem>
          <listitem>
            <para> For applications that deal with multiple subjects of analysis (Sofas),
              the rules that determine whether a component is Multi-View or Single-View
              have been made more consistent. A component is considered Multi-View if and
              only if it declares at least one inputSofa or outputSofa in its descriptor.
              This leads to the following incompatibilities in unusual cases:
              <itemizedlist>
                <listitem>
                  <para> It is an error if an annotator that implements the TextAnnotator or
                    JTextAnnotator interface also declares inputSofas or outputSofas in
                    its descriptor. Such annotators must be Single-View. </para>
                </listitem>
                <listitem>
                  <para> Annotators that implement GenericAnnotator but do not declare
                    any inputSofas or outputSofas will now be passed the view of default
                    Sofa instead of the Base CAS. </para>
                </listitem>
                <listitem>
                  <para> As of version 2.7.0, all annotators will be passed the view of 
                    the default Sofa. </para>
                </listitem>
              </itemizedlist> </para>
          </listitem>
        </itemizedlist> </para>
      
    </section>
  </section>
  </section>

  <section id="ugr.project_overview_migrating_from_ibm_uima">
    <title>Migrating from IBM UIMA to Apache UIMA</title>
    <para>In Apache UIMA, several things have changed that require changes to user code and descriptors.
      A migration utility is provided which will make the required updates to your files.  The most
      significant change is that the Java package names for all of the UIMA classes and interfaces have changed 
      from what they were in IBM UIMA; all of the package names now start with the prefix <literal>org.apache</literal>.</para>
    
    <section id="ugr.project_overview_running_the_migration_utility">
      <title>Running the Migration Utility</title> 
      <note>
        <para>Before running the migration utility, be sure to back up your files, just in case you encounter any
        problems, because the migration tool updates the files in place in the directories where it finds them.</para> 
      </note>
      <para> The migration utility is run by executing the script file
        <literal>apache-uima/bin/ibmUimaToApacheUima.bat</literal> (Windows) or
        <literal>apache-uima/bin/ibmUimaToApacheUima.sh</literal> (UNIX). You must pass one argument: the
        directory containing the files that you want to be migrated. Subdirectories will be processed
        recursively.</para>

      <para>The script scans your files and applies the necessary updates, for example replacing the com.ibm
        package names with the new org.apache package names. For more details on what has changed in the UIMA APIs and
        what changes are performed by the migration script, see <xref linkend="ugr.project_overview_changes_from_2_0"/>.</para>
      
      <para>The script will only attempt to modify files with the extensions: java, xml, xmi, wsdd, properties,
        launch, bat, cmd, sh, ksh, or csh; and files with no extension. Also, files with size greater than 1,000,000
        bytes will be skipped. (If you want the script to modify files with other extensions, you can edit the script
        file and change the <literal>-ext</literal> argument appropriately.) </para>
      
      <para>If the migration tool reports warnings, there may be a few additional steps to take.  The following two
        sections explain some simple manual changes that you might need to make to your code.</para>

      <section id="ugr.project_overview_running_the_migration_utility.jcas_for_document_annotation">
        <title>JCas Cover Classes for DocumentAnnotation</title>
        <para> If you have run JCasGen it is likely that you have the classes
          <literal>com.ibm.uima.jcas.tcas.DocumentAnnotation</literal> and
          <literal>com.ibm.uima.jcas.tcas.DocumentAnnotation_Type</literal> as part of your code. This
          package name is no longer valid, and the migration utility does not move your files between directories so
          it is unable to fix this. </para>
        <para> If you have not made manual modifications to these classes, the best solution is usually to just delete
          these two classes (and their containing package). There is a default version in the
          <literal>uima-document-annotation.jar</literal> file that is included in Apache UIMA. If you
          <emphasis>have</emphasis> made custom changes, then you should not delete the file but instead move it to
          the correct package <literal>org.apache.uima.jcas.tcas</literal>. For more information about JCas
          and DocumentAnnotation please see <olink targetdoc="&uima_docs_ref;"/>
          <olink targetdoc="&uima_docs_ref;"
            targetptr="ugr.ref.jcas.documentannotation_issues"/> </para>
      </section>
      <section id="ugr.project_overview_running_the_migration_utility.manual_migration_needed.getdocumentannotation">
        <title>JCas.getDocumentAnnotation</title>
        <para>The deprecated method <literal>JCas.getDocumentAnnotation</literal> has been removed. Its use
          must be replaced with <literal>JCas.getDocumentAnnotationFs</literal>. The method
          <literal>JCas.getDocumentAnnotationFs()</literal> returns type <literal>TOP</literal>, so your
          code must cast this to type <literal>DocumentAnnotation</literal>. The reasons for this are described
          in <olink targetdoc="&uima_docs_ref;"/>
          <olink targetdoc="&uima_docs_ref;" targetptr="ugr.ref.jcas.documentannotation_issues"/>.
          </para>
      </section>      
      
    </section>
    
     
    <section id="ugr.project_overview_rare_migration">
      <title>Manual Migration</title>
      <para>The following are rare cases where you may need to take additional steps to migrate your code.  You need only 
        read this section if the migration tool reported a warning or if you are having trouble getting your code to 
        compile or run after running the migration.  For most users, attention to these things will not
        be required.</para>
      
      <section id="ugr.project_overview.manual_migration_needed.xiinclude">
        <title>xi:include</title>
        <para>The use of &lt;xi:include> in UIMA component descriptors has been discouraged for some time, and in
          Apache UIMA support for it has been removed. If you have descriptors that use that, you must change them to
          use UIMA's &lt;import> syntax instead. The proper syntax is described in <olink
            targetdoc="&uima_docs_ref;"/> <olink
            targetdoc="&uima_docs_ref;" targetptr="ugr.ref.xml.component_descriptor.imports"/>.
          </para>
      </section>
      <section id="ugr.project_overview.manual_migration_needed.duplicate_methods_cas_tcas">
        <title>Duplicate Methods Taking CAS and TCAS as Arguments</title>
        <para>Because <literal>TCAS</literal> has been replaced by <literal>CAS</literal>, if you had two
          methods distinguished only by whether an argument type was <literal>TCAS</literal> or
          <literal>CAS</literal>, the migration tool will cause these to have identical signatures, which will be
          a compile error. If this happens, consider why the two variants were needed in the first place. Often, it may
          work to simply delete one of the methods.</para>
      </section>
      <section id="ugr.project_overview.manual_migration_needed.undocumented_methods">
        <title>Use of Undocumented Methods from the com.ibm.uima.util package</title>
        <titleabbrev>Undocumented Methods</titleabbrev>
        <para>Previous UIMA versions has some methods in the <literal>com.ibm.uima.util</literal> package that
          were for internal use and were not documented in the Javadoc. (There are also many methods in that package
          which are documented, and there is no issue with using these.) It is not recommended that you use any of the
          undocumented methods. If you do, the migration script will not handle them correctly. These have now been
          moved to <literal>org.apache.uima.internal.util</literal>, and you will have to manually update your
          imports to point to this location.</para>
      </section>
      <section id="ugr.project_overview.manual_migration_needed.uima_package_names_in_user_code">
        <title>Use of UIMA Package Names for User Code</title>
        <titleabbrev>Package Names</titleabbrev>
        <para>If you have placed your own classes in a package that has exactly the same name as one of the UIMA packages
          (not recommended), this will cause problems when your run the migration script. Since the script replaces
          UIMA package names, all of your imports that refer to your class will get replaced and your code will no
          longer compile. If this happens, you can fix it by manually moving your code to the new Apache UIMA package
          name (i.e., whatever name your imports got replaced with). However, we recommend instead that you do not
          use Apache UIMA package names for your own code.</para>
        <para>An even more rare case would be if you had a package name that started with a capital letter (poor Java
          style) AND was prefixed by one of the UIMA package names, for example a package named
          <literal>com.ibm.uima.MyPackage</literal>. This would be treated as a class name and replaced with
          <literal>org.apache.uima.MyPackage</literal> wherever it occurs.</para>
      </section>
      <section id="ugr.project_overview.manual_migration_needed.exceptions_extend_uima_exceptions">
        <title>CASException and CASRuntimeException now extend UIMA(Runtime)Exception</title>
        <titleabbrev>Changes to CAS Exceptions</titleabbrev>
        <para>
          This change may affect user code to a small extent, as some of the APIs on 
          <literal>CASException</literal> and <literal>CASRuntimeException</literal> no longer exist.
          On the up side, all UIMA exceptions are now derived from the same base classes and behave
          the same way.  The most significant change is that you can no longer check for the specific
          type of exception the way you used to.  For example, if you had code like this:
          
          <programlisting>catch (CASRuntimeException e) {
  if (e.getError() == CASRuntimeException.ILLEGAL_ARRAY_SIZE) {
  // Do something in case this particular error is caught</programlisting>
          
          you will need to replace it with the following:
          
          <programlisting>catch (CASRuntimeException e) {
  if (e.getMessageKey().equals(CASRuntimeException.ILLEGAL_ARRAY_SIZE)) {
  // Do something in case this particular error is caught</programlisting>
          
          as the message keys are now strings.  This change is not handled by the migration script.
        </para>
      </section>
    </section>
  </section>
  
  <section id="ugr.project_overview_summary">
    <title>Apache UIMA Summary</title>
    <section id="ugr.ovv.summary.general">
      <title>General</title>
      <para>UIMA supports the development, discovery, composition and deployment of multi-modal
        analytics for the analysis of unstructured information and its integration with search
        technologies.</para>
      
      <para>Apache UIMA includes APIs and tools for creating analysis components. Examples of analysis components include
        tokenizers, summarizers, categorizers, parsers, named-entity detectors etc. Tutorial examples are
        provided with Apache UIMA; additional components are available from the community. </para>
      
      <para>Apache UIMA does not itself include a semantic search engine; instructions are included for 
        incorporating the semantic search SDK from IBM's <ulink url="http://alphaworks.ibm.com/tech/uima">alphaWorks</ulink>
        which can index the results of
        analysis and for using this semantic index to perform more advanced search. </para>
    </section>
    <section id="ugr.ovv.summary.programming_language_support">
      <title>Programming Language Support</title>
      <para>UIMA supports the development and integration of analysis algorithms developed in different
        programming languages. </para>
      
      <para>The Apache UIMA project is both a Java framework and a matching C++
        enablement layer, which allows annotators to be written in C++ and have access to a C++ version of the CAS. The
        C++ enablement layer also enables annotators to be written in Perl, Python, and TCL, and to interoperate with
        those written in other languages. <!--Documentation for this is provided here (link to be filled in).-->
        </para>
      
    </section>
    <section id="ugr.ovv.general.summary.multi_modal_support">
      <title>Multi-Modal Support</title>
      <para>The UIMA architecture supports the development, discovery, composition and deployment of
        multi-modal analytics, including text, audio and video. <olink
          targetdoc="&uima_docs_tutorial_guides;"/> <olink
          targetdoc="&uima_docs_tutorial_guides;" targetptr="ugr.tug.aas"/> discuss this is more
        detail.</para>
    </section>
    <section id="ugr.ovv.summary.general.semantic_search_components">
      <title>Semantic Search Components</title>
      <para> The Lucene search engine as of this writing (November, 2006) does not support searching with
        annotations. The site <ulink url="http://www.alphaworks.ibm.com/tech/uima"/> provides a download of a
        semantic search engine, a simple demo query tool, some documentation on the semantic search engine, and a
        component that connects the results of UIMA analysis to the indexer so that the annotations as well as
        key-words can be indexed. </para>
      
      <para>Previous versions of the UIMA SDK (prior to the Apache versions) are available from <ulink
          url="http://www.alphaworks.ibm.com/tech/uima"> IBM's alphaWorks</ulink>. The source code for
        previous versions of the main UIMA framework is available on <ulink
          url="http://uima-framework.sourceforge.net/"> SourceForge</ulink>.</para>      
    </section>
  </section>
  
  <section id="ugr.project_overview_summary_sdk_capabilities">
    <title>Summary of Apache UIMA Capabilities</title>
    <informaltable frame="all" rowsep="1" colsep="1">
      <tgroup cols="2">
        <colspec colnum="1" colname="col1" colwidth=".75*"/>
        <colspec colnum="2" colname="col2" colwidth="*"/>
        <tbody>
          <row>
            <entry role="tableSubhead">Module</entry>
            <entry role="tableSubhead">Description</entry>
          </row>
          <row>
            <entry>UIMA Framework Core</entry>
            <entry>
              <para>A framework integrating core functions for creating, deploying, running and managing UIMA
                components, including analysis engines and Collection Processing Engines in collocated and/or
                distributed configurations. </para>
              
              <para>The framework includes an implementation of core components for transport layer adaptation,
                CAS management, workflow management based on declarative specifications, resource management,
                configuration management, logging, and other functions.</para>
            </entry>
          </row>
          <row>
            <entry>C++ and other programming language Interoperability</entry>
            
            <entry>
              <para>Includes C++ CAS and supports the creation of UIMA compliant C++ components that can be
                deployed in the UIMA run-time through a built-in JNI adapter. This includes high-speed binary
                serialization.</para>
              
              <para>Includes support for creating service-based UIMA engines. This is ideal for
                wrapping existing code written in different languages.</para>
            </entry>
          </row>
          <row>
            <entry role="tableSubhead">Framework Services and APIs</entry>
            <entry role="tableSubhead">Note that interfaces of these components are available to the developer
              but different implementations are possible in different implementations of the UIMA
              framework.</entry>
          </row>
          <row>
            <entry>CAS</entry>
            <entry>These classes provide the developer with typed access to the Common Analysis Structure (CAS),
              including type system schema, elements, subjects of analysis and indices. Multiple subjects of
              analysis (Sofas) mechanism supports the independent or simultaneous analysis of multiple views of
              the same artifacts (e.g. documents), supporting multi-lingual and multi-modal analysis.</entry>
          </row>
          <row>
            <entry>JCas</entry>
            <entry>An alternative interface to the CAS, providing Java-based UIMA Analysis components with
              native Java object access to CAS types and their attributes or features, using the
              JavaBeans conventions of getters and setters.</entry>
          </row>
          
          <row>
            <entry>Collection Processing Management (CPM)</entry>
            <entry>Core functions for running UIMA collection processing engines in collocated and/or
              distributed configurations. The CPM provides scalability across parallel processing pipelines,
              check-pointing, performance monitoring and recoverability.</entry>
          </row>
          <row>
            <entry>Resource Manager</entry>
            <entry>Provides UIMA components with run-time access to external resources handling capabilities
              such as resource naming, sharing, and caching. </entry>
          </row>
          <row>
            <entry>Configuration Manager</entry>
            <entry>Provides UIMA components with run-time access to their configuration parameter settings.
              </entry>
          </row>
          <row>
            <entry>Logger</entry>
            <entry>Provides access to a common logging facility.</entry>
          </row>
          <row>
            <entry namest="col1" nameend="col2" align="center" role="tableSubhead"> Tools and Utilities
              </entry>
          </row>
          <row>
            <entry>JCasGen</entry>
            <entry>Utility for generating a Java object model for CAS types from a UIMA XML type system
              definition.</entry>
          </row>
          <row>
            <entry>Saving and Restoring CAS contents</entry>
            <entry>APIs in the core framework support saving and restoring the contents of a CAS to streams using an
              XMI format. </entry>
          </row>
          <row>
            <entry>PEAR Packager for Eclipse</entry>
            <entry>Tool for building a UIMA component archive to facilitate porting, registering, installing and
              testing components.</entry>
          </row>
          <row>
            <entry>PEAR Installer</entry>
            <entry>Tool for installing and verifying a UIMA component archive in a UIMA installation.</entry>
          </row>
          <row>
            <entry>PEAR Merger</entry>
            <entry>Utility that combines multiple PEARs into one.</entry>
          </row>
          <row>
            <entry>Component Descriptor Editor</entry>
            <entry>Eclipse Plug-in for specifying and configuring component descriptors for UIMA analysis
              engines as well as other UIMA component types including Collection Readers and CAS
              Consumers.</entry>
          </row>
          <row>
            <entry>CPE Configurator</entry>
            <entry>Graphical tool for configuring Collection Processing Engines and applying them to
              collections of documents.</entry>
          </row>
          <row>
            <entry>Java Annotation Viewer</entry>
            <entry>Viewer for exploring annotations and related CAS data.</entry>
          </row>
          <row>
            <entry>CAS Visual Debugger</entry>
            <entry>GUI Java application that provides developers with detailed visual view of the contents of a
              CAS.</entry>
          </row>
          <row>
            <entry>Document Analyzer</entry>
            <entry>GUI Java application that applies analysis engines to sets of documents and shows results in a
              viewer.</entry>
          </row>
          <row>
            <entry namest="col1" nameend="col2" align="center" role="tableSubhead"> Example Analysis
              Components </entry>
          </row>
          <row>
            <entry>Database Writer</entry>
            <entry>CAS Consumer that writes the content of selected CAS types into a relational database, using
              JDBC. This code is in cpe/PersonTitleDBWriterCasConsumer. </entry>
          </row>
          <row>
            <entry>Annotators</entry>
            <entry> Set of simple annotators meant for pedagogical purposes. Includes: Date/time, Room-number,
              Regular expression, Tokenizer, and Meeting-finder annotator. There are sample CAS Multipliers
              as well. </entry>
          </row>
          <row>
            <entry>Flow Controllers</entry>
            <entry> There is a sample flow-controller based on the whiteboard concept of sending the CAS to whatever
              annotator hasn't yet processed it, when that annotator's inputs are available in the CAS. </entry>
          </row>
          <row>
            <entry>XMI Collection Reader, CAS Consumer</entry>
            <entry>Reads and writes the CAS in the XMI format</entry>
          </row>
          
          <row>
            <entry>File System Collection Reader</entry>
            <entry> Simple Collection Reader for pulling documents from the file system and initializing CASes.
              </entry>
          </row>
          <row>
            <entry namest="col1" nameend="col2" align="center" role="tableSubhead"> Components available
              from <ulink url="http://www.alphaworks.ibm.com/tech/uima"></ulink> </entry>
          </row>
          <row>
            <entry>Semantic Search CAS Indexer</entry>
            <entry>A CAS Consumer that uses the semantic search engine indexer to build an index from a stream of
              CASes. Requires the semantic search engine (available from the same place). </entry>
          </row>
        </tbody>
      </tgroup>
    </informaltable>
  </section>
  
</chapter>