blob: 9186c94fceda806b5931c4284a67e56575a63fc1 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
<!ENTITY % uimaents SYSTEM "../../target/docbook-shared/entities.ent" >
%uimaents;
]>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<chapter id="ugr.ref.jcas">
<title>JCas Reference</title>
<para>The CAS is a system for sharing data among annotators, consisting of data structures
(definable at run time), sets of indexes over these data, metadata describing these, subjects of
analysis, and a high
performance serialization/deserialization mechanism. JCas provides Java approach to
accessing CAS data, and is based on using generated, specific Java classes for each CAS
type.</para>
<para>Annotators process one CAS per call to their process method. During processing,
annotators can retrieve feature structures from the passed in CAS, add new ones, modify
existing ones, and use and update CAS indexes. Of course, an annotator can also use plain
Java Objects in addition; but the data in the CAS is what is shared among annotators within
an application.</para>
<para>All the facilities present in the APIs for the CAS are available when using the JCas
APIs; indeed, you can use the getCas() method to get the corresponding CAS object from a
JCas (and vice-versa). The JCas APIs often have helper methods that make using this
interface more convenient for Java developers.</para>
<para>The data in the CAS are typed objects having fields. JCas uses a set of generated Java
classes (each corresponding to a particular CAS type) with <quote>getter</quote> and
<quote>setter</quote> methods for the features, plus a constructor so new instances can
be made. The Java classes don&apos;t actually store the data in the class instance;
instead, the getters and setters forward to the underlying CAS data representation.
Because of this, applications which use the JCas interface can share data with annotators
using plain CAS (i.e., not using the JCas approach). </para>
<para>Users can modify the JCas generated
Java classes by adding fields to them; this allows arbitrary non-CAS data to also be
represented within the JCas objects, as well; however, the non-CAS data stored in the JCas
object instances cannot be shared with annotators using the plain CAS.</para>
<para>Data in the CAS initially has no corresponding JCas type instances; these are created
as needed at the first reference. This means, if your annotator is passed a large CAS having
millions of CAS feature structures, but you only reference a few of them, and no previously
created Java JCas object instances were created by upstream annotators, the only Java
objects that will be created will be those that correspond to the CAS feature structures
that you reference.</para>
<para>The JCas class Java source files are generated from XML type system descriptions. The
JCasGen utility does the work of generating the corresponding Java Class Model for the CAS
types. There are a variety of ways JCasGen can be run; these are described later. You
include the generated classes with your UIMA component, and you can publish these classes
for others who might want to use your type system.</para>
<para>The specification of the type system in XML can be written using a conventional text
editor, an XML editor, or using the Eclipse plug-in that supports editing UIMA
descriptors.</para>
<para>Changes to the type system are done by changing the XML and regenerating the
corresponding Java Class Models. Of course, once you&apos;ve published your type system
for others to use, you should be careful that any changes you make don&apos;t adversely
impact the users. Additional features can be added to existing types without breaking
other code.</para>
<para>A separate Java class is generated for each type; this type implements the CAS
FeatureStructure interface, as well as having the special getters and setters for the
included features. In the current implementation, an additional helper class per type is
also generated. The generated Java classes have methods (getters and setters) for the
fields as defined in the XML type specification. Descriptor comments are reflected in the
generated Java code as Java-doc style comments.</para>
<section id="ugr.ref.jcas.name_spaces">
<title>Name Spaces</title>
<para>Full Type names consist of a <quote>namespace</quote> prefix dotted with a simple
name. Namespaces are used like packages to avoid collisions between types that are
defined by different people at different times. The namespace is used as the Java
package name for generated Java files.</para>
<para>Type names used in the CAS correspond to the generated Java classes directly. If the
CAS name is com.myCompany.myProject.ExampleClass, the generated Java class is in the
package com.myCompany.myProject, and the class is ExampleClass.</para>
<para>
An exception to this rule is the built-in types
starting with <literal>uima.cas </literal>and <literal>uima.tcas</literal>;
these names are mapped to Java packages named
<literal>org.apache.uima.jcas.cas</literal> and
<literal>org.apache.uima.jcas.tcas</literal>.</para>
</section>
<section id="ugr.ref.jcas.use_of_description">
<title>XML description element</title>
<titleabbrev>Use of XML Description</titleabbrev>
<para>Each XML type specification can have &lt;description ...
&gt; tags. The description for a type will be copied into the generated Java code, as a
Javadoc style comment for the class. When writing these descriptions in the XML type
specification file, you might want to use html tags, as allowed in Javadocs.</para>
<para>If you use the Component Description Editor, you can write the html tags normally,
for instance, <quote>&lt;h1&gt;My Title&lt;/h1&gt;</quote>. The Component
Descriptor Editor will take care of coverting the actual descriptor source so that it
has the leading <quote>&lt;</quote> character written as <quote>&amp;lt;</quote>,
to avoid confusing the XML type specification. For example, &lt;p&gt; would be written
in the source of the descriptor as &amp;lt;p&gt;. Any characters used in the Javadoc
comment must of course be from the character set allowed by the XML type specification.
These specifications often start with the line &lt;?xml version=<quote>1.0</quote>
encoding=<quote>UTF-8</quote> ?&gt;, which means you can use any of the UTF-8
characters.</para>
</section>
<section id="ugr.ref.jcas.mapping_built_ins">
<title>Mapping built-in CAS types to Java types</title>
<para>The built-in primitive CAS types map to Java types as follows:</para>
<programlisting>uima.cas.Boolean &rarr; boolean
uima.cas.Byte &rarr; byte
uima.cas.Short &rarr; short
uima.cas.Integer &rarr; int
uima.cas.Long &rarr; long
uima.cas.Float &rarr; float
uima.cas.Double &rarr; double
uima.cas.String &rarr; String</programlisting>
</section>
<section id="ugr.ref.jcas.augmenting_generated_code">
<title>Augmenting the generated Java Code</title>
<para>The Java Class Models generated for each type can be augmented by the user. Typical
augmentations include adding additional (non-CAS) fields and methods, and import
statements that might be needed to support these. Commonly added methods include
additional constructors (having different parameter signatures), and
implementations of toString().</para>
<para>To augment the code, just edit the generated Java source code for the class named the
same as the CAS type. Here&apos;s an example of an additional method you might add; the
various getter methods are retrieving values from the instance:</para>
<programlisting>public String toString() { // for debugging
return "XsgParse "
+ getslotName() + ": "
+ getheadWord().getCoveredText()
+ " seqNo: " + getseqNo()
+ ", cAddr: " + id
+ ", size left mods: " + getlMods().size()
+ ", size right mods: " + getrMods().size();
}</programlisting>
<section id="ugr.ref.jcas.data_persistence">
<title>Persistence of additional data</title>
<para>If you add custom instance fields to JCas cover classes, these exist in the JCas cover object instance,
but not in the CAS itself. Each time a CAS object is referenced (by an iterator, or by following a Feature
Structure reference), a new JCas cover object instance may be created. If you need these values, you can (a)
make them CAS values if possible, or (b) hold a reference to the the particular JCas cover object instance in
your Java code. For some simple cases, setting the the performance tuning option JCAS_CACHE_ENABLE (see
<olink targetdoc="&uima_docs_tutorial_guides;" targetptr="tug.application.pto"/>)
to true
will cause the same JCas cover object that was previously used for a particular CAS Feature Structure to be
reused. However, this capability won't work when other factors interfere with the ability to reuse the same
object. Pear isolation is an example of this.</para>
<para>Because of this, and because the JCas Cache holds on to the JCas cover objects beyond their useful life and
prevents them from being garbage collected, it is normally recommended running with the
JCAS_CACHE_ENABLE set to "false".</para>
</section>
<section id="ugr.ref.jcas.keeping_augmentations_when_regenerating">
<title>Keeping hand-coded augmentations when regenerating</title>
<para>If the type system specification changes, you have to re-run the JCasGen
generator. This will produce updated Java for the Class Models that capture the
changed specification. If you have previously augmented the source for these Java
Class Models, your changes must be merged with the newly (re)generated Java source
code for the Class Models. This can be done by hand, or you can run the version of JCasGen
that is integrated with Eclipse, and use automatic merging that is done using Eclipse&apos;s EMF
plug-in. You can obtain Eclipse and the needed EMF plug-in from <ulink
url="http://www.eclipse.org/"/>.</para>
<para>If you run the generator version that works without using Eclipse, it will not
merge Java source changes you may have previously made; if you want them retained,
you&apos;ll have to do the merging by hand.</para>
<para>The Java source merging will keep additional constructors, additional fields,
and any changes you may have made to the readObject method (see below). Merging will
<emphasis>not</emphasis> delete classes in the target corresponding to deleted CAS types, which no longer
are in the source &ndash; you should delete these by hand.</para>
<warning><para>The merging supports Java 1.4 syntactic constructs only.
JCasGen generates Java 1.4 code, so as long as any code you change here also sticks to
only Java 1.4 constructs, the merge will work. If you use Java 5 or later specific syntax or constructs, the merge
operation will likely fail to merge properly.</para></warning>
</section>
<section id="ugr.ref.jcas.additional_constructors">
<title>Additional Constructors</title>
<para>Any additional constructors that you add must include the JCas argument. The
first line of your constructor is required to be</para>
<programlisting>this(jcas); // run the standard constructor</programlisting>
<para>where jcas is the passed in JCas reference. If the type you&apos;re defining
extends <literal>uima.tcas.Annotation</literal>, JCasGen will automatically
add a constructor which takes 2 additional parameters &ndash; the begin and end Java
int values, and set the <literal>uima.tcas.Annotation</literal>
<literal>begin</literal> and <literal>end</literal> fields.</para>
<para>Here&apos;s an example: If you&apos;re defining a type MyType which has a
feature parent, you might make an additional constructor which has an additional
argument of parent:</para>
<programlisting>MyType(JCas jcas, MyType parent) {
this(jcas); // run the standard constructor
setParent(parent); // set the parent field from the parameter
}</programlisting>
<section id="ugr.ref.jcas.using_readobject">
<title>Using readObject</title>
<para>Fields defined by augmenting the Java Class Model to include additional
fields represent data that exist for this class in Java, in a local JVM (Java Virtual
Machine), but do not exist in the CAS when it is passed to other environments (for
example, passing to a remote annotator).</para>
<para>A problem can arise when new instances are created, perhaps by the underlying
system when it iterates over an index, which is: how to insure that any additional
non-CAS fields are properly initialized. To allow for arbitrary initialization
at instance creation time, an initialization method in the Java Class Model,
called readObject is used. The generated default for this method is to do nothing,
but it is one of the methods that you can modify &ndash; to do whatever
initialization might be needed. It is called with 0 parameters, during the
constructor for the object, after the basic object fields have been set up. It can
refer to fields in the CAS using the getters and setters, and other fields in the Java
object instance being initialized.</para>
<para>A pre-existing CAS feature structure could exist if a CAS was being passed to
this annotator; in this case the JCas system calls the readObject method when
creating the corresponding Java instance for the first time for the CAS feature
structure. This can happen at two points: when a new object is being returned from an
iterator over a CAS index, or a getter method is getting a field for the first time
whose value is a feature structure.</para>
</section>
</section>
<section id="ugr.ref.jcas.modifying_generated_items">
<title>Modifying generated items</title>
<para>The following modifications, if made in generated items, will be preserved when
regenerating.</para>
<para>The public/private etc. flags associated with methods (getters and setters).
You can change the default (<quote>public</quote>) if needed.</para>
<para><quote>final</quote> or <quote>abstract</quote> can be added to the type
itself, with the usual semantics.</para>
</section>
</section>
<section id="ugr.ref.jcas.merging_types_from_other_specs">
<title>Merging types</title>
<titleabbrev>Merging Types</titleabbrev>
<para>Type definitions are merged by the framework from all the components being run together.</para>
<section id="ugr.ref.jcas.merging_types.aggregates_and_cpes">
<title>Aggregate AEs and CPEs as sources of types</title>
<para>When running aggregate AEs (Analysis Engines), or a set of AEs in a collection processing engine, the
UIMA framework will build a merged type system (Note: this <quote>merge</quote> is merging types, not to be
confused with merging Java source code, discussed above). This merged type system has all the types of every
component used in the application. In addition, application code can use UIMA Framework APIs to read and merge
type descriptions, manually.</para>
<para>In most cases, each type system can have its own Java Class Models generated individually, perhaps at an
earlier time, and the resulting class files (or .jar files containing these class files) can be put in the
class path to enable JCas.</para>
<para>However, it is possible that there may be multiple definitions of the same CAS type, each of which might
have different features defined. In this case, the UIMA framework will create a merged type by accumulating
all the defined features for a particular type into that type&apos;s type definition. However, the JCas
classes for these types are not automatically merged, which can create some issues for JCas users, as
discussed in the next section.</para>
</section>
<section id="ugr.ref.jcas.merging_types.jcasgen_support">
<title>JCasGen support for type merging</title>
<para>When there are multiple definitions of the same CAS type with different features defined, then JCasGen
can be re-run on the merged type system, to create one set of JCas Class definitions for the merged types,
which can then be shared by all the components.
Directions for running JCasGen can be found in <olink
targetdoc="&uima_docs_tools;" targetptr="ugr.tools.jcasgen"/>. This is typically done by the person who
is assembling the Aggregate Analysis Engine or Collection Processing Engine. The resulting merged Java
Class Model will then contain get and set methods for the complete set of features. These Java classes must
then be made available in the class path, <emphasis>replacing</emphasis> the pre-merge versions of the
classes.</para>
<para>If hand-modifications were done to the pre-merge versions of the classes, these must be applied to the
merged versions, as described in section <xref
linkend="ugr.ref.jcas.keeping_augmentations_when_regenerating"/>, above. If just one of the
pre-merge versions had hand-modifications, the source for this hand-modified version can be put into the
file system where the generated output will go, and the -merge option for JCasGen will automatically
merge the hand-modifications with the generated code. If
<emphasis>both</emphasis> pre-merged versions had hand-modifications, then these modifications must
be manually merged.</para>
<para>An alternative to this is packaging the components as individual PEAR files, each with their own
version of the JCas generated Classes. The Framework (as of release 2.2) can run PEAR files using the
pear file descriptor, and supply each component with its particular version of the JCas generated class.</para>
</section>
<section id="ugr.ref.jcas.impact_of_type_merging_on_composability">
<title>Impact of Type Merging on Composability of Annotators</title>
<titleabbrev>Type Merging impacts on Composability</titleabbrev>
<para>The recommended approach in UIMA is to build and maintain type systems as separate components, which are
imported by Annotators. Using this approach, Type Merging does not occur because the Type System and its JCas
classes are centrally managed and shared by the annotators.</para>
<para>If you do choose to create a JCas Annotator that relies on Type Merging (meaning that your annotator
redefines a Type that is already in use elsewhere, and adds its own features), this can negatively impact the
reusability of your annotator, unless your component is used as a PEAR file.</para>
<para>If not using PEAR file packaging isolation capability, whenever
anyone wants to combine your annotator with another annotator that uses a different version of
the same Type, they will need to be aware of all of the issues described in the previous section. They will need
to have the know-how to re-run JCasGen and appropriately set up their classpath to include the merged Java
classes and to not include the pre-merge classes. (To enable this, you should package these classes
separately from other .jar files for your annotator, so that they can be more easily excluded.) And, if you
have done hand-modifications to your JCas classes, the person assembling your annotator will need to
properly merge those changes. These issues significantly complicate the task of combining annotators, and
will cause your annotator not to be as easily reusable as other UIMA annotators. </para>
</section>
<section id="ugr.ref.jcas.documentannotation_issues">
<title>Adding Features to DocumentAnnotation</title>
<para>There is one built-in type, <literal>uima.tcas.DocumentAnnotation</literal>,
to which applications can add additional features. (All other built-in types
are "feature-final" and you cannot add additional features to them.) Frequently,
additional features are added to <literal>uima.tcas.DocumentAnnotation</literal>
to provide a place to store document-level metadata.</para>
<para>For the same reasons mentioned in the previous section, adding features to
DocumentAnnotation is not recommended if you are using JCas. Instead, it is recommended
that you define your own type for storing your document-level metadata. You can create
an instance of this type and add it to the indexes in the usual way. You can then
retrieve this instance using the iterator returned from the method<literal>getAllIndexedFS(type)</literal>
on an instance of a JFSIndexRepository object.
(As of UIMA v2.1, you do not have to declare a custom index in your descriptor to
get this to work).</para>
<para>If you do choose to add features to DocumentAnnotation, there are additional issues to
be aware of. The UIMA SDK provides the JCas cover class for the built-in definition of
DocumentAnnotation, in the separate jar file <literal>uima-document-annotation.jar</literal>.
If you add additional features to DocumentAnnotation, you must remove this jar file
from your classpath, because you will not want to use the default JCas cover class.
You will need to re-run JCasGen as described in <xref
linkend="ugr.ref.jcas.merging_types.jcasgen_support"/>. JCasGen will generate a new cover
class for DocumentAnnotation, which you must place in your classpath in lieu of the version
in <literal>uima-document-annotation.jar</literal>.</para>
<para>Also, this is the reason why the method <literal>JCas.getDocumentAnnotationFs()</literal> returns
type <literal>TOP</literal>, rather than type <literal>DocumentAnnotation</literal>. Because the
<literal>DocumentAnnotation</literal> class can be replaced by users, it is not part of
<literal>uima-core.jar</literal> and so the core UIMA framework cannot have any references
to it. In your code, you may <quote>cast</quote> the result of <literal>JCas.getDocumentAnnotationFs()</literal>
to type <literal>DocumentAnnotation</literal>, which must be available on the classpath either via
<literal>uima-document-annotation.jar</literal> or by including a custom version that you have generated using JCasGen.</para>
</section>
</section>
<section id="ugr.ref.jcas.using_within_an_annotator">
<title>Using JCas within an Annotator</title>
<para>To use JCas within an annotator, you must include the generated Java classes output
from JCasGen in the class path.</para>
<para>An annotator written using JCas is built by defining a class for the annotator that
extends JCasAnnotator_ImplBase. The process method for this annotator is
written</para>
<programlisting>public void process(JCas jcas)
throws AnalysisEngineProcessException {
... // body of annotator goes here
}</programlisting>
<para>The process method is passed the JCas instance to use as a parameter.</para>
<para>The JCas reference is used throughout the annotator to refer to the particular JCas
instance being worked on. In pooled or multi-threaded implementations, there will be a
separate JCas for each thread being (simultaneously) worked on.</para>
<para>You can do several kinds of operations using the JCas APIs: create new feature
structures (instances of CAS types) (using the new operator), access existing feature
structures passed to your annotator in the JCas (for example, by using the next method of
an iterator over the feature structures), get and set the fields of a particular
instance of a feature structure, and add and remove feature structure instances from
the CAS indexes. To support iteration, there are also functions to get and use indexes
and iterators over the instances in a JCas.</para>
<section id="ugr.ref.jcas.new_instances">
<title>Creating new instances using the Java <quote>new</quote> operator</title>
<titleabbrev>Creating new instances</titleabbrev>
<para>The new operator creates new instances of JCas types. It takes at least one
parameter, the JCas instance in which the type is to be created. For example, if there
was a type Meeting defined, you can create a new instance of it using:
<programlisting>Meeting m = new Meeting(jcas);</programlisting></para>
<para>Other variations of constructors can be added in custom code; the single
parameter version is the one automatically generated by JCasGen. For types that are
subtypes of Annotation, JCasGen also generates an additional constructor with
additional <quote>begin</quote> and <quote>end</quote> arguments.</para>
</section>
<section id="ugr.ref.jcas.getters_and_setters">
<title>Getters and Setters</title>
<para>If the CAS type Meeting had fields location and time, you could get or set these by
using getter or setter methods. These methods have names formed by splicing together
the word <quote>get</quote> or <quote>set</quote> followed by the field name, with
the first letter of the field name capitalized. For instance
<programlisting>getLocation()</programlisting></para>
<para>The getter forms take no parameters and return the value of the field; the setter
forms take one parameter, the value to set into the field, and return void.</para>
<para>There are built-in CAS types for arrays of integers, strings, floats, and
feature structures. For fields whose values are these types of arrays, there is an
alternate form of getters and setters that take an additional parameter, written as
the first parameter, which is the index in the array of an item to get or set.</para>
</section>
<section id="ugr.ref.jcas.obtaining_refs_to_indexes">
<title>Obtaining references to Indexes</title>
<para>The only way to access instances (not otherwise referenced from other
instances) passed in to your annotator in its JCas is to use an iterator over some
index. Indexes in the CAS are specified in the annotator descriptor. Indexes have a
name; text annotators have a built-in, standard index over all annotations.</para>
<para>To get an index, first get the JFSIndexRepository from the JCas using the method
jcas.getJFSIndexRepository(). Here are the calls to get indexes:</para>
<programlisting>JFSIndexRepository ir = jcas.getJFSIndexRepository();
ir.getIndex(name-of-index) // get the index by its name, a string
ir.getIndex(name-of-index, Foo.type) // filtered by specific type
ir.getAnnotationIndex() // get AnnotationIndex
ir.getAnnotationIndex(Foo.type) // filtered by specific type</programlisting>
<para>For convenience, the getAnnotationIndex method is available directly on the JCas object
instance; the implementation merely forwards to the associated index repository.</para>
<para>Filtering types have to be a subtype of the type specified for this index in its
index specification. They can be written as either Foo.type or if you have an instance
of Foo, you can write</para>
<programlisting>fooInstance.jcasType.casType. </programlisting>
<para>Foo is (of course) an example of the name of the type.</para>
</section>
<section id="ugr.ref.jcas.adding_removing_instances_to_indexes">
<title>Adding (and removing) instances to (from) indexes</title>
<titleabbrev>Updating Indexes</titleabbrev>
<para>CAS indexes are maintained automatically by the CAS. But you must add any
instances of feature structures you want the index to find, to the indexes by using the
call:</para>
<programlisting>myInstance.addToIndexes();</programlisting>
<para>Do this after setting all features in the instance <emphasis role="bold-italic">which could be used in indexing</emphasis>, for example, in
determining the sorting order. After indexing, do not change the values of these
particular features because the indexes will not be updated. If you need to change the
values, you must first remove the instance from the CAS indexes, change the values,
and then add the instance back. To remove an instance from the indexes, use the method:
<programlisting>myInstance.removeFromIndexes();</programlisting></para>
<note><para>It&apos;s OK to change feature values which are not used in determining
sort ordering (or set membership), without removing and re-adding back to the index.
</para></note>
<para>When writing a Multi-View component, you may need to index instances in multiple
CAS views. The methods above use the indexes associated with the current JCas object.
There is a variation of the <literal>addToIndexes / removeFromIndexes</literal> methods which
takes one argument: a reference to a JCas object holding the view in which you want to
index this instance.
<programlisting>myInstance.addToIndexes(anotherJCas)
myInstance.removeFromIndexes(anotherJCas)</programlisting>
</para>
<para>
You can also explicitly add instances to other views using the addFsToIndexes method on
other JCas (or CAS) objects. For instance, if you had 2 other CAS views (myView1 and
myView2), in which you wanted to index myInstance, you could write:</para>
<programlisting>myInstance.addToIndexes(); //addToIndexes used with the new operator
myView1.addFsToIndexes(myInstance); // index myInstance in myView1
myView2.addFsToIndexes(myInstance); // index myInstance in myView2</programlisting>
<para>
The rules for determining which index to use with a particular JCas object are designed to
behave the way most would think they should; if you need specific behavior, you can always
explicitly designate which view the index adding and removing operations should work on.
</para>
<para>
The rules are:
If the instance is a subtype of AnnotationBase, then the view is the view associated with the
annotation as specified in the feature holding the view reference in AnnotationBase.
Otherwise, if the instance was created using the "new" operator, then the view is the view passed to the
instance's constructor.
Otherwise, if the instance was created by getting a feature value from some other instance, whose range
type is a feature structure, then the view is the same as the referring instance.
Otherwise, if the instance was created by any of the Feature Structure Iterator operations over some index,
then it is the view associated with the index.
</para>
</section>
<section id="ugr.ref.jcas.using_iterators">
<title>Using Iterators</title>
<para>Once you have an index obtained from the JCas, you can get an iterator from the
index; here is an example:</para>
<programlisting>FSIndexRepository ir = jcas.getFSIndexRepository();
FSIndex myIndex = ir.getIndex("myIndexName");
FSIterator myIterator = myIndex.iterator();
JFSIndexRepository ir = jcas.getJFSIndexRepository();
FSIndex myIndex = ir.getIndex("myIndexName", Foo.type); // filtered
FSIterator myIterator = myIndex.iterator();</programlisting>
<para>Iterators work like normal Java iterators, but are augmented to support
additional capabilities. Iterators are described in the CAS Reference, <olink
targetdoc="&uima_docs_ref;"
targetptr="ugr.ref.cas.indexes_and_iterators"/>.</para>
</section>
<section id="ugr.ref.jcas.class_loaders">
<title>Class Loaders in UIMA</title>
<para>The basic concept of a UIMA application includes assembling engines into a flow.
The application made up of these Engines are run within the UIMA Framework, either by
the Collection Processing Manager, or by using more basic UIMA Framework
APIs.</para>
<para>The UIMA Framework exists within a JVM (Java Virtual Machine). A JVM has the
capability to load multiple applications, in a way where each one is isolated from the
others, by using a separate class loader for each application. For instance, one set
of UIMA Framework Classes could be shared by multiple sets of application - specific
classes, even if these application-specific classes had the same names but were
different versions.</para>
<section id="ugr.ref.jcas.class_loaders.optional">
<title>Use of Class Loaders is optional</title>
<para>The UIMA framework will use a specific ClassLoader, based on how
ResourceManager instances are used. Specific ClassLoaders are only created if
you specify an ExtensionClassPath as part of the ResourceManager. If you do not
need to support multiple applications within one UIMA framework within a JVM,
don&apos;t specify an ExtensionClassPath; in this case, the classloader used
will be the one used to load the UIMA framework - usually the overall application
class loader.</para>
<para>Of course, you should not run multiple UIMA applications together, in this
way, if they have different class definitions for the same class name. This
includes the JCas <quote>cover</quote> classes. This case might arise, for
instance, if both applications extended
<literal>uima.tcas.DocumentAnnotation</literal> in differing,
incompatible ways. Each application would need its own definition of this class,
but only one could be loaded (unless you specify ExtensionClassPath in the
ResourceManager which will cause the UIMA application to load its private
versions of its classes, from its classpath).</para>
</section>
</section>
<section id="ugr.ref.jcas.accessing_jcas_objects_outside_uima_components">
<title>Issues accessing JCas objects outside of UIMA Engine Components</title>
<para>If you are using the ExtensionClassPaths, the JCas cover classes are loaded
under a class loader created by the ResourceManager part of the UIMA Framework.
If you reference the same JCas
classes outside of any UIMA component, for instance, in top level application code,
the JCas classes used by that top level application code also must be in the class path
for the application code.</para>
<para>Alternatively, you could do all the JCas processing inside a UIMA component (and do no
processing using JCas outside of the UIMA pipeline).</para>
</section>
</section>
<section id="ugr.ref.jcas.setting_up_classpath">
<title>Setting up Classpath for JCas</title>
<para>The JCas Java classes generated by JCasGen are typically compiled and put into a JAR
file, which, in turn, is put into the application&apos;s class path.</para>
<para>This JAR file must be generated from the application&apos;s merged type system.
This is most conveniently done by opening the top level descriptor used by the
application in the Component Descriptor Editor tool, and pressing the Run-JCasGen
button on the Type System Definition page.</para>
</section>
<section id="ugr.ref.jcas.pear_support">
<title>PEAR isolation</title>
<para>
As of version 2.2, the framework supports component descriptors which are PEAR descriptors.
These descriptors define components plus include information on the class path needed to
run them. The framework uses the class path information to set up a localized class path, just
for code running within the PEAR context. This allows PEAR files requiring different
versions of common code to work well together, even if the class names in the different versions
have the same names.
</para>
<para>The mechanism used to switch the class loaders when entering a PEAR-packaged annotator in
a flow depends on the framework knowing if JCas is being used within that annotator code. The
framework will know this if the particular view being passed has had a previous call to
getJCas(), or if the particular annotator is marked as a JCas-using one (by having it extend the
class <code>JCasAnnotator_ImplBase).</code></para>
</section>
</chapter>