blob: 746cffb73d017d87c8b70484056dee3ddce2ae22 [file] [log] [blame]
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<chapter id="ugr.tools.uimafit.casutil">
<title>CAS Utilities</title>
<para>uimaFIT facilitates working with the CAS and JCas by offering various convenient methods for
accessing and navigating annotations and feature structures. Additionally, the the convenience
methods for JCas access are fully type-safe and return the JCas type or a collection of the JCas
type which you wanted to access.</para>
<section>
<title>Access methods</title>
<para>uimaFIT supports the following convenience methods for accessing CAS and JCas structures.
All methods respect the UIMA index definitions and return annotations or feature structures in
the order defined by the indexes. Unless the default UIMA index for annotations has been
overwritten, annotations are returned sorted by begin (increasing) and end
(decreasing).</para>
<itemizedlist>
<listitem>
<para><code>select(cas, type)</code> - fetch all annotations of the given type from the
CAS/JCas. Variants of this method also exist to fetch annotations from a
<type>FSList</type> or <type>FSArray</type>.</para>
</listitem>
<listitem>
<para><code>selectAll(cas)</code> - fetch all annotations from the CAS or fetch all feature
structures from the JCas.</para>
</listitem>
<listitem>
<para><code>selectBetween(type, annotation1, annotation2)</code>* - fetch all annotations
between the given two annotations.</para>
</listitem>
<listitem>
<para><code>selectCovered(type, annotation)</code>* - fetch all annotations covered by the
given annotation. If this operation is used intensively, <code>indexCovered(...)</code>
should be used to pre-calculate annotation covering information.</para>
</listitem>
<listitem>
<para><code>selectCovering(type, annotation)*</code> - fetch all annotations covering the
given annotation. If this operation is used intensively, <code>indexCovering(...)</code>
should be used to pre-calculate annotation covering information.</para>
</listitem>
<listitem>
<para><code>selectByIndex(cas, type, n)</code> - fetch the n-th feature structure of the
given type.</para>
</listitem>
<listitem>
<para><code>selectSingle(cas, type)</code> - fetch the single feature structure of the given
type. An exception is thrown if there is not exactly one feature structure of the type.
</para>
</listitem>
<listitem>
<para><code>selectSingleRelative(type, annotation, n)</code>* - fetch a single annotation
relative to the given annotation. A positive <parameter>n</parameter> fetches the n-th
annotation right of the specified annotation, while the a negative
<parameter>n</parameter> fetches to the left.</para>
</listitem>
<listitem>
<para><code>selectPreceding(type, annotation, n)</code>* - fetch the n annotations preceding
the given annotation. If there are less then n preceding annotations, all preceding
annotations are returned.</para>
</listitem>
<listitem>
<para><code>selectFollowing(type, annotation, n)</code>* - fetch the n annotations following
the given annotation. If there are less then n following annotations, all following
annotations are returned.</para>
</listitem>
</itemizedlist>
<note>
<para>For historical reasons, the method marked with * also exist in a version that accepts a
CAS/JCas as the first argument. These may not work as expected when the annoation arguments
provided to the method are from a different CAS/JCas/view. Also, for any method accepting
two annotations, these should come from the same CAS/JCas/view. In future, the potentially
problematic signatures may be deprecated, removed, or throw exeptions if these conditions
are not met.</para>
</note>
<note>
<para>You should expect the structures returned by these methods to be backed by the CAS/JCas
contents. In particular, if you remove any feature structures from the CAS while iterating
over these structures may cause failures. For this reason, you should also not hold on to
these structures longer than necessary, as is the case for UIMA <code>FSIterator</code>s as
well.</para>
</note>
<para>Depending on whether one works with a CAS or JCas, the respective methods are available
from the JCasUtil or CasUtil classes. </para>
<para>JCasUtil expect a JCas wrapper class for the <parameter>type</parameter> argument, e.g.
<code>select(jcas, Token.class)</code> and return this type or a collection using this
generic type. Any subtypes of the specified type are returned as well. CasUtil expects a UIMA
<type>Type</type> instance. For conveniently getting these, CasUtil offers the methods
<code>getType(CAS, Class&lt;?>)</code> or <code>getType(CAS, String)</code> which fetch a
type either by its JCas wrapper class or by its name.</para>
<para>Unless annotations are specifically required, e.g. because begin/end offsets are required,
the JCasUtil methods can be used to access any feature structure inheriting from
<type>TOP</type>, not only annotations. The CasUtil methods generally work only on
annotations. Alternative methods ending in "FS" are provided for accessing arbitrary feature
structures, e.g. <code>selectFS</code>.</para>
<para>Examples:</para>
<programlisting>// CAS version
Type tokenType = CasUtil.getType(cas, "my.Token");
for (AnnotationFS token : CasUtil.select(cas, tokenType)) {
...
}
// JCas version
for (Token token : JCasUtil.select(jcas, Token.class)) {
...
}</programlisting>
</section>
</chapter>