blob: 3dc20e062ff54a37b48e27e40daef9c8cd58d708 [file] [log] [blame]
<!--
***************************************************************
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
***************************************************************
-->
<html>
<head>
<title>Apache UIMA v3.2.0 Release Notes</title>
</head>
<body>
<h1>Apache UIMA (Unstructured Information Management Architecture) v3.2.0 Release Notes</h1>
<h2>Contents</h2>
<p>
<a href="#what.is.uima">What is UIMA?</a><br/>
<a href="#major.changes">Major Changes in this Release</a><br/>
<a href="#get.involved">How to Get Involved</a><br/>
<a href="#report.issues">How to Report Issues</a><br/>
<a href="#list.issues">List of JIRA Issues Fixed in this Release</a><br/>
</p>
<h2><a id="what.is.uima">What is UIMA?</a></h2>
<p>
Unstructured Information Management applications are
software systems that analyze large volumes of
unstructured information in order to discover knowledge
that is relevant to an end user. UIMA is a framework and
SDK for developing such applications. An example UIM
application might ingest plain text and identify
entities, such as persons, places, organizations; or
relations, such as works-for or located-at. UIMA enables
such an application to be decomposed into components,
for example "language identification" -&gt; "language
specific segmentation" -&gt; "sentence boundary
detection" -&gt; "entity detection (person/place names
etc.)". Each component must implement interfaces defined
by the framework and must provide self-describing
metadata via XML descriptor files. The framework manages
these components and the data flow between them.
Components are written in Java or C++; the data that
flows between components is designed for efficient
mapping between these languages. UIMA additionally
provides capabilities to wrap components as network
services, and can scale to very large volumes by
replicating processing pipelines over a cluster of
networked nodes.
</p>
<p>
Apache UIMA is an Apache-licensed open source
implementation of the UIMA specification (that
specification is, in turn, being developed concurrently
by a technical committee within
<a href="http://www.oasis-open.org">OASIS</a>,
a standards organization). We invite and encourage you
to participate in both the implementation and
specification efforts.
</p>
<p>
UIMA is a component framework for analysing unstructured
content such as text, audio and video. It comprises an
SDK and tooling for composing and running analytic
components written in Java and C++, with some support
for Perl, Python and TCL.
</p>
<h2><a id="major.changes">Notable changes in this release</a></h2>
<ul>
<li>Added AnnotationPredicates utility class providing various predicates testing how annotations
relate to each other (e.g. covering, being covered by, following, preceding, etc.)</li>
<li>Added single-int arg version of select.startAt()</li>
<li>Added trim method to AnnotationFS</li>
<li>Added ability to serialize as XMIs as XML 1.1</li>
<li>Added ability to serialize as XMIs pretty-printed using CasIOUtils</li>
<li>Added typed parameter support to PEARs</li>
<li>Improve performance of setting up JCas classes by reducing sync lock contention</li>
<li>Improved speed of constructing aggregate engines</li>
<li>Fixed de-serialization of array subtypes in form 6 binary CASes</li>
<li>Fixed parameter-fetching methods PearSpecifier_impl not returning null because they promise not to</li>
<li>Fixed logs being spammed with "Import by location/name..." messages</li>
<li>Fixed numerous bugs and inconsistencies in the SelectFS implementation</li>
<li>Fixed CAS-transportable Java objects not being properly deserialized</li>
<li>Fixed FSArray.spliterator() to work in PEAR scenarios</li>
<li>Fixed memory leak in FSClassRegistry in scenarios with large numbers of dynamically created classloaders</li>
<li>Fixed oddball "race" condition when initializing JCas classes</li>
<li>Fixed problem when reading mixed sets of binary CASes from UIMAv2 and UIMAv3</li>
<li>Fixed IndexOutOfBoundsException in CVD</li>
<li>Fixed bug causing Annotation to be returned when asking JCas for a specific type</li>
<li>Fixed ability to install PEARs into directories containing XML special characters in the name</li>
<li>Fixed not-indexed document annotation being wrongly added back to the index during de-serialization</li>
<li>Fixed index protection for cases that no FSes were indexed</li>
<li>Fixed concurrent binary serialization producing corrupt output</li>
<li>Fixed deep cloning of AnalysisEngineDescription</li>
<li>Fixed race condition in type system consolidation</li>
<li>Fixed re-initialization of multi-view CAS with a different type system</li>
<li>Fixed logger silently discarding a parameter in some cases (placeholder filler or throwable)</li>
<li>No longer ship Pack200-compressed versions of the Eclipse plugins</li>
<li>Converted UIMAv3 User's Guide from DocBook to Asciidoc</li>
</ul>
<h3>API changes</h3>
<h4>SelectFS API with zero-width annotations</h4>
<p>The behavior of the selectFS API changes in this release, in particular with respect to the
handling of zero-width annotations (those that have the same start and end position). The behavior
has been made to align with the new annotation predicates, the details of which are described in the
UIMAv3 User's Guide.
</p>
<h4>SelectFS API with negative shift on bounded selections</h4>
<p>The <code>shifted</code> operation can no longer be used to expand a selection beyond its
selection boundaries. Consider the following example:<p>
<pre><code>
t1 = new Token(0,1)
t2 = new Token(2,3)
t3 = new Token(4,5)
t4 = new Token(6,7)
t5 = new Token(8,9)
</code></pre>
<p>In previous versions, was also possible to use a negative shift with a bounding operator such as
<code>following</code>, <code>coveredBy</code>, etc. and it would call <code>moveToPrevious</code>
on the internal iterator of the selection operation, causing it to return annotations occurring
before the bounds e.g.:
<pre><code>
select().shifted(-1).following(t3) => {t3, t4, t5}
</code></pre>
<p>This was found to be inconsistent behavior. The iterator used for the selection (which can also be
obtained by calling <code>fsIterator()</code>) should respect the bounds.</p>
<p>As of this UIMA version, using <code>shifted</code> with a negative argument in conjunction with
a bounding operator will trigger a warning in the logs and return an empty result.</p>
<pre><code>
select().shifted(-1).following(t3) => {}
select().following(t3) => {t4, t5}
</code></pre>
<h4>SelectFS API with Backwards selection with startAt</h4>
<p>In previous versions, the using the <code>moveTo</code> operation backwards iterators obtained
through <code>SelectFSs</code> did never ignore type priorities - even though <code>SelectFSs</code>
by default should ignore them.</p>
<h2><a id="list.issues">Full list of JIRA Issues affecting this Release</a></h2>
Click <a href="issuesFixed/jira-report.html">issuesFixed/jira-report.hmtl</a> for the list of
issues affecting this release.
<p>Please use the mailing lists
( http://uima.apache.org/mail-lists.html )
for feedback.</p>
<h2><a id="get.involved">How to Get Involved</a></h2>
<p>
The Apache UIMA project really needs and appreciates any contributions,
including documentation help, source code and feedback. If you are interested
in contributing, please visit
<a href="http://uima.apache.org/get-involved.html">
http://uima.apache.org/get-involved.html</a>.
</p>
<h2><a id="report.issues">How to Report Issues</a></h2>
<p>
The Apache UIMA project uses JIRA for issue tracking. Please report any
issues you find at
<a href="http://issues.apache.org/jira/browse/uima">http://issues.apache.org/jira/browse/uima</a>
</p>
</body>
</html>