RELEASE_NOTES.html - uima-uimaj - Git at Google

 	<!--
     ***************************************************************
     * Licensed to the Apache Software Foundation (ASF) under one
     * or more contributor license agreements.  See the NOTICE file
     * distributed with this work for additional information
     * regarding copyright ownership.  The ASF licenses this file
     * to you under the Apache License, Version 2.0 (the
     * "License"); you may not use this file except in compliance
     * with the License.  You may obtain a copy of the License at
     *
     *   http://www.apache.org/licenses/LICENSE-2.0
     *
     * Unless required by applicable law or agreed to in writing,
     * software distributed under the License is distributed on an
     * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     * KIND, either express or implied.  See the License for the
     * specific language governing permissions and limitations
     * under the License.
     ***************************************************************
    -->
 <html>
 <head>
   <title>Apache UIMA v3.2.0 Release Notes</title>
 </head>
 <body>
 <h1>Apache UIMA (Unstructured Information Management Architecture) v3.2.0 Release Notes</h1>

 <h2>Contents</h2>
 <p>
 <a href="#what.is.uima">What is UIMA?</a><br/>
 <a href="#major.changes">Major Changes in this Release</a><br/>
 <a href="#get.involved">How to Get Involved</a><br/>
 <a href="#report.issues">How to Report Issues</a><br/>
 <a href="#list.issues">List of JIRA Issues Fixed in this Release</a><br/>
 </p>

 <h2><a id="what.is.uima">What is UIMA?</a></h2>

      <p>
   			Unstructured Information Management applications are
 				software systems that analyze large volumes of
 				unstructured information in order to discover knowledge
 				that is relevant to an end user. UIMA is a framework and
 				SDK for developing such applications. An example UIM
 				application might ingest plain text and identify
 				entities, such as persons, places, organizations; or
 				relations, such as works-for or located-at. UIMA enables
 				such an application to be decomposed into components,
 				for example "language identification" -&gt; "language
 				specific segmentation" -&gt; "sentence boundary
 				detection" -&gt; "entity detection (person/place names
 				etc.)". Each component must implement interfaces defined
 				by the framework and must provide self-describing
 				metadata via XML descriptor files. The framework manages
 				these components and the data flow between them.
 				Components are written in Java or C++; the data that
 				flows between components is designed for efficient
 				mapping between these languages. UIMA additionally
 				provides capabilities to wrap components as network
 				services, and can scale to very large volumes by
 				replicating processing pipelines over a cluster of
 				networked nodes.
 			</p>
       <p>
 				Apache UIMA is an Apache-licensed open source
 				implementation of the UIMA specification (that
 				specification is, in turn, being developed concurrently
 				by a technical committee within
 				<a href="http://www.oasis-open.org">OASIS</a>,
 				a standards organization). We invite and encourage you
 				to participate in both the implementation and
 				specification efforts.
 			</p>
       <p>
 				UIMA is a component framework for analysing unstructured
 				content such as text, audio and video. It comprises an
 				SDK and tooling for composing and running analytic
 				components written in Java and C++, with some support
 				for Perl, Python and TCL.
 			</p>

 <h2><a id="major.changes">Notable changes in this release</a></h2>

 <ul>
   <li>Added AnnotationPredicates utility class providing various predicates testing how annotations
       relate to each other (e.g. covering, being covered by, following, preceding, etc.)</li>
   <li>Added single-int arg version of select.startAt()</li>
   <li>Added trim method to AnnotationFS</li>
   <li>Added ability to serialize as XMIs as XML 1.1</li>
   <li>Added ability to serialize as XMIs pretty-printed using CasIOUtils</li>
   <li>Added typed parameter support to PEARs</li>
   <li>Improve performance of setting up JCas classes by reducing sync lock contention</li>
   <li>Improved speed of constructing aggregate engines</li>
   <li>Fixed de-serialization of array subtypes in form 6 binary CASes</li>
   <li>Fixed parameter-fetching methods PearSpecifier_impl not returning null because they promise not to</li>
   <li>Fixed logs being spammed with "Import by location/name..." messages</li>
   <li>Fixed numerous bugs and inconsistencies in the SelectFS implementation</li>
   <li>Fixed CAS-transportable Java objects not being properly deserialized</li>
   <li>Fixed FSArray.spliterator() to work in PEAR scenarios</li>
   <li>Fixed memory leak in FSClassRegistry in scenarios with large numbers of dynamically created classloaders</li>
   <li>Fixed oddball "race" condition when initializing JCas classes</li>
   <li>Fixed problem when reading mixed sets of binary CASes from UIMAv2 and UIMAv3</li>
   <li>Fixed IndexOutOfBoundsException in CVD</li>
   <li>Fixed bug causing Annotation to be returned when asking JCas for a specific type</li>
   <li>Fixed ability to install PEARs into directories containing XML special characters in the name</li>
   <li>Fixed not-indexed document annotation being wrongly added back to the index during de-serialization</li>
   <li>Fixed index protection for cases that no FSes were indexed</li>
   <li>Fixed concurrent binary serialization producing corrupt output</li>
   <li>Fixed deep cloning of AnalysisEngineDescription</li>
   <li>Fixed race condition in type system consolidation</li>
   <li>Fixed re-initialization of multi-view CAS with a different type system</li>
   <li>Fixed logger silently discarding a parameter in some cases (placeholder filler or throwable)</li>
   <li>No longer ship Pack200-compressed versions of the Eclipse plugins</li>
   <li>Converted UIMAv3 User's Guide from DocBook to Asciidoc</li>
 </ul>

 <h3>API changes</h3>

 <h4>SelectFS API with zero-width annotations</h4>

 <p>The behavior of the selectFS API changes in this release, in particular with respect to the
 handling of zero-width annotations (those that have the same start and end position). The behavior
 has been made to align with the new annotation predicates, the details of which are described in the
 UIMAv3 User's Guide.
 </p>

 <h4>SelectFS API with negative shift on bounded selections</h4>

 <p>The <code>shifted</code> operation can no longer be used to expand a selection beyond its
 selection boundaries. Consider the following example:<p>

 <pre><code>
 t1 = new Token(0,1)
 t2 = new Token(2,3)
 t3 = new Token(4,5)
 t4 = new Token(6,7)
 t5 = new Token(8,9)
 </code></pre>

 <p>In previous versions, was also possible to use a negative shift with a bounding operator such as
 <code>following</code>, <code>coveredBy</code>, etc. and it would call <code>moveToPrevious</code>
 on the internal iterator of the selection operation, causing it to return annotations occurring
 before the bounds e.g.:

 <pre><code>
 select().shifted(-1).following(t3) => {t3, t4, t5}
 </code></pre>

 <p>This was found to be inconsistent behavior. The iterator used for the selection (which can also be
 obtained by calling <code>fsIterator()</code>) should respect the bounds.</p>

 <p>As of this UIMA version, using <code>shifted</code> with a negative argument in conjunction with
 a bounding operator will trigger a warning in the logs and return an empty result.</p>

 <pre><code>
 select().shifted(-1).following(t3) => {}
 select().following(t3) => {t4, t5}
 </code></pre>

 <h4>SelectFS API with Backwards selection with startAt</h4>

 <p>In previous versions, the using the <code>moveTo</code> operation backwards iterators obtained
 through <code>SelectFSs</code> did never ignore type priorities - even though <code>SelectFSs</code>
 by default should ignore them.</p>

 <h2><a id="list.issues">Full list of JIRA Issues affecting this Release</a></h2>
 Click <a href="issuesFixed/jira-report.html">issuesFixed/jira-report.hmtl</a> for the list of
 issues affecting this release.

 <p>Please use the mailing lists
 ( http://uima.apache.org/mail-lists.html )
 for feedback.</p>

 <h2><a id="get.involved">How to Get Involved</a></h2>
 <p>
 The Apache UIMA project really needs and appreciates any contributions,
 including documentation help, source code and feedback.  If you are interested
 in contributing, please visit
 <a href="http://uima.apache.org/get-involved.html">
   http://uima.apache.org/get-involved.html</a>.
 </p>

 <h2><a id="report.issues">How to Report Issues</a></h2>
 <p>
 The Apache UIMA project uses JIRA for issue tracking.  Please report any
 issues you find at
 <a href="http://issues.apache.org/jira/browse/uima">http://issues.apache.org/jira/browse/uima</a>
 </p>
 </body>
 </html>
	<!--
	***************************************************************
	* Licensed to the Apache Software Foundation (ASF) under one
	* or more contributor license agreements. See the NOTICE file
	* distributed with this work for additional information
	* regarding copyright ownership. The ASF licenses this file
	* to you under the Apache License, Version 2.0 (the
	* "License"); you may not use this file except in compliance
	* with the License. You may obtain a copy of the License at
	*
	* http://www.apache.org/licenses/LICENSE-2.0
	*
	* Unless required by applicable law or agreed to in writing,
	* software distributed under the License is distributed on an
	* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	* KIND, either express or implied. See the License for the
	* specific language governing permissions and limitations
	* under the License.
	***************************************************************
	-->
	<html>
	<head>
	<title>Apache UIMA v3.2.0 Release Notes</title>
	</head>
	<body>
	<h1>Apache UIMA (Unstructured Information Management Architecture) v3.2.0 Release Notes</h1>

	<h2>Contents</h2>
	<p>
	<a href="#what.is.uima">What is UIMA?</a><br/>
	<a href="#major.changes">Major Changes in this Release</a><br/>
	<a href="#get.involved">How to Get Involved</a><br/>
	<a href="#report.issues">How to Report Issues</a><br/>
	<a href="#list.issues">List of JIRA Issues Fixed in this Release</a><br/>
	</p>

	<h2><a id="what.is.uima">What is UIMA?</a></h2>

	<p>
	Unstructured Information Management applications are
	software systems that analyze large volumes of
	unstructured information in order to discover knowledge
	that is relevant to an end user. UIMA is a framework and
	SDK for developing such applications. An example UIM
	application might ingest plain text and identify
	entities, such as persons, places, organizations; or
	relations, such as works-for or located-at. UIMA enables
	such an application to be decomposed into components,
	for example "language identification" -> "language
	specific segmentation" -> "sentence boundary
	detection" -> "entity detection (person/place names
	etc.)". Each component must implement interfaces defined
	by the framework and must provide self-describing
	metadata via XML descriptor files. The framework manages
	these components and the data flow between them.
	Components are written in Java or C++; the data that
	flows between components is designed for efficient
	mapping between these languages. UIMA additionally
	provides capabilities to wrap components as network
	services, and can scale to very large volumes by
	replicating processing pipelines over a cluster of
	networked nodes.
	</p>
	<p>
	Apache UIMA is an Apache-licensed open source
	implementation of the UIMA specification (that
	specification is, in turn, being developed concurrently
	by a technical committee within
	<a href="http://www.oasis-open.org">OASIS</a>,
	a standards organization). We invite and encourage you
	to participate in both the implementation and
	specification efforts.
	</p>
	<p>
	UIMA is a component framework for analysing unstructured
	content such as text, audio and video. It comprises an
	SDK and tooling for composing and running analytic
	components written in Java and C++, with some support
	for Perl, Python and TCL.
	</p>

	<h2><a id="major.changes">Notable changes in this release</a></h2>

	<ul>
	<li>Added AnnotationPredicates utility class providing various predicates testing how annotations
	relate to each other (e.g. covering, being covered by, following, preceding, etc.)</li>
	<li>Added single-int arg version of select.startAt()</li>
	<li>Added trim method to AnnotationFS</li>
	<li>Added ability to serialize as XMIs as XML 1.1</li>
	<li>Added ability to serialize as XMIs pretty-printed using CasIOUtils</li>
	<li>Added typed parameter support to PEARs</li>
	<li>Improve performance of setting up JCas classes by reducing sync lock contention</li>
	<li>Improved speed of constructing aggregate engines</li>
	<li>Fixed de-serialization of array subtypes in form 6 binary CASes</li>
	<li>Fixed parameter-fetching methods PearSpecifier_impl not returning null because they promise not to</li>
	<li>Fixed logs being spammed with "Import by location/name..." messages</li>
	<li>Fixed numerous bugs and inconsistencies in the SelectFS implementation</li>
	<li>Fixed CAS-transportable Java objects not being properly deserialized</li>
	<li>Fixed FSArray.spliterator() to work in PEAR scenarios</li>
	<li>Fixed memory leak in FSClassRegistry in scenarios with large numbers of dynamically created classloaders</li>
	<li>Fixed oddball "race" condition when initializing JCas classes</li>
	<li>Fixed problem when reading mixed sets of binary CASes from UIMAv2 and UIMAv3</li>
	<li>Fixed IndexOutOfBoundsException in CVD</li>
	<li>Fixed bug causing Annotation to be returned when asking JCas for a specific type</li>
	<li>Fixed ability to install PEARs into directories containing XML special characters in the name</li>
	<li>Fixed not-indexed document annotation being wrongly added back to the index during de-serialization</li>
	<li>Fixed index protection for cases that no FSes were indexed</li>
	<li>Fixed concurrent binary serialization producing corrupt output</li>
	<li>Fixed deep cloning of AnalysisEngineDescription</li>
	<li>Fixed race condition in type system consolidation</li>
	<li>Fixed re-initialization of multi-view CAS with a different type system</li>
	<li>Fixed logger silently discarding a parameter in some cases (placeholder filler or throwable)</li>
	<li>No longer ship Pack200-compressed versions of the Eclipse plugins</li>
	<li>Converted UIMAv3 User's Guide from DocBook to Asciidoc</li>
	</ul>

	<h3>API changes</h3>

	<h4>SelectFS API with zero-width annotations</h4>

	<p>The behavior of the selectFS API changes in this release, in particular with respect to the
	handling of zero-width annotations (those that have the same start and end position). The behavior
	has been made to align with the new annotation predicates, the details of which are described in the
	UIMAv3 User's Guide.
	</p>

	<h4>SelectFS API with negative shift on bounded selections</h4>

	<p>The <code>shifted</code> operation can no longer be used to expand a selection beyond its
	selection boundaries. Consider the following example:<p>

	<pre><code>
	t1 = new Token(0,1)
	t2 = new Token(2,3)
	t3 = new Token(4,5)
	t4 = new Token(6,7)
	t5 = new Token(8,9)
	</code></pre>

	<p>In previous versions, was also possible to use a negative shift with a bounding operator such as
	<code>following</code>, <code>coveredBy</code>, etc. and it would call <code>moveToPrevious</code>
	on the internal iterator of the selection operation, causing it to return annotations occurring
	before the bounds e.g.:

	<pre><code>
	select().shifted(-1).following(t3) => {t3, t4, t5}
	</code></pre>

	<p>This was found to be inconsistent behavior. The iterator used for the selection (which can also be
	obtained by calling <code>fsIterator()</code>) should respect the bounds.</p>

	<p>As of this UIMA version, using <code>shifted</code> with a negative argument in conjunction with
	a bounding operator will trigger a warning in the logs and return an empty result.</p>

	<pre><code>
	select().shifted(-1).following(t3) => {}
	select().following(t3) => {t4, t5}
	</code></pre>

	<h4>SelectFS API with Backwards selection with startAt</h4>

	<p>In previous versions, the using the <code>moveTo</code> operation backwards iterators obtained
	through <code>SelectFSs</code> did never ignore type priorities - even though <code>SelectFSs</code>
	by default should ignore them.</p>

	<h2><a id="list.issues">Full list of JIRA Issues affecting this Release</a></h2>
	Click <a href="issuesFixed/jira-report.html">issuesFixed/jira-report.hmtl</a> for the list of
	issues affecting this release.

	<p>Please use the mailing lists
	( http://uima.apache.org/mail-lists.html )
	for feedback.</p>

	<h2><a id="get.involved">How to Get Involved</a></h2>
	<p>
	The Apache UIMA project really needs and appreciates any contributions,
	including documentation help, source code and feedback. If you are interested
	in contributing, please visit
	<a href="http://uima.apache.org/get-involved.html">
	http://uima.apache.org/get-involved.html</a>.
	</p>

	<h2><a id="report.issues">How to Report Issues</a></h2>
	<p>
	The Apache UIMA project uses JIRA for issue tracking. Please report any
	issues you find at
	<a href="http://issues.apache.org/jira/browse/uima">http://issues.apache.org/jira/browse/uima</a>
	</p>
	</body>
	</html>