blob: 16cc8820addc66f056162a3ac4d90db694d6e4f5 [file] [log] [blame]
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
<title>Apache Distributed UIMA Cluster Computing (DUCC) 3.0.0 Release Notes</title>
<h1>Apache UIMA-DUCC (Unstructured Information Management Architecture - Distributed UIMA Cluster Computing ) v.3.0.0 Release Notes</h1>
<a href="">1. What is UIMA-DUCC?</a><br/>
<a href="#major.changes">2. Major Changes in this Release</a><br/>
<a href="#migration">3. Migration from a Prior Release</a><br/>
<a href="#migration">4. Limitations</a><br/></p>
<h2><a name="">1. What is UIMA-DUCC?</a></h2>
DUCC stands for Distributed UIMA Cluster Computing. DUCC is a cluster management system providing tooling,
management, and scheduling facilities to automate the scale-out of applications written to the UIMA framework.
Core UIMA provides a generalized framework for applications that process unstructured information such as human
language, but does not provide a scale-out mechanism. UIMA-AS provides a scale-out mechanism to distribute UIMA
pipelines over a cluster of computing resources, but does not provide job or cluster management of the resources.
DUCC defines a formal job model that closely maps to a standard UIMA pipeline. Around this job model DUCC
provides cluster management services to automate the scale-out of UIMA pipelines over computing clusters.
<h2><a name="major.changes">2. Major Changes in this Release</a></h2>
Apache UIMA DUCC 3.0.0 is a maintenance release containing new features and bug fixes. What's new:<br>
<li>Support for UIMA v2 and v3</li>
<li>Added support for reliable DUCC - automatic head node failover</li>
<li>Created new pull service that can be run with or without DUCC</li>
<li>Enable DUCC to run without shared file system</li>
<li>Add new DUCC stop options, including quiesce</li>
<li>Add "CASes processed" to annotator performance metrics</li>
<li>Upgraded to Cassandra Server v.3.11.3, Cassandra Driver v.3.6.0, Jetty v.9.4.14.v20181114, guava v.18.0,
joda v.2.4, commons.lang v.3.1, commons.math v.3.2, netty v.4.0.44, snappy v.</li>
For a complete list of issues fixed and up-to-date information on UIMA-DUCC issues, see our issue tracker:
<a href=""></a>
<h2><a name="migration">3. Migration from a Prior Release</a></h2>
When upgrading from an existing installation the ducc_update script may be used to replace the system files while leaving the site-specific configuration files in place. For more information see <strong>ducc_update</strong> in the Administrative Commands section of the DuccBook.
<h2><a name="limitations">4. Limitations</a></h2>
On some systems cgroups swap accounting is not enabled and duccmon will show N/A for swap. To
confirm, please check memory.stat file in <cgroups base dir>/ducc/ folder. If swap accounting is
enabled there should be "swap" property defined. If it's missing, you need to add a kernel parameter
swapaccount=1. Details of how to do this can be found <a href="">here</a>.
Due to a bug in uima sdk, the uima AnalysisEngineProcessException cannot be serialized as a Java object. If your
analysis engine throws an exception in process(), the ducc framework will stringify it and wrap it in
java RuntimeException. If you have a custom error handler plugged in into a job driver you will not be
able to test for AnalysisEngineProcessException in a stack trace with a code like this:
if ( error instanceof AnalysisEngineProcessException ) ...
To use OS-based login with the WebServer while running DUCC with IBM java, the minimum JDK version is
Java 8 SR4 FP5 (