blob: 9b124861a30a343053a9fdecc7db82db0624a5bd [file] [log] [blame]
<html>
<!--
***************************************************************
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
***************************************************************
-->
<head>
<title>Apache Distributed UIMA Cluster Computing (DUCC) 1.1.0 Release Notes</title>
</head>
<body>
<h1>Apache UIMA-DUCC (Unstructured Information Management Architecture - Distributed UIMA Cluster Computing ) v1.1.0. Release Notes</h1>
<h2>Contents</h2>
<p>
<a href="#what.is.uima-ducc">1. What is UIMA-DUCC?</a><br/>
<a href="#major.changes">2. Major Changes in this Release</a><br/>
<a href="#limitations">3. Limitations in this Release</a><br/>
</p>
<h2><a name="what.is.uima-ducc">1. What is UIMA-DUCC?</a></h2>
<p>
DUCC stands for Distributed UIMA Cluster Computing. DUCC is a cluster management system providing tooling,
management, and scheduling facilities to automate the scale-out of applications written to the UIMA framework.
Core UIMA provides a generalized framework for applications that process unstructured information such as human
language, but does not provide a scale-out mechanism. UIMA-AS provides a scale-out mechanism to distribute UIMA
pipelines over a cluster of computing resources, but does not provide job or cluster management of the resources.
DUCC defines a formal job model that closely maps to a standard UIMA pipeline. Around this job model DUCC
provides cluster management services to automate the scale-out of UIMA pipelines over computing clusters.
</p>
<h2><a name="major.changes">2. Major Changes in this Release</a></h2>
<p>
UIMA DUCC 1.1.0 Apache is a maintenance release containing bug fixes and a few
new features. What's new:<br>
<h3>2.1 Service Manager Changes</h3>
Advanced ping support - pinger is able to microschedule service instances
The Ping API is enhanced to allow the following actions:
<ul>
<li>start instances</li>
<li>stop instances</li>
<li>change autostart</li>
<li>set last-usage information in query</li>
<li>manage the instance failure window. Default failure-window management is provide.</li>
</ul>
A sample microscheduling pinger is supplied to illustrate the new API.
Multiple pingers may be registered by the admin to run internally as threads in the SM instead of as external processes.
CLI support to enable / disable instance startup.
CLI support to seamlessly transition service startup mode among autostarted, reference-started, and manually-started
Enhance query to provide more information to CLI and web server:
Service last use
Registration date
Explicit denotation of start mode: autostart, reference start, and manual start
CLI now does user authentication and administrator authorization on all actions.
Most registration parameters can be dynamically modified without re-registering the service.
Dynamic modification of pinger properties automatically restarts the pinger.
Debug support - a service may be registered to connect back to a debug port when it is started.
Multiple per-service admins - A service may register a list of ids which are allowed to perform administrative functions for that service.
<h3>2.2 ducc_ling Changes</h3>
<p>
All registered groups are set for processes.
User may set DUCC_UMASK to establish the umask for a processes.
</p>
<h3>2.3 Resource Manager Changes</h3>
<p>
Administrative CLI interface
Vary-off a node to temporarily exclude it from scheduling
Vary-on a node to return it to the scheduling pool
Query occupancy - for each node, shows what is scheduled there
Query load - summary of scheduling tables to allow external entities such as LSF to collaborate with DUCC scheduler
Misc enhancements
Better handling of failed nodes, purges all work other than reservations
Improved de-fragmentation logic
Improved handling of small clusters
Improved eviction, takes into account the amount of work that would be lost before scheduling a process for eviction.
</p>
<h3>2.4 Web Server Changes</h3>
<p>
Added Node visualization
</p>
For a complete list of issues fixed and up-to-date information on UIMA-DUCC issues, see our issue tracker:
<a href="https://issues.apache.org/jira/issues/?jql=project%20%3D%20UIMA%20AND%20fixVersion%20%3D%20%221.1.0-Ducc%22%20ORDER%20BY%20key%20ASC">https://issues.apache.org/jira/issues/?jql=project%20%3D%20UIMA%20AND%20fixVersion%20%3D%20%221.1.0-Ducc%22%20ORDER%20BY%20key%20ASC</a>
</p>
<h2><a name="limitations">3. Limitations in this Release</a></h2>
<h3>3.1 FireFox memory bloat</h3>
<p>
DUCC's Web Server comprises a javascript that provides the ability
to monitor various aspects of the DUCC system via a browser.
It has been occasionally observed for a browser that if several tabs
are simultaneously activated, each containing an "Automatic" monitor
of one aspect of the DUCC system, then over a relatively long period
of time (on the order of days) the browser process may consume a large
amount of memory (on the order of several GB).
At the time of this writing, this problem is not reliably reproduced.
This limitation has not been observed when in "Manual" monitoring mode.
The memory bloat has only been observed on Firefox browser.
</p>
</body>
</html>