blob: a4e1e8cb0280b71fe20f17e85c39f6105f309a66 [file] [log] [blame]
%
% Licensed to the Apache Software Foundation (ASF) under one
% or more contributor license agreements. See the NOTICE file
% distributed with this work for additional information
% regarding copyright ownership. The ASF licenses this file
% to you under the Apache License, Version 2.0 (the
% "License"); you may not use this file except in compliance
% with the License. You may obtain a copy of the License at
%
% http://www.apache.org/licenses/LICENSE-2.0
%
% Unless required by applicable law or agreed to in writing,
% software distributed under the License is distributed on an
% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
% KIND, either express or implied. See the License for the
% specific language governing permissions and limitations
% under the License.
%
% Force internal link target to this chapter
% Reference page from javadoc via <a href="/doc/duccbook.html#DUCC_CLI">whatever text</a>
\ifpdf
\else
\HCode{<a name='DUCC_CLI'></a>}
\fi
\chapter{Command Line Interface}
\label{chap:cli}
\paragraph{Overview}
The DUCC CLI is the primary means of communication with DUCC. Work is submitted, work is
canceled, work is monitored, and work is queried with this interface.
All parameters may be passed to all the CLI commands in the form of Unix-like ``long-form''
(key, value) pairs, in which the key is proceeded by the characters ``$--$". As well, the
parameters may be saved in a standard Java Properties file, without the leading ``$--$''
characters. Both a properties file and command-line parameters may be passed to each CLI. When
both are present, the parameters on the command line take precedence. Take, for example
the following simple job properties file, call it {\tt 1.job}, where the environment variable
``DH'' has been set to the location of \ducchome.
\begin{verbatim}
description Test job 1
classpath ${DH}/lib/uima-ducc/examples/*
environment AE_INIT_TIME=5 AE_INIT_RANGE=5 LD_LIBRARY_PATH=/a/nother/path
scheduling_class normal
driver_descriptor_CR org.apache.uima.ducc.test.randomsleep.FixedSleepCR
driver_descriptor_CR_overrides jobfile=${DH}/lib/examples/simple/1.inputs compression=10
error_rate=0.0
driver_jvm_args -Xmx500M
process_descriptor_AE org.apache.uima.ducc.test.randomsleep.FixedSleepAE
process_memory_size 4
process_jvm_args -Xmx100M
process_pipeline_count 2
process_per_item_time_max 5
process_deployments_max 999
\end{verbatim}
This can be submitted, overriding the scheduling class and memory, thus:
\begin{verbatim}
ducc_submit --specification 1.job --process_memory_size 16 --scheduling_class high
\end{verbatim}
The DUCC CLI parameters are now described in detail.
\section{The DUCC Job Descriptor}
The DUCC Job Descriptor includes properties to enable automated management and scale-out
over large computing clusters. The job descriptor includes
\begin{itemize}
\item References to the various UIMA components required by the job (CR, CM, AE, CC, and maybe DD)
\item Scale-out requirements: number of processes, number of threads per process, etc
\item Environment requirements: log directory, working directory, environment variables, etc,
\item JVM parameters
\item Scheduling class
\item Error-handling preferences: acceptable failure counts, timeouts, etc
\item Debugging and monitoring requirements and preferences
\end{itemize}
\section{Operating System Limit Support}
The CLI supports specification of operating system limits applied to the various job processes.
To specify a limit, pass the name of the limit and its value in the {\em environment} specified
in the job. Limits are named with the string ``DUCC\_RLIMIT\_name'' where ``name'' is the name of
a specific limit. Supported limits include:
\begin{itemize}
\item DUCC\_RLIMIT\_CORE
\item DUCC\_RLIMIT\_CPU
\item DUCC\_RLIMIT\_DATA
\item DUCC\_RLIMIT\_FSIZE
\item DUCC\_RLIMIT\_MEMLOCK
\item DUCC\_RLIMIT\_NOFILE
\item DUCC\_RLIMIT\_NPROC
\item DUCC\_RLIMIT\_RSS
\item DUCC\_RLIMIT\_STACK
\item DUCC\_RLIMIT\_AS
\item DUCC\_RLIMIT\_LOCKS
\item DUCC\_RLIMIT\_SIGPENDING
\item DUCC\_RLIMIT\_MSGQUEUE
\item DUCC\_RLIMIT\_NICE
\item DUCC\_RLIMIT\_STACK
\item DUCC\_RLIMIT\_RTPRIO
\end{itemize}
See the Linux documentation for details on the meanings of these limits and their values.
For example, to set the maximum number of open files allowed in any job process, specify
an environment similar to this when submitting the job:
\begin{verbatim}
ducc_submit .... --environment="DUCC_RLIMIT_NOFILE=1024" ...
\end{verbatim}
\section{Command Line Forms}
The Command Line Interface is provided in several forms:
\begin{enumerate}
\item A wrapper script around the uima-ducc-cli.jar.
\item Direct invocation of each command's {\tt class} with the {\tt java} command.
\end{enumerate}
When using the scripts the full execution environment is established
silently. When invoking a command's {\tt class} directly, the java {\tt CLASSPATH}
must include the uima-ducc-cli.jar, as illustrated in the wrapper scripts.
\section{DUCC Commands}
The following commands are provided:
\begin{description}
\item[ducc\_submit] Submit a job for execution.
\item[ducc\_cancel] Cancel a job in progress.
\item[ducc\_reserve] Request a reservation of a machine.
\item[ducc\_unreserve] Cancel a reservation.
\item[ducc\_monitor] Monitor the progress of a job that is already submitted.
\item[ducc\_process\_submit] Submit an arbitrary process (managed reservation) for execution.
\item[ducc\_process\_cancel] Cancel an arbitrary process.
\item[ducc\_services] Register, unregister, start, stop, modify, disable, enable,
ignore references, observe references, and query a service.
\item[viaducc] This is a script wrapper to facilitate execution of Eclipse workspaces as
DUCC jobs as well as general execution of arbitrary processes in DUCC-managed resources.
\end{description}
The next section describes these commands in detail.
%% These all input sections
\input{part2/cli/ducc-submit.tex}
\input{part2/cli/ducc-cancel.tex}
% \input{part2/cli/ducc-monitor.tex}
\input{part2/cli/ducc-reserve.tex}
\input{part2/cli/ducc-unreserve.tex}
% service submit/cancel not part of the public CLI/API
% \input{part2/cli/ducc-service-submit.tex}
% \input{part2/cli/ducc-service-cancel.tex}
\input{part2/cli/ducc-process-submit.tex}
\input{part2/cli/ducc-process-cancel.tex}
\input{part2/cli/ducc-services.tex}
\input{part2/cli/viaducc.tex}
% Create well-known link to this spot for HTML version
\ifpdf
\else
\HCode{<a name='DUCC_API'></a>}
\fi
\chapter{The DUCC Public API}
\label{chap:api}
\input{part2/ducc-api.tex}
% Create well-known link to this spot for HTML version
\ifpdf
\else
\HCode{<a name='DUCC_SERVICES'></a>}
\fi
\chapter{Service Management}
\label{chap:services}
\input{part2/services.tex}
%% this inputs a chapter
\input{part2/job-logs.tex}
%% this inputs a chapter
\input{part2/job-errors.tex}
%% this inputs a chapter
\input {part2/webserver.tex}