| % |
| % Licensed to the Apache Software Foundation (ASF) under one |
| % or more contributor license agreements. See the NOTICE file |
| % distributed with this work for additional information |
| % regarding copyright ownership. The ASF licenses this file |
| % to you under the Apache License, Version 2.0 (the |
| % "License"); you may not use this file except in compliance |
| % with the License. You may obtain a copy of the License at |
| % |
| % http://www.apache.org/licenses/LICENSE-2.0 |
| % |
| % Unless required by applicable law or agreed to in writing, |
| % software distributed under the License is distributed on an |
| % "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| % KIND, either express or implied. See the License for the |
| % specific language governing permissions and limitations |
| % under the License. |
| % |
| % Force internal link target to this chapter |
| % Reference page from javadoc via <a href="/doc/duccbook.html#DUCC_CLI">whatever text</a> |
| \ifpdf |
| \else |
| \HCode{<a name='DUCC_CLI'></a>} |
| \fi |
| \chapter{Command Line Interface} |
| \label{chap:cli} |
| |
| \paragraph{Overview} |
| The DUCC CLI is the primary means of communication with DUCC. Work is submitted, work is |
| canceled, work is monitored, and work is queried with this interface. |
| |
| All parameters may be passed to all the CLI commands in the form of Unix-like ``long-form'' |
| (key, value) pairs, in which the key is proceeded by the characters ``$--$". As well, the |
| parameters may be saved in a standard Java Properties file, without the leading ``$--$'' |
| characters. Both a properties file and command-line parameters may be passed to each CLI. When |
| both are present, the parameters on the command line take precedence. Take, for example |
| the following simple job properties file, call it {\tt 1.job}, where the environment variable |
| ``DH'' has been set to the location of \ducchome. |
| \begin{verbatim} |
| description Test job 1 |
| |
| classpath ${DH}/lib/uima-ducc/examples/* |
| environment AE_INIT_TIME=5 AE_INIT_RANGE=5 LD_LIBRARY_PATH=/a/nother/path |
| scheduling_class normal |
| |
| driver_descriptor_CR org.apache.uima.ducc.test.randomsleep.FixedSleepCR |
| driver_descriptor_CR_overrides jobfile=${DH}/lib/examples/simple/1.inputs compression=10 |
| error_rate=0.0 |
| |
| driver_jvm_args -Xmx500M |
| |
| process_descriptor_AE org.apache.uima.ducc.test.randomsleep.FixedSleepAE |
| process_memory_size 4 |
| process_jvm_args -Xmx100M |
| process_pipeline_count 2 |
| process_per_item_time_max 5 |
| process_deployments_max 999 |
| |
| \end{verbatim} |
| |
| This can be submitted, overriding the scheduling class and memory, thus: |
| \begin{verbatim} |
| ducc_submit --specification 1.job --process_memory_size 16 --scheduling_class high |
| \end{verbatim} |
| |
| The DUCC CLI parameters are now described in detail. |
| |
| \section{The DUCC Job Descriptor} |
| The DUCC Job Descriptor includes properties to enable automated management and scale-out |
| over large computing clusters. The job descriptor includes |
| \begin{itemize} |
| \item References to the various UIMA components required by the job (CR, CM, AE, CC, and maybe DD) |
| \item Scale-out requirements: number of processes, number of threads per process, etc |
| \item Environment requirements: log directory, working directory, environment variables, etc, |
| \item JVM parameters |
| \item Scheduling class |
| \item Error-handling preferences: acceptable failure counts, timeouts, etc |
| \item Debugging and monitoring requirements and preferences |
| \end{itemize} |
| |
| \section{Operating System Limit Support} |
| The CLI supports specification of operating system limits applied to the various job processes. |
| To specify a limit, pass the name of the limit and its value in the {\em environment} specified |
| in the job. Limits are named with the string ``DUCC\_RLIMIT\_name'' where ``name'' is the name of |
| a specific limit. Supported limits include: |
| \begin{itemize} |
| \item DUCC\_RLIMIT\_CORE |
| \item DUCC\_RLIMIT\_CPU |
| \item DUCC\_RLIMIT\_DATA |
| \item DUCC\_RLIMIT\_FSIZE |
| \item DUCC\_RLIMIT\_MEMLOCK |
| \item DUCC\_RLIMIT\_NOFILE |
| \item DUCC\_RLIMIT\_NPROC |
| \item DUCC\_RLIMIT\_RSS |
| \item DUCC\_RLIMIT\_STACK |
| \item DUCC\_RLIMIT\_AS |
| \item DUCC\_RLIMIT\_LOCKS |
| \item DUCC\_RLIMIT\_SIGPENDING |
| \item DUCC\_RLIMIT\_MSGQUEUE |
| \item DUCC\_RLIMIT\_NICE |
| \item DUCC\_RLIMIT\_STACK |
| \item DUCC\_RLIMIT\_RTPRIO |
| \end{itemize} |
| See the Linux documentation for details on the meanings of these limits and their values. |
| |
| For example, to set the maximum number of open files allowed in any job process, specify |
| an environment similar to this when submitting the job: |
| \begin{verbatim} |
| ducc_submit .... --environment="DUCC_RLIMIT_NOFILE=1024" ... |
| \end{verbatim} |
| |
| \section{Command Line Forms} |
| The Command Line Interface is provided in several forms: |
| |
| \begin{enumerate} |
| \item A wrapper script around the uima-ducc-cli.jar. |
| \item Direct invocation of each command's {\tt class} with the {\tt java} command. |
| \end{enumerate} |
| |
| When using the scripts the full execution environment is established |
| silently. When invoking a command's {\tt class} directly, the java {\tt CLASSPATH} |
| must include the uima-ducc-cli.jar, as illustrated in the wrapper scripts. |
| |
| \section{DUCC Commands} |
| The following commands are provided: |
| \begin{description} |
| \item[ducc\_submit] Submit a job for execution. |
| \item[ducc\_cancel] Cancel a job in progress. |
| \item[ducc\_reserve] Request a reservation of a machine. |
| \item[ducc\_unreserve] Cancel a reservation. |
| \item[ducc\_monitor] Monitor the progress of a job that is already submitted. |
| \item[ducc\_process\_submit] Submit an arbitrary process (managed reservation) for execution. |
| \item[ducc\_process\_cancel] Cancel an arbitrary process. |
| \item[ducc\_services] Register, unregister, start, stop, modify, disable, enable, |
| ignore references, observe references, and query a service. |
| \item[viaducc] This is a script wrapper to facilitate execution of Eclipse workspaces as |
| DUCC jobs as well as general execution of arbitrary processes in DUCC-managed resources. |
| \end{description} |
| |
| The next section describes these commands in detail. |
| |
| %% These all input sections |
| \input{part2/cli/ducc-submit.tex} |
| \input{part2/cli/ducc-cancel.tex} |
| % \input{part2/cli/ducc-monitor.tex} |
| \input{part2/cli/ducc-reserve.tex} |
| \input{part2/cli/ducc-unreserve.tex} |
| % service submit/cancel not part of the public CLI/API |
| % \input{part2/cli/ducc-service-submit.tex} |
| % \input{part2/cli/ducc-service-cancel.tex} |
| \input{part2/cli/ducc-process-submit.tex} |
| \input{part2/cli/ducc-process-cancel.tex} |
| \input{part2/cli/ducc-services.tex} |
| \input{part2/cli/viaducc.tex} |
| |
| The next sections describe various tools in detail. |
| |
| \input{part2/cli/tools-ducc-status.tex} |
| \input{part2/cli/tools-ducc-watcher.tex} |
| |
| % Create well-known link to this spot for HTML version |
| \ifpdf |
| \else |
| \HCode{<a name='DUCC_API'></a>} |
| \fi |
| |
| \chapter{The DUCC Public API} |
| \label{chap:api} |
| \input{part2/ducc-api.tex} |
| |
| % Create well-known link to this spot for HTML version |
| \ifpdf |
| \else |
| \HCode{<a name='DUCC_SERVICES'></a>} |
| \fi |
| \chapter{Service Management} |
| \label{chap:services} |
| \input{part2/services.tex} |
| |
| %% this inputs a chapter |
| |
| \input{part2/job-logs.tex} |
| |
| %% this inputs a chapter |
| |
| \input{part2/job-errors.tex} |
| |
| %% this inputs a chapter |
| \input {part2/webserver.tex} |
| |