uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/admin-commands.tex - uima-ducc - Git at Google

 %
 % Licensed to the Apache Software Foundation (ASF) under one
 % or more contributor license agreements.  See the NOTICE file
 % distributed with this work for additional information
 % regarding copyright ownership.  The ASF licenses this file
 % to you under the Apache License, Version 2.0 (the
 % "License"); you may not use this file except in compliance
 % with the License.  You may obtain a copy of the License at
 %
 %   http://www.apache.org/licenses/LICENSE-2.0
 %
 % Unless required by applicable law or agreed to in writing,
 % software distributed under the License is distributed on an
 % "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 % KIND, either express or implied.  See the License for the
 % specific language governing permissions and limitations
 % under the License.
 %

 \section{Administrative Commands}

    The administrative commands include a command to start DUCC, one to stop it, and one to
    verify the configuration and query the state of the cluster.

    Note: The scripting that supports some of these functions runs (by default) in multi-threaded mode so
    large clusters can be started, stopped, and queried quickly.  If DUCC is running on an older
    system, the threading may not work right, in which case the scripts detect this and run
    single-threaded.  As well, all these commands support a ``--nothreading'' option to manually
    disable the threading.

 \subsection{start\_ducc}
 \label{subsec:admin.start-ducc}

     \subsubsection{{\em Description}}
     The command \ducchome/admin/start\_ducc is used to start DUCC processes.
     It must be run from the head node.
     If run with no parameters it takes the following actions:
     \begin{itemize}
       \item Starts the ActiveMQ server.
       \item Starts the database.
       \item Starts the management processes Resource Manager, Orchestrator, Process Manager,
       Services Manager, and Web Server on the local node.
       \item Starts an agent process on every node named in the default node list.
     \end{itemize}

     \subsubsection{{\em Usage}}

     \begin{description}
       \item[start\_ducc {[options]}] \hfill \\
         If no options are given, all DUCC processes are started, using the default node list,
         {\em ducc.nodes}.

       \end{description}

       \subsubsection{{\em Options: }}
       \begin{description}

         \item[-n, --nodelist {[nodefile] }] \hfill \\
           Start agents on the nodes in the nodefile. Multiple nodefiles may be specified:
 \begin{verbatim}
 start\_ducc -n foo.nodes -n bar.nodes -n baz.nodes
 \end{verbatim}


         \item[-c, --component {[component] }] \hfill \\
           Start a specific DUCC component, optionally on a specific node. If the component
           name is qualified with a nodename, the component is started on that node. To qualify
           a component name with a destination node, use the notation component@nodename.
           Multiple components may be specified:
 \begin{verbatim}
 start\_ducc -c sm -c pm -c rm -c or@bj22 -c agent@n1 -c agent@n2
 \end{verbatim}

           Components include:
           \begin{description}
             \item[rm] The Resource Manager
             \item[or]The Orchestrator
             \item[pm]The Process Manager
             \item[sm]The Service Manager
             \item[ws]The Web Server
             \item[agent@node]Node Agents
             \item[broker] ActiveMQ broker
             \item[db] Database
           \end{description}

           \item[--nothreading] If specified, the command does not run in multi-threaded mode
             even if it is supported on the local platform.

       \end{description}

       \subsubsection{{\em Notes: }}
       A different nodelist may be used to specify where Agents are started. As well multiple node
       lists may be specified, in which case Agents are started on all the nodes in the multiple node
       lists.

       To start only agents, run start\_ducc specifying a nodelist explicitly. Note that the broker
       must have already been started.

       To start a specific management process, run start\_ducc with the -c component parameter,
       specify the component that should be started.

       \subsubsection{{\em Examples: }}

       Start some nodes from two different nodelist and start the broker.
 \begin{verbatim}
         start\_ducc -n foo.nodes -n bar.nodes -c broker
 \end{verbatim}

       Start an agent on a specific node:
 \begin{verbatim}
         start\_ducc -c agent@a.specific.node
 \end{verbatim}

       Start the webserver on node 'bingle':
 \begin{verbatim}
         start\_ducc -c ws@bingle
 \end{verbatim}

       \subsubsection{{\em Debugging:}}

       Sometimes something will not start and it can be difficult to understand why.  To diagnose, it is
       helpful to know that {\em start\_ducc} is simply a wrapper around a lower-level bit of scripting
       that does the actual work.  That lower-level code can be invoked stand-alone, in which case
       console messages that {\em start\_ducc} will have suppressed are presented to the console.

       The lower-level script is called {\em ducc.py} and accepts the same {\em -c component} flag as
       start\_ducc.  If some component will not start, try running {\em ducc.py -c component} directly.
       It will start in the foreground and usually the cause of the problem becomes evident from
       the console.

       For example, suppose the Resource Manager will not start.  Run the following:
 \begin{verbatim}
       ./ducc.py -c rm
 \end{verbatim}
       and examine the output.  Use {\em CTL-C} to stop the component when done.


 \subsection{stop\_ducc}
 \label{subsec:admin.stop-ducc}

     \subsubsection{{\em Description:}}
     Stop\_ducc is used to stop DUCC processes. At least one parameter is required.
     When {\em -a} is specified, the following actions are taken:
     \begin{itemize}
        \item Uses the ActiveMQ broker to broadcast a shutdown request to all
         DUCC compoments, other than the ActiveMQ broker itself, and the database.
       \item Waits a bit, for all daemons to stop.
       \item Stops the database.
       \item Stops the ActiveMQ broker.
     \end{itemize}


     \subsubsection{\em Usage:}

     \begin{description}
       \item[stop\_ducc {[options]}] \hfill \\
         If no options are given, help text is presented. At least one option is required, to avoid
         accidental cluster shutdown.
       \end{description}


       \subsubsection{Options:}
         \begin{description}

           \item[-a --all] \hfill \\
             Stop all the DUCC processes, including agents and management processes. This
             broadcasts a "shutdown" command to all DUCC processes. Shutdown is normally
             performed gracefully with all processes given time to save state.
             All user processes, both jobs and services, are sent shutdown signals. Job and service
             processes which do not shutdown within a designated grace period are then forcibly
             terminated with kill -9.

           \item[-n, --nodelist {[nodefile]}] \hfill \\
             Only the DUCC agents in the designated nodelists are shutdown. The processes are sent
             kill -INT signals which triggers the Java shutdown hooks and enables graceful shutdown.
             All user processes on the indicated nodes, both jobs and services, are sent "shutdown"
             signals and are given a minute to shutdown gracefully. Job and service processes which do
             not shutdown within a designated grace period are then forcibly terminated with kill -9.

 \begin{verbatim}
 stop\_ducc -n foo.nodes -n bar.nodes -n baz.nodes
 \end{verbatim}

           \item[-c, --component {[component]}] \hfill \\
             Stop a specific DUCC component.

             This may be used to stop an errant management component and subsequently restart it
             (with start\_ducc).

             This may also be used to stop a specific agent and the job and services processes it is
             managing, without the need to specify a nodelist.

             Examples:

             Stop agents on nodes n1 and n2:
 \begin{verbatim}
 stop_ducc -c agent@n1 -c agent@n2
 \end{verbatim}

             Stop and restart the rm:
 \begin{verbatim}
 stop_ducc -c rm
 start_ducc -c rm
 \end{verbatim}

             Components include:
             \begin{description}
               \item[rm] The Resource Manager.
               \item[or] The Orchestrator.
               \item[pm] The Process Manager.
               \item[sm] The Service Manager.
               \item[ws] The Web Server.
               \item[db] The database.
               \item[broker] The ActiveMQ broker (only if the broker is auto-managed).
               \item[agent@node] Node Agent on the specified node.
               \end{description}

           \item[-w, --wait {[time in seconds]}] If given, this signals the time to wait
             after broadcasting the shutdown signal, and before stopping the ActiveMQ broker itself.
             If not specified, the default is 60 seconds.

             NOTE: In production systems, it is generally wise to use the default of 60 seconds.  For
             test systems a shorter wait speeds cycle time.  Be sure to use {\em check\_ducc -k} after
             {\em stop\_ducc} if you change the wait time to insure all processes are actually stopped.

           \item[--nothreading] If specified, the command does not run in multi-threaded mode
             even if it is supported on the local platform.

        \end{description}

    \subsubsection{{\em Notes:}}
    Sometimes problems in the network or elsewhere prevent the DUCC components from stopping properly.  The
    {\em check\_ducc} command, described in the following section, contains options to query the
    existance of DUCC processes in the cluster, to forcibly ({\em kill -9}) terminate them, and to
    more gracefully terminate them ({\em kill -INT}).


 \subsection{check\_ducc}
 \label{subsec:admin.check-ducc}
     \subsubsection{{\em Description:}}

     Check\_ducc is used to verify the integrity of the DUCC installation and to find and report on
     DUCC processes. It identifies processes owned by ducc (management processes, agents,
     and job processes), and processes started by DUCC on behalf of users.

     Check\_ducc can also be used to clean up errant DUCC processes when stop\_ducc is unable
     to do so. The difference is that stop\_ducc generally tries more gracefully stop processes.
     check\_ducc is used as a last resort, or if a fast but graceless shutdown is desired.

     \subsubsection{\em{Usage: }}

         \begin{description}
           \item[check\_ducc {[options]}]
               If no options are given this is the equivalent of:
 \begin{verbatim}
 check_ducc -c -n ../resources/ducc.nodes
 \end{verbatim}

               This verifies the integrity of the DUCC installation and searches for all the
               processes owned by user {\em ducc} and started by DUCC on all the nodes in ducc.nodes.
         \end{description}

      \subsubsection{\em{Options:}}
          \begin{description}
            \item[-n --nodelist {[nodefile]}]
              Only the nodes specified in the nodefile are searched. The option may be specified
              multiple times for multiple nodefiles. Note that the "local" node is always checked as well.
 \begin{verbatim}
 check_ducc -n nlist1 -n nlist2
 \end{verbatim}

            \item[-c --configuration]
              Verify the \hyperref[sec:ducc.classes]{Resource Manager configuration}.

            \item[-p --pids]
                Rewrite the PID file. The PID file contains the process ids of all known DUCC
                management and agent processes. The PID file is normally managed by start\_ducc and
                stop\_ducc and is stored in the file {\em ducc.pids} in directory {\em ducc\_runtime/state}.

                Occasionally the PID file can become partially or fully corrupted; for example, if a DUCC
                process dies spontaneously. Use check\_ducc -p to search the cluster for processes and
                refresh the PID file.

             \item[-i, --int] \hfill \\
               Use this to send a shutdown signal ({\em kill -INT}) to all the DUCC processes.  The DUCC processes
               catch this signal, close their resources and exit.  Some resources take some time to close, or in
               case of problems, are unable to close, in which case the DUCC processes will unconditionally exit.

               Sometimes problems in the network or elsewhere prevent {\em check\_ducc -i} from terminating
               the DUCC processes.  In this case, use {\em check\_ducc -k}, described below.

             \item[-k, --kill] \hfill \\
               Use this to forcibly kill a component using kill -9. This should only be used if {\em stop\_ducc}
               or {\em check\_ducc -i} does not work.

             \item[--nothreading] If specified, the command does not run in multi-threaded mode
               even if it is supported on the local platform.

             \item[-v, --verbose] \hfill \\
               When specified with {\em -c} to check the configuration, this emits a formatted version
               of the node list showing the full structure of the scheduling classes.


            \end{description}


 \subsection{ducc\_post\_install}
 \label{subsec:admin.ducc-post-install}

     \paragraph{Description:}
     The post-installation script must be run only after the first installation of DUCC.
     When updating an existing installation use \hyperref[subsec:admin.ducc-update]{\em ducc\_update}.
     ducc\_post\_install performs these tasks:
     \begin{enumerate}
       \item Verifies that the correct level of Java and Python are installed and available.
       \item Creates a default nodelist, \duccruntime/resources/ducc.nodes, containing the name of the node you are installing on.
       \item Defines the ``ducc head'' node to be the node you are installing from.
       \item Initializes the database.
       \item Sets up the default https keystore for the webserver.
       \item Installs the DUCC documentation ``ducc book'' into the DUCC webserver root.
       \item Builds and installs the C program, ``ducc\_ling'', into the default location.
       \item Ensures that the (supplied) ActiveMQ broker is runnable.
     \end{enumerate}

     Once the script completes successfully \hyperref[subsec:admin.start-ducc]{\em start\_ducc} will run a single-user/unprivileged DUCC.

     \paragraph{Notes:}
     If the script is rerun it will rename the previously created files so any customizations applied
     can be recovered.

 \subsection{ducc\_update}
 \label{subsec:admin.ducc-update}

     \paragraph{Description:}
         This command is used to unpack a new release of DUCC and create a new installion or update
         an existing one.
         For a new installation it simply unpacks the tar file with the appropriate permissions.
         The setup must be completed by running \hyperref[subsec:admin.ducc-post-install]{\em ducc\_post\_install}.

         When updating an existing installation it performs the following actions:
         \begin{enumerate}
           \item Checks that DUCC is not running.
           \item Creates a site.ducc.properties file if updating from DUCC 1.1.0.
           \item Creates a time-stamped archive directory to hold the old runtime.
           \item Archives current files before updating them, except for the customizable ones.
           \item Leaves in place any files added to the directories that may hold csite-specific files.
           \item Reports which are replaced, added, or kept.
           \item Rebuilds the non-privileged ducc\_ling.
         \end{enumerate}

         The site-specific files, those holding customizations such as node and class definitions
         as well as logs and job history, are left in place,
         while all replaced files are archived under a folder called {\em ducc\_archives}
         so the previous installation can be restored if necessary.

         Note that the update does not create the database.  After updating to 2.1.0 from an earlier
         version with the file-based persistent scheme the database should be created with
         \hyperref[subsec:admin.db-create]{\em db\_create}
         and the files holding state such as job history and service registrations loaded into the database with
         \hyperref[subsec:admin.db-loader]{\em db\_loader}.
         If this conversion is omitted DUCC will continue to use the file-based scheme but with some
         loss of functionality that the database design would provide.

     \paragraph{Usage:}
         This command takes two parameters, a pointer to the DUCC\_HOME to be updated or created,
         and the name of the tar file containg the new build.
     \begin{description}
       \item[ducc\_update {\em some-ducc-home} {\em binary-tar-file}]
         Update an existing DUCC installation or install a new one.
     \end{description}

     \paragraph{Arguments:}
     \begin{description}
         \item[{\em some-ducc-home}]
           This specifies the DUCC\_HOME you wish to create or update.  If it doesn't exist a new
           installation is created, otherwise it is updated.
         \item[{\em binary-tar-file}]
           The name of the binary tar file containing the new build.
      \end{description}

     \paragraph{Example:}
 \begin{verbatim}
 ducc_update ducc_runtime /home/ducc/ducc_runtime Downloads/uima-ducc-2.1.0-bin.tar.gz
 \end{verbatim}

 \subsection{rm\_reconfigure}
 \label{subsec:admin.rm-reconfigure}

     \subsubsection{{\em Description:}}
     Rm\_reconfigure is used to force the Resource Manager (RM) to reread all its configuration
     files and reconfigure itself accordingly, without the need to fully stop and restart RM.
     This is generally much faster than RM restart and avoids losing most state messages from
     the other DUCC processes.

     The {\em rm\_reconfigure} command first performs a
     \hyperref[sec:admin.properties-merge]{properties merge.}

     RM then validates the new
     configuration, and if no errors are found, saves certain information such as current node
     online-offline status.  It then rereads the following configuration files and rebuilds its
     internal structures accordingly:
     \begin{itemize}
       \item ducc.properties (after merging default.ducc.properties and site.ducc.properties,
       \item ducc.classes,
       \item log4j.xml.
     \end{itemize}
     The saved configuration is then restored into the newly configured structures.
     On receipt of the next Orchestrator state, the RM fully rebuilds its state from the current
     DUCC load and scheduling restarts.

     Depending on the nature of the new configuration, the current load may be adjusted; for
     example, if the weight of a fair-share class is changed, preemptions or extra allocations
     may be performed.

     If the new configuration is not consistent with the current load, a number of more drastic
     adjustments will be performed:
     \begin{itemize}
       \item If a fair-share class is deleted, all existing jobs for that class are preempted
         and a {\em refusal} message is sent to the Orchestrator for each affected job.
       \item If a fair-share class is redefined over a different nodepool such that existing
         work are no longer legally scheduled, any shares allocated over inappropriate
         hosts are {\em preempted}.  As soon as the preemptions are acknowledged, the RM
         will reschedule the shares over the differently-configured resources.
       \item If a non-preemptable class is deleted or reconfigured so existing non-preempt able
         work is no longer allocated correctly, the following will occur:
         \begin{itemize}
             \item If the shares are for services, they are deallocated and a {\em refusal} is
               sent to the Orchestrator.  The Service Manager will observe this and restart the
               processes, causing them to be reallocated over the changed configuration.
             \item Otherwise, the RM leaves the allocation in place, but places the hosts on an
               internal {\em blacklist}, preventing subsequent scheduling to those hosts. Once
               the (now) incorrectly placed shares are freed (e.g. by canceling a reservation or
               exit of a managed reservation), the hosts are again white listed and made available
               for scheduling.
         \end{itemize}
      \end{itemize}

     In short, the RM makes every effort to avoid disturbing existing allocations, and blacklists
     hosts that are no longer consistently configured for the current load, until such time as
     the allocations on those hosts are released.

     \subsubsection{\em Usage:}

     \begin{description}
       \item[rm\_reconfigure] \hfill \\
         This command has no options.
       \end{description}


 \iffalse  % Dropped this script for 2.1 .... needs work

 \subsection{rm\_qload}
 \label{subsec:admin.rm-qload}

     \subsubsection{{\em Description:}}
     Rm\_qload is used to query the Resource Manager's scheduling tables to determine the
     current demand and capacity of the system, as the RM sees it.  The primary purpose
     is to provide information to adjunct resource managers (such as a ``cloud'') to
     determine the current needs, or lack thereof, of the system.  The administrative
     command is implemented as a Python script that interacts with the underlying
     Java ``RmQueryLoadReply'' API and is provided mostly as an example of how
     scripting can be used to interact with the RM.

     After displaying the current scheduling quantum, the response is provided in two sections:
     \begin{enumerate}
       \item Information showing the current demand and usage of resource classes, and
       \item Information showing the current nodepool usage.
     \end{enumerate}

     \subsubsection{\em Class section}
     Three lines are emitted per class:
     \begin{enumerate}
       \item The name of the class and its scheduling policy,
       \item A line showing the {\em demand}, or {\em request} by quantum, on the class, and
       \item A line showing the {\em usage}, or {\em award}, by quantum on the class.
     \end{enumerate}

     The numbers shown for {\em request} and {\em award} show the number of processes, by
     memory, in terms of scheduling quantum, for each class.  For example, assuming the
     scheduling quantum is 15GB, the following shows:
     \begin{itemize}
       \item Five processes of quantum 2 (15-30GB) are requested, but only two have been awarded,
       \item Three processes of quantum 3 (31-45GB) are requested and all have been awarded,
       \item Four processes of quantum 4 (46-60GB) are requested, and two have been awarded.
     \end{itemize}
 \begin{verbatim}
 Class normal policy FAIR_SHARE
    requested    0    0    5    3    4    0    0    0    0
    awarded      0    0    2    3    2    0    0    0    0
 \end{verbatim}

     \subsubsection{\em Nodepool section}
     Six lines are displayed for each nodepool:
     \begin{enumerate}
       \item The name of the nodepool,
       \item A summary showing the number hosts in the pool which are online, dead (unresponsive), and
           varied-off, the total quantum shares available to the nodepool, and the total unscheduled or
           {\em free} shares.
       \item The number of hosts known to the nodepool, by quantum, similar to the class listings above,
       \item The nubmer of online hosts, by quantum,
       \item The number of completely free hosts by quantum (no work currently scheduled), and
       \item The number of {\em virtual} hosts, by quantum.  A {\em virtual host} is created when a
           host is partially scheduled.  For example, if a 32G processes is scheduled on a 64G host, this
           creates one free 32G {\em virtual host}.
     \end{enumerate}
     To determine the number of processes, by quantum, that can be scheduled, one must {\em sum} the
     ``free'' and ``virtual'' columns.

     For example, (assuming a 15GB quantum), the following listing shows
     that nodepool ``power'' contains fourteen hosts with at least 45GB each (3 quanta).  Two
     of these hosts have something scheduled on them (the ``free
     machines'' line), leaving unused space of one 15G quantum on one
     host, and one 30GB quantum on another host.

 \begin{verbatim}
 Nodepool power
    online 14 dead 0 offline 0 total-shares 42 free-shares 42
    all     machines:    0    0    0   14    0    0    0    0    0
    online  machines:    0    0    0    0    0    0    0    0    0
    free    machines:    0    0    0   12    0    0    0    0    0
    virtual machines:    0    1    1    0    0    0    0    0    0
 \end{verbatim}

     \subsubsection{\em Usage:}
     \begin{description}
       \item[rm\_qload] \hfill \\
         This command has no options.
       \end{description}

 \fi     % End of dropped rm_qload

 \subsection{rm\_qoccupancy}
 \label{subsec:admin.rm-qoccupancy}

     \subsubsection{{\em Description:}}
     Rm\_qoccupancy provides a list of all known hosts to the RM, and for each host, the following information:
     \begin{itemize}
       \item The name of the host,
       \item Whether the host has any blacklists processes on it,
       \item Whether the host is currently onlline (responsive),
       \item The status of the host; whether the host is schedulable ({\em up} or {\em down}.  A responsive host becomes
         unschedulable ({\em down}) if it is varied-off,
       \item The nodepool the host is a member of,
       \item The reported memory size of the host,
       \item The {\em order} of the host.  The {\em order} is defined to be the maximum number of quantum shares
         supported by the host,
       \item The number of unscheduled quantum shares on the host, and
       \item If work is scheduled on the host, information relevent to that scheduled processes (or reservation).
     \end{itemize}

     If work is scheduled on a host, the work summary is keyed thus:
     \begin{description}
       \item[J] The Orchestrator-assigned job id of the work,
       \item[S] The RM-assigned share id of the work,
       \item[O] The {\em order} of the allocation; that is, the number of quantum shares the allocation occupies,
       \item[II] The {\em initialization investment}; the number of milliseconds the allocated work spent in its
         initialization phase, if any (usually only UIMA-AS processes display this),
       \item[IR] The {\em runtime investment}; the number of milliseconds spent processing the current CASs, if this
         is a UIMA-AS processes.  Note that this number can change dramatically, as it is the sum of time spent only
         by the current CASs.  When a CAS completes, it no longer contributes to the investment of the process.  The RM
         uses this information to determine the best candidate for eviction, if needed ot maintain fair-share.
       \item[E] Whether the RM has preempted (evicted) the process but it has not yet exited,
       \item[P] Whether the RM has purged the process (evicted, because the host is non-responsive), but it has not
         been confirmed evicted,
       \item[F] Whether the process is {\em fixed}; that is, non-preemptbable,
       \item[I] Whether the initialization phase is completed (usually only UIMA-AS processes).
     \end{description}

     The following example shows seven hosts, one with a preemptable share in the {\em --default--}
     nodepool (on bluej290-5), and one with a non-preemptable share in the {\em jobdriver} nodepool.
 \begin{verbatim}
         Node Blacklisted Online Status        Nodepool     Memory Order   Free
   bluej290-5       False   True     up     --default--   32505856     2      0
             J[    6006] S[     189] O[2] II[       0] IR[       0] E[False] P[False] F[False] I[False]

   bluej290-6       False   True     up     --default--   32505856     2      2
   bluej290-7       False   True     up     --default--   32505856     2      2
  bluej291-26       False   True     up    nightly-test   32505856     2      2
  bluej291-27       False   True     up    nightly-test   32505856     2      2
  bluej293-60       False   True     up           intel   32505856     2      2
  bluej537-73       False   True     up       jobdriver   32505856     2      1
             J[    5973] S[       1] O[1] II[       0] IR[       0] E[False] P[False] F[ True] I[False]


 \end{verbatim}


     \subsubsection{\em Usage:}

     \begin{description}
       \item[rm\_qoccupancy] \hfill \\
         This command has no options.
       \end{description}


 \subsection{vary\_off}
 \label{subsec:admin.vary-off}
     \subsubsection{{\em Description:}}

     Vary\_off is used to remove a host from scheduling and to evict the preemptable work that is running on it.
     This allows for graceful clearance of a host so the host can be take offline for maintenance,
     or any other purpose (such as sharing the host with other applications.)
     The DUCC agent is NOT stoppped; use \hyperref[subsec:admin.stop-ducc]{stop\_ducc} to stop the
     agent.
     Managed and unmanaged reservations are not canceled by {\em vary\_off}.

     Only the userid that started DUCC may issue {\em vary\_off}; attempts from other userids
     are rejected.

     \subsubsection{\em{Usage: }}

         \begin{description}
           \item[vary\_off list-of-hosts]
             The {\em list-of-nodes} is a space delimited list of hosts to be removed from
               scheduling in the DUCC cluster.
         \end{description}

 \subsection{vary\_on}
 \label{subsec:admin.vary-on}
     \subsubsection{{\em Description:}}

     Vary\_on is used to restore a host to scheduling by DUCC.  If the agent is still
     alive the host becomes immediately available.  The agent is not started by
     {\em vary\_on}; use use
     \hyperref[subsec:admin.start-ducc]{start\_ducc} to start the agent if needed.

     Only the userid that started DUCC may issue {\em vary\_on}; attempts from other userids
     are rejected.

     \subsubsection{\em{Usage: }}

         \begin{description}
           \item[vary\_on list-of-hosts]
             The {\em list-of-nodes} is a space delimited list of hosts to be restored for
               scheduling in the DUCC cluster.
         \end{description}


 \subsection{ducc\_properties\_manager}
 \label{sec:cli.ducc-properties-manager}

     \paragraph{Description:}
     This CLI is used to manually merge or difference two properties files.

     Normally, the DUCC scripts {\em start\_ducc, check\_ducc,}, and {\em rm\_configure} automatically
     merge the file {\em default.ducc.properties} and {\em site.ducc.properties} when invoked.

     \paragraph{Usage:}
     \begin{description}
     \item[ducc\_props\_manager --merge file1 --with file2 --to file3]
       Merge two properties files into one.  Properties added to, or changed in, the second file
       are used to override those in the first file, with the result written to the third file.
     \item[ducc\_props\_manager --delta file1 --with file2 --to file3]
       Compare two properties files and write the differences into a third file.  The first file is
       considered a ``master'' file.  Properties with different values in the second file, or which
       do not occur in the first file, are written into the third file.
     \end{description}

     \paragraph{Options:}
     \begin{description}
         \item[$--$merge file1]
           In this form, the two files specified in the {\em $--$with} and {\em$--$to} fields are merged, with the
           results placed in $--$file3.  Overrides are flagged with a change tag and the date of the merge.

           {\em file1} is considered the ``master'' properties file and is usually the unmodified file provided
           with the DUCC distribution, {\em default.ducc.properties}.

           {\em file2} is considered a set of override or additional properties and is usually the site local
           properties file, {\em site.ducc.properties.}

         \item[$--$delta file1]
           In this form, the two files specified in the {\em $--$with} and {\em$--$to} fields are compared, with
           differences placed in $--$file3.

           {\em file1} is considered the ``master'' properties file and is usually the unmodified file provided
           with the DUCC distribution, {\em default.ducc.properties}.

           {\em file2} is considered the ``external'' properties file and is usually the properties file from
           an older version of DUCC.

           Differences are placed in {\em $--$file3} which may be a viable first cut at a new {\em site.ducc.properties.}

           \item[$--$with file2] This specifies the properties file to merge with the master, or to difference
             with the master properties file.

           \item[$--$to file3] This specifies the file to which the results of the merge or delta are written.
      \end{description}

     \paragraph{Notes:}
     None.

 \subsection{db\_create}
 \label{subsec:admin.db-create}

     \paragraph{Description:}
         This command is used to initialize the database.  Normally the database is initialized
         during {\em ducc\_post\_install} but if this is an existing DUCC installation that is
         being migrated from a version that does not use the database, it will be necessary to
         initialize the database with this command.

         This command performs the following actions:
         \begin{enumerate}
           \item Starts the database.
           \item Disables the default database superuser.
           \item Installs a database superuser as ``ducc'' and sets the password
             to a random string.  The password is saved
             in DUCC\_HOME/resources.private/ducc.private.properties.
           \item Installs the DUCC database schema.
           \item Stops the database.
         \end{enumerate}


          This command takes no parameters.

          NOTE: The database user and password are NOT RELATED to any login ID on the system,
          they are used and maintained by the database only.

 \subsection{db\_loader}
 \label{subsec:admin.db-loader}

     \paragraph{Description:}
         This command is used to copy the data from DUCC's older (pre 2.1.0) file-based persistence
         into the database.  The database schema must already exist, created either
         with {\em ducc\_post\_install} or with {\em db\_create}.

         This command performs the following actions:
         \begin{enumerate}
           \item Starts the database.
           \item Drops some of the indexes in the database.
           \item Loads the Orchestrator checkpoint file from {\em DUCC\_HOME/state/orchestrator.chkpt}.
           \item Loads all job history from {\em DUCC\_HOME/history/jobs}.
           \item Loads all reservation history from {\em DUCC\_HOME/history/reservations}.
           \item Loads all service instance and AP history from {\em DUCC\_HOME/history/services}.
           \item Loads the service registry from {\em DUCC\_HOME/state/services}.
           \item Loads the service registry histroy from {\em DUCC\_HOME/history/service-registry}.
           \item Reloads the Orchestratory checkpoint, as a spot-check of the loader's instrumentation (to insure
             load times stay reasonable.)
           \item Re-installs the DUCC database schema.
           \item Stops the database.
           \item Optionally renames the file-based state so if you rerun the command, the data does not get reloaded.
         \end{enumerate}

         When the command exits, DUCC should be ready to run with all its state in the database.

         This command takes two parameters, a pointer to the DUCC\_HOME you want to load from, and
         a flag to disable the rename of the file-based state.

     \paragraph{Usage:}
     \begin{description}
     \item[db\_loader -i {\em some-ducc-home} {[--no-archive]}]
       Load the database from the specified DUCC\_HOME, and optionally do not archive the original files
       by renaming them.
     \end{description}

     \paragraph{Options:}
     \begin{description}
         \item[$-i$ {\em some-ducc-home}]
           This specifies the DUCC\_HOME you wish to load.  Most of the time it is the DUCC\_HOME you
           are running within, but it can be some other DUCC\_HOME if you have multiple installations and
           want other history and state loaded.
         \item[$--no-archive$]
           If specified, the original files are not renamed.  Note that only the directories in {\em history}
           and {\em state} are renamed.  To restore these, simply rename them back without the {\em archive}
           suffix.
      \end{description}

     \paragraph{Example:}
 \begin{verbatim}
 db_loader -i /home/ducc/ducc_runtime
 db_loader -i /home/ducc.old/ducc_runtime --no-archive
 \end{verbatim}

     \paragraph{Notes:}
     The console shows progress of the loader.  Full details of the load are written to a log {\em db-loader-log}
     in the usual DUCC log directory, for reference and potential problem determination if something goes wrong.