You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by ch...@apache.org on 2013/09/20 14:37:47 UTC

svn commit: r1524983 - in /uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook: part2/services.tex part4/admin/ducc-classes.tex

Author: challngr
Date: Fri Sep 20 12:37:47 2013
New Revision: 1524983

URL: http://svn.apache.org/r1524983
Log:
UIMA-2682 Updates for new RM configuration.

Modified:
    uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part2/services.tex
    uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-classes.tex

Modified: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part2/services.tex
URL: http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part2/services.tex?rev=1524983&r1=1524982&r2=1524983&view=diff
==============================================================================
--- uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part2/services.tex (original)
+++ uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part2/services.tex Fri Sep 20 12:37:47 2013
@@ -308,7 +308,7 @@ public class CustomPing
 {
     String host;
     String port;
-    public void init(String endpoint) throws Exception {
+    public void init(String args, String endpoint) throws Exception {
         // Parse the service endpoint, which is a String of the form 
         //    host:port
         String[] parts = endpoint.split(":");

Modified: uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-classes.tex
URL: http://svn.apache.org/viewvc/uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-classes.tex?rev=1524983&r1=1524982&r2=1524983&view=diff
==============================================================================
--- uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-classes.tex (original)
+++ uima/sandbox/uima-ducc/trunk/uima-ducc-duccdocs/src/site/tex/duccbook/part4/admin/ducc-classes.tex Fri Sep 20 12:37:47 2013
@@ -1,185 +1,196 @@
-\section{DUCC Class Definitions}
+\section{Scheduler Configuration: Classes and Nodepools}
 \label{sec:ducc.classes}
 
-    The class configuration file is used by the Resource Manager configure the rules used for job 
-    scheduling. See the Resource Manager chapter for a detailed description of the DUCC schedueler. 
-
-    The name of class configuration file is specified in ducc.properties. The default name is 
-    ducc.classes [105] and is specified by the property ducc.rm.class.definitions property. 
-
-    This file configures the classes and the associate scheduling rules of each class. It contains 
-    properties to declare the following: 
+The class configuration file is used by the Resource Manager configure the rules used for job
+scheduling. See the \hyperref[sec:]{Resource Manager chapter} for a detailed description of the DUCC
+schedueler, scheduling classes, and how classes are used to configure the scheduling process.
+
+The scheduler  configuration file is specified in ducc.properties. The default name is 
+ducc.classes and is specified by the property {\em ducc.rm.class.definitions}.
+  
+\subsection{Nodepools}
+
+\subsubsection{Overview}
+    A {\em nodepool} is a grouping of a subset of the physical nodes to allow differing
+    scheduling policies to be applied to different nodes in the system.  Some typical
+    nodepool groupings might include:
     \begin{enumerate}
-      \item The names of each class. 
-      \item The default class to use if none is specified with the job. 
-      \item The names of all the nodepools. 
-      \item For each nodepool, the name of the file containing member nodes. 
-      \item A set of properties for each class, declaring the rules enforced by that class. 
+      \item Group Intel and Power nodes separately so that users may submit jobs that run
+        only in Intel architecture, or only Power, or ``don't care''.
+      \item Designate a group of nodes with large locally attached disks such that users
+        can run jobs that require thos disks.
+      \item Designate a specific set of nodes with specialized hardware such as high-speed
+        network, such that jobs can be scheduled to run only on those nodes.
     \end{enumerate}
 
-    The general properties are as follows. The default values are the defaults in the system as initially 
-    installed. 
-
-    \begin{description}
+    A Nodepool is a subset of some larger collection of nodes.  Nodepools themselves may be
+    further subdivided.  Nodepools may not overlap: every node belongs to one and exactly
+    one nodepool.  During system start-up the consistency of nodepool definition is checked
+    and the system will refuse to start if the configuration is incorrect.
+
+    For example, the diagram below is an abstract representation of all the nodes in a
+    system.  There are five nodepools defined:
+    \begin{itemize}
+      \item Nodepool ``Default'' is subdivided into three pools, NP1, NP2, and NP3.  All
+        the nodes not contained in NP1, NP2, and NP3 belong to the pool called ``Default''.
+      \item Nodepool NP1 is not further subdivided.
+      \item Nodepool NP2 is not firther subdivided.
+      \item Nodepool NP3 is further subdivided to form NP4.  All nodes within NP3 but
+        not in NP4 are contained in NP3.
+      \item Nodepool NP4 is not further subdivided.
+    \end{itemize}
+
+    \begin{figure}[H]
+      \centering
+      \includegraphics[bb=0 0 241 161, width=5.5in]{images/Nodepool1.jpg}
+      \caption{Nodepool Example}
+      \label{fig:Nodepools1}
+    \end{figure}
 
-      \item[scheduling.class.set] \hfill \\
-        This defines the set of class names for the installation.  The names themselves are arbitrary
-        and correspond to the rules defined in subsequent properties.
-
-        \begin{description}
-          \item[Default Value] background low normal high urgent weekly fixed reserve JobDriver 
-        \end{description}
-          
-      \item[scheduling.default.name] \hfill \\
-        This is the default class that jobs are assigned to, when not otherwise designated in their 
-        submission properties. 
-        \begin{description}
-          \item[Default Value] normal 
-        \end{description}
-    \end{description}        
+    In the figure below the Nodepools are incorrectly defined for two reasons:
+    \begin{enumerate}
+       \item NP1 and NP2 overlap.
+       \item NP4 overlaps both nodepool ``Default'' and NP3.
+    \end{enumerate}
     
-    Nodepools are declared with a set of properties to name each nodepool and to name a file for 
-    each pool that declares membership in the nodepool. For each nodepool a property of the form 
-    scheduling.nodepool.NODEPOOLNAME is declared, where NODEPOOLNAME is one of the 
-    declared nodepools. 
-
-    The property to declare nodepool names is as follows: 
-
-    \begin{description}
-      \item[scheduling.nodepool] \hfill \\
-      This is the list of nodepool names. For example: 
-\begin{verbatim}
-      scheduling.nodepool = res res1 res2 
-\end{verbatim}
-      \begin{description}
-        \item[Default Value] reserve 
-      \end{description}
-    \end{description}
+    \begin{figure}[H]
+      \centering
+      \includegraphics[bb=0 0 241 161, width=5.5in]{images/Nodepool2.jpg}
+      \caption{Nodepools: Overlapping Pools are Incorrect}
+      \label{fig:Nodepools2}
+    \end{figure}
+
+    Multiple ``top-level'' nodepools are allowed.  A ``top-level'' nodepool has no containing
+    pool.  Multiple top-level pools logically divide a cluster of machines into {\em multiple
+      independent clusters} from the standpoint of the scheduler.  Work scheduled over one
+    pool in no way affects work scheduled over the other pool.  The figure below shows an
+    abstract nodepool configuration with two top-level nodepools, ``Top-NP1'' and ``Top-NP2''.
+    \begin{figure}[H]
+      \centering
+      \includegraphics[bb=0 0 496 161, width=5.5in]{images/Nodepool3.jpg}
+      \caption{Nodepools: Multiple top-level Nodepools}
+      \label{fig:Nodepools3}
+    \end{figure}
+
+\subsubsection{Scheduling considerations}
+    A primary goal of the scheduler is to insure that no resources are left idle if there
+    is pending work that is able to use those resources.  Therefore, work scheduled to
+    a class defined over a specific nodepool (say, NpAllOfThem), may be scheduled on nodes
+    in any of the nodepools contained within NpAllOfThem.  If work defined over a
+    subpool (such as NP1) arrives, processes on nodes in NP1 that were scheduled for
+    NpAllOfThem are considered ``squatters'' and are the most likely candidates for
+    eviction. (Processes assigned to their proper nodepools are considered ``residents''
+    and are evicted only after all ``squatters'' have been evicted.)  The scheduler strives
+    to avoid creating ``squatters''.
+
+    Because non-preemptable process can't be preempeted, work submitted to a class
+    implementing one of the non-preemptable policies (FIXED or RESERVE) are never allowed
+    to ``squat'' in other nodepools and will scheduled only on the nodes in their
+    proper nodepool.
+
+    In the case of multiple top-level nodepools: these nodepools and their subpools
+    form independent scheduling groups.  Specifically, fair-share allocations over any
+    nodepool in one top-level pool does NOT affect the fair-share allocations for jobs
+    in any other top-level nodepool.
+
+\subsubsection{Configuration}
+    DUCC uses a simplified JSON-like structure to define nodepools.
+
+    At least one nodepool definition is required.  This nodepool need not have any subpools or node
+    definitions.  The first top-level nodepool is considered the ``default'' nodepool.  A node not
+    named specifically within one of the node files which checks in with DUCC is assigned to this
+    first, or ``default'' nodepool. 
+
+    Thus, if only one nodepool is defined with no other attributes, all nodes are
+    assigned to that pool.
+
+    A nodepool definition consists of the token ``Nodepool'' followed by its
+    name, followed by a block delimeted with ``curly'' braces \{ and \}.  This
+    block contains the attributes of the nodepool as key/value pairs.
+    Lineneds are ignored.  A semicolon $;$ may optionally be used to
+    delimit key/value pairs for readability, and an equals sign ``='' may optinally
+    be used to delimit keys from values, also just for readability.
+
+    The attributes of a Nodepool are:
+    \begin{definition}
+      \item[domain] This is valid only in the ``default'' nodepool.  Any node
+        in any nodfile which does not have a doman, and any node which checks
+        in with the schedule without a domain name is assigned this domain name
+        in order that the scheduler may deal entirely with full-qualified node names.
+      \item[nodefile] This is the name of a file containing the names of the nodes
+        which are members of this nodepool.
+      \item[parent] This is used to indicate which nodepool is the logical parent.
+        Any nodepool without a ``parent'' is considered a top-level nodepool.
+    \end{definition}
         
-    This is an example of a declaration of three nodepools. 
-
-\begin{verbatim}
-scheduling.nodepool = res res1 res1 
-scheduling.nodepool.res = res.nodes 
-scheduling.nodepool.res1 = res1.nodes 
-scheduling.nodepool.res2 = res2.nodes 
-\end{verbatim}
-    
-    There is no way to enforce priority assignment to any given nodepool. It is possible to declare a 
-    "preference", such that the resources in a given nodepool are considered first when searching for 
-    nodes. To configure a preference, use the order decorattion on a nodepool specificaion. 
-
-    To declare nodepool order, specify the property {\tt scheduling.nodepool.[poolname].order}. The
-    nodepools are sorted numerically according to their order, and pools with lower order are
-    searched before pools with higher order. The global nodepool always order "0" so it is usally
-    searched first. For example, the pool configuration below establishes a search order of
-
+    The following example defines six nodepools, 
     \begin{enumerate}
-      \item global 
-      \item res2 
-      \item res 
-      \item res1 
+      \item A top-level nodepool called ``--default--'',
+      \item A top-level nodepool called ``jobdriver'',
+      \item A subpool of ``--default--'' called ``intel'',
+      \item A subpool of ``--default--'' called ``power'',
+      \item A subpool of ``intel'' called ``nightly-test'',
+      \item And a subpool of ``power'' called ``testing-p7'',
     \end{enumerate}
     
-    This is an example of a declaration of three nodepools. 
-
 \begin{verbatim}
-scheduling.nodepool = res res1 res1 
-scheduling.nodepool.res = res.nodes 
-scheduling.nodepool.res.order = 4 
-scheduling.nodepool.res1 = res1.nodes 
-scheduling.nodepool.res1.order = 7 
-scheduling.nodepool.res2 = res2.nodes 
-scheduling.nodepool.res2.order = 2 
-\end{verbatim}
+    Nodepool --default--  { domain bluej.net }
+    Nodepool jobdriver    { nodefile jobdriver.nodes }
     
-    For each class named in scheduling.class.set a set of properties is specified, defining the rules 
-    implemented by that class. Each such property is of the form 
+    Nodepool intel        { nodefile intel.nodes        ; parent --default-- }
+    Nodepool power        { nodefile power.nodes        ; parent --default-- }
 
-\begin{verbatim}
-scheduling.class.CLASSNAME.RULE = VALUE 
+    Nodepool nightly-test { nodefile nightly-test.nodes ; parent intel }
+    Nodepool timing-p7    { nodefile timing-p7.nodes    ; parent power }
 \end{verbatim}
     
-    where 
-    \begin{description}
-      \item[CLASSNAME] specifies is the name of the class. 
-      \item[RULE] specifies rule. Rules are described below. 
-      \item[VALUE] specifies the value of the rule, as described below. 
-      \end{description}
-      
-      The rules are: 
-      \begin{description}
+\subsection{Class Definitions}
 
-        \item[policy] \hfill \\
-          This is the scheduling policy, required, and must be one of: 
-          \begin{itemize}
-            \item[] FAIR\_SHARE 
-            \item[] FIXED\_SHARE 
-            \item[] RESERVE 
-          \end{itemize}
-            
-        \item[share\_weight] \hfill \\
-          This is any integer. This is the weighted-fair-share weight for the class as discussed above. It is 
-          only used when policy = FAIR\_SHARE. 
-
-        \item[priority] \hfill \\
-          This is the evaluation priority for the class as discussed above. This is used for all scheduling 
-          policies. 
-
-        \item[cap] \hfill \\
-          This is an integer, or an integer with "\%" appended to denote a percentage. It is used for all 
-          scheduling classes. 
-
-          This is the class cap as discussed above. It may be an absolute value, in processes (which may 
-          comprise more than one share quanta), or it may be specified as a percentage by appending 
-          "\%" to the end. When specified as a percentage, it caps the shares allocated to this class as 
-          that percentage of the total shares remaining when the class is evaluated.. It does not consider 
-          shares that may have been available and assigned to higher-priority classes. 
-
-        \item[nodepool] \hfill \\
-          This is the name of the nodepool associated with this class. It must be one of the names 
-          declared in the property scheduling.nodepool. 
-
-        \item[prediction] \hfill \\
-          Acceptable values are true and false. When set to true the scheduler uses prediction when 
-          allocating shares. It is only used when policy = FAIR\_SHARE. 
-
-        \item[prediction.fudge] \hfill \\
-          Acceptable values are any integer, denoting milliseconds. This is the prediction fudge as 
-          discussed above. It is only used when policy = FAIR\_SHARE. 
-
-        \item[expand.by.doubling] \hfill \\
-          Acceptable values are true and false. When set to true the scheduler doubles a job's shares 
-          up to it's fair-share when possible, as discussed above. It is only used when policy = 
-          FAIR\_SHARE. 
-
-        \item[expand.by.doubling] \hfill \\
-          Acceptable values are true and false. When set to true the scheduler doubles a job's shares up 
-          to it's fair-share when possible, as discussed above. When set in ducc.classes it overrides the 
-          defaults from ducc.properties. It is only used when policy = FAIR\_SHARE. 
-
-        \item[initialization.cap] \hfill \\
-          Acceptable values are any integer. This is the maximum number of processes assigned to a job 
-          until the first process has successfully completed initialization. To disable the cap, set it to zero 
-          0. It is only used when policy = FAIR\_SHARE. 
-
-        \item[max\_processes] \hfill \\
-          Acceptable values are any integer. This is the maximum number of processes assigned to a 
-          FIXED\_SHARE request. If more are requested, the request is canceled. It is only used when 
-          policy = FIXED\_SHARE. If set to 0 or not specified, there is no enforced maximum. 
-
-        \item[max\_machines] \hfill \\
-          Acceptable values are any integer. This is the maximum number of machines assigned to a 
-          RESERVE request. If more are requested, the request is canceled. It is only used when policy = 
-          RESERVE. If set to 0 or not specified, there is no enforced maximum. 
-
-        \item[enforce.memory] \hfill \\
-          Acceptable values are true and false. When set to true the scheduler requires that any machine 
-          selected for a reservation matches the reservation's declared memory. The declared memory 
-          is converted to a number of quantum shares. Only machines whose memory, when converted 
-          to share quanta are selected. When set to false, any machine in the configured nodepool is 
-          selected. It is only used when policy = RESERVE. 
-      \end{description}
-          
+    Scheduler classes are defined in the same simplified JSON-like language as
+    nodepools.
 
-        
+    A simple inheritance (or ``template'') scheme is supported for classes.  Any
+    class may be configured to ``derive'' from any other class.  In this case, the
+    child class acquires all the attributes of the parent class, any of which may
+    be selectively overridden.  Multiple inheritance is not supported but
+    nested inheritance is; that is, class A may inherit from class B which inherits
+    from class C and so on. In this way, generalized templates for the site's
+    class structure may be defined.  
+
+    The general form of a class definition consists of the keword Class, followed
+    by the name of the class, and then optionally by the name of a ``parent'' class
+    whose characteristics it inherits.   Following the name (and optionally parent class
+    name) are the attributes of the class, also within a \{ \} block.
+
+    The attributes defined for classes are:
+    \begin{description}
+      \item[abstract] If specified, this indicates this class is a template ONLY. It is used
+        as a model for oher classes.  Values are ``true'' or ``false''.  The default is
+        ``false''.
+      \item[cap] This specifies the largest number of shares any job in this class
+        may be assigned.  It may be an absolute number or a percentage.  If specified as
+        a percentage (i.e. it contains a trailing \%), it specifies a percentage of the
+        total nodes in the containing nodepool.
+      \item[debug] FAIR_SHARE only. This specifies the name of a class to substitute
+        for jobs submitted for debug.
+      \item[expand-by-doubling] FAIR_SHARE only.  If ``true'', and the ``initialization-cap'' is
+        set, then after any process has initialized, the job will expand to its maximum allowable
+        shares by doubling in size each scheduling cycle.
+      \item[initialization-cap] FAIR_SHARE only. If specified, this is the largest number of processes this job
+        may be assigned until at least one process has successfully completed initialization.
+      \item[max-processes] FIXED-SHARE only.  This is the largest number of FIXED-SHARE,
+        non-preemptable shares any single job may be assigned.
+      \item[prediction-fudge] FAIR_SHARE only. When the scheduler is considering expanding the
+        number of processes for a job it tries to determine if the job may complete before those
+        processes are allocated and initialized.  The ``prediction-fudge'' adds some amount of 
+        time (in milliseconds) to the projected completion time.  This allows installations to
+        prevent jobs from expanding when they were otherwise going to end in a few minutes
+        anyway.
+      \item[nodepool] If specified, jobs for this class are assigned to nodes in this nodepool. 
+      \item[policy] This is the scheduling policy, one of FAIR_SHARE, FIXED_SHARE, or RESERVE. This
+        attribute is required (there is no default).
+      \item[priority] This is the scheduling priority for jobs in this class.
+      \item[weight] FAIR_SHARE only. This is the fair-share weight for jobs in this class.
+      
+    \end{description}
+