You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@slider.apache.org by st...@apache.org on 2015/11/19 19:42:22 UTC

svn commit: r1715237 - in /incubator/slider/site/trunk/content: developing/functional_tests.md developing/releasing.md developing/testing.md docs/api/slider_REST_api_v2.md docs/getting_started.md docs/security.md docs/troubleshooting.md

Author: stevel
Date: Thu Nov 19 18:42:22 2015
New Revision: 1715237

URL: http://svn.apache.org/viewvc?rev=1715237&view=rev
Log:
general improvements

Modified:
    incubator/slider/site/trunk/content/developing/functional_tests.md
    incubator/slider/site/trunk/content/developing/releasing.md
    incubator/slider/site/trunk/content/developing/testing.md
    incubator/slider/site/trunk/content/docs/api/slider_REST_api_v2.md
    incubator/slider/site/trunk/content/docs/getting_started.md
    incubator/slider/site/trunk/content/docs/security.md
    incubator/slider/site/trunk/content/docs/troubleshooting.md

Modified: incubator/slider/site/trunk/content/developing/functional_tests.md
URL: http://svn.apache.org/viewvc/incubator/slider/site/trunk/content/developing/functional_tests.md?rev=1715237&r1=1715236&r2=1715237&view=diff
==============================================================================
--- incubator/slider/site/trunk/content/developing/functional_tests.md (original)
+++ incubator/slider/site/trunk/content/developing/functional_tests.md Thu Nov 19 18:42:22 2015
@@ -81,7 +81,7 @@ If set:
 1. The standard `-site.xml` files are loaded by the JUnit test runner, to bond
 the test classes to the YARN cluster.
 1. The property is used to set the environment variable `HADOOP_CONF_DIR`
-before the `bin/slider` or bin\slider.py` script is executed.
+before the `bin/slider` or `bin\slider.py` script is executed.
 
 
 **Note 1:** a path can be set relative to ${SLIDER_CONF_DIR}

Modified: incubator/slider/site/trunk/content/developing/releasing.md
URL: http://svn.apache.org/viewvc/incubator/slider/site/trunk/content/developing/releasing.md?rev=1715237&r1=1715236&r2=1715237&view=diff
==============================================================================
--- incubator/slider/site/trunk/content/developing/releasing.md (original)
+++ incubator/slider/site/trunk/content/developing/releasing.md Thu Nov 19 18:42:22 2015
@@ -26,7 +26,7 @@ Here is our release process from Slider
 As well as everything needed to build slider, there are some extra requirements
 for releasing:
 
-1. Shell: (Currently: Bash; some `fish` examples too)
+1. Shell: `bash`
 1. [git flow](http://danielkummer.github.io/git-flow-cheatsheet/)
 1. OS/X and windows: [Atlassian SourceTree](http://www.sourcetreeapp.com/).
 This can perform the git flow operations, as well as show the state of your
@@ -35,6 +35,7 @@ git graph.
  
 ### Before you begin
 
+Read the [ASF incubator release manual](http://incubator.apache.org/guides/releasemanagement.html)
 
 Check out the latest version of the branch to be released,
 run the tests. This should be done on a checked out
@@ -53,9 +54,9 @@ create HBase and Accumulo clusters in th
 *Make sure that the integration tests are passing (and not being skipped) before
 starting to make a release*
 
-*3.* Make sure there are no uncommitted files in your local repo. 
+Make sure there are no uncommitted files in your local repo. 
 
-*4.* If you are not building against a stable Hadoop release
+If you are not building against a stable Hadoop release
 
   1. Check out the Hadoop branch you intend to build and test against —and include in
      the redistributable artifacts.
@@ -425,7 +426,7 @@ Clone this project and read its instruct
 ## Close the release in Nexus
 
 1. log in to [https://repository.apache.org/index.html](https://repository.apache.org/index.html)
-with your ASF username & LDAP password
+with your ASF username and LDAP password
 1. go to [Staging Repositories](https://repository.apache.org/index.html#stagingRepositories)
 1. find the latest slider repository in the list
 1. select it; 

Modified: incubator/slider/site/trunk/content/developing/testing.md
URL: http://svn.apache.org/viewvc/incubator/slider/site/trunk/content/developing/testing.md?rev=1715237&r1=1715236&r2=1715237&view=diff
==============================================================================
--- incubator/slider/site/trunk/content/developing/testing.md (original)
+++ incubator/slider/site/trunk/content/developing/testing.md Thu Nov 19 18:42:22 2015
@@ -27,4 +27,4 @@
 Slider core contains a suite of tests that are designed to run on the local machine,
 using Hadoop's `MiniDFSCluster` and `MiniYARNCluster` classes to create small,
 one-node test clusters. All the YARN/HDFS code runs in the JUnit process; the
-AM and spawned processeses run independently.
+AM and spawned processes run independently.

Modified: incubator/slider/site/trunk/content/docs/api/slider_REST_api_v2.md
URL: http://svn.apache.org/viewvc/incubator/slider/site/trunk/content/docs/api/slider_REST_api_v2.md?rev=1715237&r1=1715236&r2=1715237&view=diff
==============================================================================
--- incubator/slider/site/trunk/content/docs/api/slider_REST_api_v2.md (original)
+++ incubator/slider/site/trunk/content/docs/api/slider_REST_api_v2.md Thu Nov 19 18:42:22 2015
@@ -374,15 +374,15 @@ Core concepts:
 2. The live view of what is going on in the application under `/application/model`.
 
 
-## /application 
+## `/application`
 ### All Application resources
 
 All entries will be under the service path `/application`, which itself is under the `/ws/v1/` path of the Slider web interface.
 
-## /application/model/ : 
-### GET/ and, for some URLs, PUT view of the specification 
+## `/application/model/` : 
+### GET  and, for some URLs, PUT view of the specification 
 
-### /application/model/desired/ 
+### `/application/model/desired/ `
 
 This is where the specification of the application: resources and configuration, can be read and written. 
 
@@ -390,13 +390,11 @@ This is where the specification of the a
 
 2. Write accesses to `configuration` will only take effect on a cluster upgrade or restart
 
-### /application/model/resolved/
-
-
+### `/application/model/resolved/`
 
 The resolved specification, the one where we implement the inheritance, and, when we eventually do x-refs, all non-LAZY references. This lets the caller see the final configuration model.
 
-### /application/model/internal/
+### `/application/model/internal/`
 
 Read-only view of `internal.json`. Exported for diagnostics and completeness.
 
@@ -422,7 +420,7 @@ DELETE node_id will decommission all con
 
 1. "system" state: AM state, outstanding requests, upgrade in progress
 
-## /application/actions
+## `/application/actions`
 ### POST state changing operations
 
 These are for operations which are hard to represent in a simple REST view within the AM itself.
@@ -445,8 +443,7 @@ All of these are GET operations on data
     <td>desired/resources.json extended with statistics of the actual pending, and failed resource allocations.</td>
   </tr>
   <tr>
-    <td>live/containers
-</td>
+    <td>live/containers</td>
     <td>sorted list of container IDs</td>
   </tr>
   <tr>
@@ -542,33 +539,33 @@ This is different from a POST in that a
 
 # Non-normative Example Data structures
 
-## application/live/resources
+## `application/live/resources`
 
 The contents of application/live/resources on an application which only has an application master deployed. The entries in italic are the statistics related to the live state; the remainder the original values.
+```
+{
+  "schema" : "http://example.org/specification/v2.0.0",
+  "metadata" : { },
+  "global" : { },
+  "credentials" : { },
+  "components" : {
+    "slider-appmaster" : {
+      "yarn.memory" : "1024",
+      "yarn.vcores" : "1",
+      "yarn.component.instances" : "1",
+      "yarn.component.instances.requesting" : "0",
+      "yarn.component.instances.actual" : "1",
+      "yarn.component.instances.releasing" : "0",
+      "yarn.component.instances.failed" : "0",
+      "yarn.component.instances.completed" : "0",
+      "yarn.component.instances.started" : "1"
 
-    {
-      "schema" : "http://example.org/specification/v2.0.0",
-      "metadata" : { },
-      "global" : { },
-      "credentials" : { },
-      "components" : {
-        "slider-appmaster" : {
-          "yarn.memory" : "1024",
-          "yarn.vcores" : "1",
-          "yarn.component.instances" : "1",
-          "yarn.component.instances.requesting" : "0",
-          "yarn.component.instances.actual" : "1",
-          "yarn.component.instances.releasing" : "0",
-          "yarn.component.instances.failed" : "0",
-          "yarn.component.instances.completed" : "0",
-          "yarn.component.instances.started" : "1"
-    
-        }
-    
-      }
-    
     }
 
+  }
+
+}
+```
 
 ## `live/liveness`
 
@@ -577,12 +574,13 @@ application as perceived by Slider itsel
 
 See `org.apache.slider.api.types.ApplicationLivenessInformation`
 
-    {
-      "allRequestsSatisfied": true,
-      "requestsOutstanding": 0
-    }
-        
- 
+```
+{
+  "allRequestsSatisfied": true,
+  "requestsOutstanding": 0
+}
+```
+
 Its initial/basic form counts the number of outstanding container requests.
 
 This could be extended in future with more criteria, such as the minimum number/

Modified: incubator/slider/site/trunk/content/docs/getting_started.md
URL: http://svn.apache.org/viewvc/incubator/slider/site/trunk/content/docs/getting_started.md?rev=1715237&r1=1715236&r2=1715237&view=diff
==============================================================================
--- incubator/slider/site/trunk/content/docs/getting_started.md (original)
+++ incubator/slider/site/trunk/content/docs/getting_started.md Thu Nov 19 18:42:22 2015
@@ -50,7 +50,7 @@ The Slider deployment has the following
 
 * Required Services: HDFS, YARN and ZooKeeper
 
-* Oracle JDK 1.6 (64-bit)
+* Oracle JDK 1.7 (64-bit)
 
 * Python 2.6
 
@@ -80,6 +80,17 @@ In `yarn-site.xml` make the following mo
   </tr>
 </table>
 
+Example
+
+    <property>
+      <name>yarn.scheduler.minimum-allocation-mb</name>
+      <value>256</value>
+    </property>
+    <property>
+      <name>yarn.nodemanager.delete.debug-delay-sec</name>
+      <value>3600</value>
+    </property>
+
 
 There are other options detailed in the Troubleshooting file available [here](/docs/troubleshooting.html).
 

Modified: incubator/slider/site/trunk/content/docs/security.md
URL: http://svn.apache.org/viewvc/incubator/slider/site/trunk/content/docs/security.md?rev=1715237&r1=1715236&r2=1715237&view=diff
==============================================================================
--- incubator/slider/site/trunk/content/docs/security.md (original)
+++ incubator/slider/site/trunk/content/docs/security.md Thu Nov 19 18:42:22 2015
@@ -46,7 +46,7 @@ Slider runs in secure clusters, but with
   in the clusters *MUST* have read/write access to these files. This can be
   done with a shortname that matches that of the user, or by requesting
   that Slider create a directory with group write permissions -and using LDAP
-  to indentify the application principals as members of the same group
+  to identify the application principals as members of the same group
   as the user.
 
 
@@ -80,7 +80,7 @@ Slider runs in secure clusters, but with
 *  Kerberos is running and that HDFS and YARN are running Kerberized.
 *  LDAP cannot be assumed. 
 *  Credentials needed for the application can be pushed out into the local filesystems of 
-  the of the worker nodes via some external mechanism (e.g. scp), and protected by
+  the of the worker nodes via some external mechanism (e.g. `scp`), and protected by
   the access permissions of the native filesystem. Any user with access to these
   credentials is considered to have been granted such rights.
 *  These credentials can outlive the duration of the application instances
@@ -93,7 +93,7 @@ kerberos identities.
 1. The user is expected to have their own Kerberos principal, and have used `kinit`
   or equivalent to authenticate with Kerberos and gain a (time-bounded) TGT
 1. The user is expected to have principals for every host in the cluster of the form
-  username/hostname@REALM for component aunthentication.  The AM authentication requirements
+  username/hostname@REALM for component authentication.  The AM authentication requirements
   can be satisfied with a non-host based principal (username@REALM).
 1. Separate keytabs should be generated for the AM, which contains the AM login principal, and the service components, which contain all the service principals.  The keytabs can be manually distributed
   to all the nodes in the cluster with read access permissions to the user, or the user may elect to leverage the Slider keytab distribution mechanism.
@@ -103,7 +103,7 @@ kerberos identities.
 The Slider Client will talk to HDFS and YARN authenticating itself with the TGT,
 talking to the YARN and HDFS principals which it has been configured to expect.
 
-This can be done as described in [Client Configuration] (/docs/client-configuration.html) on the command line as
+This can be done as described in [Client Configuration](/docs/client-configuration.html) on the command line as
 
      -D yarn.resourcemanager.principal=yarn/master@LOCAL 
      -D dfs.namenode.kerberos.principal=hdfs/master@LOCAL
@@ -115,14 +115,15 @@ It will then deploy the AM, which will (
 rights of the user that created the cluster.
 
 The Application Master will read in the JSON cluster specification file, and instantiate the
-relevant number of componentss. 
+relevant number of components. 
 
 ### The Keytab distribution/access Options
-  Rather than relying on delegation token based authentication mechanisms, the AM leverages keytab files for obtaining the principals to authenticate to the configured cluster KDC. In order to perform this login the AM requires access to a keytab file that contains the principal representing the user identity to be associated with the launched application instance (e.g. in an HBase installation you may elect to leverage the `hbase` principal for this purpose). There are two mechanisms supported for keytab access and/or distribution:
+
+Rather than relying on delegation token based authentication mechanisms, the AM leverages keytab files for obtaining the principals to authenticate to the configured cluster KDC. In order to perform this login the AM requires access to a keytab file that contains the principal representing the user identity to be associated with the launched application instance (e.g. in an HBase installation you may elect to use the `hbase` principal for this purpose). There are two mechanisms supported for keytab access and/or distribution:
 
 #### Local Keytab file access:
 
-  An application deployer may choose to pre-distribute the keytab files required to the Node Manager (NM) hosts in a Yarn cluster. In that instance the appConfig.json requires the following properties:
+An application deployer may choose to pre-distribute the keytab files required to the Node Manager (NM) hosts in a Yarn cluster. In that instance the appConfig.json requires the following properties:
 
     . . .
     "components": {
@@ -133,9 +134,9 @@ relevant number of componentss.
         }
     }
 
-  The `slider.am.keytab.local.path` property provides the full path to the keytab file location and is mandatory for the local lookup mechanism.  The principal to leverage from the file is identified by the `slider.keytab.principal.name` property.
-  
-  In this scenario the distribution of keytab files for the AM AND the application itself is the purview of the application deployer.  So, for example, for an hbase deployment, the hbase site service keytab will have to be distributed as well and indicated in the hbase-site properties:
+The `slider.am.keytab.local.path` property provides the full path to the keytab file location and is mandatory for the local lookup mechanism.  The principal to leverage from the file is identified by the `slider.keytab.principal.name` property.
+
+In this scenario the distribution of keytab files for the AM AND the application itself is the purview of the application deployer.  So, for example, for an hbase deployment, the hbase site service keytab will have to be distributed as well and indicated in the hbase-site properties:
 
         . . .
         "site.hbase-site.hbase.master.kerberos.principal": "hbase/_HOST@EXAMPLE.COM",
@@ -144,7 +145,7 @@ relevant number of componentss.
 
 #### Slider keytab distribution:
 
-  The deployer can select to upload the keytab files (manually or using the Slider client install-keytab option - see below) for the AM and the application to an HDFS directory (with appropriate permissions set) and slider will localize the keytab files to locations accessible by the AM or the application containers:
+The deployer can select to upload the keytab files (manually or using the Slider client install-keytab option - see below) for the AM and the application to an HDFS directory (with appropriate permissions set) and slider will localize the keytab files to locations accessible by the AM or the application containers:
 
     . . .
     "components": {
@@ -155,7 +156,7 @@ relevant number of componentss.
             "slider.keytab.principal.name" : "hbase"
         }
     }
-     
+
 The `slider.hdfs.keytab.dir` points to an HDFS path, relative to the user`s home directory
 (e.g. `/users/hbase`), in which slider can find all keytab files required for both AM login
 as well as application services. For example, for Apache HBase the uses would be the headless keytab
@@ -182,7 +183,7 @@ For both mechanisms above, the principal
 
 #### Slider Client Keytab installation:
 
-The Slider client can be leveraged to install keytab files individually into a designated
+The Slider client can be used to install keytab files individually into a designated
 keytab HDFS folder. The format of the command is:
 
 	slider install-keytab —keytab <path to keytab on local file system> —folder <name of HDFS folder to store keytab> [—overwrite]
@@ -199,20 +200,21 @@ The command can be used to upload keytab
 
 Subsequently, the associated hbase-site configuration properties would be:
 
-	"global": {
-	    . . .
-       "site.hbase-site.hbase.master.kerberos.principal": "hbase/_HOST@EXAMPLE.COM",
-       "site.hbase-site.hbase.master.keytab.file": "${AGENT_WORK_ROOT}/keytabs/hbase.service.keytab",
-       . . .
+    "global": {
+        . . .
+         "site.hbase-site.hbase.master.kerberos.principal": "hbase/_HOST@EXAMPLE.COM",
+         "site.hbase-site.hbase.master.keytab.file": "${AGENT_WORK_ROOT}/keytabs/hbase.service.keytab",
+         . . .
+      }
+    "components": {
+         "slider-appmaster": {
+             "jvm.heapsize": "256M",
+             "slider.hdfs.keytab.dir": ".slider/keytabs/HBASE",
+             "slider.am.login.keytab.name": "hbase.headless.keytab"
+             `slider.keytab.principal.name` : `hbase"
+         }
     }
-	"components": {
-       "slider-appmaster": {
-           "jvm.heapsize": "256M",
-           "slider.hdfs.keytab.dir": ".slider/keytabs/HBASE",
-           "slider.am.login.keytab.name": "hbase.headless.keytab"
-           `slider.keytab.principal.name` : `hbase"
-       }
-   	}
+
 
 ## Securing communications between the Slider Client and the Slider AM.
 
@@ -302,7 +304,8 @@ In this example:
 This property is specified in the appConfig file's global section (with the "site.myapp-site" prefix), and is referenced here to indicate to Slider which application property provides the store password.
 
 ### Specifying a keystore/truststore Credential Provider alias
-Applications that utilize the Credenfial Provider API to retrieve application passwords can specify the following configuration:
+
+Applications that utilize the Credential Provider API to retrieve application passwords can specify the following configuration:
 
 * Indicate the credential storage path in the `credentials` section of the app configuration file:
 
@@ -312,32 +315,34 @@ Applications that utilize the Credenfial
 
 If you specify a list of aliases and are making use of the Slider CLI for application deployment, you will be prompted to enter a value for the passwords specified if no password matching a configured alias is found in the credential store.  However, any mechanism available for pre-populating the credential store may be utilized.
 
-*  Reference the alias to use for securing the keystore/truststore in the component's configuraton section:
+*  Reference the alias to use for securing the keystore/truststore in the component's configuration section:
 
         "APP_COMPONENT": {
             "slider.component.security.stores.required": "true", 
             "slider.component.keystore.credential.alias.property": "app_component.keystore.password.alias"
         }
         
-At runtime, Slider will read the credential mapped to the alias (in this case, "app_component.keystore.password.alias"), and leverage the password stored to secure the generated keystore.
+At runtime, Slider will read the credential mapped to the alias (in this case, ``"app_component.keystore.password.alias"``), and leverage the password stored to secure the generated keystore.
 
 ## Important: Java Cryptography Package  
 
 
-When trying to talk to a secure, cluster you may see the message:
+When trying to talk to a secure cluster you may see the message:
 
     No valid credentials provided (Mechanism level: Illegal key size)]
 or
+
     No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
  
 This means that the JRE does not have the extended cryptography package
 needed to work with the keys that Kerberos needs. This must be downloaded
 from Oracle (or other supplier of the JVM) and installed according to
-its accompanying instructions.
+the accompanying instructions.
 
 
 ## Useful Links
 
+1. [Hadoop and Kerberos: The Madness Beyond the Gate](https://www.gitbook.com/book/steveloughran/kerberos_and_hadoop/details)
 1. [Adding Security to Apache Hadoop](http://hortonworks.com/wp-content/uploads/2011/10/security-design_withCover-1.pdf)
 1. [The Role of Delegation Tokens in Apache Hadoop Security](http://hortonworks.com/blog/the-role-of-delegation-tokens-in-apache-hadoop-security/)
 1. [Chapter 8. Secure Apache HBase](http://hbase.apache.org/book/security.html)

Modified: incubator/slider/site/trunk/content/docs/troubleshooting.md
URL: http://svn.apache.org/viewvc/incubator/slider/site/trunk/content/docs/troubleshooting.md?rev=1715237&r1=1715236&r2=1715237&view=diff
==============================================================================
--- incubator/slider/site/trunk/content/docs/troubleshooting.md (original)
+++ incubator/slider/site/trunk/content/docs/troubleshooting.md Thu Nov 19 18:42:22 2015
@@ -24,6 +24,50 @@ that works
 
 ## Common problems
 
+
+
+### Not all the containers start -but whenever you kill one, another one comes up.
+
+This is often caused by YARN not having enough capacity in the cluster to start
+up the requested set of containers. The AM has submitted a list of container
+requests to YARN, but only when an existing container is released or killed
+is one of the outstanding requests granted.
+
+Fix #1: Ask for smaller containers
+
+edit the `yarn.memory` option for roles to be smaller: set it 64 for a smaller
+YARN allocation. *This does not affect the actual heap size of the 
+application component deployed*
+
+Fix #2: Tell YARN to be less strict about memory consumption
+
+Here are the properties in `yarn-site.xml` which we set to allow YARN 
+to schedule more role instances than it nominally has room for.
+
+    <property>
+      <name>yarn.scheduler.minimum-allocation-mb</name>
+      <value>128</value>
+    </property>
+    <property>
+      <description>Whether physical memory limits will be enforced for
+        containers.
+      </description>
+      <name>yarn.nodemanager.pmem-check-enabled</name>
+      <value>false</value>
+    </property>
+    <!-- we really don't want checking here-->
+    <property>
+      <name>yarn.nodemanager.vmem-check-enabled</name>
+      <value>false</value>
+    </property>
+  
+*Important* In a real cluster, the minimum size of a an allocation should be larger, such
+as `256`, to stop the RM being overloaded. When the PMEM and VMEM sizes are enforced 
+
+### The complete instance never comes up -some containers are outstanding
+
+This means that there isn't enough space in the cluster 
+
 ### Slider instances not being able to create registry paths on secure clusters
 
 This feature requires the YARN Resource Manager to do the setup securely of
@@ -66,46 +110,6 @@ It is from those logs that the cause of
 output of the actual application which Slider is trying to deploy.
 
 
-
-### Not all the containers start -but whenever you kill one, another one comes up.
-
-This is often caused by YARN not having enough capacity in the cluster to start
-up the requested set of containers. The AM has submitted a list of container
-requests to YARN, but only when an existing container is released or killed
-is one of the outstanding requests granted.
-
-Fix #1: Ask for smaller containers
-
-edit the `yarn.memory` option for roles to be smaller: set it 64 for a smaller
-YARN allocation. *This does not affect the actual heap size of the 
-application component deployed*
-
-Fix #2: Tell YARN to be less strict about memory consumption
-
-Here are the properties in `yarn-site.xml` which we set to allow YARN 
-to schedule more role instances than it nominally has room for.
-
-    <property>
-      <name>yarn.scheduler.minimum-allocation-mb</name>
-      <value>1</value>
-    </property>
-    <property>
-      <description>Whether physical memory limits will be enforced for
-        containers.
-      </description>
-      <name>yarn.nodemanager.pmem-check-enabled</name>
-      <value>false</value>
-    </property>
-    <!-- we really don't want checking here-->
-    <property>
-      <name>yarn.nodemanager.vmem-check-enabled</name>
-      <value>false</value>
-    </property>
-  
-If you create too many instances, your hosts will start swapping and
-performance will collapse -we do not recommend using this in production.
-
-
 ### Configuring YARN for better debugging
  
  
@@ -149,4 +153,5 @@ where hbasesliderapp is the name of Slid
 The script would retrieve hbase-site.xml and run HBase shell command.
 
 You can issue the following command to see supported options:
+   
     ./hbase-slider