You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@knox.apache.org by km...@apache.org on 2014/09/18 03:37:04 UTC

svn commit: r1625871 - in /knox: site/books/knox-0-5-0/knox-0-5-0.html trunk/books/0.5.0/book.md trunk/books/0.5.0/book_service-details.md trunk/books/0.5.0/service_yarn.md

Author: kminder
Date: Thu Sep 18 01:37:04 2014
New Revision: 1625871

URL: http://svn.apache.org/r1625871
Log:
KNOX-420: Docs for HDFS HA support.  KNOX-427: Docs for YARN support.

Added:
    knox/trunk/books/0.5.0/service_yarn.md
Modified:
    knox/site/books/knox-0-5-0/knox-0-5-0.html
    knox/trunk/books/0.5.0/book.md
    knox/trunk/books/0.5.0/book_service-details.md

Modified: knox/site/books/knox-0-5-0/knox-0-5-0.html
URL: http://svn.apache.org/viewvc/knox/site/books/knox-0-5-0/knox-0-5-0.html?rev=1625871&r1=1625870&r2=1625871&view=diff
==============================================================================
--- knox/site/books/knox-0-5-0/knox-0-5-0.html (original)
+++ knox/site/books/knox-0-5-0/knox-0-5-0.html Thu Sep 18 01:37:04 2014
@@ -46,6 +46,7 @@
     <li><a href="#Oozie">Oozie</a></li>
     <li><a href="#HBase">HBase</a></li>
     <li><a href="#Hive">Hive</a></li>
+    <li><a href="#Yarn">Yarn</a></li>
   </ul></li>
   <li><a href="#Limitations">Limitations</a></li>
   <li><a href="#Troubleshooting">Troubleshooting</a></li>
@@ -1575,6 +1576,7 @@ dep/commons-codec-1.7.jar
   <li><a href="#Oozie">Oozie</a></li>
   <li><a href="#HBase">HBase</a></li>
   <li><a href="#Hive">Hive</a></li>
+  <li><a href="#Yarn">Yarn</a></li>
 </ul><h3><a id="Assumptions"></a>Assumptions</h3><p>This document assumes a few things about your environment in order to simplify the examples.</p>
 <ul>
   <li>The JVM is executable as simply java.</li>
@@ -2832,6 +2834,96 @@ connection.close();
 2012-02-03 --- 18:35:34 --- SampleClass6 --- [TRACE]
 2012-02-03 --- 18:35:34 --- SampleClass2 --- [DEBUG]
 ...
+</code></pre><h3><a id="Yarn"></a>Yarn</h3><p>Knox provides gateway functionality for the REST APIs of the ResourceManager. The ResourceManager REST API&rsquo;s allow the user to get information about the cluster - status on the cluster, metrics on the cluster, scheduler information, information about nodes in the cluster, and information about applications on the cluster. Also as of hadoop version 2.5.0, the user can submit a new application as well as kill it (or get state) using the &lsquo;Writable&rsquo; APIs.</p><p>The docs for this can be found here</p><p><a href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html">http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html</a></p><p>To enable this functionality, a topology file needs to have the following configuration:</p>
+<pre><code>&lt;service&gt;
+        &lt;role&gt;RESOURCEMANAGER&lt;/role&gt;
+        &lt;url&gt;http://&lt;hostname&gt;:&lt;port&gt;/ws&lt;/url&gt;
+&lt;/service&gt;
+</code></pre><p>The default resource manager http port is 8088. If it is configured to some other port, that configuration can be found in yarn-site.xml under the property &lsquo;yarn.resourcemanager.webapp.address&rsquo;.</p><h4><a id="Yarn+URL+Mapping"></a>Yarn URL Mapping</h4><p>For Yarn URLs, the mapping of Knox Gateway accessible URLs to direct Yarn URLs is the following.</p>
+<table>
+  <tbody>
+    <tr>
+      <td>Gateway </td>
+      <td><code>https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/resourcemanager</code> </td>
+    </tr>
+    <tr>
+      <td>Cluster </td>
+      <td><code>http://{yarn-host}:{yarn-port}/ws}</code> </td>
+    </tr>
+  </tbody>
+</table><h4><a id="Yarn+Examples+via+cURL"></a>Yarn Examples via cURL</h4><p>Some of the various calls that can be made and examples using curl are listed below.</p>
+<pre><code># 0. Getting cluster info
+
+curl -ikv -u guest:guest-password -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster&#39;
+
+# 1. Getting cluster metrics
+
+curl -ikv -u guest:guest-password -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/metrics&#39;
+
+To get the same information in an xml format
+
+curl -ikv -u guest:guest-password -H Accept:application/xml -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/metrics&#39;
+
+# 2. Getting scheduler information
+
+curl -ikv -u guest:guest-password -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/scheduler&#39;
+
+# 3. Getting all the applications listed and their information
+
+curl -ikv -u guest:guest-password -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps&#39;
+
+# 4. Getting applications statistics
+
+curl -ikv -u guest:guest-password -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/appstatistics&#39;
+
+Also query params can be used as below to filter the results
+
+curl -ikv -u guest:guest-password -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/appstatistics?states=accepted,running,finished&amp;applicationTypes=mapreduce&#39;
+
+# 5. To get a specific application (please note, replace the application id with a real value)
+
+curl -ikv -u guest:guest-password -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/{application_id}&#39;
+
+# 6. To get the attempts made for a particular application
+
+curl -ikv -u guest:guest-password -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/{application_id}/appattempts&#39;
+
+# 7. To get information about the various nodes
+
+curl -ikv -u guest:guest-password -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/nodes&#39;
+
+Also to get a specific node, use an id obtained in the response from above (the node id is scrambled) and issue the following
+
+curl -ikv -u guest:guest-password -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/nodes/{node_id}&#39;
+
+# 8. To create a new Application
+
+curl -ikv -u guest:guest-password -X POST &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/new-application&#39;
+
+An application id is returned from the request above and this can be used to submit an application.
+
+# 9. To submit an application, put together a request containing the application id received in the above response (please refer to Yarn REST
+API documentation).
+
+curl -ikv -u guest:guest-password -T request.json -H Content-Type:application/json -X POST &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps&#39;
+
+Here the request is saved in a file called request.json
+
+#10. To get application state
+
+curl -ikv -u guest:guest-password -X GET &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/{application_id}/state&#39;
+
+curl -ikv -u guest:guest-password -H Content-Type:application/json -X PUT -T state-killed.json &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/application_1409008107556_0007/state&#39;
+
+# 11. To kill an application that is running issue the below command with the application id of the application that is to be killed.
+The contents of the state-killed.json file are :
+
+{
+  &quot;state&quot;:&quot;KILLED&quot;
+}
+
+
+curl -ikv -u guest:guest-password -H Content-Type:application/json -X PUT -T state-killed.json &#39;https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/{application_id}/state&#39;
 </code></pre><h2><a id="Limitations"></a>Limitations</h2><h3><a id="Secure+Oozie+POST/PUT+Request+Payload+Size+Restriction"></a>Secure Oozie POST/PUT Request Payload Size Restriction</h3><p>With one exception there are no know size limits for requests or responses payloads that pass through the gateway. The exception involves POST or PUT request payload sizes for Oozie in a Kerberos secured Hadoop cluster. In this one case there is currently a 4Kb payload size limit for the first request made to the Hadoop cluster. This is a result of how the gateway negotiates a trust relationship between itself and the cluster via SPNego. There is an undocumented configuration setting to modify this limit&rsquo;s value if required. In the future this will be made more easily configuration and at that time it will be documented.</p><h3><a id="LDAP+Groups+Acquisition+from+AD"></a>LDAP Groups Acquisition from AD</h3><p>The LDAP authenticator currently does not &ldquo;out of the box&rdquo; support the
  acquisition of group information from Microsoft Active Directory. Building this into the default implementation is on the roadmap.</p><h3><a id="Group+Membership+Propagation"></a>Group Membership Propagation</h3><p>Groups that are acquired via Shiro Group Lookup and/or Identity Assertion Group Principal Mapping are not propagated to the Hadoop services. Therefore groups used for Service Level Authorization policy may not match those acquired within the cluster via GroupMappingServiceProvider plugins.</p><h2><a id="Troubleshooting"></a>Troubleshooting</h2><h3><a id="Finding+Logs"></a>Finding Logs</h3><p>When things aren&rsquo;t working the first thing you need to do is examine the diagnostic logs. Depending upon how you are running the gateway these diagnostic logs will be output to different locations.</p><h4><a id="java+-jar+bin/gateway.jar"></a>java -jar bin/gateway.jar</h4><p>When the gateway is run this way the diagnostic output is written directly to the console. If you want t
 o capture that output you will need to redirect the console output to a file using OS specific techniques.</p>
 <pre><code>java -jar bin/gateway.jar &gt; gateway.log
 </code></pre><h4><a id="bin/gateway.sh+start"></a>bin/gateway.sh start</h4><p>When the gateway is run this way the diagnostic output is written to /var/log/knox/knox.out and /var/log/knox/knox.err. Typically only knox.out will have content.</p><h3><a id="Increasing+Logging"></a>Increasing Logging</h3><p>The <code>log4j.properties</code> files <code>{GATEWAY_HOME}/conf</code> can be used to change the granularity of the logging done by Knox. The Knox server must be restarted in order for these changes to take effect. There are various useful loggers pre-populated but commented out.</p>

Modified: knox/trunk/books/0.5.0/book.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.5.0/book.md?rev=1625871&r1=1625870&r2=1625871&view=diff
==============================================================================
--- knox/trunk/books/0.5.0/book.md (original)
+++ knox/trunk/books/0.5.0/book.md Thu Sep 18 01:37:04 2014
@@ -51,6 +51,7 @@
     * #[Oozie]
     * #[HBase]
     * #[Hive]
+    * #[Yarn]
 * #[Limitations]
 * #[Troubleshooting]
 * #[Export Controls]

Modified: knox/trunk/books/0.5.0/book_service-details.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.5.0/book_service-details.md?rev=1625871&r1=1625870&r2=1625871&view=diff
==============================================================================
--- knox/trunk/books/0.5.0/book_service-details.md (original)
+++ knox/trunk/books/0.5.0/book_service-details.md Thu Sep 18 01:37:04 2014
@@ -34,6 +34,7 @@ These are the current Hadoop services wi
 * #[Oozie]
 * #[HBase]
 * #[Hive]
+* #[Yarn]
 
 ### Assumptions
 
@@ -78,5 +79,6 @@ Therefore each request via cURL will res
 <<service_oozie.md>>
 <<service_hbase.md>>
 <<service_hive.md>>
+<<service_yarn.md>>
 
 

Added: knox/trunk/books/0.5.0/service_yarn.md
URL: http://svn.apache.org/viewvc/knox/trunk/books/0.5.0/service_yarn.md?rev=1625871&view=auto
==============================================================================
--- knox/trunk/books/0.5.0/service_yarn.md (added)
+++ knox/trunk/books/0.5.0/service_yarn.md Thu Sep 18 01:37:04 2014
@@ -0,0 +1,125 @@
+<!---
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+--->
+
+### Yarn ###
+
+Knox provides gateway functionality for the REST APIs of the ResourceManager. The ResourceManager REST API's allow the
+user to get information about the cluster - status on the cluster, metrics on the cluster, scheduler information,
+information about nodes in the cluster, and information about applications on the cluster. Also as of hadoop version
+2.5.0, the user can submit a new application as well as kill it (or get state) using the 'Writable' APIs.
+
+The docs for this can be found here
+
+http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html
+
+To enable this functionality, a topology file needs to have the following configuration:
+
+
+    <service>
+            <role>RESOURCEMANAGER</role>
+            <url>http://<hostname>:<port>/ws</url>
+    </service>
+
+The default resource manager http port is 8088. If it is configured to some other port, that configuration can be
+found in yarn-site.xml under the property 'yarn.resourcemanager.webapp.address'.
+
+#### Yarn URL Mapping ####
+
+For Yarn URLs, the mapping of Knox Gateway accessible URLs to direct Yarn URLs is the following.
+
+| ------- | ------------------------------------------------------------------------------------- |
+| Gateway | `https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/resourcemanager` |
+| Cluster | `http://{yarn-host}:{yarn-port}/ws}`                                      |
+
+
+#### Yarn Examples via cURL
+
+Some of the various calls that can be made and examples using curl are listed below.
+
+    # 0. Getting cluster info
+    
+    curl -ikv -u guest:guest-password -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster'
+    
+    # 1. Getting cluster metrics
+    
+    curl -ikv -u guest:guest-password -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/metrics'
+    
+    To get the same information in an xml format
+    
+    curl -ikv -u guest:guest-password -H Accept:application/xml -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/metrics'
+    
+    # 2. Getting scheduler information
+    
+    curl -ikv -u guest:guest-password -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/scheduler'
+    
+    # 3. Getting all the applications listed and their information
+    
+    curl -ikv -u guest:guest-password -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps'
+    
+    # 4. Getting applications statistics
+    
+    curl -ikv -u guest:guest-password -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/appstatistics'
+    
+    Also query params can be used as below to filter the results
+    
+    curl -ikv -u guest:guest-password -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/appstatistics?states=accepted,running,finished&applicationTypes=mapreduce'
+    
+    # 5. To get a specific application (please note, replace the application id with a real value)
+    
+    curl -ikv -u guest:guest-password -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/{application_id}'
+    
+    # 6. To get the attempts made for a particular application
+    
+    curl -ikv -u guest:guest-password -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/{application_id}/appattempts'
+    
+    # 7. To get information about the various nodes
+    
+    curl -ikv -u guest:guest-password -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/nodes'
+    
+    Also to get a specific node, use an id obtained in the response from above (the node id is scrambled) and issue the following
+    
+    curl -ikv -u guest:guest-password -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/nodes/{node_id}'
+    
+    # 8. To create a new Application
+    
+    curl -ikv -u guest:guest-password -X POST 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/new-application'
+    
+    An application id is returned from the request above and this can be used to submit an application.
+    
+    # 9. To submit an application, put together a request containing the application id received in the above response (please refer to Yarn REST
+    API documentation).
+    
+    curl -ikv -u guest:guest-password -T request.json -H Content-Type:application/json -X POST 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps'
+    
+    Here the request is saved in a file called request.json
+    
+    #10. To get application state
+    
+    curl -ikv -u guest:guest-password -X GET 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/{application_id}/state'
+    
+    curl -ikv -u guest:guest-password -H Content-Type:application/json -X PUT -T state-killed.json 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/application_1409008107556_0007/state'
+    
+    # 11. To kill an application that is running issue the below command with the application id of the application that is to be killed.
+    The contents of the state-killed.json file are :
+    
+    {
+      "state":"KILLED"
+    }
+    
+    
+    curl -ikv -u guest:guest-password -H Content-Type:application/json -X PUT -T state-killed.json 'https://localhost:8443/gateway/sandbox/resourcemanager/v1/cluster/apps/{application_id}/state'
+