You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zeppelin.apache.org by ah...@apache.org on 2016/09/29 12:01:46 UTC

zeppelin git commit: [ZEPPELIN-1279] Zeppelin with CDH5.x docker document.

Repository: zeppelin
Updated Branches:
  refs/heads/master b24491baf -> c7ce709f3


[ZEPPELIN-1279] Zeppelin with CDH5.x docker document.

### What is this PR for?
This PR is for the documentation of running zeppelin with CDH docker environment.
and This PR is the part of https://issues.apache.org/jira/browse/ZEPPELIN-1198.

Tested CDH5.7 on ubuntu.

### What type of PR is it?
Documentation

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1281

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: astroshim <hs...@nflabs.com>
Author: AhyoungRyu <ah...@apache.org>
Author: HyungSung <hs...@nflabs.com>

Closes #1451 from astroshim/ZEPPELIN-1281 and squashes the following commits:

5dcb8c1 [astroshim] move configurations to right path and add excluding rat-plugin
09408e3 [HyungSung] Merge pull request #11 from AhyoungRyu/ZEPPELIN-1281-ahyoung
850119c [AhyoungRyu] Generate TOC & change some sentences
e687a53 [AhyoungRyu] Replace zeppelin_with_cdh.png to crop the url part
cc9a023 [AhyoungRyu] Remove main title link anchor
b525f68 [astroshim] separate cdh doc with spark_cluster_mode.md
e66993f [astroshim] fix doc
a7b5b2d [astroshim] cdh docker environment


Project: http://git-wip-us.apache.org/repos/asf/zeppelin/repo
Commit: http://git-wip-us.apache.org/repos/asf/zeppelin/commit/c7ce709f
Tree: http://git-wip-us.apache.org/repos/asf/zeppelin/tree/c7ce709f
Diff: http://git-wip-us.apache.org/repos/asf/zeppelin/diff/c7ce709f

Branch: refs/heads/master
Commit: c7ce709f356c5d007e12824ff9214e9e95905d84
Parents: b24491b
Author: astroshim <hs...@nflabs.com>
Authored: Tue Sep 27 11:24:34 2016 +0900
Committer: AhyoungRyu <ah...@apache.org>
Committed: Thu Sep 29 21:01:31 2016 +0900

----------------------------------------------------------------------
 docs/_includes/themes/zeppelin/_navigation.html |   1 +
 .../img/docs-img/cdh_yarn_applications.png      | Bin 0 -> 124719 bytes
 .../zeppelin/img/docs-img/zeppelin_with_cdh.png | Bin 0 -> 41727 bytes
 docs/index.md                                   |   1 +
 docs/install/cdh.md                             | 100 +++++++++++++++++++
 pom.xml                                         |   1 +
 .../cdh/hdfs_conf/core-site.xml                 |   6 ++
 .../cdh/hdfs_conf/hdfs-site.xml                 |  64 ++++++++++++
 .../cdh/hdfs_conf/mapred-site.xml               |   6 ++
 .../cdh/hdfs_conf/yarn-site.xml                 |  26 +++++
 10 files changed, 205 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/docs/_includes/themes/zeppelin/_navigation.html
----------------------------------------------------------------------
diff --git a/docs/_includes/themes/zeppelin/_navigation.html b/docs/_includes/themes/zeppelin/_navigation.html
index e86ffb7..4a7e75b 100644
--- a/docs/_includes/themes/zeppelin/_navigation.html
+++ b/docs/_includes/themes/zeppelin/_navigation.html
@@ -108,6 +108,7 @@
                 <li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-standalone-mode">Zeppelin on Spark Cluster Mode (Standalone)</a></li>
                 <li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-on-yarn-mode">Zeppelin on Spark Cluster Mode (YARN)</a></li>
                 <li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-on-mesos-mode">Zeppelin on Spark Cluster Mode (Mesos)</a></li>
+                <li><a href="{{BASE_PATH}}/install/cdh.html">Zeppelin on CDH</a></li>
                 <li role="separator" class="divider"></li>
                 <li class="title"><span><b>Contibute</b><span></li>
                 <li><a href="{{BASE_PATH}}/development/writingzeppelininterpreter.html">Writing Zeppelin Interpreter</a></li>

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/docs/assets/themes/zeppelin/img/docs-img/cdh_yarn_applications.png
----------------------------------------------------------------------
diff --git a/docs/assets/themes/zeppelin/img/docs-img/cdh_yarn_applications.png b/docs/assets/themes/zeppelin/img/docs-img/cdh_yarn_applications.png
new file mode 100644
index 0000000..980ea5b
Binary files /dev/null and b/docs/assets/themes/zeppelin/img/docs-img/cdh_yarn_applications.png differ

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/docs/assets/themes/zeppelin/img/docs-img/zeppelin_with_cdh.png
----------------------------------------------------------------------
diff --git a/docs/assets/themes/zeppelin/img/docs-img/zeppelin_with_cdh.png b/docs/assets/themes/zeppelin/img/docs-img/zeppelin_with_cdh.png
new file mode 100644
index 0000000..9dae220
Binary files /dev/null and b/docs/assets/themes/zeppelin/img/docs-img/zeppelin_with_cdh.png differ

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/docs/index.md
----------------------------------------------------------------------
diff --git a/docs/index.md b/docs/index.md
index 8c2ce95..0f25750 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -172,6 +172,7 @@ Join to our [Mailing list](https://zeppelin.apache.org/community.html) and repor
   * [Zeppelin on Spark Cluster Mode (Standalone via Docker)](./install/spark_cluster_mode.html#spark-standalone-mode)
   * [Zeppelin on Spark Cluster Mode (YARN via Docker)](./install/spark_cluster_mode.html#spark-on-yarn-mode)
   * [Zeppelin on Spark Cluster Mode (Mesos via Docker)](./install/spark_cluster_mode.html#spark-on-mesos-mode)
+  * [Zeppelin on CDH (via Docker)](./install/cdh.html)
 * Contribute
   * [Writing Zeppelin Interpreter](./development/writingzeppelininterpreter.html)
   * [Writing Zeppelin Application (Experimental)](./development/writingzeppelinapplication.html)

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/docs/install/cdh.md
----------------------------------------------------------------------
diff --git a/docs/install/cdh.md b/docs/install/cdh.md
new file mode 100644
index 0000000..f661417
--- /dev/null
+++ b/docs/install/cdh.md
@@ -0,0 +1,100 @@
+---
+layout: page
+title: "Apache Zeppelin on CDH"
+description: "This document will guide you how you can build and configure the environment on CDH with Apache Zeppelin using docker scripts."
+group: install
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+{% include JB/setup %}
+
+# Apache Zeppelin on CDH
+
+<div id="toc"></div>
+
+### 1. Import Cloudera QuickStart Docker image
+
+>[Cloudera](http://www.cloudera.com/) has officially provided CDH Docker Hub in their own container. Please check [this guide page](http://www.cloudera.com/documentation/enterprise/latest/topics/quickstart_docker_container.html#cloudera_docker_container) for more information.
+
+You can import the Docker image by pulling it from Cloudera Docker Hub.
+
+```
+docker pull cloudera/quickstart:latest
+```
+
+
+### 2. Run docker
+
+```
+docker run -it \
+ -p 80:80 \
+ -p 4040:4040 \
+ -p 8020:8020 \
+ -p 8022:8022 \
+ -p 8030:8030 \
+ -p 8032:8032 \
+ -p 8033:8033 \
+ -p 8040:8040 \
+ -p 8042:8042 \
+ -p 8088:8088 \
+ -p 8480:8480 \
+ -p 8485:8485 \
+ -p 8888:8888 \
+ -p 9083:9083 \
+ -p 10020:10020 \
+ -p 10033:10033 \
+ -p 18088:18088 \
+ -p 19888:19888 \
+ -p 25000:25000 \
+ -p 25010:25010 \
+ -p 25020:25020 \
+ -p 50010:50010 \
+ -p 50020:50020 \
+ -p 50070:50070 \
+ -p 50075:50075 \
+ -h quickstart.cloudera --privileged=true \
+ agitated_payne_backup /usr/bin/docker-quickstart;
+```
+
+### 3. Verify running CDH
+
+To verify the application is running well, check the web UI for HDFS on `http://<hostname>:50070/` and YARN on `http://<hostname>:8088/cluster`.
+
+
+### 4. Configure Spark interpreter in Zeppelin
+Set following configurations to `conf/zeppelin-env.sh`.
+
+```
+export MASTER=yarn-client
+export HADOOP_CONF_DIR=[your_hadoop_conf_path]
+export SPARK_HOME=[your_spark_home_path]
+```
+
+`HADOOP_CONF_DIR`(Hadoop configuration path) is defined in `/scripts/docker/spark-cluster-managers/cdh/hdfs_conf`.
+
+Don't forget to set Spark `master` as `yarn-client` in Zeppelin **Interpreters** setting page like below.
+
+<img src="../assets/themes/zeppelin/img/docs-img/zeppelin_yarn_conf.png" />
+
+### 5. Run Zeppelin with Spark interpreter
+After running a single paragraph with Spark interpreter in Zeppelin,
+
+<img src="../assets/themes/zeppelin/img/docs-img/zeppelin_with_cdh.png" />
+
+<br/>
+
+browse `http://<hostname>:8088/cluster/apps` to check Zeppelin application is running well or not.
+
+<img src="../assets/themes/zeppelin/img/docs-img/cdh_yarn_applications.png" />
+

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/pom.xml
----------------------------------------------------------------------
diff --git a/pom.xml b/pom.xml
index c93f4b8..03b2263 100644
--- a/pom.xml
+++ b/pom.xml
@@ -753,6 +753,7 @@
               <exclude>.spark-dist/**</exclude>
               <exclude>**/interpreter-setting.json</exclude>
               <exclude>**/constants.json</exclude>
+              <exclude>scripts/**</exclude>
 
               <!-- bundled from bootstrap -->
               <exclude>docs/assets/themes/zeppelin/bootstrap/**</exclude>

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/core-site.xml
----------------------------------------------------------------------
diff --git a/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/core-site.xml b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/core-site.xml
new file mode 100644
index 0000000..6cdbc7f
--- /dev/null
+++ b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/core-site.xml
@@ -0,0 +1,6 @@
+<configuration>
+  <property>
+    <name>fs.defaultFS</name>
+    <value>hdfs://0.0.0.0:8020</value>
+  </property>
+</configuration>

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/hdfs-site.xml
----------------------------------------------------------------------
diff --git a/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/hdfs-site.xml b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/hdfs-site.xml
new file mode 100644
index 0000000..ce031cf
--- /dev/null
+++ b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/hdfs-site.xml
@@ -0,0 +1,64 @@
+<configuration>
+  <property>
+    <name>dfs.replication</name>
+    <value>1</value>
+  </property>
+
+
+  <property>
+    <name>dfs.data.dir</name>
+    <value>/data/hdfs</value>
+    <final>true</final>
+  </property>
+
+  <property>
+    <name>dfs.permissions</name>
+    <value>false</value>
+  </property>
+
+
+  <property>
+    <name>dfs.client.use.datanode.hostname</name>
+    <value>true</value>
+    <description>Whether clients should use datanode hostnames when
+      connecting to datanodes.
+    </description>
+  </property>
+
+  <property>
+    <name>dfs.datanode.use.datanode.hostname</name>
+    <value>true</value>
+    <description>Whether datanodes should use datanode hostnames when
+      connecting to other datanodes for data transfer.
+    </description>
+  </property>
+
+  <property>
+    <name>dfs.datanode.address</name>
+    <value>0.0.0.0:50010</value>
+    <description>
+      The address where the datanode server will listen to.
+      If the port is 0 then the server will start on a free port.
+    </description>
+  </property>
+
+  <property>
+    <name>dfs.datanode.http.address</name>
+    <value>0.0.0.0:50075</value>
+    <description>
+      The datanode http server address and port.
+      If the port is 0 then the server will start on a free port.
+    </description>
+  </property>
+
+  <property>
+    <name>dfs.datanode.ipc.address</name>
+    <value>0.0.0.0:50020</value>
+    <description>
+      The datanode ipc server address and port.
+      If the port is 0 then the server will start on a free port.
+    </description>
+  </property>
+
+</configuration>
+

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/mapred-site.xml
----------------------------------------------------------------------
diff --git a/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/mapred-site.xml b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/mapred-site.xml
new file mode 100644
index 0000000..6dc557d
--- /dev/null
+++ b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/mapred-site.xml
@@ -0,0 +1,6 @@
+<configuration>
+  <property>
+    <name>mapreduce.framework.name</name>
+    <value>yarn</value>
+  </property>
+</configuration>

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c7ce709f/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/yarn-site.xml
----------------------------------------------------------------------
diff --git a/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/yarn-site.xml b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/yarn-site.xml
new file mode 100644
index 0000000..4fce42f9
--- /dev/null
+++ b/scripts/docker/spark-cluster-managers/cdh/hdfs_conf/yarn-site.xml
@@ -0,0 +1,26 @@
+<configuration>
+  <property>
+    <name>yarn.resourcemanager.scheduler.address</name>
+    <value>0.0.0.0:8030</value>
+  </property>
+  <property>
+    <name>yarn.resourcemanager.address</name>
+    <value>0.0.0.0:8032</value>
+  </property>
+  <property>
+    <name>yarn.resourcemanager.webapp.address</name>
+    <value>0.0.0.0:8088</value>
+  </property>
+  <property>
+    <name>yarn.resourcemanager.resource-tracker.address</name>
+    <value>0.0.0.0:8031</value>
+  </property>
+  <property>
+    <name>yarn.resourcemanager.admin.address</name>
+    <value>0.0.0.0:8033</value>
+  </property>
+  <property>
+      <name>yarn.application.classpath</name>
+      <value>/usr/local/hadoop/etc/hadoop, /usr/local/hadoop/share/hadoop/common/*, /usr/local/hadoop/share/hadoop/common/lib/*, /usr/local/hadoop/share/hadoop/hdfs/*, /usr/local/hadoop/share/hadoop/hdfs/lib/*, /usr/local/hadoop/share/hadoop/mapreduce/*, /usr/local/hadoop/share/hadoop/mapreduce/lib/*, /usr/local/hadoop/share/hadoop/yarn/*, /usr/local/hadoop/share/hadoop/yarn/lib/*, /usr/local/hadoop/share/spark/*</value>
+   </property>
+</configuration>