You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zeppelin.apache.org by bz...@apache.org on 2016/08/03 10:20:03 UTC

zeppelin git commit: [ZEPPELIN-1198][Spark Standalone] Documents for running zeppelin on production environments.

Repository: zeppelin
Updated Branches:
  refs/heads/master 16b320ff9 -> b96550329


[ZEPPELIN-1198][Spark Standalone] Documents for running zeppelin on production environments.

### What is this PR for?
This PR is for documentation for running zeppelin on production environments.

### What type of PR is it?
Documentation

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-1198

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: astroshim <hs...@nflabs.com>

Closes #1227 from astroshim/ZEPPELIN-1198/standalone and squashes the following commits:

53a32f2 [astroshim] add 'via Docker'
61a0e5e [astroshim] add apache license header
83fdef6 [astroshim] doc for spark standalone


Project: http://git-wip-us.apache.org/repos/asf/zeppelin/repo
Commit: http://git-wip-us.apache.org/repos/asf/zeppelin/commit/b9655032
Tree: http://git-wip-us.apache.org/repos/asf/zeppelin/tree/b9655032
Diff: http://git-wip-us.apache.org/repos/asf/zeppelin/diff/b9655032

Branch: refs/heads/master
Commit: b965503291fd004f2044df1c8d257aa4c7b1c522
Parents: 16b320f
Author: astroshim <hs...@nflabs.com>
Authored: Fri Jul 29 01:03:17 2016 +0900
Committer: Alexander Bezzubov <bz...@apache.org>
Committed: Wed Aug 3 18:47:21 2016 +0900

----------------------------------------------------------------------
 docs/_includes/themes/zeppelin/_navigation.html |   5 +-
 .../themes/zeppelin/img/docs-img/spark_ui.png   | Bin 0 -> 206211 bytes
 .../zeppelin/img/docs-img/standalone_conf.png   | Bin 0 -> 184762 bytes
 docs/index.md                                   |   4 +-
 docs/install/spark_cluster_mode.md              |  74 +++++++++++++++++++
 .../spark_standalone/Dockerfile                 |  54 ++++++++++++++
 .../spark_standalone/entrypoint.sh              |  31 ++++++++
 7 files changed, 166 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/zeppelin/blob/b9655032/docs/_includes/themes/zeppelin/_navigation.html
----------------------------------------------------------------------
diff --git a/docs/_includes/themes/zeppelin/_navigation.html b/docs/_includes/themes/zeppelin/_navigation.html
index 7756f23..e809b09 100644
--- a/docs/_includes/themes/zeppelin/_navigation.html
+++ b/docs/_includes/themes/zeppelin/_navigation.html
@@ -32,7 +32,6 @@
                 <li><a href="{{BASE_PATH}}/manual/notebookashomepage.html">Customize Zeppelin Homepage</a></li>
                 <li role="separator" class="divider"></li>
                 <li class="title"><span><b>More</b><span></li>
-                <li><a href="{{BASE_PATH}}/install/virtual_machine.html">Zeppelin on Vagrant VM</a></li>
                 <li><a href="{{BASE_PATH}}/install/upgrade.html">Upgrade Zeppelin Version</a></li>
               </ul>
             </li>
@@ -103,6 +102,10 @@
                 <li><a href="{{BASE_PATH}}/security/notebook_authorization.html">Notebook Authorization</a></li>
                 <li><a href="{{BASE_PATH}}/security/datasource_authorization.html">Data Source Authorization</a></li>
                 <li role="separator" class="divider"></li>
+                <li class="title"><span><b>Advanced</b><span></li>
+                <li><a href="{{BASE_PATH}}/install/virtual_machine.html">Zeppelin on Vagrant VM</a></li>
+                <li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-standalone-mode">Zeppelin on Spark Cluster Mode (Standalone)</a></li>
+                <li role="separator" class="divider"></li>
                 <li class="title"><span><b>Contibute</b><span></li>
                 <li><a href="{{BASE_PATH}}/development/writingzeppelininterpreter.html">Writing Zeppelin Interpreter</a></li>
                 <li><a href="{{BASE_PATH}}/development/writingzeppelinapplication.html">Writing Zeppelin Application (Experimental)</a></li>                

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/b9655032/docs/assets/themes/zeppelin/img/docs-img/spark_ui.png
----------------------------------------------------------------------
diff --git a/docs/assets/themes/zeppelin/img/docs-img/spark_ui.png b/docs/assets/themes/zeppelin/img/docs-img/spark_ui.png
new file mode 100644
index 0000000..ca91cf0
Binary files /dev/null and b/docs/assets/themes/zeppelin/img/docs-img/spark_ui.png differ

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/b9655032/docs/assets/themes/zeppelin/img/docs-img/standalone_conf.png
----------------------------------------------------------------------
diff --git a/docs/assets/themes/zeppelin/img/docs-img/standalone_conf.png b/docs/assets/themes/zeppelin/img/docs-img/standalone_conf.png
new file mode 100644
index 0000000..908fc84
Binary files /dev/null and b/docs/assets/themes/zeppelin/img/docs-img/standalone_conf.png differ

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/b9655032/docs/index.md
----------------------------------------------------------------------
diff --git a/docs/index.md b/docs/index.md
index 141e7f6..399393c 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -133,7 +133,6 @@ Join to our [Mailing list](https://zeppelin.apache.org/community.html) and repor
   * [Publish your Paragraph](./manual/publish.html) results into your external website
   * [Customize Zeppelin Homepage](./manual/notebookashomepage.html) with one of your notebooks
 * More
-  * [Apache Zeppelin on Vagrant VM](./install/virtual_machine.html): a guide for installing Apache Zeppelin on Vagrant virtual machine
   * [Upgrade Apache Zeppelin Version](./install/upgrade.html): a manual procedure of upgrading Apache Zeppelin version
 
 ####Interpreter
@@ -168,6 +167,9 @@ Join to our [Mailing list](https://zeppelin.apache.org/community.html) and repor
   * [Shiro Authentication](./security/shiroauthentication.html)
   * [Notebook Authorization](./security/notebook_authorization.html)
   * [Data Source Authorization](./security/datasource_authorization.html)
+* Advanced
+  * [Apache Zeppelin on Vagrant VM](./install/virtual_machine.html)
+  * [Zeppelin on Spark Cluster Mode (Standalone via Docker)](./install/spark_cluster_mode.html#spark-standalone-mode)
 * Contribute
   * [Writing Zeppelin Interpreter](./development/writingzeppelininterpreter.html)
   * [Writing Zeppelin Application (Experimental)](./development/writingzeppelinapplication.html)

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/b9655032/docs/install/spark_cluster_mode.md
----------------------------------------------------------------------
diff --git a/docs/install/spark_cluster_mode.md b/docs/install/spark_cluster_mode.md
new file mode 100644
index 0000000..d2517bd
--- /dev/null
+++ b/docs/install/spark_cluster_mode.md
@@ -0,0 +1,74 @@
+---
+layout: page
+title: "Apache Zeppelin on Spark cluster mode"
+description: ""
+group: install
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+{% include JB/setup %}
+
+# Apache Zeppelin on Spark Cluster Mode
+
+<div id="toc"></div>
+
+## Overview 
+[Apache Spark](http://spark.apache.org/) has supported three cluster manager types([Standalone](http://spark.apache.org/docs/latest/spark-standalone.html), [Apache Mesos](http://spark.apache.org/docs/latest/running-on-mesos.html) and [Hadoop YARN](http://spark.apache.org/docs/latest/running-on-yarn.html)) so far.
+This document will guide you how you can build and configure the environment on 3 types of Spark cluster manager with Apache Zeppelin using [Docker](https://www.docker.com/) scripts.
+So [install docker](https://docs.docker.com/engine/installation/) on the machine first.
+
+## Spark standalone mode
+[Spark standalone](http://spark.apache.org/docs/latest/spark-standalone.html) is a simple cluster manager included with Spark that makes it easy to set up a cluster.
+You can simply set up Spark standalone environment with below steps. 
+
+> **Note :** Since Apache Zeppelin and Spark use same `8080` port for their web UI, you might need to change `zeppelin.server.port` in `conf/zeppelin-site.xml`.
+
+### 1. Build Docker file
+You can find docker script files under `scripts/docker/spark-cluster-managers`.
+
+```
+cd $ZEPPELIN_HOME/scripts/docker/spark-cluster-managers/spark_standalone
+docker build -t "spark_standalone" .
+```
+
+### 2. Run docker
+
+```
+docker run -it \
+-p 8080:8080 \
+-p 7077:7077 \
+-p 8888:8888 \
+-p 8081:8081 \
+-h sparkmaster \
+--name spark_standalone \
+spark_standalone bash; 
+```
+
+### 3. Configure Spark interpreter in Zeppelin
+Set Spark master as `spark://localhost:7077` in Zeppelin **Interpreters** setting page.
+
+<img src="../assets/themes/zeppelin/img/docs-img/standalone_conf.png" />
+
+### 4. Run Zeppelin with Spark interpreter
+After running single paragraph with Spark interpreter in Zeppelin, browse `https://localhost:8080` and check whether Spark cluster is running well or not.
+
+<img src="../assets/themes/zeppelin/img/docs-img/spark_ui.png" />
+
+You can also simply verify that Spark is running well in Docker with below command.
+
+```
+ps -ef | grep spark
+```
+
+

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/b9655032/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile
----------------------------------------------------------------------
diff --git a/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile b/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile
new file mode 100644
index 0000000..a7bae23
--- /dev/null
+++ b/scripts/docker/spark-cluster-managers/spark_standalone/Dockerfile
@@ -0,0 +1,54 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+FROM centos:centos6
+MAINTAINER hsshim@nflabs.com
+
+ENV SPARK_PROFILE 1.6
+ENV SPARK_VERSION 1.6.2
+ENV HADOOP_PROFILE 2.3
+ENV SPARK_HOME /usr/local/spark
+
+# Update the image with the latest packages
+RUN yum update -y; yum clean all
+
+# Get utils
+RUN yum install -y \
+wget \
+tar \
+curl \
+&& \
+yum clean all
+
+# Remove old jdk
+RUN yum remove java; yum remove jdk
+
+# install jdk7 
+RUN yum install -y java-1.7.0-openjdk-devel
+ENV JAVA_HOME /usr/lib/jvm/java
+ENV PATH $PATH:$JAVA_HOME/bin
+
+# install spark
+RUN curl -s http://apache.mirror.cdnetworks.com/spark/spark-$SPARK_VERSION/spark-$SPARK_VERSION-bin-hadoop$HADOOP_PROFILE.tgz | tar -xz -C /usr/local/
+RUN cd /usr/local && ln -s spark-$SPARK_VERSION-bin-hadoop$HADOOP_PROFILE spark
+
+# update boot script
+COPY entrypoint.sh /etc/entrypoint.sh
+RUN chown root.root /etc/entrypoint.sh
+RUN chmod 700 /etc/entrypoint.sh
+
+#spark
+EXPOSE 8080 7077 8888 8081
+
+ENTRYPOINT ["/etc/entrypoint.sh"]

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/b9655032/scripts/docker/spark-cluster-managers/spark_standalone/entrypoint.sh
----------------------------------------------------------------------
diff --git a/scripts/docker/spark-cluster-managers/spark_standalone/entrypoint.sh b/scripts/docker/spark-cluster-managers/spark_standalone/entrypoint.sh
new file mode 100755
index 0000000..f4fded0
--- /dev/null
+++ b/scripts/docker/spark-cluster-managers/spark_standalone/entrypoint.sh
@@ -0,0 +1,31 @@
+#!/bin/bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+export SPARK_MASTER_PORT=7077
+
+# run spark 
+cd /usr/local/spark/sbin
+./start-master.sh
+./start-slave.sh spark://`hostname`:$SPARK_MASTER_PORT
+
+CMD=${1:-"exit 0"}
+if [[ "$CMD" == "-d" ]];
+then
+	service sshd stop
+	/usr/sbin/sshd -D -d
+else
+	/bin/bash -c "$*"
+fi