You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zeppelin.apache.org by zj...@apache.org on 2021/02/20 08:53:24 UTC

[zeppelin] branch master updated: [ZEPPELIN-5074] Improve on how to install doc

This is an automated email from the ASF dual-hosted git repository.

zjffdu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/zeppelin.git


The following commit(s) were added to refs/heads/master by this push:
     new 41cc755  [ZEPPELIN-5074] Improve on how to install doc
41cc755 is described below

commit 41cc755c01155fbee23fe4418cbe33c0ba36b52a
Author: Jeff Zhang <zj...@apache.org>
AuthorDate: Sat Jan 30 12:58:26 2021 +0800

    [ZEPPELIN-5074] Improve on how to install doc
    
    ### What is this PR for?
    
    Update the install doc for the latest zeppelin version 0.9.0.
    
    ### What type of PR is it?
    [ Documentation ]
    
    ### Todos
    * [ ] - Task
    
    ### What is the Jira issue?
    * https://issues.apache.org/jira/browse/ZEPPELIN-5074
    
    ### How should this be tested?
    * No test needed
    
    ### Screenshots (if appropriate)
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? No
    * Does this needs documentation? No
    
    Author: Jeff Zhang <zj...@apache.org>
    
    Closes #4026 from zjffdu/ZEPPELIN-5074 and squashes the following commits:
    
    8b376ba2a [Jeff Zhang] remove unbunt 16.x
    4f0dfaf57 [Jeff Zhang] [ZEPPELIN-5074] Improve on how to install doc
---
 docs/quickstart/install.md              | 31 +++++++++--
 docs/setup/basics/hadoop_integration.md |  2 +-
 docs/setup/basics/how_to_build.md       | 99 +++++++++++++++++++--------------
 3 files changed, 85 insertions(+), 47 deletions(-)

diff --git a/docs/quickstart/install.md b/docs/quickstart/install.md
index f0c3985..ec0ccc8 100644
--- a/docs/quickstart/install.md
+++ b/docs/quickstart/install.md
@@ -40,16 +40,16 @@ Apache Zeppelin officially supports and is tested on the following environments:
   </tr>
   <tr>
     <td>OS</td>
-    <td>Mac OSX <br /> Ubuntu 16.X</td>
+    <td>Mac OSX <br/> Ubuntu 18.04 <br/> Ubuntu 20.04</td>
   </tr>
 </table>
 
 ### Downloading Binary Package
 
-Two binary packages are available on the [download page](http://zeppelin.apache.org/download.html). Only difference between these two binaries is interpreters are included in the package file.
+Two binary packages are available on the [download page](http://zeppelin.apache.org/download.html). Only difference between these two binaries is whether all the interpreters are included in the package file.
 
 - **all interpreter package**: unpack it in a directory of your choice and you're ready to go.
-- **net-install interpreter package**: unpack and follow [install additional interpreters](../usage/interpreter/installation.html) to install interpreters. If you're unsure, just run `./bin/install-interpreter.sh --all` and install all interpreters.
+- **net-install interpreter package**: only spark, python, markdown and shell interpreter included. Unpack and follow [install additional interpreters](../usage/interpreter/installation.html) to install other interpreters. If you're unsure, just run `./bin/install-interpreter.sh --all` and install all interpreters.
   
 ### Building Zeppelin from source
 
@@ -67,12 +67,35 @@ bin/zeppelin-daemon.sh start
 
 After Zeppelin has started successfully, go to [http://localhost:8080](http://localhost:8080) with your web browser.
 
+By default Zeppelin is listening at `127.0.0.1:8080`, so you can't access it when it is deployed in another remote machine.
+To access a remote Zeppelin, you need to change `zeppelin.server.addr` to `0.0.0.0` in `conf/zeppelin-site.xml`.
+
 #### Stopping Zeppelin
 
 ```
 bin/zeppelin-daemon.sh stop
 ```
 
+
+## Using the official docker image
+
+Make sure that [docker](https://www.docker.com/community-edition) is installed in your local machine.  
+
+Use this command to launch Apache Zeppelin in a container.
+
+```bash
+docker run -p 8080:8080 --rm --name zeppelin apache/zeppelin:0.9.0
+
+```
+To persist `logs` and `notebook` directories, use the [volume](https://docs.docker.com/engine/reference/commandline/run/#mount-volume--v-read-only) option for docker container.
+
+```bash
+docker run -p 8080:8080 --rm -v $PWD/logs:/logs -v $PWD/notebook:/notebook -e ZEPPELIN_LOG_DIR='/logs' -e ZEPPELIN_NOTEBOOK_DIR='/notebook' --name zeppelin apache/zeppelin:0.9.0
+```
+
+If you have trouble accessing `localhost:8080` in the browser, Please clear browser cache.
+
+
 ## Start Apache Zeppelin with a service manager
 
 > **Note :** The below description was written based on Ubuntu.
@@ -117,7 +140,7 @@ exec bin/zeppelin-daemon.sh upstart
 
 ## Next Steps
 
-Congratulations, you have successfully installed Apache Zeppelin! Here are few steps you might find useful:
+Congratulations, you have successfully installed Apache Zeppelin! Here are a few steps you might find useful:
 
 #### New to Apache Zeppelin...
  * For an in-depth overview, head to [Explore Zeppelin UI](../quickstart/explore_ui.html).
diff --git a/docs/setup/basics/hadoop_integration.md b/docs/setup/basics/hadoop_integration.md
index 9417ede..dd7d8a5 100644
--- a/docs/setup/basics/hadoop_integration.md
+++ b/docs/setup/basics/hadoop_integration.md
@@ -32,7 +32,7 @@ Hadoop is an optional component of zeppelin unless you need the following featur
 
 ## Requirements
 
-In Zeppelin 0.9 doesn't ship with hadoop dependencies, you need to include hadoop jars by yourself via the following steps
+Zeppelin 0.9 doesn't ship with hadoop dependencies, you need to include hadoop jars by yourself via the following steps
 
 * Hadoop client (both 2.x and 3.x are supported) is installed.
 * `$HADOOP_HOME/bin` is put in `PATH`. Because internally zeppelin will run command `hadoop classpath` to get all the hadoop jars and put them in the classpath of Zeppelin.
diff --git a/docs/setup/basics/how_to_build.md b/docs/setup/basics/how_to_build.md
index f38861b..7f70c33 100644
--- a/docs/setup/basics/how_to_build.md
+++ b/docs/setup/basics/how_to_build.md
@@ -38,7 +38,7 @@ If you want to build from source, you must first install the following dependenc
   </tr>
   <tr>
     <td>Maven</td>
-    <td>3.1.x or higher</td>
+    <td>3.6.3 or higher</td>
   </tr>
   <tr>
     <td>OpenJDK or Oracle JDK</td>
@@ -64,32 +64,40 @@ You can build Zeppelin with following maven command:
 mvn clean package -DskipTests [Options]
 ```
 
-If you're unsure about the options, use the same commands that creates official binary package.
+Check [build-profiles](#build-profiles) section for further build options.
+If you are behind proxy, follow instructions in [Proxy setting](#proxy-setting-optional) section.
+
+If you're interested in contribution, please check [Contributing to Apache Zeppelin (Code)](../../development/contribution/how_to_contribute_code.html) and [Contributing to Apache Zeppelin (Website)](../../development/contribution/how_to_contribute_website.html).
 
-```bash
-# update all pom.xml to use scala 2.11
-./dev/change_scala_version.sh 2.11
-# build zeppelin with all interpreters and include latest version of Apache spark support for local mode.
-mvn clean package -DskipTests -Pspark-2.3 -Phadoop-2.6 -Pscala-2.11
-```
 
 #### 3. Done
-You can directly start Zeppelin by running after successful build:
+You can directly start Zeppelin by running the following command after successful build:
 
 ```bash
 ./bin/zeppelin-daemon.sh start
 ```
 
-Check [build-profiles](#build-profiles) section for further build options.
-If you are behind proxy, follow instructions in [Proxy setting](#proxy-setting-optional) section.
+### Build profiles
 
-If you're interested in contribution, please check [Contributing to Apache Zeppelin (Code)](../../development/contribution/how_to_contribute_code.html) and [Contributing to Apache Zeppelin (Website)](../../development/contribution/how_to_contribute_website.html).
 
-### Build profiles
+#### Scala profile
+
+To be noticed, this scala profile affect the modules (e.g. cassandra, scalding) that use scala except Spark interpreter (Spark interpreter use other profiles to control its scala version, see the doc below).
+
+Set scala version (default 2.10). Available profiles are
+
+```
+-Pscala-2.10
+-Pscala-2.11
+```
 
 #### Spark Interpreter
 
-To build with a specific Spark version, Hadoop version or specific features, define one or more of the following profiles and options:
+To be noticed, the spark profiles here only affect the embedded mode (no need to specify `SPARK_HOME`) of spark interpreter. 
+Zeppelin doesn't require you to build with different spark to make different versions of spark work in Zeppelin.
+You can run different versions of Spark in Zeppelin as long as you specify `SPARK_HOME`. Actually Zeppelin supports all the versions of Spark from 1.6 to 3.0.
+
+To build with a specific Spark version or scala versions, define one or more of the following profiles and options:
 
 ##### `-Pspark-[version]`
 
@@ -98,6 +106,8 @@ Set spark major version
 Available profiles are
 
 ```
+-Pspark-3.0
+-Pspark-2.4
 -Pspark-2.3
 -Pspark-2.2
 -Pspark-2.1
@@ -107,30 +117,37 @@ Available profiles are
 
 minor version can be adjusted by `-Dspark.version=x.x.x`
 
+##### `-Pspark-scala-[version] (optional)`
 
-##### `-Phadoop[version]`
-
-set hadoop major version (default hadoop2)
+To be noticed, these profiles also only affect the embedded mode (no need to specify `SPARK_HOME`) of Spark interpreter. 
+Actually Zeppelin supports all the versions of scala (2.10, 2.11, 2.12) in Spark interpreter as long as you specify `SPARK_HOME`.
 
 Available profiles are
 
 ```
--Phadoop2
--Phadoop3
+-Pspark-scala-2.10
+-Pspark-scala-2.11
+-Pspark-scala-2.12
 ```
 
-minor version can be adjusted by `-Dhadoop.version=x.x.x`
+If you want to use Spark 3.x in the embedded mode, then you have to specify both profile `spark-3.0` and `spark-scala-2.12`,
+because Spark 3.x doesn't support scala 2.10 and 2.11.
+ 
+#### Build hadoop with Zeppelin (`-Phadoop[version]`)
+ 
+To be noticed, hadoop profiles only affect Zeppelin server, it doesn't affect any interpreter. 
+Zeppelin server use hadoop in some cases, such as using hdfs as notebook storage. You can check this [page](./hadoop_integration.html) for more details about how to configure hadoop in Zeppelin.
 
-##### `-Pscala-[version] (optional)`
-
-set scala version (default 2.10)
+Set hadoop major version (default hadoop2).
 Available profiles are
 
 ```
--Pscala-2.10
--Pscala-2.11
+-Phadoop2
+-Phadoop3
 ```
 
+minor version can be adjusted by `-Dhadoop.version=x.x.x`
+
 
 ##### `-Pvendor-repo` (optional)
 
@@ -146,19 +163,17 @@ Build examples under zeppelin-examples directory
 Here are some examples with several options:
 
 ```bash
-# build with spark-2.1, scala-2.11
-./dev/change_scala_version.sh 2.11
-mvn clean package -Pspark-2.1 -Pscala-2.11 -DskipTests
+# build with spark-3.0, spark-scala-2.12
+mvn clean package -Pspark-3.0 -Pspark-scala-2.12 -DskipTests
 
-# build with spark-2.0, scala-2.11
-./dev/change_scala_version.sh 2.11
-mvn clean package -Pspark-2.0 -Pscala-2.11 -DskipTests
+# build with spark-2.4, spark-scala-2.11
+mvn clean package -Pspark-2.4 -Pspark-scala-2.11 -DskipTests
 
-# build with spark-1.6, scala-2.10
-mvn clean package -Pspark-1.6 -DskipTests
+# build with spark-1.6, spark-scala-2.10
+mvn clean package -Pspark-1.6 -Pspark-scala-2.10 -DskipTests
 
-# with CDH
-mvn clean package -Pspark-1.6 -Dhadoop.version=2.6.0-cdh5.5.0 -Pvendor-repo -DskipTests
+# build with CDH
+mvn clean package -Pspark-1.6 -Pspark-scala-2.10 -Dhadoop.version=2.6.0-cdh5.5.0 -Pvendor-repo -DskipTests
 ```
 
 Ignite Interpreter
@@ -217,7 +232,7 @@ If you don't have requirements prepared, install it.
 ```bash
 sudo apt-get update
 sudo apt-get install git
-sudo apt-get install openjdk-7-jdk
+sudo apt-get install openjdk-8-jdk
 sudo apt-get install npm
 sudo apt-get install libfontconfig
 sudo apt-get install r-base-dev
@@ -229,14 +244,14 @@ sudo apt-get install r-cran-evaluate
 ### Install maven
 
 ```bash
-wget http://www.eu.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz
-sudo tar -zxf apache-maven-3.3.9-bin.tar.gz -C /usr/local/
-sudo ln -s /usr/local/apache-maven-3.3.9/bin/mvn /usr/local/bin/mvn
+wget http://www.eu.apache.org/dist/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz
+sudo tar -zxf apache-maven-3.6.3-bin.tar.gz -C /usr/local/
+sudo ln -s /usr/local/apache-maven-3.6.3/bin/mvn /usr/local/bin/mvn
 ```
 
 _Notes:_
  - Ensure node is installed by running `node --version`  
- - Ensure maven is running version 3.1.x or higher with `mvn -version`
+ - Ensure maven is running version 3.6.3 or higher with `mvn -version`
  - Configure maven to use more memory than usual by `export MAVEN_OPTS="-Xmx2g -XX:MaxMetaspaceSize=512m"`
 
 
@@ -316,10 +331,10 @@ mvn clean package -Pbuild-distr
 To build a distribution with specific profiles, run:
 
 ```sh
-mvn clean package -Pbuild-distr -Pspark-2.3 -Phadoop-2.4
+mvn clean package -Pbuild-distr -Pspark-2.4
 ```
 
-The profiles `-Pspark-2.3 -Phadoop-2.4` can be adjusted if you wish to build to a specific spark versions.  
+The profiles `-Pspark-2.4` can be adjusted if you wish to build to a specific spark versions.  
 
 The archive is generated under _`zeppelin-distribution/target`_ directory