You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zeppelin.apache.org by ah...@apache.org on 2017/04/04 06:10:25 UTC

zeppelin git commit: [ZEPPELIN-2298] Remove -Ppyspark build profile

Repository: zeppelin
Updated Branches:
  refs/heads/master d4085468d -> c87fa53a3


[ZEPPELIN-2298] Remove -Ppyspark build profile

### What is this PR for?
Currently users who build Zeppelin from source need to include `-Ppyspark` to use `%pyspark` with embedded local Spark. But it's quite inconvenient to write this build profile every time we build i think. So I removed `-Ppyspark` and make pyspark related libraries automatically downloaded when we build Zeppelin.

### What type of PR is it?
Improvement

### Todos
* [x] - remove the rest of `-Ppyspark` build profile in `dev/create_release.sh`, `dev/publish_release.sh`, and `docs/install/build.md` after getting feedback

### What is the Jira issue?
[ZEPPELIN-2298](https://issues.apache.org/jira/browse/ZEPPELIN-2298)

### How should this be tested?
1. Apply this patch
2. Build source with below command
```
mvn clean package -DskipTests -pl 'zeppelin-interpreter, zeppelin-zengine, zeppelin-server, zeppelin-display, spark, spark-dependencies'

```
Aftr this step, there will be `pyspark` dir under `ZEPPELIN_HOME/interpreter/spark`. Before this PR, only `dep` dir and `zeppelin-spark_2.10-0.8.0-SNAPSHOT.jar` is generated without `-Ppyspark` build profile.

4. Restart Zeppelin. To make sure, run any python code e.g.
```
%pyspark
print("Hello "+z.input("name"))
```
It should be run successfully without any error

### Screenshots (if appropriate)
 tl;dr Without `-Ppyspark` profile
 - Before
<img width="856" alt="screen shot 2017-04-02 at 2 50 57 pm" src="https://cloud.githubusercontent.com/assets/10060731/24584778/0e8ec6b0-17b4-11e7-9f0d-f2599fd7bd63.png">

 - After
<img width="893" alt="screen shot 2017-04-02 at 2 28 21 pm" src="https://cloud.githubusercontent.com/assets/10060731/24584779/10b7ed68-17b4-11e7-90d4-aa95eb9bba2d.png">

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

I want to include `SparkR` by default(= remove `-PsparkR` build profile) like this as a next step. I want to ask how Zeppelin community think about this.

Author: AhyoungRyu <fb...@hanmail.net>

Closes #2213 from AhyoungRyu/ZEPPELIN-2298/includePysparkByDefault and squashes the following commits:

f7bcf06 [AhyoungRyu] Remove -Ppyspark in virtual_machine.md
458ac02 [AhyoungRyu] Remove the rest of -Ppyspark in blind side of Zeppelin :)
cee1e87 [AhyoungRyu] Change py4j.version -> python.py4j.version
ce43158 [AhyoungRyu] Change py4j.version -> spark.py4j.version
fa4fb36 [AhyoungRyu] Remove the rest of -Ppyspark
30aac81 [AhyoungRyu] Remove -Ppyspark build flag


Project: http://git-wip-us.apache.org/repos/asf/zeppelin/repo
Commit: http://git-wip-us.apache.org/repos/asf/zeppelin/commit/c87fa53a
Tree: http://git-wip-us.apache.org/repos/asf/zeppelin/tree/c87fa53a
Diff: http://git-wip-us.apache.org/repos/asf/zeppelin/diff/c87fa53a

Branch: refs/heads/master
Commit: c87fa53a3added61f82bae7000122318e34a2676
Parents: d408546
Author: AhyoungRyu <fb...@hanmail.net>
Authored: Tue Apr 4 10:49:36 2017 +0900
Committer: ahyoungryu <ah...@apache.org>
Committed: Tue Apr 4 15:10:13 2017 +0900

----------------------------------------------------------------------
 .travis.yml                                     |  14 +-
 dev/create_release.sh                           |   4 +-
 dev/publish_release.sh                          |   2 +-
 docs/install/build.md                           |  16 +--
 docs/install/virtual_machine.md                 |   2 +-
 .../install_with_flink_and_spark_cluster.md     |   7 +-
 python/pom.xml                                  |  10 +-
 scripts/vagrant/zeppelin-dev/README.md          |   2 +-
 .../vagrant/zeppelin-dev/show-instructions.sh   |   2 +-
 spark-dependencies/pom.xml                      | 128 +++++++++----------
 spark/pom.xml                                   |   6 +-
 11 files changed, 91 insertions(+), 102 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c87fa53a/.travis.yml
----------------------------------------------------------------------
diff --git a/.travis.yml b/.travis.yml
index 9424a91..ea42117 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -51,7 +51,7 @@ matrix:
 
     # Test selenium with spark module for 1.6.3
     - jdk: "oraclejdk7"
-      env: TEST_SELENIUM="true" SCALA_VER="2.10" SPARK_VER="1.6.3" HADOOP_VER="2.6" PROFILE="-Pspark-1.6 -Phadoop-2.6 -Ppyspark -Phelium-dev -Pexamples" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="verify -DskipRat" TEST_PROJECTS="-pl .,zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark -Dtest=org.apache.zeppelin.AbstractFunctionalSuite -DfailIfNoTests=false"
+      env: TEST_SELENIUM="true" SCALA_VER="2.10" SPARK_VER="1.6.3" HADOOP_VER="2.6" PROFILE="-Pspark-1.6 -Phadoop-2.6 -Phelium-dev -Pexamples" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="verify -DskipRat" TEST_PROJECTS="-pl .,zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark -Dtest=org.apache.zeppelin.AbstractFunctionalSuite -DfailIfNoTests=false"
 
     # Test interpreter modules
     - jdk: "oraclejdk7"
@@ -59,27 +59,27 @@ matrix:
 
     # Test spark module for 2.1.0 with scala 2.11, livy
     - jdk: "oraclejdk7"
-      env: SCALA_VER="2.11" SPARK_VER="2.1.0" HADOOP_VER="2.6" PROFILE="-Pspark-2.1 -Phadoop-2.6 -Ppyspark -Psparkr -Pscala-2.11" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark,livy" TEST_PROJECTS="-Dtest=ZeppelinSparkClusterTest,org.apache.zeppelin.spark.*,org.apache.zeppelin.livy.* -DfailIfNoTests=false"
+      env: SCALA_VER="2.11" SPARK_VER="2.1.0" HADOOP_VER="2.6" PROFILE="-Pspark-2.1 -Phadoop-2.6 -Psparkr -Pscala-2.11" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark,livy" TEST_PROJECTS="-Dtest=ZeppelinSparkClusterTest,org.apache.zeppelin.spark.*,org.apache.zeppelin.livy.* -DfailIfNoTests=false"
 
     # Test spark module for 2.0.2 with scala 2.11
     - jdk: "oraclejdk7"
-      env: SCALA_VER="2.11" SPARK_VER="2.0.2" HADOOP_VER="2.6" PROFILE="-Pspark-2.0 -Phadoop-2.6 -Ppyspark -Psparkr -Pscala-2.11" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark" TEST_PROJECTS="-Dtest=ZeppelinSparkClusterTest,org.apache.zeppelin.spark.* -DfailIfNoTests=false"
+      env: SCALA_VER="2.11" SPARK_VER="2.0.2" HADOOP_VER="2.6" PROFILE="-Pspark-2.0 -Phadoop-2.6 -Psparkr -Pscala-2.11" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark" TEST_PROJECTS="-Dtest=ZeppelinSparkClusterTest,org.apache.zeppelin.spark.* -DfailIfNoTests=false"
 
     # Test spark module for 1.6.3 with scala 2.10
     - jdk: "oraclejdk7"
-      env: SCALA_VER="2.10" SPARK_VER="1.6.3" HADOOP_VER="2.6" PROFILE="-Pspark-1.6 -Phadoop-2.6 -Ppyspark -Psparkr -Pscala-2.10" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark" TEST_PROJECTS="-Dtest=ZeppelinSparkClusterTest,org.apache.zeppelin.spark.*,org.apache.zeppelin.spark.* -DfailIfNoTests=false"
+      env: SCALA_VER="2.10" SPARK_VER="1.6.3" HADOOP_VER="2.6" PROFILE="-Pspark-1.6 -Phadoop-2.6 -Psparkr -Pscala-2.10" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark" TEST_PROJECTS="-Dtest=ZeppelinSparkClusterTest,org.apache.zeppelin.spark.*,org.apache.zeppelin.spark.* -DfailIfNoTests=false"
 
     # Test spark module for 1.6.3 with scala 2.11
     - jdk: "oraclejdk7"
-      env: SCALA_VER="2.11" SPARK_VER="1.6.3" HADOOP_VER="2.6" PROFILE="-Pspark-1.6 -Phadoop-2.6 -Ppyspark -Psparkr -Pscala-2.11" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark" TEST_PROJECTS="-Dtest=ZeppelinSparkClusterTest,org.apache.zeppelin.spark.* -DfailIfNoTests=false"
+      env: SCALA_VER="2.11" SPARK_VER="1.6.3" HADOOP_VER="2.6" PROFILE="-Pspark-1.6 -Phadoop-2.6 -Psparkr -Pscala-2.11" BUILD_FLAG="package -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-zengine,zeppelin-server,zeppelin-display,spark-dependencies,spark" TEST_PROJECTS="-Dtest=ZeppelinSparkClusterTest,org.apache.zeppelin.spark.* -DfailIfNoTests=false"
 
     # Test python/pyspark with python 2
     - jdk: "oraclejdk7"
-      env: PYTHON="2" SCALA_VER="2.10" SPARK_VER="1.6.1" HADOOP_VER="2.6" PROFILE="-Pspark-1.6 -Phadoop-2.6 -Ppyspark" BUILD_FLAG="package -am -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-display,spark-dependencies,spark,python" TEST_PROJECTS="-Dtest=org.apache.zeppelin.spark.PySpark*Test,org.apache.zeppelin.python.* -Dpyspark.test.exclude='' -DfailIfNoTests=false"
+      env: PYTHON="2" SCALA_VER="2.10" SPARK_VER="1.6.1" HADOOP_VER="2.6" PROFILE="-Pspark-1.6 -Phadoop-2.6" BUILD_FLAG="package -am -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-display,spark-dependencies,spark,python" TEST_PROJECTS="-Dtest=org.apache.zeppelin.spark.PySpark*Test,org.apache.zeppelin.python.* -Dpyspark.test.exclude='' -DfailIfNoTests=false"
 
     # Test python/pyspark with python 3
     - jdk: "oraclejdk7"
-      env: PYTHON="3" SCALA_VER="2.11" SPARK_VER="2.0.0" HADOOP_VER="2.6" PROFILE="-Pspark-2.0 -Phadoop-2.6 -Ppyspark -Pscala-2.11" BUILD_FLAG="package -am -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-display,spark-dependencies,spark,python" TEST_PROJECTS="-Dtest=org.apache.zeppelin.spark.PySpark*Test,org.apache.zeppelin.python.* -Dpyspark.test.exclude='' -DfailIfNoTests=false"
+      env: PYTHON="3" SCALA_VER="2.11" SPARK_VER="2.0.0" HADOOP_VER="2.6" PROFILE="-Pspark-2.0 -Phadoop-2.6 -Pscala-2.11" BUILD_FLAG="package -am -DskipTests -DskipRat" TEST_FLAG="test -DskipRat" MODULES="-pl .,zeppelin-interpreter,zeppelin-display,spark-dependencies,spark,python" TEST_PROJECTS="-Dtest=org.apache.zeppelin.spark.PySpark*Test,org.apache.zeppelin.python.* -Dpyspark.test.exclude='' -DfailIfNoTests=false"
 
 before_install:
   # check files included in commit range, clear bower_components if a bower.json file has changed.

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c87fa53a/dev/create_release.sh
----------------------------------------------------------------------
diff --git a/dev/create_release.sh b/dev/create_release.sh
index cac6a75..34e6c88 100755
--- a/dev/create_release.sh
+++ b/dev/create_release.sh
@@ -106,8 +106,8 @@ function make_binary_release() {
 
 git_clone
 make_source_package
-make_binary_release all "-Pspark-2.1 -Phadoop-2.6 -Pyarn -Ppyspark -Psparkr -Pscala-${SCALA_VERSION}"
-make_binary_release netinst "-Pspark-2.1 -Phadoop-2.6 -Pyarn -Ppyspark -Psparkr -Pscala-${SCALA_VERSION} -pl zeppelin-interpreter,zeppelin-zengine,:zeppelin-display_${SCALA_VERSION},:zeppelin-spark-dependencies_${SCALA_VERSION},:zeppelin-spark_${SCALA_VERSION},zeppelin-web,zeppelin-server,zeppelin-distribution -am"
+make_binary_release all "-Pspark-2.1 -Phadoop-2.6 -Pyarn -Psparkr -Pscala-${SCALA_VERSION}"
+make_binary_release netinst "-Pspark-2.1 -Phadoop-2.6 -Pyarn -Psparkr -Pscala-${SCALA_VERSION} -pl zeppelin-interpreter,zeppelin-zengine,:zeppelin-display_${SCALA_VERSION},:zeppelin-spark-dependencies_${SCALA_VERSION},:zeppelin-spark_${SCALA_VERSION},zeppelin-web,zeppelin-server,zeppelin-distribution -am"
 
 # remove non release files and dirs
 rm -rf "${WORKING_DIR}/zeppelin"

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c87fa53a/dev/publish_release.sh
----------------------------------------------------------------------
diff --git a/dev/publish_release.sh b/dev/publish_release.sh
index 5322f77..6c0bc9d 100755
--- a/dev/publish_release.sh
+++ b/dev/publish_release.sh
@@ -46,7 +46,7 @@ if [[ $RELEASE_VERSION == *"SNAPSHOT"* ]]; then
   DO_SNAPSHOT="yes"
 fi
 
-PUBLISH_PROFILES="-Ppublish-distr -Pspark-2.1 -Phadoop-2.6 -Pyarn -Ppyspark -Psparkr -Pr"
+PUBLISH_PROFILES="-Ppublish-distr -Pspark-2.1 -Phadoop-2.6 -Pyarn -Psparkr -Pr"
 PROJECT_OPTIONS="-pl !zeppelin-distribution"
 NEXUS_STAGING="https://repository.apache.org/service/local/staging"
 NEXUS_PROFILE="153446d1ac37c4"

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c87fa53a/docs/install/build.md
----------------------------------------------------------------------
diff --git a/docs/install/build.md b/docs/install/build.md
index 985f2da..8bae369 100644
--- a/docs/install/build.md
+++ b/docs/install/build.md
@@ -69,7 +69,7 @@ If you're unsure about the options, use the same commands that creates official
 # update all pom.xml to use scala 2.11
 ./dev/change_scala_version.sh 2.11
 # build zeppelin with all interpreters and include latest version of Apache spark support for local mode.
-mvn clean package -DskipTests -Pspark-2.0 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr -Pr -Pscala-2.11
+mvn clean package -DskipTests -Pspark-2.0 -Phadoop-2.4 -Pyarn -Psparkr -Pr -Pscala-2.11
 ```
 
 ####3. Done
@@ -145,10 +145,6 @@ Available profiles are
 enable YARN support for local mode
 > YARN for local mode is not supported for Spark v1.5.0 or higher. Set `SPARK_HOME` instead.
 
-##### `-Ppyspark` (optional)
-
-enable [PySpark](http://spark.apache.org/docs/latest/api/python/) support for local mode.
-
 ##### `-Pr` (optional)
 
 enable [R](https://www.r-project.org/) support with [SparkR](https://spark.apache.org/docs/latest/sparkr.html) integration.
@@ -188,14 +184,14 @@ Here are some examples with several options:
 ```bash
 # build with spark-2.1, scala-2.11
 ./dev/change_scala_version.sh 2.11
-mvn clean package -Pspark-2.1 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr -Pscala-2.11 -DskipTests
+mvn clean package -Pspark-2.1 -Phadoop-2.4 -Pyarn -Psparkr -Pscala-2.11 -DskipTests
 
 # build with spark-2.0, scala-2.11
 ./dev/change_scala_version.sh 2.11
-mvn clean package -Pspark-2.0 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr -Pscala-2.11 -DskipTests
+mvn clean package -Pspark-2.0 -Phadoop-2.4 -Pyarn -Psparkr -Pscala-2.11 -DskipTests
 
 # build with spark-1.6, scala-2.10
-mvn clean package -Pspark-1.6 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr -DskipTests
+mvn clean package -Pspark-1.6 -Phadoop-2.4 -Pyarn -Psparkr -DskipTests
 
 # spark-cassandra integration
 mvn clean package -Pcassandra-spark-1.5 -Dhadoop.version=2.6.0 -Phadoop-2.6 -DskipTests -DskipTests
@@ -328,10 +324,10 @@ mvn clean package -Pbuild-distr
 To build a distribution with specific profiles, run:
 
 ```sh
-mvn clean package -Pbuild-distr -Pspark-1.5 -Phadoop-2.4 -Pyarn -Ppyspark
+mvn clean package -Pbuild-distr -Pspark-1.5 -Phadoop-2.4 -Pyarn
 ```
 
-The profiles `-Pspark-1.5 -Phadoop-2.4 -Pyarn -Ppyspark` can be adjusted if you wish to build to a specific spark versions, or omit support such as `yarn`.  
+The profiles `-Pspark-1.5 -Phadoop-2.4 -Pyarn` can be adjusted if you wish to build to a specific spark versions, or omit support such as `yarn`.  
 
 The archive is generated under _`zeppelin-distribution/target`_ directory
 

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c87fa53a/docs/install/virtual_machine.md
----------------------------------------------------------------------
diff --git a/docs/install/virtual_machine.md b/docs/install/virtual_machine.md
index 6456bc5..d0973d6 100644
--- a/docs/install/virtual_machine.md
+++ b/docs/install/virtual_machine.md
@@ -110,7 +110,7 @@ This assumes you've already cloned the project either on the host machine in the
 
 ```
 cd /zeppelin
-mvn clean package -Pspark-1.6 -Ppyspark -Phadoop-2.4 -Psparkr -DskipTests
+mvn clean package -Pspark-1.6 -Phadoop-2.4 -Psparkr -DskipTests
 ./bin/zeppelin-daemon.sh start
 ```
 

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c87fa53a/docs/quickstart/install_with_flink_and_spark_cluster.md
----------------------------------------------------------------------
diff --git a/docs/quickstart/install_with_flink_and_spark_cluster.md b/docs/quickstart/install_with_flink_and_spark_cluster.md
index 0fec15c..89d6d6e 100644
--- a/docs/quickstart/install_with_flink_and_spark_cluster.md
+++ b/docs/quickstart/install_with_flink_and_spark_cluster.md
@@ -130,12 +130,11 @@ mvn clean package -DskipTests -Pspark-1.6 -Dflink.version=1.1.3 -Pscala-2.10
 -`-Pscala-2.10` tells maven to build with Scala v2.10.
 
 
-**Note:** You may wish to include additional build flags such as `-Ppyspark` or `-Psparkr`.  See [the build section of github for more details](https://github.com/apache/zeppelin#build).
-
 **Note:** You can build against any version of Spark that has a Zeppelin build profile available. The key is to make sure you check out the matching version of Spark to build. At the time of this writing, Spark 1.6 was the most recent Spark version available.
 
 **Note:** On build failures. Having installed Zeppelin close to 30 times now, I will tell you that sometimes the build fails for seemingly no reason.
 As long as you didn't edit any code, it is unlikely the build is failing because of something you did. What does tend to happen, is some dependency that maven is trying to download is unreachable.  If your build fails on this step here are some tips:
+
 - Don't get discouraged.
 - Scroll up and read through the logs. There will be clues there.
 - Retry (that is, run the `mvn clean package -DskipTests -Pspark-1.6` again)
@@ -154,7 +153,7 @@ Use `ifconfig` to determine the host machine's IP address. If you are not famili
 
 Open a web-browser on a machine connected to the same network as the host (or in the host operating system if using a virtual machine).  Navigate to http://`yourip`:8080, where yourip is the IP address you found in `ifconfig`.
 
-See the [Zeppelin tutorial](../tutorial/tutorial.md) for basic Zeppelin usage. It is also advised that you take a moment to check out the tutorial notebook that is included with each Zeppelin install, and to familiarize yourself with basic notebook functionality.
+See the [Zeppelin tutorial](../tutorial/tutorial.html) for basic Zeppelin usage. It is also advised that you take a moment to check out the tutorial notebook that is included with each Zeppelin install, and to familiarize yourself with basic notebook functionality.
 
 ##### Flink Test
 Create a new notebook named "Flink Test" and copy and paste the following code.
@@ -417,6 +416,6 @@ You should be able check the Flink and Spark webuis (at something like http://`y
 
 ### Next Steps
 
-Check out the [tutorial](./tutorial.md) for more cool things you can do with your new toy!
+Check out the [tutorial](./tutorial.html) for more cool things you can do with your new toy!
 
 [Join the community](http://zeppelin.apache.org/community.html), ask questions and contribute! Every little bit helps.

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c87fa53a/python/pom.xml
----------------------------------------------------------------------
diff --git a/python/pom.xml b/python/pom.xml
index 681986c..e520b4b 100644
--- a/python/pom.xml
+++ b/python/pom.xml
@@ -33,7 +33,7 @@
   <name>Zeppelin: Python interpreter</name>
 
   <properties>
-    <py4j.version>0.9.2</py4j.version>
+    <python.py4j.version>0.9.2</python.py4j.version>
     <python.test.exclude>
         **/PythonInterpreterWithPythonInstalledTest.java,
         **/PythonInterpreterPandasSqlTest.java,
@@ -58,7 +58,7 @@
     <dependency>
       <groupId>net.sf.py4j</groupId>
       <artifactId>py4j</artifactId>
-      <version>${py4j.version}</version>
+      <version>${python.py4j.version}</version>
     </dependency>
 
     <dependency>
@@ -108,8 +108,8 @@
             <goals><goal>download-single</goal></goals>
             <configuration>
               <url>https://pypi.python.org/packages/64/5c/01e13b68e8caafece40d549f232c9b5677ad1016071a48d04cc3895acaa3</url>
-              <fromFile>py4j-${py4j.version}.zip</fromFile>
-              <toFile>${project.build.directory}/../../interpreter/python/py4j-${py4j.version}.zip</toFile>
+              <fromFile>py4j-${python.py4j.version}.zip</fromFile>
+              <toFile>${project.build.directory}/../../interpreter/python/py4j-${python.py4j.version}.zip</toFile>
             </configuration>
           </execution>
         </executions>
@@ -123,7 +123,7 @@
             <phase>package</phase>
             <configuration>
               <target>
-                <unzip src="${project.build.directory}/../../interpreter/python/py4j-${py4j.version}.zip"
+                <unzip src="${project.build.directory}/../../interpreter/python/py4j-${python.py4j.version}.zip"
                        dest="${project.build.directory}/../../interpreter/python"/>
               </target>
             </configuration>

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c87fa53a/scripts/vagrant/zeppelin-dev/README.md
----------------------------------------------------------------------
diff --git a/scripts/vagrant/zeppelin-dev/README.md b/scripts/vagrant/zeppelin-dev/README.md
index fd428d6..8ebde4b 100644
--- a/scripts/vagrant/zeppelin-dev/README.md
+++ b/scripts/vagrant/zeppelin-dev/README.md
@@ -87,7 +87,7 @@ This assumes you've already cloned the project either on the host machine in the
 
 ```
 cd /zeppelin
-mvn clean package -Pspark-1.6 -Ppyspark -Phadoop-2.4 -Psparkr -DskipTests
+mvn clean package -Pspark-1.6 -Phadoop-2.4 -Psparkr -DskipTests
 ./bin/zeppelin-daemon.sh start
 ```
 

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c87fa53a/scripts/vagrant/zeppelin-dev/show-instructions.sh
----------------------------------------------------------------------
diff --git a/scripts/vagrant/zeppelin-dev/show-instructions.sh b/scripts/vagrant/zeppelin-dev/show-instructions.sh
index f3b2b27..bb9abee 100644
--- a/scripts/vagrant/zeppelin-dev/show-instructions.sh
+++ b/scripts/vagrant/zeppelin-dev/show-instructions.sh
@@ -34,7 +34,7 @@ echo 'mvn clean package -DskipTests'
 echo
 echo '# or for a specific Spark/Hadoop build with additional options such as python and R support'
 echo
-echo 'mvn clean package -Pspark-1.6 -Ppyspark -Phadoop-2.4 -Psparkr -DskipTests'
+echo 'mvn clean package -Pspark-1.6 -Phadoop-2.4 -Psparkr -DskipTests'
 echo './bin/zeppelin-daemon.sh start'
 echo
 echo 'On your host machine browse to http://localhost:8080/'

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c87fa53a/spark-dependencies/pom.xml
----------------------------------------------------------------------
diff --git a/spark-dependencies/pom.xml b/spark-dependencies/pom.xml
index 2ce787c..3592509 100644
--- a/spark-dependencies/pom.xml
+++ b/spark-dependencies/pom.xml
@@ -62,7 +62,7 @@
     <spark.bin.download.url>
       http://d3kbcqa49mib13.cloudfront.net/spark-${spark.version}-bin-without-hadoop.tgz
     </spark.bin.download.url>
-    <py4j.version>0.8.2.1</py4j.version>
+    <spark.py4j.version>0.8.2.1</spark.py4j.version>
 
     <!--plugin versions-->
     <plugin.shade.version>2.3</plugin.shade.version>
@@ -514,7 +514,7 @@
       <id>spark-1.6</id>
       <properties>
         <spark.version>1.6.3</spark.version>
-        <py4j.version>0.9</py4j.version>
+        <spark.py4j.version>0.9</spark.py4j.version>
         <akka.group>com.typesafe.akka</akka.group>
         <akka.version>2.3.11</akka.version>
         <protobuf.version>2.5.0</protobuf.version>
@@ -526,7 +526,7 @@
       <properties>
         <spark.version>2.0.2</spark.version>
         <protobuf.version>2.5.0</protobuf.version>
-        <py4j.version>0.10.3</py4j.version>
+        <spark.py4j.version>0.10.3</spark.py4j.version>
         <scala.version>2.11.8</scala.version>
       </properties>
     </profile>
@@ -539,7 +539,7 @@
       <properties>
         <spark.version>2.1.0</spark.version>
         <protobuf.version>2.5.0</protobuf.version>
-        <py4j.version>0.10.4</py4j.version>
+        <spark.py4j.version>0.10.4</spark.py4j.version>
         <scala.version>2.11.8</scala.version>
       </properties>
     </profile>
@@ -828,69 +828,6 @@
     </profile>
 
     <profile>
-      <id>pyspark</id>
-      <build>
-        <plugins>
-          <plugin>
-            <groupId>com.googlecode.maven-download-plugin</groupId>
-            <artifactId>download-maven-plugin</artifactId>
-            <executions>
-              <execution>
-                <id>download-pyspark-files</id>
-                <phase>validate</phase>
-                <goals>
-                  <goal>wget</goal>
-                </goals>
-                <configuration>
-                  <readTimeOut>60000</readTimeOut>
-                  <retries>5</retries>
-                  <unpack>true</unpack>
-                  <url>${spark.src.download.url}</url>
-                  <outputDirectory>${project.build.directory}</outputDirectory>
-                </configuration>
-              </execution>
-            </executions>
-          </plugin>
-
-          <plugin>
-            <artifactId>maven-clean-plugin</artifactId>
-            <configuration>
-              <filesets>
-                <fileset>
-                  <directory>${basedir}/../python/build</directory>
-                </fileset>
-              </filesets>
-            </configuration>
-          </plugin>
-
-          <plugin>
-            <groupId>org.apache.maven.plugins</groupId>
-            <artifactId>maven-antrun-plugin</artifactId>
-            <executions>
-              <execution>
-                <id>zip-pyspark-files</id>
-                <phase>generate-resources</phase>
-                <goals>
-                  <goal>run</goal>
-                </goals>
-                <configuration>
-                  <target>
-                    <delete dir="../interpreter/spark/pyspark"/>
-                    <copy todir="../interpreter/spark/pyspark"
-                          file="${project.build.directory}/${spark.archive}/python/lib/py4j-${py4j.version}-src.zip"/>
-                    <zip destfile="${project.build.directory}/../../interpreter/spark/pyspark/pyspark.zip"
-                         basedir="${project.build.directory}/${spark.archive}/python"
-                         includes="pyspark/*.py,pyspark/**/*.py"/>
-                  </target>
-                </configuration>
-              </execution>
-            </executions>
-          </plugin>
-        </plugins>
-      </build>
-    </profile>
-
-    <profile>
       <id>sparkr</id>
       <build>
         <plugins>
@@ -1045,6 +982,63 @@
           </execution>
         </executions>
       </plugin>
+
+      <!-- include pyspark by default -->
+      <plugin>
+        <groupId>com.googlecode.maven-download-plugin</groupId>
+        <artifactId>download-maven-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>download-pyspark-files</id>
+            <phase>validate</phase>
+            <goals>
+              <goal>wget</goal>
+            </goals>
+            <configuration>
+              <readTimeOut>60000</readTimeOut>
+              <retries>5</retries>
+              <unpack>true</unpack>
+              <url>${spark.src.download.url}</url>
+              <outputDirectory>${project.build.directory}</outputDirectory>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+
+      <plugin>
+        <artifactId>maven-clean-plugin</artifactId>
+        <configuration>
+          <filesets>
+            <fileset>
+              <directory>${basedir}/../python/build</directory>
+            </fileset>
+          </filesets>
+        </configuration>
+      </plugin>
+
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-antrun-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>zip-pyspark-files</id>
+            <phase>generate-resources</phase>
+            <goals>
+              <goal>run</goal>
+            </goals>
+            <configuration>
+              <target>
+                <delete dir="../interpreter/spark/pyspark"/>
+                <copy todir="../interpreter/spark/pyspark"
+                      file="${project.build.directory}/${spark.archive}/python/lib/py4j-${spark.py4j.version}-src.zip"/>
+                <zip destfile="${project.build.directory}/../../interpreter/spark/pyspark/pyspark.zip"
+                     basedir="${project.build.directory}/${spark.archive}/python"
+                     includes="pyspark/*.py,pyspark/**/*.py"/>
+              </target>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
     </plugins>
   </build>
 </project>

http://git-wip-us.apache.org/repos/asf/zeppelin/blob/c87fa53a/spark/pom.xml
----------------------------------------------------------------------
diff --git a/spark/pom.xml b/spark/pom.xml
index 046b4d5..c494111 100644
--- a/spark/pom.xml
+++ b/spark/pom.xml
@@ -506,7 +506,7 @@
       <id>spark-1.6</id>
       <properties>
         <spark.version>1.6.3</spark.version>
-        <py4j.version>0.9</py4j.version>
+        <spark.py4j.version>0.9</spark.py4j.version>
         <akka.group>com.typesafe.akka</akka.group>
         <akka.version>2.3.11</akka.version>
         <protobuf.version>2.5.0</protobuf.version>
@@ -518,7 +518,7 @@
       <properties>
         <spark.version>2.0.2</spark.version>
         <protobuf.version>2.5.0</protobuf.version>
-        <py4j.version>0.10.3</py4j.version>
+        <spark.py4j.version>0.10.3</spark.py4j.version>
         <scala.version>2.11.8</scala.version>
       </properties>
     </profile>
@@ -531,7 +531,7 @@
       <properties>
         <spark.version>2.1.0</spark.version>
         <protobuf.version>2.5.0</protobuf.version>
-        <py4j.version>0.10.4</py4j.version>
+        <spark.py4j.version>0.10.4</spark.py4j.version>
         <scala.version>2.11.8</scala.version>
       </properties>
     </profile>