You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@zeppelin.apache.org by Thomas Bünger <th...@googlemail.com> on 2018/06/05 12:21:21 UTC

NewSparkInterpreter fails on yarn-cluster

I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled version
of spark under /usr/lib/spark.

This works fine in local or yarn-client mode, but in yarn-cluster mode i
just get a

java.lang.NullPointerException at
org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)

Seems to be caused by an unsuccessful search for the py4j libraries.
I've made sure that SPARK_HOME is actually set in .bash_rc, in
zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
interpreter, something odd is going on.

Best regards,
 Thomas

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Jongyoul Lee <jo...@gmail.com>.

I'm not sure it's not my fault or not, but I fixed it by adding
/etc/spark2/hive-site.xml into interpreter tab. I'll investigate this issue
and file it up later.

On Mon, Jun 11, 2018 at 11:13 AM, Jeff Zhang <zj...@gmail.com> wrote:

>
> Hi Jongyoul,
>
> I find HiveContext works for me in yarn cluster mode, do you put
> hive-site.xml under SPARK_CONF_DIR ?
>
>
>
> Jeff Zhang <zj...@gmail.com>于2018年6月11日周一 上午10:07写道：
>
>>
>> Thanks Jongyoul, I will fix it before the next RC
>>
>> Jongyoul Lee <jo...@gmail.com>于2018年6月11日周一 上午9:54写道：
>>
>>> BTW, It's a bit different but I found HiveContext cannot be made in
>>> yarn-cluster mode. I'll test it more but we might fix it
>>>
>>> On Sun, Jun 10, 2018 at 11:31 AM, Jeff Zhang <zj...@gmail.com> wrote:
>>>
>>>> BTW, zeppelin don't require to be installed in all the nodes of
>>>> cluster. Install it in one node is sufficient.
>>>>
>>>>
>>>> Jeff Zhang <zj...@gmail.com>于2018年6月10日周日 上午10:22写道：
>>>>
>>>>>
>>>>> hmm, maybe it is due the the --driver-class-path in interpreter.sh.  I
>>>>> will create ticket to remote this for yarn cluster mode
>>>>>
>>>>>
>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月10日周日 上午6:04写道：
>>>>>
>>>>>> I just tried to copy the zeppelin installation to the exact same
>>>>>> location on each YARN-Node and then everything works fine!
>>>>>> So it seems to be some missing jar file from being sent to the spark
>>>>>> driver node. Or a wrong classpath.
>>>>>>
>>>>>> Maybe the following dump from the Spark UI might help somehow?
>>>>>>
>>>>>> *Environment*
>>>>>> *Runtime Information*
>>>>>> Name Value
>>>>>> Java Version 1.8.0_171 (Oracle Corporation)
>>>>>> Java Home /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-7.b10.37.
>>>>>> amzn1.x86_64/jre
>>>>>> Scala Version version 2.11.8
>>>>>> *Spark Properties*
>>>>>> Name Value
>>>>>> SPARK_HOME /usr/lib/spark
>>>>>> master yarn-cluster
>>>>>> spark.app.id application_1528204441221_0010
>>>>>> spark.app.name Zeppelin
>>>>>> spark.blacklist.decommissioning.enabled true
>>>>>> spark.blacklist.decommissioning.timeout 1h
>>>>>> spark.decommissioning.timeout.threshold 20
>>>>>> spark.driver.extraClassPath :/home/hadoop/zeppelin-0.8.1-
>>>>>> SNAPSHOT/local-repo/spark/*:/home/hadoop/zeppelin-0.8.1-
>>>>>> SNAPSHOT/interpreter/spark/*:/home/hadoop/zeppelin-0.8.1-
>>>>>> SNAPSHOT/lib/interpreter/*::/home/hadoop/zeppelin-0.8.1-
>>>>>> SNAPSHOT/interpreter/spark/spark-interpreter-0.8.1-
>>>>>> SNAPSHOT.jar:/etc/hadoop/conf/
>>>>>> spark.driver.extraJavaOptions -Dfile.encoding=UTF-8
>>>>>> -Dlog4j.configuration=log4j_yarn_cluster.properties
>>>>>> -Dzeppelin.log.file=/home/hadoop/zeppelin-0.8.1-
>>>>>> SNAPSHOT/logs/zeppelin-interpreter-spark-hadoop-ip-10-126-82-0.log
>>>>>> spark.driver.extraLibraryPath /usr/lib/hadoop/lib/native:/
>>>>>> usr/lib/hadoop-lzo/lib/native
>>>>>> spark.driver.host ip-10-126-87-125.us-east-1.aws.*****
>>>>>> spark.driver.port 38237
>>>>>> spark.dynamicAllocation.enabled true
>>>>>> spark.eventLog.dir hdfs:///var/log/spark/apps
>>>>>> spark.eventLog.enabled true
>>>>>> spark.executor.cores 4
>>>>>> spark.executor.extraClassPath /usr/lib/hadoop-lzo/lib/*:/
>>>>>> usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:
>>>>>> /usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*
>>>>>> :/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/
>>>>>> security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/
>>>>>> aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/
>>>>>> usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/
>>>>>> usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar
>>>>>> spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails
>>>>>> -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:
>>>>>> CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70
>>>>>> -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p'
>>>>>> spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native:/
>>>>>> usr/lib/hadoop-lzo/lib/native
>>>>>> spark.executor.id driver
>>>>>> spark.executorEnv.PYTHONPATH /usr/lib/spark/python/lib/
>>>>>> py4j-0.10.6-src.zip:/usr/lib/spark/python/:<CPS>{{PWD}}/
>>>>>> pyspark.zip<CPS>{{PWD}}/py4j-0.10.6-src.zip
>>>>>> spark.files.fetchFailure.unRegisterOutputOnHost true
>>>>>> spark.hadoop.yarn.timeline-service.enabled false
>>>>>> spark.history.fs.logDirectory hdfs:///var/log/spark/apps
>>>>>> spark.history.ui.port 18080
>>>>>> spark.jars
>>>>>> spark.master yarn
>>>>>> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.
>>>>>> AmIpFilter.param.PROXY_HOSTS ip-10-126-82-0.us-east-1.aws.*****
>>>>>> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.
>>>>>> AmIpFilter.param.PROXY_URI_BASES http://ip-10-126-82-0.us-east-1.aws.
>>>>>> *****:20888/proxy/application_1528204441221_0010
>>>>>> spark.repl.class.outputDir *********(redacted)
>>>>>> spark.repl.class.uri spark://ip-10-126-87-125.us-
>>>>>> east-1.aws.*****:38237/classes
>>>>>> spark.resourceManager.cleanupExpiredHost true
>>>>>> spark.scheduler.mode FIFO
>>>>>> spark.shuffle.service.enabled true
>>>>>> spark.sql.catalogImplementation hive
>>>>>> spark.sql.hive.metastore.sharedPrefixes com.amazonaws.services.
>>>>>> dynamodbv2
>>>>>> spark.sql.warehouse.dir *********(redacted)
>>>>>> spark.stage.attempt.ignoreOnDecommissionFetchFailure true
>>>>>> spark.submit.deployMode cluster
>>>>>> spark.ui.filters org.apache.hadoop.yarn.server.
>>>>>> webproxy.amfilter.AmIpFilter
>>>>>> spark.ui.port 0
>>>>>> spark.useHiveContext true
>>>>>> spark.yarn.app.container.log.dir /var/log/hadoop-yarn/
>>>>>> containers/application_1528204441221_0010/container_
>>>>>> 1528204441221_0010_01_000001
>>>>>> spark.yarn.app.id application_1528204441221_0010
>>>>>> spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS $(hostname -f)
>>>>>> spark.yarn.dist.archives file:/usr/lib/spark/R/lib/sparkr.zip#sparkr
>>>>>> spark.yarn.dist.files file:///home/hadoop/zeppelin-
>>>>>> 0.8.1-SNAPSHOT/conf/log4j_yarn_cluster.properties
>>>>>> spark.yarn.historyServer.address ip-10-126-82-0.us-east-1.aws.*
>>>>>> ****:18080
>>>>>> spark.yarn.isPython true
>>>>>> zeppelin.R.cmd R
>>>>>> zeppelin.R.image.width 100%
>>>>>> zeppelin.R.knitr true
>>>>>> zeppelin.R.render.options out.format = 'html', comment = NA, echo =
>>>>>> FALSE, results = 'asis', message = F, warning = F, fig.retina = 2
>>>>>> zeppelin.dep.additionalRemoteRepository spark-packages,http://dl.
>>>>>> bintray.com/spark-packages/maven,false;
>>>>>> zeppelin.dep.localrepo local-repo
>>>>>> zeppelin.interpreter.localRepo /home/hadoop/zeppelin-0.8.1-
>>>>>> SNAPSHOT/local-repo/spark
>>>>>> zeppelin.interpreter.max.poolsize 10
>>>>>> zeppelin.interpreter.output.limit 102400
>>>>>> zeppelin.pyspark.python python
>>>>>> zeppelin.pyspark.useIPython true
>>>>>> zeppelin.spark.concurrentSQL false
>>>>>> zeppelin.spark.enableSupportedVersionCheck true
>>>>>> zeppelin.spark.importImplicit true
>>>>>> zeppelin.spark.maxResult 1000
>>>>>> zeppelin.spark.printREPLOutput true
>>>>>> zeppelin.spark.sql.interpolation false
>>>>>> zeppelin.spark.sql.stacktrace false
>>>>>> zeppelin.spark.useHiveContext true
>>>>>> zeppelin.spark.useNew true
>>>>>>
>>>>>>
>>>>>> Am Sa., 9. Juni 2018 um 23:34 Uhr schrieb Thomas Bünger <
>>>>>> thom.bueng@googlemail.com>:
>>>>>>
>>>>>>> Hey Jeff,
>>>>>>> I just tried branch-0.8.
>>>>>>> Still the same error: No ZeppelinContext "z" available when using
>>>>>>> "yarn-cluster". (See attached screenshot)
>>>>>>> With "yarn-client" it works.
>>>>>>>
>>>>>>> Besides setting JAVA_HOME and HADOOP_CONF_DIR inside
>>>>>>> zeppelin-env.sh, no further adjustment where applied to the zeppelin
>>>>>>> installation. (Also thanks to the new %spark.conf ;-) )
>>>>>>>
>>>>>>> Best regards,
>>>>>>>  Thomas
>>>>>>>
>>>>>>> [image: Screen Shot 2018-06-09 at 23.27.49.png]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Am Fr., 8. Juni 2018 um 03:05 Uhr schrieb Jeff Zhang <
>>>>>>> zjffdu@gmail.com>:
>>>>>>>
>>>>>>>>
>>>>>>>> Hi Thomas,
>>>>>>>>
>>>>>>>> I try to the latest branch-0.8, it works for me. Could you try
>>>>>>>> again to verify it ?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月7日周四 下午8:34写道：
>>>>>>>>
>>>>>>>>> I specifically mean visualisation via ZeppelinContext inside a
>>>>>>>>> Spark interpreter. (e.g. "z.show(...)")
>>>>>>>>> The visualisation of SparkSQL results inside a SparkSQLInterpreter
>>>>>>>>> work fine, also in yarn-cluster mode.
>>>>>>>>>
>>>>>>>>> Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
>>>>>>>>> thom.bueng@googlemail.com>:
>>>>>>>>>
>>>>>>>>>> Hey Jeff,
>>>>>>>>>>
>>>>>>>>>> I tried your changes and now it works nicely. Thank you very much!
>>>>>>>>>>
>>>>>>>>>> But I still can't use any of the forms and visualizations in
>>>>>>>>>> yarn-cluster?
>>>>>>>>>> I was hoping that this got resolved with the new SparkInterpreter
>>>>>>>>>> so that I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm
>>>>>>>>>> still getting errors like
>>>>>>>>>> "error: not found: value z"
>>>>>>>>>>
>>>>>>>>>> Was this not in scope of that change? Is this a bug? Or is it
>>>>>>>>>> known limitation and also not supported in 0.8?
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>  Thomas
>>>>>>>>>>
>>>>>>>>>> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang <
>>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I can confirm that this is a bug, and created
>>>>>>>>>>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>>>>>>>>>>
>>>>>>>>>>> Will fix it soon
>>>>>>>>>>>
>>>>>>>>>>> Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> hmm, it looks like a bug. I will check it tomorrow.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>>>>>>>>>>>>
>>>>>>>>>>>>> $ ls /usr/lib/spark/python/lib
>>>>>>>>>>>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>>>>>>>>>>>
>>>>>>>>>>>>> So folder exists and contains both necessary zips. Please
>>>>>>>>>>>>> note, that in local or yarn-client mode the files are properly picked up
>>>>>>>>>>>>> from that very same location.
>>>>>>>>>>>>>
>>>>>>>>>>>>> How does yarn-cluster work under the hood? Could it be that
>>>>>>>>>>>>> environment variables (like SPARK_HOME) are lost, because they are only
>>>>>>>>>>>>> available in my local shell + zeppelin daemon process? Do I need to tell
>>>>>>>>>>>>> YARN somehow about SPARK_HOME?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <
>>>>>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could you check whether there's folder /usr/lib/spark/python/lib
>>>>>>>>>>>>>> ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二
>>>>>>>>>>>>>> 下午8:45写道：
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> sys.env
>>>>>>>>>>>>>>> java.lang.NullPointerException at org.apache.zeppelin.spark.
>>>>>>>>>>>>>>> NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>>>>>> at org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>>>>>>>>>>>>> at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>>>>>>>>>>>>> at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>>>>>>>>>>>> at org.apache.zeppelin.interpreter.remote.
>>>>>>>>>>>>>>> RemoteInterpreterServer$InterpretJob.jobRun(
>>>>>>>>>>>>>>> RemoteInterpreterServer.java:617) at
>>>>>>>>>>>>>>> org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>>>>>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>>>>>>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>>>>>>>>>>>> at java.util.concurrent.ScheduledThreadPoolExecutor$
>>>>>>>>>>>>>>> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>>>>>>>>> at java.util.concurrent.ScheduledThreadPoolExecutor$
>>>>>>>>>>>>>>> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>>>>>>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>>>>>>>>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>>>>>>>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>>>>>>>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Could you paste the full stracktrace ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二
>>>>>>>>>>>>>>>> 下午8:21写道：
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the
>>>>>>>>>>>>>>>>> preinstalled version of spark under /usr/lib/spark.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This works fine in local or yarn-client mode, but in
>>>>>>>>>>>>>>>>> yarn-cluster mode i just get a
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.
>>>>>>>>>>>>>>>>> setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Seems to be caused by an unsuccessful search for the py4j
>>>>>>>>>>>>>>>>> libraries.
>>>>>>>>>>>>>>>>> I've made sure that SPARK_HOME is actually set in
>>>>>>>>>>>>>>>>> .bash_rc, in zeppelin-env.sh and via the new %spark.conf, but somehow in
>>>>>>>>>>>>>>>>> the remote interpreter, something odd is going on.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>  Thomas
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>
>>>
>>> --
>>> 이종열, Jongyoul Lee, 李宗烈
>>> http://madeng.net
>>>
>>


-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Jeff Zhang <zj...@gmail.com>.

Hi Jongyoul,

I find HiveContext works for me in yarn cluster mode, do you put
hive-site.xml under SPARK_CONF_DIR ?



Jeff Zhang <zj...@gmail.com>于2018年6月11日周一 上午10:07写道：

>
> Thanks Jongyoul, I will fix it before the next RC
>
> Jongyoul Lee <jo...@gmail.com>于2018年6月11日周一 上午9:54写道：
>
>> BTW, It's a bit different but I found HiveContext cannot be made in
>> yarn-cluster mode. I'll test it more but we might fix it
>>
>> On Sun, Jun 10, 2018 at 11:31 AM, Jeff Zhang <zj...@gmail.com> wrote:
>>
>>> BTW, zeppelin don't require to be installed in all the nodes of cluster.
>>> Install it in one node is sufficient.
>>>
>>>
>>> Jeff Zhang <zj...@gmail.com>于2018年6月10日周日 上午10:22写道：
>>>
>>>>
>>>> hmm, maybe it is due the the --driver-class-path in interpreter.sh.  I
>>>> will create ticket to remote this for yarn cluster mode
>>>>
>>>>
>>>> Thomas Bünger <th...@googlemail.com>于2018年6月10日周日 上午6:04写道：
>>>>
>>>>> I just tried to copy the zeppelin installation to the exact same
>>>>> location on each YARN-Node and then everything works fine!
>>>>> So it seems to be some missing jar file from being sent to the spark
>>>>> driver node. Or a wrong classpath.
>>>>>
>>>>> Maybe the following dump from the Spark UI might help somehow?
>>>>>
>>>>> *Environment*
>>>>> *Runtime Information*
>>>>> Name Value
>>>>> Java Version 1.8.0_171 (Oracle Corporation)
>>>>> Java Home
>>>>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-7.b10.37.amzn1.x86_64/jre
>>>>> Scala Version version 2.11.8
>>>>> *Spark Properties*
>>>>> Name Value
>>>>> SPARK_HOME /usr/lib/spark
>>>>> master yarn-cluster
>>>>> spark.app.id application_1528204441221_0010
>>>>> spark.app.name Zeppelin
>>>>> spark.blacklist.decommissioning.enabled true
>>>>> spark.blacklist.decommissioning.timeout 1h
>>>>> spark.decommissioning.timeout.threshold 20
>>>>> spark.driver.extraClassPath
>>>>> :/home/hadoop/zeppelin-0.8.1-SNAPSHOT/local-repo/spark/*:/home/hadoop/zeppelin-0.8.1-SNAPSHOT/interpreter/spark/*:/home/hadoop/zeppelin-0.8.1-SNAPSHOT/lib/interpreter/*::/home/hadoop/zeppelin-0.8.1-SNAPSHOT/interpreter/spark/spark-interpreter-0.8.1-SNAPSHOT.jar:/etc/hadoop/conf/
>>>>> spark.driver.extraJavaOptions -Dfile.encoding=UTF-8
>>>>> -Dlog4j.configuration=log4j_yarn_cluster.properties
>>>>> -Dzeppelin.log.file=/home/hadoop/zeppelin-0.8.1-SNAPSHOT/logs/zeppelin-interpreter-spark-hadoop-ip-10-126-82-0.log
>>>>> spark.driver.extraLibraryPath
>>>>> /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native
>>>>> spark.driver.host ip-10-126-87-125.us-east-1.aws.*****
>>>>> spark.driver.port 38237
>>>>> spark.dynamicAllocation.enabled true
>>>>> spark.eventLog.dir hdfs:///var/log/spark/apps
>>>>> spark.eventLog.enabled true
>>>>> spark.executor.cores 4
>>>>> spark.executor.extraClassPath
>>>>> /usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar
>>>>> spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails
>>>>> -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC
>>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70
>>>>> -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p'
>>>>> spark.executor.extraLibraryPath
>>>>> /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native
>>>>> spark.executor.id driver
>>>>> spark.executorEnv.PYTHONPATH
>>>>> /usr/lib/spark/python/lib/py4j-0.10.6-src.zip:/usr/lib/spark/python/:<CPS>{{PWD}}/pyspark.zip<CPS>{{PWD}}/py4j-0.10.6-src.zip
>>>>> spark.files.fetchFailure.unRegisterOutputOnHost true
>>>>> spark.hadoop.yarn.timeline-service.enabled false
>>>>> spark.history.fs.logDirectory hdfs:///var/log/spark/apps
>>>>> spark.history.ui.port 18080
>>>>> spark.jars
>>>>> spark.master yarn
>>>>>
>>>>> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS
>>>>> ip-10-126-82-0.us-east-1.aws.*****
>>>>>
>>>>> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES
>>>>> http://ip-10-126-82-0.us-east-1.aws.
>>>>> *****:20888/proxy/application_1528204441221_0010
>>>>> spark.repl.class.outputDir *********(redacted)
>>>>> spark.repl.class.uri
>>>>> spark://ip-10-126-87-125.us-east-1.aws.*****:38237/classes
>>>>> spark.resourceManager.cleanupExpiredHost true
>>>>> spark.scheduler.mode FIFO
>>>>> spark.shuffle.service.enabled true
>>>>> spark.sql.catalogImplementation hive
>>>>> spark.sql.hive.metastore.sharedPrefixes
>>>>> com.amazonaws.services.dynamodbv2
>>>>> spark.sql.warehouse.dir *********(redacted)
>>>>> spark.stage.attempt.ignoreOnDecommissionFetchFailure true
>>>>> spark.submit.deployMode cluster
>>>>> spark.ui.filters
>>>>> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
>>>>> spark.ui.port 0
>>>>> spark.useHiveContext true
>>>>> spark.yarn.app.container.log.dir
>>>>> /var/log/hadoop-yarn/containers/application_1528204441221_0010/container_1528204441221_0010_01_000001
>>>>> spark.yarn.app.id application_1528204441221_0010
>>>>> spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS $(hostname -f)
>>>>> spark.yarn.dist.archives file:/usr/lib/spark/R/lib/sparkr.zip#sparkr
>>>>> spark.yarn.dist.files
>>>>> file:///home/hadoop/zeppelin-0.8.1-SNAPSHOT/conf/log4j_yarn_cluster.properties
>>>>> spark.yarn.historyServer.address
>>>>> ip-10-126-82-0.us-east-1.aws.*****:18080
>>>>> spark.yarn.isPython true
>>>>> zeppelin.R.cmd R
>>>>> zeppelin.R.image.width 100%
>>>>> zeppelin.R.knitr true
>>>>> zeppelin.R.render.options out.format = 'html', comment = NA, echo =
>>>>> FALSE, results = 'asis', message = F, warning = F, fig.retina = 2
>>>>> zeppelin.dep.additionalRemoteRepository spark-packages,
>>>>> http://dl.bintray.com/spark-packages/maven,false;
>>>>> zeppelin.dep.localrepo local-repo
>>>>> zeppelin.interpreter.localRepo
>>>>> /home/hadoop/zeppelin-0.8.1-SNAPSHOT/local-repo/spark
>>>>> zeppelin.interpreter.max.poolsize 10
>>>>> zeppelin.interpreter.output.limit 102400
>>>>> zeppelin.pyspark.python python
>>>>> zeppelin.pyspark.useIPython true
>>>>> zeppelin.spark.concurrentSQL false
>>>>> zeppelin.spark.enableSupportedVersionCheck true
>>>>> zeppelin.spark.importImplicit true
>>>>> zeppelin.spark.maxResult 1000
>>>>> zeppelin.spark.printREPLOutput true
>>>>> zeppelin.spark.sql.interpolation false
>>>>> zeppelin.spark.sql.stacktrace false
>>>>> zeppelin.spark.useHiveContext true
>>>>> zeppelin.spark.useNew true
>>>>>
>>>>>
>>>>> Am Sa., 9. Juni 2018 um 23:34 Uhr schrieb Thomas Bünger <
>>>>> thom.bueng@googlemail.com>:
>>>>>
>>>>>> Hey Jeff,
>>>>>> I just tried branch-0.8.
>>>>>> Still the same error: No ZeppelinContext "z" available when using
>>>>>> "yarn-cluster". (See attached screenshot)
>>>>>> With "yarn-client" it works.
>>>>>>
>>>>>> Besides setting JAVA_HOME and HADOOP_CONF_DIR inside zeppelin-env.sh,
>>>>>> no further adjustment where applied to the zeppelin installation. (Also
>>>>>> thanks to the new %spark.conf ;-) )
>>>>>>
>>>>>> Best regards,
>>>>>>  Thomas
>>>>>>
>>>>>> [image: Screen Shot 2018-06-09 at 23.27.49.png]
>>>>>>
>>>>>>
>>>>>>
>>>>>> Am Fr., 8. Juni 2018 um 03:05 Uhr schrieb Jeff Zhang <
>>>>>> zjffdu@gmail.com>:
>>>>>>
>>>>>>>
>>>>>>> Hi Thomas,
>>>>>>>
>>>>>>> I try to the latest branch-0.8, it works for me. Could you try again
>>>>>>> to verify it ?
>>>>>>>
>>>>>>>
>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月7日周四 下午8:34写道：
>>>>>>>
>>>>>>>> I specifically mean visualisation via ZeppelinContext inside a
>>>>>>>> Spark interpreter. (e.g. "z.show(...)")
>>>>>>>> The visualisation of SparkSQL results inside a SparkSQLInterpreter
>>>>>>>> work fine, also in yarn-cluster mode.
>>>>>>>>
>>>>>>>> Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
>>>>>>>> thom.bueng@googlemail.com>:
>>>>>>>>
>>>>>>>>> Hey Jeff,
>>>>>>>>>
>>>>>>>>> I tried your changes and now it works nicely. Thank you very much!
>>>>>>>>>
>>>>>>>>> But I still can't use any of the forms and visualizations in
>>>>>>>>> yarn-cluster?
>>>>>>>>> I was hoping that this got resolved with the new SparkInterpreter
>>>>>>>>> so that I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm
>>>>>>>>> still getting errors like
>>>>>>>>> "error: not found: value z"
>>>>>>>>>
>>>>>>>>> Was this not in scope of that change? Is this a bug? Or is it
>>>>>>>>> known limitation and also not supported in 0.8?
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>  Thomas
>>>>>>>>>
>>>>>>>>> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang <
>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I can confirm that this is a bug, and created
>>>>>>>>>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>>>>>>>>>
>>>>>>>>>> Will fix it soon
>>>>>>>>>>
>>>>>>>>>> Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> hmm, it looks like a bug. I will check it tomorrow.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>>>>>>>>>>>
>>>>>>>>>>>> $ ls /usr/lib/spark/python/lib
>>>>>>>>>>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>>>>>>>>>>
>>>>>>>>>>>> So folder exists and contains both necessary zips. Please note,
>>>>>>>>>>>> that in local or yarn-client mode the files are properly picked up from
>>>>>>>>>>>> that very same location.
>>>>>>>>>>>>
>>>>>>>>>>>> How does yarn-cluster work under the hood? Could it be that
>>>>>>>>>>>> environment variables (like SPARK_HOME) are lost, because they are only
>>>>>>>>>>>> available in my local shell + zeppelin daemon process? Do I need to tell
>>>>>>>>>>>> YARN somehow about SPARK_HOME?
>>>>>>>>>>>>
>>>>>>>>>>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <
>>>>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could you check whether there's folder /usr/lib/spark/python/lib
>>>>>>>>>>>>> ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二
>>>>>>>>>>>>> 下午8:45写道：
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> sys.env
>>>>>>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>>>>>>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>>>>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>>>>>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>>>>>>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>>>>>>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Could you paste the full stracktrace ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二
>>>>>>>>>>>>>>> 下午8:21写道：
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the
>>>>>>>>>>>>>>>> preinstalled version of spark under /usr/lib/spark.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This works fine in local or yarn-client mode, but in
>>>>>>>>>>>>>>>> yarn-cluster mode i just get a
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Seems to be caused by an unsuccessful search for the py4j
>>>>>>>>>>>>>>>> libraries.
>>>>>>>>>>>>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc,
>>>>>>>>>>>>>>>> in zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>>>>>>>>>>>>> interpreter, something odd is going on.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>  Thomas
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>
>>
>> --
>> 이종열, Jongyoul Lee, 李宗烈
>> http://madeng.net
>>
>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Jeff Zhang <zj...@gmail.com>.

Thanks Jongyoul, I will fix it before the next RC

Jongyoul Lee <jo...@gmail.com>于2018年6月11日周一 上午9:54写道：

> BTW, It's a bit different but I found HiveContext cannot be made in
> yarn-cluster mode. I'll test it more but we might fix it
>
> On Sun, Jun 10, 2018 at 11:31 AM, Jeff Zhang <zj...@gmail.com> wrote:
>
>> BTW, zeppelin don't require to be installed in all the nodes of cluster.
>> Install it in one node is sufficient.
>>
>>
>> Jeff Zhang <zj...@gmail.com>于2018年6月10日周日 上午10:22写道：
>>
>>>
>>> hmm, maybe it is due the the --driver-class-path in interpreter.sh.  I
>>> will create ticket to remote this for yarn cluster mode
>>>
>>>
>>> Thomas Bünger <th...@googlemail.com>于2018年6月10日周日 上午6:04写道：
>>>
>>>> I just tried to copy the zeppelin installation to the exact same
>>>> location on each YARN-Node and then everything works fine!
>>>> So it seems to be some missing jar file from being sent to the spark
>>>> driver node. Or a wrong classpath.
>>>>
>>>> Maybe the following dump from the Spark UI might help somehow?
>>>>
>>>> *Environment*
>>>> *Runtime Information*
>>>> Name Value
>>>> Java Version 1.8.0_171 (Oracle Corporation)
>>>> Java Home
>>>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-7.b10.37.amzn1.x86_64/jre
>>>> Scala Version version 2.11.8
>>>> *Spark Properties*
>>>> Name Value
>>>> SPARK_HOME /usr/lib/spark
>>>> master yarn-cluster
>>>> spark.app.id application_1528204441221_0010
>>>> spark.app.name Zeppelin
>>>> spark.blacklist.decommissioning.enabled true
>>>> spark.blacklist.decommissioning.timeout 1h
>>>> spark.decommissioning.timeout.threshold 20
>>>> spark.driver.extraClassPath
>>>> :/home/hadoop/zeppelin-0.8.1-SNAPSHOT/local-repo/spark/*:/home/hadoop/zeppelin-0.8.1-SNAPSHOT/interpreter/spark/*:/home/hadoop/zeppelin-0.8.1-SNAPSHOT/lib/interpreter/*::/home/hadoop/zeppelin-0.8.1-SNAPSHOT/interpreter/spark/spark-interpreter-0.8.1-SNAPSHOT.jar:/etc/hadoop/conf/
>>>> spark.driver.extraJavaOptions -Dfile.encoding=UTF-8
>>>> -Dlog4j.configuration=log4j_yarn_cluster.properties
>>>> -Dzeppelin.log.file=/home/hadoop/zeppelin-0.8.1-SNAPSHOT/logs/zeppelin-interpreter-spark-hadoop-ip-10-126-82-0.log
>>>> spark.driver.extraLibraryPath
>>>> /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native
>>>> spark.driver.host ip-10-126-87-125.us-east-1.aws.*****
>>>> spark.driver.port 38237
>>>> spark.dynamicAllocation.enabled true
>>>> spark.eventLog.dir hdfs:///var/log/spark/apps
>>>> spark.eventLog.enabled true
>>>> spark.executor.cores 4
>>>> spark.executor.extraClassPath
>>>> /usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar
>>>> spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails
>>>> -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC
>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70
>>>> -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p'
>>>> spark.executor.extraLibraryPath
>>>> /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native
>>>> spark.executor.id driver
>>>> spark.executorEnv.PYTHONPATH
>>>> /usr/lib/spark/python/lib/py4j-0.10.6-src.zip:/usr/lib/spark/python/:<CPS>{{PWD}}/pyspark.zip<CPS>{{PWD}}/py4j-0.10.6-src.zip
>>>> spark.files.fetchFailure.unRegisterOutputOnHost true
>>>> spark.hadoop.yarn.timeline-service.enabled false
>>>> spark.history.fs.logDirectory hdfs:///var/log/spark/apps
>>>> spark.history.ui.port 18080
>>>> spark.jars
>>>> spark.master yarn
>>>>
>>>> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS
>>>> ip-10-126-82-0.us-east-1.aws.*****
>>>>
>>>> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES
>>>> http://ip-10-126-82-0.us-east-1.aws.
>>>> *****:20888/proxy/application_1528204441221_0010
>>>> spark.repl.class.outputDir *********(redacted)
>>>> spark.repl.class.uri
>>>> spark://ip-10-126-87-125.us-east-1.aws.*****:38237/classes
>>>> spark.resourceManager.cleanupExpiredHost true
>>>> spark.scheduler.mode FIFO
>>>> spark.shuffle.service.enabled true
>>>> spark.sql.catalogImplementation hive
>>>> spark.sql.hive.metastore.sharedPrefixes
>>>> com.amazonaws.services.dynamodbv2
>>>> spark.sql.warehouse.dir *********(redacted)
>>>> spark.stage.attempt.ignoreOnDecommissionFetchFailure true
>>>> spark.submit.deployMode cluster
>>>> spark.ui.filters
>>>> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
>>>> spark.ui.port 0
>>>> spark.useHiveContext true
>>>> spark.yarn.app.container.log.dir
>>>> /var/log/hadoop-yarn/containers/application_1528204441221_0010/container_1528204441221_0010_01_000001
>>>> spark.yarn.app.id application_1528204441221_0010
>>>> spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS $(hostname -f)
>>>> spark.yarn.dist.archives file:/usr/lib/spark/R/lib/sparkr.zip#sparkr
>>>> spark.yarn.dist.files
>>>> file:///home/hadoop/zeppelin-0.8.1-SNAPSHOT/conf/log4j_yarn_cluster.properties
>>>> spark.yarn.historyServer.address
>>>> ip-10-126-82-0.us-east-1.aws.*****:18080
>>>> spark.yarn.isPython true
>>>> zeppelin.R.cmd R
>>>> zeppelin.R.image.width 100%
>>>> zeppelin.R.knitr true
>>>> zeppelin.R.render.options out.format = 'html', comment = NA, echo =
>>>> FALSE, results = 'asis', message = F, warning = F, fig.retina = 2
>>>> zeppelin.dep.additionalRemoteRepository spark-packages,
>>>> http://dl.bintray.com/spark-packages/maven,false;
>>>> zeppelin.dep.localrepo local-repo
>>>> zeppelin.interpreter.localRepo
>>>> /home/hadoop/zeppelin-0.8.1-SNAPSHOT/local-repo/spark
>>>> zeppelin.interpreter.max.poolsize 10
>>>> zeppelin.interpreter.output.limit 102400
>>>> zeppelin.pyspark.python python
>>>> zeppelin.pyspark.useIPython true
>>>> zeppelin.spark.concurrentSQL false
>>>> zeppelin.spark.enableSupportedVersionCheck true
>>>> zeppelin.spark.importImplicit true
>>>> zeppelin.spark.maxResult 1000
>>>> zeppelin.spark.printREPLOutput true
>>>> zeppelin.spark.sql.interpolation false
>>>> zeppelin.spark.sql.stacktrace false
>>>> zeppelin.spark.useHiveContext true
>>>> zeppelin.spark.useNew true
>>>>
>>>>
>>>> Am Sa., 9. Juni 2018 um 23:34 Uhr schrieb Thomas Bünger <
>>>> thom.bueng@googlemail.com>:
>>>>
>>>>> Hey Jeff,
>>>>> I just tried branch-0.8.
>>>>> Still the same error: No ZeppelinContext "z" available when using
>>>>> "yarn-cluster". (See attached screenshot)
>>>>> With "yarn-client" it works.
>>>>>
>>>>> Besides setting JAVA_HOME and HADOOP_CONF_DIR inside zeppelin-env.sh,
>>>>> no further adjustment where applied to the zeppelin installation. (Also
>>>>> thanks to the new %spark.conf ;-) )
>>>>>
>>>>> Best regards,
>>>>>  Thomas
>>>>>
>>>>> [image: Screen Shot 2018-06-09 at 23.27.49.png]
>>>>>
>>>>>
>>>>>
>>>>> Am Fr., 8. Juni 2018 um 03:05 Uhr schrieb Jeff Zhang <zjffdu@gmail.com
>>>>> >:
>>>>>
>>>>>>
>>>>>> Hi Thomas,
>>>>>>
>>>>>> I try to the latest branch-0.8, it works for me. Could you try again
>>>>>> to verify it ?
>>>>>>
>>>>>>
>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月7日周四 下午8:34写道：
>>>>>>
>>>>>>> I specifically mean visualisation via ZeppelinContext inside a Spark
>>>>>>> interpreter. (e.g. "z.show(...)")
>>>>>>> The visualisation of SparkSQL results inside a SparkSQLInterpreter
>>>>>>> work fine, also in yarn-cluster mode.
>>>>>>>
>>>>>>> Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
>>>>>>> thom.bueng@googlemail.com>:
>>>>>>>
>>>>>>>> Hey Jeff,
>>>>>>>>
>>>>>>>> I tried your changes and now it works nicely. Thank you very much!
>>>>>>>>
>>>>>>>> But I still can't use any of the forms and visualizations in
>>>>>>>> yarn-cluster?
>>>>>>>> I was hoping that this got resolved with the new SparkInterpreter
>>>>>>>> so that I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm
>>>>>>>> still getting errors like
>>>>>>>> "error: not found: value z"
>>>>>>>>
>>>>>>>> Was this not in scope of that change? Is this a bug? Or is it known
>>>>>>>> limitation and also not supported in 0.8?
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>  Thomas
>>>>>>>>
>>>>>>>> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang <
>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I can confirm that this is a bug, and created
>>>>>>>>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>>>>>>>>
>>>>>>>>> Will fix it soon
>>>>>>>>>
>>>>>>>>> Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> hmm, it looks like a bug. I will check it tomorrow.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>>>>>>>>>>
>>>>>>>>>>> $ ls /usr/lib/spark/python/lib
>>>>>>>>>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>>>>>>>>>
>>>>>>>>>>> So folder exists and contains both necessary zips. Please note,
>>>>>>>>>>> that in local or yarn-client mode the files are properly picked up from
>>>>>>>>>>> that very same location.
>>>>>>>>>>>
>>>>>>>>>>> How does yarn-cluster work under the hood? Could it be that
>>>>>>>>>>> environment variables (like SPARK_HOME) are lost, because they are only
>>>>>>>>>>> available in my local shell + zeppelin daemon process? Do I need to tell
>>>>>>>>>>> YARN somehow about SPARK_HOME?
>>>>>>>>>>>
>>>>>>>>>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <
>>>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Could you check whether there's folder /usr/lib/spark/python/lib
>>>>>>>>>>>> ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> sys.env
>>>>>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>>>> at
>>>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>>>>>>>>>>> at
>>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>>>>>>>>>>> at
>>>>>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>>>>>>>>>> at
>>>>>>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>>>>>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>>>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>>>>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>>>>>>> at
>>>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>>>>>>> at
>>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>>>>>>>>> at
>>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>>>>>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>>>>>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Could you paste the full stracktrace ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二
>>>>>>>>>>>>>> 下午8:21写道：
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the
>>>>>>>>>>>>>>> preinstalled version of spark under /usr/lib/spark.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This works fine in local or yarn-client mode, but in
>>>>>>>>>>>>>>> yarn-cluster mode i just get a
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Seems to be caused by an unsuccessful search for the py4j
>>>>>>>>>>>>>>> libraries.
>>>>>>>>>>>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc,
>>>>>>>>>>>>>>> in zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>>>>>>>>>>>> interpreter, something odd is going on.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>  Thomas
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>
>
> --
> 이종열, Jongyoul Lee, 李宗烈
> http://madeng.net
>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Jongyoul Lee <jo...@gmail.com>.

BTW, It's a bit different but I found HiveContext cannot be made in
yarn-cluster mode. I'll test it more but we might fix it

On Sun, Jun 10, 2018 at 11:31 AM, Jeff Zhang <zj...@gmail.com> wrote:

> BTW, zeppelin don't require to be installed in all the nodes of cluster.
> Install it in one node is sufficient.
>
>
> Jeff Zhang <zj...@gmail.com>于2018年6月10日周日 上午10:22写道：
>
>>
>> hmm, maybe it is due the the --driver-class-path in interpreter.sh.  I
>> will create ticket to remote this for yarn cluster mode
>>
>>
>> Thomas Bünger <th...@googlemail.com>于2018年6月10日周日 上午6:04写道：
>>
>>> I just tried to copy the zeppelin installation to the exact same
>>> location on each YARN-Node and then everything works fine!
>>> So it seems to be some missing jar file from being sent to the spark
>>> driver node. Or a wrong classpath.
>>>
>>> Maybe the following dump from the Spark UI might help somehow?
>>>
>>> *Environment*
>>> *Runtime Information*
>>> Name Value
>>> Java Version 1.8.0_171 (Oracle Corporation)
>>> Java Home /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-7.b10.37.
>>> amzn1.x86_64/jre
>>> Scala Version version 2.11.8
>>> *Spark Properties*
>>> Name Value
>>> SPARK_HOME /usr/lib/spark
>>> master yarn-cluster
>>> spark.app.id application_1528204441221_0010
>>> spark.app.name Zeppelin
>>> spark.blacklist.decommissioning.enabled true
>>> spark.blacklist.decommissioning.timeout 1h
>>> spark.decommissioning.timeout.threshold 20
>>> spark.driver.extraClassPath :/home/hadoop/zeppelin-0.8.1-
>>> SNAPSHOT/local-repo/spark/*:/home/hadoop/zeppelin-0.8.1-
>>> SNAPSHOT/interpreter/spark/*:/home/hadoop/zeppelin-0.8.1-
>>> SNAPSHOT/lib/interpreter/*::/home/hadoop/zeppelin-0.8.1-
>>> SNAPSHOT/interpreter/spark/spark-interpreter-0.8.1-
>>> SNAPSHOT.jar:/etc/hadoop/conf/
>>> spark.driver.extraJavaOptions -Dfile.encoding=UTF-8
>>> -Dlog4j.configuration=log4j_yarn_cluster.properties
>>> -Dzeppelin.log.file=/home/hadoop/zeppelin-0.8.1-SNAPSHOT/logs/zeppelin-
>>> interpreter-spark-hadoop-ip-10-126-82-0.log
>>> spark.driver.extraLibraryPath /usr/lib/hadoop/lib/native:/
>>> usr/lib/hadoop-lzo/lib/native
>>> spark.driver.host ip-10-126-87-125.us-east-1.aws.*****
>>> spark.driver.port 38237
>>> spark.dynamicAllocation.enabled true
>>> spark.eventLog.dir hdfs:///var/log/spark/apps
>>> spark.eventLog.enabled true
>>> spark.executor.cores 4
>>> spark.executor.extraClassPath /usr/lib/hadoop-lzo/lib/*:/
>>> usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:
>>> /usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*
>>> :/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/
>>> security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/
>>> aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/
>>> usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/
>>> usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar
>>> spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:
>>> CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70
>>> -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p'
>>> spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native:/
>>> usr/lib/hadoop-lzo/lib/native
>>> spark.executor.id driver
>>> spark.executorEnv.PYTHONPATH /usr/lib/spark/python/lib/
>>> py4j-0.10.6-src.zip:/usr/lib/spark/python/:<CPS>{{PWD}}/
>>> pyspark.zip<CPS>{{PWD}}/py4j-0.10.6-src.zip
>>> spark.files.fetchFailure.unRegisterOutputOnHost true
>>> spark.hadoop.yarn.timeline-service.enabled false
>>> spark.history.fs.logDirectory hdfs:///var/log/spark/apps
>>> spark.history.ui.port 18080
>>> spark.jars
>>> spark.master yarn
>>> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.
>>> AmIpFilter.param.PROXY_HOSTS ip-10-126-82-0.us-east-1.aws.*****
>>> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.
>>> AmIpFilter.param.PROXY_URI_BASES http://ip-10-126-82-0.us-east-1.aws.
>>> *****:20888/proxy/application_1528204441221_0010
>>> spark.repl.class.outputDir *********(redacted)
>>> spark.repl.class.uri spark://ip-10-126-87-125.us-
>>> east-1.aws.*****:38237/classes
>>> spark.resourceManager.cleanupExpiredHost true
>>> spark.scheduler.mode FIFO
>>> spark.shuffle.service.enabled true
>>> spark.sql.catalogImplementation hive
>>> spark.sql.hive.metastore.sharedPrefixes com.amazonaws.services.
>>> dynamodbv2
>>> spark.sql.warehouse.dir *********(redacted)
>>> spark.stage.attempt.ignoreOnDecommissionFetchFailure true
>>> spark.submit.deployMode cluster
>>> spark.ui.filters org.apache.hadoop.yarn.server.
>>> webproxy.amfilter.AmIpFilter
>>> spark.ui.port 0
>>> spark.useHiveContext true
>>> spark.yarn.app.container.log.dir /var/log/hadoop-yarn/
>>> containers/application_1528204441221_0010/container_
>>> 1528204441221_0010_01_000001
>>> spark.yarn.app.id application_1528204441221_0010
>>> spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS $(hostname -f)
>>> spark.yarn.dist.archives file:/usr/lib/spark/R/lib/sparkr.zip#sparkr
>>> spark.yarn.dist.files file:///home/hadoop/zeppelin-
>>> 0.8.1-SNAPSHOT/conf/log4j_yarn_cluster.properties
>>> spark.yarn.historyServer.address ip-10-126-82-0.us-east-1.aws.*
>>> ****:18080
>>> spark.yarn.isPython true
>>> zeppelin.R.cmd R
>>> zeppelin.R.image.width 100%
>>> zeppelin.R.knitr true
>>> zeppelin.R.render.options out.format = 'html', comment = NA, echo =
>>> FALSE, results = 'asis', message = F, warning = F, fig.retina = 2
>>> zeppelin.dep.additionalRemoteRepository spark-packages,http://dl.
>>> bintray.com/spark-packages/maven,false;
>>> zeppelin.dep.localrepo local-repo
>>> zeppelin.interpreter.localRepo /home/hadoop/zeppelin-0.8.1-
>>> SNAPSHOT/local-repo/spark
>>> zeppelin.interpreter.max.poolsize 10
>>> zeppelin.interpreter.output.limit 102400
>>> zeppelin.pyspark.python python
>>> zeppelin.pyspark.useIPython true
>>> zeppelin.spark.concurrentSQL false
>>> zeppelin.spark.enableSupportedVersionCheck true
>>> zeppelin.spark.importImplicit true
>>> zeppelin.spark.maxResult 1000
>>> zeppelin.spark.printREPLOutput true
>>> zeppelin.spark.sql.interpolation false
>>> zeppelin.spark.sql.stacktrace false
>>> zeppelin.spark.useHiveContext true
>>> zeppelin.spark.useNew true
>>>
>>>
>>> Am Sa., 9. Juni 2018 um 23:34 Uhr schrieb Thomas Bünger <
>>> thom.bueng@googlemail.com>:
>>>
>>>> Hey Jeff,
>>>> I just tried branch-0.8.
>>>> Still the same error: No ZeppelinContext "z" available when using
>>>> "yarn-cluster". (See attached screenshot)
>>>> With "yarn-client" it works.
>>>>
>>>> Besides setting JAVA_HOME and HADOOP_CONF_DIR inside zeppelin-env.sh,
>>>> no further adjustment where applied to the zeppelin installation. (Also
>>>> thanks to the new %spark.conf ;-) )
>>>>
>>>> Best regards,
>>>>  Thomas
>>>>
>>>> [image: Screen Shot 2018-06-09 at 23.27.49.png]
>>>>
>>>>
>>>>
>>>> Am Fr., 8. Juni 2018 um 03:05 Uhr schrieb Jeff Zhang <zjffdu@gmail.com
>>>> >:
>>>>
>>>>>
>>>>> Hi Thomas,
>>>>>
>>>>> I try to the latest branch-0.8, it works for me. Could you try again
>>>>> to verify it ?
>>>>>
>>>>>
>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月7日周四 下午8:34写道：
>>>>>
>>>>>> I specifically mean visualisation via ZeppelinContext inside a Spark
>>>>>> interpreter. (e.g. "z.show(...)")
>>>>>> The visualisation of SparkSQL results inside a SparkSQLInterpreter
>>>>>> work fine, also in yarn-cluster mode.
>>>>>>
>>>>>> Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
>>>>>> thom.bueng@googlemail.com>:
>>>>>>
>>>>>>> Hey Jeff,
>>>>>>>
>>>>>>> I tried your changes and now it works nicely. Thank you very much!
>>>>>>>
>>>>>>> But I still can't use any of the forms and visualizations in
>>>>>>> yarn-cluster?
>>>>>>> I was hoping that this got resolved with the new SparkInterpreter so
>>>>>>> that I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm
>>>>>>> still getting errors like
>>>>>>> "error: not found: value z"
>>>>>>>
>>>>>>> Was this not in scope of that change? Is this a bug? Or is it known
>>>>>>> limitation and also not supported in 0.8?
>>>>>>>
>>>>>>> Best regards,
>>>>>>>  Thomas
>>>>>>>
>>>>>>> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang <
>>>>>>> zjffdu@gmail.com>:
>>>>>>>
>>>>>>>>
>>>>>>>> I can confirm that this is a bug, and created
>>>>>>>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>>>>>>>
>>>>>>>> Will fix it soon
>>>>>>>>
>>>>>>>> Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：
>>>>>>>>
>>>>>>>>>
>>>>>>>>> hmm, it looks like a bug. I will check it tomorrow.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>>>>>>>>>
>>>>>>>>>> $ ls /usr/lib/spark/python/lib
>>>>>>>>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>>>>>>>>
>>>>>>>>>> So folder exists and contains both necessary zips. Please note,
>>>>>>>>>> that in local or yarn-client mode the files are properly picked up from
>>>>>>>>>> that very same location.
>>>>>>>>>>
>>>>>>>>>> How does yarn-cluster work under the hood? Could it be that
>>>>>>>>>> environment variables (like SPARK_HOME) are lost, because they are only
>>>>>>>>>> available in my local shell + zeppelin daemon process? Do I need to tell
>>>>>>>>>> YARN somehow about SPARK_HOME?
>>>>>>>>>>
>>>>>>>>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <
>>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Could you check whether there's folder /usr/lib/spark/python/lib
>>>>>>>>>>> ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> sys.env
>>>>>>>>>>>> java.lang.NullPointerException at org.apache.zeppelin.spark.
>>>>>>>>>>>> NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>>> at org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>>>>>>>>>> at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>>>>>>>>>> at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>>>>>>>>> at org.apache.zeppelin.interpreter.remote.
>>>>>>>>>>>> RemoteInterpreterServer$InterpretJob.jobRun(
>>>>>>>>>>>> RemoteInterpreterServer.java:617) at
>>>>>>>>>>>> org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>>>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$
>>>>>>>>>>>> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>>>>>> at java.util.concurrent.ScheduledThreadPoolExecutor$
>>>>>>>>>>>> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>>>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>>>>>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>>>>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>>>>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>> Could you paste the full stracktrace ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二
>>>>>>>>>>>>> 下午8:21写道：
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the
>>>>>>>>>>>>>> preinstalled version of spark under /usr/lib/spark.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This works fine in local or yarn-client mode, but in
>>>>>>>>>>>>>> yarn-cluster mode i just get a
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> java.lang.NullPointerException at org.apache.zeppelin.spark.
>>>>>>>>>>>>>> NewSparkInterpreter.setupConfForPySpark(
>>>>>>>>>>>>>> NewSparkInterpreter.java:149)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Seems to be caused by an unsuccessful search for the py4j
>>>>>>>>>>>>>> libraries.
>>>>>>>>>>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc,
>>>>>>>>>>>>>> in zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>>>>>>>>>>> interpreter, something odd is going on.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>  Thomas
>>>>>>>>>>>>>>
>>>>>>>>>>>>>


-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Jeff Zhang <zj...@gmail.com>.

BTW, zeppelin don't require to be installed in all the nodes of cluster.
Install it in one node is sufficient.


Jeff Zhang <zj...@gmail.com>于2018年6月10日周日 上午10:22写道：

>
> hmm, maybe it is due the the --driver-class-path in interpreter.sh.  I
> will create ticket to remote this for yarn cluster mode
>
>
> Thomas Bünger <th...@googlemail.com>于2018年6月10日周日 上午6:04写道：
>
>> I just tried to copy the zeppelin installation to the exact same location
>> on each YARN-Node and then everything works fine!
>> So it seems to be some missing jar file from being sent to the spark
>> driver node. Or a wrong classpath.
>>
>> Maybe the following dump from the Spark UI might help somehow?
>>
>> *Environment*
>> *Runtime Information*
>> Name Value
>> Java Version 1.8.0_171 (Oracle Corporation)
>> Java Home
>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-7.b10.37.amzn1.x86_64/jre
>> Scala Version version 2.11.8
>> *Spark Properties*
>> Name Value
>> SPARK_HOME /usr/lib/spark
>> master yarn-cluster
>> spark.app.id application_1528204441221_0010
>> spark.app.name Zeppelin
>> spark.blacklist.decommissioning.enabled true
>> spark.blacklist.decommissioning.timeout 1h
>> spark.decommissioning.timeout.threshold 20
>> spark.driver.extraClassPath
>> :/home/hadoop/zeppelin-0.8.1-SNAPSHOT/local-repo/spark/*:/home/hadoop/zeppelin-0.8.1-SNAPSHOT/interpreter/spark/*:/home/hadoop/zeppelin-0.8.1-SNAPSHOT/lib/interpreter/*::/home/hadoop/zeppelin-0.8.1-SNAPSHOT/interpreter/spark/spark-interpreter-0.8.1-SNAPSHOT.jar:/etc/hadoop/conf/
>> spark.driver.extraJavaOptions -Dfile.encoding=UTF-8
>> -Dlog4j.configuration=log4j_yarn_cluster.properties
>> -Dzeppelin.log.file=/home/hadoop/zeppelin-0.8.1-SNAPSHOT/logs/zeppelin-interpreter-spark-hadoop-ip-10-126-82-0.log
>> spark.driver.extraLibraryPath
>> /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native
>> spark.driver.host ip-10-126-87-125.us-east-1.aws.*****
>> spark.driver.port 38237
>> spark.dynamicAllocation.enabled true
>> spark.eventLog.dir hdfs:///var/log/spark/apps
>> spark.eventLog.enabled true
>> spark.executor.cores 4
>> spark.executor.extraClassPath
>> /usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar
>> spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails
>> -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC
>> -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70
>> -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p'
>> spark.executor.extraLibraryPath
>> /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native
>> spark.executor.id driver
>> spark.executorEnv.PYTHONPATH
>> /usr/lib/spark/python/lib/py4j-0.10.6-src.zip:/usr/lib/spark/python/:<CPS>{{PWD}}/pyspark.zip<CPS>{{PWD}}/py4j-0.10.6-src.zip
>> spark.files.fetchFailure.unRegisterOutputOnHost true
>> spark.hadoop.yarn.timeline-service.enabled false
>> spark.history.fs.logDirectory hdfs:///var/log/spark/apps
>> spark.history.ui.port 18080
>> spark.jars
>> spark.master yarn
>>
>> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS
>> ip-10-126-82-0.us-east-1.aws.*****
>>
>> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES
>> http://ip-10-126-82-0.us-east-1.aws.
>> *****:20888/proxy/application_1528204441221_0010
>> spark.repl.class.outputDir *********(redacted)
>> spark.repl.class.uri
>> spark://ip-10-126-87-125.us-east-1.aws.*****:38237/classes
>> spark.resourceManager.cleanupExpiredHost true
>> spark.scheduler.mode FIFO
>> spark.shuffle.service.enabled true
>> spark.sql.catalogImplementation hive
>> spark.sql.hive.metastore.sharedPrefixes com.amazonaws.services.dynamodbv2
>> spark.sql.warehouse.dir *********(redacted)
>> spark.stage.attempt.ignoreOnDecommissionFetchFailure true
>> spark.submit.deployMode cluster
>> spark.ui.filters
>> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
>> spark.ui.port 0
>> spark.useHiveContext true
>> spark.yarn.app.container.log.dir
>> /var/log/hadoop-yarn/containers/application_1528204441221_0010/container_1528204441221_0010_01_000001
>> spark.yarn.app.id application_1528204441221_0010
>> spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS $(hostname -f)
>> spark.yarn.dist.archives file:/usr/lib/spark/R/lib/sparkr.zip#sparkr
>> spark.yarn.dist.files
>> file:///home/hadoop/zeppelin-0.8.1-SNAPSHOT/conf/log4j_yarn_cluster.properties
>> spark.yarn.historyServer.address ip-10-126-82-0.us-east-1.aws.*****:18080
>> spark.yarn.isPython true
>> zeppelin.R.cmd R
>> zeppelin.R.image.width 100%
>> zeppelin.R.knitr true
>> zeppelin.R.render.options out.format = 'html', comment = NA, echo =
>> FALSE, results = 'asis', message = F, warning = F, fig.retina = 2
>> zeppelin.dep.additionalRemoteRepository spark-packages,
>> http://dl.bintray.com/spark-packages/maven,false;
>> zeppelin.dep.localrepo local-repo
>> zeppelin.interpreter.localRepo
>> /home/hadoop/zeppelin-0.8.1-SNAPSHOT/local-repo/spark
>> zeppelin.interpreter.max.poolsize 10
>> zeppelin.interpreter.output.limit 102400
>> zeppelin.pyspark.python python
>> zeppelin.pyspark.useIPython true
>> zeppelin.spark.concurrentSQL false
>> zeppelin.spark.enableSupportedVersionCheck true
>> zeppelin.spark.importImplicit true
>> zeppelin.spark.maxResult 1000
>> zeppelin.spark.printREPLOutput true
>> zeppelin.spark.sql.interpolation false
>> zeppelin.spark.sql.stacktrace false
>> zeppelin.spark.useHiveContext true
>> zeppelin.spark.useNew true
>>
>>
>> Am Sa., 9. Juni 2018 um 23:34 Uhr schrieb Thomas Bünger <
>> thom.bueng@googlemail.com>:
>>
>>> Hey Jeff,
>>> I just tried branch-0.8.
>>> Still the same error: No ZeppelinContext "z" available when using
>>> "yarn-cluster". (See attached screenshot)
>>> With "yarn-client" it works.
>>>
>>> Besides setting JAVA_HOME and HADOOP_CONF_DIR inside zeppelin-env.sh, no
>>> further adjustment where applied to the zeppelin installation. (Also thanks
>>> to the new %spark.conf ;-) )
>>>
>>> Best regards,
>>>  Thomas
>>>
>>> [image: Screen Shot 2018-06-09 at 23.27.49.png]
>>>
>>>
>>>
>>> Am Fr., 8. Juni 2018 um 03:05 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>>>
>>>>
>>>> Hi Thomas,
>>>>
>>>> I try to the latest branch-0.8, it works for me. Could you try again to
>>>> verify it ?
>>>>
>>>>
>>>> Thomas Bünger <th...@googlemail.com>于2018年6月7日周四 下午8:34写道：
>>>>
>>>>> I specifically mean visualisation via ZeppelinContext inside a Spark
>>>>> interpreter. (e.g. "z.show(...)")
>>>>> The visualisation of SparkSQL results inside a SparkSQLInterpreter
>>>>> work fine, also in yarn-cluster mode.
>>>>>
>>>>> Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
>>>>> thom.bueng@googlemail.com>:
>>>>>
>>>>>> Hey Jeff,
>>>>>>
>>>>>> I tried your changes and now it works nicely. Thank you very much!
>>>>>>
>>>>>> But I still can't use any of the forms and visualizations in
>>>>>> yarn-cluster?
>>>>>> I was hoping that this got resolved with the new SparkInterpreter so
>>>>>> that I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm
>>>>>> still getting errors like
>>>>>> "error: not found: value z"
>>>>>>
>>>>>> Was this not in scope of that change? Is this a bug? Or is it known
>>>>>> limitation and also not supported in 0.8?
>>>>>>
>>>>>> Best regards,
>>>>>>  Thomas
>>>>>>
>>>>>> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang <
>>>>>> zjffdu@gmail.com>:
>>>>>>
>>>>>>>
>>>>>>> I can confirm that this is a bug, and created
>>>>>>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>>>>>>
>>>>>>> Will fix it soon
>>>>>>>
>>>>>>> Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：
>>>>>>>
>>>>>>>>
>>>>>>>> hmm, it looks like a bug. I will check it tomorrow.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>>>>>>>>
>>>>>>>>> $ ls /usr/lib/spark/python/lib
>>>>>>>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>>>>>>>
>>>>>>>>> So folder exists and contains both necessary zips. Please note,
>>>>>>>>> that in local or yarn-client mode the files are properly picked up from
>>>>>>>>> that very same location.
>>>>>>>>>
>>>>>>>>> How does yarn-cluster work under the hood? Could it be that
>>>>>>>>> environment variables (like SPARK_HOME) are lost, because they are only
>>>>>>>>> available in my local shell + zeppelin daemon process? Do I need to tell
>>>>>>>>> YARN somehow about SPARK_HOME?
>>>>>>>>>
>>>>>>>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <
>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Could you check whether there's folder /usr/lib/spark/python/lib
>>>>>>>>>> ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> sys.env
>>>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>> at
>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>>>>>>>>> at
>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>>>>>>>>> at
>>>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>>>>>>>> at
>>>>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>>>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>>>>> at
>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>>>>> at
>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>>>>>>> at
>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>>>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>>>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>>> Could you paste the full stracktrace ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>>>>>>>>>>>>
>>>>>>>>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the
>>>>>>>>>>>>> preinstalled version of spark under /usr/lib/spark.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This works fine in local or yarn-client mode, but in
>>>>>>>>>>>>> yarn-cluster mode i just get a
>>>>>>>>>>>>>
>>>>>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Seems to be caused by an unsuccessful search for the py4j
>>>>>>>>>>>>> libraries.
>>>>>>>>>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>>>>>>>>>>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>>>>>>>>>> interpreter, something odd is going on.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>  Thomas
>>>>>>>>>>>>>
>>>>>>>>>>>>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Jeff Zhang <zj...@gmail.com>.

hmm, maybe it is due the the --driver-class-path in interpreter.sh.  I will
create ticket to remote this for yarn cluster mode


Thomas Bünger <th...@googlemail.com>于2018年6月10日周日 上午6:04写道：

> I just tried to copy the zeppelin installation to the exact same location
> on each YARN-Node and then everything works fine!
> So it seems to be some missing jar file from being sent to the spark
> driver node. Or a wrong classpath.
>
> Maybe the following dump from the Spark UI might help somehow?
>
> *Environment*
> *Runtime Information*
> Name Value
> Java Version 1.8.0_171 (Oracle Corporation)
> Java Home
> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-7.b10.37.amzn1.x86_64/jre
> Scala Version version 2.11.8
> *Spark Properties*
> Name Value
> SPARK_HOME /usr/lib/spark
> master yarn-cluster
> spark.app.id application_1528204441221_0010
> spark.app.name Zeppelin
> spark.blacklist.decommissioning.enabled true
> spark.blacklist.decommissioning.timeout 1h
> spark.decommissioning.timeout.threshold 20
> spark.driver.extraClassPath
> :/home/hadoop/zeppelin-0.8.1-SNAPSHOT/local-repo/spark/*:/home/hadoop/zeppelin-0.8.1-SNAPSHOT/interpreter/spark/*:/home/hadoop/zeppelin-0.8.1-SNAPSHOT/lib/interpreter/*::/home/hadoop/zeppelin-0.8.1-SNAPSHOT/interpreter/spark/spark-interpreter-0.8.1-SNAPSHOT.jar:/etc/hadoop/conf/
> spark.driver.extraJavaOptions -Dfile.encoding=UTF-8
> -Dlog4j.configuration=log4j_yarn_cluster.properties
> -Dzeppelin.log.file=/home/hadoop/zeppelin-0.8.1-SNAPSHOT/logs/zeppelin-interpreter-spark-hadoop-ip-10-126-82-0.log
> spark.driver.extraLibraryPath
> /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native
> spark.driver.host ip-10-126-87-125.us-east-1.aws.*****
> spark.driver.port 38237
> spark.dynamicAllocation.enabled true
> spark.eventLog.dir hdfs:///var/log/spark/apps
> spark.eventLog.enabled true
> spark.executor.cores 4
> spark.executor.extraClassPath
> /usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar
> spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails
> -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC
> -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70
> -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p'
> spark.executor.extraLibraryPath
> /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native
> spark.executor.id driver
> spark.executorEnv.PYTHONPATH
> /usr/lib/spark/python/lib/py4j-0.10.6-src.zip:/usr/lib/spark/python/:<CPS>{{PWD}}/pyspark.zip<CPS>{{PWD}}/py4j-0.10.6-src.zip
> spark.files.fetchFailure.unRegisterOutputOnHost true
> spark.hadoop.yarn.timeline-service.enabled false
> spark.history.fs.logDirectory hdfs:///var/log/spark/apps
> spark.history.ui.port 18080
> spark.jars
> spark.master yarn
>
> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS
> ip-10-126-82-0.us-east-1.aws.*****
>
> spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES
> http://ip-10-126-82-0.us-east-1.aws.
> *****:20888/proxy/application_1528204441221_0010
> spark.repl.class.outputDir *********(redacted)
> spark.repl.class.uri
> spark://ip-10-126-87-125.us-east-1.aws.*****:38237/classes
> spark.resourceManager.cleanupExpiredHost true
> spark.scheduler.mode FIFO
> spark.shuffle.service.enabled true
> spark.sql.catalogImplementation hive
> spark.sql.hive.metastore.sharedPrefixes com.amazonaws.services.dynamodbv2
> spark.sql.warehouse.dir *********(redacted)
> spark.stage.attempt.ignoreOnDecommissionFetchFailure true
> spark.submit.deployMode cluster
> spark.ui.filters
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
> spark.ui.port 0
> spark.useHiveContext true
> spark.yarn.app.container.log.dir
> /var/log/hadoop-yarn/containers/application_1528204441221_0010/container_1528204441221_0010_01_000001
> spark.yarn.app.id application_1528204441221_0010
> spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS $(hostname -f)
> spark.yarn.dist.archives file:/usr/lib/spark/R/lib/sparkr.zip#sparkr
> spark.yarn.dist.files
> file:///home/hadoop/zeppelin-0.8.1-SNAPSHOT/conf/log4j_yarn_cluster.properties
> spark.yarn.historyServer.address ip-10-126-82-0.us-east-1.aws.*****:18080
> spark.yarn.isPython true
> zeppelin.R.cmd R
> zeppelin.R.image.width 100%
> zeppelin.R.knitr true
> zeppelin.R.render.options out.format = 'html', comment = NA, echo =
> FALSE, results = 'asis', message = F, warning = F, fig.retina = 2
> zeppelin.dep.additionalRemoteRepository spark-packages,
> http://dl.bintray.com/spark-packages/maven,false;
> zeppelin.dep.localrepo local-repo
> zeppelin.interpreter.localRepo
> /home/hadoop/zeppelin-0.8.1-SNAPSHOT/local-repo/spark
> zeppelin.interpreter.max.poolsize 10
> zeppelin.interpreter.output.limit 102400
> zeppelin.pyspark.python python
> zeppelin.pyspark.useIPython true
> zeppelin.spark.concurrentSQL false
> zeppelin.spark.enableSupportedVersionCheck true
> zeppelin.spark.importImplicit true
> zeppelin.spark.maxResult 1000
> zeppelin.spark.printREPLOutput true
> zeppelin.spark.sql.interpolation false
> zeppelin.spark.sql.stacktrace false
> zeppelin.spark.useHiveContext true
> zeppelin.spark.useNew true
>
>
> Am Sa., 9. Juni 2018 um 23:34 Uhr schrieb Thomas Bünger <
> thom.bueng@googlemail.com>:
>
>> Hey Jeff,
>> I just tried branch-0.8.
>> Still the same error: No ZeppelinContext "z" available when using
>> "yarn-cluster". (See attached screenshot)
>> With "yarn-client" it works.
>>
>> Besides setting JAVA_HOME and HADOOP_CONF_DIR inside zeppelin-env.sh, no
>> further adjustment where applied to the zeppelin installation. (Also thanks
>> to the new %spark.conf ;-) )
>>
>> Best regards,
>>  Thomas
>>
>> [image: Screen Shot 2018-06-09 at 23.27.49.png]
>>
>>
>>
>> Am Fr., 8. Juni 2018 um 03:05 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>>
>>>
>>> Hi Thomas,
>>>
>>> I try to the latest branch-0.8, it works for me. Could you try again to
>>> verify it ?
>>>
>>>
>>> Thomas Bünger <th...@googlemail.com>于2018年6月7日周四 下午8:34写道：
>>>
>>>> I specifically mean visualisation via ZeppelinContext inside a Spark
>>>> interpreter. (e.g. "z.show(...)")
>>>> The visualisation of SparkSQL results inside a SparkSQLInterpreter work
>>>> fine, also in yarn-cluster mode.
>>>>
>>>> Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
>>>> thom.bueng@googlemail.com>:
>>>>
>>>>> Hey Jeff,
>>>>>
>>>>> I tried your changes and now it works nicely. Thank you very much!
>>>>>
>>>>> But I still can't use any of the forms and visualizations in
>>>>> yarn-cluster?
>>>>> I was hoping that this got resolved with the new SparkInterpreter so
>>>>> that I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm
>>>>> still getting errors like
>>>>> "error: not found: value z"
>>>>>
>>>>> Was this not in scope of that change? Is this a bug? Or is it known
>>>>> limitation and also not supported in 0.8?
>>>>>
>>>>> Best regards,
>>>>>  Thomas
>>>>>
>>>>> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang <zjffdu@gmail.com
>>>>> >:
>>>>>
>>>>>>
>>>>>> I can confirm that this is a bug, and created
>>>>>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>>>>>
>>>>>> Will fix it soon
>>>>>>
>>>>>> Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：
>>>>>>
>>>>>>>
>>>>>>> hmm, it looks like a bug. I will check it tomorrow.
>>>>>>>
>>>>>>>
>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>>>>>>>
>>>>>>>> $ ls /usr/lib/spark/python/lib
>>>>>>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>>>>>>
>>>>>>>> So folder exists and contains both necessary zips. Please note,
>>>>>>>> that in local or yarn-client mode the files are properly picked up from
>>>>>>>> that very same location.
>>>>>>>>
>>>>>>>> How does yarn-cluster work under the hood? Could it be that
>>>>>>>> environment variables (like SPARK_HOME) are lost, because they are only
>>>>>>>> available in my local shell + zeppelin daemon process? Do I need to tell
>>>>>>>> YARN somehow about SPARK_HOME?
>>>>>>>>
>>>>>>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <
>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Could you check whether there's folder /usr/lib/spark/python/lib ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> sys.env
>>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>> at
>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>>>>>>>> at
>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>>>>>>>> at
>>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>>>>>>> at
>>>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>>>> at
>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>>>> at
>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>>>>>> at
>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>> Could you paste the full stracktrace ?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>>>>>>>>>>>
>>>>>>>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the
>>>>>>>>>>>> preinstalled version of spark under /usr/lib/spark.
>>>>>>>>>>>>
>>>>>>>>>>>> This works fine in local or yarn-client mode, but in
>>>>>>>>>>>> yarn-cluster mode i just get a
>>>>>>>>>>>>
>>>>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>>>
>>>>>>>>>>>> Seems to be caused by an unsuccessful search for the py4j
>>>>>>>>>>>> libraries.
>>>>>>>>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>>>>>>>>>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>>>>>>>>> interpreter, something odd is going on.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>  Thomas
>>>>>>>>>>>>
>>>>>>>>>>>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Thomas Bünger <th...@googlemail.com>.

I just tried to copy the zeppelin installation to the exact same location
on each YARN-Node and then everything works fine!
So it seems to be some missing jar file from being sent to the spark driver
node. Or a wrong classpath.

Maybe the following dump from the Spark UI might help somehow?

*Environment*
*Runtime Information*
Name Value
Java Version 1.8.0_171 (Oracle Corporation)
Java Home
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.171-7.b10.37.amzn1.x86_64/jre
Scala Version version 2.11.8
*Spark Properties*
Name Value
SPARK_HOME /usr/lib/spark
master yarn-cluster
spark.app.id application_1528204441221_0010
spark.app.name Zeppelin
spark.blacklist.decommissioning.enabled true
spark.blacklist.decommissioning.timeout 1h
spark.decommissioning.timeout.threshold 20
spark.driver.extraClassPath
:/home/hadoop/zeppelin-0.8.1-SNAPSHOT/local-repo/spark/*:/home/hadoop/zeppelin-0.8.1-SNAPSHOT/interpreter/spark/*:/home/hadoop/zeppelin-0.8.1-SNAPSHOT/lib/interpreter/*::/home/hadoop/zeppelin-0.8.1-SNAPSHOT/interpreter/spark/spark-interpreter-0.8.1-SNAPSHOT.jar:/etc/hadoop/conf/
spark.driver.extraJavaOptions -Dfile.encoding=UTF-8
-Dlog4j.configuration=log4j_yarn_cluster.properties
-Dzeppelin.log.file=/home/hadoop/zeppelin-0.8.1-SNAPSHOT/logs/zeppelin-interpreter-spark-hadoop-ip-10-126-82-0.log
spark.driver.extraLibraryPath
/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native
spark.driver.host ip-10-126-87-125.us-east-1.aws.*****
spark.driver.port 38237
spark.dynamicAllocation.enabled true
spark.eventLog.dir hdfs:///var/log/spark/apps
spark.eventLog.enabled true
spark.executor.cores 4
spark.executor.extraClassPath
/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar
spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails
-XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70
-XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p'
spark.executor.extraLibraryPath
/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native
spark.executor.id driver
spark.executorEnv.PYTHONPATH
/usr/lib/spark/python/lib/py4j-0.10.6-src.zip:/usr/lib/spark/python/:<CPS>{{PWD}}/pyspark.zip<CPS>{{PWD}}/py4j-0.10.6-src.zip
spark.files.fetchFailure.unRegisterOutputOnHost true
spark.hadoop.yarn.timeline-service.enabled false
spark.history.fs.logDirectory hdfs:///var/log/spark/apps
spark.history.ui.port 18080
spark.jars
spark.master yarn
spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS
ip-10-126-82-0.us-east-1.aws.*****
spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES
http://ip-10-126-82-0.us-east-1.aws.
*****:20888/proxy/application_1528204441221_0010
spark.repl.class.outputDir *********(redacted)
spark.repl.class.uri
spark://ip-10-126-87-125.us-east-1.aws.*****:38237/classes
spark.resourceManager.cleanupExpiredHost true
spark.scheduler.mode FIFO
spark.shuffle.service.enabled true
spark.sql.catalogImplementation hive
spark.sql.hive.metastore.sharedPrefixes com.amazonaws.services.dynamodbv2
spark.sql.warehouse.dir *********(redacted)
spark.stage.attempt.ignoreOnDecommissionFetchFailure true
spark.submit.deployMode cluster
spark.ui.filters org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
spark.ui.port 0
spark.useHiveContext true
spark.yarn.app.container.log.dir
/var/log/hadoop-yarn/containers/application_1528204441221_0010/container_1528204441221_0010_01_000001
spark.yarn.app.id application_1528204441221_0010
spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS $(hostname -f)
spark.yarn.dist.archives file:/usr/lib/spark/R/lib/sparkr.zip#sparkr
spark.yarn.dist.files
file:///home/hadoop/zeppelin-0.8.1-SNAPSHOT/conf/log4j_yarn_cluster.properties
spark.yarn.historyServer.address ip-10-126-82-0.us-east-1.aws.*****:18080
spark.yarn.isPython true
zeppelin.R.cmd R
zeppelin.R.image.width 100%
zeppelin.R.knitr true
zeppelin.R.render.options out.format = 'html', comment = NA, echo = FALSE,
results = 'asis', message = F, warning = F, fig.retina = 2
zeppelin.dep.additionalRemoteRepository spark-packages,
http://dl.bintray.com/spark-packages/maven,false;
zeppelin.dep.localrepo local-repo
zeppelin.interpreter.localRepo
/home/hadoop/zeppelin-0.8.1-SNAPSHOT/local-repo/spark
zeppelin.interpreter.max.poolsize 10
zeppelin.interpreter.output.limit 102400
zeppelin.pyspark.python python
zeppelin.pyspark.useIPython true
zeppelin.spark.concurrentSQL false
zeppelin.spark.enableSupportedVersionCheck true
zeppelin.spark.importImplicit true
zeppelin.spark.maxResult 1000
zeppelin.spark.printREPLOutput true
zeppelin.spark.sql.interpolation false
zeppelin.spark.sql.stacktrace false
zeppelin.spark.useHiveContext true
zeppelin.spark.useNew true


Am Sa., 9. Juni 2018 um 23:34 Uhr schrieb Thomas Bünger <
thom.bueng@googlemail.com>:

> Hey Jeff,
> I just tried branch-0.8.
> Still the same error: No ZeppelinContext "z" available when using
> "yarn-cluster". (See attached screenshot)
> With "yarn-client" it works.
>
> Besides setting JAVA_HOME and HADOOP_CONF_DIR inside zeppelin-env.sh, no
> further adjustment where applied to the zeppelin installation. (Also thanks
> to the new %spark.conf ;-) )
>
> Best regards,
>  Thomas
>
> [image: Screen Shot 2018-06-09 at 23.27.49.png]
>
>
>
> Am Fr., 8. Juni 2018 um 03:05 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>
>>
>> Hi Thomas,
>>
>> I try to the latest branch-0.8, it works for me. Could you try again to
>> verify it ?
>>
>>
>> Thomas Bünger <th...@googlemail.com>于2018年6月7日周四 下午8:34写道：
>>
>>> I specifically mean visualisation via ZeppelinContext inside a Spark
>>> interpreter. (e.g. "z.show(...)")
>>> The visualisation of SparkSQL results inside a SparkSQLInterpreter work
>>> fine, also in yarn-cluster mode.
>>>
>>> Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
>>> thom.bueng@googlemail.com>:
>>>
>>>> Hey Jeff,
>>>>
>>>> I tried your changes and now it works nicely. Thank you very much!
>>>>
>>>> But I still can't use any of the forms and visualizations in
>>>> yarn-cluster?
>>>> I was hoping that this got resolved with the new SparkInterpreter so
>>>> that I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm
>>>> still getting errors like
>>>> "error: not found: value z"
>>>>
>>>> Was this not in scope of that change? Is this a bug? Or is it known
>>>> limitation and also not supported in 0.8?
>>>>
>>>> Best regards,
>>>>  Thomas
>>>>
>>>> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang <zjffdu@gmail.com
>>>> >:
>>>>
>>>>>
>>>>> I can confirm that this is a bug, and created
>>>>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>>>>
>>>>> Will fix it soon
>>>>>
>>>>> Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：
>>>>>
>>>>>>
>>>>>> hmm, it looks like a bug. I will check it tomorrow.
>>>>>>
>>>>>>
>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>>>>>>
>>>>>>> $ ls /usr/lib/spark/python/lib
>>>>>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>>>>>
>>>>>>> So folder exists and contains both necessary zips. Please note, that
>>>>>>> in local or yarn-client mode the files are properly picked up from that
>>>>>>> very same location.
>>>>>>>
>>>>>>> How does yarn-cluster work under the hood? Could it be that
>>>>>>> environment variables (like SPARK_HOME) are lost, because they are only
>>>>>>> available in my local shell + zeppelin daemon process? Do I need to tell
>>>>>>> YARN somehow about SPARK_HOME?
>>>>>>>
>>>>>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <
>>>>>>> zjffdu@gmail.com>:
>>>>>>>
>>>>>>>>
>>>>>>>> Could you check whether there's folder /usr/lib/spark/python/lib ?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>>>>>>>>
>>>>>>>>>
>>>>>>>>> sys.env
>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>>> at
>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>>> at
>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>>>>> at
>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>>
>>>>>>>>>> Could you paste the full stracktrace ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>>>>>>>>>>
>>>>>>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the
>>>>>>>>>>> preinstalled version of spark under /usr/lib/spark.
>>>>>>>>>>>
>>>>>>>>>>> This works fine in local or yarn-client mode, but in
>>>>>>>>>>> yarn-cluster mode i just get a
>>>>>>>>>>>
>>>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>>
>>>>>>>>>>> Seems to be caused by an unsuccessful search for the py4j
>>>>>>>>>>> libraries.
>>>>>>>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>>>>>>>>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>>>>>>>> interpreter, something odd is going on.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>>  Thomas
>>>>>>>>>>>
>>>>>>>>>>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Thomas Bünger <th...@googlemail.com>.

Hey Jeff,
I just tried branch-0.8.
Still the same error: No ZeppelinContext "z" available when using
"yarn-cluster". (See attached screenshot)
With "yarn-client" it works.

Besides setting JAVA_HOME and HADOOP_CONF_DIR inside zeppelin-env.sh, no
further adjustment where applied to the zeppelin installation. (Also thanks
to the new %spark.conf ;-) )

Best regards,
 Thomas

[image: Screen Shot 2018-06-09 at 23.27.49.png]



Am Fr., 8. Juni 2018 um 03:05 Uhr schrieb Jeff Zhang <zj...@gmail.com>:

>
> Hi Thomas,
>
> I try to the latest branch-0.8, it works for me. Could you try again to
> verify it ?
>
>
> Thomas Bünger <th...@googlemail.com>于2018年6月7日周四 下午8:34写道：
>
>> I specifically mean visualisation via ZeppelinContext inside a Spark
>> interpreter. (e.g. "z.show(...)")
>> The visualisation of SparkSQL results inside a SparkSQLInterpreter work
>> fine, also in yarn-cluster mode.
>>
>> Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
>> thom.bueng@googlemail.com>:
>>
>>> Hey Jeff,
>>>
>>> I tried your changes and now it works nicely. Thank you very much!
>>>
>>> But I still can't use any of the forms and visualizations in
>>> yarn-cluster?
>>> I was hoping that this got resolved with the new SparkInterpreter so
>>> that I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm
>>> still getting errors like
>>> "error: not found: value z"
>>>
>>> Was this not in scope of that change? Is this a bug? Or is it known
>>> limitation and also not supported in 0.8?
>>>
>>> Best regards,
>>>  Thomas
>>>
>>> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>>>
>>>>
>>>> I can confirm that this is a bug, and created
>>>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>>>
>>>> Will fix it soon
>>>>
>>>> Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：
>>>>
>>>>>
>>>>> hmm, it looks like a bug. I will check it tomorrow.
>>>>>
>>>>>
>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>>>>>
>>>>>> $ ls /usr/lib/spark/python/lib
>>>>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>>>>
>>>>>> So folder exists and contains both necessary zips. Please note, that
>>>>>> in local or yarn-client mode the files are properly picked up from that
>>>>>> very same location.
>>>>>>
>>>>>> How does yarn-cluster work under the hood? Could it be that
>>>>>> environment variables (like SPARK_HOME) are lost, because they are only
>>>>>> available in my local shell + zeppelin daemon process? Do I need to tell
>>>>>> YARN somehow about SPARK_HOME?
>>>>>>
>>>>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <
>>>>>> zjffdu@gmail.com>:
>>>>>>
>>>>>>>
>>>>>>> Could you check whether there's folder /usr/lib/spark/python/lib ?
>>>>>>>
>>>>>>>
>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>>>>>>>
>>>>>>>>
>>>>>>>> sys.env
>>>>>>>> java.lang.NullPointerException at
>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>> at
>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>>>>>> at
>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>>>>>> at
>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>>>>> at
>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>> at
>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>> at
>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>>>> at
>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>>>>
>>>>>>>>
>>>>>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>>>>>>>> zjffdu@gmail.com>:
>>>>>>>>
>>>>>>>>> Could you paste the full stracktrace ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>>>>>>>>>
>>>>>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
>>>>>>>>>> version of spark under /usr/lib/spark.
>>>>>>>>>>
>>>>>>>>>> This works fine in local or yarn-client mode, but in yarn-cluster
>>>>>>>>>> mode i just get a
>>>>>>>>>>
>>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>>
>>>>>>>>>> Seems to be caused by an unsuccessful search for the py4j
>>>>>>>>>> libraries.
>>>>>>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>>>>>>>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>>>>>>> interpreter, something odd is going on.
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>  Thomas
>>>>>>>>>>
>>>>>>>>>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Jeff Zhang <zj...@gmail.com>.

Hi Thomas,

I try to the latest branch-0.8, it works for me. Could you try again to
verify it ?


Thomas Bünger <th...@googlemail.com>于2018年6月7日周四 下午8:34写道：

> I specifically mean visualisation via ZeppelinContext inside a Spark
> interpreter. (e.g. "z.show(...)")
> The visualisation of SparkSQL results inside a SparkSQLInterpreter work
> fine, also in yarn-cluster mode.
>
> Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
> thom.bueng@googlemail.com>:
>
>> Hey Jeff,
>>
>> I tried your changes and now it works nicely. Thank you very much!
>>
>> But I still can't use any of the forms and visualizations in yarn-cluster?
>> I was hoping that this got resolved with the new SparkInterpreter so that
>> I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm still
>> getting errors like
>> "error: not found: value z"
>>
>> Was this not in scope of that change? Is this a bug? Or is it known
>> limitation and also not supported in 0.8?
>>
>> Best regards,
>>  Thomas
>>
>> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>>
>>>
>>> I can confirm that this is a bug, and created
>>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>>
>>> Will fix it soon
>>>
>>> Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：
>>>
>>>>
>>>> hmm, it looks like a bug. I will check it tomorrow.
>>>>
>>>>
>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>>>>
>>>>> $ ls /usr/lib/spark/python/lib
>>>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>>>
>>>>> So folder exists and contains both necessary zips. Please note, that
>>>>> in local or yarn-client mode the files are properly picked up from that
>>>>> very same location.
>>>>>
>>>>> How does yarn-cluster work under the hood? Could it be that
>>>>> environment variables (like SPARK_HOME) are lost, because they are only
>>>>> available in my local shell + zeppelin daemon process? Do I need to tell
>>>>> YARN somehow about SPARK_HOME?
>>>>>
>>>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <zjffdu@gmail.com
>>>>> >:
>>>>>
>>>>>>
>>>>>> Could you check whether there's folder /usr/lib/spark/python/lib ?
>>>>>>
>>>>>>
>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>>>>>>
>>>>>>>
>>>>>>> sys.env
>>>>>>> java.lang.NullPointerException at
>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>> at
>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>>>>> at
>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>>>>> at
>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>>>> at
>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>> at
>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>> at
>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>>> at
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>>>
>>>>>>>
>>>>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>>>>>>> zjffdu@gmail.com>:
>>>>>>>
>>>>>>>> Could you paste the full stracktrace ?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>>>>>>>>
>>>>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
>>>>>>>>> version of spark under /usr/lib/spark.
>>>>>>>>>
>>>>>>>>> This works fine in local or yarn-client mode, but in yarn-cluster
>>>>>>>>> mode i just get a
>>>>>>>>>
>>>>>>>>> java.lang.NullPointerException at
>>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>>
>>>>>>>>> Seems to be caused by an unsuccessful search for the py4j
>>>>>>>>> libraries.
>>>>>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>>>>>>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>>>>>> interpreter, something odd is going on.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>  Thomas
>>>>>>>>>
>>>>>>>>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Thomas Bünger <th...@googlemail.com>.

I specifically mean visualisation via ZeppelinContext inside a Spark
interpreter. (e.g. "z.show(...)")
The visualisation of SparkSQL results inside a SparkSQLInterpreter work
fine, also in yarn-cluster mode.

Am Do., 7. Juni 2018 um 14:30 Uhr schrieb Thomas Bünger <
thom.bueng@googlemail.com>:

> Hey Jeff,
>
> I tried your changes and now it works nicely. Thank you very much!
>
> But I still can't use any of the forms and visualizations in yarn-cluster?
> I was hoping that this got resolved with the new SparkInterpreter so that
> I can switch from yarn-client to yarn-cluster mode in 0.8, but I'm still
> getting errors like
> "error: not found: value z"
>
> Was this not in scope of that change? Is this a bug? Or is it known
> limitation and also not supported in 0.8?
>
> Best regards,
>  Thomas
>
> Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>
>>
>> I can confirm that this is a bug, and created
>> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>>
>> Will fix it soon
>>
>> Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：
>>
>>>
>>> hmm, it looks like a bug. I will check it tomorrow.
>>>
>>>
>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>>>
>>>> $ ls /usr/lib/spark/python/lib
>>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>>
>>>> So folder exists and contains both necessary zips. Please note, that in
>>>> local or yarn-client mode the files are properly picked up from that very
>>>> same location.
>>>>
>>>> How does yarn-cluster work under the hood? Could it be that environment
>>>> variables (like SPARK_HOME) are lost, because they are only available in my
>>>> local shell + zeppelin daemon process? Do I need to tell YARN somehow about
>>>> SPARK_HOME?
>>>>
>>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <zjffdu@gmail.com
>>>> >:
>>>>
>>>>>
>>>>> Could you check whether there's folder /usr/lib/spark/python/lib ?
>>>>>
>>>>>
>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>>>>>
>>>>>>
>>>>>> sys.env
>>>>>> java.lang.NullPointerException at
>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>> at
>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>>>> at
>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>>>> at
>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>>> at
>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>> at
>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>> at
>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>> at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>>
>>>>>>
>>>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <
>>>>>> zjffdu@gmail.com>:
>>>>>>
>>>>>>> Could you paste the full stracktrace ?
>>>>>>>
>>>>>>>
>>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>>>>>>>
>>>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
>>>>>>>> version of spark under /usr/lib/spark.
>>>>>>>>
>>>>>>>> This works fine in local or yarn-client mode, but in yarn-cluster
>>>>>>>> mode i just get a
>>>>>>>>
>>>>>>>> java.lang.NullPointerException at
>>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>>
>>>>>>>> Seems to be caused by an unsuccessful search for the py4j libraries.
>>>>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>>>>>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>>>>> interpreter, something odd is going on.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>  Thomas
>>>>>>>>
>>>>>>>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Thomas Bünger <th...@googlemail.com>.

Hey Jeff,

I tried your changes and now it works nicely. Thank you very much!

But I still can't use any of the forms and visualizations in yarn-cluster?
I was hoping that this got resolved with the new SparkInterpreter so that I
can switch from yarn-client to yarn-cluster mode in 0.8, but I'm still
getting errors like
"error: not found: value z"

Was this not in scope of that change? Is this a bug? Or is it known
limitation and also not supported in 0.8?

Best regards,
 Thomas

Am Mi., 6. Juni 2018 um 03:28 Uhr schrieb Jeff Zhang <zj...@gmail.com>:

>
> I can confirm that this is a bug, and created
> https://issues.apache.org/jira/browse/ZEPPELIN-3531
>
> Will fix it soon
>
> Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：
>
>>
>> hmm, it looks like a bug. I will check it tomorrow.
>>
>>
>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>>
>>> $ ls /usr/lib/spark/python/lib
>>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>>
>>> So folder exists and contains both necessary zips. Please note, that in
>>> local or yarn-client mode the files are properly picked up from that very
>>> same location.
>>>
>>> How does yarn-cluster work under the hood? Could it be that environment
>>> variables (like SPARK_HOME) are lost, because they are only available in my
>>> local shell + zeppelin daemon process? Do I need to tell YARN somehow about
>>> SPARK_HOME?
>>>
>>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>>>
>>>>
>>>> Could you check whether there's folder /usr/lib/spark/python/lib ?
>>>>
>>>>
>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>>>>
>>>>>
>>>>> sys.env
>>>>> java.lang.NullPointerException at
>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>> at
>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>>> at
>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>> at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>
>>>>>
>>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <zjffdu@gmail.com
>>>>> >:
>>>>>
>>>>>> Could you paste the full stracktrace ?
>>>>>>
>>>>>>
>>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>>>>>>
>>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
>>>>>>> version of spark under /usr/lib/spark.
>>>>>>>
>>>>>>> This works fine in local or yarn-client mode, but in yarn-cluster
>>>>>>> mode i just get a
>>>>>>>
>>>>>>> java.lang.NullPointerException at
>>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>>
>>>>>>> Seems to be caused by an unsuccessful search for the py4j libraries.
>>>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>>>>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>>>> interpreter, something odd is going on.
>>>>>>>
>>>>>>> Best regards,
>>>>>>>  Thomas
>>>>>>>
>>>>>>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Jeff Zhang <zj...@gmail.com>.

I can confirm that this is a bug, and created
https://issues.apache.org/jira/browse/ZEPPELIN-3531

Will fix it soon

Jeff Zhang <zj...@gmail.com>于2018年6月5日周二 下午9:01写道：

>
> hmm, it looks like a bug. I will check it tomorrow.
>
>
> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：
>
>> $ ls /usr/lib/spark/python/lib
>> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>>
>> So folder exists and contains both necessary zips. Please note, that in
>> local or yarn-client mode the files are properly picked up from that very
>> same location.
>>
>> How does yarn-cluster work under the hood? Could it be that environment
>> variables (like SPARK_HOME) are lost, because they are only available in my
>> local shell + zeppelin daemon process? Do I need to tell YARN somehow about
>> SPARK_HOME?
>>
>> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>>
>>>
>>> Could you check whether there's folder /usr/lib/spark/python/lib ?
>>>
>>>
>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>>>
>>>>
>>>> sys.env
>>>> java.lang.NullPointerException at
>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>> at
>>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>>> at
>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>>> at
>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>> at
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>> at java.lang.Thread.run(Thread.java:748)
>>>>
>>>>
>>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <zjffdu@gmail.com
>>>> >:
>>>>
>>>>> Could you paste the full stracktrace ?
>>>>>
>>>>>
>>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>>>>>
>>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
>>>>>> version of spark under /usr/lib/spark.
>>>>>>
>>>>>> This works fine in local or yarn-client mode, but in yarn-cluster
>>>>>> mode i just get a
>>>>>>
>>>>>> java.lang.NullPointerException at
>>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>>
>>>>>> Seems to be caused by an unsuccessful search for the py4j libraries.
>>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>>>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>>> interpreter, something odd is going on.
>>>>>>
>>>>>> Best regards,
>>>>>>  Thomas
>>>>>>
>>>>>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Jeff Zhang <zj...@gmail.com>.

hmm, it looks like a bug. I will check it tomorrow.


Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:56写道：

> $ ls /usr/lib/spark/python/lib
> py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip
>
> So folder exists and contains both necessary zips. Please note, that in
> local or yarn-client mode the files are properly picked up from that very
> same location.
>
> How does yarn-cluster work under the hood? Could it be that environment
> variables (like SPARK_HOME) are lost, because they are only available in my
> local shell + zeppelin daemon process? Do I need to tell YARN somehow about
> SPARK_HOME?
>
> Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>
>>
>> Could you check whether there's folder /usr/lib/spark/python/lib ?
>>
>>
>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>>
>>>
>>> sys.env
>>> java.lang.NullPointerException at
>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>> at
>>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>>> at
>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>>> at
>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>> at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>> at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>>>
>>>> Could you paste the full stracktrace ?
>>>>
>>>>
>>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>>>>
>>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
>>>>> version of spark under /usr/lib/spark.
>>>>>
>>>>> This works fine in local or yarn-client mode, but in yarn-cluster mode
>>>>> i just get a
>>>>>
>>>>> java.lang.NullPointerException at
>>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>>
>>>>> Seems to be caused by an unsuccessful search for the py4j libraries.
>>>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>>> interpreter, something odd is going on.
>>>>>
>>>>> Best regards,
>>>>>  Thomas
>>>>>
>>>>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Thomas Bünger <th...@googlemail.com>.

$ ls /usr/lib/spark/python/lib
py4j-0.10.6-src.zip  PY4J_LICENSE.txt  pyspark.zip

So folder exists and contains both necessary zips. Please note, that in
local or yarn-client mode the files are properly picked up from that very
same location.

How does yarn-cluster work under the hood? Could it be that environment
variables (like SPARK_HOME) are lost, because they are only available in my
local shell + zeppelin daemon process? Do I need to tell YARN somehow about
SPARK_HOME?

Am Di., 5. Juni 2018 um 14:48 Uhr schrieb Jeff Zhang <zj...@gmail.com>:

>
> Could you check whether there's folder /usr/lib/spark/python/lib ?
>
>
> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：
>
>>
>> sys.env
>> java.lang.NullPointerException at
>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>> at
>> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
>> at
>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>> at
>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> at java.lang.Thread.run(Thread.java:748)
>>
>>
>> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>>
>>> Could you paste the full stracktrace ?
>>>
>>>
>>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>>>
>>>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
>>>> version of spark under /usr/lib/spark.
>>>>
>>>> This works fine in local or yarn-client mode, but in yarn-cluster mode
>>>> i just get a
>>>>
>>>> java.lang.NullPointerException at
>>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>>
>>>> Seems to be caused by an unsuccessful search for the py4j libraries.
>>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>>> interpreter, something odd is going on.
>>>>
>>>> Best regards,
>>>>  Thomas
>>>>
>>>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Jeff Zhang <zj...@gmail.com>.

Could you check whether there's folder /usr/lib/spark/python/lib ?


Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:45写道：

>
> sys.env
> java.lang.NullPointerException at
> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
> at
> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
> at
> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
> at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
> at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
>
> Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <zj...@gmail.com>:
>
>> Could you paste the full stracktrace ?
>>
>>
>> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>>
>>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled
>>> version of spark under /usr/lib/spark.
>>>
>>> This works fine in local or yarn-client mode, but in yarn-cluster mode i
>>> just get a
>>>
>>> java.lang.NullPointerException at
>>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>>
>>> Seems to be caused by an unsuccessful search for the py4j libraries.
>>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>>> interpreter, something odd is going on.
>>>
>>> Best regards,
>>>  Thomas
>>>
>>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Thomas Bünger <th...@googlemail.com>.

sys.env
java.lang.NullPointerException at
org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
at
org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:90)
at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


Am Di., 5. Juni 2018 um 14:41 Uhr schrieb Jeff Zhang <zj...@gmail.com>:

> Could you paste the full stracktrace ?
>
>
> Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：
>
>> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled version
>> of spark under /usr/lib/spark.
>>
>> This works fine in local or yarn-client mode, but in yarn-cluster mode i
>> just get a
>>
>> java.lang.NullPointerException at
>> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>>
>> Seems to be caused by an unsuccessful search for the py4j libraries.
>> I've made sure that SPARK_HOME is actually set in .bash_rc, in
>> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
>> interpreter, something odd is going on.
>>
>> Best regards,
>>  Thomas
>>
>

Re: NewSparkInterpreter fails on yarn-cluster

Posted by Jeff Zhang <zj...@gmail.com>.

Could you paste the full stracktrace ?


Thomas Bünger <th...@googlemail.com>于2018年6月5日周二 下午8:21写道：

> I've tried the 0.8.0-rc4 on my EMR cluster using the preinstalled version
> of spark under /usr/lib/spark.
>
> This works fine in local or yarn-client mode, but in yarn-cluster mode i
> just get a
>
> java.lang.NullPointerException at
> org.apache.zeppelin.spark.NewSparkInterpreter.setupConfForPySpark(NewSparkInterpreter.java:149)
>
> Seems to be caused by an unsuccessful search for the py4j libraries.
> I've made sure that SPARK_HOME is actually set in .bash_rc, in
> zeppelin-env.sh and via the new %spark.conf, but somehow in the remote
> interpreter, something odd is going on.
>
> Best regards,
>  Thomas
>