You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@zeppelin.apache.org by "Y. Ethan Guo" <gu...@uber.com> on 2019/04/07 07:31:02 UTC

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

Hi Jeff,

Given this PR is merged, I'm trying to see if I can run yarn cluster mode
from master build.  I built Zeppelin master from this commit:

commit 3655c12b875884410224eca5d6155287d51916ac
Author: Jongyoul Lee <jo...@gmail.com>
Date:   Mon Apr 1 15:37:57 2019 +0900
    [MINOR] Refactor CronJob class (#3335)

While I can successfully run Spark interpreter yarn client mode, I'm having
trouble making the yarn cluster mode working.  Specifically, while the
interpreter job was accepted in yarn, the job failed after 1-2 minutes
because of this exception (see below).  Do you have any idea why this
is happening?

DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
keyStorePassword=None, trustStore=None, trustStorePassword=None,
protocol=None, enabledAlgorithms=Set()}
 INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
Starting the user application in a separate Thread
 INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
Waiting for spark context initialization...
 INFO [2019-04-07 06:57:00,403] ({Driver}
RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter
server on port 0, intpEventServerAddress: 172.17.0.1:45128
ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) -
User class threw exception:
org.apache.thrift.transport.TTransportException: java.net.ConnectException:
Connection refused (Connection refused)
org.apache.thrift.transport.TTransportException: java.net.ConnectException:
Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:154)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:139)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
Caused by: java.net.ConnectException: Connection refused (Connection
refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
... 8 more

Thanks,
- Ethan

On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang <zj...@gmail.com> wrote:

> Here's the PR
> https://github.com/apache/zeppelin/pull/3308
>
> Y. Ethan Guo <gu...@uber.com> 于2019年2月28日周四 上午2:50写道：
>
>> Hi All,
>>
>> I'm trying to use the new feature of yarn cluster mode to run Spark 2.4.0
>> jobs on Zeppelin 0.8.1. I've set the SPARK_HOME, SPARK_SUBMIT_OPTIONS, and
>> HADOOP_CONF_DIR env variables in zeppelin-env.sh so that the Spark
>> interpreter can be started in the cluster. I used `--jars` in
>> SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried to import a
>> class from the jars in a Spark paragraph, the interpreter complained that
>> it cannot find the package and class ("<console>:23: error: object ... is
>> not a member of package ..."). Looks like the jars are not properly
>> imported.
>>
>> I followed the instruction here
>> <https://zeppelin.apache.org/docs/0.8.1/interpreter/spark.html#2-loading-spark-properties>
>> to add the jars, but it seems that it's not working in the cluster mode.
>> And this issue seems to be related to this bug:
>> https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any update
>> on fixing it? What is the right way to add local jars in yarn cluster mode?
>> Any help and update are much appreciated.
>>
>>
>> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths omitted):
>>
>> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars
>> ... --repositories
>> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
>> "
>>
>> Thanks,
>> - Ethan
>> --
>> Best,
>> - Ethan
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

Posted by Jeff Zhang <zj...@gmail.com>.

It is supposed to be fixed in 0.9.0-SNAPSHOT as well, if you hit this issue
in master, then it should be a bug, please file a ticket and describe the
details. Thanks



Y. Ethan Guo <gu...@uber.com> 于2019年4月8日周一 下午4:42写道：

> I'm partially hitting this issue in 0.9.0-SNAPSHOT for Spark interpreter
> with other names.  Not sure if ZEPPELIN-3986 issue is completely resolved.
> I'm using multiple spark interpreters with different spark confs which
> share the same SPARK_SUBMIT_OPTIONS including a `--jars` option.  It seems
> that only one of them is working.  Anyway, shall we follow up on the ticket
> and see how to fix it?
>
> Thanks,
> - Ethan
>
> On Mon, Apr 8, 2019 at 1:34 AM Jeff Zhang <zj...@gmail.com> wrote:
>
>> Hi Ethan,
>>
>> These behavior are not expected. Maybe you are hitting this issue which
>> is fixed in 0.8.2
>> https://jira.apache.org/jira/browse/ZEPPELIN-3986
>>
>>
>> Y. Ethan Guo <gu...@uber.com> 于2019年4月8日周一 下午4:26写道：
>>
>>> Hi Jeff, Dave,
>>>
>>> Thanks for the suggestion.  I was able to successfully run the Spark
>>> interpreter in yarn cluster mode on anther machine running Zeppelin.  The
>>> previous problem could probably be due to network issues.
>>>
>>> I have two observations:
>>> (1) I'm able to use "--jars" option in SPARK_SUBMIT_OPTIONS in the
>>> "spark" interpreter with yarn cluster mode configured.  I verify that the
>>> jars are pushed to the driver and executors by successfully running a job
>>> using some classes in the jars.  However, if I create a new "spark_abc"
>>> interpreter under the spark interpreter group, this new interpreter doesn't
>>> seem to pick up SPARK_SUBMIT_OPTIONS and the jars option, leading to errors
>>> of not being able to access packages/classes in the jars.
>>>
>>> (2) Once I restart the spark interpreters in the interpreter settings,
>>> the corresponding Spark jobs in yarn cluster first transition from
>>> "RUNNING" state to "ACCEPTED" state, and then end up in "FAILED" state.
>>>
>>> I'm wondering if the above behavior are expected and they are known to
>>> be the limitations of the current 0.9.0-SNAPSHOT version.
>>>
>>> Thanks,
>>> - Ethan
>>>
>>> On Sun, Apr 7, 2019 at 9:59 AM Dave Boyd <db...@incadencecorp.com>
>>> wrote:
>>>
>>>> From the connection refused message I wonder if it is an SSL error.  I
>>>> note none of the information for SSL (truststore, keystore, etc.)
>>>> I would think the YARN cluster requires some form of authentication.
>>>> On 4/7/19 9:27 AM, Jeff Zhang wrote:
>>>>
>>>> It looks like the interpreter process can not connect to zeppelin
>>>> server process. I guess it is due to some network issue, can you check
>>>> whether the node in yarn cluster can connect to the zeppelin server host ?
>>>>
>>>> Y. Ethan Guo <gu...@uber.com> 于2019年4月7日周日 下午3:31写道：
>>>>
>>>>> Hi Jeff,
>>>>>
>>>>> Given this PR is merged, I'm trying to see if I can run yarn cluster
>>>>> mode from master build.  I built Zeppelin master from this commit:
>>>>>
>>>>> commit 3655c12b875884410224eca5d6155287d51916ac
>>>>> Author: Jongyoul Lee <jo...@gmail.com>
>>>>> Date:   Mon Apr 1 15:37:57 2019 +0900
>>>>>     [MINOR] Refactor CronJob class (#3335)
>>>>>
>>>>> While I can successfully run Spark interpreter yarn client mode, I'm
>>>>> having trouble making the yarn cluster mode working.  Specifically, while
>>>>> the interpreter job was accepted in yarn, the job failed after 1-2 minutes
>>>>> because of this exception (see below).  Do you have any idea why this
>>>>> is happening?
>>>>>
>>>>> DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
>>>>> Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
>>>>> keyStorePassword=None, trustStore=None, trustStorePassword=None,
>>>>> protocol=None, enabledAlgorithms=Set()}
>>>>>  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
>>>>> Starting the user application in a separate Thread
>>>>>  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
>>>>> Waiting for spark context initialization...
>>>>>  INFO [2019-04-07 06:57:00,403] ({Driver}
>>>>> RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter
>>>>> server on port 0, intpEventServerAddress: 172.17.0.1:45128
>>>>> ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91)
>>>>> - User class threw exception:
>>>>> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
>>>>> Connection refused (Connection refused)
>>>>> org.apache.thrift.transport.TTransportException:
>>>>> java.net.ConnectException: Connection refused (Connection refused)
>>>>> at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:154)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:139)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>> at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
>>>>> Caused by: java.net.ConnectException: Connection refused (Connection
>>>>> refused)
>>>>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>> at
>>>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>>>> at
>>>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>>>> at
>>>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>>> at java.net.Socket.connect(Socket.java:589)
>>>>> at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
>>>>> ... 8 more
>>>>>
>>>>> Thanks,
>>>>> - Ethan
>>>>>
>>>>> On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang <zj...@gmail.com> wrote:
>>>>>
>>>>>> Here's the PR
>>>>>> https://github.com/apache/zeppelin/pull/3308
>>>>>>
>>>>>> Y. Ethan Guo <gu...@uber.com> 于2019年2月28日周四 上午2:50写道：
>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I'm trying to use the new feature of yarn cluster mode to run Spark
>>>>>>> 2.4.0 jobs on Zeppelin 0.8.1. I've set the SPARK_HOME,
>>>>>>> SPARK_SUBMIT_OPTIONS, and HADOOP_CONF_DIR env variables in zeppelin-env.sh
>>>>>>> so that the Spark interpreter can be started in the cluster. I used
>>>>>>> `--jars` in SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried
>>>>>>> to import a class from the jars in a Spark paragraph, the interpreter
>>>>>>> complained that it cannot find the package and class ("<console>:23: error:
>>>>>>> object ... is not a member of package ..."). Looks like the jars are not
>>>>>>> properly imported.
>>>>>>>
>>>>>>> I followed the instruction here
>>>>>>> <https://zeppelin.apache.org/docs/0.8.1/interpreter/spark.html#2-loading-spark-properties>
>>>>>>> to add the jars, but it seems that it's not working in the cluster mode.
>>>>>>> And this issue seems to be related to this bug:
>>>>>>> https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any
>>>>>>> update on fixing it? What is the right way to add local jars in yarn
>>>>>>> cluster mode? Any help and update are much appreciated.
>>>>>>>
>>>>>>>
>>>>>>> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths
>>>>>>> omitted):
>>>>>>>
>>>>>>> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ...
>>>>>>> --jars ... --repositories
>>>>>>> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
>>>>>>> "
>>>>>>>
>>>>>>> Thanks,
>>>>>>> - Ethan
>>>>>>> --
>>>>>>> Best,
>>>>>>> - Ethan
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards
>>>>>>
>>>>>> Jeff Zhang
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Best Regards
>>>>
>>>> Jeff Zhang
>>>>
>>>> --
>>>> ========= mailto:dboyd@incadencecorp.com <db...@incadencecorp.com> ============
>>>> David W. Boyd
>>>> VP,  Data Solutions
>>>> 10432 Balls Ford, Suite 240
>>>> Manassas, VA 20109
>>>> office:   +1-703-552-2862
>>>> cell:     +1-703-402-7908
>>>> ============== http://www.incadencecorp.com/ ============
>>>> ISO/IEC JTC1 WG9, editor ISO/IEC 20547 Big Data Reference Architecture
>>>> Chair ANSI/INCITS TC Big Data
>>>> Co-chair NIST Big Data Public Working Group Reference Architecture
>>>> First Robotic Mentor - FRC, FTC - www.iliterobotics.org
>>>> Board Member- USSTEM Foundation - www.usstem.org
>>>>
>>>> The information contained in this message may be privileged
>>>> and/or confidential and protected from disclosure.
>>>> If the reader of this message is not the intended recipient
>>>> or an employee or agent responsible for delivering this message
>>>> to the intended recipient, you are hereby notified that any
>>>> dissemination, distribution or copying of this communication
>>>> is strictly prohibited.  If you have received this communication
>>>> in error, please notify the sender immediately by replying to
>>>> this message and deleting the material from any computer.
>>>>
>>>>
>>>>
>>>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>

-- 
Best Regards

Jeff Zhang

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

Posted by "Y. Ethan Guo" <gu...@uber.com>.

I'm partially hitting this issue in 0.9.0-SNAPSHOT for Spark interpreter
with other names.  Not sure if ZEPPELIN-3986 issue is completely resolved.
I'm using multiple spark interpreters with different spark confs which
share the same SPARK_SUBMIT_OPTIONS including a `--jars` option.  It seems
that only one of them is working.  Anyway, shall we follow up on the ticket
and see how to fix it?

Thanks,
- Ethan

On Mon, Apr 8, 2019 at 1:34 AM Jeff Zhang <zj...@gmail.com> wrote:

> Hi Ethan,
>
> These behavior are not expected. Maybe you are hitting this issue which is
> fixed in 0.8.2
> https://jira.apache.org/jira/browse/ZEPPELIN-3986
>
>
> Y. Ethan Guo <gu...@uber.com> 于2019年4月8日周一 下午4:26写道：
>
>> Hi Jeff, Dave,
>>
>> Thanks for the suggestion.  I was able to successfully run the Spark
>> interpreter in yarn cluster mode on anther machine running Zeppelin.  The
>> previous problem could probably be due to network issues.
>>
>> I have two observations:
>> (1) I'm able to use "--jars" option in SPARK_SUBMIT_OPTIONS in the
>> "spark" interpreter with yarn cluster mode configured.  I verify that the
>> jars are pushed to the driver and executors by successfully running a job
>> using some classes in the jars.  However, if I create a new "spark_abc"
>> interpreter under the spark interpreter group, this new interpreter doesn't
>> seem to pick up SPARK_SUBMIT_OPTIONS and the jars option, leading to errors
>> of not being able to access packages/classes in the jars.
>>
>> (2) Once I restart the spark interpreters in the interpreter settings,
>> the corresponding Spark jobs in yarn cluster first transition from
>> "RUNNING" state to "ACCEPTED" state, and then end up in "FAILED" state.
>>
>> I'm wondering if the above behavior are expected and they are known to be
>> the limitations of the current 0.9.0-SNAPSHOT version.
>>
>> Thanks,
>> - Ethan
>>
>> On Sun, Apr 7, 2019 at 9:59 AM Dave Boyd <db...@incadencecorp.com> wrote:
>>
>>> From the connection refused message I wonder if it is an SSL error.  I
>>> note none of the information for SSL (truststore, keystore, etc.)
>>> I would think the YARN cluster requires some form of authentication.
>>> On 4/7/19 9:27 AM, Jeff Zhang wrote:
>>>
>>> It looks like the interpreter process can not connect to zeppelin server
>>> process. I guess it is due to some network issue, can you check whether the
>>> node in yarn cluster can connect to the zeppelin server host ?
>>>
>>> Y. Ethan Guo <gu...@uber.com> 于2019年4月7日周日 下午3:31写道：
>>>
>>>> Hi Jeff,
>>>>
>>>> Given this PR is merged, I'm trying to see if I can run yarn cluster
>>>> mode from master build.  I built Zeppelin master from this commit:
>>>>
>>>> commit 3655c12b875884410224eca5d6155287d51916ac
>>>> Author: Jongyoul Lee <jo...@gmail.com>
>>>> Date:   Mon Apr 1 15:37:57 2019 +0900
>>>>     [MINOR] Refactor CronJob class (#3335)
>>>>
>>>> While I can successfully run Spark interpreter yarn client mode, I'm
>>>> having trouble making the yarn cluster mode working.  Specifically, while
>>>> the interpreter job was accepted in yarn, the job failed after 1-2 minutes
>>>> because of this exception (see below).  Do you have any idea why this
>>>> is happening?
>>>>
>>>> DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
>>>> Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
>>>> keyStorePassword=None, trustStore=None, trustStorePassword=None,
>>>> protocol=None, enabledAlgorithms=Set()}
>>>>  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
>>>> Starting the user application in a separate Thread
>>>>  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
>>>> Waiting for spark context initialization...
>>>>  INFO [2019-04-07 06:57:00,403] ({Driver}
>>>> RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter
>>>> server on port 0, intpEventServerAddress: 172.17.0.1:45128
>>>> ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) -
>>>> User class threw exception:
>>>> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
>>>> Connection refused (Connection refused)
>>>> org.apache.thrift.transport.TTransportException:
>>>> java.net.ConnectException: Connection refused (Connection refused)
>>>> at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:154)
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:139)
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>> at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>>> at
>>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
>>>> Caused by: java.net.ConnectException: Connection refused (Connection
>>>> refused)
>>>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>> at
>>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>>> at
>>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>>> at
>>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>>> at java.net.Socket.connect(Socket.java:589)
>>>> at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
>>>> ... 8 more
>>>>
>>>> Thanks,
>>>> - Ethan
>>>>
>>>> On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang <zj...@gmail.com> wrote:
>>>>
>>>>> Here's the PR
>>>>> https://github.com/apache/zeppelin/pull/3308
>>>>>
>>>>> Y. Ethan Guo <gu...@uber.com> 于2019年2月28日周四 上午2:50写道：
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> I'm trying to use the new feature of yarn cluster mode to run Spark
>>>>>> 2.4.0 jobs on Zeppelin 0.8.1. I've set the SPARK_HOME,
>>>>>> SPARK_SUBMIT_OPTIONS, and HADOOP_CONF_DIR env variables in zeppelin-env.sh
>>>>>> so that the Spark interpreter can be started in the cluster. I used
>>>>>> `--jars` in SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried
>>>>>> to import a class from the jars in a Spark paragraph, the interpreter
>>>>>> complained that it cannot find the package and class ("<console>:23: error:
>>>>>> object ... is not a member of package ..."). Looks like the jars are not
>>>>>> properly imported.
>>>>>>
>>>>>> I followed the instruction here
>>>>>> <https://zeppelin.apache.org/docs/0.8.1/interpreter/spark.html#2-loading-spark-properties>
>>>>>> to add the jars, but it seems that it's not working in the cluster mode.
>>>>>> And this issue seems to be related to this bug:
>>>>>> https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any
>>>>>> update on fixing it? What is the right way to add local jars in yarn
>>>>>> cluster mode? Any help and update are much appreciated.
>>>>>>
>>>>>>
>>>>>> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths
>>>>>> omitted):
>>>>>>
>>>>>> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ...
>>>>>> --jars ... --repositories
>>>>>> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
>>>>>> "
>>>>>>
>>>>>> Thanks,
>>>>>> - Ethan
>>>>>> --
>>>>>> Best,
>>>>>> - Ethan
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards
>>>>>
>>>>> Jeff Zhang
>>>>>
>>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>> --
>>> ========= mailto:dboyd@incadencecorp.com <db...@incadencecorp.com> ============
>>> David W. Boyd
>>> VP,  Data Solutions
>>> 10432 Balls Ford, Suite 240
>>> Manassas, VA 20109
>>> office:   +1-703-552-2862
>>> cell:     +1-703-402-7908
>>> ============== http://www.incadencecorp.com/ ============
>>> ISO/IEC JTC1 WG9, editor ISO/IEC 20547 Big Data Reference Architecture
>>> Chair ANSI/INCITS TC Big Data
>>> Co-chair NIST Big Data Public Working Group Reference Architecture
>>> First Robotic Mentor - FRC, FTC - www.iliterobotics.org
>>> Board Member- USSTEM Foundation - www.usstem.org
>>>
>>> The information contained in this message may be privileged
>>> and/or confidential and protected from disclosure.
>>> If the reader of this message is not the intended recipient
>>> or an employee or agent responsible for delivering this message
>>> to the intended recipient, you are hereby notified that any
>>> dissemination, distribution or copying of this communication
>>> is strictly prohibited.  If you have received this communication
>>> in error, please notify the sender immediately by replying to
>>> this message and deleting the material from any computer.
>>>
>>>
>>>
>>>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

Posted by Jeff Zhang <zj...@gmail.com>.

Hi Ethan,

These behavior are not expected. Maybe you are hitting this issue which is
fixed in 0.8.2
https://jira.apache.org/jira/browse/ZEPPELIN-3986


Y. Ethan Guo <gu...@uber.com> 于2019年4月8日周一 下午4:26写道：

> Hi Jeff, Dave,
>
> Thanks for the suggestion.  I was able to successfully run the Spark
> interpreter in yarn cluster mode on anther machine running Zeppelin.  The
> previous problem could probably be due to network issues.
>
> I have two observations:
> (1) I'm able to use "--jars" option in SPARK_SUBMIT_OPTIONS in the "spark"
> interpreter with yarn cluster mode configured.  I verify that the jars are
> pushed to the driver and executors by successfully running a job using some
> classes in the jars.  However, if I create a new "spark_abc" interpreter
> under the spark interpreter group, this new interpreter doesn't seem to
> pick up SPARK_SUBMIT_OPTIONS and the jars option, leading to errors of not
> being able to access packages/classes in the jars.
>
> (2) Once I restart the spark interpreters in the interpreter settings, the
> corresponding Spark jobs in yarn cluster first transition from "RUNNING"
> state to "ACCEPTED" state, and then end up in "FAILED" state.
>
> I'm wondering if the above behavior are expected and they are known to be
> the limitations of the current 0.9.0-SNAPSHOT version.
>
> Thanks,
> - Ethan
>
> On Sun, Apr 7, 2019 at 9:59 AM Dave Boyd <db...@incadencecorp.com> wrote:
>
>> From the connection refused message I wonder if it is an SSL error.  I
>> note none of the information for SSL (truststore, keystore, etc.)
>> I would think the YARN cluster requires some form of authentication.
>> On 4/7/19 9:27 AM, Jeff Zhang wrote:
>>
>> It looks like the interpreter process can not connect to zeppelin server
>> process. I guess it is due to some network issue, can you check whether the
>> node in yarn cluster can connect to the zeppelin server host ?
>>
>> Y. Ethan Guo <gu...@uber.com> 于2019年4月7日周日 下午3:31写道：
>>
>>> Hi Jeff,
>>>
>>> Given this PR is merged, I'm trying to see if I can run yarn cluster
>>> mode from master build.  I built Zeppelin master from this commit:
>>>
>>> commit 3655c12b875884410224eca5d6155287d51916ac
>>> Author: Jongyoul Lee <jo...@gmail.com>
>>> Date:   Mon Apr 1 15:37:57 2019 +0900
>>>     [MINOR] Refactor CronJob class (#3335)
>>>
>>> While I can successfully run Spark interpreter yarn client mode, I'm
>>> having trouble making the yarn cluster mode working.  Specifically, while
>>> the interpreter job was accepted in yarn, the job failed after 1-2 minutes
>>> because of this exception (see below).  Do you have any idea why this
>>> is happening?
>>>
>>> DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
>>> Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
>>> keyStorePassword=None, trustStore=None, trustStorePassword=None,
>>> protocol=None, enabledAlgorithms=Set()}
>>>  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
>>> Starting the user application in a separate Thread
>>>  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
>>> Waiting for spark context initialization...
>>>  INFO [2019-04-07 06:57:00,403] ({Driver}
>>> RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter
>>> server on port 0, intpEventServerAddress: 172.17.0.1:45128
>>> ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) -
>>> User class threw exception:
>>> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
>>> Connection refused (Connection refused)
>>> org.apache.thrift.transport.TTransportException:
>>> java.net.ConnectException: Connection refused (Connection refused)
>>> at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:154)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:139)
>>> at
>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:498)
>>> at
>>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
>>> Caused by: java.net.ConnectException: Connection refused (Connection
>>> refused)
>>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>>> at
>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>> at
>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>> at
>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>> at java.net.Socket.connect(Socket.java:589)
>>> at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
>>> ... 8 more
>>>
>>> Thanks,
>>> - Ethan
>>>
>>> On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang <zj...@gmail.com> wrote:
>>>
>>>> Here's the PR
>>>> https://github.com/apache/zeppelin/pull/3308
>>>>
>>>> Y. Ethan Guo <gu...@uber.com> 于2019年2月28日周四 上午2:50写道：
>>>>
>>>>> Hi All,
>>>>>
>>>>> I'm trying to use the new feature of yarn cluster mode to run Spark
>>>>> 2.4.0 jobs on Zeppelin 0.8.1. I've set the SPARK_HOME,
>>>>> SPARK_SUBMIT_OPTIONS, and HADOOP_CONF_DIR env variables in zeppelin-env.sh
>>>>> so that the Spark interpreter can be started in the cluster. I used
>>>>> `--jars` in SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried
>>>>> to import a class from the jars in a Spark paragraph, the interpreter
>>>>> complained that it cannot find the package and class ("<console>:23: error:
>>>>> object ... is not a member of package ..."). Looks like the jars are not
>>>>> properly imported.
>>>>>
>>>>> I followed the instruction here
>>>>> <https://zeppelin.apache.org/docs/0.8.1/interpreter/spark.html#2-loading-spark-properties>
>>>>> to add the jars, but it seems that it's not working in the cluster mode.
>>>>> And this issue seems to be related to this bug:
>>>>> https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any
>>>>> update on fixing it? What is the right way to add local jars in yarn
>>>>> cluster mode? Any help and update are much appreciated.
>>>>>
>>>>>
>>>>> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths
>>>>> omitted):
>>>>>
>>>>> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars
>>>>> ... --repositories
>>>>> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
>>>>> "
>>>>>
>>>>> Thanks,
>>>>> - Ethan
>>>>> --
>>>>> Best,
>>>>> - Ethan
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards
>>>>
>>>> Jeff Zhang
>>>>
>>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>> --
>> ========= mailto:dboyd@incadencecorp.com <db...@incadencecorp.com> ============
>> David W. Boyd
>> VP,  Data Solutions
>> 10432 Balls Ford, Suite 240
>> Manassas, VA 20109
>> office:   +1-703-552-2862
>> cell:     +1-703-402-7908
>> ============== http://www.incadencecorp.com/ ============
>> ISO/IEC JTC1 WG9, editor ISO/IEC 20547 Big Data Reference Architecture
>> Chair ANSI/INCITS TC Big Data
>> Co-chair NIST Big Data Public Working Group Reference Architecture
>> First Robotic Mentor - FRC, FTC - www.iliterobotics.org
>> Board Member- USSTEM Foundation - www.usstem.org
>>
>> The information contained in this message may be privileged
>> and/or confidential and protected from disclosure.
>> If the reader of this message is not the intended recipient
>> or an employee or agent responsible for delivering this message
>> to the intended recipient, you are hereby notified that any
>> dissemination, distribution or copying of this communication
>> is strictly prohibited.  If you have received this communication
>> in error, please notify the sender immediately by replying to
>> this message and deleting the material from any computer.
>>
>>
>>
>>

-- 
Best Regards

Jeff Zhang

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

Posted by "Y. Ethan Guo" <gu...@uber.com>.

Hi Jeff, Dave,

Thanks for the suggestion.  I was able to successfully run the Spark
interpreter in yarn cluster mode on anther machine running Zeppelin.  The
previous problem could probably be due to network issues.

I have two observations:
(1) I'm able to use "--jars" option in SPARK_SUBMIT_OPTIONS in the "spark"
interpreter with yarn cluster mode configured.  I verify that the jars are
pushed to the driver and executors by successfully running a job using some
classes in the jars.  However, if I create a new "spark_abc" interpreter
under the spark interpreter group, this new interpreter doesn't seem to
pick up SPARK_SUBMIT_OPTIONS and the jars option, leading to errors of not
being able to access packages/classes in the jars.

(2) Once I restart the spark interpreters in the interpreter settings, the
corresponding Spark jobs in yarn cluster first transition from "RUNNING"
state to "ACCEPTED" state, and then end up in "FAILED" state.

I'm wondering if the above behavior are expected and they are known to be
the limitations of the current 0.9.0-SNAPSHOT version.

Thanks,
- Ethan

On Sun, Apr 7, 2019 at 9:59 AM Dave Boyd <db...@incadencecorp.com> wrote:

> From the connection refused message I wonder if it is an SSL error.  I
> note none of the information for SSL (truststore, keystore, etc.)
> I would think the YARN cluster requires some form of authentication.
> On 4/7/19 9:27 AM, Jeff Zhang wrote:
>
> It looks like the interpreter process can not connect to zeppelin server
> process. I guess it is due to some network issue, can you check whether the
> node in yarn cluster can connect to the zeppelin server host ?
>
> Y. Ethan Guo <gu...@uber.com> 于2019年4月7日周日 下午3:31写道：
>
>> Hi Jeff,
>>
>> Given this PR is merged, I'm trying to see if I can run yarn cluster mode
>> from master build.  I built Zeppelin master from this commit:
>>
>> commit 3655c12b875884410224eca5d6155287d51916ac
>> Author: Jongyoul Lee <jo...@gmail.com>
>> Date:   Mon Apr 1 15:37:57 2019 +0900
>>     [MINOR] Refactor CronJob class (#3335)
>>
>> While I can successfully run Spark interpreter yarn client mode, I'm
>> having trouble making the yarn cluster mode working.  Specifically, while
>> the interpreter job was accepted in yarn, the job failed after 1-2 minutes
>> because of this exception (see below).  Do you have any idea why this
>> is happening?
>>
>> DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
>> Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
>> keyStorePassword=None, trustStore=None, trustStorePassword=None,
>> protocol=None, enabledAlgorithms=Set()}
>>  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
>> Starting the user application in a separate Thread
>>  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
>> Waiting for spark context initialization...
>>  INFO [2019-04-07 06:57:00,403] ({Driver}
>> RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter
>> server on port 0, intpEventServerAddress: 172.17.0.1:45128
>> ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) -
>> User class threw exception:
>> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
>> Connection refused (Connection refused)
>> org.apache.thrift.transport.TTransportException:
>> java.net.ConnectException: Connection refused (Connection refused)
>> at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:154)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:139)
>> at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at
>> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
>> Caused by: java.net.ConnectException: Connection refused (Connection
>> refused)
>> at java.net.PlainSocketImpl.socketConnect(Native Method)
>> at
>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>> at
>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>> at
>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>> at java.net.Socket.connect(Socket.java:589)
>> at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
>> ... 8 more
>>
>> Thanks,
>> - Ethan
>>
>> On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang <zj...@gmail.com> wrote:
>>
>>> Here's the PR
>>> https://github.com/apache/zeppelin/pull/3308
>>>
>>> Y. Ethan Guo <gu...@uber.com> 于2019年2月28日周四 上午2:50写道：
>>>
>>>> Hi All,
>>>>
>>>> I'm trying to use the new feature of yarn cluster mode to run Spark
>>>> 2.4.0 jobs on Zeppelin 0.8.1. I've set the SPARK_HOME,
>>>> SPARK_SUBMIT_OPTIONS, and HADOOP_CONF_DIR env variables in zeppelin-env.sh
>>>> so that the Spark interpreter can be started in the cluster. I used
>>>> `--jars` in SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried
>>>> to import a class from the jars in a Spark paragraph, the interpreter
>>>> complained that it cannot find the package and class ("<console>:23: error:
>>>> object ... is not a member of package ..."). Looks like the jars are not
>>>> properly imported.
>>>>
>>>> I followed the instruction here
>>>> <https://zeppelin.apache.org/docs/0.8.1/interpreter/spark.html#2-loading-spark-properties>
>>>> to add the jars, but it seems that it's not working in the cluster mode.
>>>> And this issue seems to be related to this bug:
>>>> https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any
>>>> update on fixing it? What is the right way to add local jars in yarn
>>>> cluster mode? Any help and update are much appreciated.
>>>>
>>>>
>>>> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths
>>>> omitted):
>>>>
>>>> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars
>>>> ... --repositories
>>>> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
>>>> "
>>>>
>>>> Thanks,
>>>> - Ethan
>>>> --
>>>> Best,
>>>> - Ethan
>>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>
>
> --
> Best Regards
>
> Jeff Zhang
>
> --
> ========= mailto:dboyd@incadencecorp.com <db...@incadencecorp.com> ============
> David W. Boyd
> VP,  Data Solutions
> 10432 Balls Ford, Suite 240
> Manassas, VA 20109
> office:   +1-703-552-2862
> cell:     +1-703-402-7908
> ============== http://www.incadencecorp.com/ ============
> ISO/IEC JTC1 WG9, editor ISO/IEC 20547 Big Data Reference Architecture
> Chair ANSI/INCITS TC Big Data
> Co-chair NIST Big Data Public Working Group Reference Architecture
> First Robotic Mentor - FRC, FTC - www.iliterobotics.org
> Board Member- USSTEM Foundation - www.usstem.org
>
> The information contained in this message may be privileged
> and/or confidential and protected from disclosure.
> If the reader of this message is not the intended recipient
> or an employee or agent responsible for delivering this message
> to the intended recipient, you are hereby notified that any
> dissemination, distribution or copying of this communication
> is strictly prohibited.  If you have received this communication
> in error, please notify the sender immediately by replying to
> this message and deleting the material from any computer.
>
>
>
>

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

Posted by Dave Boyd <db...@incadencecorp.com>.

From the connection refused message I wonder if it is an SSL error.  I note none of the information for SSL (truststore, keystore, etc.)
I would think the YARN cluster requires some form of authentication.

On 4/7/19 9:27 AM, Jeff Zhang wrote:
It looks like the interpreter process can not connect to zeppelin server process. I guess it is due to some network issue, can you check whether the node in yarn cluster can connect to the zeppelin server host ?

Y. Ethan Guo <gu...@uber.com>> 于2019年4月7日周日 下午3:31写道：
Hi Jeff,

Given this PR is merged, I'm trying to see if I can run yarn cluster mode from master build.  I built Zeppelin master from this commit:

commit 3655c12b875884410224eca5d6155287d51916ac
Author: Jongyoul Lee <jo...@gmail.com>>
Date:   Mon Apr 1 15:37:57 2019 +0900
    [MINOR] Refactor CronJob class (#3335)

While I can successfully run Spark interpreter yarn client mode, I'm having trouble making the yarn cluster mode working.  Specifically, while the interpreter job was accepted in yarn, the job failed after 1-2 minutes because of this exception (see below).  Do you have any idea why this is happening?

DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) - Created SSL options for fs: SSLOptions{enabled=false, keyStore=None, keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, enabledAlgorithms=Set()}
 INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) - Starting the user application in a separate Thread
 INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) - Waiting for spark context initialization...
 INFO [2019-04-07 06:57:00,403] ({Driver} RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter server on port 0, intpEventServerAddress: 172.17.0.1:45128<http://172.17.0.1:45128>
ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) - User class threw exception: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:154)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:139)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
... 8 more

Thanks,
- Ethan

On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang <zj...@gmail.com>> wrote:
Here's the PR
https://github.com/apache/zeppelin/pull/3308

Y. Ethan Guo <gu...@uber.com>> 于2019年2月28日周四 上午2:50写道：
Hi All,

I'm trying to use the new feature of yarn cluster mode to run Spark 2.4.0 jobs on Zeppelin 0.8.1. I've set the SPARK_HOME, SPARK_SUBMIT_OPTIONS, and HADOOP_CONF_DIR env variables in zeppelin-env.sh so that the Spark interpreter can be started in the cluster. I used `--jars` in SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried to import a class from the jars in a Spark paragraph, the interpreter complained that it cannot find the package and class ("<console>:23: error: object ... is not a member of package ..."). Looks like the jars are not properly imported.

I followed the instruction here<https://zeppelin.apache.org/docs/0.8.1/interpreter/spark.html#2-loading-spark-properties> to add the jars, but it seems that it's not working in the cluster mode.  And this issue seems to be related to this bug: https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any update on fixing it? What is the right way to add local jars in yarn cluster mode? Any help and update are much appreciated.


Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths omitted):

export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars ... --repositories https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/"

Thanks,
- Ethan
--
Best,
- Ethan


--
Best Regards

Jeff Zhang


--
Best Regards

Jeff Zhang

--
========= mailto:dboyd@incadencecorp.com ============
David W. Boyd
VP,  Data Solutions
10432 Balls Ford, Suite 240
Manassas, VA 20109
office:   +1-703-552-2862
cell:     +1-703-402-7908
============== http://www.incadencecorp.com/ ============
ISO/IEC JTC1 WG9, editor ISO/IEC 20547 Big Data Reference Architecture
Chair ANSI/INCITS TC Big Data
Co-chair NIST Big Data Public Working Group Reference Architecture
First Robotic Mentor - FRC, FTC - www.iliterobotics.org<http://www.iliterobotics.org>
Board Member- USSTEM Foundation - www.usstem.org<http://www.usstem.org>

The information contained in this message may be privileged
and/or confidential and protected from disclosure.
If the reader of this message is not the intended recipient
or an employee or agent responsible for delivering this message
to the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication
is strictly prohibited.  If you have received this communication
in error, please notify the sender immediately by replying to
this message and deleting the material from any computer.

Re: Adding jars to Spark 2.4.0 under yarn cluster mode in Zeppelin 0.8.1

Posted by Jeff Zhang <zj...@gmail.com>.

It looks like the interpreter process can not connect to zeppelin server
process. I guess it is due to some network issue, can you check whether the
node in yarn cluster can connect to the zeppelin server host ?

Y. Ethan Guo <gu...@uber.com> 于2019年4月7日周日 下午3:31写道：

> Hi Jeff,
>
> Given this PR is merged, I'm trying to see if I can run yarn cluster mode
> from master build.  I built Zeppelin master from this commit:
>
> commit 3655c12b875884410224eca5d6155287d51916ac
> Author: Jongyoul Lee <jo...@gmail.com>
> Date:   Mon Apr 1 15:37:57 2019 +0900
>     [MINOR] Refactor CronJob class (#3335)
>
> While I can successfully run Spark interpreter yarn client mode, I'm
> having trouble making the yarn cluster mode working.  Specifically, while
> the interpreter job was accepted in yarn, the job failed after 1-2 minutes
> because of this exception (see below).  Do you have any idea why this
> is happening?
>
> DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) -
> Created SSL options for fs: SSLOptions{enabled=false, keyStore=None,
> keyStorePassword=None, trustStore=None, trustStorePassword=None,
> protocol=None, enabledAlgorithms=Set()}
>  INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) -
> Starting the user application in a separate Thread
>  INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) -
> Waiting for spark context initialization...
>  INFO [2019-04-07 06:57:00,403] ({Driver}
> RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter
> server on port 0, intpEventServerAddress: 172.17.0.1:45128
> ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) -
> User class threw exception:
> org.apache.thrift.transport.TTransportException: java.net.ConnectException:
> Connection refused (Connection refused)
> org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused (Connection refused)
> at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:154)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:139)
> at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
> Caused by: java.net.ConnectException: Connection refused (Connection
> refused)
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> at
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
> ... 8 more
>
> Thanks,
> - Ethan
>
> On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang <zj...@gmail.com> wrote:
>
>> Here's the PR
>> https://github.com/apache/zeppelin/pull/3308
>>
>> Y. Ethan Guo <gu...@uber.com> 于2019年2月28日周四 上午2:50写道：
>>
>>> Hi All,
>>>
>>> I'm trying to use the new feature of yarn cluster mode to run Spark
>>> 2.4.0 jobs on Zeppelin 0.8.1. I've set the SPARK_HOME,
>>> SPARK_SUBMIT_OPTIONS, and HADOOP_CONF_DIR env variables in zeppelin-env.sh
>>> so that the Spark interpreter can be started in the cluster. I used
>>> `--jars` in SPARK_SUBMIT_OPTIONS to add local jars. However, when I tried
>>> to import a class from the jars in a Spark paragraph, the interpreter
>>> complained that it cannot find the package and class ("<console>:23: error:
>>> object ... is not a member of package ..."). Looks like the jars are not
>>> properly imported.
>>>
>>> I followed the instruction here
>>> <https://zeppelin.apache.org/docs/0.8.1/interpreter/spark.html#2-loading-spark-properties>
>>> to add the jars, but it seems that it's not working in the cluster mode.
>>> And this issue seems to be related to this bug:
>>> https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any update
>>> on fixing it? What is the right way to add local jars in yarn cluster mode?
>>> Any help and update are much appreciated.
>>>
>>>
>>> Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths omitted):
>>>
>>> export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars
>>> ... --repositories
>>> https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/
>>> "
>>>
>>> Thanks,
>>> - Ethan
>>> --
>>> Best,
>>> - Ethan
>>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>

-- 
Best Regards

Jeff Zhang