You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Yan Yang <ya...@deserve.com> on 2019/08/01 16:58:02 UTC

UnknownHostException on submission Spark jobs to an AWS EMR cluster

We have been trying to submit Spark jobs to an AWS EMR cluster from our own
Zeppelin instance.

The YARN job was received and started properly, but ran into below error.
The host indicated in the error is a weird 12-char alphanumeric string that
does not look like hostname or IP.

*Caused by: java.net.UnknownHostException: 77c5b7197972*

We have been able to use spark-submit from the same location where Zeppelin
is installed to submit the same job to the same EMR cluster successfully.

We have tried to toggle both client/cluster deploy mode in Zeppelin as well
as the configuration *zeppelin.spark.useNew*. However we are still hitting
the above error.

Anyone has encountered this before? Specifically is that weird host
generated from Zeppelin's Spark submission logic?

Thanks
Yan

Re: UnknownHostException on submission Spark jobs to an AWS EMR cluster

Posted by Jeff Zhang <zj...@gmail.com>.
It depends on which mode do you use? If you use yarn client mode, then
driver run in the zeppelin host, the driver needs to connect with executor
which run in EMR.
If you use yarn-cluster mode, then the driver run in EMR. Then the driver
needs to connect with zeppelin server which is outside of EMR.


Yan Yang <ya...@deserve.com> 于2019年8月3日周六 上午3:34写道:

> Jeff
>
> When we run the Spark interpreter against remote cluster, does the
> interpreter process run locally or on the Spark cluster? Which port do we
> need to open on the zeppelin-server for the interpreter?
>
> Thanks a lot for the help.
>
> Yan
>


-- 
Best Regards

Jeff Zhang

Re: UnknownHostException on submission Spark jobs to an AWS EMR cluster

Posted by Yan Yang <ya...@deserve.com>.
Jeff

When we run the Spark interpreter against remote cluster, does the
interpreter process run locally or on the Spark cluster? Which port do we
need to open on the zeppelin-server for the interpreter?

Thanks a lot for the help.

Yan

Re: UnknownHostException on submission Spark jobs to an AWS EMR cluster

Posted by Jeff Zhang <zj...@gmail.com>.
Do you see the error in yarn am log ? I suspect it is due to network issue.
Because zeppelin needs bidirectional communication between zeppelin-server
and interpreter process. Does your EMR cluster able to access your zeppelin
server host ?


Yan Yang <ya...@deserve.com> 于2019年8月2日周五 上午12:58写道:

> We have been trying to submit Spark jobs to an AWS EMR cluster from our
> own Zeppelin instance.
>
> The YARN job was received and started properly, but ran into below error.
> The host indicated in the error is a weird 12-char alphanumeric string that
> does not look like hostname or IP.
>
> *Caused by: java.net.UnknownHostException: 77c5b7197972*
>
> We have been able to use spark-submit from the same location where
> Zeppelin is installed to submit the same job to the same EMR cluster
> successfully.
>
> We have tried to toggle both client/cluster deploy mode in Zeppelin as
> well as the configuration *zeppelin.spark.useNew*. However we are still
> hitting the above error.
>
> Anyone has encountered this before? Specifically is that weird host
> generated from Zeppelin's Spark submission logic?
>
> Thanks
> Yan
>
>

-- 
Best Regards

Jeff Zhang