You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Yang Lei <ge...@gmail.com> on 2016/04/12 23:05:29 UTC

Spark on Mesos 0.28 issue

I have been able to run spark submission in docker container (HOST network) through Marathon on mesos and target to Mesos cluster (zk address) for at least Spark 1.6, 1.5.2 over Mesos 0.26, 0.27. 

I do need to define SPARK_PUBLIC_DNS and SPARK_LOCAL_IP so that the spark driver can announce the right IP address.

However, on Mesos 0.28, the spark framework will fail with "Failed to shutdown socket with fd 54: Transport endpoint is not connected <https://www.google.com/search?client=safari&rls=en&q=Failed+to+shutdown+socket+with+fd+54:+Transport+endpoint+is+not+connected&ie=UTF-8&oe=UTF-8>”. Eventually, I got the problem bypassed by defining additional  LIBPROCESS_IP

Please let me know if the behavior is as expected. If it is,  it will be good to document the requirement on the Spark Mesos cluster website.

Thank you.

Yang.

Re: Spark on Mesos 0.28 issue

Posted by Yang Lei <ge...@gmail.com>.
I looked at the JIRA. I do not think it is related, as without using docker image for spark task, the framework still fail. After my work around, both scenarios, w/ docker and w/o worked. 

About logs, the only thing caught my eyes is the line I pasted. It is from the master mesos log. The slave log does not have error messages... 

Yang



Sent from my iPad

> On Apr 13, 2016, at 7:03 AM, Adrian Bridgett <ad...@opensignal.com> wrote:
> 
> I think you maybe hitting https://issues.apache.org/jira/browse/MESOS-4878 which was fixed in Mesos 0.28.1
> 
>> On 13/04/2016 02:34, Timothy Chen wrote:
>> Hi Yang,
>> 
>> Can you share the master log/slave log?
>> 
>> Tim
>> 
>> 
>> On Apr 12, 2016, at 2:05 PM, Yang Lei <ge...@gmail.com> wrote:
>> 
>>> I have been able to run spark submission in docker container (HOST network) through Marathon on mesos and target to Mesos cluster (zk address) for at least Spark 1.6, 1.5.2 over Mesos 0.26, 0.27. 
>>> 
>>> I do need to define SPARK_PUBLIC_DNS and SPARK_LOCAL_IP so that the spark driver can announce the right IP address.
>>> 
>>> However, on Mesos 0.28, the spark framework will fail with "Failed to shutdown socket with fd 54: Transport endpoint is not connected”. Eventually, I got the problem bypassed by defining additional  LIBPROCESS_IP
>>> 
>>> Please let me know if the behavior is as expected. If it is,  it will be good to document the requirement on the Spark Mesos cluster website.
>>> 
>>> Thank you.
>>> 
>>> Yang.
> 
> -- 
> Adrian Bridgett |  Sysadmin Engineer, OpenSignal
> _____________________________________________________
> Office: 3rd Floor, The Angel Office, 2 Angel Square, London, EC1V 1NY
> Phone #: +44 777-377-8251
> Skype: abridgett  |  @adrianbridgett  |  LinkedIn link 
> _____________________________________________________

Re: Spark on Mesos 0.28 issue

Posted by Adrian Bridgett <ad...@opensignal.com>.
I think you maybe hitting 
https://issues.apache.org/jira/browse/MESOS-4878 which was fixed in 
Mesos 0.28.1

On 13/04/2016 02:34, Timothy Chen wrote:
> Hi Yang,
>
> Can you share the master log/slave log?
>
> Tim
>
>
> On Apr 12, 2016, at 2:05 PM, Yang Lei <genially@gmail.com 
> <ma...@gmail.com>> wrote:
>
>> I have been able to run spark submission in docker container (HOST 
>> network) through Marathon on mesos and target to Mesos cluster (zk 
>> address) for at least Spark 1.6, 1.5.2 over Mesos 0.26, 0.27.
>>
>> I do need to define SPARK_PUBLIC_DNS and SPARK_LOCAL_IP so that the 
>> spark driver can announce the right IP address.
>>
>> However, on Mesos 0.28, the spark framework will fail with "Failed to 
>> shutdown socket with fd 54: Transport endpoint is not connected 
>> <https://www.google.com/search?client=safari&rls=en&q=Failed+to+shutdown+socket+with+fd+54:+Transport+endpoint+is+not+connected&ie=UTF-8&oe=UTF-8>”. 
>> Eventually, I got the problem bypassed by defining additional 
>> LIBPROCESS_IP
>>
>> Please let me know if the behavior is as expected. If it is,  it will 
>> be good to document the requirement on the Spark Mesos cluster website.
>>
>> Thank you.
>>
>> Yang.

-- 
*Adrian Bridgett* |  Sysadmin Engineer, OpenSignal 
<http://www.opensignal.com>
_____________________________________________________
Office: 3rd Floor, The Angel Office, 2 Angel Square, London, EC1V 1NY
Phone #: +44 777-377-8251
Skype: abridgett  |@adrianbridgett <http://twitter.com/adrianbridgett>| 
LinkedIn link <https://uk.linkedin.com/in/abridgett>
_____________________________________________________

Re: Spark on Mesos 0.28 issue

Posted by Timothy Chen <tn...@gmail.com>.
Hi Yang,

Can you share the master log/slave log?

Tim


> On Apr 12, 2016, at 2:05 PM, Yang Lei <ge...@gmail.com> wrote:
> 
> I have been able to run spark submission in docker container (HOST network) through Marathon on mesos and target to Mesos cluster (zk address) for at least Spark 1.6, 1.5.2 over Mesos 0.26, 0.27. 
> 
> I do need to define SPARK_PUBLIC_DNS and SPARK_LOCAL_IP so that the spark driver can announce the right IP address.
> 
> However, on Mesos 0.28, the spark framework will fail with "Failed to shutdown socket with fd 54: Transport endpoint is not connected”. Eventually, I got the problem bypassed by defining additional  LIBPROCESS_IP
> 
> Please let me know if the behavior is as expected. If it is,  it will be good to document the requirement on the Spark Mesos cluster website.
> 
> Thank you.
> 
> Yang.