You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by jegordon <jg...@gmail.com> on 2015/07/09 00:19:54 UTC

Remote spark-submit not working with YARN

I'm trying to submit a spark job from a different server outside of my Spark
Cluster (running spark 1.4.0, hadoop 2.4.0 and YARN) using the spark-submit
script :

spark/bin/spark-submit --master yarn-client --executor-memory 4G
myjobScript.py

The think is that my application never pass from the accepted state, it
stuck on it :

15/07/08 16:49:40 INFO Client: Application report for
application_1436314873375_0030 (state: ACCEPTED)
15/07/08 16:49:41 INFO Client: Application report for
application_1436314873375_0030 (state: ACCEPTED)
15/07/08 16:49:42 INFO Client: Application report for
application_1436314873375_0030 (state: ACCEPTED)
15/07/08 16:49:43 INFO Client: Application report for
application_1436314873375_0030 (state: ACCEPTED)
15/07/08 16:49:44 INFO Client: Application report for
application_1436314873375_0030 (state: ACCEPTED)
15/07/08 16:49:45 INFO Client: Application report for
application_1436314873375_0030 (state: ACCEPTED)
15/07/08 16:49:46 INFO Client: Application report for
application_1436314873375_0030 (state: ACCEPTED)
15/07/08 16:49:47 INFO Client: Application report for
application_1436314873375_0030 (state: ACCEPTED)
15/07/08 16:49:48 INFO Client: Application report for
application_1436314873375_0030 (state: ACCEPTED)
15/07/08 16:49:49 INFO Client: Application report for
application_1436314873375_0030 (state: ACCEPTED)

But if i execute the same script with spark-submit in the master server of
my cluster it runs correctly.

I already set the yarn configuration in the remote server in
$YARN_CONF_DIR/yarn-site.xml like this :

 <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>54.54.54.54</value>
 </property>

 <property>
   <name>yarn.resourcemanager.address</name>
   <value>54.54.54.54:8032</value>
   <description>Enter your ResourceManager hostname.</description>
 </property>

 <property>
   <name>yarn.resourcemanager.scheduler.address</name>
   <value>54.54.54.54:8030</value>
   <description>Enter your ResourceManager hostname.</description>
 </property>

 <property>
   <name>yarn.resourcemanager.resourcetracker.address</name>
   <value>54.54.54.54:8031</value>
   <description>Enter your ResourceManager hostname.</description>
 </property>
Where 54.54.54.54 is the IP of my resourcemanager node.

Why is this happening? do i have to configure something else in YARN to
accept remote submits? or what am i missing?

Thanks a lot

JG




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Remote-spark-submit-not-working-with-YARN-tp23728.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Remote spark-submit not working with YARN

Posted by Juan Gordon <jg...@gmail.com>.
Hi ,

I checked the logs and it looks like YARN is trying to comunicate with my
"test server" through the local IP ( SPARK cluster and my "test server" are
in differents VPC in Amazon EC2) and thats why YARN can't response.

I try the same script in yarn-cluster mode and it runs correctly in that
way.

So i think that my issue is solved using yarn-cluster.

Thanks a lot,

JG

On Wed, Jul 8, 2015 at 7:24 PM, canan chen <cc...@gmail.com> wrote:

> The application is accepted by YARN RM. Just as Sandy mentioned, please
> look at the RM log, there must be some useful info there.
>
> On Thu, Jul 9, 2015 at 7:27 AM, Sandy Ryza <sa...@cloudera.com>
> wrote:
>
>> Strange.  Does the application show up at all in the YARN web UI?  Does application_1436314873375_0030
>> show up at all in the YARN ResourceManager logs?
>>
>> -Sandy
>>
>> On Wed, Jul 8, 2015 at 3:32 PM, Juan Gordon <jg...@gmail.com> wrote:
>>
>>> Hello Sandy,
>>>
>>> Yes I'm sure that YARN has the enought resources, i checked it in the
>>> WEB UI page of my cluster
>>>
>>> Also, i'm able to submit the same script in any of the nodes of the
>>> cluster.
>>>
>>> That's why i don't understand whats happening.
>>>
>>> Thanks
>>>
>>> JG
>>>
>>> On Wed, Jul 8, 2015 at 5:26 PM, Sandy Ryza <sa...@cloudera.com>
>>> wrote:
>>>
>>>> Hi JG,
>>>>
>>>> One way this can occur is that YARN doesn't have enough resources to
>>>> run your job.  Have you verified that it does?  Are you able to submit
>>>> using the same command from a node on the cluster?
>>>>
>>>> -Sandy
>>>>
>>>> On Wed, Jul 8, 2015 at 3:19 PM, jegordon <jg...@gmail.com> wrote:
>>>>
>>>>> I'm trying to submit a spark job from a different server outside of my
>>>>> Spark
>>>>> Cluster (running spark 1.4.0, hadoop 2.4.0 and YARN) using the
>>>>> spark-submit
>>>>> script :
>>>>>
>>>>> spark/bin/spark-submit --master yarn-client --executor-memory 4G
>>>>> myjobScript.py
>>>>>
>>>>> The think is that my application never pass from the accepted state, it
>>>>> stuck on it :
>>>>>
>>>>> 15/07/08 16:49:40 INFO Client: Application report for
>>>>> application_1436314873375_0030 (state: ACCEPTED)
>>>>> 15/07/08 16:49:41 INFO Client: Application report for
>>>>> application_1436314873375_0030 (state: ACCEPTED)
>>>>> 15/07/08 16:49:42 INFO Client: Application report for
>>>>> application_1436314873375_0030 (state: ACCEPTED)
>>>>> 15/07/08 16:49:43 INFO Client: Application report for
>>>>> application_1436314873375_0030 (state: ACCEPTED)
>>>>> 15/07/08 16:49:44 INFO Client: Application report for
>>>>> application_1436314873375_0030 (state: ACCEPTED)
>>>>> 15/07/08 16:49:45 INFO Client: Application report for
>>>>> application_1436314873375_0030 (state: ACCEPTED)
>>>>> 15/07/08 16:49:46 INFO Client: Application report for
>>>>> application_1436314873375_0030 (state: ACCEPTED)
>>>>> 15/07/08 16:49:47 INFO Client: Application report for
>>>>> application_1436314873375_0030 (state: ACCEPTED)
>>>>> 15/07/08 16:49:48 INFO Client: Application report for
>>>>> application_1436314873375_0030 (state: ACCEPTED)
>>>>> 15/07/08 16:49:49 INFO Client: Application report for
>>>>> application_1436314873375_0030 (state: ACCEPTED)
>>>>>
>>>>> But if i execute the same script with spark-submit in the master
>>>>> server of
>>>>> my cluster it runs correctly.
>>>>>
>>>>> I already set the yarn configuration in the remote server in
>>>>> $YARN_CONF_DIR/yarn-site.xml like this :
>>>>>
>>>>>  <property>
>>>>>     <name>yarn.resourcemanager.hostname</name>
>>>>>     <value>54.54.54.54</value>
>>>>>  </property>
>>>>>
>>>>>  <property>
>>>>>    <name>yarn.resourcemanager.address</name>
>>>>>    <value>54.54.54.54:8032</value>
>>>>>    <description>Enter your ResourceManager hostname.</description>
>>>>>  </property>
>>>>>
>>>>>  <property>
>>>>>    <name>yarn.resourcemanager.scheduler.address</name>
>>>>>    <value>54.54.54.54:8030</value>
>>>>>    <description>Enter your ResourceManager hostname.</description>
>>>>>  </property>
>>>>>
>>>>>  <property>
>>>>>    <name>yarn.resourcemanager.resourcetracker.address</name>
>>>>>    <value>54.54.54.54:8031</value>
>>>>>    <description>Enter your ResourceManager hostname.</description>
>>>>>  </property>
>>>>> Where 54.54.54.54 is the IP of my resourcemanager node.
>>>>>
>>>>> Why is this happening? do i have to configure something else in YARN to
>>>>> accept remote submits? or what am i missing?
>>>>>
>>>>> Thanks a lot
>>>>>
>>>>> JG
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Remote-spark-submit-not-working-with-YARN-tp23728.html
>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Saludos,
>>> Juan Gordon
>>>
>>
>>
>


-- 
Saludos,
Juan Gordon

Re: Remote spark-submit not working with YARN

Posted by Sandy Ryza <sa...@cloudera.com>.
Strange.  Does the application show up at all in the YARN web UI?
Does application_1436314873375_0030
show up at all in the YARN ResourceManager logs?

-Sandy

On Wed, Jul 8, 2015 at 3:32 PM, Juan Gordon <jg...@gmail.com> wrote:

> Hello Sandy,
>
> Yes I'm sure that YARN has the enought resources, i checked it in the WEB
> UI page of my cluster
>
> Also, i'm able to submit the same script in any of the nodes of the
> cluster.
>
> That's why i don't understand whats happening.
>
> Thanks
>
> JG
>
> On Wed, Jul 8, 2015 at 5:26 PM, Sandy Ryza <sa...@cloudera.com>
> wrote:
>
>> Hi JG,
>>
>> One way this can occur is that YARN doesn't have enough resources to run
>> your job.  Have you verified that it does?  Are you able to submit using
>> the same command from a node on the cluster?
>>
>> -Sandy
>>
>> On Wed, Jul 8, 2015 at 3:19 PM, jegordon <jg...@gmail.com> wrote:
>>
>>> I'm trying to submit a spark job from a different server outside of my
>>> Spark
>>> Cluster (running spark 1.4.0, hadoop 2.4.0 and YARN) using the
>>> spark-submit
>>> script :
>>>
>>> spark/bin/spark-submit --master yarn-client --executor-memory 4G
>>> myjobScript.py
>>>
>>> The think is that my application never pass from the accepted state, it
>>> stuck on it :
>>>
>>> 15/07/08 16:49:40 INFO Client: Application report for
>>> application_1436314873375_0030 (state: ACCEPTED)
>>> 15/07/08 16:49:41 INFO Client: Application report for
>>> application_1436314873375_0030 (state: ACCEPTED)
>>> 15/07/08 16:49:42 INFO Client: Application report for
>>> application_1436314873375_0030 (state: ACCEPTED)
>>> 15/07/08 16:49:43 INFO Client: Application report for
>>> application_1436314873375_0030 (state: ACCEPTED)
>>> 15/07/08 16:49:44 INFO Client: Application report for
>>> application_1436314873375_0030 (state: ACCEPTED)
>>> 15/07/08 16:49:45 INFO Client: Application report for
>>> application_1436314873375_0030 (state: ACCEPTED)
>>> 15/07/08 16:49:46 INFO Client: Application report for
>>> application_1436314873375_0030 (state: ACCEPTED)
>>> 15/07/08 16:49:47 INFO Client: Application report for
>>> application_1436314873375_0030 (state: ACCEPTED)
>>> 15/07/08 16:49:48 INFO Client: Application report for
>>> application_1436314873375_0030 (state: ACCEPTED)
>>> 15/07/08 16:49:49 INFO Client: Application report for
>>> application_1436314873375_0030 (state: ACCEPTED)
>>>
>>> But if i execute the same script with spark-submit in the master server
>>> of
>>> my cluster it runs correctly.
>>>
>>> I already set the yarn configuration in the remote server in
>>> $YARN_CONF_DIR/yarn-site.xml like this :
>>>
>>>  <property>
>>>     <name>yarn.resourcemanager.hostname</name>
>>>     <value>54.54.54.54</value>
>>>  </property>
>>>
>>>  <property>
>>>    <name>yarn.resourcemanager.address</name>
>>>    <value>54.54.54.54:8032</value>
>>>    <description>Enter your ResourceManager hostname.</description>
>>>  </property>
>>>
>>>  <property>
>>>    <name>yarn.resourcemanager.scheduler.address</name>
>>>    <value>54.54.54.54:8030</value>
>>>    <description>Enter your ResourceManager hostname.</description>
>>>  </property>
>>>
>>>  <property>
>>>    <name>yarn.resourcemanager.resourcetracker.address</name>
>>>    <value>54.54.54.54:8031</value>
>>>    <description>Enter your ResourceManager hostname.</description>
>>>  </property>
>>> Where 54.54.54.54 is the IP of my resourcemanager node.
>>>
>>> Why is this happening? do i have to configure something else in YARN to
>>> accept remote submits? or what am i missing?
>>>
>>> Thanks a lot
>>>
>>> JG
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Remote-spark-submit-not-working-with-YARN-tp23728.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>>
>>
>
>
> --
> Saludos,
> Juan Gordon
>

Re: Remote spark-submit not working with YARN

Posted by Sandy Ryza <sa...@cloudera.com>.
Hi JG,

One way this can occur is that YARN doesn't have enough resources to run
your job.  Have you verified that it does?  Are you able to submit using
the same command from a node on the cluster?

-Sandy

On Wed, Jul 8, 2015 at 3:19 PM, jegordon <jg...@gmail.com> wrote:

> I'm trying to submit a spark job from a different server outside of my
> Spark
> Cluster (running spark 1.4.0, hadoop 2.4.0 and YARN) using the spark-submit
> script :
>
> spark/bin/spark-submit --master yarn-client --executor-memory 4G
> myjobScript.py
>
> The think is that my application never pass from the accepted state, it
> stuck on it :
>
> 15/07/08 16:49:40 INFO Client: Application report for
> application_1436314873375_0030 (state: ACCEPTED)
> 15/07/08 16:49:41 INFO Client: Application report for
> application_1436314873375_0030 (state: ACCEPTED)
> 15/07/08 16:49:42 INFO Client: Application report for
> application_1436314873375_0030 (state: ACCEPTED)
> 15/07/08 16:49:43 INFO Client: Application report for
> application_1436314873375_0030 (state: ACCEPTED)
> 15/07/08 16:49:44 INFO Client: Application report for
> application_1436314873375_0030 (state: ACCEPTED)
> 15/07/08 16:49:45 INFO Client: Application report for
> application_1436314873375_0030 (state: ACCEPTED)
> 15/07/08 16:49:46 INFO Client: Application report for
> application_1436314873375_0030 (state: ACCEPTED)
> 15/07/08 16:49:47 INFO Client: Application report for
> application_1436314873375_0030 (state: ACCEPTED)
> 15/07/08 16:49:48 INFO Client: Application report for
> application_1436314873375_0030 (state: ACCEPTED)
> 15/07/08 16:49:49 INFO Client: Application report for
> application_1436314873375_0030 (state: ACCEPTED)
>
> But if i execute the same script with spark-submit in the master server of
> my cluster it runs correctly.
>
> I already set the yarn configuration in the remote server in
> $YARN_CONF_DIR/yarn-site.xml like this :
>
>  <property>
>     <name>yarn.resourcemanager.hostname</name>
>     <value>54.54.54.54</value>
>  </property>
>
>  <property>
>    <name>yarn.resourcemanager.address</name>
>    <value>54.54.54.54:8032</value>
>    <description>Enter your ResourceManager hostname.</description>
>  </property>
>
>  <property>
>    <name>yarn.resourcemanager.scheduler.address</name>
>    <value>54.54.54.54:8030</value>
>    <description>Enter your ResourceManager hostname.</description>
>  </property>
>
>  <property>
>    <name>yarn.resourcemanager.resourcetracker.address</name>
>    <value>54.54.54.54:8031</value>
>    <description>Enter your ResourceManager hostname.</description>
>  </property>
> Where 54.54.54.54 is the IP of my resourcemanager node.
>
> Why is this happening? do i have to configure something else in YARN to
> accept remote submits? or what am i missing?
>
> Thanks a lot
>
> JG
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Remote-spark-submit-not-working-with-YARN-tp23728.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>