You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Tobias Pfeiffer <tg...@preferred.jp> on 2014/12/04 05:10:39 UTC

spark-submit on YARN is slow

Hi,

I am using spark-submit to submit my application to YARN in "yarn-cluster"
mode. I have both the Spark assembly jar file as well as my application jar
file put in HDFS and can see from the logging output that both files are
used from there. However, it still takes about 10 seconds for my
application's yarnAppState to switch from ACCEPTED to RUNNING.

I am aware that this is probably not a Spark issue, but some YARN
configuration setting (or YARN-inherent slowness), I was just wondering if
anyone has an advice for how to speed this up.

Thanks
Tobias

Re: spark-submit on YARN is slow

Posted by Tobias Pfeiffer <tg...@preferred.jp>.

Hi,

On Tue, Dec 9, 2014 at 4:39 AM, Sandy Ryza <sa...@cloudera.com> wrote:
>
> Can you try using the YARN Fair Scheduler and set
> yarn.scheduler.fair.continuous-scheduling-enabled to true?
>

I'm using Cloudera 5.2.0 and my configuration says
  yarn.resourcemanager.scheduler.class =
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
yarn.scheduler.fair.continuous-scheduling-enabled = true
by default. Changing to a different Scheduler doesn't really change
anything, from ACCEPTED to RUNNING always takes about 10 seconds.

Thanks
Tobias

Re: spark-submit on YARN is slow

Posted by Sandy Ryza <sa...@cloudera.com>.

Hey Tobias,

Can you try using the YARN Fair Scheduler and set
yarn.scheduler.fair.continuous-scheduling-enabled to true?

-Sandy

On Sun, Dec 7, 2014 at 5:39 PM, Tobias Pfeiffer <tg...@preferred.jp> wrote:

> Hi,
>
> thanks for your responses!
>
> On Sat, Dec 6, 2014 at 4:22 AM, Sandy Ryza <sa...@cloudera.com>
> wrote:
>>
>> What version are you using?  In some recent versions, we had a couple of
>> large hardcoded sleeps on the Spark side.
>>
>
> I am using Spark 1.1.1.
>
> As Andrew mentioned, I guess most of the 10 seconds waiting time probably
> comes from YARN itself. (Other YARN applications also take a while to start
> up.) I'm just really puzzled about what exactly takes so long there... for
> a job that runs an hour or so, that is of course negligible, but I am
> starting up an instance per client to do interactive job processing *for
> this client*, and it feels like "yeah, thanks for logging in, now please
> wait a while until you can actually use the program", that's a bit
> suboptimal.
>
> Tobias
>
>
>

Re: spark-submit on YARN is slow

Posted by Tobias Pfeiffer <tg...@preferred.jp>.

Hi,

thanks for your responses!

On Sat, Dec 6, 2014 at 4:22 AM, Sandy Ryza <sa...@cloudera.com> wrote:
>
> What version are you using?  In some recent versions, we had a couple of
> large hardcoded sleeps on the Spark side.
>

I am using Spark 1.1.1.

As Andrew mentioned, I guess most of the 10 seconds waiting time probably
comes from YARN itself. (Other YARN applications also take a while to start
up.) I'm just really puzzled about what exactly takes so long there... for
a job that runs an hour or so, that is of course negligible, but I am
starting up an instance per client to do interactive job processing *for
this client*, and it feels like "yeah, thanks for logging in, now please
wait a while until you can actually use the program", that's a bit
suboptimal.

Tobias

Re: spark-submit on YARN is slow

Posted by Sameer Farooqui <sa...@databricks.com>.

Just an FYI - I can submit the SparkPi app to YARN in cluster mode on a
1-node m3.xlarge EC2 instance instance and the app finishes running
successfully in about 40 seconds. I just figured the 30 - 40 sec run time
was normal b/c of the submitting overhead that Andrew mentioned.

Denny, you can maybe also try to run SparkPi against YARN as a speed check.

spark-submit --class org.apache.spark.examples.SparkPi --deploy-mode
cluster --master yarn
/opt/cloudera/parcels/CDH-5.2.1-1.cdh5.2.1.p0.12/jars/spark-examples-1.1.0-cdh5.2.1-hadoop2.5.0-cdh5.2.1.jar
10

On Fri, Dec 5, 2014 at 2:32 PM, Denny Lee <de...@gmail.com> wrote:

> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
> steps. If I was running this on standalone cluster mode the query finished
> in 55s but on YARN, the query was still running 30min later. Would the hard
> coded sleeps potentially be in play here?
> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sa...@cloudera.com> wrote:
>
>> Hi Tobias,
>>
>> What version are you using?  In some recent versions, we had a couple of
>> large hardcoded sleeps on the Spark side.
>>
>> -Sandy
>>
>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <an...@databricks.com> wrote:
>>
>>> Hey Tobias,
>>>
>>> As you suspect, the reason why it's slow is because the resource manager
>>> in YARN takes a while to grant resources. This is because YARN needs to
>>> first set up the application master container, and then this AM needs to
>>> request more containers for Spark executors. I think this accounts for most
>>> of the overhead. The remaining source probably comes from how our own YARN
>>> integration code polls application (every second) and cluster resource
>>> states (every 5 seconds IIRC). I haven't explored in detail whether there
>>> are optimizations there that can speed this up, but I believe most of the
>>> overhead comes from YARN itself.
>>>
>>> In other words, no I don't know of any quick fix on your end that you
>>> can do to speed this up.
>>>
>>> -Andrew
>>>
>>>
>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:
>>>
>>> Hi,
>>>>
>>>> I am using spark-submit to submit my application to YARN in
>>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well as my
>>>> application jar file put in HDFS and can see from the logging output that
>>>> both files are used from there. However, it still takes about 10 seconds
>>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>>>
>>>> I am aware that this is probably not a Spark issue, but some YARN
>>>> configuration setting (or YARN-inherent slowness), I was just wondering if
>>>> anyone has an advice for how to speed this up.
>>>>
>>>> Thanks
>>>> Tobias
>>>>
>>>
>>>
>>

Re: spark-submit on YARN is slow

Posted by Sandy Ryza <sa...@cloudera.com>.

Great to hear!

-Sandy

On Fri, Dec 5, 2014 at 11:17 PM, Denny Lee <de...@gmail.com> wrote:

> Okay, my bad for not testing out the documented arguments - once i use the
> correct ones, the query shrinks completes in ~55s (I can probably make it
> faster).   Thanks for the help, eh?!
>
>
>
> On Fri Dec 05 2014 at 10:34:50 PM Denny Lee <de...@gmail.com> wrote:
>
>> Sorry for the delay in my response - for my spark calls for stand-alone
>> and YARN, I am using the --executor-memory and --total-executor-cores for
>> the submission.  In standalone, my baseline query completes in ~40s while
>> in YARN, it completes in ~1800s.  It does not appear from the RM web UI
>> that its asking for more resources than available but by the same token, it
>> appears that its only using a small amount of cores and available memory.
>>
>> Saying this, let me re-try using the --executor-cores,
>> --executor-memory, and --num-executors arguments as suggested (and
>> documented) vs. the --total-executor-cores
>>
>>
>> On Fri Dec 05 2014 at 1:14:53 PM Andrew Or <an...@databricks.com> wrote:
>>
>>> Hey Arun I've seen that behavior before. It happens when the cluster
>>> doesn't have enough resources to offer and the RM hasn't given us our
>>> containers yet. Can you check the RM Web UI at port 8088 to see whether
>>> your application is requesting more resources than the cluster has to offer?
>>>
>>> 2014-12-05 12:51 GMT-08:00 Sandy Ryza <sa...@cloudera.com>:
>>>
>>> Hey Arun,
>>>>
>>>> The sleeps would only cause maximum like 5 second overhead.  The idea
>>>> was to give executors some time to register.  On more recent versions, they
>>>> were replaced with the spark.scheduler.minRegisteredResourcesRatio and
>>>> spark.scheduler.maxRegisteredResourcesWaitingTime.  As of 1.1, by
>>>> default YARN will wait until either 30 seconds have passed or 80% of the
>>>> requested executors have registered.
>>>>
>>>> -Sandy
>>>>
>>>> On Fri, Dec 5, 2014 at 12:46 PM, Ashish Rangole <ar...@gmail.com>
>>>> wrote:
>>>>
>>>>> Likely this not the case here yet one thing to point out with Yarn
>>>>> parameters like --num-executors is that they should be specified *before*
>>>>> app jar and app args on spark-submit command line otherwise the app only
>>>>> gets the default number of containers which is 2.
>>>>> On Dec 5, 2014 12:22 PM, "Sandy Ryza" <sa...@cloudera.com> wrote:
>>>>>
>>>>>> Hi Denny,
>>>>>>
>>>>>> Those sleeps were only at startup, so if jobs are taking
>>>>>> significantly longer on YARN, that should be a different problem.  When you
>>>>>> ran on YARN, did you use the --executor-cores, --executor-memory, and
>>>>>> --num-executors arguments?  When running against a standalone cluster, by
>>>>>> default Spark will make use of all the cluster resources, but when running
>>>>>> against YARN, Spark defaults to a couple tiny executors.
>>>>>>
>>>>>> -Sandy
>>>>>>
>>>>>> On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <de...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
>>>>>>> steps. If I was running this on standalone cluster mode the query finished
>>>>>>> in 55s but on YARN, the query was still running 30min later. Would the hard
>>>>>>> coded sleeps potentially be in play here?
>>>>>>> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sa...@cloudera.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Tobias,
>>>>>>>>
>>>>>>>> What version are you using?  In some recent versions, we had a
>>>>>>>> couple of large hardcoded sleeps on the Spark side.
>>>>>>>>
>>>>>>>> -Sandy
>>>>>>>>
>>>>>>>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <an...@databricks.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hey Tobias,
>>>>>>>>>
>>>>>>>>> As you suspect, the reason why it's slow is because the resource
>>>>>>>>> manager in YARN takes a while to grant resources. This is because YARN
>>>>>>>>> needs to first set up the application master container, and then this AM
>>>>>>>>> needs to request more containers for Spark executors. I think this accounts
>>>>>>>>> for most of the overhead. The remaining source probably comes from how our
>>>>>>>>> own YARN integration code polls application (every second) and cluster
>>>>>>>>> resource states (every 5 seconds IIRC). I haven't explored in detail
>>>>>>>>> whether there are optimizations there that can speed this up, but I believe
>>>>>>>>> most of the overhead comes from YARN itself.
>>>>>>>>>
>>>>>>>>> In other words, no I don't know of any quick fix on your end that
>>>>>>>>> you can do to speed this up.
>>>>>>>>>
>>>>>>>>> -Andrew
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am using spark-submit to submit my application to YARN in
>>>>>>>>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well as my
>>>>>>>>>> application jar file put in HDFS and can see from the logging output that
>>>>>>>>>> both files are used from there. However, it still takes about 10 seconds
>>>>>>>>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>>>>>>>>>
>>>>>>>>>> I am aware that this is probably not a Spark issue, but some YARN
>>>>>>>>>> configuration setting (or YARN-inherent slowness), I was just wondering if
>>>>>>>>>> anyone has an advice for how to speed this up.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Tobias
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>

Re: spark-submit on YARN is slow

Posted by Denny Lee <de...@gmail.com>.

Okay, my bad for not testing out the documented arguments - once i use the
correct ones, the query shrinks completes in ~55s (I can probably make it
faster).   Thanks for the help, eh?!


On Fri Dec 05 2014 at 10:34:50 PM Denny Lee <de...@gmail.com> wrote:

> Sorry for the delay in my response - for my spark calls for stand-alone
> and YARN, I am using the --executor-memory and --total-executor-cores for
> the submission.  In standalone, my baseline query completes in ~40s while
> in YARN, it completes in ~1800s.  It does not appear from the RM web UI
> that its asking for more resources than available but by the same token, it
> appears that its only using a small amount of cores and available memory.
>
> Saying this, let me re-try using the --executor-cores, --executor-memory,
> and --num-executors arguments as suggested (and documented) vs. the
> --total-executor-cores
>
>
> On Fri Dec 05 2014 at 1:14:53 PM Andrew Or <an...@databricks.com> wrote:
>
>> Hey Arun I've seen that behavior before. It happens when the cluster
>> doesn't have enough resources to offer and the RM hasn't given us our
>> containers yet. Can you check the RM Web UI at port 8088 to see whether
>> your application is requesting more resources than the cluster has to offer?
>>
>> 2014-12-05 12:51 GMT-08:00 Sandy Ryza <sa...@cloudera.com>:
>>
>> Hey Arun,
>>>
>>> The sleeps would only cause maximum like 5 second overhead.  The idea
>>> was to give executors some time to register.  On more recent versions, they
>>> were replaced with the spark.scheduler.minRegisteredResourcesRatio and
>>> spark.scheduler.maxRegisteredResourcesWaitingTime.  As of 1.1, by
>>> default YARN will wait until either 30 seconds have passed or 80% of the
>>> requested executors have registered.
>>>
>>> -Sandy
>>>
>>> On Fri, Dec 5, 2014 at 12:46 PM, Ashish Rangole <ar...@gmail.com>
>>> wrote:
>>>
>>>> Likely this not the case here yet one thing to point out with Yarn
>>>> parameters like --num-executors is that they should be specified *before*
>>>> app jar and app args on spark-submit command line otherwise the app only
>>>> gets the default number of containers which is 2.
>>>> On Dec 5, 2014 12:22 PM, "Sandy Ryza" <sa...@cloudera.com> wrote:
>>>>
>>>>> Hi Denny,
>>>>>
>>>>> Those sleeps were only at startup, so if jobs are taking significantly
>>>>> longer on YARN, that should be a different problem.  When you ran on YARN,
>>>>> did you use the --executor-cores, --executor-memory, and --num-executors
>>>>> arguments?  When running against a standalone cluster, by default Spark
>>>>> will make use of all the cluster resources, but when running against YARN,
>>>>> Spark defaults to a couple tiny executors.
>>>>>
>>>>> -Sandy
>>>>>
>>>>> On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <de...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
>>>>>> steps. If I was running this on standalone cluster mode the query finished
>>>>>> in 55s but on YARN, the query was still running 30min later. Would the hard
>>>>>> coded sleeps potentially be in play here?
>>>>>> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sa...@cloudera.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Tobias,
>>>>>>>
>>>>>>> What version are you using?  In some recent versions, we had a
>>>>>>> couple of large hardcoded sleeps on the Spark side.
>>>>>>>
>>>>>>> -Sandy
>>>>>>>
>>>>>>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <an...@databricks.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey Tobias,
>>>>>>>>
>>>>>>>> As you suspect, the reason why it's slow is because the resource
>>>>>>>> manager in YARN takes a while to grant resources. This is because YARN
>>>>>>>> needs to first set up the application master container, and then this AM
>>>>>>>> needs to request more containers for Spark executors. I think this accounts
>>>>>>>> for most of the overhead. The remaining source probably comes from how our
>>>>>>>> own YARN integration code polls application (every second) and cluster
>>>>>>>> resource states (every 5 seconds IIRC). I haven't explored in detail
>>>>>>>> whether there are optimizations there that can speed this up, but I believe
>>>>>>>> most of the overhead comes from YARN itself.
>>>>>>>>
>>>>>>>> In other words, no I don't know of any quick fix on your end that
>>>>>>>> you can do to speed this up.
>>>>>>>>
>>>>>>>> -Andrew
>>>>>>>>
>>>>>>>>
>>>>>>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I am using spark-submit to submit my application to YARN in
>>>>>>>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well as my
>>>>>>>>> application jar file put in HDFS and can see from the logging output that
>>>>>>>>> both files are used from there. However, it still takes about 10 seconds
>>>>>>>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>>>>>>>>
>>>>>>>>> I am aware that this is probably not a Spark issue, but some YARN
>>>>>>>>> configuration setting (or YARN-inherent slowness), I was just wondering if
>>>>>>>>> anyone has an advice for how to speed this up.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Tobias
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>

Re: spark-submit on YARN is slow

Posted by Denny Lee <de...@gmail.com>.

Sorry for the delay in my response - for my spark calls for stand-alone and
YARN, I am using the --executor-memory and --total-executor-cores for the
submission.  In standalone, my baseline query completes in ~40s while in
YARN, it completes in ~1800s.  It does not appear from the RM web UI that
its asking for more resources than available but by the same token, it
appears that its only using a small amount of cores and available memory.

Saying this, let me re-try using the --executor-cores, --executor-memory,
and --num-executors arguments as suggested (and documented) vs. the
--total-executor-cores


On Fri Dec 05 2014 at 1:14:53 PM Andrew Or <an...@databricks.com> wrote:

> Hey Arun I've seen that behavior before. It happens when the cluster
> doesn't have enough resources to offer and the RM hasn't given us our
> containers yet. Can you check the RM Web UI at port 8088 to see whether
> your application is requesting more resources than the cluster has to offer?
>
> 2014-12-05 12:51 GMT-08:00 Sandy Ryza <sa...@cloudera.com>:
>
> Hey Arun,
>>
>> The sleeps would only cause maximum like 5 second overhead.  The idea was
>> to give executors some time to register.  On more recent versions, they
>> were replaced with the spark.scheduler.minRegisteredResourcesRatio and
>> spark.scheduler.maxRegisteredResourcesWaitingTime.  As of 1.1, by default
>> YARN will wait until either 30 seconds have passed or 80% of the requested
>> executors have registered.
>>
>> -Sandy
>>
>> On Fri, Dec 5, 2014 at 12:46 PM, Ashish Rangole <ar...@gmail.com>
>> wrote:
>>
>>> Likely this not the case here yet one thing to point out with Yarn
>>> parameters like --num-executors is that they should be specified *before*
>>> app jar and app args on spark-submit command line otherwise the app only
>>> gets the default number of containers which is 2.
>>> On Dec 5, 2014 12:22 PM, "Sandy Ryza" <sa...@cloudera.com> wrote:
>>>
>>>> Hi Denny,
>>>>
>>>> Those sleeps were only at startup, so if jobs are taking significantly
>>>> longer on YARN, that should be a different problem.  When you ran on YARN,
>>>> did you use the --executor-cores, --executor-memory, and --num-executors
>>>> arguments?  When running against a standalone cluster, by default Spark
>>>> will make use of all the cluster resources, but when running against YARN,
>>>> Spark defaults to a couple tiny executors.
>>>>
>>>> -Sandy
>>>>
>>>> On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <de...@gmail.com>
>>>> wrote:
>>>>
>>>>> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
>>>>> steps. If I was running this on standalone cluster mode the query finished
>>>>> in 55s but on YARN, the query was still running 30min later. Would the hard
>>>>> coded sleeps potentially be in play here?
>>>>> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sa...@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Tobias,
>>>>>>
>>>>>> What version are you using?  In some recent versions, we had a couple
>>>>>> of large hardcoded sleeps on the Spark side.
>>>>>>
>>>>>> -Sandy
>>>>>>
>>>>>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <an...@databricks.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hey Tobias,
>>>>>>>
>>>>>>> As you suspect, the reason why it's slow is because the resource
>>>>>>> manager in YARN takes a while to grant resources. This is because YARN
>>>>>>> needs to first set up the application master container, and then this AM
>>>>>>> needs to request more containers for Spark executors. I think this accounts
>>>>>>> for most of the overhead. The remaining source probably comes from how our
>>>>>>> own YARN integration code polls application (every second) and cluster
>>>>>>> resource states (every 5 seconds IIRC). I haven't explored in detail
>>>>>>> whether there are optimizations there that can speed this up, but I believe
>>>>>>> most of the overhead comes from YARN itself.
>>>>>>>
>>>>>>> In other words, no I don't know of any quick fix on your end that
>>>>>>> you can do to speed this up.
>>>>>>>
>>>>>>> -Andrew
>>>>>>>
>>>>>>>
>>>>>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:
>>>>>>>
>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am using spark-submit to submit my application to YARN in
>>>>>>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well as my
>>>>>>>> application jar file put in HDFS and can see from the logging output that
>>>>>>>> both files are used from there. However, it still takes about 10 seconds
>>>>>>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>>>>>>>
>>>>>>>> I am aware that this is probably not a Spark issue, but some YARN
>>>>>>>> configuration setting (or YARN-inherent slowness), I was just wondering if
>>>>>>>> anyone has an advice for how to speed this up.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Tobias
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>

Re: spark-submit on YARN is slow

Posted by Andrew Or <an...@databricks.com>.

Hey Arun I've seen that behavior before. It happens when the cluster
doesn't have enough resources to offer and the RM hasn't given us our
containers yet. Can you check the RM Web UI at port 8088 to see whether
your application is requesting more resources than the cluster has to offer?

2014-12-05 12:51 GMT-08:00 Sandy Ryza <sa...@cloudera.com>:

> Hey Arun,
>
> The sleeps would only cause maximum like 5 second overhead.  The idea was
> to give executors some time to register.  On more recent versions, they
> were replaced with the spark.scheduler.minRegisteredResourcesRatio and
> spark.scheduler.maxRegisteredResourcesWaitingTime.  As of 1.1, by default
> YARN will wait until either 30 seconds have passed or 80% of the requested
> executors have registered.
>
> -Sandy
>
> On Fri, Dec 5, 2014 at 12:46 PM, Ashish Rangole <ar...@gmail.com>
> wrote:
>
>> Likely this not the case here yet one thing to point out with Yarn
>> parameters like --num-executors is that they should be specified *before*
>> app jar and app args on spark-submit command line otherwise the app only
>> gets the default number of containers which is 2.
>> On Dec 5, 2014 12:22 PM, "Sandy Ryza" <sa...@cloudera.com> wrote:
>>
>>> Hi Denny,
>>>
>>> Those sleeps were only at startup, so if jobs are taking significantly
>>> longer on YARN, that should be a different problem.  When you ran on YARN,
>>> did you use the --executor-cores, --executor-memory, and --num-executors
>>> arguments?  When running against a standalone cluster, by default Spark
>>> will make use of all the cluster resources, but when running against YARN,
>>> Spark defaults to a couple tiny executors.
>>>
>>> -Sandy
>>>
>>> On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <de...@gmail.com>
>>> wrote:
>>>
>>>> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
>>>> steps. If I was running this on standalone cluster mode the query finished
>>>> in 55s but on YARN, the query was still running 30min later. Would the hard
>>>> coded sleeps potentially be in play here?
>>>> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sa...@cloudera.com>
>>>> wrote:
>>>>
>>>>> Hi Tobias,
>>>>>
>>>>> What version are you using?  In some recent versions, we had a couple
>>>>> of large hardcoded sleeps on the Spark side.
>>>>>
>>>>> -Sandy
>>>>>
>>>>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <an...@databricks.com>
>>>>> wrote:
>>>>>
>>>>>> Hey Tobias,
>>>>>>
>>>>>> As you suspect, the reason why it's slow is because the resource
>>>>>> manager in YARN takes a while to grant resources. This is because YARN
>>>>>> needs to first set up the application master container, and then this AM
>>>>>> needs to request more containers for Spark executors. I think this accounts
>>>>>> for most of the overhead. The remaining source probably comes from how our
>>>>>> own YARN integration code polls application (every second) and cluster
>>>>>> resource states (every 5 seconds IIRC). I haven't explored in detail
>>>>>> whether there are optimizations there that can speed this up, but I believe
>>>>>> most of the overhead comes from YARN itself.
>>>>>>
>>>>>> In other words, no I don't know of any quick fix on your end that you
>>>>>> can do to speed this up.
>>>>>>
>>>>>> -Andrew
>>>>>>
>>>>>>
>>>>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:
>>>>>>
>>>>>> Hi,
>>>>>>>
>>>>>>> I am using spark-submit to submit my application to YARN in
>>>>>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well as my
>>>>>>> application jar file put in HDFS and can see from the logging output that
>>>>>>> both files are used from there. However, it still takes about 10 seconds
>>>>>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>>>>>>
>>>>>>> I am aware that this is probably not a Spark issue, but some YARN
>>>>>>> configuration setting (or YARN-inherent slowness), I was just wondering if
>>>>>>> anyone has an advice for how to speed this up.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Tobias
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>

Re: spark-submit on YARN is slow

Posted by Sandy Ryza <sa...@cloudera.com>.

Hey Arun,

The sleeps would only cause maximum like 5 second overhead.  The idea was
to give executors some time to register.  On more recent versions, they
were replaced with the spark.scheduler.minRegisteredResourcesRatio and
spark.scheduler.maxRegisteredResourcesWaitingTime.  As of 1.1, by default
YARN will wait until either 30 seconds have passed or 80% of the requested
executors have registered.

-Sandy

On Fri, Dec 5, 2014 at 12:46 PM, Ashish Rangole <ar...@gmail.com> wrote:

> Likely this not the case here yet one thing to point out with Yarn
> parameters like --num-executors is that they should be specified *before*
> app jar and app args on spark-submit command line otherwise the app only
> gets the default number of containers which is 2.
> On Dec 5, 2014 12:22 PM, "Sandy Ryza" <sa...@cloudera.com> wrote:
>
>> Hi Denny,
>>
>> Those sleeps were only at startup, so if jobs are taking significantly
>> longer on YARN, that should be a different problem.  When you ran on YARN,
>> did you use the --executor-cores, --executor-memory, and --num-executors
>> arguments?  When running against a standalone cluster, by default Spark
>> will make use of all the cluster resources, but when running against YARN,
>> Spark defaults to a couple tiny executors.
>>
>> -Sandy
>>
>> On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <de...@gmail.com> wrote:
>>
>>> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
>>> steps. If I was running this on standalone cluster mode the query finished
>>> in 55s but on YARN, the query was still running 30min later. Would the hard
>>> coded sleeps potentially be in play here?
>>> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sa...@cloudera.com> wrote:
>>>
>>>> Hi Tobias,
>>>>
>>>> What version are you using?  In some recent versions, we had a couple
>>>> of large hardcoded sleeps on the Spark side.
>>>>
>>>> -Sandy
>>>>
>>>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <an...@databricks.com>
>>>> wrote:
>>>>
>>>>> Hey Tobias,
>>>>>
>>>>> As you suspect, the reason why it's slow is because the resource
>>>>> manager in YARN takes a while to grant resources. This is because YARN
>>>>> needs to first set up the application master container, and then this AM
>>>>> needs to request more containers for Spark executors. I think this accounts
>>>>> for most of the overhead. The remaining source probably comes from how our
>>>>> own YARN integration code polls application (every second) and cluster
>>>>> resource states (every 5 seconds IIRC). I haven't explored in detail
>>>>> whether there are optimizations there that can speed this up, but I believe
>>>>> most of the overhead comes from YARN itself.
>>>>>
>>>>> In other words, no I don't know of any quick fix on your end that you
>>>>> can do to speed this up.
>>>>>
>>>>> -Andrew
>>>>>
>>>>>
>>>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> I am using spark-submit to submit my application to YARN in
>>>>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well as my
>>>>>> application jar file put in HDFS and can see from the logging output that
>>>>>> both files are used from there. However, it still takes about 10 seconds
>>>>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>>>>>
>>>>>> I am aware that this is probably not a Spark issue, but some YARN
>>>>>> configuration setting (or YARN-inherent slowness), I was just wondering if
>>>>>> anyone has an advice for how to speed this up.
>>>>>>
>>>>>> Thanks
>>>>>> Tobias
>>>>>>
>>>>>
>>>>>
>>>>
>>

Re: spark-submit on YARN is slow

Posted by Ashish Rangole <ar...@gmail.com>.

Likely this not the case here yet one thing to point out with Yarn
parameters like --num-executors is that they should be specified *before*
app jar and app args on spark-submit command line otherwise the app only
gets the default number of containers which is 2.
On Dec 5, 2014 12:22 PM, "Sandy Ryza" <sa...@cloudera.com> wrote:

> Hi Denny,
>
> Those sleeps were only at startup, so if jobs are taking significantly
> longer on YARN, that should be a different problem.  When you ran on YARN,
> did you use the --executor-cores, --executor-memory, and --num-executors
> arguments?  When running against a standalone cluster, by default Spark
> will make use of all the cluster resources, but when running against YARN,
> Spark defaults to a couple tiny executors.
>
> -Sandy
>
> On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <de...@gmail.com> wrote:
>
>> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
>> steps. If I was running this on standalone cluster mode the query finished
>> in 55s but on YARN, the query was still running 30min later. Would the hard
>> coded sleeps potentially be in play here?
>> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sa...@cloudera.com> wrote:
>>
>>> Hi Tobias,
>>>
>>> What version are you using?  In some recent versions, we had a couple of
>>> large hardcoded sleeps on the Spark side.
>>>
>>> -Sandy
>>>
>>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <an...@databricks.com>
>>> wrote:
>>>
>>>> Hey Tobias,
>>>>
>>>> As you suspect, the reason why it's slow is because the resource
>>>> manager in YARN takes a while to grant resources. This is because YARN
>>>> needs to first set up the application master container, and then this AM
>>>> needs to request more containers for Spark executors. I think this accounts
>>>> for most of the overhead. The remaining source probably comes from how our
>>>> own YARN integration code polls application (every second) and cluster
>>>> resource states (every 5 seconds IIRC). I haven't explored in detail
>>>> whether there are optimizations there that can speed this up, but I believe
>>>> most of the overhead comes from YARN itself.
>>>>
>>>> In other words, no I don't know of any quick fix on your end that you
>>>> can do to speed this up.
>>>>
>>>> -Andrew
>>>>
>>>>
>>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:
>>>>
>>>> Hi,
>>>>>
>>>>> I am using spark-submit to submit my application to YARN in
>>>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well as my
>>>>> application jar file put in HDFS and can see from the logging output that
>>>>> both files are used from there. However, it still takes about 10 seconds
>>>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>>>>
>>>>> I am aware that this is probably not a Spark issue, but some YARN
>>>>> configuration setting (or YARN-inherent slowness), I was just wondering if
>>>>> anyone has an advice for how to speed this up.
>>>>>
>>>>> Thanks
>>>>> Tobias
>>>>>
>>>>
>>>>
>>>
>

Re: spark-submit on YARN is slow

Posted by Arun Ahuja <aa...@gmail.com>.

Hey Sandy,

What are those sleeps for and do they still exist?  We have seen about a
1min to 1:30 executor startup time, which is a large chunk for jobs that
run in ~10min.

Thanks,
Arun

On Fri, Dec 5, 2014 at 3:20 PM, Sandy Ryza <sa...@cloudera.com> wrote:

> Hi Denny,
>
> Those sleeps were only at startup, so if jobs are taking significantly
> longer on YARN, that should be a different problem.  When you ran on YARN,
> did you use the --executor-cores, --executor-memory, and --num-executors
> arguments?  When running against a standalone cluster, by default Spark
> will make use of all the cluster resources, but when running against YARN,
> Spark defaults to a couple tiny executors.
>
> -Sandy
>
> On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <de...@gmail.com> wrote:
>
>> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
>> steps. If I was running this on standalone cluster mode the query finished
>> in 55s but on YARN, the query was still running 30min later. Would the hard
>> coded sleeps potentially be in play here?
>> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sa...@cloudera.com> wrote:
>>
>>> Hi Tobias,
>>>
>>> What version are you using?  In some recent versions, we had a couple of
>>> large hardcoded sleeps on the Spark side.
>>>
>>> -Sandy
>>>
>>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <an...@databricks.com>
>>> wrote:
>>>
>>>> Hey Tobias,
>>>>
>>>> As you suspect, the reason why it's slow is because the resource
>>>> manager in YARN takes a while to grant resources. This is because YARN
>>>> needs to first set up the application master container, and then this AM
>>>> needs to request more containers for Spark executors. I think this accounts
>>>> for most of the overhead. The remaining source probably comes from how our
>>>> own YARN integration code polls application (every second) and cluster
>>>> resource states (every 5 seconds IIRC). I haven't explored in detail
>>>> whether there are optimizations there that can speed this up, but I believe
>>>> most of the overhead comes from YARN itself.
>>>>
>>>> In other words, no I don't know of any quick fix on your end that you
>>>> can do to speed this up.
>>>>
>>>> -Andrew
>>>>
>>>>
>>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:
>>>>
>>>> Hi,
>>>>>
>>>>> I am using spark-submit to submit my application to YARN in
>>>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well as my
>>>>> application jar file put in HDFS and can see from the logging output that
>>>>> both files are used from there. However, it still takes about 10 seconds
>>>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>>>>
>>>>> I am aware that this is probably not a Spark issue, but some YARN
>>>>> configuration setting (or YARN-inherent slowness), I was just wondering if
>>>>> anyone has an advice for how to speed this up.
>>>>>
>>>>> Thanks
>>>>> Tobias
>>>>>
>>>>
>>>>
>>>
>

Re: spark-submit on YARN is slow

Posted by Sandy Ryza <sa...@cloudera.com>.

Hi Denny,

Those sleeps were only at startup, so if jobs are taking significantly
longer on YARN, that should be a different problem.  When you ran on YARN,
did you use the --executor-cores, --executor-memory, and --num-executors
arguments?  When running against a standalone cluster, by default Spark
will make use of all the cluster resources, but when running against YARN,
Spark defaults to a couple tiny executors.

-Sandy

On Fri, Dec 5, 2014 at 11:32 AM, Denny Lee <de...@gmail.com> wrote:

> My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand
> steps. If I was running this on standalone cluster mode the query finished
> in 55s but on YARN, the query was still running 30min later. Would the hard
> coded sleeps potentially be in play here?
> On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sa...@cloudera.com> wrote:
>
>> Hi Tobias,
>>
>> What version are you using?  In some recent versions, we had a couple of
>> large hardcoded sleeps on the Spark side.
>>
>> -Sandy
>>
>> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <an...@databricks.com> wrote:
>>
>>> Hey Tobias,
>>>
>>> As you suspect, the reason why it's slow is because the resource manager
>>> in YARN takes a while to grant resources. This is because YARN needs to
>>> first set up the application master container, and then this AM needs to
>>> request more containers for Spark executors. I think this accounts for most
>>> of the overhead. The remaining source probably comes from how our own YARN
>>> integration code polls application (every second) and cluster resource
>>> states (every 5 seconds IIRC). I haven't explored in detail whether there
>>> are optimizations there that can speed this up, but I believe most of the
>>> overhead comes from YARN itself.
>>>
>>> In other words, no I don't know of any quick fix on your end that you
>>> can do to speed this up.
>>>
>>> -Andrew
>>>
>>>
>>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:
>>>
>>> Hi,
>>>>
>>>> I am using spark-submit to submit my application to YARN in
>>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well as my
>>>> application jar file put in HDFS and can see from the logging output that
>>>> both files are used from there. However, it still takes about 10 seconds
>>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>>>
>>>> I am aware that this is probably not a Spark issue, but some YARN
>>>> configuration setting (or YARN-inherent slowness), I was just wondering if
>>>> anyone has an advice for how to speed this up.
>>>>
>>>> Thanks
>>>> Tobias
>>>>
>>>
>>>
>>

Re: spark-submit on YARN is slow

Posted by Denny Lee <de...@gmail.com>.

My submissions of Spark on YARN (CDH 5.2) resulted in a few thousand steps.
If I was running this on standalone cluster mode the query finished in 55s
but on YARN, the query was still running 30min later. Would the hard coded
sleeps potentially be in play here?
On Fri, Dec 5, 2014 at 11:23 Sandy Ryza <sa...@cloudera.com> wrote:

> Hi Tobias,
>
> What version are you using?  In some recent versions, we had a couple of
> large hardcoded sleeps on the Spark side.
>
> -Sandy
>
> On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <an...@databricks.com> wrote:
>
>> Hey Tobias,
>>
>> As you suspect, the reason why it's slow is because the resource manager
>> in YARN takes a while to grant resources. This is because YARN needs to
>> first set up the application master container, and then this AM needs to
>> request more containers for Spark executors. I think this accounts for most
>> of the overhead. The remaining source probably comes from how our own YARN
>> integration code polls application (every second) and cluster resource
>> states (every 5 seconds IIRC). I haven't explored in detail whether there
>> are optimizations there that can speed this up, but I believe most of the
>> overhead comes from YARN itself.
>>
>> In other words, no I don't know of any quick fix on your end that you can
>> do to speed this up.
>>
>> -Andrew
>>
>>
>> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:
>>
>> Hi,
>>>
>>> I am using spark-submit to submit my application to YARN in
>>> "yarn-cluster" mode. I have both the Spark assembly jar file as well as my
>>> application jar file put in HDFS and can see from the logging output that
>>> both files are used from there. However, it still takes about 10 seconds
>>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>>
>>> I am aware that this is probably not a Spark issue, but some YARN
>>> configuration setting (or YARN-inherent slowness), I was just wondering if
>>> anyone has an advice for how to speed this up.
>>>
>>> Thanks
>>> Tobias
>>>
>>
>>
>

Re: spark-submit on YARN is slow

Posted by Sandy Ryza <sa...@cloudera.com>.

Hi Tobias,

What version are you using?  In some recent versions, we had a couple of
large hardcoded sleeps on the Spark side.

-Sandy

On Fri, Dec 5, 2014 at 11:15 AM, Andrew Or <an...@databricks.com> wrote:

> Hey Tobias,
>
> As you suspect, the reason why it's slow is because the resource manager
> in YARN takes a while to grant resources. This is because YARN needs to
> first set up the application master container, and then this AM needs to
> request more containers for Spark executors. I think this accounts for most
> of the overhead. The remaining source probably comes from how our own YARN
> integration code polls application (every second) and cluster resource
> states (every 5 seconds IIRC). I haven't explored in detail whether there
> are optimizations there that can speed this up, but I believe most of the
> overhead comes from YARN itself.
>
> In other words, no I don't know of any quick fix on your end that you can
> do to speed this up.
>
> -Andrew
>
>
> 2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:
>
> Hi,
>>
>> I am using spark-submit to submit my application to YARN in
>> "yarn-cluster" mode. I have both the Spark assembly jar file as well as my
>> application jar file put in HDFS and can see from the logging output that
>> both files are used from there. However, it still takes about 10 seconds
>> for my application's yarnAppState to switch from ACCEPTED to RUNNING.
>>
>> I am aware that this is probably not a Spark issue, but some YARN
>> configuration setting (or YARN-inherent slowness), I was just wondering if
>> anyone has an advice for how to speed this up.
>>
>> Thanks
>> Tobias
>>
>
>

Re: spark-submit on YARN is slow

Posted by Andrew Or <an...@databricks.com>.

Hey Tobias,

As you suspect, the reason why it's slow is because the resource manager in
YARN takes a while to grant resources. This is because YARN needs to first
set up the application master container, and then this AM needs to request
more containers for Spark executors. I think this accounts for most of the
overhead. The remaining source probably comes from how our own YARN
integration code polls application (every second) and cluster resource
states (every 5 seconds IIRC). I haven't explored in detail whether there
are optimizations there that can speed this up, but I believe most of the
overhead comes from YARN itself.

In other words, no I don't know of any quick fix on your end that you can
do to speed this up.

-Andrew


2014-12-03 20:10 GMT-08:00 Tobias Pfeiffer <tg...@preferred.jp>:

> Hi,
>
> I am using spark-submit to submit my application to YARN in "yarn-cluster"
> mode. I have both the Spark assembly jar file as well as my application jar
> file put in HDFS and can see from the logging output that both files are
> used from there. However, it still takes about 10 seconds for my
> application's yarnAppState to switch from ACCEPTED to RUNNING.
>
> I am aware that this is probably not a Spark issue, but some YARN
> configuration setting (or YARN-inherent slowness), I was just wondering if
> anyone has an advice for how to speed this up.
>
> Thanks
> Tobias
>