You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Alexander Pivovarov <ap...@gmail.com> on 2016/02/09 06:03:03 UTC

spark on yarn wastes one box (or 1 GB on each box) for am container

Lets say that yarn has 53GB memory available on each slave

spark.am container needs 896MB.  (512 + 384)

I see two options to configure spark:

1. configure spark executors to use 52GB and leave 1 GB on each box. So,
some box will also run am container. So, 1GB memory will not be used on all
slaves but one.

2. configure spark to use all 53GB and add additional 53GB box which will
run only am container. So, 52GB on this additional box will do nothing

I do not like both options. Is there a better way to configure yarn/spark?


Alex

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Alexander Pivovarov <ap...@gmail.com>.

I mean Jonathan

On Tue, Feb 9, 2016 at 10:41 AM, Alexander Pivovarov <ap...@gmail.com>
wrote:

> I decided to do YARN over-commit and add 896
> to yarn.nodemanager.resource.memory-mb
> it was 54,272
> now I set it to 54,272+896 = 55,168
>
> Kelly, can I ask you couple questions
> 1. it is possible to add yarn label to particular instance group boxes on
> EMR?
> 2. in addition to maximizeResourceAllocation it would be nice if we have
> executorsPerBox setting in EMR.
> I have a case when I need to run 2 or 4 executors on r3.2xlarge
>
> On Tue, Feb 9, 2016 at 9:56 AM, Alexander Pivovarov <ap...@gmail.com>
> wrote:
>
>> I use hadoop 2.7.1
>> On Feb 9, 2016 9:54 AM, "Marcelo Vanzin" <va...@cloudera.com> wrote:
>>
>>> You should be able to use spark.yarn.am.nodeLabelExpression if your
>>> version of YARN supports node labels (and you've added a label to the
>>> node where you want the AM to run).
>>>
>>> On Tue, Feb 9, 2016 at 9:51 AM, Alexander Pivovarov
>>> <ap...@gmail.com> wrote:
>>> > Am container starts first and yarn selects random computer to run it.
>>> >
>>> > Is it possible to configure yarn so that it selects small computer for
>>> am
>>> > container.
>>> >
>>> > On Feb 9, 2016 12:40 AM, "Sean Owen" <so...@cloudera.com> wrote:
>>> >>
>>> >> If it's too small to run an executor, I'd think it would be chosen for
>>> >> the AM as the only way to satisfy the request.
>>> >>
>>> >> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
>>> >> <ap...@gmail.com> wrote:
>>> >> > If I add additional small box to the cluster can I configure yarn to
>>> >> > select
>>> >> > small box to run am container?
>>> >> >
>>> >> >
>>> >> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com>
>>> wrote:
>>> >> >>
>>> >> >> Typically YARN is there because you're mediating resource requests
>>> >> >> from things besides Spark, so yeah using every bit of the cluster
>>> is a
>>> >> >> little bit of a corner case. There's not a good answer if all your
>>> >> >> nodes are the same size.
>>> >> >>
>>> >> >> I think you can let YARN over-commit RAM though, and allocate more
>>> >> >> memory than it actually has. It may be beneficial to let them all
>>> >> >> think they have an extra GB, and let one node running the AM
>>> >> >> technically be overcommitted, a state which won't hurt at all
>>> unless
>>> >> >> you're really really tight on memory, in which case something might
>>> >> >> get killed.
>>> >> >>
>>> >> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <
>>> jonathakamzn@gmail.com>
>>> >> >> wrote:
>>> >> >> > Alex,
>>> >> >> >
>>> >> >> > That's a very good question that I've been trying to answer
>>> myself
>>> >> >> > recently
>>> >> >> > too. Since you've mentioned before that you're using EMR, I
>>> assume
>>> >> >> > you're
>>> >> >> > asking this because you've noticed this behavior on emr-4.3.0.
>>> >> >> >
>>> >> >> > In this release, we made some changes to the
>>> >> >> > maximizeResourceAllocation
>>> >> >> > (which you may or may not be using, but either way this issue is
>>> >> >> > present),
>>> >> >> > including the accidental inclusion of somewhat of a bug that
>>> makes it
>>> >> >> > not
>>> >> >> > reserve any space for the AM, which ultimately results in one of
>>> the
>>> >> >> > nodes
>>> >> >> > being utilized only by the AM and not an executor.
>>> >> >> >
>>> >> >> > However, as you point out, the only viable fix seems to be to
>>> reserve
>>> >> >> > enough
>>> >> >> > memory for the AM on *every single node*, which in some cases
>>> might
>>> >> >> > actually
>>> >> >> > be worse than wasting a lot of memory on a single node.
>>> >> >> >
>>> >> >> > So yeah, I also don't like either option. Is this just the price
>>> you
>>> >> >> > pay
>>> >> >> > for
>>> >> >> > running on YARN?
>>> >> >> >
>>> >> >> >
>>> >> >> > ~ Jonathan
>>> >> >> >
>>> >> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
>>> >> >> > <ap...@gmail.com>
>>> >> >> > wrote:
>>> >> >> >>
>>> >> >> >> Lets say that yarn has 53GB memory available on each slave
>>> >> >> >>
>>> >> >> >> spark.am container needs 896MB.  (512 + 384)
>>> >> >> >>
>>> >> >> >> I see two options to configure spark:
>>> >> >> >>
>>> >> >> >> 1. configure spark executors to use 52GB and leave 1 GB on each
>>> box.
>>> >> >> >> So,
>>> >> >> >> some box will also run am container. So, 1GB memory will not be
>>> used
>>> >> >> >> on
>>> >> >> >> all
>>> >> >> >> slaves but one.
>>> >> >> >>
>>> >> >> >> 2. configure spark to use all 53GB and add additional 53GB box
>>> which
>>> >> >> >> will
>>> >> >> >> run only am container. So, 52GB on this additional box will do
>>> >> >> >> nothing
>>> >> >> >>
>>> >> >> >> I do not like both options. Is there a better way to configure
>>> >> >> >> yarn/spark?
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> Alex
>>> >> >
>>> >> >
>>>
>>>
>>>
>>> --
>>> Marcelo
>>>
>>
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Alexander Pivovarov <ap...@gmail.com>.

I decided to do YARN over-commit and add 896
to yarn.nodemanager.resource.memory-mb
it was 54,272
now I set it to 54,272+896 = 55,168

Kelly, can I ask you couple questions
1. it is possible to add yarn label to particular instance group boxes on
EMR?
2. in addition to maximizeResourceAllocation it would be nice if we have
executorsPerBox setting in EMR.
I have a case when I need to run 2 or 4 executors on r3.2xlarge

On Tue, Feb 9, 2016 at 9:56 AM, Alexander Pivovarov <ap...@gmail.com>
wrote:

> I use hadoop 2.7.1
> On Feb 9, 2016 9:54 AM, "Marcelo Vanzin" <va...@cloudera.com> wrote:
>
>> You should be able to use spark.yarn.am.nodeLabelExpression if your
>> version of YARN supports node labels (and you've added a label to the
>> node where you want the AM to run).
>>
>> On Tue, Feb 9, 2016 at 9:51 AM, Alexander Pivovarov
>> <ap...@gmail.com> wrote:
>> > Am container starts first and yarn selects random computer to run it.
>> >
>> > Is it possible to configure yarn so that it selects small computer for
>> am
>> > container.
>> >
>> > On Feb 9, 2016 12:40 AM, "Sean Owen" <so...@cloudera.com> wrote:
>> >>
>> >> If it's too small to run an executor, I'd think it would be chosen for
>> >> the AM as the only way to satisfy the request.
>> >>
>> >> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
>> >> <ap...@gmail.com> wrote:
>> >> > If I add additional small box to the cluster can I configure yarn to
>> >> > select
>> >> > small box to run am container?
>> >> >
>> >> >
>> >> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com>
>> wrote:
>> >> >>
>> >> >> Typically YARN is there because you're mediating resource requests
>> >> >> from things besides Spark, so yeah using every bit of the cluster
>> is a
>> >> >> little bit of a corner case. There's not a good answer if all your
>> >> >> nodes are the same size.
>> >> >>
>> >> >> I think you can let YARN over-commit RAM though, and allocate more
>> >> >> memory than it actually has. It may be beneficial to let them all
>> >> >> think they have an extra GB, and let one node running the AM
>> >> >> technically be overcommitted, a state which won't hurt at all unless
>> >> >> you're really really tight on memory, in which case something might
>> >> >> get killed.
>> >> >>
>> >> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <
>> jonathakamzn@gmail.com>
>> >> >> wrote:
>> >> >> > Alex,
>> >> >> >
>> >> >> > That's a very good question that I've been trying to answer myself
>> >> >> > recently
>> >> >> > too. Since you've mentioned before that you're using EMR, I assume
>> >> >> > you're
>> >> >> > asking this because you've noticed this behavior on emr-4.3.0.
>> >> >> >
>> >> >> > In this release, we made some changes to the
>> >> >> > maximizeResourceAllocation
>> >> >> > (which you may or may not be using, but either way this issue is
>> >> >> > present),
>> >> >> > including the accidental inclusion of somewhat of a bug that
>> makes it
>> >> >> > not
>> >> >> > reserve any space for the AM, which ultimately results in one of
>> the
>> >> >> > nodes
>> >> >> > being utilized only by the AM and not an executor.
>> >> >> >
>> >> >> > However, as you point out, the only viable fix seems to be to
>> reserve
>> >> >> > enough
>> >> >> > memory for the AM on *every single node*, which in some cases
>> might
>> >> >> > actually
>> >> >> > be worse than wasting a lot of memory on a single node.
>> >> >> >
>> >> >> > So yeah, I also don't like either option. Is this just the price
>> you
>> >> >> > pay
>> >> >> > for
>> >> >> > running on YARN?
>> >> >> >
>> >> >> >
>> >> >> > ~ Jonathan
>> >> >> >
>> >> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
>> >> >> > <ap...@gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Lets say that yarn has 53GB memory available on each slave
>> >> >> >>
>> >> >> >> spark.am container needs 896MB.  (512 + 384)
>> >> >> >>
>> >> >> >> I see two options to configure spark:
>> >> >> >>
>> >> >> >> 1. configure spark executors to use 52GB and leave 1 GB on each
>> box.
>> >> >> >> So,
>> >> >> >> some box will also run am container. So, 1GB memory will not be
>> used
>> >> >> >> on
>> >> >> >> all
>> >> >> >> slaves but one.
>> >> >> >>
>> >> >> >> 2. configure spark to use all 53GB and add additional 53GB box
>> which
>> >> >> >> will
>> >> >> >> run only am container. So, 52GB on this additional box will do
>> >> >> >> nothing
>> >> >> >>
>> >> >> >> I do not like both options. Is there a better way to configure
>> >> >> >> yarn/spark?
>> >> >> >>
>> >> >> >>
>> >> >> >> Alex
>> >> >
>> >> >
>>
>>
>>
>> --
>> Marcelo
>>
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Alexander Pivovarov <ap...@gmail.com>.

I use hadoop 2.7.1
On Feb 9, 2016 9:54 AM, "Marcelo Vanzin" <va...@cloudera.com> wrote:

> You should be able to use spark.yarn.am.nodeLabelExpression if your
> version of YARN supports node labels (and you've added a label to the
> node where you want the AM to run).
>
> On Tue, Feb 9, 2016 at 9:51 AM, Alexander Pivovarov
> <ap...@gmail.com> wrote:
> > Am container starts first and yarn selects random computer to run it.
> >
> > Is it possible to configure yarn so that it selects small computer for am
> > container.
> >
> > On Feb 9, 2016 12:40 AM, "Sean Owen" <so...@cloudera.com> wrote:
> >>
> >> If it's too small to run an executor, I'd think it would be chosen for
> >> the AM as the only way to satisfy the request.
> >>
> >> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
> >> <ap...@gmail.com> wrote:
> >> > If I add additional small box to the cluster can I configure yarn to
> >> > select
> >> > small box to run am container?
> >> >
> >> >
> >> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com>
> wrote:
> >> >>
> >> >> Typically YARN is there because you're mediating resource requests
> >> >> from things besides Spark, so yeah using every bit of the cluster is
> a
> >> >> little bit of a corner case. There's not a good answer if all your
> >> >> nodes are the same size.
> >> >>
> >> >> I think you can let YARN over-commit RAM though, and allocate more
> >> >> memory than it actually has. It may be beneficial to let them all
> >> >> think they have an extra GB, and let one node running the AM
> >> >> technically be overcommitted, a state which won't hurt at all unless
> >> >> you're really really tight on memory, in which case something might
> >> >> get killed.
> >> >>
> >> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <
> jonathakamzn@gmail.com>
> >> >> wrote:
> >> >> > Alex,
> >> >> >
> >> >> > That's a very good question that I've been trying to answer myself
> >> >> > recently
> >> >> > too. Since you've mentioned before that you're using EMR, I assume
> >> >> > you're
> >> >> > asking this because you've noticed this behavior on emr-4.3.0.
> >> >> >
> >> >> > In this release, we made some changes to the
> >> >> > maximizeResourceAllocation
> >> >> > (which you may or may not be using, but either way this issue is
> >> >> > present),
> >> >> > including the accidental inclusion of somewhat of a bug that makes
> it
> >> >> > not
> >> >> > reserve any space for the AM, which ultimately results in one of
> the
> >> >> > nodes
> >> >> > being utilized only by the AM and not an executor.
> >> >> >
> >> >> > However, as you point out, the only viable fix seems to be to
> reserve
> >> >> > enough
> >> >> > memory for the AM on *every single node*, which in some cases might
> >> >> > actually
> >> >> > be worse than wasting a lot of memory on a single node.
> >> >> >
> >> >> > So yeah, I also don't like either option. Is this just the price
> you
> >> >> > pay
> >> >> > for
> >> >> > running on YARN?
> >> >> >
> >> >> >
> >> >> > ~ Jonathan
> >> >> >
> >> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
> >> >> > <ap...@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Lets say that yarn has 53GB memory available on each slave
> >> >> >>
> >> >> >> spark.am container needs 896MB.  (512 + 384)
> >> >> >>
> >> >> >> I see two options to configure spark:
> >> >> >>
> >> >> >> 1. configure spark executors to use 52GB and leave 1 GB on each
> box.
> >> >> >> So,
> >> >> >> some box will also run am container. So, 1GB memory will not be
> used
> >> >> >> on
> >> >> >> all
> >> >> >> slaves but one.
> >> >> >>
> >> >> >> 2. configure spark to use all 53GB and add additional 53GB box
> which
> >> >> >> will
> >> >> >> run only am container. So, 52GB on this additional box will do
> >> >> >> nothing
> >> >> >>
> >> >> >> I do not like both options. Is there a better way to configure
> >> >> >> yarn/spark?
> >> >> >>
> >> >> >>
> >> >> >> Alex
> >> >
> >> >
>
>
>
> --
> Marcelo
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Alexander Pivovarov <ap...@gmail.com>.

Can you add an ability to set custom yarn labels instead/in addition to?
On Feb 9, 2016 3:28 PM, "Jonathan Kelly" <jo...@gmail.com> wrote:

> Oh, sheesh, how silly of me. I copied and pasted that setting name without
> even noticing the "mapreduce" in it. Yes, I guess that would mean that
> Spark AMs are probably running even on TASK instances currently, which is
> OK but not consistent with what we do for MapReduce. I'll make sure we
> set spark.yarn.am.nodeLabelExpression appropriately in the next EMR release.
>
> ~ Jonathan
>
> On Tue, Feb 9, 2016 at 1:30 PM Marcelo Vanzin <va...@cloudera.com> wrote:
>
>> On Tue, Feb 9, 2016 at 12:16 PM, Jonathan Kelly <jo...@gmail.com>
>> wrote:
>> > And we do set yarn.app.mapreduce.am.labels=CORE
>>
>> That sounds very mapreduce-specific, so I doubt Spark (or anything
>> non-MR) would honor it.
>>
>> --
>> Marcelo
>>
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Jonathan Kelly <jo...@gmail.com>.

Oh, sheesh, how silly of me. I copied and pasted that setting name without
even noticing the "mapreduce" in it. Yes, I guess that would mean that
Spark AMs are probably running even on TASK instances currently, which is
OK but not consistent with what we do for MapReduce. I'll make sure we
set spark.yarn.am.nodeLabelExpression appropriately in the next EMR release.

~ Jonathan

On Tue, Feb 9, 2016 at 1:30 PM Marcelo Vanzin <va...@cloudera.com> wrote:

> On Tue, Feb 9, 2016 at 12:16 PM, Jonathan Kelly <jo...@gmail.com>
> wrote:
> > And we do set yarn.app.mapreduce.am.labels=CORE
>
> That sounds very mapreduce-specific, so I doubt Spark (or anything
> non-MR) would honor it.
>
> --
> Marcelo
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Marcelo Vanzin <va...@cloudera.com>.

On Tue, Feb 9, 2016 at 12:16 PM, Jonathan Kelly <jo...@gmail.com> wrote:
> And we do set yarn.app.mapreduce.am.labels=CORE

That sounds very mapreduce-specific, so I doubt Spark (or anything
non-MR) would honor it.

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Alexander Pivovarov <ap...@gmail.com>.

Great! Thank you!

On Tue, Feb 9, 2016 at 4:02 PM, Jonathan Kelly <jo...@gmail.com>
wrote:

> You can set custom per-instance-group configurations (e.g.,
> ["classification":"yarn-site",properties:{"yarn.nodemanager.labels":"SPARKAM"}])
> using the Configurations parameter of
> http://docs.aws.amazon.com/ElasticMapReduce/latest/API/API_InstanceGroupConfig.html.
> Unfortunately, it's not currently possible to specify per-instance-group
> configurations via the CLI though, only cluster wide configurations.
>
> ~ Jonathan
>
>
> On Tue, Feb 9, 2016 at 12:36 PM Alexander Pivovarov <ap...@gmail.com>
> wrote:
>
>> Thanks Jonathan
>>
>> Actually I'd like to use maximizeResourceAllocation.
>>
>> Ideally for me would be to add new instance group having single small box
>> labelled as AM
>> I'm not sure "aws emr create-cluster" supports setting custom LABELS ,
>> the only settings awailable are:
>>
>> InstanceCount=1,BidPrice=0.5,Name=sparkAM,InstanceGroupType=TASK,InstanceType=m3.xlarge
>>
>>
>> How can I specify yarn label AM for that box?
>>
>>
>>
>> On Tue, Feb 9, 2016 at 12:16 PM, Jonathan Kelly <jo...@gmail.com>
>> wrote:
>>
>>> Interesting, I was not aware of spark.yarn.am.nodeLabelExpression.
>>>
>>> We do use YARN labels on EMR; each node is automatically labeled with
>>> its type (MASTER, CORE, or TASK). And we do
>>> set yarn.app.mapreduce.am.labels=CORE in yarn-site.xml, but we do not set
>>> spark.yarn.am.nodeLabelExpression.
>>>
>>> Does Spark somehow not actually honor this? It seems weird that Spark
>>> would have its own similar-sounding property
>>> (spark.yarn.am.nodeLabelExpression). If spark.yarn.am.nodeLabelExpression
>>> is used and yarn.app.mapreduce.am.labels ignored, I could be wrong about
>>> Spark AMs only running on CORE instances in EMR.
>>>
>>> I'm guessing though that spark.yarn.am.nodeLabelExpression would simply
>>> override yarn.app.mapreduce.am.labels, so yarn.app.mapreduce.am.labels
>>> would be treated as a default when it is set and
>>> spark.yarn.am.nodeLabelExpression is not. Is that correct?
>>>
>>> In short, Alex, you should not need to set any of the label-related
>>> properties yourself if you do what I suggested regarding using small CORE
>>> instances and large TASK instances. But if you want to do something
>>> different, it would also be possible to add a TASK instance group with
>>> small nodes and configured with some new label. Then you could set
>>> spark.yarn.am.nodeLabelExpression to that label.
>>>
>>> Thanks, Marcelo, for pointing out spark.yarn.am.nodeLabelExpression!
>>>
>>> ~ Jonathan
>>>
>>> On Tue, Feb 9, 2016 at 9:54 AM Marcelo Vanzin <va...@cloudera.com>
>>> wrote:
>>>
>>>> You should be able to use spark.yarn.am.nodeLabelExpression if your
>>>> version of YARN supports node labels (and you've added a label to the
>>>> node where you want the AM to run).
>>>>
>>>> On Tue, Feb 9, 2016 at 9:51 AM, Alexander Pivovarov
>>>> <ap...@gmail.com> wrote:
>>>> > Am container starts first and yarn selects random computer to run it.
>>>> >
>>>> > Is it possible to configure yarn so that it selects small computer
>>>> for am
>>>> > container.
>>>> >
>>>> > On Feb 9, 2016 12:40 AM, "Sean Owen" <so...@cloudera.com> wrote:
>>>> >>
>>>> >> If it's too small to run an executor, I'd think it would be chosen
>>>> for
>>>> >> the AM as the only way to satisfy the request.
>>>> >>
>>>> >> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
>>>> >> <ap...@gmail.com> wrote:
>>>> >> > If I add additional small box to the cluster can I configure yarn
>>>> to
>>>> >> > select
>>>> >> > small box to run am container?
>>>> >> >
>>>> >> >
>>>> >> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com>
>>>> wrote:
>>>> >> >>
>>>> >> >> Typically YARN is there because you're mediating resource requests
>>>> >> >> from things besides Spark, so yeah using every bit of the cluster
>>>> is a
>>>> >> >> little bit of a corner case. There's not a good answer if all your
>>>> >> >> nodes are the same size.
>>>> >> >>
>>>> >> >> I think you can let YARN over-commit RAM though, and allocate more
>>>> >> >> memory than it actually has. It may be beneficial to let them all
>>>> >> >> think they have an extra GB, and let one node running the AM
>>>> >> >> technically be overcommitted, a state which won't hurt at all
>>>> unless
>>>> >> >> you're really really tight on memory, in which case something
>>>> might
>>>> >> >> get killed.
>>>> >> >>
>>>> >> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <
>>>> jonathakamzn@gmail.com>
>>>> >> >> wrote:
>>>> >> >> > Alex,
>>>> >> >> >
>>>> >> >> > That's a very good question that I've been trying to answer
>>>> myself
>>>> >> >> > recently
>>>> >> >> > too. Since you've mentioned before that you're using EMR, I
>>>> assume
>>>> >> >> > you're
>>>> >> >> > asking this because you've noticed this behavior on emr-4.3.0.
>>>> >> >> >
>>>> >> >> > In this release, we made some changes to the
>>>> >> >> > maximizeResourceAllocation
>>>> >> >> > (which you may or may not be using, but either way this issue is
>>>> >> >> > present),
>>>> >> >> > including the accidental inclusion of somewhat of a bug that
>>>> makes it
>>>> >> >> > not
>>>> >> >> > reserve any space for the AM, which ultimately results in one
>>>> of the
>>>> >> >> > nodes
>>>> >> >> > being utilized only by the AM and not an executor.
>>>> >> >> >
>>>> >> >> > However, as you point out, the only viable fix seems to be to
>>>> reserve
>>>> >> >> > enough
>>>> >> >> > memory for the AM on *every single node*, which in some cases
>>>> might
>>>> >> >> > actually
>>>> >> >> > be worse than wasting a lot of memory on a single node.
>>>> >> >> >
>>>> >> >> > So yeah, I also don't like either option. Is this just the
>>>> price you
>>>> >> >> > pay
>>>> >> >> > for
>>>> >> >> > running on YARN?
>>>> >> >> >
>>>> >> >> >
>>>> >> >> > ~ Jonathan
>>>> >> >> >
>>>> >> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
>>>> >> >> > <ap...@gmail.com>
>>>> >> >> > wrote:
>>>> >> >> >>
>>>> >> >> >> Lets say that yarn has 53GB memory available on each slave
>>>> >> >> >>
>>>> >> >> >> spark.am container needs 896MB.  (512 + 384)
>>>> >> >> >>
>>>> >> >> >> I see two options to configure spark:
>>>> >> >> >>
>>>> >> >> >> 1. configure spark executors to use 52GB and leave 1 GB on
>>>> each box.
>>>> >> >> >> So,
>>>> >> >> >> some box will also run am container. So, 1GB memory will not
>>>> be used
>>>> >> >> >> on
>>>> >> >> >> all
>>>> >> >> >> slaves but one.
>>>> >> >> >>
>>>> >> >> >> 2. configure spark to use all 53GB and add additional 53GB box
>>>> which
>>>> >> >> >> will
>>>> >> >> >> run only am container. So, 52GB on this additional box will do
>>>> >> >> >> nothing
>>>> >> >> >>
>>>> >> >> >> I do not like both options. Is there a better way to configure
>>>> >> >> >> yarn/spark?
>>>> >> >> >>
>>>> >> >> >>
>>>> >> >> >> Alex
>>>> >> >
>>>> >> >
>>>>
>>>>
>>>>
>>>> --
>>>> Marcelo
>>>>
>>>
>>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Jonathan Kelly <jo...@gmail.com>.

You can set custom per-instance-group configurations (e.g.,
["classification":"yarn-site",properties:{"yarn.nodemanager.labels":"SPARKAM"}])
using the Configurations parameter of
http://docs.aws.amazon.com/ElasticMapReduce/latest/API/API_InstanceGroupConfig.html.
Unfortunately, it's not currently possible to specify per-instance-group
configurations via the CLI though, only cluster wide configurations.

~ Jonathan

On Tue, Feb 9, 2016 at 12:36 PM Alexander Pivovarov <ap...@gmail.com>
wrote:

> Thanks Jonathan
>
> Actually I'd like to use maximizeResourceAllocation.
>
> Ideally for me would be to add new instance group having single small box
> labelled as AM
> I'm not sure "aws emr create-cluster" supports setting custom LABELS , the
> only settings awailable are:
>
> InstanceCount=1,BidPrice=0.5,Name=sparkAM,InstanceGroupType=TASK,InstanceType=m3.xlarge
>
>
> How can I specify yarn label AM for that box?
>
>
>
> On Tue, Feb 9, 2016 at 12:16 PM, Jonathan Kelly <jo...@gmail.com>
> wrote:
>
>> Interesting, I was not aware of spark.yarn.am.nodeLabelExpression.
>>
>> We do use YARN labels on EMR; each node is automatically labeled with its
>> type (MASTER, CORE, or TASK). And we do
>> set yarn.app.mapreduce.am.labels=CORE in yarn-site.xml, but we do not set
>> spark.yarn.am.nodeLabelExpression.
>>
>> Does Spark somehow not actually honor this? It seems weird that Spark
>> would have its own similar-sounding property
>> (spark.yarn.am.nodeLabelExpression). If spark.yarn.am.nodeLabelExpression
>> is used and yarn.app.mapreduce.am.labels ignored, I could be wrong about
>> Spark AMs only running on CORE instances in EMR.
>>
>> I'm guessing though that spark.yarn.am.nodeLabelExpression would simply
>> override yarn.app.mapreduce.am.labels, so yarn.app.mapreduce.am.labels
>> would be treated as a default when it is set and
>> spark.yarn.am.nodeLabelExpression is not. Is that correct?
>>
>> In short, Alex, you should not need to set any of the label-related
>> properties yourself if you do what I suggested regarding using small CORE
>> instances and large TASK instances. But if you want to do something
>> different, it would also be possible to add a TASK instance group with
>> small nodes and configured with some new label. Then you could set
>> spark.yarn.am.nodeLabelExpression to that label.
>>
>> Thanks, Marcelo, for pointing out spark.yarn.am.nodeLabelExpression!
>>
>> ~ Jonathan
>>
>> On Tue, Feb 9, 2016 at 9:54 AM Marcelo Vanzin <va...@cloudera.com>
>> wrote:
>>
>>> You should be able to use spark.yarn.am.nodeLabelExpression if your
>>> version of YARN supports node labels (and you've added a label to the
>>> node where you want the AM to run).
>>>
>>> On Tue, Feb 9, 2016 at 9:51 AM, Alexander Pivovarov
>>> <ap...@gmail.com> wrote:
>>> > Am container starts first and yarn selects random computer to run it.
>>> >
>>> > Is it possible to configure yarn so that it selects small computer for
>>> am
>>> > container.
>>> >
>>> > On Feb 9, 2016 12:40 AM, "Sean Owen" <so...@cloudera.com> wrote:
>>> >>
>>> >> If it's too small to run an executor, I'd think it would be chosen for
>>> >> the AM as the only way to satisfy the request.
>>> >>
>>> >> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
>>> >> <ap...@gmail.com> wrote:
>>> >> > If I add additional small box to the cluster can I configure yarn to
>>> >> > select
>>> >> > small box to run am container?
>>> >> >
>>> >> >
>>> >> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com>
>>> wrote:
>>> >> >>
>>> >> >> Typically YARN is there because you're mediating resource requests
>>> >> >> from things besides Spark, so yeah using every bit of the cluster
>>> is a
>>> >> >> little bit of a corner case. There's not a good answer if all your
>>> >> >> nodes are the same size.
>>> >> >>
>>> >> >> I think you can let YARN over-commit RAM though, and allocate more
>>> >> >> memory than it actually has. It may be beneficial to let them all
>>> >> >> think they have an extra GB, and let one node running the AM
>>> >> >> technically be overcommitted, a state which won't hurt at all
>>> unless
>>> >> >> you're really really tight on memory, in which case something might
>>> >> >> get killed.
>>> >> >>
>>> >> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <
>>> jonathakamzn@gmail.com>
>>> >> >> wrote:
>>> >> >> > Alex,
>>> >> >> >
>>> >> >> > That's a very good question that I've been trying to answer
>>> myself
>>> >> >> > recently
>>> >> >> > too. Since you've mentioned before that you're using EMR, I
>>> assume
>>> >> >> > you're
>>> >> >> > asking this because you've noticed this behavior on emr-4.3.0.
>>> >> >> >
>>> >> >> > In this release, we made some changes to the
>>> >> >> > maximizeResourceAllocation
>>> >> >> > (which you may or may not be using, but either way this issue is
>>> >> >> > present),
>>> >> >> > including the accidental inclusion of somewhat of a bug that
>>> makes it
>>> >> >> > not
>>> >> >> > reserve any space for the AM, which ultimately results in one of
>>> the
>>> >> >> > nodes
>>> >> >> > being utilized only by the AM and not an executor.
>>> >> >> >
>>> >> >> > However, as you point out, the only viable fix seems to be to
>>> reserve
>>> >> >> > enough
>>> >> >> > memory for the AM on *every single node*, which in some cases
>>> might
>>> >> >> > actually
>>> >> >> > be worse than wasting a lot of memory on a single node.
>>> >> >> >
>>> >> >> > So yeah, I also don't like either option. Is this just the price
>>> you
>>> >> >> > pay
>>> >> >> > for
>>> >> >> > running on YARN?
>>> >> >> >
>>> >> >> >
>>> >> >> > ~ Jonathan
>>> >> >> >
>>> >> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
>>> >> >> > <ap...@gmail.com>
>>> >> >> > wrote:
>>> >> >> >>
>>> >> >> >> Lets say that yarn has 53GB memory available on each slave
>>> >> >> >>
>>> >> >> >> spark.am container needs 896MB.  (512 + 384)
>>> >> >> >>
>>> >> >> >> I see two options to configure spark:
>>> >> >> >>
>>> >> >> >> 1. configure spark executors to use 52GB and leave 1 GB on each
>>> box.
>>> >> >> >> So,
>>> >> >> >> some box will also run am container. So, 1GB memory will not be
>>> used
>>> >> >> >> on
>>> >> >> >> all
>>> >> >> >> slaves but one.
>>> >> >> >>
>>> >> >> >> 2. configure spark to use all 53GB and add additional 53GB box
>>> which
>>> >> >> >> will
>>> >> >> >> run only am container. So, 52GB on this additional box will do
>>> >> >> >> nothing
>>> >> >> >>
>>> >> >> >> I do not like both options. Is there a better way to configure
>>> >> >> >> yarn/spark?
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> Alex
>>> >> >
>>> >> >
>>>
>>>
>>>
>>> --
>>> Marcelo
>>>
>>
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Alexander Pivovarov <ap...@gmail.com>.

Thanks Jonathan

Actually I'd like to use maximizeResourceAllocation.

Ideally for me would be to add new instance group having single small box
labelled as AM
I'm not sure "aws emr create-cluster" supports setting custom LABELS , the
only settings awailable are:

InstanceCount=1,BidPrice=0.5,Name=sparkAM,InstanceGroupType=TASK,InstanceType=m3.xlarge


How can I specify yarn label AM for that box?



On Tue, Feb 9, 2016 at 12:16 PM, Jonathan Kelly <jo...@gmail.com>
wrote:

> Interesting, I was not aware of spark.yarn.am.nodeLabelExpression.
>
> We do use YARN labels on EMR; each node is automatically labeled with its
> type (MASTER, CORE, or TASK). And we do
> set yarn.app.mapreduce.am.labels=CORE in yarn-site.xml, but we do not set
> spark.yarn.am.nodeLabelExpression.
>
> Does Spark somehow not actually honor this? It seems weird that Spark
> would have its own similar-sounding property
> (spark.yarn.am.nodeLabelExpression). If spark.yarn.am.nodeLabelExpression
> is used and yarn.app.mapreduce.am.labels ignored, I could be wrong about
> Spark AMs only running on CORE instances in EMR.
>
> I'm guessing though that spark.yarn.am.nodeLabelExpression would simply
> override yarn.app.mapreduce.am.labels, so yarn.app.mapreduce.am.labels
> would be treated as a default when it is set and
> spark.yarn.am.nodeLabelExpression is not. Is that correct?
>
> In short, Alex, you should not need to set any of the label-related
> properties yourself if you do what I suggested regarding using small CORE
> instances and large TASK instances. But if you want to do something
> different, it would also be possible to add a TASK instance group with
> small nodes and configured with some new label. Then you could set
> spark.yarn.am.nodeLabelExpression to that label.
>
> Thanks, Marcelo, for pointing out spark.yarn.am.nodeLabelExpression!
>
> ~ Jonathan
>
> On Tue, Feb 9, 2016 at 9:54 AM Marcelo Vanzin <va...@cloudera.com> wrote:
>
>> You should be able to use spark.yarn.am.nodeLabelExpression if your
>> version of YARN supports node labels (and you've added a label to the
>> node where you want the AM to run).
>>
>> On Tue, Feb 9, 2016 at 9:51 AM, Alexander Pivovarov
>> <ap...@gmail.com> wrote:
>> > Am container starts first and yarn selects random computer to run it.
>> >
>> > Is it possible to configure yarn so that it selects small computer for
>> am
>> > container.
>> >
>> > On Feb 9, 2016 12:40 AM, "Sean Owen" <so...@cloudera.com> wrote:
>> >>
>> >> If it's too small to run an executor, I'd think it would be chosen for
>> >> the AM as the only way to satisfy the request.
>> >>
>> >> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
>> >> <ap...@gmail.com> wrote:
>> >> > If I add additional small box to the cluster can I configure yarn to
>> >> > select
>> >> > small box to run am container?
>> >> >
>> >> >
>> >> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com>
>> wrote:
>> >> >>
>> >> >> Typically YARN is there because you're mediating resource requests
>> >> >> from things besides Spark, so yeah using every bit of the cluster
>> is a
>> >> >> little bit of a corner case. There's not a good answer if all your
>> >> >> nodes are the same size.
>> >> >>
>> >> >> I think you can let YARN over-commit RAM though, and allocate more
>> >> >> memory than it actually has. It may be beneficial to let them all
>> >> >> think they have an extra GB, and let one node running the AM
>> >> >> technically be overcommitted, a state which won't hurt at all unless
>> >> >> you're really really tight on memory, in which case something might
>> >> >> get killed.
>> >> >>
>> >> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <
>> jonathakamzn@gmail.com>
>> >> >> wrote:
>> >> >> > Alex,
>> >> >> >
>> >> >> > That's a very good question that I've been trying to answer myself
>> >> >> > recently
>> >> >> > too. Since you've mentioned before that you're using EMR, I assume
>> >> >> > you're
>> >> >> > asking this because you've noticed this behavior on emr-4.3.0.
>> >> >> >
>> >> >> > In this release, we made some changes to the
>> >> >> > maximizeResourceAllocation
>> >> >> > (which you may or may not be using, but either way this issue is
>> >> >> > present),
>> >> >> > including the accidental inclusion of somewhat of a bug that
>> makes it
>> >> >> > not
>> >> >> > reserve any space for the AM, which ultimately results in one of
>> the
>> >> >> > nodes
>> >> >> > being utilized only by the AM and not an executor.
>> >> >> >
>> >> >> > However, as you point out, the only viable fix seems to be to
>> reserve
>> >> >> > enough
>> >> >> > memory for the AM on *every single node*, which in some cases
>> might
>> >> >> > actually
>> >> >> > be worse than wasting a lot of memory on a single node.
>> >> >> >
>> >> >> > So yeah, I also don't like either option. Is this just the price
>> you
>> >> >> > pay
>> >> >> > for
>> >> >> > running on YARN?
>> >> >> >
>> >> >> >
>> >> >> > ~ Jonathan
>> >> >> >
>> >> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
>> >> >> > <ap...@gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Lets say that yarn has 53GB memory available on each slave
>> >> >> >>
>> >> >> >> spark.am container needs 896MB.  (512 + 384)
>> >> >> >>
>> >> >> >> I see two options to configure spark:
>> >> >> >>
>> >> >> >> 1. configure spark executors to use 52GB and leave 1 GB on each
>> box.
>> >> >> >> So,
>> >> >> >> some box will also run am container. So, 1GB memory will not be
>> used
>> >> >> >> on
>> >> >> >> all
>> >> >> >> slaves but one.
>> >> >> >>
>> >> >> >> 2. configure spark to use all 53GB and add additional 53GB box
>> which
>> >> >> >> will
>> >> >> >> run only am container. So, 52GB on this additional box will do
>> >> >> >> nothing
>> >> >> >>
>> >> >> >> I do not like both options. Is there a better way to configure
>> >> >> >> yarn/spark?
>> >> >> >>
>> >> >> >>
>> >> >> >> Alex
>> >> >
>> >> >
>>
>>
>>
>> --
>> Marcelo
>>
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Jonathan Kelly <jo...@gmail.com>.

Interesting, I was not aware of spark.yarn.am.nodeLabelExpression.

We do use YARN labels on EMR; each node is automatically labeled with its
type (MASTER, CORE, or TASK). And we do
set yarn.app.mapreduce.am.labels=CORE in yarn-site.xml, but we do not set
spark.yarn.am.nodeLabelExpression.

Does Spark somehow not actually honor this? It seems weird that Spark would
have its own similar-sounding property (spark.yarn.am.nodeLabelExpression).
If spark.yarn.am.nodeLabelExpression is used
and yarn.app.mapreduce.am.labels ignored, I could be wrong about Spark AMs
only running on CORE instances in EMR.

I'm guessing though that spark.yarn.am.nodeLabelExpression would simply
override yarn.app.mapreduce.am.labels, so yarn.app.mapreduce.am.labels
would be treated as a default when it is set and
spark.yarn.am.nodeLabelExpression is not. Is that correct?

In short, Alex, you should not need to set any of the label-related
properties yourself if you do what I suggested regarding using small CORE
instances and large TASK instances. But if you want to do something
different, it would also be possible to add a TASK instance group with
small nodes and configured with some new label. Then you could set
spark.yarn.am.nodeLabelExpression to that label.

Thanks, Marcelo, for pointing out spark.yarn.am.nodeLabelExpression!

~ Jonathan

On Tue, Feb 9, 2016 at 9:54 AM Marcelo Vanzin <va...@cloudera.com> wrote:

> You should be able to use spark.yarn.am.nodeLabelExpression if your
> version of YARN supports node labels (and you've added a label to the
> node where you want the AM to run).
>
> On Tue, Feb 9, 2016 at 9:51 AM, Alexander Pivovarov
> <ap...@gmail.com> wrote:
> > Am container starts first and yarn selects random computer to run it.
> >
> > Is it possible to configure yarn so that it selects small computer for am
> > container.
> >
> > On Feb 9, 2016 12:40 AM, "Sean Owen" <so...@cloudera.com> wrote:
> >>
> >> If it's too small to run an executor, I'd think it would be chosen for
> >> the AM as the only way to satisfy the request.
> >>
> >> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
> >> <ap...@gmail.com> wrote:
> >> > If I add additional small box to the cluster can I configure yarn to
> >> > select
> >> > small box to run am container?
> >> >
> >> >
> >> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com>
> wrote:
> >> >>
> >> >> Typically YARN is there because you're mediating resource requests
> >> >> from things besides Spark, so yeah using every bit of the cluster is
> a
> >> >> little bit of a corner case. There's not a good answer if all your
> >> >> nodes are the same size.
> >> >>
> >> >> I think you can let YARN over-commit RAM though, and allocate more
> >> >> memory than it actually has. It may be beneficial to let them all
> >> >> think they have an extra GB, and let one node running the AM
> >> >> technically be overcommitted, a state which won't hurt at all unless
> >> >> you're really really tight on memory, in which case something might
> >> >> get killed.
> >> >>
> >> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <
> jonathakamzn@gmail.com>
> >> >> wrote:
> >> >> > Alex,
> >> >> >
> >> >> > That's a very good question that I've been trying to answer myself
> >> >> > recently
> >> >> > too. Since you've mentioned before that you're using EMR, I assume
> >> >> > you're
> >> >> > asking this because you've noticed this behavior on emr-4.3.0.
> >> >> >
> >> >> > In this release, we made some changes to the
> >> >> > maximizeResourceAllocation
> >> >> > (which you may or may not be using, but either way this issue is
> >> >> > present),
> >> >> > including the accidental inclusion of somewhat of a bug that makes
> it
> >> >> > not
> >> >> > reserve any space for the AM, which ultimately results in one of
> the
> >> >> > nodes
> >> >> > being utilized only by the AM and not an executor.
> >> >> >
> >> >> > However, as you point out, the only viable fix seems to be to
> reserve
> >> >> > enough
> >> >> > memory for the AM on *every single node*, which in some cases might
> >> >> > actually
> >> >> > be worse than wasting a lot of memory on a single node.
> >> >> >
> >> >> > So yeah, I also don't like either option. Is this just the price
> you
> >> >> > pay
> >> >> > for
> >> >> > running on YARN?
> >> >> >
> >> >> >
> >> >> > ~ Jonathan
> >> >> >
> >> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
> >> >> > <ap...@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Lets say that yarn has 53GB memory available on each slave
> >> >> >>
> >> >> >> spark.am container needs 896MB.  (512 + 384)
> >> >> >>
> >> >> >> I see two options to configure spark:
> >> >> >>
> >> >> >> 1. configure spark executors to use 52GB and leave 1 GB on each
> box.
> >> >> >> So,
> >> >> >> some box will also run am container. So, 1GB memory will not be
> used
> >> >> >> on
> >> >> >> all
> >> >> >> slaves but one.
> >> >> >>
> >> >> >> 2. configure spark to use all 53GB and add additional 53GB box
> which
> >> >> >> will
> >> >> >> run only am container. So, 52GB on this additional box will do
> >> >> >> nothing
> >> >> >>
> >> >> >> I do not like both options. Is there a better way to configure
> >> >> >> yarn/spark?
> >> >> >>
> >> >> >>
> >> >> >> Alex
> >> >
> >> >
>
>
>
> --
> Marcelo
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Marcelo Vanzin <va...@cloudera.com>.

You should be able to use spark.yarn.am.nodeLabelExpression if your
version of YARN supports node labels (and you've added a label to the
node where you want the AM to run).

On Tue, Feb 9, 2016 at 9:51 AM, Alexander Pivovarov
<ap...@gmail.com> wrote:
> Am container starts first and yarn selects random computer to run it.
>
> Is it possible to configure yarn so that it selects small computer for am
> container.
>
> On Feb 9, 2016 12:40 AM, "Sean Owen" <so...@cloudera.com> wrote:
>>
>> If it's too small to run an executor, I'd think it would be chosen for
>> the AM as the only way to satisfy the request.
>>
>> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
>> <ap...@gmail.com> wrote:
>> > If I add additional small box to the cluster can I configure yarn to
>> > select
>> > small box to run am container?
>> >
>> >
>> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com> wrote:
>> >>
>> >> Typically YARN is there because you're mediating resource requests
>> >> from things besides Spark, so yeah using every bit of the cluster is a
>> >> little bit of a corner case. There's not a good answer if all your
>> >> nodes are the same size.
>> >>
>> >> I think you can let YARN over-commit RAM though, and allocate more
>> >> memory than it actually has. It may be beneficial to let them all
>> >> think they have an extra GB, and let one node running the AM
>> >> technically be overcommitted, a state which won't hurt at all unless
>> >> you're really really tight on memory, in which case something might
>> >> get killed.
>> >>
>> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <jo...@gmail.com>
>> >> wrote:
>> >> > Alex,
>> >> >
>> >> > That's a very good question that I've been trying to answer myself
>> >> > recently
>> >> > too. Since you've mentioned before that you're using EMR, I assume
>> >> > you're
>> >> > asking this because you've noticed this behavior on emr-4.3.0.
>> >> >
>> >> > In this release, we made some changes to the
>> >> > maximizeResourceAllocation
>> >> > (which you may or may not be using, but either way this issue is
>> >> > present),
>> >> > including the accidental inclusion of somewhat of a bug that makes it
>> >> > not
>> >> > reserve any space for the AM, which ultimately results in one of the
>> >> > nodes
>> >> > being utilized only by the AM and not an executor.
>> >> >
>> >> > However, as you point out, the only viable fix seems to be to reserve
>> >> > enough
>> >> > memory for the AM on *every single node*, which in some cases might
>> >> > actually
>> >> > be worse than wasting a lot of memory on a single node.
>> >> >
>> >> > So yeah, I also don't like either option. Is this just the price you
>> >> > pay
>> >> > for
>> >> > running on YARN?
>> >> >
>> >> >
>> >> > ~ Jonathan
>> >> >
>> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
>> >> > <ap...@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Lets say that yarn has 53GB memory available on each slave
>> >> >>
>> >> >> spark.am container needs 896MB.  (512 + 384)
>> >> >>
>> >> >> I see two options to configure spark:
>> >> >>
>> >> >> 1. configure spark executors to use 52GB and leave 1 GB on each box.
>> >> >> So,
>> >> >> some box will also run am container. So, 1GB memory will not be used
>> >> >> on
>> >> >> all
>> >> >> slaves but one.
>> >> >>
>> >> >> 2. configure spark to use all 53GB and add additional 53GB box which
>> >> >> will
>> >> >> run only am container. So, 52GB on this additional box will do
>> >> >> nothing
>> >> >>
>> >> >> I do not like both options. Is there a better way to configure
>> >> >> yarn/spark?
>> >> >>
>> >> >>
>> >> >> Alex
>> >
>> >



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Alexander Pivovarov <ap...@gmail.com>.

Am container starts first and yarn selects random computer to run it.

Is it possible to configure yarn so that it selects small computer for am
container.
On Feb 9, 2016 12:40 AM, "Sean Owen" <so...@cloudera.com> wrote:

> If it's too small to run an executor, I'd think it would be chosen for
> the AM as the only way to satisfy the request.
>
> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
> <ap...@gmail.com> wrote:
> > If I add additional small box to the cluster can I configure yarn to
> select
> > small box to run am container?
> >
> >
> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com> wrote:
> >>
> >> Typically YARN is there because you're mediating resource requests
> >> from things besides Spark, so yeah using every bit of the cluster is a
> >> little bit of a corner case. There's not a good answer if all your
> >> nodes are the same size.
> >>
> >> I think you can let YARN over-commit RAM though, and allocate more
> >> memory than it actually has. It may be beneficial to let them all
> >> think they have an extra GB, and let one node running the AM
> >> technically be overcommitted, a state which won't hurt at all unless
> >> you're really really tight on memory, in which case something might
> >> get killed.
> >>
> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <jo...@gmail.com>
> >> wrote:
> >> > Alex,
> >> >
> >> > That's a very good question that I've been trying to answer myself
> >> > recently
> >> > too. Since you've mentioned before that you're using EMR, I assume
> >> > you're
> >> > asking this because you've noticed this behavior on emr-4.3.0.
> >> >
> >> > In this release, we made some changes to the
> maximizeResourceAllocation
> >> > (which you may or may not be using, but either way this issue is
> >> > present),
> >> > including the accidental inclusion of somewhat of a bug that makes it
> >> > not
> >> > reserve any space for the AM, which ultimately results in one of the
> >> > nodes
> >> > being utilized only by the AM and not an executor.
> >> >
> >> > However, as you point out, the only viable fix seems to be to reserve
> >> > enough
> >> > memory for the AM on *every single node*, which in some cases might
> >> > actually
> >> > be worse than wasting a lot of memory on a single node.
> >> >
> >> > So yeah, I also don't like either option. Is this just the price you
> pay
> >> > for
> >> > running on YARN?
> >> >
> >> >
> >> > ~ Jonathan
> >> >
> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
> >> > <ap...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Lets say that yarn has 53GB memory available on each slave
> >> >>
> >> >> spark.am container needs 896MB.  (512 + 384)
> >> >>
> >> >> I see two options to configure spark:
> >> >>
> >> >> 1. configure spark executors to use 52GB and leave 1 GB on each box.
> >> >> So,
> >> >> some box will also run am container. So, 1GB memory will not be used
> on
> >> >> all
> >> >> slaves but one.
> >> >>
> >> >> 2. configure spark to use all 53GB and add additional 53GB box which
> >> >> will
> >> >> run only am container. So, 52GB on this additional box will do
> nothing
> >> >>
> >> >> I do not like both options. Is there a better way to configure
> >> >> yarn/spark?
> >> >>
> >> >>
> >> >> Alex
> >
> >
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Jonathan Kelly <jo...@gmail.com>.

Sean, I'm not sure if that's actually the case, since the AM would be
allocated before the executors are even requested (by the driver through
the AM), right? This must at least be the case with dynamicAllocation
enabled, but I would expect that it's true regardless.

However, Alex, yes, this would be possible on EMR if you use small CORE
instances and larger TASK instances. EMR is configured to run AMs only on
CORE instances, so if you don't need much HDFS space (HDFS is stored only
on CORE instances, not TASK instances), this might be a good option for
you. Note though that you would have to set spark.executor.memory yourself
though rather than using maximizeResourceAllocation because
maximizeResourceAllocation currently only considers the size of the CORE
instances when determining spark.{driver,executor}.memory.

~ Jonathan

On Tue, Feb 9, 2016 at 12:40 AM Sean Owen <so...@cloudera.com> wrote:

> If it's too small to run an executor, I'd think it would be chosen for
> the AM as the only way to satisfy the request.
>
> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
> <ap...@gmail.com> wrote:
> > If I add additional small box to the cluster can I configure yarn to
> select
> > small box to run am container?
> >
> >
> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com> wrote:
> >>
> >> Typically YARN is there because you're mediating resource requests
> >> from things besides Spark, so yeah using every bit of the cluster is a
> >> little bit of a corner case. There's not a good answer if all your
> >> nodes are the same size.
> >>
> >> I think you can let YARN over-commit RAM though, and allocate more
> >> memory than it actually has. It may be beneficial to let them all
> >> think they have an extra GB, and let one node running the AM
> >> technically be overcommitted, a state which won't hurt at all unless
> >> you're really really tight on memory, in which case something might
> >> get killed.
> >>
> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <jo...@gmail.com>
> >> wrote:
> >> > Alex,
> >> >
> >> > That's a very good question that I've been trying to answer myself
> >> > recently
> >> > too. Since you've mentioned before that you're using EMR, I assume
> >> > you're
> >> > asking this because you've noticed this behavior on emr-4.3.0.
> >> >
> >> > In this release, we made some changes to the
> maximizeResourceAllocation
> >> > (which you may or may not be using, but either way this issue is
> >> > present),
> >> > including the accidental inclusion of somewhat of a bug that makes it
> >> > not
> >> > reserve any space for the AM, which ultimately results in one of the
> >> > nodes
> >> > being utilized only by the AM and not an executor.
> >> >
> >> > However, as you point out, the only viable fix seems to be to reserve
> >> > enough
> >> > memory for the AM on *every single node*, which in some cases might
> >> > actually
> >> > be worse than wasting a lot of memory on a single node.
> >> >
> >> > So yeah, I also don't like either option. Is this just the price you
> pay
> >> > for
> >> > running on YARN?
> >> >
> >> >
> >> > ~ Jonathan
> >> >
> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
> >> > <ap...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Lets say that yarn has 53GB memory available on each slave
> >> >>
> >> >> spark.am container needs 896MB.  (512 + 384)
> >> >>
> >> >> I see two options to configure spark:
> >> >>
> >> >> 1. configure spark executors to use 52GB and leave 1 GB on each box.
> >> >> So,
> >> >> some box will also run am container. So, 1GB memory will not be used
> on
> >> >> all
> >> >> slaves but one.
> >> >>
> >> >> 2. configure spark to use all 53GB and add additional 53GB box which
> >> >> will
> >> >> run only am container. So, 52GB on this additional box will do
> nothing
> >> >>
> >> >> I do not like both options. Is there a better way to configure
> >> >> yarn/spark?
> >> >>
> >> >>
> >> >> Alex
> >
> >
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Sean Owen <so...@cloudera.com>.

If it's too small to run an executor, I'd think it would be chosen for
the AM as the only way to satisfy the request.

On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
<ap...@gmail.com> wrote:
> If I add additional small box to the cluster can I configure yarn to select
> small box to run am container?
>
>
> On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com> wrote:
>>
>> Typically YARN is there because you're mediating resource requests
>> from things besides Spark, so yeah using every bit of the cluster is a
>> little bit of a corner case. There's not a good answer if all your
>> nodes are the same size.
>>
>> I think you can let YARN over-commit RAM though, and allocate more
>> memory than it actually has. It may be beneficial to let them all
>> think they have an extra GB, and let one node running the AM
>> technically be overcommitted, a state which won't hurt at all unless
>> you're really really tight on memory, in which case something might
>> get killed.
>>
>> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <jo...@gmail.com>
>> wrote:
>> > Alex,
>> >
>> > That's a very good question that I've been trying to answer myself
>> > recently
>> > too. Since you've mentioned before that you're using EMR, I assume
>> > you're
>> > asking this because you've noticed this behavior on emr-4.3.0.
>> >
>> > In this release, we made some changes to the maximizeResourceAllocation
>> > (which you may or may not be using, but either way this issue is
>> > present),
>> > including the accidental inclusion of somewhat of a bug that makes it
>> > not
>> > reserve any space for the AM, which ultimately results in one of the
>> > nodes
>> > being utilized only by the AM and not an executor.
>> >
>> > However, as you point out, the only viable fix seems to be to reserve
>> > enough
>> > memory for the AM on *every single node*, which in some cases might
>> > actually
>> > be worse than wasting a lot of memory on a single node.
>> >
>> > So yeah, I also don't like either option. Is this just the price you pay
>> > for
>> > running on YARN?
>> >
>> >
>> > ~ Jonathan
>> >
>> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
>> > <ap...@gmail.com>
>> > wrote:
>> >>
>> >> Lets say that yarn has 53GB memory available on each slave
>> >>
>> >> spark.am container needs 896MB.  (512 + 384)
>> >>
>> >> I see two options to configure spark:
>> >>
>> >> 1. configure spark executors to use 52GB and leave 1 GB on each box.
>> >> So,
>> >> some box will also run am container. So, 1GB memory will not be used on
>> >> all
>> >> slaves but one.
>> >>
>> >> 2. configure spark to use all 53GB and add additional 53GB box which
>> >> will
>> >> run only am container. So, 52GB on this additional box will do nothing
>> >>
>> >> I do not like both options. Is there a better way to configure
>> >> yarn/spark?
>> >>
>> >>
>> >> Alex
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Alexander Pivovarov <ap...@gmail.com>.

If I add additional small box to the cluster can I configure yarn to select
small box to run am container?


On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com> wrote:

> Typically YARN is there because you're mediating resource requests
> from things besides Spark, so yeah using every bit of the cluster is a
> little bit of a corner case. There's not a good answer if all your
> nodes are the same size.
>
> I think you can let YARN over-commit RAM though, and allocate more
> memory than it actually has. It may be beneficial to let them all
> think they have an extra GB, and let one node running the AM
> technically be overcommitted, a state which won't hurt at all unless
> you're really really tight on memory, in which case something might
> get killed.
>
> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <jo...@gmail.com>
> wrote:
> > Alex,
> >
> > That's a very good question that I've been trying to answer myself
> recently
> > too. Since you've mentioned before that you're using EMR, I assume you're
> > asking this because you've noticed this behavior on emr-4.3.0.
> >
> > In this release, we made some changes to the maximizeResourceAllocation
> > (which you may or may not be using, but either way this issue is
> present),
> > including the accidental inclusion of somewhat of a bug that makes it not
> > reserve any space for the AM, which ultimately results in one of the
> nodes
> > being utilized only by the AM and not an executor.
> >
> > However, as you point out, the only viable fix seems to be to reserve
> enough
> > memory for the AM on *every single node*, which in some cases might
> actually
> > be worse than wasting a lot of memory on a single node.
> >
> > So yeah, I also don't like either option. Is this just the price you pay
> for
> > running on YARN?
> >
> >
> > ~ Jonathan
> >
> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov <apivovarov@gmail.com
> >
> > wrote:
> >>
> >> Lets say that yarn has 53GB memory available on each slave
> >>
> >> spark.am container needs 896MB.  (512 + 384)
> >>
> >> I see two options to configure spark:
> >>
> >> 1. configure spark executors to use 52GB and leave 1 GB on each box. So,
> >> some box will also run am container. So, 1GB memory will not be used on
> all
> >> slaves but one.
> >>
> >> 2. configure spark to use all 53GB and add additional 53GB box which
> will
> >> run only am container. So, 52GB on this additional box will do nothing
> >>
> >> I do not like both options. Is there a better way to configure
> yarn/spark?
> >>
> >>
> >> Alex
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Jonathan Kelly <jo...@gmail.com>.

Praveen,

You mean cluster mode, right? That would still in a sense cause one box to
be "wasted", but at least it would be used a bit more to its full
potential, especially if you set spark.driver.memory to higher than its 1g
default. Also, cluster mode is not an option for some applications, such as
the spark-shell, pyspark shell, or Zeppelin.

~ Jonathan

On Tue, Feb 9, 2016 at 5:48 AM praveen S <my...@gmail.com> wrote:

> How about running in client mode, so that the client from which it is run
> becomes the driver.
>
> Regards,
> Praveen
> On 9 Feb 2016 16:59, "Steve Loughran" <st...@hortonworks.com> wrote:
>
>>
>> > On 9 Feb 2016, at 06:53, Sean Owen <so...@cloudera.com> wrote:
>> >
>> >
>> > I think you can let YARN over-commit RAM though, and allocate more
>> > memory than it actually has. It may be beneficial to let them all
>> > think they have an extra GB, and let one node running the AM
>> > technically be overcommitted, a state which won't hurt at all unless
>> > you're really really tight on memory, in which case something might
>> > get killed.
>>
>>
>> from my test VMs
>>
>>       <property>
>>         <description>Whether physical memory limits will be enforced for
>>           containers.
>>         </description>
>>         <name>yarn.nodemanager.pmem-check-enabled</name>
>>         <value>false</value>
>>       </property>
>>
>>       <property>
>>         <name>yarn.nodemanager.vmem-check-enabled</name>
>>         <value>false</value>
>>       </property>
>>
>>
>> it does mean that a container can swap massively, hurting the performance
>> of all containers around it as IO bandwidth gets soaked up —which is why
>> the checks are on for shared clusters. If it's dedicated, you can overcommit
>
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by praveen S <my...@gmail.com>.

How about running in client mode, so that the client from which it is run
becomes the driver.

Regards,
Praveen
On 9 Feb 2016 16:59, "Steve Loughran" <st...@hortonworks.com> wrote:

>
> > On 9 Feb 2016, at 06:53, Sean Owen <so...@cloudera.com> wrote:
> >
> >
> > I think you can let YARN over-commit RAM though, and allocate more
> > memory than it actually has. It may be beneficial to let them all
> > think they have an extra GB, and let one node running the AM
> > technically be overcommitted, a state which won't hurt at all unless
> > you're really really tight on memory, in which case something might
> > get killed.
>
>
> from my test VMs
>
>       <property>
>         <description>Whether physical memory limits will be enforced for
>           containers.
>         </description>
>         <name>yarn.nodemanager.pmem-check-enabled</name>
>         <value>false</value>
>       </property>
>
>       <property>
>         <name>yarn.nodemanager.vmem-check-enabled</name>
>         <value>false</value>
>       </property>
>
>
> it does mean that a container can swap massively, hurting the performance
> of all containers around it as IO bandwidth gets soaked up —which is why
> the checks are on for shared clusters. If it's dedicated, you can overcommit

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Steve Loughran <st...@hortonworks.com>.

> On 9 Feb 2016, at 06:53, Sean Owen <so...@cloudera.com> wrote:
> 
> 
> I think you can let YARN over-commit RAM though, and allocate more
> memory than it actually has. It may be beneficial to let them all
> think they have an extra GB, and let one node running the AM
> technically be overcommitted, a state which won't hurt at all unless
> you're really really tight on memory, in which case something might
> get killed.


from my test VMs

      <property>
        <description>Whether physical memory limits will be enforced for
          containers.
        </description>
        <name>yarn.nodemanager.pmem-check-enabled</name>
        <value>false</value>
      </property>

      <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
      </property>


it does mean that a container can swap massively, hurting the performance of all containers around it as IO bandwidth gets soaked up —which is why the checks are on for shared clusters. If it's dedicated, you can overcommit

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Sean Owen <so...@cloudera.com>.

Typically YARN is there because you're mediating resource requests
from things besides Spark, so yeah using every bit of the cluster is a
little bit of a corner case. There's not a good answer if all your
nodes are the same size.

I think you can let YARN over-commit RAM though, and allocate more
memory than it actually has. It may be beneficial to let them all
think they have an extra GB, and let one node running the AM
technically be overcommitted, a state which won't hurt at all unless
you're really really tight on memory, in which case something might
get killed.

On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <jo...@gmail.com> wrote:
> Alex,
>
> That's a very good question that I've been trying to answer myself recently
> too. Since you've mentioned before that you're using EMR, I assume you're
> asking this because you've noticed this behavior on emr-4.3.0.
>
> In this release, we made some changes to the maximizeResourceAllocation
> (which you may or may not be using, but either way this issue is present),
> including the accidental inclusion of somewhat of a bug that makes it not
> reserve any space for the AM, which ultimately results in one of the nodes
> being utilized only by the AM and not an executor.
>
> However, as you point out, the only viable fix seems to be to reserve enough
> memory for the AM on *every single node*, which in some cases might actually
> be worse than wasting a lot of memory on a single node.
>
> So yeah, I also don't like either option. Is this just the price you pay for
> running on YARN?
>
>
> ~ Jonathan
>
> On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov <ap...@gmail.com>
> wrote:
>>
>> Lets say that yarn has 53GB memory available on each slave
>>
>> spark.am container needs 896MB.  (512 + 384)
>>
>> I see two options to configure spark:
>>
>> 1. configure spark executors to use 52GB and leave 1 GB on each box. So,
>> some box will also run am container. So, 1GB memory will not be used on all
>> slaves but one.
>>
>> 2. configure spark to use all 53GB and add additional 53GB box which will
>> run only am container. So, 52GB on this additional box will do nothing
>>
>> I do not like both options. Is there a better way to configure yarn/spark?
>>
>>
>> Alex

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Posted by Jonathan Kelly <jo...@gmail.com>.

Alex,

That's a very good question that I've been trying to answer myself recently
too. Since you've mentioned before that you're using EMR, I assume you're
asking this because you've noticed this behavior on emr-4.3.0.

In this release, we made some changes to the maximizeResourceAllocation
(which you may or may not be using, but either way this issue is present),
including the accidental inclusion of somewhat of a bug that makes it not
reserve any space for the AM, which ultimately results in one of the nodes
being utilized only by the AM and not an executor.

However, as you point out, the only viable fix seems to be to reserve
enough memory for the AM on *every single node*, which in some cases might
actually be worse than wasting a lot of memory on a single node.

So yeah, I also don't like either option. Is this just the price you pay
for running on YARN?

~ Jonathan
On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov <ap...@gmail.com>
wrote:

> Lets say that yarn has 53GB memory available on each slave
>
> spark.am container needs 896MB.  (512 + 384)
>
> I see two options to configure spark:
>
> 1. configure spark executors to use 52GB and leave 1 GB on each box. So,
> some box will also run am container. So, 1GB memory will not be used on all
> slaves but one.
>
> 2. configure spark to use all 53GB and add additional 53GB box which will
> run only am container. So, 52GB on this additional box will do nothing
>
> I do not like both options. Is there a better way to configure yarn/spark?
>
>
> Alex
>