You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by Markus Weimer <ma...@weimo.de> on 2017/10/04 07:16:36 UTC

YARN Request IDs are here

Hi,

the long timers might remember, but this has been a need of ours for a long
time: YARN now allows you to tag resource requests, and the allocations
that come back will have those tags:

https://issues.apache.org/jira/browse/YARN-4879


This will greatly simplify the implementation of stuff like IMRU, where
there are different resource allocations for different Evaluators. Now it
is up to us to support that new API. I believe that the
`SchedulingConstraints` approach taken in REEF-1750 should be a workable
way to get this new feature into the `EvaluatorRequestor` APIs. For the
`AllocatedEvaluator`, we could add a new field `requestId` or such to
`EvaluatorDescriptor`. However, I think that we've reached the point where
that class might be better served by a `Map<String,String>` of metadata
about that Evaluator. That way, runtimes can communicate an open set of
metadata to the Driver.

WDYT?

Markus

Re: YARN Request IDs are here

Posted by Markus Weimer <ma...@weimo.de>.
>
> This feature is in 2.9.0, 3.0.0-beta1. When can we pick up Hadoop 2.9.0?
>

I have no objections to targeting a new Hadoop release with REEF 0.17. One
concern is, however, that each time we do that we cut off some users. Maybe
send a [DISCUSS] thread to the list to alert everyone that we might be
doing that. If nobody complains, I see no reason why we couldn't. If this
move creates a problem for some users, we can write code around it that
only enables it if the job is run on a new-enough version of Hadoop. But
that is the second-best solution.

That being said, I would not support a move to a pre-release of Hadoop.
Released versions of REEF should only depend on released versions of Hadoop.

Markus

RE: YARN Request IDs are here

Posted by "Julia Wang (QIUHE)" <Qi...@microsoft.com.INVALID>.
This feature is in 2.9.0, 3.0.0-beta1. When can we pick up Hadoop 2.9.0? 

-Julia

-----Original Message-----
From: Byung-Gon Chun [mailto:bgchun@gmail.com] 
Sent: Wednesday, October 4, 2017 1:12 AM
To: dev@reef.apache.org
Subject: Re: YARN Request IDs are here

Hi Markus,

It's great to hear that YARN request IDs are available finally. :)

The initial proposal for node labels proposed to add Map<String, String> of metadata in EvaluatorDescriptor. After a couple of review rounds, Seokchan and Gyewon suggested using SchedulingConstraints.

Do you think it's better to use Map<String, String> instead of SchedulingConstraints? If so, we can go back to the route of using Map<String, String> without adding SchedulingConstraints.

Thanks!
-Gon



On Wed, Oct 4, 2017 at 4:16 PM, Markus Weimer <ma...@weimo.de> wrote:

> Hi,
>
> the long timers might remember, but this has been a need of ours for a 
> long
> time: YARN now allows you to tag resource requests, and the 
> allocations that come back will have those tags:
>
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissue
> s.apache.org%2Fjira%2Fbrowse%2FYARN-4879&data=02%7C01%7CQiuhe.Wang%40m
> icrosoft.com%7C00f9c69007854b81d4a708d50affa680%7C72f988bf86f141af91ab
> 2d7cd011db47%7C1%7C0%7C636427015482515730&sdata=XOV5rMnucxcedTmU7Hiu%2
> BKe4jLOvG9UiIuDQpWwCDys%3D&reserved=0
>
>
> This will greatly simplify the implementation of stuff like IMRU, 
> where there are different resource allocations for different 
> Evaluators. Now it is up to us to support that new API. I believe that 
> the `SchedulingConstraints` approach taken in REEF-1750 should be a 
> workable way to get this new feature into the `EvaluatorRequestor` 
> APIs. For the `AllocatedEvaluator`, we could add a new field 
> `requestId` or such to `EvaluatorDescriptor`. However, I think that 
> we've reached the point where that class might be better served by a 
> `Map<String,String>` of metadata about that Evaluator. That way, 
> runtimes can communicate an open set of metadata to the Driver.
>
> WDYT?
>
> Markus
>



--
Byung-Gon Chun

Re: YARN Request IDs are here

Posted by Markus Weimer <ma...@weimo.de>.
On Wed, Oct 4, 2017 at 10:11 AM, Byung-Gon Chun <bg...@gmail.com> wrote:

> Do you think it's better to use Map<String, String> instead of
> SchedulingConstraints? If so, we can go back to the route of using
> Map<String, String> without adding SchedulingConstraints.
>

I was thinking of doing both:


   1. Add a `SchedulingConstraint` for when you ask for containers. We
   might have to come up with a better name for it, as it is no longer a
   `Constraint` like it was for Node Labels.
   2. Add a metadata map to the `EvaluatorDescriptor` where we can keep all
   the unsorted metadata the resource manager wants to tell us about. If we
   add a `SchedulingConstraint` for the request ID, we can add a typed field
   for it, of course.


Markus

Re: YARN Request IDs are here

Posted by Byung-Gon Chun <bg...@gmail.com>.
Hi Markus,

It's great to hear that YARN request IDs are available finally. :)

The initial proposal for node labels proposed to add Map<String, String> of
metadata in EvaluatorDescriptor. After a couple of review rounds, Seokchan
and Gyewon suggested using SchedulingConstraints.

Do you think it's better to use Map<String, String> instead of
SchedulingConstraints? If so, we can go back to the route of using
Map<String, String> without adding SchedulingConstraints.

Thanks!
-Gon



On Wed, Oct 4, 2017 at 4:16 PM, Markus Weimer <ma...@weimo.de> wrote:

> Hi,
>
> the long timers might remember, but this has been a need of ours for a long
> time: YARN now allows you to tag resource requests, and the allocations
> that come back will have those tags:
>
> https://issues.apache.org/jira/browse/YARN-4879
>
>
> This will greatly simplify the implementation of stuff like IMRU, where
> there are different resource allocations for different Evaluators. Now it
> is up to us to support that new API. I believe that the
> `SchedulingConstraints` approach taken in REEF-1750 should be a workable
> way to get this new feature into the `EvaluatorRequestor` APIs. For the
> `AllocatedEvaluator`, we could add a new field `requestId` or such to
> `EvaluatorDescriptor`. However, I think that we've reached the point where
> that class might be better served by a `Map<String,String>` of metadata
> about that Evaluator. That way, runtimes can communicate an open set of
> metadata to the Driver.
>
> WDYT?
>
> Markus
>



-- 
Byung-Gon Chun